Documentation/development-process/1.Intro - linux - Git at Google

 1: A GUIDE TO THE KERNEL DEVELOPMENT PROCESS

 The purpose of this document is to help developers (and their managers)
 work with the development community with a minimum of frustration.  It is
 an attempt to document how this community works in a way which is
 accessible to those who are not intimately familiar with Linux kernel
 development (or, indeed, free software development in general).  While
 there is some technical material here, this is very much a process-oriented
 discussion which does not require a deep knowledge of kernel programming to
 understand.


 1.1: EXECUTIVE SUMMARY

 The rest of this section covers the scope of the kernel development process
 and the kinds of frustrations that developers and their employers can
 encounter there.  There are a great many reasons why kernel code should be
 merged into the official ("mainline") kernel, including automatic
 availability to users, community support in many forms, and the ability to
 influence the direction of kernel development.  Code contributed to the
 Linux kernel must be made available under a GPL-compatible license.

 Section 2 introduces the development process, the kernel release cycle, and
 the mechanics of the merge window.  The various phases in the patch
 development, review, and merging cycle are covered.  There is some
 discussion of tools and mailing lists.  Developers wanting to get started
 with kernel development are encouraged to track down and fix bugs as an
 initial exercise.

 Section 3 covers early-stage project planning, with an emphasis on
 involving the development community as soon as possible.

 Section 4 is about the coding process; several pitfalls which have been
 encountered by other developers are discussed.  Some requirements for
 patches are covered, and there is an introduction to some of the tools
 which can help to ensure that kernel patches are correct.

 Section 5 talks about the process of posting patches for review.  To be
 taken seriously by the development community, patches must be properly
 formatted and described, and they must be sent to the right place.
 Following the advice in this section should help to ensure the best
 possible reception for your work.

 Section 6 covers what happens after posting patches; the job is far from
 done at that point.  Working with reviewers is a crucial part of the
 development process; this section offers a number of tips on how to avoid
 problems at this important stage.  Developers are cautioned against
 assuming that the job is done when a patch is merged into the mainline.

 Section 7 introduces a couple of "advanced" topics: managing patches with
 git and reviewing patches posted by others.

 Section 8 concludes the document with pointers to sources for more
 information on kernel development.


 1.2: WHAT THIS DOCUMENT IS ABOUT

 The Linux kernel, at over 6 million lines of code and well over 1000 active
 contributors, is one of the largest and most active free software projects
 in existence.  Since its humble beginning in 1991, this kernel has evolved
 into a best-of-breed operating system component which runs on pocket-sized
 digital music players, desktop PCs, the largest supercomputers in
 existence, and all types of systems in between.  It is a robust, efficient,
 and scalable solution for almost any situation.

 With the growth of Linux has come an increase in the number of developers
 (and companies) wishing to participate in its development.  Hardware
 vendors want to ensure that Linux supports their products well, making
 those products attractive to Linux users.  Embedded systems vendors, who
 use Linux as a component in an integrated product, want Linux to be as
 capable and well-suited to the task at hand as possible.  Distributors and
 other software vendors who base their products on Linux have a clear
 interest in the capabilities, performance, and reliability of the Linux
 kernel.  And end users, too, will often wish to change Linux to make it
 better suit their needs.

 One of the most compelling features of Linux is that it is accessible to
 these developers; anybody with the requisite skills can improve Linux and
 influence the direction of its development.  Proprietary products cannot
 offer this kind of openness, which is a characteristic of the free software
 process.  But, if anything, the kernel is even more open than most other
 free software projects.  A typical three-month kernel development cycle can
 involve over 1000 developers working for more than 100 different companies
 (or for no company at all).

 Working with the kernel development community is not especially hard.  But,
 that notwithstanding, many potential contributors have experienced
 difficulties when trying to do kernel work.  The kernel community has
 evolved its own distinct ways of operating which allow it to function
 smoothly (and produce a high-quality product) in an environment where
 thousands of lines of code are being changed every day.  So it is not
 surprising that Linux kernel development process differs greatly from
 proprietary development methods.

 The kernel's development process may come across as strange and
 intimidating to new developers, but there are good reasons and solid
 experience behind it.  A developer who does not understand the kernel
 community's ways (or, worse, who tries to flout or circumvent them) will
 have a frustrating experience in store.  The development community, while
 being helpful to those who are trying to learn, has little time for those
 who will not listen or who do not care about the development process.

 It is hoped that those who read this document will be able to avoid that
 frustrating experience.  There is a lot of material here, but the effort
 involved in reading it will be repaid in short order.  The development
 community is always in need of developers who will help to make the kernel
 better; the following text should help you - or those who work for you -
 join our community.


 1.3: CREDITS

 This document was written by Jonathan Corbet, corbet@lwn.net.  It has been
 improved by comments from Johannes Berg, James Berry, Alex Chiang, Roland
 Dreier, Randy Dunlap, Jake Edge, Jiri Kosina, Matt Mackall, Arthur Marsh,
 Amanda McPherson, Andrew Morton, Andrew Price, Tsugikazu Shibata, and
 Jochen Voß.

 This work was supported by the Linux Foundation; thanks especially to
 Amanda McPherson, who saw the value of this effort and made it all happen.


 1.4: THE IMPORTANCE OF GETTING CODE INTO THE MAINLINE

 Some companies and developers occasionally wonder why they should bother
 learning how to work with the kernel community and get their code into the
 mainline kernel (the "mainline" being the kernel maintained by Linus
 Torvalds and used as a base by Linux distributors).  In the short term,
 contributing code can look like an avoidable expense; it seems easier to
 just keep the code separate and support users directly.  The truth of the
 matter is that keeping code separate ("out of tree") is a false economy.

 As a way of illustrating the costs of out-of-tree code, here are a few
 relevant aspects of the kernel development process; most of these will be
 discussed in greater detail later in this document.  Consider:

 - Code which has been merged into the mainline kernel is available to all
   Linux users.  It will automatically be present on all distributions which
   enable it.  There is no need for driver disks, downloads, or the hassles
   of supporting multiple versions of multiple distributions; it all just
   works, for the developer and for the user.  Incorporation into the
   mainline solves a large number of distribution and support problems.

 - While kernel developers strive to maintain a stable interface to user
   space, the internal kernel API is in constant flux.  The lack of a stable
   internal interface is a deliberate design decision; it allows fundamental
   improvements to be made at any time and results in higher-quality code.
   But one result of that policy is that any out-of-tree code requires
   constant upkeep if it is to work with new kernels.  Maintaining
   out-of-tree code requires significant amounts of work just to keep that
   code working.

   Code which is in the mainline, instead, does not require this work as the
   result of a simple rule requiring any developer who makes an API change
   to also fix any code that breaks as the result of that change.  So code
   which has been merged into the mainline has significantly lower
   maintenance costs.

 - Beyond that, code which is in the kernel will often be improved by other
   developers.  Surprising results can come from empowering your user
   community and customers to improve your product.

 - Kernel code is subjected to review, both before and after merging into
   the mainline.  No matter how strong the original developer's skills are,
   this review process invariably finds ways in which the code can be
   improved.  Often review finds severe bugs and security problems.  This is
   especially true for code which has been developed in a closed
   environment; such code benefits strongly from review by outside
   developers.  Out-of-tree code is lower-quality code.

 - Participation in the development process is your way to influence the
   direction of kernel development.  Users who complain from the sidelines
   are heard, but active developers have a stronger voice - and the ability
   to implement changes which make the kernel work better for their needs.

 - When code is maintained separately, the possibility that a third party
   will contribute a different implementation of a similar feature always
   exists.  Should that happen, getting your code merged will become much
   harder - to the point of impossibility.  Then you will be faced with the
   unpleasant alternatives of either (1) maintaining a nonstandard feature
   out of tree indefinitely, or (2) abandoning your code and migrating your
   users over to the in-tree version.

 - Contribution of code is the fundamental action which makes the whole
   process work.  By contributing your code you can add new functionality to
   the kernel and provide capabilities and examples which are of use to
   other kernel developers.  If you have developed code for Linux (or are
   thinking about doing so), you clearly have an interest in the continued
   success of this platform; contributing code is one of the best ways to
   help ensure that success.

 All of the reasoning above applies to any out-of-tree kernel code,
 including code which is distributed in proprietary, binary-only form.
 There are, however, additional factors which should be taken into account
 before considering any sort of binary-only kernel code distribution.  These
 include:

 - The legal issues around the distribution of proprietary kernel modules
   are cloudy at best; quite a few kernel copyright holders believe that
   most binary-only modules are derived products of the kernel and that, as
   a result, their distribution is a violation of the GNU General Public
   license (about which more will be said below).  Your author is not a
   lawyer, and nothing in this document can possibly be considered to be
   legal advice.  The true legal status of closed-source modules can only be
   determined by the courts.  But the uncertainty which haunts those modules
   is there regardless.

 - Binary modules greatly increase the difficulty of debugging kernel
   problems, to the point that most kernel developers will not even try.  So
   the distribution of binary-only modules will make it harder for your
   users to get support from the community.

 - Support is also harder for distributors of binary-only modules, who must
   provide a version of the module for every distribution and every kernel
   version they wish to support.  Dozens of builds of a single module can
   be required to provide reasonably comprehensive coverage, and your users
   will have to upgrade your module separately every time they upgrade their
   kernel.

 - Everything that was said above about code review applies doubly to
   closed-source code.  Since this code is not available at all, it cannot
   have been reviewed by the community and will, beyond doubt, have serious
   problems.

 Makers of embedded systems, in particular, may be tempted to disregard much
 of what has been said in this section in the belief that they are shipping
 a self-contained product which uses a frozen kernel version and requires no
 more development after its release.  This argument misses the value of
 widespread code review and the value of allowing your users to add
 capabilities to your product.  But these products, too, have a limited
 commercial life, after which a new version must be released.  At that
 point, vendors whose code is in the mainline and well maintained will be
 much better positioned to get the new product ready for market quickly.


 1.5: LICENSING

 Code is contributed to the Linux kernel under a number of licenses, but all
 code must be compatible with version 2 of the GNU General Public License
 (GPLv2), which is the license covering the kernel distribution as a whole.
 In practice, that means that all code contributions are covered either by
 GPLv2 (with, optionally, language allowing distribution under later
 versions of the GPL) or the three-clause BSD license.  Any contributions
 which are not covered by a compatible license will not be accepted into the
 kernel.

 Copyright assignments are not required (or requested) for code contributed
 to the kernel.  All code merged into the mainline kernel retains its
 original ownership; as a result, the kernel now has thousands of owners.

 One implication of this ownership structure is that any attempt to change
 the licensing of the kernel is doomed to almost certain failure.  There are
 few practical scenarios where the agreement of all copyright holders could
 be obtained (or their code removed from the kernel).  So, in particular,
 there is no prospect of a migration to version 3 of the GPL in the
 foreseeable future.

 It is imperative that all code contributed to the kernel be legitimately
 free software.  For that reason, code from anonymous (or pseudonymous)
 contributors will not be accepted.  All contributors are required to "sign
 off" on their code, stating that the code can be distributed with the
 kernel under the GPL.  Code which has not been licensed as free software by
 its owner, or which risks creating copyright-related problems for the
 kernel (such as code which derives from reverse-engineering efforts lacking
 proper safeguards) cannot be contributed.

 Questions about copyright-related issues are common on Linux development
 mailing lists.  Such questions will normally receive no shortage of
 answers, but one should bear in mind that the people answering those
 questions are not lawyers and cannot provide legal advice.  If you have
 legal questions relating to Linux source code, there is no substitute for
 talking with a lawyer who understands this field.  Relying on answers
 obtained on technical mailing lists is a risky affair.
	1: A GUIDE TO THE KERNEL DEVELOPMENT PROCESS

	The purpose of this document is to help developers (and their managers)
	work with the development community with a minimum of frustration. It is
	an attempt to document how this community works in a way which is
	accessible to those who are not intimately familiar with Linux kernel
	development (or, indeed, free software development in general). While
	there is some technical material here, this is very much a process-oriented
	discussion which does not require a deep knowledge of kernel programming to
	understand.


	1.1: EXECUTIVE SUMMARY

	The rest of this section covers the scope of the kernel development process
	and the kinds of frustrations that developers and their employers can
	encounter there. There are a great many reasons why kernel code should be
	merged into the official ("mainline") kernel, including automatic
	availability to users, community support in many forms, and the ability to
	influence the direction of kernel development. Code contributed to the
	Linux kernel must be made available under a GPL-compatible license.

	Section 2 introduces the development process, the kernel release cycle, and
	the mechanics of the merge window. The various phases in the patch
	development, review, and merging cycle are covered. There is some
	discussion of tools and mailing lists. Developers wanting to get started
	with kernel development are encouraged to track down and fix bugs as an
	initial exercise.

	Section 3 covers early-stage project planning, with an emphasis on
	involving the development community as soon as possible.

	Section 4 is about the coding process; several pitfalls which have been
	encountered by other developers are discussed. Some requirements for
	patches are covered, and there is an introduction to some of the tools
	which can help to ensure that kernel patches are correct.

	Section 5 talks about the process of posting patches for review. To be
	taken seriously by the development community, patches must be properly
	formatted and described, and they must be sent to the right place.
	Following the advice in this section should help to ensure the best
	possible reception for your work.

	Section 6 covers what happens after posting patches; the job is far from
	done at that point. Working with reviewers is a crucial part of the
	development process; this section offers a number of tips on how to avoid
	problems at this important stage. Developers are cautioned against
	assuming that the job is done when a patch is merged into the mainline.

	Section 7 introduces a couple of "advanced" topics: managing patches with
	git and reviewing patches posted by others.

	Section 8 concludes the document with pointers to sources for more
	information on kernel development.


	1.2: WHAT THIS DOCUMENT IS ABOUT

	The Linux kernel, at over 6 million lines of code and well over 1000 active
	contributors, is one of the largest and most active free software projects
	in existence. Since its humble beginning in 1991, this kernel has evolved
	into a best-of-breed operating system component which runs on pocket-sized
	digital music players, desktop PCs, the largest supercomputers in
	existence, and all types of systems in between. It is a robust, efficient,
	and scalable solution for almost any situation.

	With the growth of Linux has come an increase in the number of developers
	(and companies) wishing to participate in its development. Hardware
	vendors want to ensure that Linux supports their products well, making
	those products attractive to Linux users. Embedded systems vendors, who
	use Linux as a component in an integrated product, want Linux to be as
	capable and well-suited to the task at hand as possible. Distributors and
	other software vendors who base their products on Linux have a clear
	interest in the capabilities, performance, and reliability of the Linux
	kernel. And end users, too, will often wish to change Linux to make it
	better suit their needs.

	One of the most compelling features of Linux is that it is accessible to
	these developers; anybody with the requisite skills can improve Linux and
	influence the direction of its development. Proprietary products cannot
	offer this kind of openness, which is a characteristic of the free software
	process. But, if anything, the kernel is even more open than most other
	free software projects. A typical three-month kernel development cycle can
	involve over 1000 developers working for more than 100 different companies
	(or for no company at all).

	Working with the kernel development community is not especially hard. But,
	that notwithstanding, many potential contributors have experienced
	difficulties when trying to do kernel work. The kernel community has
	evolved its own distinct ways of operating which allow it to function
	smoothly (and produce a high-quality product) in an environment where
	thousands of lines of code are being changed every day. So it is not
	surprising that Linux kernel development process differs greatly from
	proprietary development methods.

	The kernel's development process may come across as strange and
	intimidating to new developers, but there are good reasons and solid
	experience behind it. A developer who does not understand the kernel
	community's ways (or, worse, who tries to flout or circumvent them) will
	have a frustrating experience in store. The development community, while
	being helpful to those who are trying to learn, has little time for those
	who will not listen or who do not care about the development process.

	It is hoped that those who read this document will be able to avoid that
	frustrating experience. There is a lot of material here, but the effort
	involved in reading it will be repaid in short order. The development
	community is always in need of developers who will help to make the kernel
	better; the following text should help you - or those who work for you -
	join our community.


	1.3: CREDITS

	This document was written by Jonathan Corbet, corbet@lwn.net. It has been
	improved by comments from Johannes Berg, James Berry, Alex Chiang, Roland
	Dreier, Randy Dunlap, Jake Edge, Jiri Kosina, Matt Mackall, Arthur Marsh,
	Amanda McPherson, Andrew Morton, Andrew Price, Tsugikazu Shibata, and
	Jochen Voß.

	This work was supported by the Linux Foundation; thanks especially to
	Amanda McPherson, who saw the value of this effort and made it all happen.


	1.4: THE IMPORTANCE OF GETTING CODE INTO THE MAINLINE

	Some companies and developers occasionally wonder why they should bother
	learning how to work with the kernel community and get their code into the
	mainline kernel (the "mainline" being the kernel maintained by Linus
	Torvalds and used as a base by Linux distributors). In the short term,
	contributing code can look like an avoidable expense; it seems easier to
	just keep the code separate and support users directly. The truth of the
	matter is that keeping code separate ("out of tree") is a false economy.

	As a way of illustrating the costs of out-of-tree code, here are a few
	relevant aspects of the kernel development process; most of these will be
	discussed in greater detail later in this document. Consider:

	- Code which has been merged into the mainline kernel is available to all
	Linux users. It will automatically be present on all distributions which
	enable it. There is no need for driver disks, downloads, or the hassles
	of supporting multiple versions of multiple distributions; it all just
	works, for the developer and for the user. Incorporation into the
	mainline solves a large number of distribution and support problems.

	- While kernel developers strive to maintain a stable interface to user
	space, the internal kernel API is in constant flux. The lack of a stable
	internal interface is a deliberate design decision; it allows fundamental
	improvements to be made at any time and results in higher-quality code.
	But one result of that policy is that any out-of-tree code requires
	constant upkeep if it is to work with new kernels. Maintaining
	out-of-tree code requires significant amounts of work just to keep that
	code working.

	Code which is in the mainline, instead, does not require this work as the
	result of a simple rule requiring any developer who makes an API change
	to also fix any code that breaks as the result of that change. So code
	which has been merged into the mainline has significantly lower
	maintenance costs.

	- Beyond that, code which is in the kernel will often be improved by other
	developers. Surprising results can come from empowering your user
	community and customers to improve your product.

	- Kernel code is subjected to review, both before and after merging into
	the mainline. No matter how strong the original developer's skills are,
	this review process invariably finds ways in which the code can be
	improved. Often review finds severe bugs and security problems. This is
	especially true for code which has been developed in a closed
	environment; such code benefits strongly from review by outside
	developers. Out-of-tree code is lower-quality code.

	- Participation in the development process is your way to influence the
	direction of kernel development. Users who complain from the sidelines
	are heard, but active developers have a stronger voice - and the ability
	to implement changes which make the kernel work better for their needs.

	- When code is maintained separately, the possibility that a third party
	will contribute a different implementation of a similar feature always
	exists. Should that happen, getting your code merged will become much
	harder - to the point of impossibility. Then you will be faced with the
	unpleasant alternatives of either (1) maintaining a nonstandard feature
	out of tree indefinitely, or (2) abandoning your code and migrating your
	users over to the in-tree version.

	- Contribution of code is the fundamental action which makes the whole
	process work. By contributing your code you can add new functionality to
	the kernel and provide capabilities and examples which are of use to
	other kernel developers. If you have developed code for Linux (or are
	thinking about doing so), you clearly have an interest in the continued
	success of this platform; contributing code is one of the best ways to
	help ensure that success.

	All of the reasoning above applies to any out-of-tree kernel code,
	including code which is distributed in proprietary, binary-only form.
	There are, however, additional factors which should be taken into account
	before considering any sort of binary-only kernel code distribution. These
	include:

	- The legal issues around the distribution of proprietary kernel modules
	are cloudy at best; quite a few kernel copyright holders believe that
	most binary-only modules are derived products of the kernel and that, as
	a result, their distribution is a violation of the GNU General Public
	license (about which more will be said below). Your author is not a
	lawyer, and nothing in this document can possibly be considered to be
	legal advice. The true legal status of closed-source modules can only be
	determined by the courts. But the uncertainty which haunts those modules
	is there regardless.

	- Binary modules greatly increase the difficulty of debugging kernel
	problems, to the point that most kernel developers will not even try. So
	the distribution of binary-only modules will make it harder for your
	users to get support from the community.

	- Support is also harder for distributors of binary-only modules, who must
	provide a version of the module for every distribution and every kernel
	version they wish to support. Dozens of builds of a single module can
	be required to provide reasonably comprehensive coverage, and your users
	will have to upgrade your module separately every time they upgrade their
	kernel.

	- Everything that was said above about code review applies doubly to
	closed-source code. Since this code is not available at all, it cannot
	have been reviewed by the community and will, beyond doubt, have serious
	problems.

	Makers of embedded systems, in particular, may be tempted to disregard much
	of what has been said in this section in the belief that they are shipping
	a self-contained product which uses a frozen kernel version and requires no
	more development after its release. This argument misses the value of
	widespread code review and the value of allowing your users to add
	capabilities to your product. But these products, too, have a limited
	commercial life, after which a new version must be released. At that
	point, vendors whose code is in the mainline and well maintained will be
	much better positioned to get the new product ready for market quickly.


	1.5: LICENSING

	Code is contributed to the Linux kernel under a number of licenses, but all
	code must be compatible with version 2 of the GNU General Public License
	(GPLv2), which is the license covering the kernel distribution as a whole.
	In practice, that means that all code contributions are covered either by
	GPLv2 (with, optionally, language allowing distribution under later
	versions of the GPL) or the three-clause BSD license. Any contributions
	which are not covered by a compatible license will not be accepted into the
	kernel.

	Copyright assignments are not required (or requested) for code contributed
	to the kernel. All code merged into the mainline kernel retains its
	original ownership; as a result, the kernel now has thousands of owners.

	One implication of this ownership structure is that any attempt to change
	the licensing of the kernel is doomed to almost certain failure. There are
	few practical scenarios where the agreement of all copyright holders could
	be obtained (or their code removed from the kernel). So, in particular,
	there is no prospect of a migration to version 3 of the GPL in the
	foreseeable future.

	It is imperative that all code contributed to the kernel be legitimately
	free software. For that reason, code from anonymous (or pseudonymous)
	contributors will not be accepted. All contributors are required to "sign
	off" on their code, stating that the code can be distributed with the
	kernel under the GPL. Code which has not been licensed as free software by
	its owner, or which risks creating copyright-related problems for the
	kernel (such as code which derives from reverse-engineering efforts lacking
	proper safeguards) cannot be contributed.

	Questions about copyright-related issues are common on Linux development
	mailing lists. Such questions will normally receive no shortage of
	answers, but one should bear in mind that the people answering those
	questions are not lawyers and cannot provide legal advice. If you have
	legal questions relating to Linux source code, there is no substitute for
	talking with a lawyer who understands this field. Relying on answers
	obtained on technical mailing lists is a risky affair.