Linux kernel development in the early 1990's was a pretty loose affair, with relatively small numbers of users and developers involved. With a user base in the millions and with some 2,000 developers involved over the course of one year, the kernel has since had to evolve a number of processes to keep development happening smoothly. A solid understanding of how the process works is required in order to be an effective part of it.
The kernel developers use a loosely time-based release process, with a new major kernel release happening every two or three months. The recent release history looks like this:
2.6.38 March 14, 2011 2.6.37 January 4, 2011 2.6.36 October 20, 2010 2.6.35 August 1, 2010 2.6.34 May 15, 2010 2.6.33 February 24, 2010
Every 2.6.x release is a major kernel release with new features, internal API changes, and more. A typical 2.6 release can contain nearly 10,000 changesets with changes to several hundred thousand lines of code. 2.6 is thus the leading edge of Linux kernel development; the kernel uses a rolling development model which is continually integrating major changes.
A relatively straightforward discipline is followed with regard to the merging of patches for each release. At the beginning of each development cycle, the "merge window" is said to be open. At that time, code which is deemed to be sufficiently stable (and which is accepted by the development community) is merged into the mainline kernel. The bulk of changes for a new development cycle (and all of the major changes) will be merged during this time, at a rate approaching 1,000 changes ("patches," or "changesets") per day.
(As an aside, it is worth noting that the changes integrated during the merge window do not come out of thin air; they have been collected, tested, and staged ahead of time. How that process works will be described in detail later on).
The merge window lasts for approximately two weeks. At the end of this time, Linus Torvalds will declare that the window is closed and release the first of the "rc" kernels. For the kernel which is destined to be 2.6.40, for example, the release which happens at the end of the merge window will be called 2.6.40-rc1. The -rc1 release is the signal that the time to merge new features has passed, and that the time to stabilize the next kernel has begun.
Over the next six to ten weeks, only patches which fix problems should be submitted to the mainline. On occasion a more significant change will be allowed, but such occasions are rare; developers who try to merge new features outside of the merge window tend to get an unfriendly reception. As a general rule, if you miss the merge window for a given feature, the best thing to do is to wait for the next development cycle. (An occasional exception is made for drivers for previously-unsupported hardware; if they touch no in-tree code, they cannot cause regressions and should be safe to add at any time).
As fixes make their way into the mainline, the patch rate will slow over time. Linus releases new -rc kernels about once a week; a normal series will get up to somewhere between -rc6 and -rc9 before the kernel is considered to be sufficiently stable and the final 2.6.x release is made. At that point the whole process starts over again.
As an example, here is how the 2.6.38 development cycle went (all dates in 2011):
January 4 2.6.37 stable release January 18 2.6.38-rc1, merge window closes January 21 2.6.38-rc2 February 1 2.6.38-rc3 February 7 2.6.38-rc4 February 15 2.6.38-rc5 February 21 2.6.38-rc6 March 1 2.6.38-rc7 March 7 2.6.38-rc8 March 14 2.6.38 stable release
How do the developers decide when to close the development cycle and create the stable release? The most significant metric used is the list of regressions from previous releases. No bugs are welcome, but those which break systems which worked in the past are considered to be especially serious. For this reason, patches which cause regressions are looked upon unfavorably and are quite likely to be reverted during the stabilization period.
The developers' goal is to fix all known regressions before the stable release is made. In the real world, this kind of perfection is hard to achieve; there are just too many variables in a project of this size. There comes a point where delaying the final release just makes the problem worse; the pile of changes waiting for the next merge window will grow larger, creating even more regressions the next time around. So most 2.6.x kernels go out with a handful of known regressions though, hopefully, none of them are serious.
Once a stable release is made, its ongoing maintenance is passed off to the "stable team," currently consisting of Greg Kroah-Hartman. The stable team will release occasional updates to the stable release using the 2.6.x.y numbering scheme. To be considered for an update release, a patch must (1) fix a significant bug, and (2) already be merged into the mainline for the next development kernel. Kernels will typically receive stable updates for a little more than one development cycle past their initial release. So, for example, the 2.6.36 kernel's history looked like:
October 10 2.6.36 stable release November 22 2.6.36.1 December 9 2.6.36.2 January 7 2.6.36.3 February 17 2.6.36.4
2.6.36.4 was the final stable update for the 2.6.36 release.
Some kernels are designated "long term" kernels; they will receive support for a longer period. As of this writing, the current long term kernels and their maintainers are:
2.6.27 Willy Tarreau (Deep-frozen stable kernel) 2.6.32 Greg Kroah-Hartman 2.6.35 Andi Kleen (Embedded flag kernel)
The selection of a kernel for long-term support is purely a matter of a maintainer having the need and the time to maintain that release. There are no known plans for long-term support for any specific upcoming release.
Patches do not go directly from the developer's keyboard into the mainline kernel. There is, instead, a somewhat involved (if somewhat informal) process designed to ensure that each patch is reviewed for quality and that each patch implements a change which is desirable to have in the mainline. This process can happen quickly for minor fixes, or, in the case of large and controversial changes, go on for years. Much developer frustration comes from a lack of understanding of this process or from attempts to circumvent it.
In the hopes of reducing that frustration, this document will describe how a patch gets into the kernel. What follows below is an introduction which describes the process in a somewhat idealized way. A much more detailed treatment will come in later sections.
The stages that a patch goes through are, generally:
- Design. This is where the real requirements for the patch - and the way those requirements will be met - are laid out. Design work is often done without involving the community, but it is better to do this work in the open if at all possible; it can save a lot of time redesigning things later.
- Early review. Patches are posted to the relevant mailing list, and developers on that list reply with any comments they may have. This process should turn up any major problems with a patch if all goes well.
- Wider review. When the patch is getting close to ready for mainline inclusion, it should be accepted by a relevant subsystem maintainer - though this acceptance is not a guarantee that the patch will make it all the way to the mainline. The patch will show up in the maintainer's subsystem tree and into the -next trees (described below). When the process works, this step leads to more extensive review of the patch and the discovery of any problems resulting from the integration of this patch with work being done by others.
- Please note that most maintainers also have day jobs, so merging your patch may not be their highest priority. If your patch is getting feedback about changes that are needed, you should either make those changes or justify why they should not be made. If your patch has no review complaints but is not being merged by its appropriate subsystem or driver maintainer, you should be persistent in updating the patch to the current kernel so that it applies cleanly and keep sending it for review and merging.
- Merging into the mainline. Eventually, a successful patch will be merged into the mainline repository managed by Linus Torvalds. More comments and/or problems may surface at this time; it is important that the developer be responsive to these and fix any issues which arise.
- Stable release. The number of users potentially affected by the patch is now large, so, once again, new problems may arise.
- Long-term maintenance. While it is certainly possible for a developer to forget about code after merging it, that sort of behavior tends to leave a poor impression in the development community. Merging code eliminates some of the maintenance burden, in that others will fix problems caused by API changes. But the original developer should continue to take responsibility for the code if it is to remain useful in the longer term.
One of the largest mistakes made by kernel developers (or their employers) is to try to cut the process down to a single "merging into the mainline" step. This approach invariably leads to frustration for everybody involved.
There is exactly one person who can merge patches into the mainline kernel repository: Linus Torvalds. But, of the over 9,500 patches which went into the 2.6.38 kernel, only 112 (around 1.3%) were directly chosen by Linus himself. The kernel project has long since grown to a size where no single developer could possibly inspect and select every patch unassisted. The way the kernel developers have addressed this growth is through the use of a lieutenant system built around a chain of trust.
The kernel code base is logically broken down into a set of subsystems: networking, specific architecture support, memory management, video devices, etc. Most subsystems have a designated maintainer, a developer who has overall responsibility for the code within that subsystem. These subsystem maintainers are the gatekeepers (in a loose way) for the portion of the kernel they manage; they are the ones who will (usually) accept a patch for inclusion into the mainline kernel.
Subsystem maintainers each manage their own version of the kernel source tree, usually (but certainly not always) using the git source management tool. Tools like git (and related tools like quilt or mercurial) allow maintainers to track a list of patches, including authorship information and other metadata. At any given time, the maintainer can identify which patches in his or her repository are not found in the mainline.
When the merge window opens, top-level maintainers will ask Linus to "pull" the patches they have selected for merging from their repositories. If Linus agrees, the stream of patches will flow up into his repository, becoming part of the mainline kernel. The amount of attention that Linus pays to specific patches received in a pull operation varies. It is clear that, sometimes, he looks quite closely. But, as a general rule, Linus trusts the subsystem maintainers to not send bad patches upstream.
Subsystem maintainers, in turn, can pull patches from other maintainers. For example, the networking tree is built from patches which accumulated first in trees dedicated to network device drivers, wireless networking, etc. This chain of repositories can be arbitrarily long, though it rarely exceeds two or three links. Since each maintainer in the chain trusts those managing lower-level trees, this process is known as the "chain of trust."
Clearly, in a system like this, getting patches into the kernel depends on finding the right maintainer. Sending patches directly to Linus is not normally the right way to go.
The chain of subsystem trees guides the flow of patches into the kernel, but it also raises an interesting question: what if somebody wants to look at all of the patches which are being prepared for the next merge window? Developers will be interested in what other changes are pending to see whether there are any conflicts to worry about; a patch which changes a core kernel function prototype, for example, will conflict with any other patches which use the older form of that function. Reviewers and testers want access to the changes in their integrated form before all of those changes land in the mainline kernel. One could pull changes from all of the interesting subsystem trees, but that would be a big and error-prone job.
The answer comes in the form of -next trees, where subsystem trees are collected for testing and review. The older of these trees, maintained by Andrew Morton, is called "-mm" (for memory management, which is how it got started). The -mm tree integrates patches from a long list of subsystem trees; it also has some patches aimed at helping with debugging.
Beyond that, -mm contains a significant collection of patches which have been selected by Andrew directly. These patches may have been posted on a mailing list, or they may apply to a part of the kernel for which there is no designated subsystem tree. As a result, -mm operates as a sort of subsystem tree of last resort; if there is no other obvious path for a patch into the mainline, it is likely to end up in -mm. Miscellaneous patches which accumulate in -mm will eventually either be forwarded on to an appropriate subsystem tree or be sent directly to Linus. In a typical development cycle, approximately 5-10% of the patches going into the mainline get there via -mm.
The current -mm patch is available in the "mmotm" (-mm of the moment) directory at:
http://www.ozlabs.org/~akpm/mmotm/
Use of the MMOTM tree is likely to be a frustrating experience, though; there is a definite chance that it will not even compile.
The primary tree for next-cycle patch merging is linux-next, maintained by Stephen Rothwell. The linux-next tree is, by design, a snapshot of what the mainline is expected to look like after the next merge window closes. Linux-next trees are announced on the linux-kernel and linux-next mailing lists when they are assembled; they can be downloaded from:
http://www.kernel.org/pub/linux/kernel/next/
Linux-next has become an integral part of the kernel development process; all patches merged during a given merge window should really have found their way into linux-next some time before the merge window opens.
The kernel source tree contains the drivers/staging/ directory, where many sub-directories for drivers or filesystems that are on their way to being added to the kernel tree live. They remain in drivers/staging while they still need more work; once complete, they can be moved into the kernel proper. This is a way to keep track of drivers that aren't up to Linux kernel coding or quality standards, but people may want to use them and track development.
Greg Kroah-Hartman currently maintains the staging tree. Drivers that still need work are sent to him, with each driver having its own subdirectory in drivers/staging/. Along with the driver source files, a TODO file should be present in the directory as well. The TODO file lists the pending work that the driver needs for acceptance into the kernel proper, as well as a list of people that should be Cc'd for any patches to the driver. Current rules require that drivers contributed to staging must, at a minimum, compile properly.
Staging can be a relatively easy way to get new drivers into the mainline where, with luck, they will come to the attention of other developers and improve quickly. Entry into staging is not the end of the story, though; code in staging which is not seeing regular progress will eventually be removed. Distributors also tend to be relatively reluctant to enable staging drivers. So staging is, at best, a stop on the way toward becoming a proper mainline driver.
As can be seen from the above text, the kernel development process depends heavily on the ability to herd collections of patches in various directions. The whole thing would not work anywhere near as well as it does without suitably powerful tools. Tutorials on how to use these tools are well beyond the scope of this document, but there is space for a few pointers.
By far the dominant source code management system used by the kernel community is git. Git is one of a number of distributed version control systems being developed in the free software community. It is well tuned for kernel development, in that it performs quite well when dealing with large repositories and large numbers of patches. It also has a reputation for being difficult to learn and use, though it has gotten better over time. Some sort of familiarity with git is almost a requirement for kernel developers; even if they do not use it for their own work, they'll need git to keep up with what other developers (and the mainline) are doing.
Git is now packaged by almost all Linux distributions. There is a home page at:
http://git-scm.com/
That page has pointers to documentation and tutorials.
Among the kernel developers who do not use git, the most popular choice is almost certainly Mercurial:
http://www.selenic.com/mercurial/
Mercurial shares many features with git, but it provides an interface which many find easier to use.
The other tool worth knowing about is Quilt:
http://savannah.nongnu.org/projects/quilt/
Quilt is a patch management system, rather than a source code management system. It does not track history over time; it is, instead, oriented toward tracking a specific set of changes against an evolving code base. Some major subsystem maintainers use quilt to manage patches intended to go upstream. For the management of certain kinds of trees (-mm, for example), quilt is the best tool for the job.
A great deal of Linux kernel development work is done by way of mailing lists. It is hard to be a fully-functioning member of the community without joining at least one list somewhere. But Linux mailing lists also represent a potential hazard to developers, who risk getting buried under a load of electronic mail, running afoul of the conventions used on the Linux lists, or both.
Most kernel mailing lists are run on vger.kernel.org; the master list can be found at:
http://vger.kernel.org/vger-lists.html
There are lists hosted elsewhere, though; a number of them are at lists.redhat.com.
The core mailing list for kernel development is, of course, linux-kernel. This list is an intimidating place to be; volume can reach 500 messages per day, the amount of noise is high, the conversation can be severely technical, and participants are not always concerned with showing a high degree of politeness. But there is no other place where the kernel development community comes together as a whole; developers who avoid this list will miss important information.
There are a few hints which can help with linux-kernel survival:
- Have the list delivered to a separate folder, rather than your main mailbox. One must be able to ignore the stream for sustained periods of time.
- Do not try to follow every conversation - nobody else does. It is important to filter on both the topic of interest (though note that long-running conversations can drift away from the original subject without changing the email subject line) and the people who are participating.
- Do not feed the trolls. If somebody is trying to stir up an angry response, ignore them.
- When responding to linux-kernel email (or that on other lists) preserve the Cc: header for all involved. In the absence of a strong reason (such as an explicit request), you should never remove recipients. Always make sure that the person you are responding to is in the Cc: list. This convention also makes it unnecessary to explicitly ask to be copied on replies to your postings.
- Search the list archives (and the net as a whole) before asking questions. Some developers can get impatient with people who clearly have not done their homework.
- Avoid top-posting (the practice of putting your answer above the quoted text you are responding to). It makes your response harder to read and makes a poor impression.
- Ask on the correct mailing list. Linux-kernel may be the general meeting point, but it is not the best place to find developers from all subsystems.
The last point - finding the correct mailing list - is a common place for beginning developers to go wrong. Somebody who asks a networking-related question on linux-kernel will almost certainly receive a polite suggestion to ask on the netdev list instead, as that is the list frequented by most networking developers. Other lists exist for the SCSI, video4linux, IDE, filesystem, etc. subsystems. The best place to look for mailing lists is in the MAINTAINERS file packaged with the kernel source.
Questions about how to get started with the kernel development process are common - from both individuals and companies. Equally common are missteps which make the beginning of the relationship harder than it has to be.
Companies often look to hire well-known developers to get a development group started. This can, in fact, be an effective technique. But it also tends to be expensive and does not do much to grow the pool of experienced kernel developers. It is possible to bring in-house developers up to speed on Linux kernel development, given the investment of a bit of time. Taking this time can endow an employer with a group of developers who understand the kernel and the company both, and who can help to train others as well. Over the medium term, this is often the more profitable approach.
Individual developers are often, understandably, at a loss for a place to start. Beginning with a large project can be intimidating; one often wants to test the waters with something smaller first. This is the point where some developers jump into the creation of patches fixing spelling errors or minor coding style issues. Unfortunately, such patches create a level of noise which is distracting for the development community as a whole, so, increasingly, they are looked down upon. New developers wishing to introduce themselves to the community will not get the sort of reception they wish for by these means.
Andrew Morton gives this advice for aspiring kernel developers
The #1 project for all kernel beginners should surely be "make sure that the kernel runs perfectly at all times on all machines which you can lay your hands on". Usually the way to do this is to work with others on getting things fixed up (this can require persistence!) but that's fine - it's a part of kernel development.
(http://lwn.net/Articles/283982/).
In the absence of obvious problems to fix, developers are advised to look at the current lists of regressions and open bugs in general. There is never any shortage of issues in need of fixing; by addressing these issues, developers will gain experience with the process while, at the same time, building respect with the rest of the development community.