forked from llvm-mirror/llvm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This document was crafted from the various (320+) emails between 2nd June and 20th July regarding the move to GitHub. It tried to consolidate every issue that was raised and every solution that was presented to have a GitHub repository with sub-modules. It *does not* try to argue whether sub-modules are better or worse than any other Git solution, nor if Git is better than any other VCS, nor if GitHub is better than any other free code hosting service. This is just the final conclusions of 48 days and 320 emails (plus a lot of IRC discussions) on the LLVM community. This document will be presented at the survey that the foundation will setup for us to decide if we move to this solution or not. It reflects what was discussed on the lists, but it's not authoritative. If something is not clear enough, please refer to the mailing list discussions (hint: search for "GitHub"). Review: https://reviews.llvm.org/D22463 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276097 91177308-0d34-0410-b5e6-96231b3b80d8
- Loading branch information
Showing
1 changed file
with
268 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,268 @@ | ||
============================== | ||
Moving LLVM Projects to GitHub | ||
============================== | ||
|
||
Introduction | ||
============ | ||
|
||
This is a proposal to move our current revision control system from our own | ||
hosted Subversion to GitHub. Below are the financial and technical arguments as | ||
to why we need such a move and how will people (and validation infrastructure) | ||
continue to work with a Git-based LLVM. | ||
|
||
There will be a survey pointing at this document when we'll know the community's | ||
reaction and, if we collectively decide to move, the time-frames. Be sure to make | ||
your views count. | ||
|
||
Essentially, the proposal is divided in the following parts: | ||
|
||
* Outline of the reasons to move to Git and GitHub | ||
* Description on what the work flow will look like (compared to SVN) | ||
* Remaining issues and potential problems | ||
* The proposed migration plan | ||
|
||
Why Git, and Why GitHub? | ||
======================== | ||
|
||
Why move at all? | ||
---------------- | ||
|
||
The strongest reason for the move, and why this discussion started in the first | ||
place, is that we currently host our own Subversion server and Git mirror in a | ||
voluntary basis. The LLVM Foundation sponsors the server and provides limited | ||
support, but there is only so much it can do. | ||
|
||
The volunteers are not Sysadmins themselves, but compiler engineers that happen | ||
to know a thing or two about hosting servers. We also don't have 24/7 support, | ||
and we sometimes wake up to see that continuous integration is broken because | ||
the SVN server is either down or unresponsive. | ||
|
||
With time and money, the foundation and volunteers could improve our services, | ||
implement more functionality and provide around the clock support, so that we | ||
can have a first class infrastructure with which to work. But the cost is not | ||
small, both in money and time invested. | ||
|
||
On the other hand, there are multiple services out there (GitHub, GitLab, | ||
BitBucket among others) that offer that same service (24/7 stability, disk space, | ||
Git server, code browsing, forking facilities, etc) for the very affordable price | ||
of *free*. | ||
|
||
Why Git? | ||
-------- | ||
|
||
Most new coders nowadays start with Git. A lot of them have never used SVN, CVS | ||
or anything else. Websites like GitHub have changed the landscape of open source | ||
contributions, reducing the cost of first contribution and fostering | ||
collaboration. | ||
|
||
Git is also the version control most LLVM developers use. Despite the sources | ||
being stored in an SVN server, most people develop using the Git-SVN integration, | ||
and that shows that Git is not only more powerful than SVN, but people have | ||
resorted to using a bridge because its features are now indispensable to their | ||
internal and external workflows. | ||
|
||
In essence, Git allows you to: | ||
|
||
* Commit, squash, merge, fork locally without any penalty to the server | ||
* Add as many branches as necessary to allow for multiple threads of development | ||
* Collaborate with peers directly, even without access to the Internet | ||
* Have multiple trees without multiplying disk space. | ||
|
||
In addition, because Git seems to be replacing every project's version control | ||
system, there are many more tools that can use Git's enhanced feature set, so | ||
new tooling is much more likely to support Git first (if not only), than any | ||
other version control system. | ||
|
||
Why GitHub? | ||
----------- | ||
|
||
GitHub, like GitLab and BitBucket, provide free code hosting for open source | ||
projects. Essentially, they will completely replace *all* the infrastructure that | ||
we have today that serves code repository, mirroring, user control, etc. | ||
|
||
They also have a dedicated team to monitor, migrate, improve and distribute the | ||
contents of the repositories depending on region and load. A level of quality | ||
that we'd never have without spending money that would be better spent elsewhere, | ||
for example development meetings, sponsoring disadvantaged people to work on | ||
compilers and foster diversity and equality in our community. | ||
|
||
GitHub has the added benefit that we already have a presence there. Many | ||
developers use it already, and the mirror from our current repository is already | ||
set up. | ||
|
||
Furthermore, GitHub has an *SVN view* (https://github.com/blog/626-announcing-svn-support) | ||
where people that still have/want to use SVN infrastructure and tooling can | ||
slowly migrate or even stay working as if it was an SVN repository (including | ||
read-write access). | ||
|
||
So, any of the three solutions solve the cost and maintenance problem, but GitHub | ||
has two additional features that would be beneficial to the migration plan as | ||
well as the community already settled there. | ||
|
||
|
||
What will the new workflow look like | ||
==================================== | ||
|
||
In order to move version control, we need to make sure that we get all the | ||
benefits with the least amount of problems. That's why the migration plan will | ||
be slow, one step at a time, and we'll try to make it look as close as possible | ||
to the current style without impacting the new features we want. | ||
|
||
Each LLVM project will continue to be hosted as separate GitHub repository | ||
under a single GitHub organisation. Users can continue to choose to use either | ||
SVN or Git to access the repositories to suit their current workflow. | ||
|
||
In addition, we'll create a repository that will mimic our current *linear | ||
history* repository. The most accepted proposal, then, was to have an umbrella | ||
project that will contain *sub-modules* (https://git-scm.com/book/en/v2/Git-Tools-Submodules) | ||
of all the LLVM projects and nothing else. | ||
|
||
This repository can be checked out on its own, in order to have *all* LLVM | ||
projects in a single check-out, as many people have suggested, but it can also | ||
only hold the references to the other projects, and be used for the sole purpose | ||
of understanding the *sequence* in which commits were added by using the | ||
``git rev-list --count hash`` or ``git describe hash`` commands. | ||
|
||
One example of such a repository is Takumi's llvm-project-submodule | ||
(https://github.com/chapuni/llvm-project-submodule), which when checked out, | ||
will have the references to all sub-modules but not check them out, so one will | ||
need to *init* the module manually. This will allow the *exact* same behaviour | ||
as checking out individual SVN repositories, as it will keep the correct linear | ||
history. | ||
|
||
There is no need to additional tags, flags and properties, or external | ||
services controlling the history, since both SVN and *git rev-list* can already | ||
do that on their own. | ||
|
||
We will need additional server hooks to avoid non-fast-forwards commits (ex. | ||
merges, forced pushes, etc) in order to keep the linearity of the history. | ||
|
||
The three types hooks to be implemented are: | ||
|
||
* Status Checks: By placing status checks on a protected branch, we can guarantee | ||
that the history is kept linear and sane at all times, on all repositories. | ||
See: https://help.github.com/articles/about-required-status-checks/ | ||
* Umbrella updates: By using GitHub web hooks, we can update a small web-service | ||
inside LLVM's own infrastructure to update the umbrella project remotely. The | ||
maintenance of this service will be lower than the current SVN maintenance and | ||
the scope of its failures will be less severe. | ||
See: https://developer.github.com/webhooks/ | ||
* Commits email update: By adding an email web hook, we can make every push show | ||
in the lists, allowing us to retain history and do post-commit reviews. | ||
See: https://help.github.com/articles/managing-notifications-for-pushes-to-a-repository/ | ||
|
||
Access will be transfered one-to-one to GitHub accounts for everyone that already | ||
has commit access to our current repository. Those who don't have accounts will | ||
have to create one in order to continue contributing to the project. In the | ||
future, people only need to provide their GitHub accounts to be granted access. | ||
|
||
In a nutshell: | ||
|
||
* The projects' repositories will remain identical, with a new address (GitHub). | ||
* They'll continue to have SVN access (Read-Write), but will also gain Git RW access. | ||
* The linear history can still be accessed in the (RO) submodule meta project. | ||
* Individual projects' history will be local (ie. not interlaced with the other | ||
projects, as the current SVN repos are), and we need the umbrella project | ||
(using submodules) to have the same view as we had in SVN. | ||
|
||
Additionally, each repository will have the following server hooks: | ||
|
||
* Pre-commit hooks to stop people from applying non-fast-forward merges | ||
* Webhook to update the umbrella project (via buildbot or web services) | ||
* Email hook to each commits list (llvm-commit, cfe-commit, etc) | ||
|
||
Essentially, we're adding Git RW access in addition to the already existing | ||
structure, with all the additional benefits of it being in GitHub. | ||
|
||
What will *not* be changed | ||
-------------------------- | ||
|
||
This is a change of version control system, not the whole infrastructure. There | ||
are plans to replace our current tools (review, bugs, documents), but they're | ||
all orthogonal to this proposal. | ||
|
||
We'll also be keeping the buildbots (and migrating them to use Git) as well as | ||
LNT, and any other system that currently provides value upstream. | ||
|
||
Any discussion regarding those tools are out of scope in this proposal. | ||
|
||
Remaining questions and problems | ||
================================ | ||
|
||
1. How much the SVN view emulates and how much it'll break tools/CI? | ||
|
||
For this one, we'll need people that will have problems in that area to tell | ||
us what's wrong and how to help them fix it. | ||
|
||
We also recommend people and companies to migrate to Git, for its many other | ||
additional benefits. | ||
|
||
2. Which tools will need changing? | ||
|
||
LNT may break, since it relies on SVN's history. We can continue to | ||
use LNT with the SVN-View, but it would be best to move it to Git once and for | ||
all. | ||
|
||
The LLVMLab bisect tool will also be affected and will need adjusting. As with | ||
LNT, it should be fine to use GitHub's SVN view, but changing it to work on Git | ||
will be required in the long term. | ||
|
||
Phabricator will also need to change its configuration to point at the GitHub | ||
repositories, but since it already works with Git, this will be a trivial change. | ||
|
||
Migration Plan | ||
============== | ||
|
||
If we decide to move, we'll have to set a date for the process to begin. | ||
|
||
As usual, we should be announcing big changes in one release to happen in the | ||
next one. But since this won't impact external users (if they rely on our source | ||
release tarballs), we don't necessarily have to. | ||
|
||
We will have to make sure all the *problems* reported are solved before the | ||
final push. But we can start all non-binding processes (like mirroring to GitHub | ||
and testing the SVN interface in it) before any hard decision. | ||
|
||
Here's a proposed plan: | ||
|
||
STEP #1 : Pre Move | ||
|
||
0. Update docs to mention the move, so people are aware the it's going on. | ||
1. Register an official GitHub project with the LLVM foundation. | ||
2. Setup another (read-only) mirror of llvm.org/git at this GitHub project, | ||
adding all necessary hooks to avoid broken history (merge, dates, pushes), as | ||
well as a webhook to update the umbrella project (see below). | ||
3. Make sure we have an llvm-project (with submodules) setup in the official | ||
account, with all necessary hooks (history, update, merges). | ||
4. Make sure bisecting with llvm-project works. | ||
5. Make sure no one has any other blocker. | ||
|
||
STEP #2 : Git Move | ||
|
||
6. Update the buildbots to pick up updates and commits from the official git | ||
repository. | ||
7. Update Phabricator to pick up commits from the official git repository. | ||
8. Tell people living downstream to pick up commits from the official git | ||
repository. | ||
9. Give things time to settle. We could play some games like disabling the SVN | ||
repository for a few hours on purpose so that people can test that their | ||
infrastructure has really become independent of the SVN repository. | ||
|
||
Until this point nothing has changed for developers, it will just | ||
boil down to a lot of work for buildbot and other infrastructure | ||
owners. | ||
|
||
Once all dependencies are cleared, and all problems have been solved: | ||
|
||
STEP #3: Write Access Move | ||
|
||
10. Collect peoples GitHub account information, adding them to the project. | ||
11. Switch SVN repository to read-only and allow pushes to the GitHub repository. | ||
12. Mirror Git to SVN. | ||
|
||
STEP #4 : Post Move | ||
|
||
13. Archive the SVN repository, if GitHub's SVN is good enough. | ||
14. Review and update *all* LLVM documentation. | ||
15. Review website links pointing to viewvc/klaus/phab etc. to point to GitHub | ||
instead. |