Johannes S hacked the living daylights out of some import scripts and has an up-to-date php-src git mirror.
This is really awesome not only because I absolutely love Git, but because it makes it a LOT easier to get work done.
Example*:
mjc@325i:~/scratch$ time (curl -O http://snaps.php.net/php5.3-200812031930.tar.bz2 ; tar xjf php5.3-200812031930.tar.bz2 )
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10.1M 100 10.1M 0 0 551k 0 0:00:18 0:00:18 --:--:-- 570k
real 0m32.450s
user 0m4.844s
sys 0m0.820s
cvs:
mjc@325i:~/scratch$ time cvs -Q -d :pserver:cvsread@cvs.php.net:/repository checkout -r PHP_5_3 php5
real 0m56.865s
user 0m4.828s
sys 0m4.308s
git, nearly the same workflow (requiring no extra thinking other than to learn the syntax):
mjc@325i:~/scratch$ time (git clone src/mysqlnd/php-src php-src; cd php-src; git checkout -b php-5.3 origin/PHP_5_3)
Initialized empty Git repository in /home/mjc/scratch/php-src/.git/
0 blocks
Checking out files: 100% (10636/10636), done.
Branch php-5.3 set up to track remote branch refs/remotes/origin/PHP_5_3.
Switched to a new branch "php-5.3"
real 0m8.655s
user 0m1.616s
sys 0m1.992s
* this test performed on an Amazon EC2 small instance, bandwidth may vary, but it is significantly faster than my local connection and disk drive.
Additionally, even with an empty cache, and the fact that git clone is copying the ENTIRE history, git clone was only four times slower than a primed cache cp of the CVS checkout!
mjc@325i:~/scratch$ time cp -r php5 php5_some-other-work
real 0m1.802s
user 0m0.064s
sys 0m0.704s
Note, though, that if I were to use the standard git workflow, the real time savings start to show.
To test this, first lets see how big the project is, and double it:
mjc@325i:~/scratch/php-src$ git checkout origin/PHP_5_3
Note: moving to "origin/PHP_5_3" which isn't a local branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
git checkout -b
HEAD is now at 0986ce9... fix possible invalid read
mjc@325i:~/scratch/php-src$ du -hc .cvsignore .gdbinit * | grep total
101M total
mjc@325i:~/scratch/php-src$ mkdir foo; cp -r .cvsignore .gdbinit * foo; du -hc .cvsignore .gdbinit * | grep total
cp: cannot copy a directory, `foo', into itself, `foo/foo'
cp: cannot copy a directory, `foo', into itself, `foo/foo'
cp: cannot copy a directory, `foo', into itself, `foo/foo'
207M total
Now, lets commit those changes and then switch to php6
mjc@325i:~/scratch/php-src$ time (git add foo && git commit -qam 'double the size of the repo, this is a big diff' && git checkout master)
Previous HEAD position was de6576c... double the size of the repo, this is a big diff
Switched to branch "master"
real 0m9.907s
user 0m3.376s
sys 0m0.900s
twice the size, and barely 1 second slower, most of which is spent committing the changes.
Is this enough of a real-world, but marginally excessive example, or what?
Also – the whole revision history for php is in .git – and it clocks in at (as of this writing) 119MB (only 18MB more than the actual checked-out source) with 54,304 commits.
Feel free to let me know in the comments if I missed anything.