-
Notifications
You must be signed in to change notification settings - Fork 176
/
subversion-design.html
3026 lines (2454 loc) · 130 KB
/
subversion-design.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Subversion Design</title>
</head>
<body>
<div class="h1">
<h1 style="text-align: center">Subversion Design</h1>
</div>
<p class="warningmark"><em>NOTE: This document is out of date. The last
substantial update was in October 2002 (r3377). However, people often come
here for the section on the <a href="#server.fs.struct.bubble-up">directory
bubble-up method</a>, which is still accurate.</em></p>
<div class="h1">
<h2>Table of Contents</h2>
<ol id="toc">
<li><a href="#goals">Goals — The goals of the Subversion project</a>
<ol>
<li><a href="#goals.rename-remove-resurrect">Rename/removal/resurrection support</a></li>
<li><a href="#goals.textbinary">Text vs binary issues</a></li>
<li><a href="#goals.i18n">I18N/Multilingual support</a></li>
<li><a href="#goals.branching-and-tagging">Branching and tagging</a></li>
<li><a href="#goals.misc">Miscellaneous new behaviors</a>
<ol>
<li><a href="#goals.misc.logmsgs">Log messages</a></li>
<li><a href="#goals.misc.diffplugins">Client side diff plug-ins</a></li>
<li><a href="#goals.misc.merging">Better merging</a></li>
<li><a href="#goals.misc.conflicts">Conflicts resolution</a></li>
</ol>
</li> <!-- goals.misc -->
</ol>
</li> <!-- goals -->
<li><a href="#model">Model — The versioning model used by Subversion</a>
<ol>
<li><a href="#model.wc-and-repos">Working Directories and Repositories</a></li>
<li><a href="#model.txns-and-revnums">Transactions and Revision Numbers</a></li>
<li><a href="#model.how-wc">How Working Directories Track the Repository</a></li>
<li><a href="#model.lock-merge">Locking vs. Merging - Two Paradigms of Co-operative
Developments</a></li>
<li><a href="#model.props">Properties</a></li>
<li><a href="#model.merging-and-ancestry">Merging and Ancestry</a></li>
</ol>
</li> <!-- model -->
<li><a href="#archi">Architecture — How Subversion's components work together</a>
<ol>
<li><a href="#archi.client">Client Layer</a></li>
<li><a href="#archi.network">Network Layer</a></li>
<li><a href="#archi.fs">Filesystem Layer</a></li>
</ol>
</li> <!-- archi -->
<li><a href="#deltas">Deltas — How to describe changes</a>
<ol>
<li><a href="#deltas.text">Text Deltas</a></li>
<li><a href="#deltas.prop">Property Deltas</a></li>
<li><a href="#deltas.tree">Tree Deltas</a></li>
<li><a href="#deltas.postfix-text">Postfix Text Deltas</a></li>
<li><a href="#deltas.serializing-via-editor">Serializing Deltas via the "Editor" Interface</a></li>
</ol>
</li> <!-- deltas -->
<li><a href="#client">Client — How the client works</a>
<ol>
<li><a href="#client.wc">Working copies and the working copy library</a>
<ol>
<li><a href="#client.wc.layout">The layout of working copies</a></li>
<li><a href="#client.wc.library">The working copy management library</a></li>
</ol>
</li> <!-- client.wc -->
<li><a href="#client.libsvn_ra">The repository access library</a></li>
<li><a href="#client.libsvn_client">The client operation library</a></li>
</ol>
</li> <!-- client -->
<li><a href="#protocol">Protocol — How the client and server communicate</a>
<ol>
<li><a href="#protocol.webdav">The HTTP/WebDAV/DeltaV based protocol</a></li>
<li><a href="#protocol.svn">The custom protocol</a></li>
</ol>
</li> <!-- protocol -->
<li><a href="#server">Server — How the server works</a>
<ol>
<li><a href="#server.fs">Filesystem</a>
<ol>
<li><a href="#server.fs.overview">Filesystem Overview</a></li>
<li><a href="#server.fs.api">API</a></li>
<li><a href="#server.fs.struct">Repository Structure</a>
<ol>
<li><a href="#server.fs.struct.schema">Schema</a></li>
<li><a href="#server.fs.struct.bubble-up">Bubble-Up Method</a></li>
<li><a href="#server.fs.struct.diffy-storage">Diffy Storage</a></li>
</ol>
</li> <!-- server.fs.struct -->
<li><a href="#server.fs.implementation">Implementation</a></li>
</ol>
</li> <!-- server.fs -->
<li><a href="#server.libsvn_repos">Repository Library</a></li>
</ol>
</li> <!-- server -->
<li><a href="#license">License — Copyright</a></li>
</ol>
</div>
<!--
================================================================
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
====================================================================
This software consists of voluntary contributions made by many
individuals on behalf of CollabNet.
-->
<div class="h2" id="goals" title="#goals">
<h2>Goals — The goals of the Subversion project</h2>
<p>The goal of the Subversion project is to write a version control
system that takes over CVS's current and future user base
(If you're not familiar with CVS or its shortcomings, then
skip to <a href="#model">Model — The versioning model used by Subversion</a>)
. The first release
has all the major features of CVS, plus certain new features that CVS
users often wish they had. In general, Subversion works like CVS, except
where there's a compelling reason to be different.</p>
<p>So what does Subversion have that CVS doesn't?</p>
<ul>
<li><p>It versions directories, file-metadata, renames, copies
and removals/resurrections. In other words, Subversion records the
changes users make to directory trees, not just changes to file
contents.</p></li>
<li><p>Tagging and branching are constant-time and
constant-space.</p></li>
<li><p>It is natively client-server, hence much more
maintainable than CVS. (In CVS, the client-server protocol was added
as an afterthought. This means that most new features have to be
implemented twice, or at least more than once: code for the local
case, and code for the client-server case.)</p></li>
<li><p>The repository is organized efficiently and
comprehensibly. (Without going into too much detail, let's just say
that CVS's repository structure is showing its
age.)</p></li>
<li><p>Commits are atomic. Each commit results in a single
revision number, which refers to the state of the entire tree. Files
no longer have their own revision numbers.</p></li>
<li><p>The locking scheme is only as strict as absolutely
necessary. Reads are never locked, and writes lock only the files
being written, for only as long as needed.</p></li>
<li><p>It has internationalization support.</p></li>
<li><p>It handles binary files gracefully (experience has shown
that CVS's binary file handling is prone to user
error).</p></li>
<li><p>It takes advantage of the Net's experience with CVS by
choosing better default behaviors for certain
situations.</p></li>
</ul>
<p>Some of these advantages are clear and require no further discussion.
Others are not so obvious, and are explained in greater detail
below.</p>
<div class="h3" id="goals.rename-remove-resurrect" title="#goals.rename-remove-resurrect">
<h3>Rename/removal/resurrection support</h3>
<p>Full rename support means you can trace through ancestry by name
<em>or</em> by entity. For example, if you say "Give me
revision 12 of foo.c", do you mean revision 12 of the file whose name is
<em>now</em> foo.c (but perhaps it was named bar.c back at
revision 12), or the file whose name was foo.c in revision 12 (perhaps
that file no longer exists, or has a different name now)? In Subversion,
both interpretations are available to the user.</p>
<p>(Note: we've not yet implemented this, but it wouldn't be too hard.
People are advocating switches to 'svn log' that cause history to be
traced backwards either by entity or by path.)</p>
</div> <!-- goals.rename-remove-resurrect (h3) -->
<div class="h3" id="goals.textbinary" title="#goals.textbinary">
<h3>Text vs binary issues</h3>
<p>Historically, binary files have been problematic in CVS for two
unrelated reasons: keyword expansion, and line-end conversion.</p>
<ul>
<li><p><strong class="firstterm">Keyword expansion</strong> is when CVS
expands "$Revision$" into "$Revision: 1.1 $", for example. There
are a number of keywords in CVS: "$Author: sussman $", "$Date:
2001/06/04 22:00:52 $", and so on.</p></li>
<li><p><strong class="firstterm">Line-end conversion</strong> is when CVS
gives plaintext files the appropriate line-ending conventions for the
working copy's platform. For example, Unix working copies use LF, but
Windows working copies use CRLF. (Like CVS, the Subversion
repository stores text files in Unix LF format).</p></li>
</ul>
<p>Both keyword substitution and line-end conversion are sensible only
for plain text files. CVS only recognizes two file types anyway:
plaintext and binary. And CVS assumes files are plain text unless you
tell it otherwise.</p>
<p>Subversion recognizes the same two types. The question is, how does
it determine a file's type? Experience with CVS suggests that assuming
text unless told otherwise is a losing strategy – people frequently
forget to mark images and other opaque formats as binary, then later they
wonder why CVS mangled their data. So Subversion will not mangle data:
when moving over the network, or when being stored in the repository, it
treats all files as binary. In the working copy, a tweakable meta-data
property indicates whether to treat the file as text or binary for
purposes of whether or not to allow contextual merging during
updates.</p>
<p>Users can turn line-end conversion on or off per file by tweaking
meta-data. Files do <em>not</em> undergo keyword
substitution by default, on the theory that if someone wants substitution
and isn't getting it, they'll look in the manual; but if they are getting
it and didn't want it, they might just be confused and not know what to
do. Users can turn substitution on or off per file.</p>
<p>Both of these changes are done on the client side; the repository
does not even know about them.</p>
</div> <!-- goals.textbinary (h3) -->
<div class="h3" id="goals.i18n" title="#goals.i18n">
<h3>I18N/Multilingual support</h3>
<p>Subversion is internationalized – commands, user messages, and
errors can be customized to the appropriate human language at build-time
(or run time, if that's not much harder).</p>
<p>File names and contents may be multilingual; Subversion does not
assume an ASCII-only universe. For purposes of keyword expansion and
line-end conversion, Subversion also understands the UTF-* encodings (but
not necessarily all of them by the first release).</p>
</div> <!-- goals.i18n (h3) -->
<div class="h3" id="goals.branching-and-tagging" title="#goals.branching-and-tagging">
<h3>Branching and tagging</h3>
<p>Subversion supports branching and tagging with one efficient
operation: `clone'. To clone a tree is to copy it, to create another
tree exactly like it (except that the new tree knows its ancestry
relationship to the old one).</p>
<p>At the moment of creation, a clone requires only a small, constant
amount of space in the repository – most of its storage is shared
with the original tree. If you never commit anything on the clone, then
it's just like a CVS tag. If you start committing on it, then it's a
branch. Voila! This also implies CVS's "vendor branching" feature,
since Subversion has real rename and directory support.</p>
</div> <!-- goals.branching-and-tagging (h3) -->
<div class="h3" id="goals.misc" title="#goals.misc">
<h3>Miscellaneous new behaviors</h3>
<div class="h4" id="goals.misc.logmsgs" title="#goals.misc.logmsgs">
<h4>Log messages</h4>
<p>Subversion has a flexible log message policy (a small matter, but
one dear to our hearts).</p>
<p>Log messages should be a matter of project policy, not version
control software policy. If a user commits with no log message, then
Subversion defaults to an empty message. (CVS tries to require log
messages, but fails: we've all seen empty log messages in CVS, where
the user committed with deliberately empty quotes. Let's stop the
madness now.)</p>
</div> <!-- goals.misc.logmsgs (h4) -->
<div class="h4" id="goals.misc.diffplugins" title="#goals.misc.diffplugins">
<h4>Client side diff plug-ins</h4>
<p>Subversion supports client-side plug-in diff programs.</p>
<p>There is no need for Subversion to have every possible diff
mechanism built in. It can invoke a user-specified client-side diff
program on the two revisions of the file(s) locally.</p>
<p>(Note: This feature does not exist yet, but is planned for
post-1.0.)</p>
</div> <!-- goals.misc.diffplugins (h4) -->
<div class="h4" id="goals.misc.merging" title="#goals.misc.merging">
<h4>Better merging</h4>
<p>Subversion remembers what has already been merged in and what
hasn't, thereby avoiding the problem, familiar to CVS users, of
spurious conflicts on repeated merges.</p>
<p>(Note: Parts of his feature (<a href="/merge-tracking/">Merge
Tracking</a>) are implemented in Subversion 1.5; see
the <a href="svn_1.5_releasenotes.html#merge-tracking"
>release notes</a>.)</p>
<p>For details, see <a href="#model.merging-and-ancestry">Merging and Ancestry</a>.</p>
</div> <!-- goals.misc.merging (h4) -->
<div class="h4" id="goals.misc.conflicts" title="#goals.misc.conflicts">
<h4>Conflicts resolution</h4>
<p>For text files, Subversion resolves conflicts similarly to CVS, by
folding repository changes into the working files with conflict
markers. But, for <em>both</em> text and binary files,
Subversion also always puts the old and new pristine repository
revisions into temporary files, and the pristine working copy revision
in another temporary file.</p>
<p>Thus, for any conflict, the user has four files readily at
hand:</p>
<ol>
<li><p>the original working copy file with local
mods</p></li>
<li><p>the older repository file</p></li>
<li><p>the newest repository file</p></li>
<li><p>the merged file, with conflict
markers</p></li>
</ol>
<p>and in a binary file conflict, the user has all but the
last.</p>
<p>When the conflict has been resolved and the working copy is
committed, Subversion automatically removes the temporary pristine
files.</p>
<p>A more general solution would allow plug-in merge resolution tools
on the client side; but this is not scheduled for the first release).
Note that users can use their own merge tools anyway, since all the
original files are available.</p>
</div> <!-- goals.misc.conflicts (h4) -->
</div> <!-- goals.misc (h3) -->
</div> <!-- goals (h2) -->
<div class="h2" id="model" title="#model">
<h2>Model — The versioning model used by Subversion</h2>
<p>This chapter explains the user's view of Subversion — what
“objects” you interact with, how they behave, and how they
relate to each other.</p>
<div class="h3" id="model.wc-and-repos" title="#model.wc-and-repos">
<h3>Working Directories and Repositories</h3>
<p>Suppose you are using Subversion to manage a software project. There
are two things you will interact with: your working directory, and the
repository.</p>
<p>Your <strong class="firstterm">working directory</strong> is an ordinary
directory tree, on your local system, containing your project's sources.
You can edit these files and compile your program from them in the usual
way. Your working directory is your own private work area: Subversion
never changes the files in your working directory, or publishes the
changes you make there, until you explicitly tell it to do so.</p>
<p>After you've made some changes to the files in your working
directory, and verified that they work properly, Subversion provides
commands to publish your changes to the other people working with you on
your project. If they publish their own changes, Subversion provides
commands to incorporate those changes into your working directory.</p>
<p>A working directory contains some extra files, created and maintained
by Subversion, to help it carry out these commands. In particular, these
files help Subversion recognize which files contain unpublished changes,
and which files are out-of-date with respect to others' work.</p>
<p>While your working directory is for your use alone, the
<strong class="firstterm">repository</strong> is the common public record you share
with everyone else working on the project. To publish your changes, you
use Subversion to put them in the repository. (What this means, exactly,
we explain below.) Once your changes are in the repository, others can
tell Subversion to incorporate your changes into their working
directories. In a collaborative environment like this, each user will
typically have their own working directory (or perhaps more than one),
and all the working directories will be backed by a single repository,
shared amongst all the users.</p>
<p>A Subversion repository holds a single directory tree, and records
the history of changes to that tree. The repository retains enough
information to recreate any prior state of the tree, compute the
differences between any two prior trees, and report the relations between
files in the tree — which files are derived from which other
files.</p>
<p>A Subversion repository can hold the source code for several
projects; usually, each project is a subdirectory in the tree. In this
arrangement, a working directory will usually correspond to a particular
subtree of the repository.</p>
<p>For example, suppose you have a repository laid out like this:</p>
<pre>
/trunk/paint/Makefile
canvas.c
brush.c
write/Makefile
document.c
search.c
</pre>
<p>In other words, the repository's root directory has a single
subdirectory named <tt class="filename">trunk</tt>, which itself contains two
subdirectories: <tt class="filename">paint</tt> and
<tt class="filename">write</tt>.</p>
<p>To get a working directory, you must <strong class="firstterm">check out</strong>
some subtree of the repository. If you check out
<tt class="filename">/trunk/write</tt>, you will get a working directory like
this:</p>
<pre>
write/Makefile
document.c
search.c
.svn/
</pre>
<p>This working directory is a copy of the repository's
<tt class="filename">/trunk/write</tt> directory, with one additional entry
— <tt class="filename">.svn</tt> — which holds the extra
information needed by Subversion, as mentioned above.</p>
<p>Suppose you make changes to <tt class="filename">search.c</tt>. Since the
<tt class="filename">.svn</tt> directory remembers the file's modification
date and original contents, Subversion can tell that you've changed the
file. However, Subversion does not make your changes public until you
explicitly tell it to.</p>
<p>To publish your changes, you can use Subversion's
‘<tt class="literal">commit</tt>’ command:</p>
<pre>
$ pwd
/home/jimb/write
$ ls -a
.svn/ Makefile document.c search.c
$ svn commit search.c
$
</pre>
<p>Now your changes to <tt class="filename">search.c</tt> have been committed
to the repository; if another user checks out a working copy of
<tt class="filename">/trunk/write</tt>, they will see your text.</p>
<p>Suppose you have a collaborator, Felix, who checked out a working
directory of <tt class="filename">/trunk/write</tt> at the same time you did.
When you commit your change to <tt class="filename">search.c</tt>, Felix's
working copy is left unchanged; Subversion only modifies working
directories at the user's request.</p>
<p>To bring his working directory up to date, Felix can use the
Subversion ‘<tt class="literal">update</tt>’ command. This will
incorporate your changes into his working directory, as well as any
others that have been committed since he checked it out.</p>
<pre>
$ pwd
/home/felix/write
$ ls -a
.svn/ Makefile document.c search.c
$ svn update
U search.c
$
</pre>
<p>The output from the ‘<tt class="literal">svn update</tt>’
command indicates that Subversion updated the contents of
<tt class="filename">search.c</tt>. Note that Felix didn't need to specify
which files to update; Subversion uses the information in the
<tt class="filename">.svn</tt> directory, and further information in the
repository, to decide which files need to be brought up to date.</p>
<p>We explain below what happens when both you and Felix make changes to
the same file.</p>
</div> <!-- model.wc-and-repos (h3) -->
<div class="h3" id="model.txns-and-revnums" title="#model.txns-and-revnums">
<h3>Transactions and Revision Numbers</h3>
<p>A Subversion ‘<tt class="literal">commit</tt>’ operation can
publish changes to any number of files and directories as a single atomic
transaction. In your working directory, you can change files' contents,
create, delete, rename and copy files and directories, and then commit
the completed set of changes as a unit.</p>
<p>In the repository, each commit is treated as an atomic transaction:
either all the commit's changes take place, or none of them take place.
Subversion tries to retain this atomicity in the face of program crashes,
system crashes, network problems, and other users' actions. We may call
a commit a <strong class="firstterm">transaction</strong> when we want to emphasize
its indivisible nature.</p>
<p>Each time the repository accepts a transaction, this creates a new
state of the tree, called a <strong class="firstterm">revision</strong>. Each
revision is assigned a unique natural number, one greater than the number
of the previous revision. The initial revision of a freshly created
repository is numbered zero, and consists of an empty root
directory.</p>
<p>Since each transaction creates a new revision, with its own number,
we can also use these numbers to refer to transactions; transaction
<em class="replaceable">n</em> is the transaction which created revision
<em class="replaceable">n</em>. There is no transaction numbered
zero.</p>
<p>Unlike those of many other systems, Subversion's revision numbers
apply to an entire tree, not individual files. Each revision number
selects an entire tree.</p>
<p>It's important to note that working directories do not always
correspond to any single revision in the repository; they may contain
files from several different revisions. For example, suppose you check
out a working directory from a repository whose most recent revision is
4:</p>
<pre>
write/Makefile:4
document.c:4
search.c:4
</pre>
<p>At the moment, this working directory corresponds exactly to revision
4 in the repository. However, suppose you make a change to
<tt class="filename">search.c</tt>, and commit that change. Assuming no other
commits have taken place, your commit will create revision 5 of the
repository, and your working directory will look like this:</p>
<pre>
write/Makefile:4
document.c:4
search.c:5
</pre>
<p>Suppose that, at this point, Felix commits a change to
<tt class="filename">document.c</tt>, creating revision 6. If you use
‘<tt class="literal">svn update</tt>’ to bring your working
directory up to date, then it will look like this:</p>
<pre>
write/Makefile:6
document.c:6
search.c:6
</pre>
<p>Felix's changes to <tt class="filename">document.c</tt> will appear in
your working copy of that file, and your change will still be present in
<tt class="filename">search.c</tt>. In this example, the text of
<tt class="filename">Makefile</tt> is identical in revisions 4, 5, and 6, but
Subversion will mark your working copy with revision 6 to indicate that
it is still current. So, after you do a clean update at the root of your
working directory, your working directory will generally correspond
exactly to some revision in the repository.</p>
</div> <!-- model.txns-and-revnums (h3) -->
<div class="h3" id="model.how-wc" title="#model.how-wc">
<h3>How Working Directories Track the Repository</h3>
<p>For each file in a working directory, Subversion records two
essential pieces of information:</p>
<ul>
<li><p>what revision of what repository file your working copy
is based on (this is called the file's <strong class="firstterm">base
revision</strong>), and</p></li>
<li><p>a timestamp recording when the local copy was last
updated.</p></li>
</ul>
<p>Given this information, by talking to the repository, Subversion can
tell which of the following four states a file is in:</p>
<ul>
<li><p><strong>Unchanged, and current.</strong>
The file is unchanged in the working directory, and no changes to that
file have been committed to the repository since its base
revision.</p></li>
<li><p><strong>Locally changed, and
current</strong>. The file has been changed in the working
directory, and no changes to that file have been committed to the
repository since its base revision. There are local changes that have
not been committed to the repository.</p></li>
<li><p><strong>Unchanged, and
out-of-date</strong>. The file has not been changed in
the working directory, but it has been changed in the repository. The
file should eventually be updated, to make it current with the
public revision.</p></li>
<li><p><strong>Locally changed, and
out-of-date</strong>. The file has been changed both in the
working directory, and in the repository. The file should be updated;
Subversion will attempt to merge the public changes with the local
changes. If it can't complete the merge in a plausible
way automatically, Subversion leaves it to the user to resolve the
conflict.</p></li>
</ul>
</div> <!-- model.how-wc (h3) -->
<div class="h3" id="model.lock-merge" title="#model.lock-merge">
<h3>Locking vs. Merging - Two Paradigms of Co-operative
Developments</h3>
<p>By default, Subversion prefers the “merging” method of
handling simultaneous editing by multiple users. This means that
Subversion does not prevent two users from making changes to the same
file at the same time. For example, if both you and Felix have checked
out working directories of <tt class="filename">/trunk/write</tt>, Subversion
will allow both of you to change <tt class="filename">write/search.c</tt> in
your working directories. Then, the following sequence of events will
occur:</p>
<ul>
<li><p>Suppose Felix tries to commit his changes to
<tt class="filename">search.c</tt> first. His commit will succeed, and
his text will appear in the latest revision in the
repository.</p></li>
<li><p>When you attempt to commit your changes to
<tt class="filename">search.c</tt>, Subversion will reject your commit,
and tell you that you must update <tt class="filename">search.c</tt> before
you can commit it.</p></li>
<li><p>When you update <tt class="filename">search.c</tt>, Subversion
will try to merge Felix's changes from the repository with your local
changes. By default, Subversion merges as if it were applying a
patch: if your local changes do not overlap textually with Felix's,
then all is well; otherwise, Subversion leaves it to you to resolve
the overlapping changes. In either case, Subversion carefully
preserves a copy of the original pre-merge text.</p></li>
<li><p>Once you have verified that Felix's changes and your
changes have been merged correctly, you can commit the new revision
of <tt class="filename">search.c</tt>, which now contains everyone's
changes.</p></li>
</ul>
<p>Some version control systems provide “locks”, which
prevent others from changing a file once one person has begun working on
it. In our experience, merging is preferable to locks, because:</p>
<ul>
<li><p>changes usually do not conflict, so Subversion's behavior
does the right thing by default, while locking can interfere with
legitimate work;</p></li>
<li><p>locking can prevent conflicts within a file, but not
conflicts between files (say, between a C header file and another
file that includes it), so it doesn't really solve the problem; and
finally,</p></li>
<li><p>people often forget that they are holding locks,
resulting in unnecessary delays and friction.</p></li>
</ul>
<p>Of course, some kinds of files with rigid formats, like images or
executables, are simply not mergeable. To support this, Subversion
allows users to customize its merging behavior on a per-file basis.
Firstly, you can direct Subversion to refuse to merge changes to certain
files, and simply present you with the two original texts to choose from.
Secondly, in Subversion 1.2 and later, support for the
“locking” method of working is also available, and individual
files can be designated as requiring locking.</p>
<p>(In the future, you may be able to direct Subversion to merge using a
tool which respects the semantics of specific complex file
formats.)</p>
</div> <!-- model.lock-merge (h3) -->
<div class="h3" id="model.props" title="#model.props">
<h3>Properties</h3>
<p>Files generally have interesting attributes beyond their contents:
mime-types, executable permissions, EOL styles, and so on. Subversion
attempts to preserve these attributes, or at least record them, when
doing so would be meaningful. However, different operating systems
support very different sets of file attributes: Windows NT supports
access control lists, while Linux provides only the simpler traditional
Unix permission bits.</p>
<p>In order to interoperate well with clients on many different
operating systems, Subversion supports <strong class="firstterm">property
lists</strong>, a simple, general-purpose mechanism which clients
can use to store arbitrary out-of-band information about files.</p>
<p>A property list is a set of name / value pairs. A property name is
an arbitrary text string, expressed as a Unicode UTF-8 string,
canonically decomposed and ordered. A property value is an arbitrary
string of bytes. Property values may be of any size, but Subversion may
not handle very large property values efficiently. No two properties in
a given a property list may have the same name. Although the word `list'
usually denotes an ordered sequence, there is no fixed order to the
properties in a property list; the term `property list' is
historical.</p>
<p>Each revision number, file, directory, and directory entry in the
Subversion repository, has its own property list. Subversion puts these
property lists to several uses:</p>
<ul>
<li><p>Clients can use properties to store file attributes, as
described above.</p></li>
<li><p>The Subversion server uses properties to hold attributes
of its own, and allow clients to read and modify them. For example,
someday a hypothetical ‘<tt class="literal">svn-acl</tt>’
property might hold an access control list which the Subversion server
uses to regulate access to repository files.</p></li>
<li><p>Users can invent properties of their own, to store
arbitrary information for use by scripts, build environments, and so
on. Names of user properties should be URI's, to avoid conflicts
between organizations.</p></li>
</ul>
<p>Property lists are versioned, just like file contents. You can
change properties in your working directory, but those changes are not
visible in the repository until you commit your local changes. If you do
commit a change to a property value, other users will see your change
when they update their working directories.</p>
</div> <!-- model.props (h3) -->
<div class="h3" id="model.merging-and-ancestry" title="#model.merging-and-ancestry">
<h3>Merging and Ancestry</h3>
<p>[WARNING: this section was written in May 2000, at the very
beginning of the Subversion project. This functionality probably will
not exist in Subversion 1.0, but it's planned for post-1.0. The problem
should be reasonably solvable by recording merge data in
'properties'.]</p>
<p>Subversion defines merges the same way CVS does: to merge means to
take a set of previously committed changes and apply them, as a patch, to
a working copy. This change can then be committed, like any other
change. (In Subversion's case, the patch may include changes to
directory trees, not just file contents.)</p>
<p>As defined thus far, merging is equivalent to hand-editing the
working copy into the same state as would result from the patch
application. In fact, in CVS there <em>is</em> no difference
– it is equivalent to just editing the files, and there is no
record of which ancestors these particular changes came from.
Unfortunately, this leads to conflicts when users unintentionally merge
the same changes again. (Experienced CVS users avoid this problem by
using branch- and merge-point tags, but that involves a lot of unwieldy
bookkeeping.)</p>
<p>In Subversion, merges are remembered by recording <strong class="firstterm">ancestry
sets</strong>. A revision's ancestry set is the set of all changes
"accounted for" in that revision. By maintaining ancestry sets, and
consulting them when doing merges, Subversion can detect when it would
apply the same patch twice, and spare users much bookkeeping. Ancestry
sets are stored as properties.</p>
<p>In the examples below, bear in mind that revision numbers usually
refer to changes, rather than the full contents of that revision. For
example, "the change A:4" means "the delta that resulted in A:4", not
"the full contents of A:4".</p>
<p>The simplest ancestor sets are associated with linear histories. For
example, here's the history of a file A:</p>
<pre>
_____ _____ _____ _____ _____
| | | | | | | | | |
| A:1 |----->| A:2 |----->| A:3 |----->| A:4 |----->| A:5 |
|_____| |_____| |_____| |_____| |_____|
</pre>
<p>The ancestor set of A:5 is:</p>
<pre>
{ A:1, A:2, A:3, A:4, A:5 }
</pre>
<p>That is, it includes the change that brought A from nothing to A:1,
the change from A:1 to A:2, and so on to A:5. From now on, ranges like
this will be represented with a more compact notation:</p>
<pre>
{ A:1-5 }
</pre>
<p>Now assume there's a branch B based, or "rooted", at A:2. (This
postulates an entirely different revision history, of course, and the
global revision numbers in the diagrams will change to reflect it.)
Here's what the project looks like with the branch:</p>
<pre>
_____ _____ _____ _____ _____ _____
| | | | | | | | | | | |
| A:1 |----->| A:2 |----->| A:4 |----->| A:6 |----->| A:8 |----->| A:9 |
|_____| |_____| |_____| |_____| |_____| |_____|
\
\
\ _____ _____ _____
\| | | | | |
| B:3 |----->| B:5 |----->| B:7 |
|_____| |_____| |_____|
</pre>
<p>If we produce A:9 by merging the B branch back into the
trunk</p>
<pre>
_____ _____ _____ _____ _____ _____
| | | | | | | | | | | |
| A:1 |----->| A:2 |----->| A:4 |----->| A:6 |----->| A:8 |---.->| A:9 |
|_____| |_____| |_____| |_____| |_____| / |_____|
\ |
\ |
\ _____ _____ _____ /
\| | | | | | /
| B:3 |----->| B:5 |----->| B:7 |--->-'
|_____| |_____| |_____|
</pre>
<p>then what will A:9's ancestor set be?</p>
<pre>
{ A:1, A:2, A:4, A:6, A:8, A:9, B:3, B:5, B:7}
</pre>
<p>or more compactly:</p>
<pre>
{ A:1-9, B:3-7 }
</pre>
<p>(It's all right that each file's ranges seem to include non-changes;
this is just a notational convenience, and you can think of the
non-changes as either not being included, or being included but being
null deltas as far as that file is concerned).</p>
<p>All changes along the B line are accounted for (changes B:3-7), and
so are all changes along the A line, including both the merge and any
non-merge-related edits made before the commit.</p>
<p>Although this merge happened to include all the branch changes, that
needn't be the case. For example, the next time we merge the B
line</p>
<pre>
_____ _____ _____ _____ _____ _____ _____
| | | | | | | | | | | | | |
| A:1 |-->| A:2 |-->| A:4 |-->| A:6 |-->| A:8 |-.->| A:9 |-.->|A:11 |
|_____| |_____| |_____| |_____| |_____| | |_____| | |_____|
\ / |
\ / |
\ _____ _____ _____ / _____ |
\| | | | | | / | | /
| B:3 |-->| B:5 |-->| B:7 |-->|B:10 |->-'
|_____| |_____| |_____| |_____|
</pre>
<p>Subversion will know that A's ancestry set already contains B:3-7, so
only the difference between B:7 and B:10 will be applied. A's new
ancestry will be</p>
<pre>
{ A:1-11, B:3-10 }
</pre>
<p>But why limit ourselves to contiguous ranges? An ancestry set is
truly a set – it can be any subset of the changes available:</p>
<pre>
_____ _____ _____ _____ _____ _____
| | | | | | | | | | | |
| A:1 |----->| A:2 |----->| A:4 |----->| A:6 |----->| A:8 |--.-->|A:10 |
|_____| |_____| |_____| |_____| |_____| / |_____|
| /
| ______________________.__/
| / |
| / |
\ __/_ _|__
\ { } { }
\ _____ _____ _____ _____
\| | | | | | | |
| B:3 |----->| B:5 |----->| B:7 |----->| B:9 |----->
|_____| |_____| |_____| |_____|
</pre>
<p>In this diagram, the change from B:3-5 and the change from B:7-9 are
merged into a working copy whose ancestry set (so far) is
{ A:1-8 } plus any local changes. After committing, A:10's
ancestry set is</p>
<pre>
{ A:1-10, B:5, B:9 }
</pre>
<p>Clearly, saying "Let's merge branch B into A" is a little ambiguous.
It usually means "Merge all the changes accounted for in B's tip into A",
but it <em>might</em> mean "Merge the single change that
resulted in B's tip into A".</p>
<p>Any merge, when viewed in detail, is an application of a particular
set of changes – not necessarily adjacent ones – to a working
copy. The user-level interface may allow some of these changes to be
specified implicitly. For example, many merges involve a single,
contiguous range of changes, with one or both ends of the range easily
deducible from context (i.e., branch root to branch tip). These
inference rules are not specified here, but it should be clear in most
contexts how they work.</p>
<p>Because each node knows its ancestors, Subversion never merges the
same change twice (unless you force it to). For example, if after the
above merge, you tell Subversion to merge all B changes into A,
Subversion will notice that two of them have already been merged, and so
merge only the other two changes, resulting in a final ancestry set
of:</p>
<pre>
{ A:1-10, B:3-9 }
</pre>
<!--
Heh, what about this:
B:3 adds line 3, with the text "foo".
B:5 deletes line 3.
B:7 adds line 3, with the text "foo".
B:9 deletes line 3.
The user first merges B:5 and B:9 into A. If A had that line, it goes away
now, nothing more.
Next, user merges B:3 and B:7 into A. The second merge must conflict.
I'm not sure we need to care about this, I just thought I'd note how even
merges that seem like they ought to be easily composable can still suck. :-)
-->