summaryrefslogtreecommitdiffstats
path: root/Documentation/technical/index-format.txt
blob: 89e410a8b242ef788abab4501f46fed149315357 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
GIT index format
================

= The git index file has the following format

  All binary numbers are in network byte order. Version 2 is described
  here unless stated otherwise.

   - A 12-byte header consisting of

     4-byte signature:
       The signature is { 'D', 'I', 'R', 'C' }

     4-byte version number:
       The current supported versions are 2 and 3.

     32-bit number of index entries.

   - A number of sorted index entries

   - Extensions

     Extensions are identified by signature. Optional extensions can
     be ignored if GIT does not understand them.

     GIT currently supports tree cache and resolve undo extensions.

     4-byte extension signature. If the first byte is 'A'..'Z' the
     extension is optional and can be ignored.

     32-bit size of the extension

     Extension data

   - 160-bit SHA-1 over the content of the index file before this
     checksum.

== Index entry

  Index entries are sorted in ascending order on the name field,
  interpreted as a string of unsigned bytes. Entries with the same
  name are sorted by their stage field.

  32-bit ctime seconds, the last time a file's metadata changed
    this is stat(2) data

  32-bit ctime nanosecond fractions
    this is stat(2) data

  32-bit mtime seconds, the last time a file's data changed
    this is stat(2) data

  32-bit mtime nanosecond fractions
    this is stat(2) data

  32-bit dev
    this is stat(2) data

  32-bit ino
    this is stat(2) data

  32-bit mode, split into (high to low bits)

    4-bit object type
      valid values in binary are 1000 (blob), 1010 (symbolic link)
      and 1110 (gitlink)

    3-bit unused

    9-bit unix permission (only 0755 and 0644 are valid)

  32-bit uid
    this is stat(2) data

  32-bit gid
    this is stat(2) data

  32-bit file size
    This is the on-disk size from stat(2)

  160-bit SHA-1 for the represented object

  A 16-bit field split into (high to low bits)

    1-bit assume-valid flag

    1-bit extended flag (must be zero in version 2)

    2-bit stage (during merge)

    12-bit name length if the length is less than 0x0FFF

  (Version 3) A 16-bit field, only applicable if the "extended flag"
  above is 1, split into (high to low bits).

    1-bit reserved for future

    1-bit skip-worktree flag (used by sparse checkout)

    1-bit intent-to-add flag (used by "git add -N")

    13-bit unused, must be zero

  Entry path name (variable length) relative to top level directory
    (without leading slash). '/' is used as path separator. The special
    paths ".", ".." and ".git" (without quotes) are disallowed.
    Trailing slash is also disallowed.

    The exact encoding is undefined, but the '.' and '/' characters
    are encoded in 7-bit ASCII and the encoding cannot contain a nul
    byte. Generally a superset of ASCII.

  1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
  while keeping the name NUL-terminated.

== Extensions

=== Tree cache

  Tree cache extension contains pre-computed hashes for trees that can
  be derived from the index. It helps speed up tree object generation
  from index for a new commit.

  When a path is updated in index, the path must be invalidated and
  removed from tree cache.

  - Extension tag { 'T', 'R', 'E', 'E' }

  - 32-bit size

  - A number of entries

     NUL-terminated tree name

     Blank-terminated ASCII decimal number of entries in this tree

     Newline-terminated position of this tree in the parent tree. 0 for
     the root tree

     160-bit SHA-1 for this tree and it's children

=== Resolve undo

  A conflict is represented in index as a set of higher stage entries.
  When a conflict is resolved (e.g. with "git add path"), these higher
  stage entries will be removed and a stage-0 entry with proper
  resoluton is added.

  Resolve undo extension saves these higher stage entries so that
  conflicts can be recreated (e.g. with "git checkout -m"), in case
  users want to redo a conflict resolution from scratch.

  - Extension tag { 'R', 'E', 'U', 'C' }

  - 32-bit size

  - A number of conflict entries

    NUL-terminated conflict path

    Three NUL-terminated ASCII octal numbers, entry mode of entries in
    stage 1 to 3.

    At most three 160-bit SHA-1s of the entry in three stages from 1
    to 3. SHA-1 is not saved for any stage with entry mode zero.