Commits

Anonymous committed 094574b Merge

Merge branch 'nd/index-doc' into maint

* nd/index-doc:
doc: technical details about the index file format
doc: technical details about the index file format

Comments (0)

Files changed (1)

Documentation/technical/index-format.txt

+GIT index format
+================
+
+= The git index file has the following format
+
+  All binary numbers are in network byte order. Version 2 is described
+  here unless stated otherwise.
+
+   - A 12-byte header consisting of
+
+     4-byte signature:
+       The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
+
+     4-byte version number:
+       The current supported versions are 2 and 3.
+
+     32-bit number of index entries.
+
+   - A number of sorted index entries (see below).
+
+   - Extensions
+
+     Extensions are identified by signature. Optional extensions can
+     be ignored if GIT does not understand them.
+
+     GIT currently supports cached tree and resolve undo extensions.
+
+     4-byte extension signature. If the first byte is 'A'..'Z' the
+     extension is optional and can be ignored.
+
+     32-bit size of the extension
+
+     Extension data
+
+   - 160-bit SHA-1 over the content of the index file before this
+     checksum.
+
+== Index entry
+
+  Index entries are sorted in ascending order on the name field,
+  interpreted as a string of unsigned bytes (i.e. memcmp() order, no
+  localization, no special casing of directory separator '/'). Entries
+  with the same name are sorted by their stage field.
+
+  32-bit ctime seconds, the last time a file's metadata changed
+    this is stat(2) data
+
+  32-bit ctime nanosecond fractions
+    this is stat(2) data
+
+  32-bit mtime seconds, the last time a file's data changed
+    this is stat(2) data
+
+  32-bit mtime nanosecond fractions
+    this is stat(2) data
+
+  32-bit dev
+    this is stat(2) data
+
+  32-bit ino
+    this is stat(2) data
+
+  32-bit mode, split into (high to low bits)
+
+    4-bit object type
+      valid values in binary are 1000 (regular file), 1010 (symbolic link)
+      and 1110 (gitlink)
+
+    3-bit unused
+
+    9-bit unix permission. Only 0755 and 0644 are valid for regular files.
+    Symbolic links and gitlinks have value 0 in this field.
+
+  32-bit uid
+    this is stat(2) data
+
+  32-bit gid
+    this is stat(2) data
+
+  32-bit file size
+    This is the on-disk size from stat(2), truncated to 32-bit.
+
+  160-bit SHA-1 for the represented object
+
+  A 16-bit 'flags' field split into (high to low bits)
+
+    1-bit assume-valid flag
+
+    1-bit extended flag (must be zero in version 2)
+
+    2-bit stage (during merge)
+
+    12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
+    is stored in this field.
+
+  (Version 3) A 16-bit field, only applicable if the "extended flag"
+  above is 1, split into (high to low bits).
+
+    1-bit reserved for future
+
+    1-bit skip-worktree flag (used by sparse checkout)
+
+    1-bit intent-to-add flag (used by "git add -N")
+
+    13-bit unused, must be zero
+
+  Entry path name (variable length) relative to top level directory
+    (without leading slash). '/' is used as path separator. The special
+    path components ".", ".." and ".git" (without quotes) are disallowed.
+    Trailing slash is also disallowed.
+
+    The exact encoding is undefined, but the '.' and '/' characters
+    are encoded in 7-bit ASCII and the encoding cannot contain a NUL
+    byte (iow, this is a UNIX pathname).
+
+  1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
+  while keeping the name NUL-terminated.
+
+== Extensions
+
+=== Cached tree
+
+  Cached tree extension contains pre-computed hashes for trees that can
+  be derived from the index. It helps speed up tree object generation
+  from index for a new commit.
+
+  When a path is updated in index, the path must be invalidated and
+  removed from tree cache.
+
+  The signature for this extension is { 'T', 'R', 'E', 'E' }.
+
+  A series of entries fill the entire extension; each of which
+  consists of:
+
+  - NUL-terminated path component (relative to its parent directory);
+
+  - ASCII decimal number of entries in the index that is covered by the
+    tree this entry represents (entry_count);
+
+  - A space (ASCII 32);
+
+  - ASCII decimal number that represents the number of subtrees this
+    tree has;
+
+  - A newline (ASCII 10); and
+
+  - 160-bit object name for the object that would result from writing
+    this span of index as a tree.
+
+  An entry can be in an invalidated state and is represented by having -1
+  in the entry_count field.
+
+  The entries are written out in the top-down, depth-first order.  The
+  first entry represents the root level of the repository, followed by the
+  first subtree---let's call this A---of the root level (with its name
+  relative to the root level), followed by the first subtree of A (with
+  its name relative to A), ...
+
+=== Resolve undo
+
+  A conflict is represented in the index as a set of higher stage entries.
+  When a conflict is resolved (e.g. with "git add path"), these higher
+  stage entries will be removed and a stage-0 entry with proper resoluton
+  is added.
+
+  When these higher stage entries are removed, they are saved in the
+  resolve undo extension, so that conflicts can be recreated (e.g. with
+  "git checkout -m"), in case users want to redo a conflict resolution
+  from scratch.
+
+  The signature for this extension is { 'R', 'E', 'U', 'C' }.
+
+  A series of entries fill the entire extension; each of which
+  consists of:
+
+  - NUL-terminated pathname the entry describes (relative to the root of
+    the repository, i.e. full pathname);
+
+  - Three NUL-terminated ASCII octal numbers, entry mode of entries in
+    stage 1 to 3 (a missing stage is represented by "0" in this field);
+    and
+
+  - At most three 160-bit object names of the entry in stages from 1 to 3
+    (nothing is written for a missing stage).
+