On-disk pak file format changes depending on machine architecture

Issue #24 new
sqweek created an issue

Ok so I have this pak file [attached] in my repo, which is I think a merge commit of a file called "ori" which I made separate changes to on different machines to test out the conflict resolution machinery. But when orifs tries to apply the merge it crashes with an assertion failure:

2016-07-23 20:47:35 ASSERT(it->second.size() >= sizeof(T)): getAs public/ori/tree.h:47

orifs: public/ori/tree.h:47: T AttrMap::getAs(const string&) const [with T = long unsigned int; std::__cxx11::string = std::__cxx11::basic_string<char>]: Assertion `it->second.size() >= sizeof(T)' failed.

I tracked the problem down and to explain I'd like to draw attention to two areas of the pak (which describe the file attributes):

00000100  c3 06 26 a8 a4 03 6f 72  69 a6 00 00 00 07 a8 a4  |..&...ori.......|
00000110  06 53 63 74 69 6d 65 a8  a4 08 ba 7c 05 55 00 00  |.Sctime....|.U..|
00000120  00 00 a8 a4 06 53 67 72  6f 75 70 a8 a4 06 73 71  |.....Sgroup...sq|
00000130  77 65 65 6b a8 a4 05 53  6c 69 6e 6b a8 a4 01 00  |week...Slink....|
00000140  a8 a4 06 53 6d 74 69 6d  65 a8 a4 08 ba 7c 05 55  |...Smtime....|.U|
00000150  00 00 00 00 a8 a4 06 53  70 65 72 6d 73 a8 a4 04  |.......Sperms...|
00000160  a4 01 00 00 a8 a4 05 53  73 69 7a 65 a8 a4 08 35  |.......Ssize...5|
00000170  00 00 00 00 00 00 00 a8  a4 05 53 75 73 65 72 a8  |..........Suser.|
...
000002f0  20 3d 62 d2 f5 f8 54 a1  c1 7a f3 82 a8 a4 03 6f  | =b...T..z.....o|
00000300  72 69 a6 00 00 00 07 a8  a4 06 53 63 74 69 6d 65  |ri........Sctime|
00000310  a8 a4 04 ba 7c 05 55 a8  a4 06 53 67 72 6f 75 70  |....|.U...Sgroup|
00000320  a8 a4 06 73 71 77 65 65  6b a8 a4 05 53 6c 69 6e  |...sqweek...Slin|
00000330  6b a8 a4 01 00 a8 a4 06  53 6d 74 69 6d 65 a8 a4  |k.......Smtime..|
00000340  04 ba 7c 05 55 a8 a4 06  53 70 65 72 6d 73 a8 a4  |..|.U...Sperms..|
00000350  04 a4 01 00 00 a8 a4 05  53 73 69 7a 65 a8 a4 04  |........Ssize...|
00000360  25 00 00 00 a8 a4 05 53  75 73 65 72 a8 a4 06 73  |%......Suser...s|

Note inparticular the Ssize attribute. At byte 0x16e we see that 8 bytes have been allocated for the attribute's value. But at byte 0x35f we see Ssize only getting 4 bytes.

When orifs tries to decode this 4-byte Ssize it runs into trouble during attrs.getAs<size_t>(ATTR_FILESIZE).

I'm running the merge on a 64-bit machine where sizeof(size_t) == 8. The other machine involved is however 32-bits, where sizeof(size_t) == 4. So it would seem the two machines cannot interoperate.

size_t is probably the wrong type here in the first place; off_t would at least match the stat struct's st_size member but can still change depending on how _FILE_OFFSET_BITS is defined.

Naively switching to off_t will break ori for any existing users on 32-bit platforms though. A better fix might be a backwards-compatible variable-width integer encoding but that requires some overhaul to AttrMap and may also affect de-duplication.

Comments (1)

  1. Log in to comment