Commits

David Boddie  committed 68eb2fc

Changed to Wiki-style markup

  • Participants
  • Parent commits 42e926c

Comments (0)

Files changed (1)

File Notes/Notes.txt

-Notes on the TechWriter file format
-===================================
+=Notes on the TechWriter file format=
 
-Files found in Reconnectn (a Spark archive)::
-
+Files found in Reconnectn (a Spark archive)
+{{{
     /home/david/Private/Beaker/Backup/Public/Out/MSc/Reconnectn,ddc
+}}}
 
 Using the documents, Conclusion and Summary.
 
 
-Styles
-------
+==Styles==
+
 In the Conclusion document, found "Paragraph Style" at 0x12d has the bytes
 '0f08' preceding it at 0x126 and this appears to correspond with those bytes
 appearing at 0xae6 immediately before "Paragraph Style".
 Similarly, the bytes 0x1008 appear at 0x2ca, followed by "Thesis Paragraph"
 at 0x2d1 and these occur again at 0xd14 with the text immediately after.
 
-A pattern emerges::
-
+A pattern emerges:
+{{{
     Bytes           Style text                      Bytes   Style text
     value   offset  offset  value                   offset  offset
     '0f08'  126     12d     "Paragraph Style"       ae6     ae8
     '1006'  239     240     "Expression Style"      1786    1788
      ^^
      length of the style string
+}}}
 
 Let us examine in more detail each entry in the index, starting just before
 the first style string, and continuing until the beginning of the fourth
-style string::
-
+style string:
+{{{
       length of the style string
                               vv ?? ?? ?? ?? ?? ??
   0000:0120 01 00 00 00 19 00 0f 08 00 01 06 02 00 50 61 72 .............Par
                      vv
   0000:0190          0a 03 00 01 09 05 00 4c 69 73 74 20 53    .......List S
   0000:01a0 74 79 6c 65 0d 07 00 01 07 01 00 50 69 63 74 75 tyle.......Pictu
+}}}
   
-At the end of the "style index", we see the following entries::
-
+At the end of the "style index", we see the following entries:
+{{{
   0000:02c0                               10 08 00 02 06 0f           ......
   0000:02d0 00 54 68 65 73 69 73 20 50 61 72 61 67 72 61 70 .Thesis Paragrap
   0000:02e0 68                                              h
   
   0000:0300             38 00 00 00 00 00 00 02 00 00 00 00     8...........
   0000:0310 40 03 00 00 1c 00 00 00 7e 03 00 00 11 00 00 00 @.......~.......
-
+}}}
 
 The pattern which emerges for the definition of a style is therefore:
 
   * six bytes of undetermined use
   * the style name (of length given by the first word)
 
-We can scan the entries and produce a summary using the following code::
-
+We can scan the entries and produce a summary using the following code:
+{{{
   f = open("Files/Conclusion")
   f.seek(0x126)
 
       bytes.append("%02x" % ord(f.read(1)))
     name = f.read(length)
     print "%02x" % length, "[%s]" % " ".join(bytes), name
+}}}
 
 This produces the following output, with some superfluous data found due to
-the inadequacy of the algorithm::
-
+the inadequacy of the algorithm:
+{{{
   0f [08 00 01 06 02 00] Paragraph Style
   0e [00 00 01 09 01 00] Document Style
   13 [00 00 02 09 02 00] Header/Footer Style
   0f [01 00 03 09 13 00] Thesis Appendix
   38 [00 00 00 00 00 00] @~��!�
   02 [00 00 00 00 04 05]
+}}}
 
 We note that, for each entry, the final byte of the six enclosed in square
 brackets is always zero. Other patterns are more difficult to extract.
 
 Let us reorder the list of styles above to be in the order in which their
-definitions appear in the file::
-
+definitions appear in the file:
+{{{
   0f [08 00 01 06 02 00] Paragraph Style
   06 [08 01 ff 06 06 00] Italic
   05 [08 01 ff 06 07 00] Greek
   0d [04 00 02 09 16 00] Thesis Figure
   0b [06 00 01 0a 01 00] Maths Style
   10 [06 00 02 0a 02 00] Expression Style
+}}}
 
 It becomes clear the the fourth byte in each entry somehow indicates the
 sequence in which the definitions of the styles appear in the file.
 
 
-The end of the style list
--------------------------
+===The end of the style list===
+
 Note that the ending at 304 coincides with a value in the file at 020::
 
   0000:0020 04 03 00 00 d2 07 00 00 de 07 00 00 ea 07 00 00 ....�...�...�...
   0000:0310 48 00 00 00 00 00 00 02 00 00 00 00 5c 03 00 00 H...........\...
 
 
-The beginning of the style list
--------------------------------
-Looking at the data before the first entry in Conclusion, we see::
+===The beginning of the style list===
 
+Looking at the data before the first entry in Conclusion, we see:
+{{{
   0000:0110 00 00 06 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
   
                         this is 25 and there are 25 styles listed
                         vv
   0000:0120 01 00 00 00 19 00 0f 08 00 01 06 02 00 50 61 72 .............Par
   0000:0130 61 67 72 61 70 68 20 53 74 79 6c 65 0e 00 00 01 agraph Style....
+}}}
 
 
+==Header offsets==
 
-
-Header offsets
---------------
 Having established, or at least asserted, that the end of the "style index"
 is given by the value (word) at 020 in the file, let us examine other values
 at the start of the file to determine something about the structure of the
 file format.
 
 In the Conclusion file, the data from 020 to 0a0 appears in the following
-form::
-
+form:
+{{{
   0000:0020 04 03 00 00 d2 07 00 00 de 07 00 00 ea 07 00 00 ....�...�...�...
             ^^^^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^
      end of style index |           |           12 bytes later (0x08)
   after the document text (0x08)    12 bytes later (0x08)
+}}}
 
 While the byte referred to a 0304 is 0x38, the bytes found at the other
 locations typically contain 0x08.
 
 The data after 0304 but before 07d2 contains the document text.
-
+{{{
   0000:0030 f6 07 00 00 01 0a 00 00 0d 0a 00 00 b5 0d 00 00 �...........�...
             ^^^^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^
   12 bytes later (0x08) |           |           
            12 bytes later (0x08)    12 bytes later (0xb0)
+}}}
 
 The quantity of data between the referenced data at 0a0d and that at 0db5
 is obviously more than another 12 bytes. In fact, there appears to be a
 table of some sort at 0a1c ending before 0aa0; this is followed by some
-style definitions. We continue with::
-
+style definitions. We continue with:
+{{{
   0000:0040 21 0e 00 00 de 0e 00 00 e5 16 00 00 06 00 06 00 !...�...�.......
+}}}
 
 The data referred to by the offsets, 0e21, 0ede and 16e5 are all within
 some sort of style definition area. The 0x00060006 word appears to be
-a missing entry since the offsets into the file continue below::
-  
+a missing entry since the offsets into the file continue below:
+{{{
   0000:0050 06 00 07 00 06 00 08 00 ed 17 00 00 09 00 05 00 ........�.......
-  
+}}}
+
 Here, the offset 17ed refers to data after what appears to be the final style
 definition. Entries in this table appear to continue up to 90::
-  
+{{{  
   0000:0060 09 00 06 00 08 00 01 00 0a 00 01 00 07 00 01 00 ................
   0000:0070 06 00 02 00 09 00 08 00 09 00 07 00 09 00 03 00 ................
   0000:0080 09 00 04 00 09 00 0f 00 00 00 00 00 00 00 00 00 ................
   0000:0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
+}}}
 
 It is possible that the 0x00060006 value and other similar values represent
 unused entries in a heap of some sort.
 
 
 A pattern emerges for the offsets in the header::
-
+{{{
  +-----------+-------------------------------------------------------------+
  | Offset    | Contents                                                    |
  +===========+=============================================================+
  +-----------+-------------------------------------------------------------+
  | 0000:0058 | other data                                                  |
  +-----------+-------------------------------------------------------------+
+}}}
 
 Note that in the Models file, the offset at 58 refers to data later in the
 file which includes a Drawfile.
 
 
 
+==Style definitions table==
 
-
-Style definitions table
------------------------
-Let us examine the table referred to by the relevant header offset at 38::
-
+Let us examine the table referred to by the relevant header offset at 38:
+{{{
   0000:0a00 00 08 00 00 00 00 00 00 01 00 00 00 00 b0 00 00 .............�..
   0000:0a10 00 00 00 00 01 00 00 00 00 c1 0a 00 00 0d 00 00 .........�......
   0000:0a20 00 db 0a 00 00 1c 00 00 00 04 0b 00 00 0d 00 00 .�..............
   0000:0aa0 00 13 00 00 80 00 00 00 00 14 00 00 80 00 00 00 ................
   0000:0ab0 00 00 00 00 80 00 00 00 00 89 0d 00 00 0d 00 00 ................
   0000:0ac0 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 ................
+}}}
 
-Style definitions
------------------
-The definitions at
+==Style definitions==
+
+The definitions after the