Wiki

Clone wiki

XLIFF Toolkit / Reading

Reading a Document

<Table of Content>

Accessing the source and target content is simple. You just need to read the document and when a TEXT_UNIT event occurs, grab the Unit object that comes with the event, and access the different objects in that unit.

For example the following code uses the XLIFFReader class to open and read a document and when a <unit> is parsed it gets the corresponding Unit object. Then it loops through all the segments and display the plain text output of the source content of each of the segment:

#!java
try ( XLIFFReader reader = new XLIFFReader() ) {
   reader.open(new File("myDocument1.xlf"));
   // Loop through the reader events
   while ( reader.hasNext() ) {
      Event event = reader.next();
      // Do something: here print the source content
      if ( event.isUnit() ) {
         Unit unit = event.getUnit();
         for ( Segment segment : unit.getSegments() ) {
            System.out.println(segment.getSource().getPlainText());
         }
      }
   }
}

Event-Driven Process

Reading a document is done through events. Most events are associated with a resource object that holds the data specific to the type of event. On events that have a start and an end, that resource usually comes with the start event.

Processing a document generates the following events:

Event Type Where does it Occurs Resource Class
START_DOCUMENT At the start of the XLIFF document. none
START_XLIFF At the end of the start tag of the <xliff> element. StartXliffData
START_FILE At the end of the start tag of each the <file> element. StartFileData
SKELETON At the end of the <skeleton> element. Skeleton
MID_FILE After the SKELETON event, before the first START_GROUP or TEXT_UNIT event. StartGroupData
START_GROUP At the end of the start tag of each <group> element. Unit
TEXT_UNIT At the end of the each <unit> element. none
END_GROUP At the end of the closing tag of each <group> element. none
END_FILE At the end of the closing tag of each <group> element. none
END_XLIFF At the end of the closing <xliff> element. none
END_DOCUMENT At the end of the document. none
INSIGNIFICANT_PART At the end of each insignificant part of the document (e.g. space between elements). InsingnificantPartData

Note that even if there is no skeleton or no element before the first <group> or <unit>, the SKELETON and MID_FILE events are always generated. This allows you to add new elements or perform some action at those locations even if there is no existing corresponding data in the input document.

Updated