Wiki
Clone wikiXLIFF Toolkit / Reading
Reading a Document
Accessing the source and target content is simple. You just need to read the document and when a TEXT_UNIT
event occurs, grab the Unit
object that comes with the event, and access the different objects in that unit.
For example the following code uses the XLIFFReader class to open and read a document and when a <unit>
is parsed it gets the corresponding Unit object. Then it loops through all the segments and display the plain text output of the source content of each of the segment:
#!java try ( XLIFFReader reader = new XLIFFReader() ) { reader.open(new File("myDocument1.xlf")); // Loop through the reader events while ( reader.hasNext() ) { Event event = reader.next(); // Do something: here print the source content if ( event.isUnit() ) { Unit unit = event.getUnit(); for ( Segment segment : unit.getSegments() ) { System.out.println(segment.getSource().getPlainText()); } } } }
Event-Driven Process
Reading a document is done through events. Most events are associated with a resource object that holds the data specific to the type of event. On events that have a start and an end, that resource usually comes with the start event.
Processing a document generates the following events:
Event Type | Where does it Occurs | Resource Class |
---|---|---|
START_DOCUMENT |
At the start of the XLIFF document. | none |
START_XLIFF |
At the end of the start tag of the <xliff> element. |
StartXliffData |
START_FILE |
At the end of the start tag of each the <file> element. |
StartFileData |
SKELETON |
At the end of the <skeleton> element. |
Skeleton |
MID_FILE |
After the SKELETON event, before the first START_GROUP or TEXT_UNIT event. |
StartGroupData |
START_GROUP |
At the end of the start tag of each <group> element. |
Unit |
TEXT_UNIT |
At the end of the each <unit> element. |
none |
END_GROUP |
At the end of the closing tag of each <group> element. |
none |
END_FILE |
At the end of the closing tag of each <group> element. |
none |
END_XLIFF |
At the end of the closing <xliff> element. |
none |
END_DOCUMENT |
At the end of the document. | none |
INSIGNIFICANT_PART |
At the end of each insignificant part of the document (e.g. space between elements). | InsingnificantPartData |
Note that even if there is no skeleton or no element before the first <group>
or <unit>
, the SKELETON
and MID_FILE
events are always generated. This allows you to add new elements or perform some action at those locations even if there is no existing corresponding data in the input document.
Updated