Any plans to support change tracking? (specifically MS Word)

Issue #721 new
Oytun Tez created an issue

Hi there,

We are using Okapi as a library (not the applications) and we want to support preservation of change tracking tags in DOCX files. When we disable bPreferenceAutomaticallyAcceptRevisions, we receive an exception. We are now trying to fork and see how we can quickly work around it, but we would like to hear some insights about this problem.

  1. Do you have any plans to support change tracking?
  2. Have you tried implementing it before? We can't see much history about this issue. What kind of barriers do you think we should expect?

This looks like a complicated task, so we are basically asking any kind of insights towards this issue.

Thank you, Oytun

Comments (2)

  1. Oytun Tez reporter

    I want to clarify one thing: we can initially implement only "preservation of change elements".

    Example:

                <w:del w:id="1" w:author="Oytun Tez" w:date="2018-05-15T14:58:00Z">
                    <w:r w:rsidR="00760CFE" w:rsidDel="005D030A">
                        <w:rPr>
                            <w:color w:val="1155CC" />
                            <w:u w:val="single" />
                        </w:rPr>
                        <w:delText>SOT</w:delText>
                    </w:r>
                </w:del>
                <w:r w:rsidR="00760CFE">
                    <w:rPr>
                        <w:color w:val="FF0000" />
                    </w:rPr>
                    <w:t>Hello world for content to be translated</w:t>
                </w:r>
                <w:ins w:id="2" w:author="Oytun Tez" w:date="2018-05-17T13:36:00Z">
                    <w:r w:rsidR="00D22E43">
                        <w:t xml:space="preserve"> tomorrow</w:t>
                    </w:r>
                </w:ins>
    

    Okapi would not extract <w:del/> and <w:ins/> elements, but only the regular content. In return, Okapi would still preserve <w:del/> and <w:ins/> elements.

    The translated document would look like this:

                <w:del w:id="1" w:author="Oytun Tez" w:date="2018-05-15T14:58:00Z">
                    <w:r w:rsidR="00760CFE" w:rsidDel="005D030A">
                        <w:rPr>
                            <w:color w:val="1155CC" />
                            <w:u w:val="single" />
                        </w:rPr>
                        <w:delText>SOT</w:delText>
                    </w:r>
                </w:del>
                <w:r w:rsidR="00760CFE">
                    <w:rPr>
                        <w:color w:val="FF0000" />
                    </w:rPr>
                    <w:t>This is the translated hello world content - and only this content was translated, the XLIFF does not present w:del and w:ins elements</w:t>
                </w:r>
                <w:ins w:id="2" w:author="Oytun Tez" w:date="2018-05-17T13:36:00Z">
                    <w:r w:rsidR="00D22E43">
                        <w:t xml:space="preserve"> tomorrow</w:t>
                    </w:r>
                </w:ins>
    

    This way, we don't have decide on accept/reject automatically, it will be user's problem, but accepted (regular) content would still be exported for translation and imported back to its original place.

    Would this help for you to guide us?

  2. Chase Tingley

    @DenisKonovalyenko and I talked about this a long time ago when we were reworking the filter. IIRC is some technical complexity in that the markup inside the insertion and deletion sections can be complicated, but the filter can probably handle that.

    The main reason I didn't want to support this originally is that I don't understand the use case. Translating change tracking seems like an anti-pattern most of the time. The segments it produces are difficult to translate and obey no real grammatical structure (since they contain multiple revisions simultaneously), which makes them useless for TM, etc.

    If you are looking to implement it, though, I can't think of an objection. I'm sure somebody, somewhere, wants to translate this stuff.

  3. Log in to comment