1. TortoiseHg
  2. TortoiseHg
  3. thg
  4. Issues
Issue #2076 open

merge-doc.js cannot merge Word 2010 documents

Ricardo Navarro
created an issue

Hello,

This applies to TortoiseHg 2.4.2. Version 2.1.4 does not have this issue.

I'm going to give a way to reproduce the problem, it might be better:

1) Create a repository 2) Add a word document (e.g. wordfile.docx) with some content and commit 3) Create a named branch (e.g. Branch 1)and modify the word document, commit 4) Update to the default branch, modify the word document and commit 5) You end up with two branches, default and branch 1, with two versions of the same document 6) Try "Merge with local" (branch 1 with default) 7) Click on resolve conflicts, Resolve conflicts window shows 8) Use docdiff and click on Tool Resolve 9) An error windows shows for script merge-doc.js, line 92, char 9: "Error: This file could not be found. (C:\repo\wordfile)"

As you can see, the argument given to the script merge-doc.js is "wordfile" and it shall be "wordfile.docx".

If you print out the variable sBaseDoc, in line 44 of merge-doc.js, you can see that it is "wordfile" instead of "wordfile.docx". The remaining arguments seem to be fine, i.e. sMyDoc, sTheirDoc and sMergedDoc.

I hope this explanation is understandable. I attach a repository ready to be merged so you don't have to create one and a couple of screenshots.

Thanks!

Comments (36)

  1. Ricardo Navarro reporter

    The steps to reproduce the problem in a proper list (sorry for the mess above):

    1. Create a repository.
    2. Add a word document (e.g. wordfile.docx) with some content and commit.
    3. Create a named branch (e.g. Branch 1)and modify the word document, commit.
    4. Update to the default branch, modify the word document and commit.
    5. You end up with two branches, default and branch 1, with two versions of the same document.
    6. Try "Merge with local" (branch 1 with default).
    7. Click on resolve conflicts, Resolve conflicts window shows.
    8. Use docdiff and click on Tool Resolve.
    9. An error windows shows for script merge-doc.js, line 92, char 9: "Error: This file could not be found. (C:\repo\wordfile)".
  2. Ricardo Navarro reporter

    TortoiseHg version 2.2.2 does not have this problem, merging is perfect; however, in version 2.3 (the next one) the issue appears for the first time. Hence, the problem must have been inserted betwen:

  3. Ricardo Navarro

    Thanks for the tip.

    Unfortunately I was not able to pinpoint the bug, I only have a windows machine at work.

    I think the easiest way would be to debug it, with GDB or the sort, and check that the argument passed to docdiff contains both, the file name and the file extension.

  4. Yuya Nishihara

    Could you test the attached docdiff.exe ?

    It includes the following change:

    diff --git a/win32/docdiff.py b/win32/docdiff.py
    --- a/win32/docdiff.py
    +++ b/win32/docdiff.py
    @@ -56,10 +56,10 @@ def main():
             sys.exit(1)
         elif len(args) == 2:
             local, other = [os.path.abspath(f) for f in args]
    -        base, ext = os.path.splitext(local)
    +        _base, ext = os.path.splitext(local)
         else:
             local, base, other, output = [os.path.abspath(f) for f in args]
    -        base, ext = os.path.splitext(output)
    +        _base, ext = os.path.splitext(output)
     
         if not ext or ext.lower()[1:] not in scripts.keys():
             print 'Unsupported file type', ext
    
  5. Ricardo Navarro

    It seems to work partially, at least I can see the "Word diff" view ready to be merged, but currently, your docdiff launches three instances of Word:

    1. The first with the "base" document (no diff).
    2. The second is the "base" + "other" document, in a diff view (default branch)
    3. The third is the "local" + "default branch", in a diff view ("base + other" merged)

    In Tortoisehg version 2.2.2 you only get one instance of Word with the three documents in one diff view: "base", "other" and "local".

    Could you take a look if this is happening only from my side?

    Thanks!

  6. Ricardo Navarro

    I'm re-checking this problem with your docdiff and unfortunately it seems that it is not handling the diffs correctly.

    I'll try to explain it. Using my example in the Attachment.zip. This is the glog:

    $ hg glog
    @  changeset:   2:bef6434d27d2
    |  tag:         tip
    |  parent:      0:0e60b79cfb52
    |  user:        User <name@email.net>
    |  date:        Fri Aug 10 15:56:50 2012 +0200
    |  summary:     Default branch modifications
    |
    | @  changeset:   1:a4661650f26f
    |/   branch:      Branch 1
    |    user:        User <name@email.net>
    |    date:        Fri Aug 10 15:56:01 2012 +0200
    |    summary:     Created branch 1
    |
    o  changeset:   0:0e60b79cfb52
       user:        User <name@email.net>
       date:        Fri Aug 10 15:54:30 2012 +0200
       summary:     Added files
    

    When you merge the changeset 1 with the 2 in TortoiseHg 2.2.2, in Word you get one instance with three windows (as expected), like this:

    -----------------------------------
    |                |                |
    |                |                |
    |                |   Changeset 1  |
    |     DIFF       |                |
    |      of        |                |
    |  changesets    |                |
    |    1 & 2       |----------------|
    |                |                |
    |                |                |
    |                |   Changeset 2  |
    |                |                |
    |                |                |
    |                |                |
    -----------------------------------
    

    After saving in Word the status of the file in the "Resolve Conflicts" window (of TortoiseHg) is "Resolved Conflict".

    Currently, the docdiff you provide (installed in TortoiseHg 2.5) seems to have these issues:

    • Word is launching 3 instances instead of one.
    • The diffs views are not correct as changeset 0 shows up and needs to be merged in any case (you cannot get rid of it), this should not occur. The approach would be to only merge changesets 1 & 2.
    • After saving the Word document, the status of the merging is still in "unresolved". This might lead to the conclusion that the merging was not performed.

    I don't know if I'm getting these results only in my computer or if this is happening also in yours.

    I think we are pretty close to the final solution, it is just a matter of digging a little bit more.

    Thanks!

  7. Yuya Nishihara

    The easiest workaround is to use old merge-doc.js.

    I guess the bug of docdiff.exe, which is just a bridge between TortoiseHg and diff-scripts, was fixed by my change. But still, the bundled merge-doc.js seems to have another issue.

    I read changelog of TortoiseSVN, but couldn't find why it handles Office 2010 differently. Maybe we need COM and MS-Word expert.

  8. Ricardo Navarro

    To Yuya Nishihara : I have checked what you propose and it works perfectly. The docdiff you provide fixes the parameter and the merge-doc.js from TortoiseHg 2.2.2 handles the merging perfectly. Thanks! You've nailed it.

    To Steve Borho : Agree, the issue is resolved from TortoiseHg side. Do you know who is the maintainer of merge-doc.js? The current version (in TortoiseHg 2.5) cannot be used, but the version in 2.2.2 works like a charm.

    BTW, thanks to both of you for your efforts in resolving this issue.

  9. Ricardo Navarro

    Yuya, if you need anything just tell me. I don't know right now what would be the next step.

    In the meanwhile I'm going to check out the script to know how it works and the differences between both... unfortunately I do not know JS but at least I'll try.

    Cheers.

  10. Yuya Nishihara

    Not sure, but possibly the following change could solve the problem:

    diff --git a/contrib/diff-scripts/merge-doc.js b/contrib/diff-scripts/merge-doc.js
    --- a/contrib/diff-scripts/merge-doc.js
    +++ b/contrib/diff-scripts/merge-doc.js
    @@ -90,7 +90,7 @@ else //2010 - handle slightly differentl
     {
             theirDoc = baseDoc;
             baseDoc = word.Documents.Open(sBaseDoc);
    -        myDoc = word.Documents.Open(sMyDoc);
    +        myDoc = word.Documents.Open(sMergedDoc);
     
             baseDoc.Activate(); //required otherwise it compares the wrong docs !!!
             baseDoc.Compare(sTheirDoc, "theirs", wdCompareTargetSelected, true, true);
    

    sMergedDoc should be the output file, but not touched in if-Office2010 clause.

    reference of doc.Merge() method: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word._document.merge.aspx

  11. Ricardo Navarro

    Hi Yuya, thanks for your help!

    Unfortunately, the change did not resolve the problem.

    I made some modifications and now it seems to be ok. Now you get only two documents open: one is the diff between the "base" and "theirs", the other is the diff between "mine" and "theirs". Hence you have the 3-way merge. Here is the diff between the merge-doc.js from TortoiseHg 2.5 and my modifications:

    19c19
    < var objArgs,num,sTheirDoc,sMyDoc,sBaseDoc,sMergedDoc,objScript,word,baseDoc,myDoc,theirDoc,WSHShell;^M
    ---
    > var objArgs,num,sTheirDoc,sMyDoc,sBaseDoc,sMergedDoc,objScript,word,baseDoc,myDoc,WSHShell;^M
    72,74d71
    < // Open the base document^M
    < baseDoc = word.Documents.Open(sTheirDoc);^M
    < ^M
    77a75
    >         baseDoc = word.Documents.Open(sTheirDoc);^M
    81a80
    >         baseDoc = word.Documents.Open(sTheirDoc);^M
    85a85
    >         baseDoc = word.Documents.Open(sTheirDoc);^M
    91d90
    <         theirDoc = baseDoc;^M
    101,102d99
    <         //theirDoc.Save();^M
    <         //myDoc.Save();^M
    104a102
    >         baseDoc.Close();^M
    

    I'm also uploading the file as merge-doc_proposal_1.js

    What do you think?

  12. Yuya Nishihara

    Your change looks good to me, but I'm not a Windows user, I can't say it's correct. Still, I don't understand why it can achieve merging without sMergedDoc, the output file parameter.

    Also, I guess the current implementation avoids baseDoc.Close() intentionally, by r22247: http://code.google.com/p/tortoisesvn/source/detail?r=22247#

    FWIW, I recommend to send a patch to thg-dev mailing list, or preferably to TortoiseSVN team if you have the time to test it with TortoiseSVN. They would have much more experience of Windows stuff.

  13. Ricardo Navarro

    Having reviewed SVN implementation, it seems that their approach is to have 4 "Word 2010" instances opened. From my point of view this is not desirable as for a 3-way merge you only need two Diffs, e.g. suppose you have documents "base", "theirs" and "mine", in a 3-way merge the only relevant diffs are "base-theirs" and "theirs-mine". Hence, only two instances are necessary. Under SVN approach they open on top of that the original "base" document and the original "their" document... no rationale was given for this decision.

    I would keep things simple, and if the TortoiseHg Dev Team agrees with me then the baseDoc.Close() instruction should be there (or any other mechanism); however, this will mark a separation from SVN development and will imply that TortoiseHg will have to keep its own version control of the merging scripts.

    I will send a patch to thg-dev mailing list to see what they think.

    Thanks!

  14. Ricardo Navarro

    BTW, I've found the tgh-winbuild repository where the diff-scripts are located. Therefore the following sentence (in my last post) is not applicable anymore:

    "however, this will mark a separation from SVN development and will imply that TortoiseHg will have to keep its own version control of the merging scripts."
    

    Thanks!

  15. Yuya Nishihara

    I think I have found a simpler way.

    That's nice. Thanks!

    the tgh-winbuild repository where the diff-scripts are located. Therefore the following sentence (in my last post) is not applicable

    Even though thg-winbuild has a copy of diff-scripts, patching it by thg-side will increase future maintenance cost.

  16. Ricardo Navarro

    Hi Yuya, sorry for the delay but I got a lot of work in the last week.

    Before sending anything to the dev mailing list I would prefer to post here the diff. It is very simple, only changing a "<" by a "<="; however, I think it is your call to include it or not as it affects maintainability. This change only closes the base document as it is currently happening with other versions of Office, so you only have two instances open: one is your diff with respect to the latest change, and the other is the diff between base and the latest change. Here is the change:

    diff -r 601256023034 -r 21650f7d2b4b contrib/diff-scripts/merge-doc.js
    --- a/contrib/diff-scripts/merge-doc.js	Thu Sep 13 15:55:17 2012 -0500
    +++ b/contrib/diff-scripts/merge-doc.js	Tue Oct 02 16:06:22 2012 +0200
    @@ -111,7 +111,7 @@
     }
     
     // Close the first document
    -if ((parseInt(word.Version) >= vOffice2002)&&(parseInt(word.Version) < vOffice2010))
    +if ((parseInt(word.Version) >= vOffice2002)&&(parseInt(word.Version) <= vOffice2010))
     {
             baseDoc.Close();
     }
    

    Let me know what do you think.

  17. Yuya Nishihara

    The new code says, it should skip baseDoc.Close() on upcoming Office 2013. But why?

    I think it is your call to include it or not as it affects maintainability.

    Maybe Steve will do. I don't have enough experience to decide things about win32 build. That's why I suggest sending a patch to thg-dev.

  18. Ricardo Navarro

    Hello Cowwoc.

    It is not my call to modify or not the code as I am not a developer. I just proposed a workaround: substitute a "less than" by a "less or equal than". But as it is a workaround, an appropriate fix has to be assessed, approved and included in the code by the developers.

    As Yuya said, this workaround leaves out Office 2013; however, I think that removing the last part of the IF condition should do the trick:

    if (parseInt(word.Version) >= vOffice2002)
     {
             baseDoc.Close();
     }
    

    A Javascript programmer who knows MS Word should take a look to this issue. Unfortunately, I cannot help much more. I hope it can be fixed.

  19. Telis Kit

    I've managed to overcome this problem by replacing diff-scripts\diff-doc.js with diff-doc.js from TortoiseSVN. Maybe the authors can simply update the script provided in the TortoiseHg installation and this issue will be resolved?

  20. Log in to comment