1. Sebastian Sebastian
  2. scm-manager

Issues

Issue #139 resolved

Problems with mercurial changelog encoding on windows

Lukas Wöhrl
created an issue

I get the attached stacktrace when I want to take a look at the mercurial commits or I want to look at the "activities" of the repository. The server hangs partly and must be restarted!

I'm using the latest version 1.14 so it can't be related to Issue #95 or Issue #98

Comments (34)

  1. Sebastian Sebastian repo owner

    After looking into the mercurial sourcecode, i think this is not an encoding problem. It looks like a problem with the size of the xml returned by the python process. Could please do the following steps to verify that:

    • restart scm-manager
    • edit the file lib/python/changelog.py in your scm home directory
    • search for the following method:
    def appendChangesetNode(doc, parentNode, ctx):
      changesetNode = createChildNode(doc, parentNode, 'changeset')
      appendIdNode(doc, changesetNode, ctx)
      appendParentNodes(doc, changesetNode, ctx)
      appendTextNode(doc, changesetNode, 'description', ctx.description())
      appendDateNode(doc, changesetNode, 'date', ctx.date())
      appendAuthorNodes(doc, changesetNode, ctx)
      appendBranchesNode(doc, changesetNode, ctx)
      appendListNodes(doc, changesetNode, 'tags', ctx.tags())
      appendModifications(doc, changesetNode, ctx)
    
    • uncomment the last line of the method:
    def appendChangesetNode(doc, parentNode, ctx):
      changesetNode = createChildNode(doc, parentNode, 'changeset')
      appendIdNode(doc, changesetNode, ctx)
      appendParentNodes(doc, changesetNode, ctx)
      appendTextNode(doc, changesetNode, 'description', ctx.description())
      appendDateNode(doc, changesetNode, 'date', ctx.date())
      appendAuthorNodes(doc, changesetNode, ctx)
      appendBranchesNode(doc, changesetNode, ctx)
      appendListNodes(doc, changesetNode, 'tags', ctx.tags())
      # appendModifications(doc, changesetNode, ctx)
    
    • try to reproduce the error again

    Note: SCM-Manager rewrite the python files on every start

  2. Sebastian Sebastian repo owner

    Sorry, but i'm not able to reproduce this issue. How many files have changed in the first 20 commits. I ask because i created a test repository with changes to 96860 files in the first 20 commits and i was not able to reproduce this issue.

    Please create a new file at lib/python/test.py:

    import os
    os.environ['SCM_PATH'] = ''
    os.environ['SCM_REVISION_START'] = ''
    os.environ['SCM_REVISION_END'] = ''
    os.environ['SCM_REVISION'] = ''
    
    from util import *
    from changelog import *
    

    And a bat file to start the python file:

    @echo off
    set SCM_REPOSITORY_PATH=C:\Users\svndata\.scm\repositories\hg\REPOSITORYNAME
    set SCM_PAGE_START=0
    set SCM_PAGE_LIMIT=20
    C:\Python26\python.exe C:\Users\svndata\.scm\lib\python\test.py
    

    Please execute the batch file. Do you get the same error? If you get the same error please try to modify the C:\Users\svndata\.scm\lib\python\util.py as described below. If you do not get the error direct the output of the batch to a file (test.bat > changelog.xml) and post the size of this file.

    Search for the following method:

    def writeXml(doc):
      # print doc.toprettyxml(indent="  ")
      doc.writexml(sys.stdout, encoding='UTF-8')
    

    And replace it with:

    def writeXml(doc):
      # print doc.toprettyxml(indent="  ")
      doc.writexml(sys.__stdout__, encoding='UTF-8')
    

    Try to reproduce the error again.

    Note: Keep in mind that SCM-Manager rewrite the python files on every start.

  3. Lukas Wöhrl reporter

    I didn't get this error, using that test script. The size of the changelog is 33 KB

    I don't know the exact number of files changed, but they are aprox. around 100.

  4. Sebastian Sebastian repo owner

    Ok, then i think the size is not the problem.

    Could you enable the trace log of scm-manager and call the commit function again. You should see the complete process environment in the log file. Could you modify the test.py and put the same env to the script (the same as os.environ['SCM_PATH'] = ''). After that please try again to reproduce the error with the script.

    https://bitbucket.org/sdorra/scm-manager/wiki/faq (How do i enable trace logging?)

  5. Sebastian Sebastian repo owner

    This issue is really strange. The python stacktrace shows a problem during the write of the xml. It is possible that you send me the xml output of the changelog.py for a detailed analyze? If this is not possible i will try to write a small program for analyze this weekend.

    Please try to change the output method of util.py:

    def writeXml(doc):
      # print doc.toprettyxml(indent="  ")
      doc.writexml(sys.stdout, encoding='UTF-8')
    

    And replace it with:

    def writeXml(doc):
      # print doc.toprettyxml(indent="  ")
      doc.writexml(sys.__stdout__, encoding='UTF-8')
    

    And please monitor the memory usage of your system.

  6. Sebastian Sebastian repo owner

    Sorry i get distracted too much of the python stacktrace. It is a encoding problem. I missed the MalformedByteSequenceException in the stack trace. I'll try to reproduce the problem and fix it.

  7. Sebastian Sebastian repo owner

    Sorry for the trouble, please try the following.

    Please try to change the appendValue method of util.py:

    def appendValue(doc, node, value):
      textNode = doc.createTextNode(value)
      node.appendChild(textNode)
    

    And replace it with:

    def appendValue(doc, node, value):
      textNode = doc.createTextNode(value.encode('utf-8'))
      node.appendChild(textNode)
    
  8. Lukas Wöhrl reporter

    sorry, but a copy is not possible.

    I tried the UTF-8 change, the new error is now:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 32: ordinal not in range(128)
    
  9. Sebastian Sebastian repo owner

    Ok, i was now able to reproduce your error and i think i found a solution:

    Modify the util.py again.

    Search:

    from mercurial import hg, ui, commands
    

    And replace it with:

    from mercurial import hg, ui, commands, encoding
    

    Search:

    def appendValue(doc, node, value):
      textNode = doc.createTextNode(value)
      node.appendChild(textNode)
    

    And replace it with:

    def appendValue(doc, node, value):
      textNode = doc.createTextNode(encoding.tolocal(value))
      node.appendChild(textNode)
    

    This fixes the issue in my test environment.

  10. Log in to comment