Malformed URL cloning repo with whitespaces in paths.

Create issue
Issue #104 resolved
Andrea Francia created an issue

In revision r5 trash-cli contains a white space in one of its directories. There is a way to start the clone only after a specific revision?

{{{ andrea@puzzle:~/src$ hg svn --version Mercurial Distributed SCM (version 23e941d7f507)

Copyright (C) 2005-2009 Matt Mackall mpm@selenic.com and others This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

andrea@puzzle:~/src$ hg identify ../hgsubversion/ ff69f1855086 tip

andrea@puzzle:~/src$ hg clone --stupid http://trash-cli.googlecode.com/svn/ trash-cli [r1] andreafrancia: First import. Fetching entire revision: previous revision does not exist. False madebranch None [r2] andreafrancia: Added python. [r3] andreafrancia: empty log message [r4] andreafrancia: empty log message [r5] andreafrancia: removed old version based on bash and c code. abort: ("URL 'http://trash-cli.googlecode.com/svn/old trunk' is malformed or the scheme or host or path is missing", 170000)

}}}

Comments (29)

  1. Augie Fackler repo owner

    Hi, the fix looks good, but it could use a test case. Any chance I could persuade you to make one and submit a patch generated via hg export?

  2. Andrea Francia reporter

    I'm a newbie in hg, I don't know how to use "hg export".

    About the test case, do you mean an automated test. I didn't found a test for the RevisionData.findmissing nor any other methods of RevisionData so it's difficult prepare an automated test that prepare all the stuff needed by findmissing.

    I saw there is a lot of integration tests but I didn't managed to create a such test.

  3. Andrea Francia reporter

    I forgot a question mark in the second sentence "About the test case, do you mean an automated test [?]"

  4. Augie Fackler repo owner

    If you can just make a shell script that produces an svn repo that demonstrates the need for the fix, that'll be enough that I can integrate the test the rest of the way. There are examples in tests/fixtures/*.sh.

    As far as hg export, you just commit as normal to hg, and then do hg export on the changes you want to contribute - then I can hg import them and the contributor information and log message are already done for me.

  5. Andrea Francia reporter

    The error occours only when accessing through http. The error is reproducible typing:

    $ hg clone http://trash-cli.googlecode.com/svn/ trash-cli
    [r1] andreafrancia: First import.
    [r2] andreafrancia: Added python.
    [r3] andreafrancia: *** empty log message ***
    [r4] andreafrancia: *** empty log message ***
    [r5] andreafrancia: removed old version based on bash and c code.
    abort: ("URL 'http://trash-cli.googlecode.com/svn/old trunk' is malformed or the scheme or host or path is missing", 170000)
    

    The problem is with something in the revision 5, probably something about whitespaces.

    I made various attempt to reproduce the error condition with a bash fixture script but without success. I think that error will not occours with the file: scheme In fact I also tried to reproduce exactly (with svnsync) the same set of changes from r1 to r5 from trash-cli and accessing via file: to it but in this case it works!

    $ svnsync init --username svnsync file://`pwd`/dest http://trash-cli.googlecode.com/svn/
    Copied properties for revision 0 (svn:sync-* properties skipped).
    $ svnsync sync file://`pwd`/dest # hit CTRL+C after revision 6
    Transmitting file data .................................................
    Committed revision 1.
    Copied properties for revision 1.
    Transmitting file data ......................
    Committed revision 2.
    Copied properties for revision 2.
    Transmitting file data .....
    Committed revision 3.
    Copied properties for revision 3.
    Transmitting file data ...
    Committed revision 4.
    Copied properties for revision 4.
    Transmitting file data ..
    Committed revision 5.
    Copied properties for revision 5.
    Transmitting file data ...
    Committed revision 6.
    Copied properties for revision 6.
    ^Csvnsync: Caught signal
    
    $ svn pdel --revprop -r 0 svn:sync-lock file://`pwd`/dest #Removing lock due the CTRL+C
    property 'svn:sync-lock' deleted from repository revision 0
    
    $ hg clone http://trash-cli.googlecode.com/svn/ trash-cli
    [r1] andreafrancia: First import.
    [r2] andreafrancia: Added python.
    [r3] andreafrancia: *** empty log message ***
    [r4] andreafrancia: *** empty log message ***
    [r5] andreafrancia: removed old version based on bash and c code.
    abort: ("URL 'http://trash-cli.googlecode.com/svn/old trunk' is malformed or the scheme or host or path is missing", 170000)
    
    

    So, for me, is pretty difficult create a bash script that will reproduce the bug. It seems to me that the need for fix is already demostrated by steps reported in this bug entry which are reproducible.

  6. Chema Cortes

    I'm suspecting the problem come from "libsvn", more precisely from the "neon" implementation of webdav protocol (function "ne_uri_parse"; file "ne_uri.c"). It doesn't consider the space as a valid char for paths.

    As alternative could be to use the other implementation for webdav in libsvn, "serf", but it's not recommended.

    At the moment, I'm working into a workaround by a patch of file "svn_swig_wrapper.py". It works with the trash-cli repo also.

    It's not a final patch.

  7. Augie Fackler repo owner

    Thank you so much for doing all the legwork on this - terrible that it's an svn bug.

  8. Andrea Francia reporter

    Thanks to you both for being interested in this bug that I opened.

    About closing the bug

    I would suggest to you to consider to insert a work around even if the problem is outside the hgsubversion. The new version of subversion isn't yet released and even when it will be many users could not be able to access it.

    While the Windows binaries are early availables many Linux distro are a bit slow to updating some software packages. For example the subversion package of Ubuntu is stuck to 1.5. It may take longer before the 1.7 will be packaged in the stable repo.

    So I kindly ask to reopen the bug.

    About the solution

    I think that the modification from the patch of Chema Cortes may be in a more right method that the my modification but I suggest you to use the urllib as I done instead of replace space with '%20'. urllib will take care of quoting all the needed characters.

    I uploaded some non correct patches. Please consider only the last (the second with the name url-quoting.patch). This is the working one.

  9. Augie Fackler repo owner
    • changed status to open

    FYI, marking this as resolved almost had me never notice the patch - better would have been to mark it as "open" so it's in the list I use as a queue. Looking at the patch now.

  10. Augie Fackler repo owner

    The patch looks pretty good, but I don't have any contributor information for you. Can you let me know what name and email address I should use for you?

    Also, instead of just referencing issue 104, could you link to the issue thus? http://bitbucket.org/durin42/hgsubversion/issue/104 I just want to hedge against the day that this might ship with Mercurial.

    Thanks!

  11. Augie Fackler repo owner

    One more thought: I know it makes the comment huge, but would you also link to the Subversion issue that we're working around?

    I'll have to ponder this some more as well - I'm not sure how I feel about patching around a single ra layer's bugs. Feels like the right thing though.

  12. Former user Account Deleted

    Augie, I agree completely with you. I repost the patch, but change the comments youself if you see necesary.

    As I see, this patch is basicaly harmless, include if subversion fix the bug. Without this patch, mercurial become fewer useful at my work.

    PD: i sent you my email in privacy.

  13. Chema Cortes

    Sorry, the last comment is mine. The systemraised an error and changed the author to anonymous.

  14. Augie Fackler repo owner

    Sorry for being slow on this - I'm hoping to get this figured out soon.

    The one question I have is this: Are you *sure* passing an already-quoted URL to the RA layer will be harmless if (say) ra_neon were fixed? There's something that just leaves me with a bad feeling about this.

  15. Augie Fackler repo owner

    I looked at this again - the ticket you've linked to is a similar issue, but not the same one. That fix landed in 1.5.3 and is present in all versions of 1.6. Thus, if there is a bug in Subversion, we need to find and report it. I suspect at this point that I'm using the API wrong, but I'm not sure. I'll have to research more this evening.

  16. Augie Fackler repo owner

    (just checked - this still happens in svn 1.6.5, with both serf and neon. It's almost certainly a misuse of the API, rather than a bug in svn).

  17. Former user Account Deleted

    Sorry, Augie, I was disconected some weeks for holidays.

    You are right, the ticket I've linked is fixed now (but has a target milestone of 1.7). The fail really come from not consider the space as a valid char for http paths, and this is common for neon and serf. I don't know if the API details this exception, but I'm quite sure that quoting the URL must always works.

    I'm going to check libsvn in detail.

  18. Log in to comment