Various connection errors when cloning big Mercurial repo

Issue #13837 on hold
Zoltán Lehóczky
created an issue

When cloning a "big" (~3600 commits, 347.7 MB as indicated here) Mercurial repo I get various connection-related errors, see attached files.

My internet connection is otherwise steady and fast (120/10Mb/s) and I have no issues with anything else (including interacting with already cloned BB repos).

A few days ago I experienced similar errors on other "big" repos as well (bigger in file size at ~500MB but only 10s of commits max), from under another (even faster) internet connection. This was my first time seeing this issue. At first I thought the issue was that I was cloning to a slow storage (an SD card) but re-trying it with fast storage (various SSDs) the same issues arose multiple times.

The affected repos can be successfully cloned, but in my limited tests clone fails somehow in half, if not majority of the times.

Comments (29)

  1. Zoltán Lehóczky reporter

    That's about cloning a git repo.

    BTW in the meantime it seems that the problem was caused by the clone bundles config that was recently introduced to and turned on by default on Bitbucket. This caused this issues for multiple of our team members (basically everyone who needed to freshly clone non-trivial repos) and our build server.

  2. Sean Farley

    Since you and Benedek work together, this only furthers my suspicion that there is something network related between your location and our media-api servers. I don't know what else to do besides try and debug that.

  3. Zoltán Lehóczky reporter

    This can be, although the exact same issue happens from two separate, otherwise stable and fast locations in Hungary as well as from inside an Azure datacenter on the east coast. So if it's a network error then it affects at least two greatly different locations and connections.

  4. Abhin Chhabra staff

    Hi @Zoltán Lehóczky,

    Before we engage the people who maintain our media services, I'd like to confirm that it is indeed related to downloading the clone bundles. Would you mind pasting the output from running hg clone in a verbose mode from the command line so we could see the full order of events?

    Try using hg clone -e "ssh -v" -v --debug ssh://<repo-owner>/<repo-name>. The output will be very verbose, but it will help us figure out what's going on.

    Thank you very much.

  5. Abhin Chhabra staff

    Did you error stop manifesting when you disabled clone bundles? If so, I'm happy there is a workaround to the issue, but I would appreciate if you could reenable it for this test and disable it again later.

  6. Zoltán Lehóczky reporter

    I get this:

    running ssh -v "hg -R owner/repo serve --stdio"
    sending hello command
    sending between command
    remote: 'ssh' is not recognized as an internal or external command,
    remote: operable program or batch file.
    abort: no suitable response from remote hg!

    I'd add if it makes a difference that the error happened via HTTPS (I don't use SSH at all).

  7. Abhin Chhabra staff

    Hah. Well that's not what I expected at all! Would you mind running hg clone -v --debug ssh://<repo-owner>/<repo-name> instead?

    I do want to mention that I appreciate your patience while we figure this out. Thank you.

  8. Zoltán Lehóczky reporter

    Now I get the below output and error. Is there something I need to configure locally or remotely to use SSH? I supposedly have everything installed on my machine for that, but keep in mind I never interact with repos via SSH normally.

    running "C:\Program Files\TortoiseHg\lib\TortoisePlink.exe" -ssh -2 "hg -R owner/repo serve --stdio"
    sending hello command
    sending between command
    abort: no suitable response from remote hg!

    PuTTY SSH error.png

  9. Zoltán Lehóczky reporter

    That's not an issue: I can clone via HTTPS fine even by using the command line and since TortoiseHg also use mercurial.exe in the background I get the same error if I try to clone via SSH from inside TortoiseHg.

    However now I see that I'd actually need some more setup to be able to use SSH, as per I wouldn't want to go over all of this supposedly one hour process for one test. Can't I diagnose something via HTTPS?

  10. Abhin Chhabra staff

    I think I was under the false impression that when you had the problem originally, you were using SSH. I was wrong. I'd be grateful if you could run the test with just HTTPS (assuming that is indeed what you used when you got your original issues) with the --debug option for hg turned on (as shown in the link I previously sent to you).

  11. Zoltán Lehóczky reporter

    Yes, all affected parties with this error used HTTPS in my company.

    I've turned clone bundles on temporarily and have run the clone of two repos that frequently failed with these errors. Please provide an e-mail address (or get in touch with me via my BB e-mail address) where I can share the output with you as I don't want to publish it. BTW one of the clones failed, the other succeeded (but keep in mind what I've written before: "in my limited tests clone fails somehow in half, if not majority of the times", so this doesn't mean the issue got better).

  12. Log in to comment