LookupError: 00changelog.i@66f0739d7517: no node

Issue #176 resolved
Marc Sanfaçon created an issue

I have this error that pops up once in a while in rhodecode

{{{ Module weberror.errormiddleware:162 in call

app_iter = self.application(environ, sr_checker) Module rhodecode.lib.middleware.simplegit:103 in call return self.application(environ, start_response) Module rhodecode.lib.middleware.simplehg:169 in call return app(environ, start_response) Module mercurial.hgweb.request:146 in run_wsgi return application(env, respond) Module mercurial.hgweb.hgweb_mod:86 in call return self.run_wsgi(req) Module mercurial.hgweb.hgweb_mod:122 in run_wsgi return protocol.call(self.repo, req, cmd) Module mercurial.hgweb.protocol:57 in call rsp = wireproto.dispatch(repo, p, cmd) Module mercurial.wireproto:153 in dispatch return func(repo, proto, *args) Module mercurial.wireproto:174 in branches for b in repo.branches(nodes): Module mercurial.localrepo:1294 in branches p = self.changelog.parents(n) Module mercurial.revlog:317 in parents d = i[self.rev(node)] Module mercurial.revlog:309 in rev raise LookupError(node, self.indexfile, _('no node'))

}}}

Comments (24)

  1. Marc Sanfaçon reporter

    There is a revision 66f0739d7517 in one of the repo, but I can't find the REPO in the error message, so not sure which one the error is from. Also not sure when this error was thrown.

  2. Marc Sanfaçon reporter

    I get this error several times a day. Most of the time, it happens during a push from the admin. Here's our setup:

    1 Main repo 4 team repos that gets all the main repos changesets pushed to them every 5 minutes, to make sure they are up to date. These repos contains changes not yet merged in the main repo.

    When I get this error, the user sees:

    pushing to https://admin:***@mercurial/Application searching for changes abort: HTTP Error 500: Internal Server Error

    And then I get this error:

    URL: https://mercurial/?nodes=6f7ece15c9de569009ba622b28ed00a439ac64f1+642ef4d8e6c0085ccd020c589011a2d880df5ec0&cmd=branches
    Module weberror.errormiddleware:162 in __call__
    >>  app_iter = self.application(environ, sr_checker)
    Module rhodecode.lib.middleware.simplegit:103 in __call__
    >>  return self.application(environ, start_response)
    Module rhodecode.lib.middleware.simplehg:169 in __call__
    >>  return app(environ, start_response)
    Module mercurial.hgweb.request:146 in run_wsgi
    >>  return application(env, respond)
    Module mercurial.hgweb.hgweb_mod:86 in __call__
    >>  return self.run_wsgi(req)
    Module mercurial.hgweb.hgweb_mod:122 in run_wsgi
    >>  return protocol.call(self.repo, req, cmd)
    Module mercurial.hgweb.protocol:57 in call
    >>  rsp = wireproto.dispatch(repo, p, cmd)
    Module mercurial.wireproto:153 in dispatch
    >>  return func(repo, proto, *args)
    Module mercurial.wireproto:174 in branches
    >>  for b in repo.branches(nodes):
    Module mercurial.localrepo:1294 in branches
    >>  p = self.changelog.parents(n)
    Module mercurial.revlog:317 in parents
    >>  d = i[self.rev(node)]
    Module mercurial.revlog:309 in rev
    >>  raise LookupError(node, self.indexfile, _('no node'))
    LookupError: 00changelog.i@6f7ece15c9de: no node
    

    It happened this morning at 07:11 and I saw that a user was doing a pull around the same time. So may be it is concurrent issue.

    Anything I can help with to pinpoint this issue?

    Thanks

  3. Marcin Kuzminski repo owner

    What repository is making this error main or other ? Does changeset with id 6f7ece15c9de exist for that repo ?

    Why does main push into team, maybe team should pull from main instead ?

  4. Marc Sanfaçon reporter

    It is pushing from Main to Application. The revision exists in Application but not in Main. Somehow, if I do a push right now, it works.

    I use a push instead of a pull in order to have Rhodecode up to date. I did a pull before and Rhodecode would not see the modifications until I did a manual refresh (or invalidate stats, not sure which). Also gets me the journaling in Rhodecode.

  5. Marc Sanfaçon reporter

    Any way I can help pinpoint this? I get this error several times (> 20) a day. It really looks like a concurrent issue. It just happened while I was pulling and another user was pushing.

    Thanks.

  6. Marcin Kuzminski repo owner

    can you try to change use_threadpool = true to false in the ini file ? and check if that helps ? i'll have to make some test in order to pinpoint the issue.

  7. Marc Sanfaçon reporter

    I did that, restarted rhodecode only (not celery) and just got the error.

    Looks like it might be something else. It still say Multithread in WSGI variables:

    WSGI Variables
    application	<rhodecode.lib.middleware.simplegit.SimpleGit object at 0x41701d0>
    paste.parsed_querystring	([('nodes', 'bc357713bea8771ba54aaf6c54d620a55be0d81b'), ('cmd', 'branches')], 'nodes=bc357713bea8771ba54aaf6c54d620a55be0d81b&cmd=branches')
    paste.registry	<paste.registry.Registry object at 0xd37e210>
    paste.throw_errors	True
    pylons.status_code_redirect	True
    wsgi process	'Multithreaded'
    
  8. Marcin Kuzminski repo owner

    I'm running some test to see if it's really concurency, but i have 3 scripts running concurent (two of them pulling) one pushing. I'll let you know about results. In the mean time, can you get the client versions for clients that's getting error ? I'm using >=1.8 for tests

  9. Marcin Kuzminski repo owner

    And i can confirm that for over 350+ operation of concurrent pulling and pushing, i had no errors. Using RhodeCode 1.2 mercurial 1.8.2 for both server and client. So i could assume it's not concurrency related.

    You said it happens during a push from the admin. how that's command runned ?

  10. Marcin Kuzminski repo owner

    Do you use TortoiseHG to pull ? maybe it's windows problem ? maybe someone is pushing with `-f` flag ?

  11. Marc Sanfaçon reporter

    We do use TortoiseHG to pull and push. I don't think people use the '-f' flag, but I can't be sure.

    I sometimes receive the error during evenings or week-end, so I can't be sure it's a concurrency issue.

    Is there a way to activate a 'full' debug that logs all operations on the server? That way we would be able to see what's going on the server at that time.

  12. Marcin Kuzminski repo owner

    Can you stop using tortoise in exchange for command line hg client for one day ?

    To enable full logs, change loglevels for all loggers in the .ini file to DEBUG

  13. Marc Sanfaçon reporter

    Not really possible to ask the 30+ developers to stop using tortoise. However, the automated systems use the command line version of HG (bundled in Tortoise).

    The error just happened again and at least 2 users were accessing the system at the same time. Two automated system - 1 was pushing and the other was pulling. The log is really huge now (66MB) so it is kind of hard to go through it. Anything I can look for to help debugging?

  14. Marcin Kuzminski repo owner

    Well, one good way of checking is if you would do the same test scenario as i did yesterday. Make two loops, one constantly pulling changes and other pushing.

    since pushing is not so trivial you could use this https://hg.rhodecode.org/rhodecode/files/11548cc19c8f4e8f379c2750de46075a17afe2a3/rhodecode/tests/test_hg_operations.py

    and do

    for i in xrange(1000):
        test_push_new_file(commits=5)
    

    in line 330, it just generates some random files commites them and pushes, You can play with user settings.

    and for pulls can be as simple as bash: for i in 1..1000; do hg pull; done

    If you have any troubles catch me on IRC.

  15. Former user Account Deleted

    The error is still occurring and I did not find the time to do the test you asked.

    I'll see if I can try it and will keep you posted

  16. Marc Sanfaçon reporter

    I may have found something to dig on. Our Rhodecode server hosts several repositories and whenever the error occurs, there were multiple requests on the server at once for different repos.

    What I think happens is that one of the request asks for the list of nodes for a repos but it gets the nodes from another repo. Would that be possible?

  17. Marc Sanfaçon reporter

    Really good news! I just deployed it and will monitor it tomorrow.

    There were 60 errors of this type today. So I should know quickly if it is fixed.

  18. Marc Sanfaçon reporter
    • changed status to open

    After 15 hours of the fix being deployed there were no errors at all!

    It looks like you found the problem and fixed it - good job!

    I'll let you know if there are other errors, but I'm confident we will close this bug tomorrow.

    Thanks!

  19. Log in to comment