Long operations block UI (Qt)

Issue #532 open
Marcin Wisnicki
created an issue

Clone some big repository like http://hg.netbeans.org/main/ and try to use THG workbench. Most operations (eg. filtering) will block whole UI and the program will appear as not responding.

Comments (14)

  1. Steve Borho
    • changed status to open

    Most operations?

    The only operation I expect to be slow is building a graph thousands of revisions from the tip.

    Filtering is threaded but showing the results is only partially so.

  2. Marcin Wisnicki reporter

    Well, basically any operation that involves a mouse click will block whole UI for 10s-2min, for example clicking on changeset blocks for 18s and application is marked as not responding and Windows soon asks if it should be killed :(

    It's bad that operations take so long but even worse that it appears to block UI. So far I have not found a single action that would NOT block UI.

    BTW I have 2x2.1Ghz Athlon64, 4GB RAM and Win7/x64. Should I expect THG to be completely unusable on repository of the size of http://hg.netbeans.org/releases/ (working dir is 3GB with 240k files, .hg is 1.75GB with 135k files) ?

  3. Steve Borho

    I started the clone then went home. Luckily it did finish.

    What I see are 2 seconds between clicking on a revision and seeing the contents in the revdetails tab. If I open the manifest browser.. wow that's big.. it now takes 3 seconds to switch revisions.

    Running hg cat -r [REV] file takes about 1.5 seconds on the command line, and so does 'hg manifest -r [REV] > nul' so things are not too out of line.

    Just curious, have you run hgtk on this repo? I expect it behaves about the same.

  4. Marcin Wisnicki reporter

    You mean the old TortoiseHG ? It was more or less the same.

    After restart switching revision takes "just" 3s, but if manifest is enabled it grows to 15s, which seems excessive - hg cat takes 2.5s and THG should be faster since it does not have to spawn python every time.

    If I then switch back to revision details it still takes 15s to switch revision. I guess it is a bug.

    I don't know much about structure of mercurial but I would expect switching to nearby revision to be fast. In TortoiseGit/Gitk everything happens instantly regardless of repository size because it caches metadata. Does THG use any kind of cache ?

    To recap: there are actually many separate issues here:

    1. [Bug] Mercurial operations block UI thread (ie. what this bug is really about). It seems mercurial runs on the same thread as Qt (according to process explorer).
    2. [Bug] Switching off manifest view does not bring back "faster" switching
    3. [Request] UI could be updated incrementally: hg log shows commit information instantly, so at least this could be displayed instantly and remaining information should be updated later
    4. [Question] Caching - is there any ? can it be improved ?

    Should I file separate issues for 2, 3 and possibly 4?

  5. Steve Borho

    1. The graph browsing code calls directly into Mercurial APIs to query changeset metadata. We don't run any Mercurial commands during normal browsing.

    2. Switching away from the manifest tab does not turn it off in any way. It's created on demand but once opened every revision click will refresh the manifest tab. This is something I expect will get improved on the default branch in the next few days.

    3. I have no idea how you're differentiating between commit information and remaining information. In general the history graph is demand loaded as you scroll. It only loads about 100 revisions initially and I believe you can drop that further through configuration.

    4. All of the data we query from Mercurial APIs is cached, and Mercurial does some caching itself internally (all in-process).

    Fixing #2 would help a lot. My guess is that the GUI is also fetching some piece of data that is especially expensive on the netbeans repo and is falling through the cache schemes. It needs detailed profiling to figure out what it is.

    What graph columns do you have visible? How many tags does the repo have? Branches? Heads?

    I was able to make browsing cpython about 30x faster by fixing the branch head cache. Perhaps there is more low hanging fruit here.

  6. Marcin Wisnicki reporter

    Graph columns: [Graph, Rev, Branch, Description, Author, Age, Tags]. Branches: 72, Tags: 949. But graph performance is sufficient.

    Re. 1: Proper GUI applications should never do business logic on UI thread. It blocks the toolkit which prevents screen updates, window movement and system will prompt to kill it.

    Re. 3: By commit info I mean that what is shown by hg log: [changeset, parents, user, date, summary]. Remaining info is what is additionally shown in THG Workbench: diffs, manifest etc.

    It takes just a second to do hg log -l 1000 > nul: therefore UI should be able to show such data instantly and fetch remaining info like manifest or diff later. Also this applies to revision list - when hg log does more than 1000 revs/s then so should the UI. If the bottleneck is in Graph or Tags column they should be updated asynchronously.

    Re. 4: Then caching is clearly ineffective. With proper caching I would expect instant (<100ms) UI response in common case (cache hit). Just like in git tools. Also, cache should be persisted to disk. (Like in git tools ;).

    If you agree I will split 3 and 4 (and 2?) to separate issues.

  7. Steve Borho

    I did a bit more digging and the cost of switching revisions (ignoring the manifest tab) is primarily the cost of generating the file list and showing the first file. The graph operations are already reasonably fast.

    The file list costs 2 seconds:

    hg status --rev 192047:192048

    Then showing you the first file costs about 1 second:

    hg cat --rev 192047 'crazy long file path'

    Which is about what you would expect because 'hg status --rev A:B' is mostly calculating the diff between the manifests of the two revisions. So if cat is 1 second, status will be 2 seconds.

    So we could make the UI more responsive by doing those two operations in a timer callback or thread.

    number 4 is a non-starter. I'm not going to implement a disk cache for data that's already layed out perfectly well inside the repository.

  8. jtn

    I'm not going to implement a disk cache for data that's already layed out perfectly well inside the repository.

    Maybe it is layed out perfectly well for Mercurial itself but definitely not for THG. Otherwise I wouldn't have to wait 3 second or so after each click. Please do not throw away the idea of implementing data cache; caching is an obvious way to speed up apps like this.

    I've created a simple prototype with only two changes:

    1. Cache status results in changesToParent()
    2. Avoid displaying diffs on revision change (user must click on file path to get the diff)

    Once the data is cached, I can browse the netbeans repo with delays below 50 ms instead of 1 to over 3 seconds... Isn't it worth the effort?

    Without fixing these performance problems thg is hardly usable with any large repository :(

  9. Anonymous

    @jtn: Great job with the optimizations! Please add your improvements as a patch to official thg. Do you require more testing?

  10. Log in to comment