Long-running indexing on project open

Issue #1741 resolved
Scott Wells repo owner created an issue

I've had a user report EXTREMELY long-running indexing upon opening a project, though not all the time. Based on a provided profiling snapshot, I'm pretty sure I know what's going on. If my hunch proves out I'll describe both the problem and the fix in a later comment.

Comments (41)

  1. Scott Wells reporter

    Here's a build that tries to avoid consulting the CLI for org details when a project is opening/initializing/indexing. It will use the cached org details even if they appear to be stale until the project is in a state where it can consult the CLI again.

    You can install it by downloading the attached archive (don't extract it) and using Settings/Preferences>Plugins>Install plugin from disk (under the gear drop-down menu).

    Also please add the following to Help>Diagnostic Tools>Debug Log Settings as it will log when this type of behavior occurs:

    #com.illuminatedcloud.intellij.settings.application.IlluminatedCloudConnectionConfig
    

    If this problem continues, that should help provide useful diagnostic information, and if this seems to resolve the issue, I'd like to see the logs to verify that it's in fact avoiding the situation that results in long-running indexing.

    Thanks!

  2. Scott Wells reporter

    Thanks, Peter. So this is really strange. Basically there a bunch of threads waiting on one other thread, and that thread is just getting last modified timestamps of the files under ~/.sfdx to see if they're stale. I was going to ask how many files are under that directory, but honestly it shouldn't ever matter as even with thousands of files, I would think it could get the last modified timestamps in a matter of milliseconds...seconds at worst. Having said that, do you have an exorbitant number of files under ~/.sfdx? Let's explore that for a minute since what I'm seeing here is so (seemingly) anomalous.

  3. Scott Wells reporter

    Attaching a third build for the behavior I'm seeing in Peter's latest thread dump where everything seems to be blocking trying to get the last modified timestamp of a file under ~/.sfdx.

    Before installing this, please add the following to Help>Diagnostic Tools>Debug Log Settings in addition to the one mentioned previously:

    #com.illuminatedcloud.intellij.sfdx.SfdxUtil!trace
    

    Then install it and hopefully it will reproduce the behavior. The new idea.log should show on exactly which file it's hanging. Then we can determine whether that file should be removed...or what other action should be taken.

  4. Peter Churchill

    I only see 36 items - oldest modified is 8/21 and newest is this week. I do have two dev hubs enabled in case that is relevant?

  5. Scott Wells reporter

    Let's get this third build installed with the additional debug logging enabled and see what that tells us. Basically in your thread dump I see a number of blocked threads that look like:

    "JobScheduler FJ pool 6/7" prio=0 tid=0x0 nid=0x0 blocked
         java.lang.Thread.State: BLOCKED on com.illuminatedcloud.intellij.settings.project.IlluminatedCloudConnection@c342766 owned by "JobScheduler FJ pool 5/7" Id=90
    

    and then the referenced thread looks like:

    "JobScheduler FJ pool 5/7" prio=0 tid=0x0 nid=0x0 runnable
         java.lang.Thread.State: RUNNABLE
        at java.base@11.0.6/java.io.WinNTFileSystem.getLastModifiedTime(Native Method)
        at java.base@11.0.6/java.io.File.lastModified(File.java:957)
        at com.illuminatedcloud.intellij.sfdx.SfdxUtil.isSfdxConfigModifiedSince(SourceFile:1268)
    

    I'd like to know which file it's evaluating when that happens...or even whether it's the same file every time.

  6. Scott Wells reporter

    Thanks. And it's just hanging now, correct? And if you kill the process and get it back into the same state, does it hang on that exact same file again?

  7. Scott Wells reporter

    Hmmmm...that thread dump looks quite different. I mean, it's a bunch of threads blocked in the same spot waiting on one other thread that's in the method that gets last modified info for sfdx config files, but in this case it's writing to the debug log. Now I'm wondering if things aren't actually truly blocked, but rather it's just thrashing that area. Let me study your last log a bit more. In fact, do you mind attaching all idea.log* files from that same directory? My guess is that due to log rolling I'm not going to get a full picture from just the last log.

  8. Peter Churchill

    So there are 12 files, all identical the one I attached. But I will attach a new one now which is for project that does have scratch setup so you can see how it looks different

  9. Peter Churchill

    This is from a project where the scratch is in place, and seems to work fine. So maybe I just need to remove the reference in .sfdx to stale scratch orgs that don't exist anymore?

  10. Scott Wells reporter

    Yeah, that's how the logs roll over. idea.log is the latest and idea.log.12 is the oldest (if present). Let's do this. Set up debug logging as follows:

    #com.illuminatedcloud.intellij.settings.application.IlluminatedCloudConnectionConfig
    #com.illuminatedcloud.intellij.sfdx.SfdxUtil
    com.illuminatedcloud.intellij.sfdx.SfdxUtil!trace
    

    Stop the IDE. Delete all idea.log* files so we're starting fresh. Start the IDE until it freezes. Send me all of the idea.log* files from that run. At this point I'm mainly looking for how/when it decides it needs to try to load connection information from the CLI.

  11. Scott Wells reporter

    Regarding your last comment, you could certainly try running sfdx force:org:list --clean which would purge all stale scratch org entries from ~/.sfdx. Let's get the logs above first, then you can try that and see whether it helps while I digest the latest logs. Thanks for taking all the time to go back and forth with me on this!

  12. Scott Wells reporter

    I'm thinking more and more about the expired scratch org situation and am wondering if that's not just it. I could see how that might lead to a behavior like this. I'm going to set up a few scratch orgs now that will expire tomorrow so I can play with that idea a bit and see whether I can reproduce this behavior. If so, it'll help me to add some logic to IC to detect expired scratch orgs and keep things from going south. I'd still like to see the debug logs from when this problem is happening, though.

  13. Scott Wells reporter

    Peter, it doesn't look like you changed the debug logging setup. Can you please change it exactly as described above and get another set of logs? I'm most interested in what comes out of IlluminatedCloudConnectionConfig.

  14. Scott Wells reporter

    Thanks. And that one locked up or no? The log doesn't seem to indicate a lock-up but rather a clean shut-down.

    Have you run sfdx force:org:list --clean?

  15. Peter Churchill

    I did run the clean, and although not 100% locked, it is very unresponsive/not really usable. I’ll attach another thread dump

  16. Scott Wells reporter

    And it's just this one project where this happens? You said the others are fine, even others that use SFDX connections?

  17. Peter Churchill

    Yeah, the other project where the scratch has expired seems ok now, But that other project just seems to kill performance for any other open project. But honestly, if the answer is a current scratch org, I can live with that\work around that if the indexing works as I need it to, like it seems to now

  18. Scott Wells reporter

    To be clear, I'm going to fix these issues for you, both the indexing issue and whatever issue you're seeing now. Based on the latest thread dumps--which look very similar to the previous ones--it appears that it's just unable to lazily initialize/resolve the connection for this project, and it just keeps trying to do so expecting a different answer each time. And based on what I've implemented, that should be right. Either it can get a good connection or it will stub in a placeholder connection that ends up triggers a notification to the user that the project doesn't have a valid connection. But you're ending up in some other state here apparently. I just need to corner that.

    Do I understand correctly that if you give this project a valid/current scratch org, everything is fine? If so, please do that as a workaround while I continue to try to reproduce the problem. I have several scratch orgs that will expire tomorrow so I can try to reproduce/debug this hands-on.

  19. Peter Churchill

    Yes - as soon as I associated a scratch org, it worked just fine, so def seems to be something about the stale connections.
    So let me know if you see anything else I can help with, and thanks again for looking into this so thoroughly - it has been driving me a little crazy of late!

  20. Scott Wells reporter

    Okay, that's good information to have, and between that and the indexing changes I've made it seems like we're making progress.

    Hopefully when these scratch orgs expire for me tomorrow I'll see what you're seeing. I've tried to reproduce it by creating, association, and then deleting a scratch org, but that didn't do it for me.

    I'll let you know either way, and since you're able to reproduce this, if I can't I'll reach out to lean on you for more diagnostic info. Thanks so much again!

  21. Scott Wells reporter

    Well, my scratch orgs did expire overnight, and this morning I opened a project that used one of them. The good news is that IC did what it's designed to do and told me I was configured against a now-missing org; the bad news is...well...that IC did what it's designed to do and didn't reproduce the behavior you're seeing. I'll keep digging in, but assuming I'm still unable to reproduce this, I'll need to work with you again when you're next in this state to get more diagnostic info. At this point I know specifically where to look, though, so hopefully I can get more targeted information more quickly.

  22. Peter Churchill

    Ha - ‘tis always the way. Let me try deleting the scratch org linked to that troublesome project, and see if that causes the same issue. If not, I’ll fire up one with a one day expiration, and see what happens then

  23. Scott Wells reporter

    Sounds good. I won't get much of a chance to look at it until Monday because of an outage that will start shortly for me and continue through the weekend, but if you do find a way to reproduce it reliably, hold onto that and let's connect up next week when you have a few to try to get this thing cornered.

  24. Tom Fuda

    I’ve definitely had this happen to me if I open a project where my scratch org expired. You can also repro it by deleting the scratch org using the SFDX CLI and then trying to open a project that was connected to this scratch org. I was just about to file a new issue for this.

  25. Scott Wells reporter

    Thanks for chiming in, Tom. Unfortunately I haven't been able to reproduce the behavior using those steps. This week ended up being monopolized by completing a new set of profiling features that will be released on Monday, but after that this gets my full attention.

  26. Scott Wells reporter

    FYI, I'm still trying to figure this out this week, and unfortunately I'm still unable to reproduce the behavior. I was hoping that the following scenario would do it, but alas, IC behaved (frustratingly!) admirably and just told me that my connection was invalid:

    1. Create a scratch org with a namespace set to expire the next day.
    2. Create a project against that scratch org and populate it with enough Apex classes that there would be significant indexing.
    3. Generate an OST against that scratch org for the project to further bolster the indexing overhead.
    4. Wait a day for the scratch org to expire.
    5. Start the IDE with that project.

    At that point what I've seen in the logs you all have provided is that IC gets into a state where it's constantly trying to get the details for the now-missing org--primarily so that it will know the org's namespace--ultimately leading to slow indexing. What I saw is that IC successfully queried the org list via the CLI, saw that the org associated with the project's connection is no longer present, and told me that I need to set up a connection. Indexing completed in normal time.

    Is there anything else that you all are doing differently from this? In particular is there something else about your respective projects that I haven't covered? Needless to say I'd really like to get this nailed, but at this point I'm just poking and prodding from the sides based on what I've observed in the provided logs...which are invaluable, but I fear that approach may not lead to a proper resolution. Thanks!

  27. Scott Wells reporter

    Okay, the change that is included in the attached build seemed to resolve this for me without issue. Having said that, it does sound like it may still not be perfect based on feedback from those who have tried it. I think I'm just going to include it in the next build, keep this issue open for a while, and we'll see where that puts us and what else is needed.

  28. Scott Wells reporter

    The fix that had been included in custom builds attached here is now in 2.1.4.8+. I'm not resolving this yet as I'd like feedback from those affected as to how well it addresses the issue. I imagine there might be another part of the fix required, but if it seems to hold up well over a period of time, I'll go ahead and resolve this.

  29. Scott Wells reporter

    It's been a week. How is this going for everyone? Does last week's build seem to resolve the issue? I'm going to resolve this assuming (actually, hoping) that it does, but if it doesn't, please feel free to reopen and provide new logs.

  30. Log in to comment