exception "No free or removed slots available. Key set full?!!"

Issue #67 invalid
junmeng2000 created an issue

We are using version 3.0.3. We ran into some strange issues with it. We kept running into exception "No free or removed slots available. Key set full?!!" in one of our applications. In one case, we were able to break into the place when the exception occurred. The image I uploaded shows the internal state when the exception is thrown. The second image shows the stack trace. trove stack.PNG trove state.PNG

I have the following questions. 1) We created the hash set using default constructor. It should have the capacity of 23. Why did it become 3? 2) The set shows that all three slots were occupied while the load factor is 0.5. How come the resizing didn't get triggered? 3) Again the three slots were occupied but the _size=1 and _free=2. Why?

For sure, this will throw the exception as mentioned in the subject line. I traced through the code. It does probe all the spaces and found all of them have state=1.

Also this hash set is contained in a thread local object. That should eliminate the possible corruption caused by concurrent access.

Please help look into the issue. If you need any additional information, please let me know and I'll try to gather it for you.

Comments (7)

  1. junmeng2000 reporter

    I'm not able to come up with a standalone case now. I ran two fairly complex applications and am able to reproduce the problem every single time. But since there are hundreds of thousands of messages passed between the applications, it's hard to trace the whole add/remove sequences that made the hash set arrive at the wrong state. I'll talk to developers who own these two apps to see if they can add additional debug information to capture the add/remove sequence. What would you recommend to capture?

  2. Rob Eden

    Ensure that you're not accessing the collections concurrently. This causes 99% (made up stat, but you get the point) of similar bug reports.

    The order of events would be the main thing. All of your questions are good and off the top of my head I would agree that they're all issues that need to be investigated. My knee-jerk answer for how it got to that state would be concurrency... but I could be wrong.

    If you can consistently reproduce, try wrapping in TCollections.synchronizedMap and see if the problem goes away. If so, that would really point to a concurrent access issue.

  3. junmeng2000 reporter

    Good point to wrap it around synchronizedMap to see if the problem goes away.

    Regarding concurrent access, that's what suspected in the beginning. But the hash set is contained in a thread local object as I said in the first email. That would eliminate the concurrency issue, right? Will take a close look again anyway.

  4. Rob Eden

    But the hash set is contained in a thread local object as I said in the first email. That would eliminate the concurrency issue, right?

    Dunno. Depends on how it's being used once retrieved or after being put in the slot. It can be easy to not notice a leak to another thread.

  5. junmeng2000 reporter

    OK. It turns out my initial understanding of the other application is incorrect. The set is not thread local. So it's quite obvious that the issue is caused by the concurrent access. You can close this issue. Thanks for your quick responses.

  6. Log in to comment