django-autoslug field causes deadlocks OR does not work for unique fields

Issue #17 resolved
Former user created an issue

Hi,

first, thank you for a great and very useful project.

I have a project, where I use unique slug fields.

The problem is, that when I import a lot of records, using multiple processess (2 per CPU=16 running processess, importing data), I get exceptions about violation of uniqueness for slug field.

Why? Well, the answer is simple. Between checking a field for uniqueness in pre_save, passing that test and actually saving it, it is possible that another record with the same value for slug field gets saved AND we have a problem with duplicate unique field.

Running every process in transaction is also a no-option, as it causes deadlocks with PostgreSQL, but that's another story.

Do you have any ideas on how this could be fixed?

Wrapping obj.save in transaction does not seem to help much.

What comes to my mind ATM is, that I could drop the uniqueness constraint AND check for double (non-unique) slug values after importing all the data and updating slugs by that time and THEN I could re-enable uniqueness for the slug field, but that does not seem to be optimal solution.

Beware -- this situation could appear in production setting. Unlikely, but can happen.

Comments (7)

  1. Tomáš Rychlik

    I've tried to fix this by using django 1.6+ select_for_update in pull request #10 - It's not a complete fix but helps in some cases.

  2. Michał Pasternak

    Hi Tomas,

    just letting you know I'm still here.

    Funny thing is, that I forgot about this bug, then only recently got back to my code (that breaks it), planning to run it again on multiple CPUs. Currently it runs on single CPU, as it caused deadlocks, as written in my problem report.

    Here's what I plan to do. I'll try to come up with a simple test application, that will show this problem. Then, when I have exactly the same problem replicated, we'll be able to test that.

    Unfortunately, I'm not going to do that soon (like in 1-2 weeks), it may take some more time.

    Thanks again and I'll get back to this issue later. Please keep it open.

  3. Andy repo owner

    Hi guys, is this still relevant? I need some unit tests in order to confirm this.

    The repo has been moved to Github (https://github.com/neithere/django-autoslug) and the issue tracker here will be closed soon, too (no plans for migration because it's not that easy and I don't think it's worth it). So I'd like to close all issues except for those which are still relevant.

  4. Michał Pasternak

    Well, it is relevant... and it is not.

    The only problem when this bug occurs is when you do multithreaded writes to tables with AutoSlugField. This is because of the construction of project code... first there is a select, then there is write. If there are multiple instances writing to the same table, same column, deadlocks may occur and they occurred in my case.

    Possible solution would be - I guess - to just write to the database blindly and in case there is a unique index error for the autoslugfield, just add numbers to it - without issuing any SELECTs.

    If I really need this, I'll sent some patches to github.

    As for now, I'd close it. Thanks for maintaining the project. Job well done.

  5. Andy repo owner

    Thank you Michał, got it now. I'm not sure if I understand how to fix it though. We could catch django.db.IntegrityError for unique slugs and increment the number continuously until there's no error, but we don't have per-field granularity. I'm don't think we want AutoSlugField to affect model-wide behaviour on save. (OTOH, I haven't worked with Django for a very long time now so perhaps I'm missing something rather obvious.)

  6. Log in to comment