Unique fields cannot be unlimited length in MySQL

Issue #276 resolved
Ed McDonagh created an issue

Raised by Josef Lundman: https://groups.google.com/d/topic/openrem/Qd117-eoebY/discussion

UniqueEquipmentNames has almost all text fields. MySQL cannot use these if uniqueness is required: http://stackoverflow.com/a/1827099

  • Option 1: Limit the length of each field, convert them to char fields
  • Option 2: Create an SHA hash of each field on edit, and make them unique together instead.

@dplatten: what do you think?

Comments (8)

  1. David Platten

    Wow @edmcdonagh, that's annoying.

    I don't like the idea of limiting the length of each field, as that may cause problems of its own. I'm not sure what Option 2 entails, but it's probably the way to go.

  2. Ed McDonagh reporter

    My intention @dplatten is to do the following:

    In models: Add a fixed length hash version of each field that is currently unique_together, and change the unique_together Meta to have the hash versions instead of the current fields

    In each extractor:

        from remapp.tools.hash_id import hash_id
    
        equip_display_name, created = UniqueEquipmentNames.objects.get_or_create(manufacturer=equip.manufacturer,
                                                                                 manufacturer_hash=hash_id(equip.manufacturer)
                                                                                 institution_name=equip.institution_name,
                                                                                 institution_name_hash=hash_id(equip.institution_name)
                                                                                 ...
                                                                                 )
    

    I was thinking that the hash would be created as a def in models, but I think this would be less complicated. I read something that suggested unique_together was evaluated before the model was saved, so it might be that this is the only way to do it anyway.

    I don't know how you create a migration for this? You'll want to create and populate the hash fields, then change the unique_together properties. Is that possible in a semi-automated way?

  3. Ed McDonagh reporter

    Added hash versions of each unique_together field, and a Bool indicating if hashes have been generated. Changed unique_together to look at the new fields. On a SQLite database, this migrated without complaint. Added routine called from display_name_update view which creates hash versions if they have not already been created. If this works with PostgreSQL @dplatten, after a migration the user would just need to update and save each name. Refs #276.

    → <<cset 5feba2a33597>>

  4. Ed McDonagh reporter

    Corrected error not setting hash_generated Bool. Added try except catch for displayname being None, which usually indicates something has not been imported correctly. Refs #276

    → <<cset 722043e3d4e2>>

  5. Log in to comment