Stop using Show ID as a uniqueness identifier

Issue #11 new
Chris Earley repo owner created an issue

NHK World reuses show IDs so we must switch to the show's title for proving uniqueness.

Background

While working on issue #9, I dug into the old getshowIDs.py script and found that it wasn't generating the pickle database of shows correctly, instead of only marking down one instance of a program it was keeping multiple copies of a show's (Show Title, Show ID) tuple.

After re-writing the script to fix this, I noticed that some programs like "Dragon Dentist", "Snow Fever in Niseko", and "Symbols of Revival: Tohoku's Cherry Trees" were not ending up in the final pickle db. Turn out they all have the show ID 5001, which as the script was iterating through the show entries alphabetically was finally overwritten by "Somewhere Street", which is also ID 5001.

This is just a collision instance that I happened to notice, there are multiple* other instances of ID reuse, especially among specials or other non-reoccurring programs.

Fix

Switch to using the program's title as the identifier. Though upon deeper inspection of the malformed showID database, there have been multiple instances of show titles with incorrect capitalization* so all titles will need to be run through .lower() before checking.

  • Refer to the attached txt document for a list of all the show IDs and their associated programs

Comments (1)

  1. Log in to comment