- edited description
Matching with the API
Hi,
I'm trying to resolve a list of names using taxonome, using the python API and I wanted to know what is the best approach to do it.
At first, I the code below, but the problem was that sometime the "original name" column has been changed and wasn't the exact same string I provided as the input.
Next, I tried using a different tracker, and so instead of "tracker.CSVListMatches(f)" I used "tracker.CSVTaxaTracker(f,["Id"])" and used the "id" column to match my original row.
The problem was that using this tracker I don't get the score. Also, looking at the code the two trackers look a little different so I wondered whom should I use.
Thanks,
Here is the code snippet:
Comments (4)
-
Account Deleted -
Account Deleted - edited description
trackers = [] files = [] f = open(mappings_file, "w", encoding='utf-8', newline='') files.append(f) trackers.append(tracker.CSVListMatches(f))
f = open(log_file, "w", encoding='utf-8', newline='') files.append(f) trackers.append(tracker.CSVTracker(f)) with open(input_filename, encoding='utf-8', errors='ignore') as f: input_taxa = load_taxa(f, namefield=namefield, authfield=authfield) run_match_taxa (input_taxa,taxonset, tracker=trackers,nameselector=name_selector.NameSelector())
-
(Just formatting the code)
trackers = [] files = [] f = open(mappings_file, "w", encoding='utf-8', newline='') files.append(f) trackers.append(tracker.CSVListMatches(f)) f = open(log_file, "w", encoding='utf-8', newline='') files.append(f) trackers.append(tracker.CSVTracker(f)) with open(input_filename, encoding='utf-8', errors='ignore') as f: input_taxa = load_taxa(f, namefield=namefield, authfield=authfield) run_match_taxa (input_taxa,taxonset, tracker=trackers,nameselector=name_selector.NameSelector())
-
If you use
iter_taxa
instead ofload_taxa
, then it should work through the records from your input file in strict order, so the rows in the CSVListMatches output file will correspond to the input file.load_taxa
loads the taxa into an unordered collection, which you then iterate over. Actually, my newer implementation does preserve order, but it's best not to rely on that, because it may e.g. remove duplicate taxa.Another approach would be to write a specific tracker. You could copy CSVListMatches, but record the id field in place of the original name.
- Log in to comment