Add option to ignore duplicate entries

If I parse a file with duplicate keys, I get the following error:

>>> import pybtex.database
>>> d = pybtex.database.parse_file('anthology.bib')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/pybtex/database/__init__.py", line 852, in parse_file
    return parser.parse_file(file)
  File "/usr/local/lib/python3.6/site-packages/pybtex/database/input/__init__.py", line 51, in parse_file
    self.parse_stream(f)
  File "/usr/local/lib/python3.6/site-packages/pybtex/database/input/bibtex.py", line 385, in parse_stream
    return self.parse_string(text)
  File "/usr/local/lib/python3.6/site-packages/pybtex/database/input/bibtex.py", line 380, in parse_string
    self.process_entry(entry_type, *entry[1])
  File "/usr/local/lib/python3.6/site-packages/pybtex/database/input/bibtex.py", line 347, in process_entry
    self.data.add_entry(key, entry)
  File "/usr/local/lib/python3.6/site-packages/pybtex/database/__init__.py", line 150, in add_entry
    report_error(BibliographyDataError('repeated bibliograhpy entry: %s' % key))
  File "/usr/local/lib/python3.6/site-packages/pybtex/errors.py", line 77, in report_error
    raise exception
pybtex.database.BibliographyDataError: repeated bibliograhpy entry: papineni2002:bleu

It would be nice if I could pass an option to parse_file() to cause it to ignore duplicate entries (which is the functionality you get using bibtex from the command line).

Comments (2)