Odd OCR results in SRT file

Issue #642 new
Michael Dreimiller created an issue

I mostly use Subler to create SRT files from subtitles on DVDs. Sometime around the end of last year I started to notice that occasionally an SRT file would have an entry that is total garbage while all of the other caption lines around it were fine. This is new behavior to me as I only ever saw spacing issues and character substitutions (like “VV” instead of “W” and using the ligatures for “fi' and “fl” instead of the individual characters) before this new behavior started. This will happen repeatedly throughout the SRT file. When I see this behavior I also usually find that it skips lines of dialog entirely.

I’ve attached a screenshot of an example from the 2001 Sony Pictures Classic DVD release of the 1971 film “Giardino dei Finzi-Contini”. Here’s a section from the SRT file that was generated from the attached screenshot which shows the actual dialog that generated the garbage below.

8
00:03:43,021 --> 00:03:46,650
— Race you to the entrance!
— Don‘t leave me behind!

9
00:03:51,029 --> 00:03:52,963
They‘re coming back.

10
00:03:53,031 --> 00:03:56,626
mbeganbfinkflm

11
00:03:56,701 --> 00:03:59,966
No, the Finzi—Continis
never leave their kingdom.

12
00:04:00,505 --> 00:04:04,066
The house must be
along way from the gate.

I just thought it was worth mentioning.

Comments (1)

  1. Log in to comment