I run a complex regex pattern that matches a sequence of 3-4 consecutive numbers written in different numeric formats and possibly separated by some text, WITH OVERLAP, using method finditer(text, overlap=True) of a compiled regex.
The pattern works fine on most inputs (tested on 100K different real-world texts), but it hangs on an input containing dense sequences of several hundred, nearly consecutive, numbers. This is a difficult input, no doubt. However, even after adding timeout (for example, timeout=5), the method STILL HANGS for many minutes and longer, which indicates a problem with the way “timeout” is processed. htop shows the process is busy all the time (100% cpu).
Ubuntu 20.04, 64 bit, Python 3.8.2. First tried on regex-2020.6.8, then upgraded to the latest version (2020.7.14) - the problem occurs with both versions.