Commits

Virgil Dupras  committed 393ed81

Instead of trying to optimize grouping (which broke a test), skip it when there's too many boxes to group.

What's the point of nesting 2-sized groups anyway?

  • Participants
  • Parent commits 26580b3

Comments (0)

Files changed (1)

File pdfminer/layout.py

+import logging
 from itertools import combinations
 
 from .utils import (INF, get_bound, uniq, fsplit, drange, bbox2str, matrix2str, apply_matrix_pt,
             y1 = max(obj1.y1,obj2.y1)
             objs = set(plane.find((x0,y0,x1,y1)))
             return objs.difference((obj1,obj2))
+        if len(boxes) > 100:
+            # Grouping this many boxes would take too long and it doesn't make much sense to do so
+            # considering the type of grouping (nesting 2-sized subgroups) that is done here.
+            logging.warning("Too many boxes (%d) to group, skipping.", len(boxes))
+            return boxes
         # XXX this still takes O(n^2)  :(
         dists = []
         for obj1, obj2 in combinations(boxes, 2):