Anonymous avatar Anonymous committed 64385e6

don't split at apostrophes because the simple approach of having a list of words which can contain them ("you'll" etc.) fails to account for names, some of which can contain multiple apostrophes (e.g. "Ng'ang'a").

disable SpellDigitsRule because there are too many exceptions where it shouldn't be applied that it causes more work than it saves.

Comments (0)

Files changed (1)

 
     Avoid splitting numeric punctuation, e.g., 11,000.34 should not be
     split at the comma or the decimal. Also avoid splitting at
-    apostrophes in contractions.
+    apostrophes.
     """
 
     # this is the same as Token.delimited_decimal_re except that
           """,
         re.U | re.X)
 
-    # TODO: names containing apostrophes are not recognized here. Note
-    # some names may contain more than one apostrophe, e.g. Ng'ang'a.
-    _contraction_endings = [u't', u's', u'd', u'll']
-
     def __init__(self):
         """Set rule priority and name. """
         Rule.__init__(self, 60, 1.0,
                     # Found punctuation character, and it is not
                     # embedded within a number as a thousands separator
                     # or a decimal point. Check to see if it is an
-                    # apostrophe in a contraction.
-                    if (char == u"'" and
-                        token.str[i + 1:] in PunctSplitRule._contraction_endings):
-                            continue
+                    # apostrophe.
+                    if char == u"'":
+                        continue
 
                     # Create a transform to split the token at this
                     # point.
         return transforms
 
 
+# This rule is currently disabled because it turns out to be more
+# trouble than it's worth: there are too many exceptions that it isn't
+# aware of. For example, times shouldn't be spelled ("7 a.m. to 9 p.m.")
+# and numbers in lists should either all be spelled or all in digits
+# ("children aged 3, 7, and 10").
 class SpellDigitsRule(Rule):
     """Spell out numbers 1..9.
 
     def __init__(self):
         """Set rule priority and name. """
         Rule.__init__(self, 80, 1.0, "Spell out single digit  numbers.")
+        # DISABLED, see comment at head of class
+        self.enabled = False
 
     def get_transforms(self, tokens):
         """Return an array of transform objects."""
+        # DISABLED, see comment at head of class
+        self.enabled = False
+        return []
+
         self.tokens = tokens
         transforms = []
 
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.