Ezio Melotti avatar Ezio Melotti committed c3e9ae0

Improve the cleanpath function.

Comments (2)

  1. Ezio Melotti author

    It was more readable but not as powerful. I decided to use a regex while trying to get rid of the date. Splitting on '\t' didn't work because sometimes spaces were uses, and some of the patches also contained spaces on the filename, so splitting on spaces wasn't an option. Using a regex also simplified looking for leading a/ b/ A/, dirs starting with python-, Python-, Python2, Python3, dates starting with a week day or with a year, and things like "(working copy)". There are still a few cases left though. Windows paths are not converted, and I've seen lines like "b/C:\Python32\Lib\distutils\msvc9compiler_manifest.py" (quotes included). The regex covers 98% of the cases I've seen though.

Files changed (1)

 from __future__ import print_function
+import re
 import sys
 import json
     cached = True
   return (filename, cached)
+days = 'Mon|Tue|Wed|Thu|Fri|Sat|Sun'
+path_re = re.compile(r'^(?:[ab]/)?(?:python[-23][^/]*/)?(.*?)\s*'
+                     r'(?:\s(?:%s|20[01]\d|199\d|\(\w+\s)\b.*)?(?:\.orig)?$' %
+                     days, re.I)
 def cleanpath(source, target):
-    # best-effort function to clean up the path
-    path = source
-    if not source or source == 'dev/null':
-        path = target
-    # some paths are followed by the date
-    path = path.split()[0]
-    if path.startswith(('a/', 'b/')):
-        path = path[2:]
-    if path.endswith('.orig'):
-        path = path[:-5]
-    parts = path.split('/')
-    if parts[0].startswith('Python-'):
-        parts = parts[1:]
-    path = '/'.join(parts)
+    # clean up the path by removing leading a/, b/, or python* dirs,
+    # and trailing dates, or '(working copy)', or '.orig' extensions
+    path = target
+    if not target or target == 'dev/null':
+        path = source
+    # if this fails the regex is broken
+    path = path_re.match(path).group(1)
     return path
 issue_files = {}    # 'number' => []
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.