Source

jslex / unirange.py

Ned Batchelder af0593c 







































import unicodedata

ranges = {}

classes = {
    'letters': "Lu Ll Lt Lm Lo Nl",
    'combining': "Mn Mc",
    'digit': "Nd",
    'connector': "Pc",
    }

cat_to_class = {}
ranges = {}

for klass, cats in classes.items():
    for cat in cats.split():
        cat_to_class[cat] = klass
    ranges[klass] = []

for i in range(0xFFFF):
    cat = unicodedata.category(unichr(i))
    try:
        klass = cat_to_class[cat]
    except KeyError:
        continue
    r = ranges[klass]
    if r and r[-1][1] == i-1:
        r[-1][1] = i
    else:
        r.append([i, i])

for k, r in ranges.items():
    reg = "["
    for a,b in r:
        if a == b:
            reg += r"\u%04x" % a
        else:
            reg += r"\u%04x-\u%04x" % (a, b)
    reg += "]"
    print "%s = %s" % (k, reg)
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.