wikify /

Filename Size Date modified Message
tests
765 B
18 B
6.0 KB
426 B
8.4 KB

wikify your texts!
micro-framework for text wikification

goals - avoid conflicts between text modifications rules and be easy to extend and debug

author: anatoly techtonik techtonik@gmail.com
license: Public Domain

Build Status

the problem and solution

this example is pasted from real-word replacement rules of Roundup issue tracker:

>>> import re
>>> rules = [
    # link to debian bug tracker
    (re.compile('debian:\#(?P<id>\d+)'),
     '<a href="http://bugs.debian.org/\g<id>">debian#\g<id></a>' ),

    # link to local issue
    (re.compile('\#(?P<id>\d+)'),
     '<a href="issue\g<id>">#\g<id></a>' ),
]
>>> text = "debian:#222"
>>> for search, replace in rules:
...    text = search.sub(replace, text)
...
>>> text
'<a href="http://bugs.debian.org/222">debian<a href="issue222">#222</a></a>'

expected output is:

'<a href="http://bugs.debian.org/222">debian#222</a>'

the solution:

>>> import wikify
>>> wrules = [wikify.RegexpRule(s,r) for s,r in rules]
>>> wikify.wikify("debian:#222", wrules)
'<a href="http://bugs.debian.org/222">debian#222</a>'

usage

  1. define rules that match and process parts of text
  2. text = wikify(text, rules)

rule is a function or an object run() method that takes text and returns either None (means not matched) or this text split into three parts [ not-matched, processed, the-rest ]. processed part of text is returned modified by the rule.

example of a rule in action:

>>> import wikify
>>> wikify.rule_link_wikify('wikify your texts!')
('', '<a href="https://bitbucket.org/techtonik/wikify/">wikify</a>', ' your texts!')

and its source code:

def rule_link_wikify(text):
  """ replace `wikify` text with a link to repository """
  if not 'wikify' in text:
    return None
  res = text.split('wikify', 1)
  site = 'https://bitbucket.org/techtonik/wikify/'
  url = '<a href="%s">wikify</a>' % site
  return (res[0], url, res[1])

using the rule with wikify to get processed text:

>>> from wikify import wikify, rule_link_wikify
>>> wikify('wikify your texts!', rule_link_wikify)
'<a href="https://bitbucket.org/techtonik/wikify/">wikify</a> your texts!'

you probably want change url and searched string, so to avoid rewriting the rule from scratch, wikify provides some.

API

RegexpRule(search, replace=r'\0')

wikify rule class. search is regexp, replace can be string with backreferences (like \0, \1 etc.) or a callable that receives re.MatchObject.

r = RegexpRule('(\d+)', '[\\1]')
print(wikify('wrap list 1 2 3 45', r))
# wrap list [1] [2] [3] [45]

in comparison to standard re.sub, RegexpRule expands \0 in replacement template to the whole matched string.

chained function rule (function that returns list of rules) that replaces references like #123, issue #123 with link to url with issue number appended.

w = tracker_link_rule('https://bitbucket.org/techtonik/wikify/issue/')
print(wikify('issue #123, &#8121;', w))
# <a href="https://bitbucket.org/techtonik/wikify/issue/123">issue #123</a>, &#8121;
wikify(text, rules)

rules argument can be a list of rules. wikify ensures that text processed by one rule is not reachable by others. if you try to process some text without wikify with just a series of replacement commands, there can be situations when later replacement may affect the text just pasted by previous one. wikify was made to prevent this from happening.

using as a Sphinx extension

wikify is also a Sphinx extension. the following lines if added to conf.py, will link issue numbers on changes page to bugtracker for the sphinx project:

extensions = ['wikify']

# setup wikify extension to convert issue references to links
from wikify import RegexpRule, tracker_link_rule
wikify_html_rules = [
    # PR#123 or pull request #123
    RegexpRule('(PR|pull request\s)\s*#(\d+)',
        '<a href="https://bitbucket.org/birkenfeld/sphinx/pull-request/\\2">\\0</a>'),
    # issue #123 or just #123
    tracker_link_rule('https://bitbucket.org/birkenfeld/sphinx/issue/')
]
wikify_html_pages = ['changes']

operation (flat algorithm)

for each region - find region in processed text - process text matched by region - exclude processed text from further processing

note: (flat algorithm) doesn't process nested markup, such as:

*`bold preformatted text`*

example - replace all wiki:something with HTML links

  • [x] wrap text into list with single item
  • [x] split text into three parts using regexp wiki:\w+
  • [x] copy 1st part (not-matched) into the resulting list
  • [x] replace matched part with link, insert (processed) into the resulting list
  • [ ] process (the-rest) until text list doesn't change
  • [x] repeat the above for the rest of rules, skipping (processed) parts
  • [x] reassemble text from the list

roadmap

  • [ ] optimize - measure performance of using indexes instead of text chunks
  • [x] write docs
  • [x] upload to PyPI

history

  • 1.5 - fixed major flaw in subst order for single rule
  • 1.4 - support named group replacements in RegexpRule
  • 1.3 - create_tracker_link_rule to tracker_link_rule
  • 1.2 - convert create_regexp_rule to RegexpRule class
  • 1.1 - allow rules to be classes (necessary for Sphinx)
  • 1.0 - use wikify as Sphinx extension

  • 0.9 - case insensitive match in tracker link rule

  • 0.8 - python 3 compatibility
  • 0.7 - fixed major flaw in text replacements mapping
  • 0.5 - helper to build rules to link tracker references
  • 0.6 - flatten nested rule lists
  • 0.4 - accept single rule in wikify in addition to list
  • 0.3 - allow callables in replacements for regexp rules
  • 0.2 - helper to build regexp based rules
  • 0.1 - proof of concept, production ready, no API sugar and optimizations
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.