Commits

dirkbaechle committed 209b041

Version v1.0

  • Participants
  • Parent commits c2ba235

Comments (0)

Files changed (3)

+@author: Dirk Baechle
+@title: Specification of the xmlwiko language
+
+== Basics == basic
+
+An xmlwiko file ($$*.wiki$$) consists of text blocks. These blocks
+are separated by one or more blank lines (2+ newlines). A text block itself
+does not contain blank lines.
+
+The markups for the text can be divided into the categories:
+sections, lists, environments and simple paragraphs with markup.
+
+== Sections == sections
+
+Sections outline the structure of your text. You can indent or dedent sections
+to any level you like, so we need ways of adding a subsection or closing
+several opened sections at once (dedent).
+
+A simple section is started by the code:
+
+Code: javascript
+== title == [id]
+
+As the square brackets imply, the id is optional for you...but required for
+the Forrest DTD. You can leave it out, then the given title will be joined
+by underscores '$$_$$' and the result converted to lowercase as the id of this
+section.
+
+Starting a section like this will keep the current indentation level. So if
+another section has been opened before, it will be closed first.
+If you want to open a subsection (indent) you type:
+
+Code:
+==+ title == [id]
+
+Note the '$$+$$' that signals: I want to increment the level of indentation.
+
+While you can only increment by steps of one, you can dedent arbitrarily
+using:
+
+Code:
+==-- title == [id]
+
+Here we dedent by two, which effectively results in closing the last three
+sections...and then opening the new one.
+
+Larger levels of dedent can be directly entered with a single minus, followed
+by an integer number:
+
+Code:
+==-7 title == [id]
+
+At the end of the text, all sections that are still open get closed
+automatically.
+
+Finally, you can jump to a lower indent by directly giving the number of
+section indent behind the starting tag:
+
+Code:
+==0 title == [id]
+
+for starting a new section at the top level (all opened sections are closed first).
+
+== Simple paragraphs == para
+
+The following markups are local to a text block. They have to appear matched,
+because they don't get closed automatically
+at the end of the block.
+
+Emphasis (em)
+
+Code:
+//emphasis//
+
+Bold (strong)
+
+Code:
+!!bold!!
+
+HTML link
+
+Code:
+[[URL text]]
+
+Code words, variables verbatim text inline
+
+Code: (code)
+$$optionList$$
+
+Images
+
+Code:
+<<URL>>
+or
+<<URL||alt="alt" name="" width=""...>>
+
+Anchor (<anchor id=""/>)
+
+Code:
+@@label_id@@
+
+
+A paragraph can also be started as environment with the '$$Para:$$' keyword:
+
+{{Code:
+Para:
+Starts a new paragraph at the current indent.
+Para:-2
+Closes the last two environments in the current section and starts a new paragraph
+}}
+
+These get important when you want to mix paragraphs and list/list items.
+
+== Lists ==
+
+Within a list block you can indent/dedent the item level and also
+change between ordered, unordered and description lists.
+The opening and closing of the single environments is handled by xmlwiko.
+
+{{Code:
+# first
+# second
+# new first
+# parent 1
+## child 1
+##* non numerated child
+##* non numerated child
+## child 2
+### subchild 1
+## child 3
+##~ dt||dd (description list)
+# parent 2
+}}
+
+
+Also possible for mixing paragraphs and list items:
+
+{{Code:
+List:-4
+# Closes the last four envs before opening a new list.
+List:
+#* Opens a new list at the current env level.
+}}
+
+
+== Environments ==
+
+Just like normal sections, environments may contain several paragraphs
+
+
+Code:
+
+versus
+
+{{Code:
+
+}}
+
+Available environments are: Abstract, Keywords, TODO, Code, Comment, Definition,
+Lemma, Proof, Theorem, Corollary, Quote
+
+Fixme, Warning, Note, Important
+
+z.B.:
+<note label=""></note>
+
+
+<source xml:space="preserve">
+  code text
+</source>
+
+Not to forget the two specials: List and Para (for mixed list/paragraph sections).
+
+== Special stuff ==
+
+The \blank marker was introduced as an escaped sequence. So if you want to
+print out code that contains the end marker }} you can do this, in fact.
+Simply write
+
+Code:
+{{Code:
+  A test
+}\bla\blanknk}
+}}
+
+@title: Possible transitions of xmlwiko environments
+@author: Dirk Baechle
+
+void -> section
+void -> par
+void -> env
+
+par -> section*
+par -> section-
+
+section* -> section*
+section* -> par
+section* -> env
+
+par -> env
+env -> par
+env -> section
+
+
+par -> par
+
+
+Processing: Do not collect all pars in a section to one block but...
+process single par blocks individually and close pending envs (em, strong) at the end of the block!
+We need to keep track of which env (=mode) we are in, in order to do the right thing.
+
-#!/usr/bin/python
-
+# Copyright (c) 2009 Dirk Baechle.
+# www: http://www.mydarc.de/dl9obn/programming/python/xmlwiko
+# mail: dl9obn AT darc.de
+#
+# This program is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free Software
+# Foundation; either version 2 of the License, or (at your option) any later
+# version.
+#
+# This program is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along with
+# this program; if not, write to the Free Software Foundation, Inc.,
+# 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
 """
-WiKo: this script generates a web pages, PDF article or blog
-considering the files found in the local directory.
-See http://www.iua.upf.edu/~dgarcia/wiko/usage.html
+xmlwiko: This script generates XML files as input to ApacheForrest or Docbook from Wiki like input.
+         Inspired by WiKo (the WikiCompiler, http://wiko.sf.net) it tries to simplify
+         the setup and editing of web pages (for Forrest) or simple manuals and descriptions (Docbook).
 """
-
-# Bugs: 
-# * @cite:some at end of line, or @cite:some, something.
-# * don't work with iso encoding only utf8
  
-# TODOs:
-# * refactor most behaviour to a base class (done in-flight and lost)
-# * use @toc in the .wiki file
-# * deactivate implicit <pre> mode when in explicit "{{{ }}}" <pre> mode
-# * bullets should allow breaking line into a new line with spaces.
-
 import glob
 import os.path
 import re
 import urllib
 import codecs
 
-def formulaIdGen(): a=0; yield a; a+=1
-def equationIdGen(): a=0; yield a; a+=1
+def processVerbatim(txt, language):
+    if language.strip() == "":
+        return txt
+    else:    
+        try :
+            from pygments import highlight
+            from pygments.lexers import get_lexer_by_name
+            from pygments.formatters import HtmlFormatter
+        except:
+            return txt
+        file("style_code.css",'w').write(HtmlFormatter().get_style_defs('.code'))
 
-class HtmlVerbatimProcessor :
-	def __init__(self) :
-		self.content=[]
-	def __call__(self, line) :
-		if line is None:
-			return "\n".join(self.content).replace("%","%%")
-		self.content.append(line)
-		return ""
+        try:
+            lexer = get_lexer_by_name(language, stripall=True)
+            formatter = HtmlFormatter(linenos=False, cssclass="code")
+        except:
+            return txt
+        return highlight(txt, lexer, formatter)
 
-def formulaUri(latexContent) :
-	if useRemoteFormulas :
-		return "http://www.forkosh.dreamhost.com/mimetex.cgi?"+latexContent
-	mimetex = subprocess.Popen(["mimetex","-d",latexContent], stdout=subprocess.PIPE)
-	imageContent=mimetex.stdout.read()
-	if embeddedFormulas :
-		import base64
-		url = "data:image/png;base64,"+ base64.b64encode(imageContent)
-		return url
-	if not os.access("formulas",os.F_OK) :
-		os.mkdir("formulas")
-	id = formulaIdGen().next()
-	gifname = "formulas/eq%06i.gif"%id
-	print "generating",gifname
-	gif = open(gifname,'wb')
-	gif.write(imageContent)
-	gif.close()
-	return gifname
+header = re.compile(r"^==(\+|-*|-?[0-9]+)\s*([^=]+)\s*=*\s*(.*)$")
 
-class HtmlFormulaProcessor :
-	def __init__(self, match) :
-		self.content=[]
-	def __call__(self, line) :
-		if line is None:
-			return '"'+formulaUri("\Large{"+"".join(self.content)+"}")+'"'
-		self.content.append(line.strip())
-		return ""
+# Regular expressions
+em = re.compile(r"//([^/]*)//")
+strong = re.compile(r"!!([^!]*)!!")
+quote = re.compile(r"''([^']*)''")
+code = re.compile(r"\$\$([^\$]*)\$\$")
+url = re.compile(r"\[\[([^\s]*)\s+([^\]]*)\]\]")
+anchor = re.compile(r"@@([^@]*)@@")
+img = re.compile(r"<<([^>]*)>>")
 
-class HtmlCodeProcessor :
-	def __init__(self, match) :
-		self.content=[]
-		self.language=match.group(1) or "javascript"
-	def __call__(self, line) :
-		if line is not None:
-			self.content.append(line)
-			return ""
-		try :
-			from pygments import highlight
-			from pygments.lexers import get_lexer_by_name
-			from pygments.formatters import HtmlFormatter
-		except:
-			print >> sys.stderr, "Warning: Pygments package not available. Generating code without syntax highlighting."
-			return "\n".join(self.content)
-		file("style_code.css",'w').write(HtmlFormatter().get_style_defs('.code'))
+li  = re.compile(r"^([*#~]+)(.*)")
+var = re.compile(r"^@([^:]*): (.*)")
 
-		lexer = get_lexer_by_name(self.language, stripall=True)
-		formatter = HtmlFormatter(linenos=False, cssclass="code")
-		return highlight("\n".join(self.content), lexer, formatter)
+env = re.compile(r"^({*)([a-zA-Z]+):(-*|-?[0-9]+)\s*(.*)$");
+closeenv = re.compile(r"^}}\s*$")
 
-def htmlInlineFormula(match) :
-	formula = match.group(1)
-	return '<img class="inlineFormula" src="%s" alt="%s" />'%(formulaUri(formula), formula)
+# Forrest output tags
+envTagsForrest = {
+           'Section' : ['<section id="%(id)s"><title>%(title)s</title>', '</section>', True],
+           'Para' : ['<p>', '</p>', False],
+           'Code' : ['<source xml:space="preserve">', '</source>', False],
+           'Figure' : ['', '', False],           
+           'Abstract' : ['<p><strong>Abstract:</strong></p>', '', True],
+           'Remark'  : ['<p><strong>Remark:</strong></p>', '', True],
+           'Note'  : ['<note>', '</note>', False],
+           'Important'  : ['<p><strong>Important:</strong></p>', '', True],
+           'Warning'  : ['<warning>', '</warning>', False],
+           'Caution'  : ['<p><strong>Caution:</strong></p>', '', True],
+           'Keywords' : ['<p><strong>Keywords:</strong></p>', '', True],
+           'TODO'     : ['<p><strong>TODO:</strong></p>', '', True],
+           'Definition'  : ['<p><strong>Definition:</strong></p>', '', True],
+           'Lemma'    : ['<p><strong>Lemma:</strong></p>', '', True],
+           'Proof'    : ['<p><strong>Proof:</strong></p>', '', True],
+           'Theorem'  : ['<p><strong>Theorem:</strong></p>', '', True],
+           'Corollary': ['<p><strong>Corollary:</strong></p>', '', True]
+}
+listTagsForrest = {'#' : ['<ol>', '</ol>'],
+            '*' : ['<ul>', '</ul>'],
+            '~' : ['<dl>', '</dl>'],
+            'olItem' : ['<li>', '</li>'],
+            'ulItem' : ['<li>', '</li>'],
+            'dtItem' : ['<dt>', '</dt>'],
+            'ddItem' : ['<dd>', '</dd>'],
+           }
+inlineTagsForrest = {'em' : ['<em>', '</em>'],
+              'strong' : ['<strong>', '</strong>'],
+              'quote' : ['&quot;', '&quot;'],
+              'code' : ['<code>', '</code>'],
+              'anchor' : ['<a id="', '"/>']}
+dictTagsForrest = {'ulink' : '<a href="%(url)s"%(atts)s>%(linktext)s</a>',
+                   'inlinemediaobject' : '<img src="%(fref)s"%(atts)s/>',
+                   'mediaobject' : '<figure src="%(fref)s"%(atts)s/>',
+                   'figure' : '<figure src="%(fref)s"%(atts)s/><p><strong>Figure</strong>: %(title)s</p>'
+                  }
 
-inlineHtmlSubstitutions = [  # the order is important
-	(r"%%", r"%"),
-	(r"%([^(])", r"%%\1"),
-	(r"'''(([^']|'[^']|''[^'])*)'''", r"<b>\1</b>"),
-	(r"''(([^']|'[^'])*)''", r"<em>\1</em>"),
-	(r"\[\[(\S+)\s([^\]]+)\]\]", r"<a href='\1'>\2</a>"),
-	(r"\[\[(\S+)\]\]", r"<a href='\1'>\1</a>"),
-	(r"\[(http://\S+)\s([^\]]+)\]", r"<a href='\1'>\2</a>"),
-	(r"\[(http://\S+)\]", r"<a href='\1'>\1</a>"),
-	(r"\\ref{([-+_a-zA-Z0-9:]+)}", r"<a href='#\1'>\1</a>"), # TODO: numbered figures?
-	(r"`([^`]+)`", htmlInlineFormula),
-#	(r"{{{", r"<pre>"),
-#	(r"}}}", r"</pre>"),
-	(r"^@toc\s*$", r"%(toc)s"),
-	(r"^BeginProof\n*$", r"<div class='proof'><b>Proof:</b>"),
-	(r"^EndProof\n*$", r"</div>"),
-	(r"^BeginDefinition\n*$", r"<div class='definition'><b>Definition:</b>"),
-	(r"^EndDefinition\n*$", r"</div>"),
-	(r"^BeginTheorem\n*$", r"<div class='theorem'><b>Theorem:</b>"),
-	(r"^EndTheorem\n*$", r"</div>"),
-]
-
-header = re.compile(r"^(=+)([*]?)\s*([^=]+?)\s*\1\s*$")
-headersHtml = [
-	r"<section id='toc_%(n)s'><title>%(title)s</title>",
-	r"<section id='toc_%(n)s'><title>%(title)s</title>",
-	r"<section id='toc_%(n)s'><title>%(title)s</title>",
-	r"<section id='toc_%(n)s'><title>%(title)s</title>",
-	r"<section id='toc_%(n)s'><title>%(title)s</title>",
-]
-
-li  = re.compile(r"^([*#]+)(.*)")
-quote = re.compile(r"^[ \t](.*)")
-var = re.compile(r"^@([^:]*): (.*)")
-fig = re.compile(r"^Figure:[\s]*([^\s]+)[\s]*([^\s]+)(.*)");
-figs = re.compile(r"^Figures:[\s]*([^\s]+)[\s]*(.*)");
-todo = re.compile(r"^TODO:[\s]*(.+)");
-anno = re.compile(r"^:([^\s]+):[\s]*(.*)");
-code = re.compile(r"^Code:[\s]*([^\s]+)?");
-label = re.compile(r"^Label:[\s]*([^\s]+)");
-div = re.compile(r"^([a-zA-Z0-9]+):$")
-pre = re.compile(r"^{{{[\s]*([^\s])*")
-close = re.compile(r"^---[\s]*([^\s]+)?");
-dtdd = re.compile(r"^{{([^(\|])*\|\|[\s]}}")
-
-divMarkersLatex = {
-	'Abstract' : ('\\begin{abstract}', '\\end{abstract}'),
-	'Keywords' : ('\\begin{keywords}', '\\end{keywords}'),
-	'Equation' : ('\\begin{equation}', '\\end{equation}'),
-#	'Math' : ('\\[', '\\]'),
-	'Theorem': ('\\begin{thma}', '\\end{thma}'),
-	'Lemma': ('\\begin{lem}', '\\end{lem}'),
-	'Corollary': ('\\begin{cor}', '\\end{cor}'),
-	'Proof': ('\\begin{pro}', '\\end{pro}'),
-	'Definition': ('\\begin{defin}', '\\end{defin}'),
-	#TODO: add new keys added in html
-}
-
-divMarkersHtml = {
-	'Abstract' : ('<div class="abstract"><b>Abstract:</b>', '</div>'),
-	'Keywords' : ('<div class="keywords"><b>Keywords:</b>', '</div>'),
-	'Equation' : ("<div class='equation'><img src=", " /><!--<span class='eqnumber'>(123)</span>--></div>", HtmlFormulaProcessor),
-	'Math'     : ("<div class='equation'><img src=", " /></div>", HtmlFormulaProcessor),
-	'TODO'     : ('<div class="todo"><b>TODO:</b>', '</div>'),
-	'Comment'  : ('<div class="comment"><b>Comment:</b>', '</div>'),
-	'Definition'  : ('<div class="definition"><b>Definition:</b>', '</div>'),
-	'Lemma'    : ('<div class="lemma"><b>Lemma:</b>', '</div>'),
-	'Proof'    : ('<div class="proof"><b>Proof:</b>', '</div>'),
-	'Theorem'  : ('<div class="theorem"><b>Theorem:</b>', '</div>'),
-	'Corollary': ('<div class="corollary"><b>Corollary:</b>', '</div>'),
-}
-
-defaultForrestSkeleton = u"""<?xml version="1.0" encoding="utf-8"?>
+defaultSkeletonForrest = u"""<?xml version="1.0" encoding="utf-8"?>
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
 <document>
   <header>
 </document>
 """
 
+# Docbook output tags
+envTagsDocbook = {
+           'Section' : ['<section id="%(id)s"><title>%(title)s</title>', '</section>', True],
+           'Para' : ['<para>', '</para>', False],
+           'Code' : ['<screen xml:space="preserve">', '</screen>', False],
+           'Figure' : ['', '', False],
+           'Abstract' : ['<abstract>', '</abstract>', True],
+           'Remark'  : ['<remark>', '</remark>', True],
+           'Note'  : ['<note>', '</note>', True],
+           'Important'  : ['<important>', '</important>', True],
+           'Warning'  : ['<warning>', '</warning>', True],
+           'Caution'  : ['<caution>', '</caution>', True],
+           'Keywords': ['<remark><para>Keywords:</para>', '</remark>', True],
+           'TODO': ['<remark><para>TODO:</para>', '</remark>', True],
+           'Definition': ['<remark><para>Definition:</para>', '</remark>', True],
+           'Lemma': ['<remark><para>Lemma:</para>', '</remark>', True],
+           'Proof': ['<remark><para>Proof:</para>', '</remark>', True],
+           'Theorem': ['<remark><para>Theorem:</para>', '</remark>', True],
+           'Corollary': ['<remark><para>Corollary:</para>', '</remark>', True]
+}
+listTagsDocbook = {'#' : ['<orderedlist>', '</orderedlist>'],
+            '*' : ['<itemizedlist>', '</itemizedlist>'],
+            '~' : ['<variablelist>', '</variablelist>'],
+            'olItem' : ['<listitem>', '</listitem>'],
+            'ulItem' : ['<listitem>', '</listitem>'],
+            'dtItem' : ['<varlistentry><term>', '</term>'],
+            'ddItem' : ['<listitem>', '</listitem></varlistentry>'],
+           }
+inlineTagsDocbook = {'em' : ['<emphasis>', '</emphasis>'],
+              'strong' : ['<emphasis role="bold">', '</emphasis>'],
+              'quote' : ['<quote>', '</quote>'],
+              'code' : ['<code>', '</code>'],
+              'anchor' : ['<a id="', '"/>']}
+dictTagsDocbook = {'ulink' : '<ulink url="%(url)s">%(linktext)s</ulink>',
+                   'inlinemediaobject' : '<inlinemediaobject><imageobject><imagedata fileref="%(fref)s"%(atts)s/></imageobject></inlinemediaobject>',
+                   'mediaobject' : '<mediaobject><imageobject><imagedata fileref="%(fileref)s"%(atts)s/></imageobject></mediaobject>',
+                   'figure' : '<figure><title>%(title)s</title><mediaobject><imageobject><imagedata fileref="%(fref)s"%(atts)s/></imageobject></mediaobject></figure>'
+                  }
+
+defaultSkeletonDocbook = u"""<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
+<article>
+  <title>%(title)s</title>
+  <articleinfo>
+    <author>
+      <surname>%(author)s</surname>
+    </author>
+  </articleinfo>
+%(content)s
+</article>
+"""
+
+
 def stripUtfMarker(content) :
-	import codecs
-	return content.replace( unicode(codecs.BOM_UTF8,"utf8"), "")
+    import codecs
+    return content.replace( unicode(codecs.BOM_UTF8,"utf8"), "")
 
 def readUtf8(filename) :
-	print "Reading",filename
-	return stripUtfMarker(codecs.open(filename,'r','utf8').read())
+    print "Reading",filename
+    return stripUtfMarker(codecs.open(filename,'r','utf8').read())
 
 def loadOrDefault(filename, defaultContent) :
-	try: return readUtf8(filename)
-	except: return defaultContent
+    try: return readUtf8(filename)
+    except: return defaultContent
 
 def writeUtf8(filename, content) :
-	import codecs, os
-	print "Generating",filename
-	path = filename.split("/")[:-1]
-	for i in range(len(path)) :
-		try : os.mkdir("/".join(path[:i+1]))
-		except: pass
-	basepath='../'*len(path)
-	content=content.replace('<!--base-->','<base href="%s" />'%basepath)
-	codecs.open(filename, "w",'utf8').write(content)
+    import codecs, os
+    path = filename.split("/")[:-1]
+    for i in range(len(path)) :
+        try : os.mkdir("/".join(path[:i+1]))
+        except: pass
+    codecs.open(filename, "w",'utf8').write(content)
+
+def tos(seq):
+    if len(seq) > 0:
+        return seq[-1]
+    else:
+        return ""
 
 class WikiCompiler :
 
-	def compileInlines(self, inlines) :
-		self.inlines = [ (re.compile(wikipattern), substitution) 
-			for wikipattern, substitution in inlines  ]
-	def substituteInlines(self, line) :
-		for compiledPattern, substitution in self.inlines :
-			line = compiledPattern.sub(substitution, line)
-		return line
+    def closeAllOpenedBlocks(self):
+        while len(self.openBlocks):
+            tos = self.openBlocks.pop()
+            self.result += "%s\n" % self.envTags[tos][1]
+            
+    def closeOpenedBlocks(self, tag, num=1):
+        cnt = 0
+        while len(self.openBlocks):
+            tos = self.openBlocks.pop()
+            self.result += "%s\n" % self.envTags[tos][1]
+            if tos == tag:
+                cnt += 1
+            if cnt == num:
+                break
+            
+    def process(self, content) :
+        self.itemLevel = ""
+        self.closing=""
+        self.result=""
+        
+        self.vars = {
+            'title': '',
+            'author': ''
+        }
+        # Collect simple blocks
+        self.openBlocks = []
+        currentBlock = []
+        self.currentEnvironment = ""
+        self.currentText = ""
+        self.codeType = ""
+        self.lastBlock = None
+        self.sectionIndent = 0
+        self.paraIndent = 0
+        for line in content.splitlines():
+            if line.strip() == "":
+                # Line is empty
+                if len(currentBlock):
+                    # Current block was closed...so process it
+                    self.processBlock(currentBlock)
+                    currentBlock = []
+            else:
+                varMatch = var.match(line)
+                if varMatch:
+                    # Catch vars
+                    key = varMatch.group(1)
+                    if key in self.vars:
+                        self.vars[key] = varMatch.group(2)
+                else:
+                    # Continue to collect lines
+                    currentBlock.append(line)
+                
+        # Process the final block
+        if len(currentBlock):
+            self.processBlock(currentBlock)
 
-	def openDiv(self, markers, divMatch):
-		divType = divMatch.group(1)
-		try : divDef = list(markers[divType])
-		except : return False
-		if len(divDef) == 3 :
-			divDef[2] = divDef[2](divMatch)
-		self.openBlock(*divDef)
-		return True
-	def openBlock(self,opening,closing, processor=None):
-		self.closeAnyOpen()
-		self.result.append(opening)
-		self.closing=closing	
-		if processor :
-			self.processor=processor
-	def closeAnyOpen(self) :
-		if self.closing == "" : return
-		if self.processor : self.result.append(self.processor(None))
-		self.processor=None
-		self.result.append(self.closing)
-		self.closing=""
+        # Close all blocks that are still opened
+        self.closeAllOpenedBlocks()
 
-	def addToc(self, level, title) :
-		self.toc.append( (level, title) )
-		return len(self.toc)
-	def buildToc(self) :
-		"""Default, empty toc"""
-		return ""
+        self.vars["content"] = self.result
+        
+        return self.vars
 
-	def process(self, content) :
-		self.itemLevel = ""
-		self.closing=""
-		self.result=[]
-		self.spanStack = []
-		self.toc = []
-		self.vars = {
-			'title': '',
-			'author': '',
-		}
-		for line in content.splitlines() :
-			self.processLine(line)
-		self.processLine("")
+    def processBlock(self, block):
+        # Step 1: Identify block
+        blockType = "None"
+        addIndent = ""
+        blockSpec = ""
+        sectionTitle = ""
+        sectionId = ""
+        envMatch = env.match(block[0])
+        envStarted = False
+        envStopped = False
+        if envMatch and envMatch.start() == 0:
+            blockStart = envMatch.group(1)
+            blockType = envMatch.group(2)
+            addIndent = envMatch.group(3)
+            blockSpec = envMatch.group(4)
+            self.currentEnvironment = blockType
+            if blockStart == "{{":
+                envStarted = True
+                # Does the env also end in this block?
+                endMatch = closeenv.match(block[-1])
+                if endMatch and endMatch.start() == 0:
+                    # Yes
+                    text = "\n".join(block[1:-1])
+                    envStopped = True
+                else:
+                    text = "\n".join(block[1:])
+            else:
+                # Single block env
+                envStarted = True
+                envStopped = True
+                text = "\n".join(block[1:])
+        else:
+            if len(self.currentEnvironment):
+                # Does the env end in this block?
+                endMatch = closeenv.match(block[-1])
+                if endMatch and endMatch.start() == 0:
+                    # Yes
+                    text = "\n".join(block[1:-1])
+                    envStopped = True
+                else:
+                    text = "\n".join(block)
+            else:
+                text = "\n".join(block)
+        
+        if blockType == "None":
+            # Is it a list block?
+            listMatch = li.match(text)
+            if listMatch and listMatch.start() == 0:
+                blockType = "List"
+            else:
+                varMatch = var.match(text)
+                if varMatch and varMatch.start() == 0:
+                    blockType = "Var"
+                    for l in block:
+                        varMatch = var.match(l)
+                        if varMatch:
+                            self.vars[varMatch.group(1)] = varMatch.group(2)
+                    return
+                else:
+                    # Is it a section header?
+                    headerMatch = header.match(block[0])
+                    if headerMatch and headerMatch.start() == 0:
+                        blockType = "Section"
+                        addIndent = headerMatch.group(1)
+                        sectionTitle = headerMatch.group(2).rstrip()
+                        sectionId = headerMatch.group(3)
+                        if sectionId.strip() == "":
+                            sectionId = '_'.join([f.lower() for f in sectionTitle.split()])
+        
+        # Step 2: Close old envs, based on block type and current indents
+        if blockType == "Section":
+            # Normal indentation?
+            if addIndent == "":
+                # Yes
+                self.closeOpenedBlocks(blockType)
+                self.sectionIndent -= 1
+            else:
+                # No
+                if addIndent[0] == "-":
+                    # Extract depth
+                    mcnt = 1
+                    if len(addIndent) > 1:
+                        mcnt = addIndent.count('-')
+                        if mcnt == 1:
+                            # Based on number
+                            mcnt = int(addIndent[1:])
+                    self.closeOpenedBlocks(blockType, mcnt+1)
+                    self.sectionIndent -= mcnt+1
+                elif addIndent[0] != "+":
+                    # Jump to given section depth
+                    mcnt = self.sectionIndent - int(addIndent)
+                    self.closeOpenedBlocks(blockType, mcnt)
+                    self.sectionIndent -= mcnt
+                                        
+        # Step 3: Open new section
+        if blockType == "Section":
+            self.openBlocks.append('Section')
+            self.sectionIndent += 1
+            self.result += "%s\n" % (self.envTags['Section'][0] % {'title':sectionTitle, 'id':sectionId})
+            return
+        
+        # Step 4: Process block=
+        # Step 4a: Convert list items, if required
+        if blockType == "List":
+            text = self.processList(text)
+        elif blockType == "Figure":
+            fighref = blockSpec
+            seppos = fighref.find("||")
+            if seppos > 0:
+                figatts = ' '+fighref[seppos+2:]
+                fighref = fighref[:seppos]
+            else:
+                figatts = ' alt="'+fighref+'"'
+             
+            if text.strip() != "":
+                # Figure with title
+                text = self.dictTags['figure'] % {'fref' : fighref,
+                                                  'atts' : figatts,
+                                                  'title' : text}
+            else:
+                # No title
+                text = self.dictTags['mediaobject'] % {'fref' : fighref,
+                                                       'atts' : figatts}
+        elif blockType != "Code":
+            if blockType != "None":
+                if self.envTags[blockType][2]:
+                    # Wrap text in para
+                    text = "%s%s%s\n" % (self.envTags['Para'][0], text, self.envTags['Para'][1])
+            else:
+                # Wrap text in para
+                text = "%s%s%s\n" % (self.envTags['Para'][0], text, self.envTags['Para'][1])                
+        else:
+            text = processVerbatim(text, blockSpec)        
+            
+        # Step 4b: Replace inline expressions
+        if blockType != "Code":
+            text = self.inlineReplace(text)
 
-		self.vars["content"] = ("\n".join(self.result)) % {
-			'toc': self.buildToc(),
-		}
-		return self.vars
+        # Step 5: Wrap block in environment tags
+        if envStarted:
+            text = "%s\n%s" % (self.envTags[self.currentEnvironment][0],text)
+            self.openBlocks.append(blockType)
+        if envStopped:
+            text = "%s\n%s\n" % (text, self.envTags[self.currentEnvironment][1])
+            self.currentEnvironment = ""
+            self.openBlocks.pop()
+            
+        # Step 6: Add text to result
+        self.result += text
 
-class HtmlCompiler(WikiCompiler) :
-	def __init__(self) :
-		self.compileInlines(inlineHtmlSubstitutions)
-		self.headerPatterns = headersHtml
-		self.processor = None
-	def buildToc(self) :
-		result = []
-		lastLevel = 0
-		i=1
-		result+=["<h2>Index</h2>"]
-		result+=["<div class='toc'>"]
-		for (level, item) in self.toc :
-			while lastLevel < level :
-				result += ["<ul>"]
-				lastLevel+=1
-			while lastLevel > level :
-				result += ["</ul>"]
-				lastLevel-=1
-			result+=["<li><a href='#toc_%i'>%s</a></li>"%(i,item)]
-			i+=1
-		while lastLevel > 0 :
-			result += ["</ul>"]
-			lastLevel-=1
-		result += ["</div>"]
-		return "\n".join(result)
+    def replaceAll(self, text, regex, starttag, endtag):
+        match = regex.search(text)
+        while match:
+            text = text[:match.start()]+starttag+match.group(1)+endtag+text[match.end():]
+            match = regex.search(text)
+            
+        return text
+        
+    def inlineReplace(self, text):
+        # Find and replace URLs
+        uMatch = url.search(text)
+        while uMatch:
+            href = uMatch.group(1)
+            atxt = uMatch.group(2)
+            urlatts = ""
+            seppos = atxt.find("||")
+            if seppos > 0:
+                urlatts = ' '+atxt[:seppos]
+                atxt = atxt[seppos+2:]
+            text = (text[:uMatch.start()] + 
+                    self.dictTags['ulink'] % {'url' : href,
+                                              'atts' : urlatts,
+                                              'linktext' : atxt} + 
+                    text[uMatch.end():])
+            uMatch = url.search(text)
+            
+        # Find and replace images
+        iMatch = img.search(text)
+        while iMatch:
+            href = iMatch.group(1)
+            urlatts = ""
+            seppos = href.find("||")
+            if seppos > 0:
+                urlatts = ' '+href[seppos+2:]
+                href = href[:seppos]
+            else:
+                urlatts = ' alt="'+href+'"'
+            text = (text[:iMatch.start()] +
+                    self.dictTags['inlinemediaobject'] % {'fref' : href,
+                                                          'atts' : urlatts} +
+                    text[iMatch.end():])
+                
+            iMatch = img.search(text)
+        
+        # Apply non-greedy inline substitutions to the joined block
+        text = self.replaceAll(text, em, self.inlineTags["em"][0], self.inlineTags["em"][1])
+        text = self.replaceAll(text, strong, self.inlineTags["strong"][0], self.inlineTags["strong"][1])
+        text = self.replaceAll(text, quote, self.inlineTags["quote"][0], self.inlineTags["quote"][1])
+        text = self.replaceAll(text, code, self.inlineTags["code"][0], self.inlineTags["code"][1])
+        text = self.replaceAll(text, anchor, self.inlineTags["anchor"][0], self.inlineTags["anchor"][1])
+        # Replace \blank escape sequences
+        text = text.replace("\blank","")
+        return text
 
-	def processLine(self, line) :
-		newItemLevel = ""
-		liMatch = li.match(line)
-		quoteMatch = quote.match(line)
-		headerMatch = header.match(line)
-		varMatch = var.match(line)
-		figMatch = fig.match(line)
-		figsMatch = figs.match(line)
-		todoMatch = todo.match(line)
-		annoMatch = anno.match(line)
-		labelMatch = label.match(line)
-		codeMatch = code.match(line)
-		divMatch = div.match(line)
-		preMatch = pre.match(line)
-		closeMatch = close.match(line)
-		if self.closing == "</pre>" and line == "}}}" :
-			self.closeAnyOpen()
-			return
-		elif line=="" :
-			self.closeAnyOpen()
-			return
-		elif self.processor : 
-			self.processor(line)
-			return
-		elif varMatch :
-			self.vars[varMatch.group(1)] = varMatch.group(2)
-			print "Var '%s': %s"%(varMatch.group(1),varMatch.group(2))
-			return
-		if liMatch :
-			self.closeAnyOpen()
-			newItemLevel = liMatch.group(1)
-			line = "%s<li>%s</li>" %("\t"*len(newItemLevel), liMatch.group(2) )
-		while len(newItemLevel) < len(self.itemLevel) or  \
-				self.itemLevel != newItemLevel[0:len(self.itemLevel)]:
-#			print "pop '"+self.itemLevel+"','"+newItemLevel+"' "+self.itemLevel[-1]
-			if self.itemLevel[-1] == "*":
-                            tag = "ul"
-			else:
-                            tag = "ol"
-			self.result.append("%s</%s>"%("\t"*(len(self.itemLevel)-1),tag))
-			self.itemLevel=self.itemLevel[0:-1]
-		if quoteMatch:
-			if self.closing != "</blockquote>" :
-				self.openBlock("<blockquote>","</blockquote>")
-			line=line[1:] # remove the quoting indicator space
-		elif figMatch :
-			self.closeAnyOpen()
-			self.openBlock(
-				"<div class='figure' id='%(id)s'><img src='%(img)s' alt='%(id)s'/><br />\n"%{
-					'id':figMatch.group(1),
-					'img': figMatch.group(2),
-					},
-				"</div>\n")
-			return
-		elif figsMatch :
-			self.closeAnyOpen()
-			self.openBlock(
-				("<div class='figure' id='%(id)s'>\n"
-				+"".join(["<img src='%s' alt='%%(id)s'/><br />\n"%image for image in figsMatch.group(2).split()]))
-				%{
-					'id':figsMatch.group(1),
-					},
-				"</div>\n")
-			return
-		elif codeMatch :
-			self.closeAnyOpen()
-			self.openBlock(
-				"<code>",
-				"</code>",
-				HtmlCodeProcessor(codeMatch))
-			return
-		elif preMatch :
-			self.closeAnyOpen()
-			self.openBlock(
-				"<pre>",
-				"</pre>",
-				HtmlVerbatimProcessor())
-			return
-		elif todoMatch :
-			line=" <span class='todo'>TODO: %s</span> "%todoMatch.group(1)
-		elif annoMatch :
-			annotator = annoMatch.group(1)
-			text = annoMatch.group(2)
-			line=(" <a class='anno'><img alt='[Ann:%s]' src='stock_notes.png' />"+ 
-				"<span class='tooltip'><b>%s:</b> %s</span></a> ")%(annotator,annotator,text)
-		elif labelMatch :
-			line=" <a name='#%s'></a>"%labelMatch.group(1)
-		elif headerMatch :
-			self.closeAnyOpen()
-			title = headerMatch.group(3)
-			level = len(headerMatch.group(1))
-			n=self.addToc(level,title)
-			line = self.headerPatterns[level-1]%{
-				"title": title,
-				"n": n,
-				"level": level,
-			}
-		elif closeMatch :
-			line="</%s>"%closeMatch.group(1)
-		elif not liMatch : 
-			if divMatch :
-				if self.openDiv(divMarkersHtml, divMatch) :
-					return
-				print "Not supported block class '%s'" % divMatch.group(1)
-			elif self.closing == "" :
-				self.openBlock("<p>","</p>")
-		# Equilibrate the item level
-		while len(self.itemLevel) != len(newItemLevel) :
-			self.closeAnyOpen()
-#			print "push '"+self.itemLevel+"','"+newItemLevel+"'"
-			levelToAdd = newItemLevel[len(self.itemLevel)]
-			if levelToAdd == u"*":
-                            tag = "ul"
-                        else:
-                            tag = "ol"
-			self.result.append("%s<%s>"%("\t"*len(self.itemLevel),tag))
-			self.itemLevel += levelToAdd
-		if self.processor :
-			self.processor(line)
-		else :
-			line = self.substituteInlines(line)	
-			self.result.append(line)
+    def getListItemText(self, lastItem, lastText):
+        if lastItem == "":
+            return lastItem
+        
+        if lastItem[-1] != '~':
+            if lastItem[-1] == '#':
+                return "%s%s%s\n" % (self.listTags['olItem'][0],
+                                     lastText,
+                                     self.listTags['olItem'][1])
+            else:
+                return "%s%s%s\n" % (self.listTags['ulItem'][0],
+                                     lastText,
+                                     self.listTags['ulItem'][1])                
+        else:
+            fpos = lastText.find('||')
+            if fpos > 0:
+                return "%s%s%s%s%s%s%s%s\n" % (self.listTags['dtItem'][0],
+                                               lastText[:fpos],
+                                               self.listTags['dtItem'][1],
+                                               self.listTags['ddItem'][0],
+                                               self.envTags['Para'][0],
+                                               lastText[fpos+2:],
+                                               self.envTags['Para'][1],
+                                               self.listTags['ddItem'][1])
+            else:
+                return "%s%s%s%s%s\n" % (self.listTags['dtItem'][0],
+                                         lastText,
+                                         self.listTags['dtItem'][1],
+                                         self.listTags['ddItem'][0],
+                                         self.listTags['ddItem'][1])
+        return ""
+        
+    def processList(self, txt):
+        lines = txt.split('\n')
+        listStack = []
+        listIndent = 0
+        ltxt = ""
+        lastItem = ""
+        lastText = ""
+        curItem = ""
+        curText = ""
+        for l in lines:
+            lMatch = li.match(l)
+            if lMatch:
+                curItem = lMatch.group(1)
+                curText = lMatch.group(2)
+                
+                if lastText != "":
+                    # Emit last item
+                    ltxt += self.getListItemText(lastItem, lastText)
+                # Close old list envs
+                toclose = len(lastItem)-len(curItem)
+                if toclose >= 0:
+                    if lastItem.find(curItem) != 0:
+                        toclose += 1 # Changed env
+                while len(listStack) and toclose > 0:
+                    ltxt += "%s\n" % self.listTags[listStack.pop()][1]
+                    if len(listStack):
+                        # Pop enclosing <li> item
+                        ltxt += "%s\n" % self.listTags[listStack.pop()][1]                        
+                    toclose -= 1
+                    listIndent -= 1
+                    
+                # Open new list envs
+                toopen = len(curItem)-listIndent
+                print curItem, lastItem, listIndent
+                if toopen > 0 and curItem != lastItem:
+                    opencnt = listIndent
+                    if len(curItem) > 1:
+                        toopen += 1 # ugly hack
+                    print "opencnt", opencnt, toopen
+                    
+                    while opencnt < toopen:
+                        if opencnt > 0:
+                            # Prepend <li> for list item
+                            otag = 'olItem'
+                            print "prepending"
+                            if curItem[opencnt-1] == '*':
+                                otag = 'ulItem'
+                            ltxt += "%s" % self.listTags[otag][0]
+                            listStack.append(otag)
+                        ltxt += "%s\n" % self.listTags[curItem[opencnt]][0]
+                        listStack.append(curItem[opencnt])
+                        opencnt += 1
+                        listIndent += 1
+                    
+                lastItem = curItem
+                lastText = curText
+            else:
+                lastText += "\n%s" % l
+        # Emit last
+        if lastText != "":
+            ltxt += self.getListItemText(lastItem, lastText)
+                                                        
+        # Close remaining envs
+        while len(listStack):
+            ltxt += "%s\n" % self.listTags[listStack.pop()][1]
+        
+        return ltxt
+
+class ForrestCompiler(WikiCompiler):
+    def __init__(self):
+        self.envTags = envTagsForrest
+        self.listTags = listTagsForrest
+        self.inlineTags = inlineTagsForrest
+        self.dictTags = dictTagsForrest
+
+class DocbookCompiler(WikiCompiler):
+    def __init__(self):
+        self.envTags = envTagsDocbook
+        self.listTags = listTagsDocbook
+        self.inlineTags = inlineTagsDocbook
+        self.dictTags = dictTagsDocbook
 
 skeletonFileName = "skeleton.xml"
-skeleton = loadOrDefault(skeletonFileName, defaultForrestSkeleton)
+if len(sys.argv) > 1:
+    skeleton = loadOrDefault(skeletonFileName, defaultSkeletonDocbook)
+    hComp = DocbookCompiler()
+else:
+    skeleton = loadOrDefault(skeletonFileName, defaultSkeletonForrest)
+    hComp = ForrestCompiler()
 
 # Generate XML files from content files + skeleton
 for path,dirs,files in os.walk('.'):
     for f in files:
         if f.endswith(".wiki"):
-	    contentFile = os.path.join(path, f)
-	    target = "".join(os.path.splitext(f)[0:-1])+".xml"
+            contentFile = os.path.join(path, f)
+            target = "".join(os.path.splitext(f)[0:-1])+".xml"
             target = os.path.join(path, target)
-	    content = readUtf8(contentFile)
-            print "Generating", target, "from", contentFile, "..."
-            htmlResult = HtmlCompiler().process(content)
-            htmlResult['wikiSource']=contentFile;
+            content = readUtf8(contentFile)
+            htmlResult = hComp.process(content)
             writeUtf8(target, skeleton%htmlResult)
-
-