Feature Grammars

Feature Grammars are a technology for extracting hierarchical feature reports from an XML document.

The features may have been designed, or reflect clusters detected in some corpus, or be reverse engineered from some existing code.

The report is a simple XML document that can be used to guide routing of the document depending on the features it contains, or as a plan for code to use to enable or disable certain processing, or to detect anomalies or other invalidities, or to mark or grade a document.

A kind of grammar is used to enable and report each feature, and to group required or optional subfeatures. The feature is actually detected by an XPath expression on the document, which allows full querying of the document.


Feature Grammars is based on about 15 years of reflection by me on meta-schemas for document clusters and Schematron.

This version follows the same architecture as my Schematron implementation: an XSLT script converts the Feature Grammar into XLST code, which is then run against the document of interest, to produce an XML output (or failure messages)


The current version is a prototype or proof-of-concept version.

It does not have reasonable syntax checking for XPaths and models. A recursive grammar will cause a failure.

It uses a proprietary SAXON function saxon:parse(). It would be good to remove this entirely. In the meantime, if you have XSLT3 there is now a standard function for the same, and some other implementations provided their own custom extension functions for the same purpose.

Who do I talk to?

Rick Jelliffe. However, initially my response times will probably depend on whether this project stimulates any interest or interesting follow-up. But please don't let that stop you from blasting off some suggestions or complaints etc to me: please don't consider me too rude if I don't always response.