XMLy is an XML and HTML parser, written in PHP.


  • Parsing XML/HTML documents
  • Filtering HTML elements and allowing only certain attributes
  • Pretty printing


  • PHP 5.3
  • Some data to parse! :)


Basic example

require_once 'xmly/parser.php'; // the parser will include the datatypes on its own
$xmly = new xmly(); // initializing the parser
$xmly->loadHTML(file_get_contents('foo.html')); // you can either load the source manually...
$xmly->loadHTMLFile('foo.html'); // ...or make XMLy get it by itself
$xmly->parse(); // it's as simple as that!

NB: the methods loadXML and loadXMLFile are also available.

Using the parsed data

Pretty printing using the built-in method


Making your own printing function

The description of the types will be in the file datatypes.php.

// Get the root node after parsing the code
$root = $xmly->parse(true); // set it to true to make it return the root node

// Get the type of a node
$type = get_class($node);

// Looping through children/attributes
for ($i = 0; $i < $node->attributeCount; $i++) // only for xmlyNode
for ($i = 0; $i < $node->childCount; $i++) // xmlyNode and xmlyRoot

// Other data types have a `value` parameter
$text = $node->value; // for xmlyText and every class inheriting from it

Filtering HTML/XML tags

Start by initializing the filter if you want to use the built-in properties.

$xmly = new xmly(true); // force init for filter lambda functions
$filtered_html = $xmly->filter(); // use the filter "as is"
$filtered_html = $xmly->filter($custom_tags); // or setup allowable tags/attributes

What ou can do with your $custom_tags:

// If this array is empty, all tags will be removed
$custom_tags = array();

// Remove comments and doctype
$custom_tags['!'] = array(
    'comments' => false,
    'doctype' => false

// Simply allow a tag
$custom_tags['a'] = null;

// Allow a tag and only certain attributes
$custom_tags['a'] = array('href', 'target');

// Control the value of the allowed attributes
$custom_tags['a']['target'] = function($value) {
    return $value === '_blank' ? $value : null; // allowing 'target' if it is '_blank'
    // Note: if null is returned, the attribute will be removed

// More complex example, as used in the default filter
$custom_tags['p']['style'] = function($value) {
    $css = null;
    $style = explode(';', $value);
    foreach ($style as $line)
        $rule = array_map('trim', explode(':', $line));
        if (count($rule) === 2 && in_array($rule[0], array('margin-left', 'margin-right', 'width', 'height', 'float')))
        $css .= "$rule[0]:$rule[1];";
    return $css;