Overview

HTTPS SSH

XMLy

XMLy is an XML and HTML parser, written in PHP.

Features:

  • Parsing XML/HTML documents
  • Filtering HTML elements and allowing only certain attributes
  • Pretty printing

Requirements

  • PHP 5.3
  • Some data to parse! :)

Usage

Basic example

<?php
require_once 'xmly/parser.php'; // the parser will include the datatypes on its own
$xmly = new xmly(); // initializing the parser
$xmly->loadHTML(file_get_contents('foo.html')); // you can either load the source manually...
$xmly->loadHTMLFile('foo.html'); // ...or make XMLy get it by itself
$xmly->parse(); // it's as simple as that!
?>

NB: the methods loadXML and loadXMLFile are also available.

Using the parsed data

Pretty printing using the built-in method

<?php
$xmly->prettyPrint();
?>

Making your own printing function

The description of the types will be in the file datatypes.php.

<?php
// Get the root node after parsing the code
$root = $xmly->parse(true); // set it to true to make it return the root node

// Get the type of a node
$type = get_class($node);

// Looping through children/attributes
for ($i = 0; $i < $node->attributeCount; $i++) // only for xmlyNode
    do_stuff($node->attribute($i));
for ($i = 0; $i < $node->childCount; $i++) // xmlyNode and xmlyRoot
    do_stuff($node->child($i));

// Other data types have a `value` parameter
$text = $node->value; // for xmlyText and every class inheriting from it
?>

Filtering HTML/XML tags

Start by initializing the filter if you want to use the built-in properties.

<?php
$xmly = new xmly(true); // force init for filter lambda functions
$filtered_html = $xmly->filter(); // use the filter "as is"
$filtered_html = $xmly->filter($custom_tags); // or setup allowable tags/attributes
?>

What ou can do with your $custom_tags:

<?php
// If this array is empty, all tags will be removed
$custom_tags = array();

// Remove comments and doctype
$custom_tags['!'] = array(
    'comments' => false,
    'doctype' => false
);

// Simply allow a tag
$custom_tags['a'] = null;

// Allow a tag and only certain attributes
$custom_tags['a'] = array('href', 'target');

// Control the value of the allowed attributes
$custom_tags['a']['target'] = function($value) {
    return $value === '_blank' ? $value : null; // allowing 'target' if it is '_blank'
    // Note: if null is returned, the attribute will be removed
};

// More complex example, as used in the default filter
$custom_tags['p']['style'] = function($value) {
    $css = null;
    $style = explode(';', $value);
    foreach ($style as $line)
    {
        $rule = array_map('trim', explode(':', $line));
        if (count($rule) === 2 && in_array($rule[0], array('margin-left', 'margin-right', 'width', 'height', 'float')))
        $css .= "$rule[0]:$rule[1];";
    }
    return $css;
};
?>