1. Kazbek
  2. py_w3c

Overview

HTTPS SSH

Python wrapper for W3C markup validator service.

Installation:

$ pip install py_w3c

Usage:

There are 3 methods of validating:

  1. validate url - HTMLValidator().validate(url)
  2. validate file - HTMLValidator().validate_file(filename_or_file)
  3. validate fragment - HTMLValidator().validate_fragment(fragment_string)

Note: You can pass charset or doctype while creating validator instance. This will force validator to use passed doctype or charset for validation.

Example:

vld = HTMLValidator(doctype="XHTML1", charset="utf-8")
# now validator uses XHTML1 doctype and utf-8 charset ignoring doctype and charset in the document content
vld.validate('http://example.com')

Examples:

  • As library:
# import HTML validator
from py_w3c.validators.html.validator import HTMLValidator

# create validator instance
vld = HTMLValidator()

# validate
vld.validate('http://example.com')

# look for errors
print(vld.errors)  # list with dicts

# look for warnings
print(vld.warnings)
  • As standalone script - (not very usefull right now). Only url validating is allowed for standalone script, wich prints warnings and errors to the console.
$ w3c_validate http://example.com

Running tests:

$ python setup.py test

This command will install tox and run tests for py2.7 and py3.4.

To run test for one python version use (py2.7 for example):

$ python setup.py test -a "-epy27"

To run tests in your own environment run:

python -m unittest discover

Note: By default tests never send real requests to the w3c validator service. Use mocks instead (see tests/responses directory). If you want to send real requests comment @httpretty.activate decorator for the test of interest. But you need to be carefull with real mode because w3c may block your ip because of 're-requesting the same resource too frequently'. More details here - https://www.w3.org/Help/abuse-info/re-reqs.html.