1. nmb10
  2. py_w3c



Python wrapper for W3C markup validator service.


$ pip install py_w3c


There are 3 methods of validating:

  1. validate url - HTMLValidator().validate(url)
  2. validate file - HTMLValidator().validate_file(filename_or_file)
  3. validate fragment - HTMLValidator().validate_fragment(fragment_string)

Note: You can pass charset or doctype while creating validator instance. This will force validator to use passed doctype or charset for validation.


vld = HTMLValidator(doctype="XHTML1", charset="utf-8")
# now validator uses XHTML1 doctype and utf-8 charset ignoring doctype and charset in the document content


  • As library:
# import HTML validator
from py_w3c.validators.html.validator import HTMLValidator

# create validator instance
vld = HTMLValidator()

# validate

# look for errors
print(vld.errors)  # list with dicts

# look for warnings
  • As standalone script - (not very usefull right now). Only url validating is allowed for standalone script, wich prints warnings and errors to the console.
$ w3c_validate http://example.com

Running tests:

$ python setup.py test

This command will install tox and run tests for py2.7 and py3.4.

To run test for one python version use (py2.7 for example):

$ python setup.py test -a "-epy27"

Note: By default tests never send real requests to the w3c validator service. Using mocks instead (see validators/html/tests/responses directory). If you want to send real requests set FORCE_W3C_USE to True. But you need to be carefull with real mode because w3c may block your ip because of 're-requesting the same resource too frequently'. More details here - https://www.w3.org/Help/abuse-info/re-reqs.html.