Technique G134:Validating Web pages

Applicability

Any markup languages and many other technologies.

This technique relates to 4.1.1: Parsing (Sufficient).

Description

The objective of this technique is to avoid ambiguities in Web pages that often result from code that does not validate against formal specifications. Each technology's mechanism to specify the technology and technology version is used, and the Web page is validated against the formal specification of that technology. If a validator for that technology is available, the developer can use it.

Validation will usually eliminate ambiguities (and more) because an essential step in validation is to check for proper use of that technology's markup (in a markup language) or code (in other technologies). Validation does not necessarily check for full conformance with a specification but it is the best means for automatically checking content against its specification.

Examples

Example 1: Validating HTML

HTML pages include a document type declaration (sometimes referred to as !DOCTYPE statement) and are valid according to the HTML version specified by the document type declaration. The developer can use off-line or online validators (see Resources below) to check the validity of the HTML pages.

Example 2: Validating XML

XHTML, SVG, SMIL and other XML-based documents reference a Document Type Definition (DTD) or other type of XML schema. The developer can use online or off-line validators (including validation tools built into editors) to check the validity of the XML documents.

Example 3: Batch validation with Ant

The xmlvalidate task of Apache Ant can be used for batch validation of XML files. The following Apache Ant target is a simple example for the validation of files with the extension .xml in the directory dev\\Web (relative to the Ant build file).

              
              <target name="validate-xml"> 
                <xmlvalidate lenient="no"> 
                <fileset dir="dev/web" includes="*.xml" /> 
                </xmlvalidate> 
              </target>
            

Other sources

No endorsement implied.

Validating HTML and XHTML

  • The W3C Markup Validation Service by the World Wide Web Consortium allows you to validate HTML and XHTML files by URI, by file upload and by direct input of complete HTML or XHTML documents. There are also separate pages with an extended interface for file upload and for validating by URI (advanced options such as encodings and document types).
  • Installation Documentation for the W3C Markup Validation Service explains how to install this service (for example for use on an intranet).
  • WDG HTML Validator by the Web Design Group allows you to enter a URI to validate single pages or entire sites. There are also versions to validate Web pages in batch mode (by specifying one or more URIs of HTML documents to validate), by file upload and by direct input of HTML code.
  • Offline HTMLHelp.com Validator is a tool for Unix users; it is the off-line version of the online WDG HTML Validator.
  • Off-line HTML Validator – A clipbook for NoteTab by Professor Igor Podlubny is an extension for the programming editor NoteTab. It uses James Clark's open-source SGML parser, which is also used by the W3C Markup Validation Service.
  • Do-it-yourself Offline HTML Validator by Matti Tukiainen explains how you can create a simple validator with James Clark's SGML parser on Windows.
  • Validating an entire site by Peter Kranz explains how you can install a modified version of the W3C Markup Validation Service that outputs validation results as XML on Mac OS. Source code (in Perl and Python) is available.
  • Can I use the W3C MarkUp Validation Service to validate HTML? explains how you can validate HTML from within the free editor HTML-Kit.
  • HTML/XML Validator is an online repair tool for HTML and XHTML based on Tidy and PHP 5. It is available in several languages but it is not a real validator.
  • Fix Your Site With the Right DOCTYPE! by Jeffrey Zeldman explains what HTML and XHTML doctypes work and what their effect is on the rendering mode of a few browsers.
  • Modifying Dreamweaver to Produce Valid XHTML by Carrie Bickner.
  • XHTML-Schemata für FrontPage 2003 und Visual Studio .NET by Christoph Schneegans is a German article that explains how the W3C XML Schemas for XHTML 1.0 can be used in FrontPage 2003 and Visual Studio .NET to create valid code.
  • Nvu is a free and open-source Web authoring tool for Windows, Macintosh and Linux that can call the W3C HTML Validation Service.
  • Amaya by the World Wide Web Consortium is a free and open-source Web authoring tool with support for HTML, XHTML, CSS, SVG and MathML that alerts you to validity errors when you save a document.
  • Web Developer Extension is an extension for Mozilla, Firefox and Flock by Chris Pedrick that allows you to use the W3C Validation Services for HTML and CSS.

Validating XML

  • XML Validator - A Document Validation Service by JavaView allows you to check wellformedness and validity of XML files, by file upload or by direct input of XML code.
  • Apache Ant's XMLValidate Task can be used to validate XML-based documents. This tool can be used to validate entire directories (and subdirectories) of XML files.
  • XML Schema Validator by Christoph Schneegans is an online tool that allows you to validate XML (and XHTML) files by URI, by file upload, by direct input of complete XML documents, and by direct input of XML code fragments. A bookmarklet that allows you to validate the page currently displayed in your browser is also available. This validator claims to be more accurate than the W3C validator.
  • XML Schema Validator by CoreFiling is an online tool that allows you to validate an XML file against a W3C XML Schema, both of which can be uploaded.
  • Schema Validator: this is a validator that allows you to paste XML and W3C XML Schema code into text boxes to validate XML code.

Note that many programming editors, XML editors and integrated development environments (IDEs) can validate XML files. These include the following free and/or open-source tools:

  • the programming editor JEdit with the XML and SideKick plugins, which supports DTDs and W3C XML Schemas,
  • the “workbench" Eclipse with the Web Tools Platform,
  • the Web authoring tool SCREEM for the Gnome desktop environment, which supports DTDs,
  • the XML editor Jaxe, which validates XML files with Apache Xerces,
  • the XML editor Xerlin, which supports DTDs and to some extent W3C XML schema,
  • the XML editor xmloperator, which supports DTDs and RELAX NG schemas,
  • Emacs in nXML mode (see the YahooGroup Emacs nXML Mode),
  • the XML editor Pollo, which supports DTDs, W3C XML Schemas and RELAX NG schemas, and is best suited for tree-like XML files.

Tests

Procedure

For HTML, SGML-based and XML-based technologies:

  1. Load each page or document into a validating parser.
  2. Check that no validation errors are found.

For other technologies:

Follow the validation procedure defined for the technology in use, if any exists.

Expected Results

For HTML, SGML-based and XML-based technologies:

Step 2 is true.