XML Catalogs & Catalog Resolvers

XML documents typically refer to external entities, for example the public and/or system ID for the Document Type Definition. These external relationships are expressed using URIs, typically as URLs. However, if they're absolute URLs, they only work when your network can reach them. Relying on remote resources makes XML processing susceptible to both planned and unplanned network downtime. Conversely, if they're relative URLs, they're only useful in the context where the were initially created. For example, the URL "../../xml/dtd/docbookx.xml" will usually only be useful in very limited circumstances. One way to avoid these problems is to use an entity resolver (a standard part of SAX) or a URI Resolver (a standard part of JAXP).

A resolver can examine the URIs of the resources being requested and determine how best to satisfy those requests. The XML catalog is a document describing a mapping between external entity references and locally-cached equivalents.

Java SAX Example

Catalog resolvers are available for various programming languages. The following example shows how, in Java, a SAX parser may be created to parse some input source in which the `org.apache.xml.resolver.tools.CatalogResolver` is used to resolve external entities to locally-cached instances. This resolver originates from Apache Xerces but is now included with the Sun Java runtime. Simply create a SAXParser in the normal way, using factories. Obtain the XML reader and set the entity resolver to the standard one or another of your own.

   final SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
   final XMLReader reader = saxParser.getXMLReader();

   final ContentHandler handler = ...;
   final InputSource input = ...;

   reader.setEntityResolver( new CatalogResolver() );
   reader.setContentHandler( handler );
   reader.parse( input );

It is important to call the parse method on the reader, not on the SAX parser.

Example Catalog.xml

The following simple catalog shows how one might provide locally-cached DTDs for an XHTML page validation tool, for example.

<?xml version="1.0"?>
   <!DOCTYPE catalog
     PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
            "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">

   <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
            prefer="public">

   <public publicId="-//W3C//DTD XHTML 1.0 Strict//EN"
             uri="dtd/xhtml1/xhtml1-strict.dtd"/>

   <public publicId="-//W3C//DTD XHTML 1.0 Transitional//EN"
             uri="dtd/xhtml1/xhtml1-transitional.dtd"/>

   <public publicId="-//W3C//DTD XHTML 1.1//EN"
             uri="dtd/xhtml11/xhtml11-flat.dtd"/>

   </catalog>

See also

 
comments powered by Disqus