Preface
XML is one of the most important developments in document syntax in the history of computing. In the last few years it has been adopted in fields as diverse as law, aeronautics, finance, insurance, robotics, multimedia, hospitality, travel, art, construction, telecommunications, software, agriculture, physics, journalism, theology, retail, and comics. XML has become the syntax of choice for newly designed document formats across almost all computer applications. It's used on Linux, Windows, Macintosh, and many other computer platforms. Mainframes on Wall Street trade stocks with one another by exchanging XML documents. Children playing games on their home PCs save their documents in XML. Sports fans receive real-time game scores on their cell phones in XML. XML is simply the most robust, reliable, and flexible document syntax ever invented.
XML in a Nutshell is a comprehensive guide to the rapidly growing world of XML. It covers all aspects of XML, from the most basic syntax rules, to the details of DTD and schema creation, to the APIs you can use to read and write XML documents in a variety of programming languages.
What This Book Covers
There are hundreds of formally established XML applications from the W3C and other standards bodies, such as OASIS and the Object Management Group. There are even more informal, unstandardized applications from individuals and corporations, such as Microsoft's Channel Definition Format and John Guajardo's Mind Reading Markup Language. This book cannot cover them all, any more than a book on Java could discuss every program that has ever been or might ever be written in Java. This book focuses primarily on XML itself. It covers the fundamental rules that all XML documents and authors must adhere to, whether a web designer uses SMIL to add animations to web pages or a C++ programmer uses SOAP to exchange serialized objects with a remote database.
This book also covers generic supporting technologies that have been layered on top of XML and are used across a wide range of XML applications. These technologies include:
XLink
An attribute-based syntax for hyperlinks between XML and non-XML documents that provide the simple, one-directional links familiar from HTML, multidirectional links between many documents, and links between documents to which you don't have write access.
XSLT
An XML application that describes transformations from one document to another, in either the same or different XML vocabularies.
XPointer
A syntax for URI fragment identifiers that selects particular parts of the XML document referred to by the URI—often used in conjunction with an XLink.
XPath
A non-XML syntax used by both XPointer and XSLT for identifying particular pieces of XML documents. For example, an XPath can locate the third address element in the document, or all elements with an email attribute whose value is elharo@metalab.unc.edu.
Namespaces
A means of distinguishing between elements and attributes from different XML vocabularies that have the same name; for instance, the title of a book and the title of a web page in a web page about books.
Schemas
An XML vocabulary for describing the permissible contents of XML documents from other XML vocabularies.
SAX
The Simple API for XML, an event-based application programming interface implemented by many XML parsers.
DOM
The Document Object Model, a language-neutral tree-oriented API that treats an XML document as a set of nested objects with various properties.
XHTML
An XMLized version of HTML that can be extended with other XML applications such as MathML and SVG.
RDDL
The Resource Directory Description Language, an XML application based on XHTML for documents placed at the end of namespace URLs.
All these technologies, whether defined in XML (XLinks, XSLT, Namespaces, Schemas, XHTML, and RDDL) or in another syntax (XPointers, XPath, SAX, and DOM), are used in many different XML applications.
This book does not specifically cover XML applications that are relevant to only some users of XML, such as:
SVG
Scalable Vector Graphics, a W3C-endorsed standard XML encoding of line art.
MathML
The Mathematical Markup Language, a W3C-endorsed standard XML application used for embedding equations in web pages and other documents.
RDF
The Resource Description Framework, a W3C-standard XML application used for describing resources, with a particular focus on the sort of metadata one might find in a library card catalog.
Occasionally we use one or more of these applications in an example, but we do not cover all aspects of the relevant vocabulary in depth. While interesting and important, these applications (and hundreds more like them) are intended primarily for use with special software that knows their format intimately. For instance, most graphic designers do not work directly with SVG. Instead, they use their customary tools, such as Adobe Illustrator, to create SVG documents. They may not even know they're using XML.
This book focuses on standards that are relevant to almost all developers working with XML. We investigate XML technologies that span a wide range of XML applications, not those that are relevant only within a few restricted domains.