分享
 
 
 

C/C++ developers: Fill your XML toolbox

王朝c/c++·作者佚名  2006-05-18
窄屏简体版  字體: |||超大  

URL:http://www-128.ibm.com/developerworks/xml/library/x-ctlbx.html

01 Sep 2001

Designed for C and C++ programmers who are new to XML development, this article gives an overview of tools to assemble in preparation for XML development. Tool tables outline generic XML tools like IDEs and schema designers, parsers, XSLT tools, SOAP and XML-RPC libraries, and other libraries either usable from or actually written in C and/or C++. The article includes advice for installing open-source libraries on Windows, Unix, and Linux, plus a brief glossary of key XML terms.

More dW content related to: xml tool c c++

It seems as if everywhere you look there is some new XML-related tool being released in source code form written in Java. Despite Java's apparent dominance in the XML arena, many C/C++ programmers do XML development, and there are a large assortment of XML tools for the C and C++ programmer. We'll confront XML library issues like validation, schemas, and API models. Next, we'll look at a collection of generic XML tools like IDEs and schema designers. Finally, we'll conclude with a list and discussion of libraries either usable from or actually written in C and/or C++.

In this article, I'll skip arguments for using XML; I'll assume that you already have good reasons for wanting to ramp up on this technology. Also, I'll leave the more detailed explanations of XML to the background sources in Resources. Suffice it to say that XML is a standard for data exchange (not just a file format). The data may be exchanged in the form of application files in XML format, or over network connections that exist only for a moment before being discarded at the conclusion of an online transaction or at the close of a network connection.

Also, this isn't a comparative review that rates tools. My goal is to explain the types of tools you'll probably need and to point you to likely candidates. You'll still need to research, test, and compare tool features against your project needs to assemble your ultimate toolbox.

Two sets of tools

To incorporate XML in your own software projects, you're going to want to have two sets of tools in your bag of tricks. The first set is a dialect designer (or more properly "schema designer"). The second set of tools includes software libraries that will add parsing and XML-generation features to your application.

Back to top

Designing your own XML dialect

An XML dialect is just a particular set of XML tags along with some rules for how the tags fit together. The two dominant ways for specifying or defining an XML dialect currently are through a Document Type Definition (DTD) or an XML Schema. I'll refer to both of these collectively as a schema.

Your project's domain may already have a particular schema designed for you. If not, you can hack up your own schema using a plain text editor. A more refined approach is to use an actual dialect designer that can check syntax. (An incorrect schema won't help when you try to use it later to validate your XML data.)

Back to top

Design tools

Nowadays, most programmers' text editors -- particularly the ones found in IDEs -- have decent macro and template support for features like syntax highlighting and autocompletion of a partially typed word or phrase. Therefore I am omitting from this discussion any XML editors that do only syntax highlighting and autocompletion. Microsoft Word or an Emacs macro can do that, so "XML editor" ought to mean something more.

The tools shown in Table 1 fall into three categories:

The IDE (integrated development environment, the Swiss Army knife approach)

The schema-sensitive XML editor (reinforces allowable tag structure and attributes found in a schema or DTD)

Schema designer (for writing your own Schema or DTD)

Because all of the tools in Table 1 are running applications, they are suitable for all XML developers -- not just those using C++ (unlike the tools listed in the other tables in this article).

Table 1. Dialect design tools for various platforms

Product

Vendor

Description

License

Platform

Turbo XML

TIBCO/Extensibility

XML Schema/DTD designer and IDE

commercial

Java, Win32

Komodo

Active State

XML editor and IDE

commercial

Linux, Win32, others

XML Spy

Altova

IDE for XML editing and schema design

commercial

Win32

XML Notepad

Microsoft

XML editor

free

Win32

Morphon XML

Lunatech Research

Schema-sensitive XML data editor

commercial

Java

XED

University of Edinburgh

Schema-sensitive XML data editor

noncommercial

Win32, Linux, Unix

Xeena

IBM alphaWorks

Schema-sensitive XML data editor

free trial/commercial

Mac, Unix, Win32

Visual XML

Pierlou

Schema-sensitive XML data editor

noncommercial

Java

Netpadd

Phillip Lenssen

Alternative to Microsoft's XML Notepad

noncommercial

Win32

XMetal

Softquad

DTD sensitive XML editor

commercial

Win32

Merlot

Channelpoint

Visual XML editor; supports plug-ins for DTDs

noncommercial

Java

XML Validator

ElCel Technologies

Command-line XML validation tool

noncommercial

Win32

XML Canon

ElCel Technologies

Produces canonical XML by merging XML data with DTD

noncommercial

Win32

Back to top

C and C++ tools

The rest of this article serves up the meat and potatoes of adding XML functionality for the C/C++ programmer through software libraries. You'll see in the next sections of this article that many more command-line utilities can be found included as test and/or sample programs in the software libraries. As an example, Transformiix can be used as a library, a Perl module, or a command-line tool.

Licensing

In all of the tool tables, commercial license means that the tool must be purchased to be used in a production environment. Trial versions (with time-limits or with key features disabled) are usually available for download and evaluation. All of the other tools have noncommercial licenses that require no fee for use, but if you're using them in commercial projects, make sure that your intended use complies with the terms of the license. For example, use any open-source code as shared libraries or DLLs so as to be consistent with the open-source license covenants (usually LGPL) that accompany these XML libraries. In the table, the type of noncommercial license is specified when it has been possible to determine the type of license without completely installing the tool; the table uses noncommercial to designate free software whose license terms are a bit more obscure.

The most common open-source licenses are Apache, GPL, BSD, and LGPL. Including GPL code in a commercial package is not permitted, whereas Apache and LGPL code can be used in software that is sold. All four restrict using their name in a derived product without permission.

Up until now I've tried to be frugal in my use of XML-related terms. Before you read further, though, if you are not familiar with XML basics, you might want to scan the terms defined in the XML terminology sidebar. The terms will assist you both in following the rest of this article and in sorting out the features of the tools and libraries mentioned as you explore them in greater depth on your own.

Back to top

Parsers

Once you have a DTD or schema and the XML document to go with it, you'll need a parser to read and interpret the XML document. Table 2 outlines parser libraries for C/C++ developers. Before you start poring over the grid in the table, though, you'll need a little bit of background.

Validation

XML parsers come in two forms: validating and nonvalidating. Which one do you need? If you are working without a formal DTD or schema, validation features won't be important to you. If you have or are planning to use a DTD or schema, you will probably prefer a validating parser. (In that case, I suggest that you also learn how to read and write a DTD/schema by hand so that when validation issues arise you will be able to deal with the errors. Sometimes the error is in the DTD/schema, so you may be debugging the DTD/schema files as well as your XML data.)

Parser API models

Two common API models exist for interfacing software with an XML parser: the document model and the event model. The document API model parses XML data to produce an object. The object abstracts the document's contents into a tree structure. The application operates on this tree-structure object. The event API model uses a callback mechanism to notify the application of the structure of the XML data. The events/callbacks usually occur at the time of parsing.

API standards: DOM and SAX

The generic parser API models have been further refined into specific API standards. The W3C has recommended DOM (levels 1 and 2, with level 3 in draft) as a standardized document API model. While not a W3C project, SAX has taken its place as the de facto standard event API model.

W3C standards

As you compare features in parsers and other XML tools, look for support for W3C recommendations and emerging specifications, such as namespaces, XPath, XLink, XInclude, and XInfoset. Keep in mind that XML technologies are maturing rapidly and that support for the first level of a specification, such as the DOM, may lack important functionality introduced in level two of that specification. If functionality in the most current form of a specification is important to your project, choose your tools accordingly.

Reading the table

In Table 2, the Event column specifies parsers that support a push or event model API, like SAX. The Doc column specifies parsers that support a pull or document model API, like DOM. As before, the table lists both commercial and noncommercial tools (see the sidebar Licensing for details about the software licenses.)

Table 2. Parsers for C/C++ developers

Library

Supplier

Event

Doc

Highlights

License

expat

James Clark/expat team

native & SAX

-

Very fast push model parser with a native API and SAX wrappers.

LGPL (free)

libxml

Gnome

SAX

DOM

Very robust; SAX & DOM wrappers; does DTD validation

LGPL (free)

MSXML

Microsoft

SAX

DOM

The Microsoft XML library for Win32

EULA (free)

Xerces

Apache Software Foundation

SAX

DOM

Does SAX plus DOM levels 1 and 2; DTD validation; incremental XML Schema

Apache (free)

XTL

Vivid Creations

SAX

DOM

STL-based XML toolkit with SAX and DOM

commercial

RXP

University of Edinburgh

-

native

Validating namespace-aware XML parser in C

GPL (free)

XML4C

IBM alphaWorks

SAX

DOM

IBM-sponsored variant of Xerces

Apache (free)

Oracle XDK 8i

Oracle

SAX

DOM

Oracle-sponsored XML toolkit for C++

noncommercial

Pull Parser

Extreme! Lab

-

native

Indiana University-sponsored lightweight XML toolkit for C++

noncommercial

XML Booster

PhiDaNi Software

-

native

Parser generator, generates C-source parser

commercial

Open-source top three The three most popular of the open-source XML libraries are expat, libxml, and Xerces. All three are cross-platform, and each serves as a basis for an XSLT library implementation, which give you a growth path once you've satisfied your basic XML needs.

Expat is an open-source event-oriented XML parsing library originated by James Clark. He has transferred the project to a small team on SourceForge. A SAX wrapper is available. The expat parser can found in a number of projects, such as the open-source browser Mozilla, the XSLT processor Transformiix, and the RDF tool repat.

Libxml offers a dual-mode API for both SAX and DOM-like operations. It supports validating against a DTD and is used in Gnome's XSLT processor, libxslt. Libxml was rewritten and released as libxml(2) but may also be referred to as libxml2. Users of this library should make sure they have the current version.

Xerces is a very solid, well-documented library that serves as the basis for the IBM alphaWorks XML4C library. Xerces is also used in the Apache XSLT processor, Xalan. Xerces supports DOM, SAX, and validating against a DTD. The latest versions reads and interprets parts of the W3C XML Schema Recommendation (with complete XML Schema support targeted for the end of 2001).

Compiling and linking one of the top three libraries into your project is painless. Most packages include thorough instructions for each platform. Here some sample installation instructions.

Building an open-source library on Windows

Building libxml on Windows from scratch is an easy four steps:

Download the source tarball.

Use a program like Winzip to unzip the contents to a directory. Be sure you instruct your unzipping utility to preserve path names for any subdirectories that libxml may need.

Locate the libxml2.dsw file in the ./win32/dsp subfolder and open it from MS Developer Studio.

Select Build All from the top menu in DevStudio. This builds all the sample and test programs along with the libxml DLLs needed to run them.

You can use the above steps to build Xerces on Windows. The only difference is to look for the samples.dsw workspace file in the ./c/samples/Projects/Win32/VC6 subfolder.

Expat has started including DSP project makefiles. Look in the lib and xmlwf subfolders.

Building an open-source library on Unix

For projects running on Linux or Unix, in most cases you can untar the source code to an empty directory, set some options, and type "make" to build a shared library. Solaris users: Don't forget to use the GNU untar utility. The following code worked for me from the bash shell under Slackware Linux:

tar -x <xerces-c-src1_5_1.tar

cd xerces-c-src1_5_1

export XERCESCROOT=/home/mine/xerces-c-src1_5_1

cd src

autoconf

chmod 377 runConfigure

./runConfigure -p linux

gmake

cd ../samples

chmod 377 runConfigure

./runConfigure -p linux

gmake

XML and COM

MSXML, Microsoft's proprietary XML offering for the Windows family of operating systems, is implemented as a collection of scriptable COM objects, so it plays well in other language environments and is well documented. The library supports DOM along with a native document-oriented interface. SAX events are also supported.

As an alternative to MSXML, the Apache XML Project's Xerces library comes with a COM wrapper that will make it act as a drop-in replacement for MSXML in many cases. Vivid Creations offers COM wrappers for the SAX and DOM APIs to its XTL library that also serve as substitutes for MSXML.

Back to top

XML transforms: XSLT and XQuery

The transform is the next step up the XML evolutionary ladder from merely processing XML data at the element and attribute level. An XML transform operates on incoming XML data to produce XML output. A transform can reorganize tag structure, add/remove tags and attributes, and filter to zoom in on select fragments of XML data.

The XQuery documentation refers to the transform process as a query but the meaning is the same.

XSLT and XQuery are XML dialects for specifying how to perform such actions on random XML data. You can write a script file with the changes expressed in XML as XSTL or XQuery instead of loading some XML data into a DOM and having to programmatically manipulate the DOM version to produce a desired result. This more generalized approach leads to greater flexibility and reduced development time. Now your Web developers who are not C/C++ programmers can write their own transforms as XML, which may free a C++ programmer for more complex work.

Table 3. C/C++ Transform/Query libraries

Library

Supplier

Highlights

License

libxslt

Gnome

built on top of libxml

LGPL or X11-like (free)

Xalan

Apache

Built on top of the Xerces parser

Apache (free)

Transformiix

MITRE

XSLT processor built on expat

noncommercial

xsltc

Oliver Gerardin

XSLT compiler, produces C code

noncommercial

sablotron

Ginger Alliance

XSL engine

noncommercial

Back to top

Messaging: XML-RPC and SOAP tools

For the purpose of this article, messaging refers to having two software agents communicate with one another. Such messaging is sometimes called message-oriented middleware. (This is not messaging like AOL, MSN, or ICQ, okay? There is an XML-based instant-messaging protocol effort underway called Jabber. I've included a link in Resources for your curiosity, but, again, that's not what I'm talking about here.)

Using XML for messaging has become popular enough to have produced these two alternatives: XML-RPC and SOAP. The most appealing feature of these protocols is that clients, servers, and peers can differ dramatically in terms of the developer's choice of tools for implementation. It's as though all developers get to use their favorite language, development kit, or software library and still work together.

(As a side note, Gregor Purdy has written an excellent critique of XML-RPC in the form of a proposed alternative (see Resources).

Table 4 includes a few libraries for use in the message-oriented middleware category. This is not an exhaustive list of the resources in this category, and there are new tools evolving quickly, but it's a good start.

Table 4. C/C++ Messaging libraries

Library

Supplier

Highlights

License

Platform

4S4C SOAP services

Simon Fell

Open Source SOAP effort

noncommercial

Linux, Unix, Win32

SOAP client

SQL Data

C++ SOAP client toolkit

commercial

Win32

SOAP component

mozilla.org

Scriptable XPCOM component

noncommercial

numerous

XML-RPC for C/C++

First Peer

XML-RPC library in C

noncommercial

Linux, Unix, Win32

XML-RPC component

mozilla.org

Scriptable XPCOM component

noncommercial

numerous

XML-RPC for C/C++

Epinions

XML-RPC library in C

noncommercial

Linux, Unix, Win32

Back to top

In parting

These tools ought to give you a good start on your XML toolbox. If you want to suggest other C/C++ tools for XML that you have tried or to make any other comment, join the discussion attached to this article (use the link in Resources or click the Discuss icon at the top or bottom of the article page).

XML terminology

These XML terms may come in handy as you read about the libraries discussed in this article:

Document model: Technique for parsing and manipulating XML data as a treelike object; this is also called a "pull" model. See the DOM API standard as an example.

DOM: The Document Object Model is a specific tree-structured programming model of an XML document described as a standard by the W3C. The DOM standard is currently divided into three levels. DOM 1.0 refers to DOM Level 1.0 conformance; DOM Level two is the most current spec that has been approved by the W3C as a Recommendation; DOM Level 3 is in draft at the time of this writing.

DTD: Document Type Definition. An XML file that defines XML elements and XML attributes for those elements and that specifies rules for how XML tags may be nested and what data an element may contain.

Event model: Technique for parsing XML data using callbacks or handlers; this is also called a "push" model. See the SAX API standard as an example.

Namespaces: Means of unambiguously identifying XML tags from different DTDs or schemas so they can be mixed in the same XML document. RDF is highly dependent upon this feature; the XML 1.0 token "xmlns" can be used to define a namespace within an XML document.

RDF: Resource Description Framework, a compact XML dialect for associating XML attribute data with information that usually resides elsewhere. Your driver's license would be analogous to an RDF XML file that describes you.

SAX: Simple API for XML is a standard programming interface for XML parser implementations; SAX uses an event-oriented programming model. SAX is a de facto standard first developed by David Megginson and now maintained by the XML-dev mailing list.

SOAP: Simple Object Access Protocol is a network protocol similar to XML-RPC (see XML-RPC). Using SOAP, an application can create a remote object, invoke methods on that object, and retrieve results.

Validation: Verifying that a well formed XML document is correct with respect to a DTD or schema.

Well formed: An XML document whose tags and data are consistent with XML 1.0 syntax.

W3C: The Worldwide Web Consortium, which has become the key standards body for most of the XML-related technologies. The W3C calls a finally approved specification a Recommendation (rather than a standard).

XML 1.0: The first standard for XML syntax blessed by the W3C; establishes basic rules for XML data, such as all tags must be closed with a slash ( / ) like this: <example/> or followed by a closing tag like this: <example>close the tag on your way out</example>.

XML-RPC: XML Remote Procedure Call. XML-RPC is a standard XML dialect for invoking methods and services across a network; as you can guess, XML-RPC uses XML for the messaging between client and server.

XML Schema: XML Schema, a W3C Recommendation, works much like a DTD to define the structure of an XML document but with more flexibility. XML Schema uses XML 1.0 syntax to specify the schema, in contrast to the older SGML syntax used for DTDs.

XQuery: Similar in some of its functionality to XSLT but designed more toward acting as a query language for XML data -- analogous to using SQL in a relational database. Less mature than XSLT as a specification, XQuery may become the SQL of the next decade.

XSLT: Extensible Style Language Transforms, an XML dialect for transforming XML content. You apply an XSLT file to some XML input data to produce the desired XML output data.

Back to top

Resources

Visit the W3C XML page, the host with the most in XML specifications.

Explore this compilation of XML-related software.

Catch up on XML basics with Zvon's general XML tutorial or Doug Tidwell's Intro to XML here on developerWorks and other offerings in the XML zone education area.

Look at Howard Katz's introduction to XQuery.

Review Gregor Purdy's excellent critique of XML-RPC.

Gain an understanding of XSLT in Michael Kay's technical overview, What kind of language is XSLT?.

Find out about that other kind of messaging (instant messaging) at Jabber.

Get a handle on the XML developer skillset by reviewing the XML Certification guidelines for the IBM Certified Developer Program.

XML Spec 1.0: The W3C core XML 1.0 specification.

DOM Level 1.0: The W3C Document Object Model level 1 API Recommendation.

DOM Level 2.0: The W3C Document Object Model level 2 API Recommendation.

SAX/SAX2: The Simple API for XML event model de facto standard.

Namespace: W3C Recommendation for handling XML namespaces.

XML Schema: All about the W3C XML Schema Recommendation

Back to top

About the author

Rick is a long-time programmer whose professional career consists of stalking stock options and defeating deadlines while giving gratuity enough to make any waitress blush. His one claim to fame is being known by first name at nearly every coffeehouse in town. He also enjoys speaking in a seminar setting on technical topics. A bit of a maverick on design, he is moving to more modern modeling methodologies like UML. A bumper sticker on his car reads: "I <br/> for XHTML." You can contact Rick at rfmobile@swbell.net.

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
 
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有