XML: How to get the benefits without the heartache, part 1

Be the first to comment | 14I like it!
June 2, 2008, 12:50 PM —  ITworld.com — 

First up, a scope warning for this article. This is the first part of a two-part piece about XML. Here I focus on uses of XML in areas such as application configuration files, the exchange of structured, machine-oriented data, that sort of thing. In the second part, I discuss XML in document-centric applications such as content/document management and Web publication systems.

Much has been written and continues to be written about the "angle bracket tax". Now let us start by calling a spade a spade. XML is not a silver bullet and if you unilaterally spray it over your application space you can get into trouble. No amount of pretty-printing an XML file containing a SOAP message will make it look pretty to an application developer's eyes. No amount of pretty-printing a complex ANT script or a CFML script will make the conditional logic that these things often contain, easy to read or easy to process programmatically.

For many applications of XML you will come across, there appears to be a better, more optimal non-XML based solution possible. For any given data representation requirement you as an application developer/designer might have, there is a "better" syntax than XML for representing it. XML is sub-optimal for everything, or so it sometimes seems.

In my opinion, that is not a weakness of XML, it is a key strength. A strength that, if used wisely, pays significant dividends. However, it must be used wisely to be effective.

The most important thing is to ensure that you use XML to solve the parsing problems that you do not want to take on yourself. Tagging data can really cut down on the amount of work you have to do but only if the tags are in the right places. For example, the following example does not really help you process name in your application:


<name>Sean Mc Grath </name>


The problem is that your application must do the tricky part - splitting the first name from the second name. This would be much more useful:


<name> <first>Sean </first> <second>Mc Grath </second> </name>


The second most important thing to do is not complicate your life by using complicated XML processing APIs. There are times when you absolutely must use event-oriented parsing techniques like SAX but most of the time, you don't. Life is much easier if you load up the XML document and "walk" it node-by-node or "pull" parts of it, token by token using pull parsing.

The third most important thing to do is to consider internationalization. Now if you are happy to live in a US-ASCII world, this doesn't apply to you but for everyone else, listen up. Detecting and properly handling character encoding is hard and ugly. XML - for all its sub-optimality - provides a workable framework in which to handle character

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Free stuff

Win an Amazon Kindle!
This month's giveaway gadget - Amazon's Kindle - will keep you entertained on the long trip home to visit family and friends over the holidays. Enter the drawing now!

Applied Security Visualization
By Raffael Marty
Published by Addison-Wesley Professional
Learn more!

 

IT Manager's Handbook
By Bill Holtsnider and Brian D. Jaffe
Published by Morgan Kaufmann
Learn more!

 

Windows Vista Resource Kit
By Mitch Tulloch, Tony Northrup, and Jerry Honeycutt
Published by Microsoft Press
Learn more!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

More Resources