In which we rant about the LACK of XML

Yesterday, we were chatting with Captain Telescope about development, XML, and how ugly and misused the latter can be. Frankly, it’s misused way more often in our experience than not. XML+XSLT can be a real boon for some applications, but there’s a tendency among some to store Every. Damn. Thing. in XML, and there’s really no good reason for that. In some situations, a five-line pure-text “unix-style” config file is exactly what you need, not a stanza-filled XML abomination — in fact, even something as complex as an Apache config file would probably only suffer if converted to XML; as it is, it’s fairly clear if you know what you’re doing, and if you don’t, you have no business in the config file.

Likewise, XML ought never be a persistent data store for anything you’re going to read and write repeatedly. (Yes, we’ve really heard people suggest this.) XML is a way to move data around; it’s a great lingua franca for shifting data formats. XSLT allows the (relatively) easy transformation of XML into damn near anything else you want, which is awesome. Using an XML file or files as your database, though, is just fucking stupid in a world where wholly reasonable RDBMS tools abound at the “free” price point.

HOWEVER, today we find a perfect example of something that really, really, really needs some XML love. We’re working with [Nameless Government Entity] on some supply-chain issues, and one element of these transactions is something called an Advance Shipping Notification. An ASN is an electronic document transmitted to the recipient of a given shipment of goods; you send it on ahead of the shipment so that [NGE] knows that your shipment of widgets, catfish jerky, and whiskey is on its merry way (and how much of each are coming, and who it’s from, and all that goodness).

These ASN documents can be formatted in one of two ways, for the most part. Both formats look like what happens when Heathen Central’s Chief Feline Officer takes a shortcut across our keyboard; here’s an example from the better, more legible of the two:


START*1^
A*AFVendor11^
B*COMBO^
1*GS03F04702^FA940105F9126^20060104^^
2*STUC0001^20060115^^N^
3*SPL^
4*^^^

… and so forth for several dozen lines. Lovely, huh? Naturally, there’s no documentation at all in the file itself (we have a 96-page Word document for that; naturally, it’s rife with additions and exceptions to otherwise inviolate rules). It’s exceeded in the “meaningful data most resembling line noise” competition only by certain Perl idioms, for crying out loud.

In this instance, at least, we’d kill for an XML alternative. The accessibility implications would be huge, especially in world where many, many people are going to be creating these files in the next 6-18 months. Like, say, this one.

Comments are closed.