XML is about the simplest thing around. How can you screw it up?
I'm writing some metadata-extraction-from-image code and started looking at Adobe's XMP.
XMP is written using the W3's RDF spec - [almost wrote RDF framework, but that would then be Resource Description Framework framework (RDF2), which might not be the same thing].
RDF defines a 'Framework' for writing machine parseable statements of the form:
Tim has a bike.
- Tim the subject
- has the predicate
- bike the object
Here's an example from the RDF Primer. It says: 'http://www.example.org was created on August 16, 1999'
The RDF is 5.9 times LONGER than the English Language Sentence. That's an increase
<exterms:creation-date>August 16, 1999</exterms:creation-date>
in text of about 83%. Or, to put it another way, a BANDWIDTH UTILIZATION of about 17%.
For What Gain?
Nothing. And it takes them 6 LONG RFC style documents to define this messs. That's 6 Long, Boring documents with much repetition and pedantic phrasing with many MUST's and SHALL's and MAY's.
But it can be parsed by a machine - if you can understand the spec well enough to write the code.
Why take something so simple and make it so incomprehensibly complex?
But - Believe it Not - I digress.
XMP is written using RDF [why re-invent the wheel when you can use somebody else's debacle and make it worse].
I'm not going to get into XMP - but at first glance it looks like they're using attributes for data, XML entities for data, and RDF nested structures for data - with NO obvious logic as to when and where these choices are made. To further mess it up, everything uses XPATH name spaces - which my XML parser translates back to URI's [which point to nothing, but are long and look cool], like a 'good parser should' - which obfuscates the already obfuscated and bulks out the fluff to content ratio admirably.
Here's how I think XML intended to encapsulate a website's creation date:
It's machine parse-able. It's (almost) human readable. It only wastes 50% of the bandwidth - as opposed to 85% using RDF.<site-info site_name="www.example.com">
<creation-date>August 16, 1999</creation-date>
How about JSON - where the entire spec fits on one web page:
"creation_date": "August 16, 1999"