XML is about the simplest thing around. How can you screw it up?
I'm writing some metadata-extraction-from-image code and started looking at Adobe's XMP.
XMP is written using the W3's RDF spec - [almost wrote RDF framework, but that would then be Resource Description Framework framework (RDF2), which might not be the same thing].
RDF defines a 'Framework' for writing machine parseable statements of the form:
Tim has a bike.
RDF calls:
- Tim the subject
- has the predicate
- bike the object
Here's an example from the RDF Primer. It says: 'http://www.example.org was created on August 16, 1999'
The RDF is 5.9 times LONGER than the English Language Sentence. That's an increase<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:exterms="http://www.example.org/terms/">
<rdf:Description rdf:about="http://www.example.org/index.html">
<exterms:creation-date>August 16, 1999</exterms:creation-date>
</rdf:Description>
</rdf:RDF>
in text of about 83%. Or, to put it another way, a BANDWIDTH UTILIZATION of about 17%.
For What Gain?
Nothing. And it takes them 6 LONG RFC style documents to define this messs. That's 6 Long, Boring documents with much repetition and pedantic phrasing with many MUST's and SHALL's and MAY's.
But it can be parsed by a machine - if you can understand the spec well enough to write the code.
Why take something so simple and make it so incomprehensibly complex?
But - Believe it Not - I digress.
XMP is written using RDF [why re-invent the wheel when you can use somebody else's debacle and make it worse].
I'm not going to get into XMP - but at first glance it looks like they're using attributes for data, XML entities for data, and RDF nested structures for data - with NO obvious logic as to when and where these choices are made. To further mess it up, everything uses XPATH name spaces - which my XML parser translates back to URI's [which point to nothing, but are long and look cool], like a 'good parser should' - which obfuscates the already obfuscated and bulks out the fluff to content ratio admirably.
Yech!!!!!!!
Here's how I think XML intended to encapsulate a website's creation date:
It's machine parse-able. It's (almost) human readable. It only wastes 50% of the bandwidth - as opposed to 85% using RDF.<site-info site_name="www.example.com">
<creation-date>August 16, 1999</creation-date>
</site-info>
How about JSON - where the entire spec fits on one web page:
{
"site_info": {
"site": "www.example.org",
"creation_date": "August 16, 1999"
}
}
No comments:
Post a Comment