RDF for the beginner / Various data models compared with RDF (written for beginners!) |
The Extensible Markup Language is a standard text format that was published by the W3C in 1998 and updated several times.
His strengths are the following:
XML is based on a tree model. Here is the same family data, but in XML:
<families> <family familyName="Landais"> <parents> <father id="3"> <firstName>Eric</firstName> <age>34</age> <goesToEvents> <event href="events.xml#5"/> </goesToEvents> </father> <mother id="4"> <firstName>Olga</firstName> <age>35</age> <goesToEvents> <event href="events.xml#5"/> <event href="events.xml#8"/> </goesToEvents> </mother> </parents> <children> <daughter id="2"> <firstName>Salomé</firstName> <age>9</age> <goesToEvents> <event href="events.xml#11"/> </goesToEvents> </daughter> <son id="1"> <firstName>Isaac</firstName> <age>5</age> </son> </children> </family> <family familyName="Todorov"> <parents> ... </parents> <children> <daughter id="2"> <firstName>Olga</firstName> <age>35</age> <goesToEvents> <event href="events.xml#5"/> <event href="events.xml#8"/> </goesToEvents> </daughter> <son id="5"> <firstName>Boris</firstName> <age>42</age> <goesToEvents> <event href="events.xml#8"/> </goesToEvents> </son> </children> </family> </families>
This is one of the many possibilities to structure this information in XML. After thinking a lot, I went for a structure that focuses on parents and children. The problem here is that each person that has children (in this case Olga) needs to be represented twice: as a parent and as a child.
Also, although the events are identified with a unique events ID, it is necessary to know how to resolve the references to get information about the events. it is not self-explanatory.
Storing a flat list of elements, one per person, using IDs to represent all relationships, would address the issue with duplicate entries...
<persons> <person id="4" parents="10,11" siblings="5" spouse="3" children="1,2"> <firstName>Olga</firstName> <lastName>Landais</lastName> <age>35</age> </person> ... </persons>
...but would result in two annoying issues, the consequences of using XML for a purpose it wasn't designed for:
So, as we know, XML is a very versatile format, but, as we just saw, its tree model doesn't faciliate the cross referencing of information as the only direct relations that XML can express between two nodes of information are the following:
The rest are the result of interpretations which require communication, agreement and extra coding effort.
For instance, using our family information, how do we find the names of Boris' nephews? It is possible, but the query to write is much more complex than the piece of information we try to retrieve. We could also change the structure of the document to adapt it to answer this specific question, but that wouldn't be without a new load of trade-offs to answer other questions.