RDF for the beginner / Various data models compared with RDF (written for beginners!) |
Relational database management systems (RDBMS) are the most common way to store data. The most famous solutions are MySQL and Oracle. In these data bases, the data is stored in tables. Each table can have virtually infinite number of columns and rows, and each data entry is stored as a new row. Here is a typical example:
id | firstName | familyName | age | parent | sibling | child | goesToEvent |
1 | Salomé | Landais | 9 | 3,4 | 2 | 11 | |
2 | Isaac | Landais | 5 | 3,4 | 1 | ||
3 | Eric | Landais | 34 | 1,2 | 5 | ||
4 | Olga | Landais | 35 | 1,2 | 5,8 | ||
5 | Boris | Todorov | 42 | 9 | 4 | 8 | |
... | ... | ... | ... | ... | ... | ... | ... |
This table stores the name and the age of people, including how they are related. Each person has a unique identifier in the scope of this table, which enables the creation of relationships across the entries of the table. It is agreed that the last column refers to the rows of another table that stores events, each event also having a unique identifier in the scope of the events table. The identifiers to data entries in another table are called foreign keys.
When it comes to data interlinking, we meet certain problems that are inherent to the way relational databases are meant to work.
Contrary to the Web of documents that uses standards to present the information (HTML, W3C standard) and transport it (HTTP, IETF standard), relational databases are not supported by standard vocabularies or models.
If company Y wants to interchange data with company X, even if they use the same database vendor, they cannot interchange data easily, because they probably use different column headers and made different choices regarding how the data will be distributed across the tables.
Finally, unique identifiers are only unique in a limited scope. In worst cases they are unique in the scope of an individual table, in best cases they are unique in a circle of organizations that agreed on an identification system. This means that when the data is merged with another data set, there is a risk to end up with data entries that have the same identifier, which results in data clashes. To get a feeling of this situation, imagine two cars in the same country having the same license plate, or two phone chips being bound to the same phone number.
On the Web of documents, we have robust worldwide identifiers: Universal Resource Locators (URL). If you visit https://github.com/ColinMaudry/dita-rdf, thanks to standards, you are certain of a couple things: