Ontologies in XHTML/RDFa

For now this simply is a place to store my notes and remarks for this subject. I'm just on the way to find a proper notation. so there will be much change here in the near future.

Ontology: a scheme for things and classes

Mainly ontologies are about two subjects: Things and classes. Things are mainly all those things well known from our daily life. Classes are abstract terms where things can be related to.

Example

Let's assume the term book. At a first glance you might think this is a thing. Well, it is not. It is not because you could ask: Which book? O.k., then let's take a specific book. Let's say this book has a title of "Harry Potter and the wizard's stone". More this book as an author named "Joanne K. Rowlings". This is indeed a "thing" in terms of ontology. This thing belongs to the class "book".

For now i'll skip the fact that this thing has many attributes like author, title, ISBN and the like. For now i simply assume the title beeing the thing (ot the representation of the thing). So now i'd try to write that down in :

Class

owl:Class rdf:ID="book"/

Here i define a class named "book". This is simply derived from Web Ontology Language Guide of the W3C. Should be quite straightforward and simple.

Thing

Now the concrete individual, the book "Harry_Potter_and_the_philosophers_stone".

owl:Thing rdf:ID="Harry_Potter_and_the_philosophers_stone"/ owl:Thing rdf:about="#Harry_Potter_and_the_philosophers_stone" rdf:type rdf:resource="#book"/ /owl:Thing

The first line names a thing "Harry_Potter_and_the_philosophers_stone". The other three state, that this thing is of class "book". This example is derived from Web Ontology Language Guide, too.

HTML

This relationship may then simply be modeled in HTML. To begin with something i'll start with the class "book". Since this will be a definition, the dfn element is optimal:

dfn class="Class" id="book"book/dfn

It's that simple. it would be possible to construct a new microformat out of that. Up to now this has nothing to do with RDFa, though. This definition might then be anywhere within a web page. The attribute class="Class" says, that the content of the dfn container is a class. Together with the element itself this forms a class definition. The name of the class is statet in the id attribute. The content of the container does not have the class name, but instead a human readable "label".

The whole thing simply says, that the thing named "book" is a class. As said above, a class is something where concrete individuals, so called "things" are related to. So then letÄs define such a thing:

dfn class="Thing" id="Harry_Potter_and_the_philosophers_stone"Harry Potter and the philosophers stone/dfn

This does the definition. But there is still something missing: The relationship to the class. Relations (and their references) are marked up in HTML using the anchor (a) element and the attributes rel and rev. So the next thing is to surround the definition with such an anchor. Since the relationship should not say something about the link target (we already know this is a class definition), but instead about the link source, we need the rev attribute rather than the rel attribute. The complete definition then reads like this:

a href="#book" rev="Thing" dfn class="Thing" id="Harry_Potter_and_the_philosophers_stone"Harry Potter and the philosophers stone/dfn /a

As could be seen the attribute class="Thing" of the dfn element is redundant and thus unnecessary. If needed, for just the relationship the dfn element could be skipped at all. Since this will become a definition though, i'll keep the dfn. So compacted the above code would read like this:

a href="#book" rev="Thing" dfn id="Harry_Potter_and_the_philosophers_stone"Harry Potter and the philosophers stone/dfn /a

This now says that a thing named "Harry_Potter_and_the_philosophers_stone" is of class "book". Or to say it more understandable: That this thing is a book.

RDFa

Well, the class="Class" and Class="Thing" attributes where just invented here. I could set up a web page describing these classes and include references to these pages in the head element as profile. But it is possible to include many different profiles. So sooner or later i'd face some naming problems. In XML this is solved (handled) by using namespaces. RDFa now is using exactly that. The namespaces are chosen ba using the namespace declaration, and the class and thing identifiers are prefixed with the namespace prefix. As could be easily seen this is only possible with XHTML. In this simple version here already usable for XHTML 1.0. With XHTML htis then reads something like (namespace declaration skipped):

dfn class="owl:Class" id="book"book/dfn [...] a href="#book" rev="owl:Thing" dfn id="Harry_Potter_and_the_philosophers_stone"Harry Potter and the philosophers stone/dfn /a [...]

Please note that the contents of the href attribute is just a normal URL. In this case i just used the socalled fragment identifier. This means that both definitions have to be on the same page. Using complete URLs makes it possible to have the different definitions located all across the world.

Not always is it desireable or useful to construct this relationship as a clickable link. For example if both definitions are on the same page, it often is not very useful to construct a clickable link. To still construct a relationship you need something more tricky from RDFa. Some things which are planned for XHTML 2.0. But because these things are that useful, the W3C has implemented an intermediate specification. To use this we first need an other DOCTYPE:

!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"

Then you may take advantage of two things: the rev attribute could be used for any element, including dfn. And there is an attribute named resource. Applying these results in:

dfn id="Harry_Potter_and_the_philosophers_stone" rev="owl:Thing" resource="#book"Harry Potter and the philosophers stone/dfn

By using a simple dfn we now have defined that this thing is a book. This definition is machine readable. Human readers might know that without this definition. The computer however lacks all our human implicite knowledge. For computers anything has to be stated explicitely. For humans this simply reads like this:

The book Harry Potter and the philosophers stone is great.