Why I Want RIF
July 9, 2009
RIF is done, more or less.
When I say “done”, I don’t mean “done” like toast in the toaster is done, when it’s just perfectly crunchy, without quite being dry or hard. And I don’t mean “done” like a hacking project which is done at that precise moment when it stops being more fascinating than sleep or food or sunshine. No, RIF is “done” like a term paper, the night before it’s due. It meets the requirements, more or less, and the time has come to ship it.
Of course, the W3C process favors quality over speed, so instead of turning it in and walking away, we’ll have to do several rewrites, to address the teacher’s comments. In this case, we have at least three rounds of that. In the first round (called “last call”, which started last friday), the “teacher” is anyone who feels like reading and commenting. Then comes “candidate recommendation”, when we try to get everyone to implement it and give us comments as they do. (This is where OWL 2 is now). Finally, we’ll ask for a high level review from all W3C member organizations, as they decide whether to promote it from Proposed Recommendation to a full W3C Recommendation.
But still, it’s done like that term paper. It’s turned in, and now we wait for the review comments.
So what is RIF good for, anyway?
The consensus, Working Group answer is 26 pages long and rather in need of some polish, so here’s my short answer. Here’s why I’ve spent the last five years working on it. (No need to cry for my lost youth, I did some other fun things during that time, too.)
We need RIF so that we don’t need standards any more.
If you’ve ever tried to use FOAF (arguably the most popular Semantic Web vocabulary), you may have noticed a little problem with representing names. Am I:
- [ foaf:firstName "Sandro"; foaf:surname "Hawke"], or
- [ foaf:givenname "Sandro"; foaf:family_name "Hawke" ], or just plain
- [ foaf:name "Sandro Hawke" ] ?
Who knows? How can anyone decide? It’s a mess.
And, of course, this problem is repeated everywhere. Every ontology has its share of coin-flip design decision — decisions where you have no overwhelming engineering reason to make one choice over another. And every problem space has, or will soon have, a vast array of ontologies addressing it from many slightly-different angles.
I want data providers to publish using whatever ontology they know and love.
I want data consumers to consume (use) data in whatever ontology they know and love.
I expect RIF to be the glue in the middle, behind the scenes, in a fuzzy ball of linked-data rule-engine goodness.
Imagine Jos publishes using foaf:firstName and foaf:surname. Imagine Chris publishes using foaf:givenname and foaf:family_name. Imagine Gary writes an app which looks for foaf:name data. As long as the right RIF rules are present on the Web, in the right places, this should work. People using Gary’s app should see the data from Jos and the data from Chris, even though Gary never knew or cared about the vocabularies they used.
Of course, there’s some question as to what those rules should say. In the US, the givenname and the firstName can be treated as the same. Meanwhile, in Japan, the family_name is the firstName! And if you try to split a name back into firstName and surname, do you use the last space (as in “Sarah Jessica Parker”) or the first space (as in “Hillary Rodham Clinton”)?
I think the solution is to accept that there may be multiple rulesets, suitable for different purposes, and they may not be perfect. I explored this space to some degree in a different context: XTAN associates some “impact” (which might be called “semantic damage”) with each transformation (or ruleset). I think that can work.
So, there are still some details to work out. I’m not presenting the solution here; I’m just explaining why RIF interests me. Now you know.
(And no, I don’t really think this will entirely obviate the need for standards, but I think it will significantly reduce that need, taking some pressure off, and shifting some more work to the machines.)