To that end, I began poking and proding the RDF landscape again to see if any viable products pop up. Being an Oracle shop, the first thing I did was try Oracle 10g with its support for RDF on top of Oracle Spatial and their Network Data Model.
After a few days of playing with it, here's what I know. It allows me to store RDF as triples directly in the database. It does this by providing a set of procedures and views that extend their Oracle Spatial functionality. Oracle Network Data Model already knows how to store graphs, so it was a logical extension to support RDF (which is a directed graph).
There are procedures provided to search the RDF models with a language that looks like a bit like SPARQL. However, only the graph pattern matching parts (
(?x :parentOf ?y)) is available. For filtering on literals (
(where ?y like 'seth')) you use straight up SQL
WHEREclauses. The search is pretty nice because you can join your search on standard SQL tables, thus combining both your triple model and relational models together.
Oracle also has Rule support, in the form of Rule Indexes. A builtin set of rules for RDF Schema (RDFS) semantics is provided, which is a very nice touch. You are also free to create your own rules, both with the query pattern matching and filters. Once your rules are created, you must then run another procedure which will effectively compile a rule base into views used by searches. After your rule base is compiled and the index is created, you can reference rule bases by name when performing your searches. Doing this allows your queries to take into account the rules. It should be noted that Oracle's rule indexes seem to chain, allowing you to write rules based on rules. This is very exciting stuff. I've seen this work locally and was very happy (commercial, "off the shelf" product supports RDF models and chained rules!)
Now, for the down side. And frankly, I don't know how I'm going to work around this. Any time you modify the RDF model, the rule indexes become invalid. That's right, any time you add, update, or delete a triple, you must rebuild your rule indexes.
At first glance, this is a deal breaker. For all but the most read-only applications, the triple store must allow for concurrent modification of the rdf model and querying and analysis via the rules.
If anyone knows a good way around this limitation, I'd love to know. This product is poised to bring RDF into the enterprise (merely due to the fact that Oracle is ubiquitous). However, I don't know if they expect the usage of the product to be anything but purely retrieval oriented.
What I'd like to figure out is how to update my RDF model without destroying my rule indexes. Even if the rule indexes aren't able to take into account new triples until I rebuild the indexes is fine with me right now.
I can't really write a trigger that will rebuilt the indexes after an insertion or deletion, because I'm assuming that there will be too many modifications. I don't know yet the impact constant index rebuilding will have on the system. Or how it will affect in-progress queries.
If you're looking for a good triple store with rules support for data mining and querying, this could be your solution. For more transactional processing, like the standard web application, I might need to find some work arounds.
Of course, tips to other products that are scalable RDF triple stores with inferencings are also appreciated. What's also nice about this new job is we can spend some money, which opens up the options considerably.