Friday, March 24, 2006

SWOOP - Helping to Debug Ontologies

Henry has discovered SWOOP, a small ontology editor from MindSwap. This reminds me to point out a very nice feature of SWOOP, one that Protege does not have out of the box.

SWOOP, through its integration with Pellet, is able to not only compute classifications but also able to tell you *why* your ontology is inconsistent. This, to me, is a huge step towards being able to use ontologies in "every day" problems.

The use case here is: You are collecting information from many many different sources. You have an ontology that defines your view of the world. It's possible that all the different information sources will present to you conflicting information. Your ontology defines what is consistent, so therefore it should be able to tell you when you have information that is *inconsistent*.

SWOOP does a pretty good job at giving you the feedback to find the source of the inconsistency. An example output is:

Inconsistent ontology Reason: Individual ErrorBoat has more than 1 values for property at
Location
violating the cardinality restriction
Axioms causing the problem:
1) (ErrorBoat decommissionedAt DryDock)
2) |_(DryDock ≠ SanDiego)
3) |_(ErrorBoat deployedTo SanDiego)
4) |_(ErrorBoat rdf:type Boat)
5) |_(Boat ⊆ (= 1 atLocation))

Now that's a simple problem, where an instance has two values for a property that is constrained to have only one.

What about when an individual belongs to two disjoint classes? This one is a bit more confusing if you don't have a background in DL, but if you stare long enough at it, you might see the real cause:

Inconsistent ontology Reason: Individual ErrorBoat is forced to belong to class
all(decommissionedAt, not(Location))
and its complement

Axioms causing the problem:
1) (ErrorBoat decommissionedAt DryDock)
2) |_(DryDock rdf:type Location)
3) |_(SanDiego rdf:type Location)
4) |_(ErrorBoat deployedTo SanDiego)
5) (Decommissioned ≡ (∃decommissionedAt . Location))
6) (Active ⊆ ¬ Decommissioned)
7) (Active ≡ (∃deployedTo . Location))

To explain this one better, I've declared that the class Active and the class Decommissioned are disjoint. And I've placed the instance ErrorBoat into both classes.

For this reason, I like SWOOP and Pellet. Though, Protege still has nicer workflows.

Thursday, March 23, 2006

RDF - Connecting Software and People - Google Video

Found RDF - Connecting Software and People - Google Video via a post to swig mailing mail. The video is quite compressed, but you can view the original slides and high quality video from the original blog post by the author.

The video does a good job of grounding RDF into some real world problem domains with real world tools such as NetBeans (the author works for Sun).

To funny bit is, as I started to watch the video, I was impressed by the cool soundtrack it had. I thought, "Wow, this guy went all out and put in a cool trance soundtrack the match with the cutting edge, futuristic feel of the semantic web." Then I realized I had an internet radio station playing in the background.

Time for a remix of the RDF presentation video? :)

More on temporal relations in RDF

Temporal relations from Henry Story on 2006-03-23 (semantic-web@w3.org from March 2006) is a post to the swig mailing list dealing with time in RDF.

The proposal in the email attempts to use N3's ability to declare metadata about graphs, in order to say when the graph was fetched. For instance:

{ :Oven :temp "22"^^t:celsius . } :fetched [ :at "2006-03-23T10:00:00Z"^^xsd:dateTime; ] .

Of course, the problem here is there's no formal RDF semantics for this, as there's no formal way to write triples about a graph. This is an N3 thing. I would guess the closest thing in RDF would be reification.

This is similar to how I've been thinking about the problem, which is marking the time the triples were retrieved from the source. The email, though, points out that it is difficult to merge those graphs together, as you'll end up with many values for an Oven's :temp. Once merged, how do you query for a particular temp at a particular time?

Another way to do it is to make the measurement a formal object. I've blogged on this before, and it's been heavily discussed in comments, but in short:

:Measurement :of :Oven ; :takenAt "2006-03-23T10:00:00Z"^^xsd:dateTime ; :valueRecorded "22"^^t:celcius.

A bit of OWL makes it slightly more interesting:

:TemperatureMeasurement owl:equivalentClass [ a owl:Restriction; owl:onProperty :valueRecorded; owl:someValuesFrom t:celcius ].

(Does that work? Will have to try...)

Wednesday, March 22, 2006

Instance Store

Instance Store is a

> Java application for performing efficient and scalable Description Logic (DL) reasoning over individuals

What does this mean? It's an attempt to build a system that can reason (using OWL, for instance) across a large set of instances.

Apparently it stores individuals in a RDBMS, such as MySQL or Oracle. It then connects that store to a reasoner, such as Racer or FaCT++. I haven't tried it yet, but I will soon, so I'm storing this here for future reference.

Would be interesting to see if plugging this directly into Oracle 10g's RDF store provides any benefits.

I should note that a quick glance at the code indicates that it hasn't been touched in about two years.

Expert Spring MVC Given Away at Symantec Seminar

The J2EE Best Practice seminar put on by Symantec is giving attendees a free copy of Expert Spring MVC and Web Flow. That's pretty exciting, but what about a free trip to London for the authors? :)

Tuesday, March 21, 2006

Expert Spring MVC Gets Second Printing

Expert Spring MVC and Web Flow is off for a second printing. I have no idea how many were printed in the first place, but this is good news none the less. A big mahalo to everyone who purchased a copy. All of the typos and errors identified will be fixed.

OWL Consistency Checking

ConsVISor is a nice web service that can check an OWL ontology (with individuals) for consistency. It displays any errors with easy to read non-technical-jargon sentences. I've been trying to find a workflow where I can test a set of individuals against an ontology to see where the inconsistencies are, and this tool is the best so far for reporting any problems.

FaCT++, via Protege, doesn't seem to provide any error messages. It does fail, though (correctly). Can't figure out how to get it to tell me what and where the problem is.

Pellet, via their online demo, seems to work. However, the error messages are cryptic and seemingly misleading. I'm assuming they are correct, however they would never lead the lay-person to the real problem.

Anyone else using OWL for consistency checking across individuals? What tool are you using? Can it provide useful error messages?

Sunday, March 19, 2006

Response to Why we need explicit temporal labelling

Why we need explicit temporal labelling is an excellent new post on the continuing saga about temporal labeling in RDF. The author provides an great example of a real world scenario for changing values of a web page's title. To reiterate, yesterday this triple was valid:

:page dc:title "I like Cheeses";

but today it's now:

:page dc:title "I like Cheese";

The author asserts that there are now two triples now, which would indicate that there are two titles.

Going back to my relational database roots, I don't see how there would be two triples (unless you explicitly store two triples in your local Model). Given just the source RDF document that the triple is found in, at one time, there is at most on triple that asserts the page's dc:title. If I'm consuming the RDF document that asserts the triple, I'm in a position to store the URI of the RDF document. When my RDF crawler hits the same RDF page, it will simply update its local store with all new values. The old triple will be deleted and replaced by whatever new triples are asserted.

Of course, that's one strategy for crawling/consuming RDF documents. But it does remove the need to attach arbitrary metadata to triples just to attach a timestamp. I believe that if we let time into the model, it won't stop there. We have reification for saying things *about* statements. And reification has a bad wrap mainly because of the syntax, not the model.

In any case, the use case of a web page's title changing over time is excellent, but correctly modeling it doesn't require a new addition to the RDF model. You can store the time you received the RDF document that asserted the triple, you can use reification to say what time the statement was asserted, or you can model explicitly that titles have a date at which they were said. Heck, nothing stops you from adding your own reifications to the triples you just downloaded.

I want to talk about one statement the blog post said:

> In the current model, I would end up with two titles for this article. While technically correct, it is intuitively wrong - and that difference is what holds back RDF for most developers. They expect to see a single title with the updated value.

Developers don't always expect to see a single value for the title. What if someone says "I want to know what the title for the web page was two weeks ago?" In other words, it's all in how you look at the data and what you're trying to see. If all you care about is the *now*, then track where triples came from (the original RDF document) and consistently update it. Delete all old triples from the original document when you do an update.

Maybe this points out that an RDF triple is pretty bare all alone, and tracking it's source document is pretty important.

On the semantic web, you can't un-say something, and that's part of this whole problem. If I can't un-say something, how do I say, "This thing I just said, well, it's no longer true." Attaching a timestamp doesn't really help to un-say anything, because there's no semantics of TTL to the timestamp. Just because there's a timestamp of yesterday on a triple doesn't mean that today that triple is invalid.

The bigger question I have is, why don't I ever have this problem of temporal labeling when writing relational database applications? When I need time as explicit data, I put it into the relational model (usually as a created_on, updated_at, performed_on, etc). If time isn't important to the data, it's assumed that whatever is in the database is the truth at now.

The web has a nice way to declare if resource representations can be cached, therefore if you can trust the data inside the representation for longer than when you received the document. If I receive an RDF document whose HTTP headers say not to cache it, then I better treat the triples inside the document as only truthful for *now*. For if I try to query the triples again from a local cache, I better understand that the values might have been updated from the source Resource. So what is the relationship between a triple, the document it's in, and the HTTP headers sent with the document?

Wow, got off track there.

Computer Networks: The Heralds of Resource Sharing - Google Video

Computer Networks: The Heralds of Resource Sharing is a 1972 documentary on ARPAnet. It's a really interesting look back at the initial thinking of computer networks, especially in contract to what we take for granted today. It includes such quotes as "programming is fun." Amen to that. Another excellent quote: "We should deal with information, not the paper it is written on." Watch this, and learn your roots.

Thursday, March 16, 2006

Does RDF’s Model Need to Include Explicit Support for Temporal Labelling?

John Barstow, in Visions of Aestia » Thinking about RDF-lite, requests that a RDF-lite type proposal include:

> Formally include provenance and temporal labelling in the model without requiring reification.

I agree that provenance should be a first class citizen in the RDF world. Assuming that RDF is used on the web, and many of the triples will come from some URI, why not include support for marking a triple with where it came from? From what I can see, this is required if you ever want to start working on the Trust layer. As pointed out, you can do this now with reification, but that's a difficult and round-about concept to teach and implement. Most RDF systems support quads under the covers anyway, so there's an obvious need to support Subject, Predicate, Object, Source (provenance).

I disagree that RDF needs to include some explicit support for temporal labelling. IMHO modeling events that need to be clarified by time is perfectly possible now, without crazy hacks.

Example: Let's say someone asks, "What is Seth doing now?" The answer would be, "Seth is currently running." OK, no problem.

A naive approach to modeling "Seth is currently running" would be to first create a triple like:

:Seth :is :running.

Hmm... is Seth always running? What about yesterday? I believe it's this type of thinking that makes people think they need time and date in the model. How would you clarify that the running is only "now", where "now" is some point in time?

Flip the triple around, and think in Nouns. The above triple doesn't work because it's modelling a verb (:running). Turning it around, you can model it like:

:ExerciseRun :performedBy :Seth ;
:startedAt "2006-01-03T12:23:45";
:endedAt "2006-01-03T13:21:20".

This says something to effect of, "Seth went for a run for exercise between 12:23 and 1:21 on the 3rd of Jan." I've made the verb an instance of a class here, in other words a Noun.

The question I have, what temporal events can't be modeled this way?

OWL 1.1 Web Ontology Language

OWL 1.1 Web Ontology Language is beginning to take shape. I'm most excited about the new datatype support. For example, you can now say "Adults are Persons that have age someValuesFrom >= 18" In other words, you can now define ranges to acceptible values for a datatype. This is different than cardinality (which would be something like "SuperPetLover is a Person with 5 pets"), this is restricting the acceptible ranges for values of datatype properties.

Tuesday, March 14, 2006

Graphs in MySQL 5

Working with Graphs in MySQL has many nice examples on how to use MySQL 5's stored procedures and views to create efficient graph structures in SQL.

Friday, March 10, 2006

Data-Driven Enterprise: Slouching toward the Semantic Web

Data-Driven Enterprise: Slouching toward the Semantic Web is a new article found in IEEE DS Online. It includes many quotes by me, including others involved in semantic web and/or enterprise integration efforts.

It includes such greatest hits as "RDF XML serialization ain't that great" and "Where are the reasoning databases?" But don't forget old time favorites like "Focusing on what can be done now instead of what's possibly possible later."

Thanks to Greg for the chance to participate.

Thursday, March 9, 2006

The Semantics Are Important

Stefan Plantikow asks a good question:

> But isnt it absurd or at least meaningless to believe we are talking about the same “uri-wise” thing iff we put “it” into completely different classes? What does such an identity establish at all?

I don't think it's meaningless at all. Think about how we talk about things in the real world.

Q: "Where are you going this weekend?"
A: :Seth :travelsTo [geo:lat "123"; geo:long "456"];

OK, so now we have a bnode for some location. That is unambiguous as far as I'm concerned, as there's only one Point on Earth with that lat and long.

Now, here's my point about the difference between identity and meaning. The identity is singular. The meaning is relative. Let's think about all the different meanings we can associate to that identity.

[geo:lat "123"; geo:long "456"] a :Beach.

OR

[geo:lat "123"; geo:long "456"] a :CountyPark.

OR

[geo:lat "123"; geo:long "456"] a :Ocean.

All of the above would work in the real world. You can easily answer the question, "I'm going to the beach" or "I'm going to the park" or "I'm going to the ocean" and we know you mean the same thing. The real world is that flexible, so the semantic web must allow for this type of flexibility.

Now, this is where Semantics comes in. When you say "ocean", you might mean "Something with sand, next to the ocean, where people hang out, where you can swim". Again, the identifier for the class :Ocean is what is globally unique. But the definition of :Ocean is context specific.

Thinking about it, it happens that your definition of :Ocean (above) matches perfectly with my definition of :Beach ("Something with sand, next to the ocean, where people hang out, where you can swim"). And this is where the semantic web shines... when you say :Ocean and I say :Beach, we can reason we are talking about the same type of thing.

If we apply our contextual definitions (semantics) to the Thing in question (here it is [geo:lat "123"; geo:long "456"]) then we are at least talking about the same thing. This is the important part about the whole operation. No matter how we define that location on earth, it's the same location between the both of us.

Wednesday, March 8, 2006

Phil Dawes’ Stuff » Blog Archive » Global identifier schemes don’t scale II

Phil continues his thoughts with the assertion that

>As the system gets big and less consistent, the global identifiers cannot guarantee to unambiguously identify things across the system

Immediately I don't agree (quite respectfully). Phil, are you saying that a URI does not identify one and only one Resource?

I want to follow the rest of your reasoning, but I can't get past this point.

Perhaps you are confusing identity with meaning? I agree that the semantic web allows for ambiguous meanings for things, and that there is no way to declare that a Resource *is* one single thing (where *is* means, *is an instance of a class*).

But, and please elaborate on this one, a URI allows us to talk about the same thing without any ambiguity. Of course, we may describe the thing with completely different views/contexts/words but at least we're talking about the same thing.

I want to get to the rest of your post, but until I hear why a URI doesn't uniquely identify a Resource, I'm stuck in confusion.

Thumbs Up for Expert Spring MVC

Rakesh from the Spring Forums gave Expert Spring MVC a thumbs up. Thanks Rakesh!

> i just purchased the Expert Spring MVC and Web Flow book and I found it to be excellent

He did point out that there are quite a few typos in there. You can read my response on the forum, but basically "I know" was all I could muster and a promise to fix whatever I'm made aware of in the errata.

Still, kind words from a stranger.

Matt Raible Enjoying Expert Spring MVC

Matt Raible is enjoying Expert Spring MVC.

>I just picked up the Expert Spring MVC one and I've been really enjoying it so far - great job Seth and co.

Very cool, thanks Matt! Coming from a fellow author, that means a lot.

Meaning is Context Relative, Not Identifiers

Phil asked for clarification on a previous post, so hopefully with some sleep in me I can elaborate a bit more. I think I was trying to say that the concept of weight can be modeled in many different ways. This includes different ontological descriptions for weight, where weight can be identified using the same or different URI.

Take, for example, the URI http://example.org/xyz. You can say that:

<http ://example.org/xyz> a :Animal.

and I can say

<http ://example.org/xyz> a :Car.

The context here is Who Said What. What's not up for question is that there is some Resource identified by <http ://example.org/xyz>, and that when we use that identifier, we are talking about the same thing.

For the semantic web to succeed, and this might be your point all along, the environment must be able to cope with the above situation (where you think the thing is an animal, and I think it's a car) . This is difficult because of the Open World assumption, no doubt about it. For all the reasoner knows, something can be a :Animal and a :Car. It's up to the authors of the ontologies to say that :Animal owl:disjoint :Car.

Even though the Open World assumption exists, URI's still identify one thing. Now, whatever meaning you assign to that thing is your own. It's the meaning that is context relative, not the identifier for the thing.

Tuesday, March 7, 2006

Essential Reading for the Web of Trust

Essential RDF context is now required reading for anyone building the web of trust. Sometimes I feel the issue of *who* created the assertion is quietly swept under the rug. That information is extremely important, for I would want to tell my reasoner to trust statements about diseases from the Deptment of Health over Joe Bob any day.

This goes back to the problem of context for a triple. Who said it? When was it said? Was it signed (cryptographically)?

Response to Global identifier schemes don’t scale

Phil Dawes wrote in Global identifier schemes don’t scale that context is required to make any statement about a Resource on a global scale (at least, I think he did :)

I completely agree, but that doesn't mean his original point is correct. Global identifiers do scale globally, because at their core they have no meaning and are simply identifiers. It's only further statements about the Resource the identifier identifies that provide meaning. And those statements are always given with a context.

Context on the semantic web, with its world of triples, has to be explicit, unfortunately. Phil's original example, that of assigning a value to someone's weight property, is a good one. The concept of context comes up depending on, you guessed it, the context.

Take, for example, the weight of a person. *When* the person weighed that much can be important under some contexts, but not others. My driver's license says I'm 140 lbs, but I don't always have that weight. In the context of the DMV, that's my weight. Now, my weight for Jenny Craig is certainly time sensitive. Therefore, Jenny Craig's RDF store and ontology would define weight properties with time qualifiers.

The great thing about all of this, is that the definition of what a weight is, is context specific. The semantic web doesn't attempt to define the One, True Meaning for weight. It fully embraces that what you mean for weight is different than what I mean.

Getting back to identifiers, the issue of context of the data is not affected by global identifiers. What can make the semantic web actually work is that no matter the semantic meaning of weight, it is possible to talk about the same Resource. The Resource's meaning becomes more valuable as we allow anyone to say anything about it. As long as we always know what "it" is. :)

Keep up the good work, Phil! Always a pleasure to read your blog.

OWL Reasoning Examples

SSSW OWL Examples contains OWL reasoning examples, including OWL ontologies and explanations as to what inferences can be made. This is a good source of OWL statements and what they can mean to a reasoner.

ActiveRDF

ActiveRDF is a ORM type product for Ruby that maps Ruby objects to RDF stores.

It's a very new project, so examples and downloads are lacking. However, apparently it has support for pluggable storage engines. The next step would be to see if I can plug this into Oracle 10g's RDF store.

Of course, the real value here is if ActiveRDF supports inferencing. Otherwise, it's a cumbersome ORM product, imho.

Saturday, March 4, 2006

A CSS Framework

A CSS Framework is an effort to create a framework for XHTML pages such that it conforms to being styled into layouts commonly found on the web.

If you page conforms to the framework (a misuse in terms, imho), then it will be easy to apply many different style sheets. In other words, write your XHTML in a certain way, and you get the flexibility to use many different layouts.

There's even a CSS Framework Stylesheet Design Contest going on.

Friday, March 3, 2006

Bossam Rule/OWL Reasoner

Bossam Rule/OWL Reasoner is another Java based rule engine. It includes a set of rules for OWL reasoning, uses the RETE algorithm, and can be embedded in Java systems. It can reason with Java objects as well as simple facts. Bossam also supports SWRL.

Thomas Risberg Likes Expert Spring MVC

Thomas Risberg recommends Expert Spring MVC, which is really cool because it was his Spring MVC tutorial that I used to first get into Spring.

Wednesday, March 1, 2006

Java User Group Sardegna - ExpertSpringMVCandWebFlow

An Italian review for Expert Spring MVC and Web Flow is already out. The reviewer gives the book five stars, which is a good sign. There is an english translation thanks to Google.

First spotted via the Spring Framework Forums.

If anyone knows of a proper English translation, please let me know! Thanks!

Disclaimer

I'm probably required to say that the views expressed in this blog are my own, and do not necessarily reflect those of my employer.