[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Semantics of Entry



Here is a little discussion I have been having off list with Danny Ayers, which I hope will help explain part of how I moved from his original interpretation of the Atom Entry, which he published at http://semtext.org/atom/atom/index.html, and which can be represented by this graph

JPEG image



... how I moved from the above, to the following model with the extra EntryState class:

JPEG image



The burden of proof of course is on my side, since my model requires an extra class and seems to introduce something that is not in the original Atom spec.

On 10 Jun 2004, at 21:36, Danny Ayers wrote:
Henry Story wrote:

Btw. I think that the EntryState class is really necessary for you to be able to encode Atom properly in RDF. This is because in a declarative world you can never remove facts from a theory. So as I understand (and I will have to read the descriptive logics book to make sure I am right) anytime an object can change you will need to move the changeable state to a new object.

I'm not certain, but I'm not sure that kind of state persistence is there in the syndication model - if the change is major then it won't be the same object anyway. If the change is minor then the new version will have the later atom:modified date, the old version is generally forgotten about. Ok, so in principle in the RDF/OWL model both the old and the new properties will apply, but I reckon those should be sorted out at the application level - e.g. for an aggregator normally only the most recent version would be preserved/displayed. [...]
Raw
http://dannyayers.com

The burden of proof is also on my side, because I believe that Danny's model is not just another model of the Atom spec, but is a flawed one.


Ok. So I need to show that the state persistence is there in the Atom model, and/or that one needs it in the OWL model.

----

Some of the more attentive readers here may notice that we have not given an id to the Entry as required by the Atom Spec. This is because Entry, Content and Entry objects are all resources. I don't know if in OWL there is a way of requiring a resource not to be anonymous, but if there were this is what we would want.

Let's take a simple example of an Entry as we might xslt it from an Atom feed. Here they are represented in I hope an intuitive notation:

tag:fish.net,2004-05-20:e1
                    |------isa---------> DannyEntry
                    |------title-------> content1: "great day"
                    |------content-----> content2: "I love sunni daze"
                    |------created-----> 20 May 2004, 8 am

some time later the entry owner notices the spelling mistake, corrects it, publishes it, and we receive his Atom feed which we xslt into something like this:

tag:fish.net,2004-05-20:e1
                    |------isa---------> DannyEntry
                    |------title-------> content1: "great day"
                    |------content-----> content3: "I love sunni days"
                    |------created-----> 20 May 2004, 8 am
                    |------modified----> 20 May 2004, 4 pm

What we want, as I understand it, in a declarative semantics, is to just be able to add true statements to our theory without creating contradictions whilst also maintaining the meaning of the original statements. But if we now merge the first set of statements with the second ones above we get the following:


tag:fish.net,2004-05-20:e1 |------isa---------> DannyEntry |------title-------> content1: "great day" |------content-----> content2: "I love sunni daze" |------content-----> content3: "I love sunni days" |------created-----> 20 May 2004, 8 am |------modified----> 20 May 2004, 4 pm


The problem with this theory is that we now no longer know which of content2 or content3 is the latest one (No cheating and guessing that because one has a 3 in the name it is likely to be after the one that has 2 in the name)


So what was the first solution I thought of? I thought: why not time stamp the content objects? This way after merging the new transforms of the two previously mentioned Atom xml feeds we get:

tag:fish.net,2004-05-20:e1
                    |-------isA---------> DannyEntry
                    |-------title-------> _c1: "great day"
                    |                     |--isA--> TimeStampedContent
                    |                     |---->20 May 2004, 8 am
                    |-------content-----> _c2: "I love sunni daze"
                    |                     |--isA--> TimeStampedContent
                    |                     |---->20 May 2004, 8 am
                    |-------created-----> 20 May 2004
                    |-------content-----> _c3: "I love sunny days"
                                          |--isA--> TimeStampedContent
                                          |---->20 May 2004, 4 pm

Cool now we know which entry came first.
Sadly I then discovered that this will solution will not generalize to attachments. Consider:


tag:fish.net,2004-05-20:e1
|------title--------> _c1: "great day"
| |---->20 May 2004, 8 am
|------content------> _c2: "I love sunni daze"
| |---->20 May 2004, 8 am
|------created-----> 20 May 2004
|------content-----> _c3: "I love sunny days"
| |---->20 May 2004, 4pm
|------content-----> _c4: "file:/User/hjs/beach.icon.jpg"
| |---->20 May 2004, 5pm
|------content-----> _c5: "file:/User/hjs/beach.jpg"
|---->20 May 2004, 5:05pm


(I have left out the isA Properties to unclutter the image)
Here the problem is that we no longer know if content _c4 is meant to be replaced by content _c5 or if they are to coexist together. Since an entry can have a number of content objects we find there is no operation to delete content objects. We could devise some rules as to how to deal with contents when these have attachments as their object. But these rules would end up being difficult to understand, easily misunderstood, and prone to getting more and more complicated as people come up with new complex examples of what we would like to do. Hence the importance of the declarative semantics: we want an operation as simple as Set Union to be the operation to merge new facts into our theory.


If we move the identification off the entry to the id (as originally proposed by the Atom-OWL spec) then things look much better, and we don't have to time stamp content nodes:

            _entry1 ----isA-----------> Entry
                |-------id------------> tag:fish.net,2004-05-20:e1
                |-------title---------> _c1: "great day"
                |-------content-------> _c2: "I love sunni daze"
                |-------created ------> 20 May 2004, 8am

_entry2 ----isA-----------> Entry
|-------id------------> tag:fish.net,2004-05-20:e1
|-------title---------> _c1
|-------content-------> _c3: "I love sunny days"
|-------modified -----> 20 May 2004, 5pm
|-------content-------> _c4: "file:/User/hjs/beach.icon.jpg"
|-------content-------> _c5: "file:/User/hjs/beach.jpg"


This is quite consistent with the Atom spec. Ids should not change but Entries can (hence the 'modified' property). But one may find the picture above a little odd in that we have anonymous nodes that are identified by unique tags that don't change, but that as opposed to the e-mail addresses in the FOAF spec are not inverse functional. Here an entry id inversely refers to more than one entry object.

By moving the statements around like this a little things start to take on a very interesting form:

tag:fish.net,2004-05-20:e1
|---1/id----->_entry1 ----isA-----------> Entry
| |-------title---------> _c1: "great day"
| |-------content-------> _c2: "I love sunni daze"
| |-------created ------> 20 May 2004, 8am
|
|---1/id----->_entry2 ----isA-----------> Entry
|-------title---------> _c1
|-------content-------> _c3: "I love sunny days"
|-------modified -----> 20 May 2004, 5pm
|-------content-------> _c4: "file:/User/hjs/beach.icon.jpg"
|-------content-------> _c5: "file:/User/hjs/beach.jpg"


Here I have just redrawn the previous picture but inversed the id relation (1/id), and clarified the fact that they were pointing to the same thing.

It is now clear that the _entry1 and _entry2 might better be though of as states of an entry at a time. In which case why not make it clearer:

tag:fish.net,2004-05-20:e1
|---isa----------> Entry
|---created------> 20 May 2004, 8am
|---hasLink------> http://someserver/changeEntry.cgi?entry1
|---hasState----->_entry1 ----isA---------> EntryState
| |-------title---------> _c1: "great day"
| |-------content-------> _c2: "I love sunni daze"
|
|---hasState----->_entry2 ----isA---------> EntryState
|-------title---------> _c1
|-------content-------> _c3: "I love sunny days"
|-------modified------> 20 May 2004, 5pm
|-------content-------> _c4: "file:/User/hjs/beach.icon.jpg"
|-------content-------> _c5: "file:/User/hjs/beach.jpg"



This is how I came to the conclusion that the EntryState class was in fact presupposed by the Atom syntax. We might say, à la Bertrand Russel, that the EntryState class reveals the deep logical structure of the Atom Syntax.


One question for this list: If the reasoning is correct, should the link property perhaps not also be attached to the EntryState, rather than the Entry as represented in the second diagram above. Ie. is a link an essential property of an Entry?

Henry Story
http://bblfish.net/