[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Semantics and Belief contexts - was: PaceDuplicateIdsEntryOrigin posted
On 25 May 2005, at 21:06, Antone Roundy wrote:
* The accepted language does not speak of the origin feed of the
entries. Ideally, an atom:id should be univerally unique to one
entry resource, and we rightly require publishers to mint them with
that goal. However, in reality, malicious or undereducted
publishers might duplicate the IDs of others. Therefore, it is
proposed to modify the specification to state that the atom:entry
elements describe the same entry (resource) if they originate in
the same feed.
* Aggregators wishing to protect against DOS attacks are not
unlikely to perform some sort of safety checks to detect malicious
atom:id duplication, regardless of whether the specification
"authorizes" them to or not.
I understand your motivation, but I think it is misguided. I only
recently understood why myself [1].
Let me explain a little how I come to this conclusion. An easy way to
understand semantics is to think of it as about the objects out
there. Take the sentences:
(a) Superman can fly
(b) Superman is Clark Kent
we can immediately deduce truly that
(c) Clark Kent can fly.
Since the referents of "Superman" and "Clark Kent" are the same, what
is true of the one,
is true of the other. When speaking directly about the world, we can
replace any occurrence
of Superman with Clark Kent, and still say something true.
When we are speaking about what others believe, this is no longer
true. Lois Lane may believe (a) without believing (c). She may think
Superman is a hero, but not think that Clark Kent is one. There
is in logic therefore a fundamental distinction between sentences
used in a direct semantic way, and
sentences used in this indirect way, when the sentence is in a belief
context. This distinction is
so fundamental that there is a well known mental illness that goes
with people who are not able to
make this distinction: autism. Autistic children have great
difficulty understanding the difference
between what is and how people perceive things to be.
In RDF this distinction shows up when moving from triples to 4-
tuples. RDF/XML is a language
that works best in the Semantic realm. With triples we can describe
objects and their
relationships. If we want speak about consistent ways of seeing the
world we need to group statements with formulae as is done in N3[2]
and TriX for example. This then allows us to name
consistent sets of statements. It also allows one to simultaneously
refer to sets that are inconsistent. I can for example consistently
hold the following:
Lois lane believes that Superman is different from Clark Kent
Clark Kent believes that he is Superman.
without contradiction.
So how does this relate to Atom? Well we need to be clear that
semantically a entryId and
a feedId point to one thing and one thing only. But this does not
mean that there can not
be erroneous, false, corrupted,... feeds out there. Aggregators
wishing to protect against
DOS attacks should simply do what we humans do in such circumstances,
namely quote what others
are saying and not assert the things others are saying. This is why
the proposal by Roy
Fielding to allow feeds inside of feeds was probably the best way to
do things (I just came
to this conclusion yesterday, before this I had no idea what he was
going on about).
So to prevent a DOS attack, best is to have aggregator feeds such as:
<feed>
<!-- aggregator feed -->
<feed src="http://true.org">
<id>tag://true.org,2005/feed1</id>
<entry>
<title>Enter your credit card number here</title>
...
</entry>
</feed>
<feed src="http://false.org">
<id>tag://true.org,2005/feed1</id>
<entry>
<title>Enter your credit card number here</title>
...
</entry>
</feed>
</feed>
Here all the aggregator feed is claiming is that he has seen entries
inside other
feeds. He never need claim to agree with any of their content. And so
the content
of the first internal feed and the second internal feed can be
contradictory. They
can for example have the same id with the same updated timestamp and
with different
content.
It will be up to the consumer of such aggregated feeds to decide
which to trust.
The good thing about this way of doing things is that one can define
a first level feed
in a simple semantic vocabulary, without needing to create all kinds
of exceptional
clauses all over the place. When dealing with feeds inside a feed one
can then
simply mention that this indirection is equivalent to the belief
context indirection.
Statements can be contradictory across such internal feeds.
Taking this into account should help make the spec a lot cleaner,
easier to write and
easier to understand. The problems are fundamental, so they cannot be
swept under the
carpet. They will keep popping up.
Henry Story
[1] http://www.imc.org/atom-syntax/mail-archive/msg15608.html
[2] http://www.w3.org/DesignIssues/Notation3.html