[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Blogging needs a whole new BAG...
Bob Wyman wrote:
Just as "the web" has a TAG (Technical Architecture Group) I think
Blogging needs a BAG (Blogging Architecture Group.) I believe that the
charter for either the W3C or IETF Atom group should include a task to
form a BAG and produce a Blogging Architecture.
Too much to take on; it'll never get done. It'll be enough work to
get Atom nailed within scope for year's end 2005.
Some of the issues that need to be dealt with are: (this is a
very incomplete list...)
* Push vs. Pull: Currently, blogging is all about pull and polling.
However, it is clear to many that the current approach simply won't
scale.
I agree with Tim Bray - that engineers can keep things on check,
with a network of CDNs and better behaved clients and the like. But
- who pays for that? What's the incentive? This ain't the nineties.
XMLPP, NNTP, SMTP, SMS, Bitorrent are all options if anyone wanted
to deploy the client tools in anger. On the other hand I'm not even
sure feed traffic is a good fit for CDNs as deployed - perhaps Mark
Nottingham has thoughts. But it does seem to me that punishing the
popular becomes an even bigger problem with syndication.
We can't have millions of aggregators polling and pulling from
millions of Atom files without introducing massive resource consumption
issues.
True, but see above. It is doable, and the resources are available
to be punished.
Clearly, we need to consider providing real "publish/subscribe"
support for blogging and thus for Atom data.
We only need to consider it. But I would think unless something
funny is going to happen with linking/identity then Atom can float
above the transportation issues for the most part. Why is Atom
affected if I get over SMS instead of HTTP?
We need to deal with the
question of "Push" not "Pull." Unfortunately, in order to get around
firewall and NAT issues, this probably implies that connection/session
oriented protocols be used.
I really would wait this out for at least 12 months. Syndication is
not the only technolgy with this issue. Who knows maybe now the P2P
folks have a real-world problem to solve.
* Trackback: The "trackback protocol" is popular, provides benefit,
etc. however, there are a number of major issues that should be
addressed. These issues include the same authentication and signature
issues that relate to pings. There are also important architectural
issues that should be addressed in order to deal with the problem of
trackback-spam. Is it possible/desirable to provide for "trackback
aggregators" or trackback services on behalf of blogs? Is Trackback
really just a special case of Pinging?
I don't see any difference. Really they're both subsumed by
backlinking - that is they're dynamic generated backlinks. The web
has a lot of history around that subject as you allude to and Tim
Bray can attest to directly.
* OPML: Many blogging applications rely on OPML files as a means of
recording and exchanging lists of blogs, metadata about them, etc. There
is not, however, a well-accepted definition of how one builds an OPML
file (although Danny Ayers, among others, has worked on this issue.) and
OPML is not the subject of any serious standardization effort. If OPML
use is to grow, we really must see some standardization here.
I would let OPML grow some more before nailing it down. Yes, it's an
interop joke, but so are many other things.
* The role of proxies, synthetic feed generators, systems like
PubSub.com, etc. is neither well understood nor defined, however, it is
clear that there are today quite a number of these things and there will
be more in the future. Effort should be put into at least clarifying the
issues related to these intermediary processors and distributors of Atom
files.
As you say this stuff is not well defined, but a standards effort is
premature don't you think?
* Comment support is provided by many blogs and we're beginning to
see issues related to comment spamming, etc. Is there something we can
do from an architectural point of view to prevent comment spamming?
(i.e. do we need something like an IRTF spamming group?) How do we
handle remote commenting and what is the place/role of services like
SixApart's "TypeKey" service? Should a "TypeKey" standard be developed?
How does the W3C Annotea effort relate to commenting and/or blogs in
general?
Many spam problems evaporate with usable crypto/pki/dsig. That most
no-one signs anything but most everyone bitches about spam is not an
Atom problem imvho.
* HTTP has limitations that are a true burden in blogging. For
instance, there is no server-side support for identifying and retrieving
of "fragments" of a resource served by an HTTP server. Thus, you can't
say things like "Give me only the entries that have been updated since
time XXXX'". Should HTTP be extended to address better the needs of
Atom? Should RFC-3229 be extended to define an ATOM specific mechanism
for retrieving Atom Fragments?
That would be - "we need a query language 'cos the XPath hacks don't
cut it anymore". Again Atom is not the only technology that has a
querying issue - this by the way is one area where the semweb folks
would have very useful contributions. They've been thinking about
this stuff for years.
* As Atom and blogging becomes more popular, there will be some
sites that generate large quantities of Atom formatted data and others
that will consume large quantities of it. This raises the issue of
binary or compressed formats for Atom data.
Bob. tut-tut.
* Many commercial publishers have expressed an interest in
syndication but also express great concern about the various issues
related to IPR and usage rights. Is there anything we can do to provide
a means to support digital rights management in Atom feeds or systems?
The basis for that would be to clearly specify c14n for Atom. That's
step one. Once that's down other specs can follow. This is one area
where sticking to raw XML is a good thing.
* PICS [...] P3P
This strikes me as area where client tools innovation is desperately
needed. Allowing people to express their content filters is
primarily a usabilty issue. Most people can't/won't use basic email
filters never mind this stuff. My point is that it doesn't matter
what's specified when people are unable to use it. I would vote to
let the Bayesians handle this.
* Various efforts have been made to support "categories" as
meta-data in blogs and blog entries. Most have failed. It seems like
these efforts should be formally linked to similar efforts such as the
ISO Topic Map or the XTM standards. Ideally, any solution for blogs
would also work well in non-blog environments. Do we need
standardization of things like "subject indicators" or do the existing
standards do the job?
A mapping into RDF is a charter option (rather than the RSS1.0
instaparse approach). We'll provide a mapping for Atom one way or
another. Danny has been working with OWL lately - Oh well, I've been
to busy to contribute anything to that efort recently, so I guess
that's where it's heading.
cheers
Bill