[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: mnot-03, Infoset & syntax in sect. 2
Bandwidth is a concern, but I believe interop to be a greater one. The
current text form of XML is the one currently deployed virtually everywhere.
Ok, if you need to aggregate millions of feeds then I can see the attraction
of a binary encoding throughout. I don't think this really is such an issue
with small devices - they aren't that small in terms of capability any more.
If it is an issue then I think it likely that the solution is server-side
filtering by relevance of feed/entries rather than improved compression. You
talk of it being difficult for people wanting to mine the data - but this
talking exceptional cases. Compare the number of people that view the web in
a browser with the number of search engines.
But anyway, given that there is no single widely-deployed standard for
binary XML (beyond server-gzipped), why would you want a binary
serialization of Atom to be based on the Infoset? If bandwidth and
efficiency is the concern, wouldn't it make a lot more sense to derive a
binary serialization directly from the abstract model of an Atom feed rather
than through what is in effect a model used to help create a text-based
serialization?
Cheers,
Danny.
> -----Original Message-----
> From: owner-atom-syntax@xxxxxxxxxxxx
> [mailto:owner-atom-syntax@xxxxxxxxxxxx]On Behalf Of Bob Wyman
> Sent: 28 December 2003 00:02
> To: 'Tim Bray'
> Cc: 'Mark Nottingham'; 'atom-syntax'
> Subject: RE: mnot-03, Infoset & syntax in sect. 2
>
>
>
> Tim Bray wrote:
> >> Already, we've seen a number of cases where people
> >> with popular blogs have had to stop providing
> >> RSS feeds due to bandwidth costs.
> > I wasn't aware of this, but I've been busy. General question:
> > is this one of our big concerns?
> Yes, bandwidth is a major concern. I'm building a system that
> must consume a significantly large percentage of all the RSS and/or
> Atom feeds on the web on a continuous basis. Thus, bandwidth concerns
> are a *very* major concern. Additionally, I require the ability to be
> able to support many, many client programs and/or web browsers which
> are consuming RSS or Atom data -- once again bandwidth is a concern.
> And, I've got a requirement to support readers on PocketPC's, Cell
> Phones, etc. -- bandwidth and memory footprint is a major concern with
> mobile devices. And, since I need to sort/filter all the content in
> these feeds, hand them off to the right clients at the right time,
> etc, I'm really concerned about parsing speed. XML is not only "fat"
> it is also slow to parse compared to the alternatives.
> These bandwidth/parsing issues don't directly impact most
> individual users but they have a tremendous impact on anyone who is
> doing aggregation or doing cross-feed data mining or analysis of any
> sort. I totally agree that people should be able to do the easy thing
> when building client programs and that typically means relying on XML.
> My problem is with the almost dogmatic objection to even tolerating
> systems that would make life easier for those who *do* care about
> bandwidth or parsing speed.
>
> What I would like to see is a world where the following
> conditions hold:
> 1. All publishers who provide data directly to end-users are
> expected to provide that data as RSS, Atom, or whatever in XML format
> -- complete with pointy brackets, etc.
> 2. Publishers who want to be more "aggregator friendly" have
> the *option* of providing an additional feed which is an ASN.1/PER
> binary encoding of the XML Infoset. (Parses faster, is much more
> compressed than XML and provides semantic equivalence with the XML
> feeds. (i.e. application/atom+per == application/atom+xml))
> 3. Aggregators, or anyone else who consumes the binary feeds,
> are assumed to also support feeds in XML encodings.
>
> The problem with the kind of blind opposition to the binary
> feeds that Tim Bray brings to this thread is that it makes it hard to
> get discussion going on a *standard* binary encoding for the Infoset.
> This means that there are wide-open opportunities for people to push
> proprietary encodings that start off by addressing "critical needs"
> and then start to seep out onto desktop systems, etc. As has been seen
> in discussions in other lists, we've already got several flavors of
> binary XML's being defined in various standards groups. We've also got
> Microsoft, BEA and others defining "binary formats" for interchange of
> otherwise XML data. Each one of those "proprietary" formats is
> basically a dormant virus just waiting to make a mess of the open net
> and introduce interop problems.
> My personal, very strongly held opinion, is that the best way
> to ensure that interop is maintained is to address the binary issue
> directly and define a *standard* binary encoding of the Infoset as
> well as the rules for it's use, requirements for XML-text support as a
> fall-back, etc. Once this is done, there will be little to justify the
> work of those who today see an opportunity or requirement for
> proprietary formats.
> Tim, you flame regularly against "proprietary" interfaces. I
> agree with every one of your posts on this subject -- except, I
> disagree with your approach. To me, you sound like Nancy Reagan saying
> "Just say No!". I believe that the more active approach of defining
> standard binary encodings can eliminate the risk of proprietary
> formats and interfaces being defined.
> As you say, we should "define the bits on the wire". Hear!
> Hear!... Let's just make sure that we make it possible to define the
> binary bits as well as the text bits...
> But, understand that I am not suggesting that Atom define a
> binary format or even address the issue -- just that in the definition
> of Atom, nothing should be done to make it difficult or "forbidden" to
> produce binary encodings of Atom data if a *standard* binary encoding
> is ever defined and accepted.
>
> > ... which makes me think that maybe I'm feeding
> > a troll in answering this.
> And, I'm wondering why I bother responding to this sort of
> insult... After all it was *you* who introduced the inflammatory,
> troll-like, "fight to the death" language into this thread... No, I am
> not a troll. I just happen to care about this issue a great deal.
> Perhaps my concerns are the results of advanced age. Unlike some of
> you folk, I've been building commercial software systems since the
> late 70's and I've learned the hard way that ignoring "pesky little
> issues" like bandwidth and processing efficiency can kill otherwise
> elegant systems.
>
> bob wyman
>
>