[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: mnot-03, Infoset & syntax in sect. 2



Tim Bray wrote:
>> Already, we've seen a number of cases where people 
>> with popular blogs have had to stop providing 
>> RSS feeds due to bandwidth costs.
> I wasn't aware of this, but I've been busy.  General question: 
> is this one of our big concerns?
	Yes, bandwidth is a major concern. I'm building a system that
must consume a significantly large percentage of all the RSS and/or
Atom feeds on the web on a continuous basis. Thus, bandwidth concerns
are a *very* major concern. Additionally, I require the ability to be
able to support many, many client programs and/or web browsers which
are consuming RSS or Atom data -- once again bandwidth is a concern.
And, I've got a requirement to support readers on PocketPC's, Cell
Phones, etc. -- bandwidth and memory footprint is a major concern with
mobile devices. And, since I need to sort/filter all the content in
these feeds, hand them off to the right clients at the right time,
etc, I'm really concerned about parsing speed. XML is not only "fat"
it is also slow to parse compared to the alternatives. 
	These bandwidth/parsing issues don't directly impact most
individual users but they have a tremendous impact on anyone who is
doing aggregation or doing cross-feed data mining or analysis of any
sort. I totally agree that people should be able to do the easy thing
when building client programs and that typically means relying on XML.
My problem is with the almost dogmatic objection to even tolerating
systems that would make life easier for those who *do* care about
bandwidth or parsing speed. 

	What I would like to see is a world where the following
conditions hold:
	1. All publishers who provide data directly to end-users are
expected to provide that data as RSS, Atom, or whatever in XML format
-- complete with pointy brackets, etc.
	2. Publishers who want to be more "aggregator friendly" have
the *option* of providing an additional feed which is an ASN.1/PER
binary encoding of the XML Infoset. (Parses faster, is much more
compressed than XML and provides semantic equivalence with the XML
feeds. (i.e. application/atom+per == application/atom+xml))
	3. Aggregators, or anyone else who consumes the binary feeds,
are assumed to also support feeds in XML encodings.

	The problem with the kind of blind opposition to the binary
feeds that Tim Bray brings to this thread is that it makes it hard to
get discussion going on a *standard* binary encoding for the Infoset.
This means that there are wide-open opportunities for people to push
proprietary encodings that start off by addressing "critical needs"
and then start to seep out onto desktop systems, etc. As has been seen
in discussions in other lists, we've already got several flavors of
binary XML's being defined in various standards groups. We've also got
Microsoft, BEA and others defining "binary formats" for interchange of
otherwise XML data. Each one of those "proprietary" formats is
basically a dormant virus just waiting to make a mess of the open net
and introduce interop problems.
	My personal, very strongly held opinion, is that the best way
to ensure that interop is maintained is to address the binary issue
directly and define a *standard* binary encoding of the Infoset as
well as the rules for it's use, requirements for XML-text support as a
fall-back, etc. Once this is done, there will be little to justify the
work of those who today see an opportunity or requirement for
proprietary formats. 
	Tim, you flame regularly against "proprietary" interfaces. I
agree with every one of your posts on this subject -- except, I
disagree with your approach. To me, you sound like Nancy Reagan saying
"Just say No!". I believe that the more active approach of defining
standard binary encodings can eliminate the risk of proprietary
formats and interfaces being defined.
	As you say, we should "define the bits on the wire". Hear!
Hear!... Let's just make sure that we make it possible to define the
binary bits as well as the text bits...
	But, understand that I am not suggesting that Atom define a
binary format or even address the issue -- just that in the definition
of Atom, nothing should be done to make it difficult or "forbidden" to
produce binary encodings of Atom data if a *standard* binary encoding
is ever defined and accepted.

> ... which makes me think that maybe I'm feeding 
> a troll in answering this.
	And, I'm wondering why I bother responding to this sort of
insult... After all it was *you* who introduced the inflammatory,
troll-like, "fight to the death" language into this thread... No, I am
not a troll. I just happen to care about this issue a great deal.
Perhaps my concerns are the results of advanced age. Unlike some of
you folk, I've been building commercial software systems since the
late 70's and I've learned the hard way that ignoring "pesky little
issues" like bandwidth and processing efficiency can kill otherwise
elegant systems.

		bob wyman