[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: PaceBatch and pipelining



The debate on atom-protocol has turned into a a pretty big firestorm,
with the general battle lines drawn around whether http 1.1 pipelining
makes batch-level support in Atom Publishing Protocol unnecessary.  
There are some pretty big names on both sides of the fence (ex. Roy
Fielding is anti-batch, Tim Bray is pro-batch).

AdamB and the I discussed briefly and came to the conclusion that the
best (and right) way to inform this debate is using empirical data, so
I posted the message below and signed us up to do so.

I'm hoping I can get some assistance with this effort to speed things
up.  I can write the client side of the tests and drive execution/roll
up the data, but was hoping that we could redirect/reprioritize Arthur
for the near term to batch enable Forseti to use as our test case on
the server side.  I think we can just test against a Noop FeedProvider
that doesn't actually do anything w/ the updates (agree?) since this
is primarily a network/client issue.  I could also use an assist from
someone on how best to set up for #3 below by deploying Arthur's
server to various near/far production envs and running the client over
fast/slow links (that I assume already exist for testing other Google
client stuff).

If someone wants the blow-by-blow on this issue, just go to [1] and
look for the threads that contain PaceBatch in the title.

TIA,

-- Kyle

[1] - http://www.imc.org/atom-protocol/mail-archive/threads.html

---------- Forwarded message ----------
From: kmarvin <kmarvin@xxxxxxxxxx>
Date: Jun 28, 2005 11:39 AM
Subject: Re: PaceBatch and pipelining
To: Bill de hÓra <bill@xxxxxxxxxx>
Cc: Atom-Protocol <atom-protocol@xxxxxxx>


>
> It would help if those arguing pro or con could tell us what they want
> to optimize for, if anything - for all I know this is /purely/ a
> convenience mechanism for the client to stuff entries together a la
> Frank's point and has nothing much to do with the server-side handling
> or connection details.
>

As one of the authors of the PACE, the original motivation was a
belief that it would be difficult for clients in practice to achieve
good performance by relying solely upon HTTP 1.1 pipelining as the
mechanism to avoid round trip latency when doing bulk updates.

Some very strong data and great thinking has come out on both sides in
this thread and it is greatly appreciated.   Inside Google, the
general rule of thumb is to try and make these types of decisions
empirically, not theoretically.   On this basis, I'm willing to commit
to an effort to build some real world tests against a real world
server to try and compare the tradeoffs.

These tests should:

1. vary the batch/pipeline sizes to see the performance tradeoffs
2. vary the amount of data in the content payload of entries (for creates/puts)
3. vary (as much as possible) the latency between client and server
4. use currently available http client libraries that support http 1.1
pipelining

The hard metric is throughput as measured from the client end of the
wire.  A soft metric is also the effort/complexity required on the
client side to achieve the performance and to deal with error
conditions using the two mechanisms.

I'll have to warn up front that it may be problematic to share the
code for these tests.... they will be based upon a potential usage of
APP that is not yet part of public products.   I should be able to
answer many reasonable questions though.  Any feedback on the test
process above (ideally, prior to building/executing them :) is
certainly appreciated.

Google is willing to base its support (or withdrawal) of the PACE on
the results of these tests.

This is going to take some time to pull together the various pieces to
produce the results, and I haven't seen any specifics from Sam or Tim
on timelines for PACE consideration.   Can someone help me to
understand this, given that I am relatively new to this process?

Related to #4. above, I'm also soliciting info about existing client
library support for HTTP 1.1 pipelining.   I'm aware of libwww, but
wondering what else is available for other languages, like Java?

Thanks again for all the serious consideration and feedback that the
PACE has received thus far (both pro and con).

-- Kyle