[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fwd: A modest proposal for an OpenSocial RESTful API]

On Mar 3, 2008, at 4:11 PM, Lisa Dusseault wrote:
On Mar 3, 2008, at 10:15 AM, Roy T. Fielding wrote:
There are no caching issues with pipelining -- only benefits.  The
idempotency issue is due to error-recovery concerns when a connection is terminated, leaving the client with an unknown as to which non- repeatable
requests were received by the origin.

Is this just a multiple-instance version of the same problem when a connection is terminated, leaving the client with an unknown as to whether a single non-repeatable request was received? Is anything different when handling broken connections with or without pipelining except the number of requests whose status is in doubt?

Yes, same problem.  To make it clear, let's just use POST as the example
instead of talking about idempotence (a mostly useless and confusing term). In all cases below, substituting "single POST request" results in the same issues and has the same set of solutions (mostly using out-of-band knowledge),
yet we don't forbid POST.

To quote from 2616:
   Clients SHOULD NOT pipeline requests using non-idempotent methods or
non-idempotent sequences of methods (see section 9.1.2). Otherwise, a
   premature termination of the transport connection could lead to
   indeterminate results. A client wishing to send a non-idempotent
   request SHOULD wait to send that request until it has received the
   response status for the previous request.

This means that clients, servers, and proxies MUST be able to recover
   from asynchronous close events. Client software SHOULD reopen the
   transport connection and retransmit the aborted sequence of requests
   without user interaction so long as the request sequence is
   idempotent (see section 9.1.2). Non-idempotent methods or sequences
   MUST NOT be automatically retried, although user agents MAY offer a
human operator the choice of retrying the request(s). Confirmation by
   user-agent software with semantic understanding of the application
   MAY substitute for user confirmation. The automatic retry SHOULD NOT
   be repeated if the second sequence of requests fails.

9.1.2 Idempotent Methods

   Methods can also have the property of "idempotence" in that (aside
   from error or expiration issues) the side-effects of N > 0 identical
   requests is the same as for a single request. The methods GET, HEAD,
   PUT and DELETE share this property. Also, the methods OPTIONS and
TRACE SHOULD NOT have side effects, and so are inherently idempotent.

   However, it is possible that a sequence of several requests is non-
   idempotent, even if all of the methods executed in that sequence are
   idempotent. (A sequence is idempotent if a single execution of the
   entire sequence always yields a result that is not changed by a
   reexecution of all, or part, of that sequence.) For example, a
   sequence is non-idempotent if its result depends on a value that is
   later modified in the same sequence.

   A sequence that never has side effects is idempotent, by definition
   (provided that no concurrent operations are being executed on the
   same set of resources).

I've never understood how a client can tell which methods are idempotent. PUT may not be idempotent on a versioning server. Side-effects are always possible. Unless I'm missing something, telling the client not to pipeline non-idempotent methods is a useless requirement.

I'd call it actively harmful, given the number of times people have used
it as an excuse not to implement pipelines. It should have been a simple
warning to implementers.

Is there a requirement which would make more sense? There should be no problem pipelining non-idempotent requests as long as the connection is healthy. This is effectively what BATCH proposals try to do: making a bunch of requests atomic, or requiring the server to get to the end of the request before applying multiple methods within it, effectively check that the connection remained healthy before applying the series of requests.

Exactly.  And once you have a BATCH defined, the next thing that is
"optimized" is the parallel handling of requests within the batch ...
and we are back to square one.

If the problem solved by idempotency requirements only happens when there are failed connections, is it worth solving? Or is there a requirement for recovering from failed connections which clients would be capable of following -- such as retrying requests in order if it retries any?

The requirement should have been on the recipient of pipelined requests.
Namely, each request must be handled in the order received and responded
to in the same order, with one exception: any sequence of requests
consisting entirely of known safe methods (e.g., GET, HEAD, OPTION, TRACE)
may be processed in parallel provided that the associated responses are
still delivered in the order received.

Note that this does not solve the problem of mid-stream request failures
any more than HTTP solves the problem of mid-POST request failures.
We would still need some warning to implementors.  However, it doesn't
prevent applications from using their own out-of-band information to
recover from failures.  For example, an automated Atom feeder is fully
capable of discovering whether a POST succeeded or not, automatically
and without user input, even if the connection goes away.  The same is
true of any authoring protocol that allows discovery of versions.