[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ETags and concurrency control



We have been looking at concurrency control in Astoria (http://astoria.mslivelabs.com). The approach we're taking is to use the usual HTTP ETag header to encode the state of the resource and then if-match headers to validate that the client is making modification based on the correct assumptions of the state of the server (or says "*" to indicate it does not care).

We ran into a few interesting things that I thought I'd share here to see if there are pre-established assumptions about any of them, or whether somebody already ran into this before.

I posted a complete write up here, and the short story comes in the description down below:
http://blogs.msdn.com/astoriateam/archive/2008/04/22/optimistic-concurrency-data-services.aspx

Creating ETags: Astoria can be layered on top of many data sources, so ETags can be many different things. We designed it so that the developer creating the service can indicate a subset of the property of the "entity" (e.g. object, record, etc.) being exposed as part of the ETag. In the ideal case they'd just use a single timestamp property, but in practice that's not always possible. One effect of this is that in some corner cases ETags can get big and bloat the header.

ETags in headers versus body: whenever we return a resource that represents a single underlying entity, if the entity has ETag-annotated properties we include an ETag response header. However, there are times where a URL points to a resource that's a list of other resources/entities (e.g. /Customers(123)/SalesOrders returns a feed with an entry for each sales-order resource for a given customer). In those cases returning an ETag in the header wouldn't help a client that will need to update a single one of those entries later on. So we include the ETag for each entry in an attribute in the entry element. I wonder if there is prior art for this...

Weak ETags: the HTTP spec is pretty explicit that ETag/if-match headers require bit-by-bit equality of the resource, and if the resource changes a single bit then the ETag should be different. There is a concept of "weak ETags" that can be equal if the resource is "semantically equivalent", but the spec indicates that is only to be used in GETs. So for the cases where a developer uses something that's not a reliable timestamp as an ETag we end up deviating somewhat from the spec. I'm sure that other folks have ran into this before.

Thanks
-pablo