[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Dealing with large collections [Re: URI constraints]
On Oct 11, 2004, at 11:51 AM, Robert Sayre wrote:
There is an open question for what to do with the blog that has
100,000
posts in it. That would be a scary-huge PROPFIND result (legal, if a
bit
expensive; could be about a 20 meg response).
What's stopping you from having child collections? But yes, in
general, we have a query problem.
Robert, how are you suggesting child collections could be used to solve
this problem? I'm having trouble picturing a use that would feel
'natural.'
Full blown search as in [1] seems a bit heavy for the Atom protocol.
The current problem, as I see it, is not "how to search an Atom
collection" but "how to specify and return subsets of a collection."
Search might be a legitimate function, but out of scope for now, I
think. How to deal with subsets is more pressing because of the cost of
returning huge results. I like a client-specified "query" approach that
specifies limits, without a full-blown SQL-like language.
On Oct 11, 2004, at 12:19 PM, Greg Stein wrote:
I'd think that two types of limits would be reasonable: max-count and
since-this-date.
Would "since-this-date" indicate those items that have been added since
that date, or modified? I can see use cases for both. Similarly, if
max-count is n, we'd need to pin down which n are expected to be
returned (what the sort order is, in other words). I don't want to
debate the number of date fields that are needed. In fact, we haven't
talked about whether date fields should be required for templates,
categories, etc., and I'd rather not force the issue.
Instead, I propose that a collection-getting request simply includes a
sort-field element which specifies some field of the item. And instead
of "since-this-date" we use "max-value," "min-value," or "limit-value".
Here's a sketchy example (whether the request is PROPFIND or something
else is orthogonal):
PROPFIND /my/silly/resource HTTP/1.1
...
<A:limit>
<A:sort-field>A:updated</A:sort-field>
<A:limit-value>2004-09-01T12:00:00Z</A:limit-value>
<A:sort-order>ascending</A:sort-order>
</A:limit>
Is there any precedent for using XML element names as the content of an
element or attribute? I can imagine a swamp of namespacing issues. Is
there a better way to pick out an element?
Anyway, something like this could address client-specified limits. Is
there a way for a server to legally return a subset of what was
requested? For example, if the list contains 100,000 items, and the
client requests a max-count of 100,000, can the server refuse to do
that?
Presumably there are cases where an Atom client wants to build a
complete list, limits be damned. If the server returns a subset the
client would need to make multiple requests in order to form a full
model of the server's state. How does the client know when it's got
everything? Do we need a method to query for the total number of items?
Ezra
[1] http://greenbytes.de/tech/webdav/draft-reschke-webdav-search-07.html
[2] http://www.imc.org/atom-syntax/mail-archive/msg00763.html