[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Adding search constraints to collection listing
Daniel Jalkut wrote:
> While thinking about some of the existing problems with the XMLRPC
> based MetaWeblog API, it occurred to me that it would benefit from a
> "search" mechanism. Its "getRecentPosts" interface requires that
> clients searching for posts of a specific nature fetch all billion-
> zillion posts and then do a client-side evaluation to find the ones
> they're interested in.
>
> It got me thinking about how AtomPub basically suffers from the same
> problem, unless a server supports a means of winnowing down the
> results of a collection listing.
If the AtomPub collection feed contains just a minimal summary of each
entry, then you can fit a lot of entries into a very small space. (My
AtomPub server implementation averages ~350 bytes per entry the last time I
checked). That's about 3000 entries per megabyte. If the user creates 10
entries per day on average, then you get about a year's worth of posts per
megabyte. In most situations it doesn't take that long to transfer a few
years of posts to the client. Since collection feeds are sorted
reverse-chronologically, and recently-changed entries are usually more
immediately needed than older ones, the initial load of the cache can be
done in the background without affecting the perceived performance of the
client too much. As long as the client server does a good job at caching, it
will have good overall performance after the initial delay in fetching all
the entry metadata.
No matter how you mix up the raw numbers above, I think it is safe to say
that for most blogging scenarios, for most users, we can get by with the
current minimal AtomPub interface, as long as the client cache entries from
the collection feed. I'm not saying that a search interface isn't useful,
but we can do a lot without one.
> Would a natural way to solve this problem be to support keyed search
> terms in the URL?
>
> http://myblog.com/app/posts?status=draft&author=Daniel%20Jalkut
>
> This would give clients and servers an easy way to expose only a
> subset of collection data, which would probably be a big win for
> bandwidth and speed.
>
> It's also possible I'm stating the obvious. I just wanted to
> get this idea out here and hopefully it's not a repeat or
> redundant to some other discussion that's already occurred.
You could start by looking at Blogger's GData API
(http://code.google.com/apis/blogger/developers_guide_protocol.html), which
is based on GData. GData's search syntax is tuned for some particular use
cases (it is optimized for neat-looking category queries and a fixed set of
indexed properties) but it has most of the functionality that a blogging
client would probably like to have. If a few AtomPub servers implemented the
GData interface, it could become something of a standard. Or, somebody could
derive from URI templates with well-specified parameters from GData's API,
and we could build a usable search interface that way. I've been meaning to
implement the GData search interface as-is. I don't like some aspects of the
syntax but I don't think any alternative is likely to be overwhelmingly
superior.
Regards,
Brian