[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: -1 on the features draft



James M Snell wrote:
> > As a result, I think there is no need to have features that 
> > distinguish between plain text and XHTML text; every implementation 
> > should support both.
> 
> I disagree and I know my users would as well, especially if 
> it means that table they spent so much time putting together 
> with formatting and image includes and a youtube embed, etc 
> suddenly came out with all markup removed.

> You, Tim and Rob have all alluded to this idea that impls 
> "should" be following some minimum set of behaviors.  
> Unfortunately, neither of Atom specs back any of that up.

First, let's talk specifically about using AtomPub for weblog entries
specifically. Somebody could write a separate RFC or a whole book "Atom
Weblog Publishing Guide." I suspect it would have something like the
following in it:

* Drafts: It is unacceptable not to offer the ability to preview an
entry before publishing it. The only way the client can currently expect
to provide realistic preview is by submitting a draft. So, this feature
is basically a requirement.

* Titles: I don't see how this is even an argument. Take any weblog, and
look at its web page representation. There is a plain-text version of
the title in the <html:title>. Now, look at it in Google Reader; it is
plain text there too. If your title doesn't make sense in plain text
then you are making a mistake as an author. If your client software
allows somebody to put a Youtube video into the title then you need to
find a new UI designer. As a result, the server should assume that it
can strip out as much markup as it wants from the title if it has a good
reason to do so, without worrying if it is going to turn the title into
nonsense. But, the more markup it supports, the better. By the
robustness principle, the server should never reject a title it can
parse, and everybody can parse XHTML and plain text without question, so
those types should never be rejected.

* Summaries: The server should be able to tell the client "don't waste
any effort generating a summary, because I will never use it." But, if
the client sends a summary anyway, the server should use it. The server
should definitely never reject an entry for containing a
client-generated summary. And, the client should never expect the server
to auto-generate a summary, so if it wants one, it should generate one
itself. Similarly to titles, the client's UI shouldn't encourage the
user to insert a 5,000 row table into a summary. Should the server
accept a 5,000 row table in a summary? Probably not. But the software
should accept some basic markup like hyperlinks and <em> tags in
summaries or nobody will want to it.

* Rights: The server should be able to tell the client "don't waste your
time generating a rights statement, I have my own policy I'm going to
use instead." But, the server should never require a rights statement.
Whether or not the server accepts, edits, or rejects a rights statement
is a legal question, not a technological question. IMO, it would be
smart for clients to just omit a user interface for per-entry rights
statements.

* Content: Users will learn the server's markup capabilities from the
documentation and from previewing. If the user wants to insert a YouTube
video, a table, or an image, then he is going to find some way to do it,
even if it means dumping his software for something else. If your
software doesn't allow as much markup as WordPress allows, then it will
fail.

No client is going to let the user insert a youtube embed, an image, or
a table into a weblog title. Weblog authors already know that their
titles have to be presentable in plain text: That is how they are
presented in <html:title> tags and in most feed readers. If an entry
title doesn't make sense as plain text, then that is the author's fault.
If the client encourages the user to put nonsense into the title, that
is the software designer's fault.

* ID. The client *has* to generate an ID, but it is likely that the
server is going to override it. Clients must be coded to handle that. A
server probably *should* override it to guarantee uniqueness. 

* Foreign markup and links: The client should not expect any foreign
markup to survive the publishing process, because servers are likely to
strip out foreign markup for privacy reasons. But, I think the spec.
already says that a server cannot flat-out reject an entry because it
contains foreign markup. How clients and servers preserve particular
kinds of foreign markup and links probably needs to be specified on a
case-by-case basis, is out of scope here.

* Dates: The client should send reasonable dates and the client should
expect those dates to be overridden by the server. Servers probably need
to do that for many reasons. If the server provides some way to
forward-date or back-date entries, then it should indicate that as a
feature, but a client shouldn't assume that functionality is there
without the explicit indication.

* Authors and Contributors: The server should probably not reject these
but it will probably have some policies in place that may cause them to
be totally overridden. Some hinting would be nice here.

* Support for HTML: Let it be a quality-of-implementation issue. It is
recommended that servers support receiving HTML everywhere and clients
should send XHTML unless they think the server supports HTML. If a
client does not send HTML in an entry then the server should not
transform it into HTML when sending it back to the client; that is, the
server should store HTML if and only if it was originally given HTML.
(Disclaimer: I greatly prefer XHTML over HTML.)

* Slug: A weblog server should support it but may ignore it. A hint
would be helpful so the client can enable/disable the UI for it, but
client software should include some heuristics to determine whether
slugs are supported.

- Brian


> I've got no problem with the notion that any impl that 
> supports HTML should also support XHTML (and vice versa) but 
> it's silly to think that text and xhtml support are somehow 
> equivalent.
> 
> This particular set of features likely could be simplified, 
> but not like this.
> 
> > However, I think it is unreasonable to require implementations to 
> > parse HTML. I think content producers (clients and servers) should 
> > observe Postel's law and send XHTML; the receiver can always easily 
> > convert it to HTML if needed. If the server accepts HTML at all, it 
> > should support it in all text constructs (likely stripping 
> out markup 
> > different types of markup for titles, summaries, rights, 
> and content).
> > 
> > 
> - James
> 
>