[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: draft-nottingham-atom-format-02




   This specification describes version 0.3 of the Atom, an XML-based
   Web content and metadata syndication format.

I take it that the SSFF and putting the content items and the feed in separate HTTP resources are considered rejected ideas. Is this correct?


1.2 Conformance

[[ talk about atom documents and atom consumers, and how requirements
are placed on them ]]

(I still suggest requiring that Atom consumers use an XML processor as defined in XML 1.0.)


   This specification uses XML Namespaces [W3C.REC-xml-names-19990114]
   to uniquely identify XML elements and attribute names. It uses the
   following namespace prefixes for the indicated namespace URIs;

"atom": http://purl.org/atom/ns#

The spec talks mainly about elements which are in the Atom namespace. Is the use of the prefix in the prose necessary? The use of a particular prefix creates an illusion that the prefix is significant (even though the prose states it isn't).


Atom documents MAY have a Document Type Declaration.

What for?


In order to make it easy to consume Atom documents, I think the documents should be parseable using a non-validating XML processor. If a validating XML processor isn't required, Atom producers can't be certain the DTD will be processed by the consumers, so relying on DTD-based infoset augmentation is pointless and possibly even harmful. Therefore, having the doctype declaration in an Atom document would be rather pointless.

For validation, using an external Relax NG schema that is designated by the validator is much more useful that validating a document against a DTD designated by the document producer.

I suggest: "Atom documents SHOULD NOT have a Document Type Declaration. Implementors of Atom producers should be aware that Atom consumers which use non-validating XML processors are not required (as per [XML 1.0]) to process the DTD."

[[Entities]]

I suggest just saying a firm "No" to entities.


Atom documents are expected to be created by software, so the usual rationale for using entities (the ability to compensate for inadequate text input methods) is moot. Also, entities imply DTD processing, which non-validating XML processors are not required to do.

Atom documents MAY contain Comments wherever they are legal in XML.

In order to discourage bogus use of comments, I suggest this wording: "[XML 1.0] allows comments is XML documents but does not require XML processors to make comments available to the application. Therefore, Atom producers MAY include comments where allowed by [XML 1.0] but MUST NOT use comments to transfer data intended for processing by Atom consumers."


All elements and attributes in an Atom document MUST be
namespace-qualified. Note that this requirement does not preclude the
use of a default namespace.

The first syntax example contains attributes which aren't in a namespace. That's why I suggested in an earlier message that the spec include something like this: "Attributes which are in no namespace and attributes which are in the Atom namespace MUST be treated equivalently (according to the meaning given in this specification) when the attributes appear on an element which is in the Atom namespace."


"escaped": A mode attribute with the value "escaped" indicates that
the element's content is an escaped string. Processors MUST
unescape the element's content before considering it as content of
the indicated media type.

An attempt at an alternative (hopefully more precise) wording:
"escaped": A mode attribute with the value "escaped" indicates that the element's content is a string. Note: The value "escaped" is due to [XML 1.0] requiring markup-significant characters to be escaped in character data. The XML processor will do the required unescaping on behalf of an Atom consumer application. The mode "escaped" is only suitable for transferring media types that are defined in terms of characters and can be represented as a sequence of characters allowed in [XML 1.0].


"base64": A mode attribute with the value "base64" indicates that
the element's content is base64-encoded [RFC2045]. Processors MUST
decode the element's content before considering it as content of
the the indicated media type.

"base64": A mode attribute with the value "base64" indicates that the element's logical content is a byte sequence and the byte sequence has been encoded using the base64 encoding [RFC2045] for transfer as XML character data. Atom consumers MUST decode the element's content before considering it as content of the indicated media type.



3.2 Person Constructs

3.3 Date Constructs
(etc.)

I suggest specifying that the Atom consumer must normalize whitespace (remove leading and trailing whitespace and replace sequences of whitespace in the middle using single spaces) in the element content first, before trying to figure out whether the content matches the data type (URL, email address, etc.) format.

4.3 "atom:title" Element

4.13.1 "atom:title" Element

(Need to resolve the markup in titles issue. I suggest the compromise I described in
http://www.imc.org/atom-syntax/mail-archive/msg01154.html)


The content of an atom:modified element SHOULD have a time zone whose
value MUST be "UTC".

Shouldn't that be "Z"?


atom:entry elements MUST contain an
   atom:issued element, but MUST NOT contain more than one.

Why is the issued date required?


If @type="multipart/alternative", @mode MUST NOT be specified, and
content element MUST contain 1 or more content elements. These
content elements MUST NOT specify @type="multipart/alternative" (i.e.
only one level of nesting is allowed).

I still think using the "multipart/alternative" to denote anything other than what it means in the MIME context is a very bad idea. I suggest allowing multiple content children of an entry with the order being an indication of the order of preference.


--
Henri Sivonen
hsivonen@xxxxxx
http://iki.fi/hsivonen/