[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: draft-nottingham-atom-format-02
This specification describes version 0.3 of the Atom, an XML-based
Web content and metadata syndication format.
I take it that the SSFF and putting the content items and the feed in
separate HTTP resources are considered rejected ideas. Is this correct?
1.2 Conformance
[[ talk about atom documents and atom consumers, and how
requirements
are placed on them ]]
(I still suggest requiring that Atom consumers use an XML processor as
defined in XML 1.0.)
This specification uses XML Namespaces [W3C.REC-xml-names-19990114]
to uniquely identify XML elements and attribute names. It uses the
following namespace prefixes for the indicated namespace URIs;
"atom": http://purl.org/atom/ns#
The spec talks mainly about elements which are in the Atom namespace.
Is the use of the prefix in the prose necessary? The use of a
particular prefix creates an illusion that the prefix is significant
(even though the prose states it isn't).
Atom documents MAY have a Document Type Declaration.
What for?
In order to make it easy to consume Atom documents, I think the
documents should be parseable using a non-validating XML processor. If
a validating XML processor isn't required, Atom producers can't be
certain the DTD will be processed by the consumers, so relying on
DTD-based infoset augmentation is pointless and possibly even harmful.
Therefore, having the doctype declaration in an Atom document would be
rather pointless.
For validation, using an external Relax NG schema that is designated by
the validator is much more useful that validating a document against a
DTD designated by the document producer.
I suggest: "Atom documents SHOULD NOT have a Document Type Declaration.
Implementors of Atom producers should be aware that Atom consumers
which use non-validating XML processors are not required (as per [XML
1.0]) to process the DTD."
[[Entities]]
I suggest just saying a firm "No" to entities.
Atom documents are expected to be created by software, so the usual
rationale for using entities (the ability to compensate for inadequate
text input methods) is moot. Also, entities imply DTD processing, which
non-validating XML processors are not required to do.
Atom documents MAY contain Comments wherever they are legal in XML.
In order to discourage bogus use of comments, I suggest this wording:
"[XML 1.0] allows comments is XML documents but does not require XML
processors to make comments available to the application. Therefore,
Atom producers MAY include comments where allowed by [XML 1.0] but MUST
NOT use comments to transfer data intended for processing by Atom
consumers."
All elements and attributes in an Atom document MUST be
namespace-qualified. Note that this requirement does not preclude
the
use of a default namespace.
The first syntax example contains attributes which aren't in a
namespace. That's why I suggested in an earlier message that the spec
include something like this: "Attributes which are in no namespace and
attributes which are in the Atom namespace MUST be treated equivalently
(according to the meaning given in this specification) when the
attributes appear on an element which is in the Atom namespace."
"escaped": A mode attribute with the value "escaped" indicates that
the element's content is an escaped string. Processors MUST
unescape the element's content before considering it as content
of
the indicated media type.
An attempt at an alternative (hopefully more precise) wording:
"escaped": A mode attribute with the value "escaped" indicates that the
element's content is a string. Note: The value "escaped" is due to [XML
1.0] requiring markup-significant characters to be escaped in character
data. The XML processor will do the required unescaping on behalf of an
Atom consumer application. The mode "escaped" is only suitable for
transferring media types that are defined in terms of characters and
can be represented as a sequence of characters allowed in [XML 1.0].
"base64": A mode attribute with the value "base64" indicates that
the element's content is base64-encoded [RFC2045]. Processors
MUST
decode the element's content before considering it as content of
the the indicated media type.
"base64": A mode attribute with the value "base64" indicates that the
element's logical content is a byte sequence and the byte sequence has
been encoded using the base64 encoding [RFC2045] for transfer as XML
character data. Atom consumers MUST decode the element's content before
considering it as content of the indicated media type.
3.2 Person Constructs
3.3 Date Constructs
(etc.)
I suggest specifying that the Atom consumer must normalize whitespace
(remove leading and trailing whitespace and replace sequences of
whitespace in the middle using single spaces) in the element content
first, before trying to figure out whether the content matches the data
type (URL, email address, etc.) format.
4.3 "atom:title" Element
4.13.1 "atom:title" Element
(Need to resolve the markup in titles issue. I suggest the compromise I
described in
http://www.imc.org/atom-syntax/mail-archive/msg01154.html)
The content of an atom:modified element SHOULD have a time zone
whose
value MUST be "UTC".
Shouldn't that be "Z"?
atom:entry elements MUST contain an
atom:issued element, but MUST NOT contain more than one.
Why is the issued date required?
If @type="multipart/alternative", @mode MUST NOT be specified, and
content element MUST contain 1 or more content elements. These
content elements MUST NOT specify @type="multipart/alternative"
(i.e.
only one level of nesting is allowed).
I still think using the "multipart/alternative" to denote anything
other than what it means in the MIME context is a very bad idea. I
suggest allowing multiple content children of an entry with the order
being an indication of the order of preference.
--
Henri Sivonen
hsivonen@xxxxxx
http://iki.fi/hsivonen/