[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: xml:lang attribute
On Thu, 7 Aug 2003 14:28:27 -0400, Maciej Ceglowski
<mceglows@xxxxxxxxxxxxxx> wrote:
Right now, there are 'title' elements at the feed and entry level, for
which the language attribute is undefined. If the feed itself had an
xml:lang attribute, it could cascade down to child elements.
As far as I have understood, the xml:lang attribute on the feed element
does indeed cascade down.
I believe the consensus on the Wiki was to allow for xml:lang values of
'unknown' to grandfather in tools that don't export language metadata.
This is what the XML specification has to say about xml:lang [1]:
A special attribute named xml:lang may be inserted in documents to
specify the language used in the contents and attribute values of
any element in an XML document. In valid documents, this attribute,
like any other, must be declared if it is used. The values of the
attribute are language identifiers as defined by [IETF RFC 1766], Tags
for the Identification of Languages, or its successor on the
IETF Standards Track.
RFC 1766 [2] dictates either:
1. Two-letter country-codes, as defined in ISO-639, optionally followed by
a dash and dialect/variant information. Examples:
no-nynorsk, en-cockney
2. For languages that aren't defined in ISO-639, but have a IANA-assigned
language code, one can use the prefix i-. Example:
i-sami-no
3. For languages that does not fit 1) or 2), one can use the private prefix
x-. Examples:
x-klingon, x-quenya, x-minbari
After explaining these three valid uses: RFC 1766 explicitly says:
Other values cannot be assigned except by updating this standard.
Before someone suggests using "x-unknown" as an attribute for undefined
languages: That is overloading the meaning of xml:lang, suggesting that we
are using a private language, whose name is "unknown". If the language is
unknown, this should be addressed by omitting the xml:lang attribute from
that particular feed.
If we then read the current informal specification [3], it says:
optional attributes of feed:
- xml:lang. SHOULD be included. MAY be overwritten on individual
entries, if the feed contains entries in more than one language.
Which is, IMHO, exactly as it should be. RFC 2119 [4] defines the use of
the word "SHOULD" as:
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
References:
-----------
[1] <URL:http://www.w3.org/TR/REC-xml#sec-lang-tag>
[2] <URL:http://www.ietf.org/rfc/rfc1766.txt>
[3] <URL:http://diveintomark.org/public/2003/08/atom02spec.txt>
[4] <URL:http://www.ietf.org/rfc/rfc2119.txt>
--
Arve Bersvendsen
http://www.virtuelvis.com
http://www.bersvendsen.com