[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: PaceItemLang created. (Add xml:lang attribute to atom:entry)



Asbjorn Ulsberg wrote:
> It looks like some of us feel that 'xml:lang' should be 
> possible to use on _all_ elements in Atom. 
> Does anyone oppose that
	I don't think we should rush into this... I hadn't been aware that
we *could* restrict the use of xml:lang, but given that we can, I think we
should carefully consider being conservative in our support for xml:lang.
	Before saying that xml:lang should be supported on *every* element,
it would make sense to ask why this might be a useful thing to do.
	My guess is that xml:lang would be very commonly used on the
atom:feed and atom:entry elements. I'm not convinced, however, that there
are compelling reasons to allow (and by implication encourage) the use of
multiple languages within an entry. 
	1. Would it make sense for the atom:title element to have a
different language than the atom:content element?
	2. Would it make sense for atom:copyright to be in some language
different from the language of atom:content? 
	3. If a search engine wanted to allow searchers to constrain results
based on language (i.e. I only want stuff in English), should it return
results that have English content but French titles? What about stuff with
French titles but English content? Or, an entry where everything is in
English accept for the atom:author? This could get confusing...
	4. In English, at least, we often appropriate phrases from other
languages. For instance, "esprit de corps" is a phrase that apparently
became popular in English during World War One. If an entry, whose content
was in English, had a title of "Esprit de Corps", should the title be marked
as English or as French?
	5. What does it mean if an atom:modified element is tagged with an
xml:lang? Don't Atom dates look the same in all languages? The same question
should be asked re: atom:issued, atom:created, atom:id, etc... Does xml:lang
even mean anything on all elements? If not, why support it?
	6. Is it really sensible for an atom:author:name to have a language
associated with it? I've always considered authors' names to be opaque
strings... Or, is it being suggested that "Phillip" in English should be
somehow considered similar to "Philippe" in some other language? If two
entries are identical except that their author elements have different
xml:lang tags would equivalence be asserted if the names could be
translated? Should "duplicate detection" software in readers use such logic?
(Does someone have the needed name translation/equivalence tables?)
	We have here, I think, a conflict between the desires of an author
to be expressive and the need/desire of content processors, and readers, to
limit the complexity of the data streams they consume. If xml:lang can be
used on every field, then some processors will be compelled to support a
very granular model for language tagging without providing much benefit to
their users. This will increase the "cost" of solutions without actually
making them much more useful. Is the extra complexity really worth it?

		bob wyman