[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Resources for AtomPub parser validation

Daniel Jalkut wrote:
<content type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml"; xml:space="preserve">
Some test HTML with an excaped ampersand in it:
<img src="foo?bar=1&amp;baz=2"/>
What's happening is the editor is ending up showing that &amp; as just a "&". This sounds like it's right, based on what you're saying that the xhtml content gets unescaped by the XML parser, right?

correct. &amp; is simply an XML-encoded "&", regardless of (X)HTML.

But the customer's contention is that the &amp; should remain escaped in the HTML source, because that's how he typed it, and that's how it exists in the database on his server.

it has to remain &amp; as long as it's within the context of the markup. when it's taken out of that context, it has to be unescaped. if your customer were using XML tools instead of text tools, that would happen automatically. in your editor, the interface has to display a "&", but the underlying markup must be &amp;.

I guess it boils down to whether I should be presenting data "in HTML" (re-escaped?) or "literally." Perhaps this is not a question that the Atom specification needs to or can answer :)

no, this is basic XML mechanics. a literal "&" in XML always has to be escaped, because "&" is the character used for escaped characters ;-)