[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for an HTTP ERR method




[I am also cc ing this to atom-syntax, just so those that are interested there to continue this discussion can move it over here. A log of it can be found here:
http://lists.w3.org/Archives/Public/ietf-http-wg/2004AprJun/
]


On 23 Jun 2004, at 16:58, Jamie Lokier wrote:

Henry Story wrote:
When a client receives a malformed server response it CAN (SHOULD?)
notify the resource that it is broken, by sending a ERR request,

What kind of malformed server response?


Broken HTTP headers are comparatively rare and should probably get an
ERR, except perhaps for the Server header.

Thanks. One more good reason for ERR. :-)


Malformed HTML is very common.  Sending ERR in response to malformed
HTML would generate a flood of ERRs.  But -- what is malformed HTML
anyway?

Yes. Presumably HTML would not warrant an ERR. But XHTML might very well.


XML and XHTML do have well defined specifications.
I admit, that despite reading many documents and specifications,
I hadn't realised that text/xml needed to use ASCII characters only.

Neither had I, nor most of the people on the atom mailing list. There is a HUGE thread there going on and on about that, which lead us to 2 proposals to solve this issue, of which this is the more generally applicable one.


To your example:

GET /index.xml HTTP/1.x
Content-encoding: text/xml; charset=UTF-8
Accept: */*
Accept-Encoding: gzip, deflate;q=1.0, identity;q=0.5, *;q=0
Accept-Language: en-us, ja;q=0.62, de-de;q=0.93, de;
...

That's a malformed request. 400 Bad Request is the correct server response :)

Thanks. I am sorry I wrote this all out a little too fast. The request would be the following:

------------8<---------------------
GET /index.xml HTTP/1.1
Host: example.com
Connection: keep-alive
User-Agent: BlogEx
Accept: text/xml
-----------8<-----------------------

The response would be something like

------------8<---------------------
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/xml
Server: SomeServer/2.1
Content-Length: 55
Date: Wed, 23 Jun 2004 15:36:05 GMT

<?xml version="1.0" encoding="iso-8859-1" ?>
<plÝtz/>
-----------8<---------------------

I will fix it right away on the wiki at:
http://www.intertwingly.net/wiki/pie/PaceErrVerb#preview

[...] I can't see what character that is after the "l" and before the "t".
It appears as a box in my mailer. (Emacs says it's character code
0x8f8 but I am suspicious).

It's a swedish o I think (the one with a line through it. So it is not ascii.


The response is broken though clearly interpretable. Clients (in the
wider of Consumer2C or B2C) will therefore attempt to accommodate the
standards due to market pressure. Market pressures are close to
physical laws in their ferocity. We cannot change them. As a result
more an more such breakages will occur, and the standards will be left
in the dust of this vicious whirlwind.[1] In any case fighting against
it is going to be very tiresome.


It would be an easier fight if there were a central, high profile place where commonly needed implementation bugs and workarounds could be deposited -- and eventually removed a few years later when it's confirmed they're not required any more.

I imagine that in the body of the message (if one thinks it would be a good thing for ERR to have a body that is) one could have a URL that points to such a place. Perhaps a few will pop up as a result of creating such a method.


Not sure if that would help or hinder the fight to get clean standard
implementations out there,but it would certainly help with building
interoperable code, and highlighting the problems of real implementations.
Name and shame, perhaps?

That can be something additional. But perhaps before shaming someone one should first alert them to the error of their ways.


Here is an example of the clients message:

-------8<-------
ERR /index.xml HTTP/1.x
Content-encoding: text/xml; charset=UTF-8
Accept: */*
Accept-Encoding: gzip, deflate;q=1.0, identity;q=0.5, *;q=0
Accept-Language: en-us, ja;q=0.62, de-de;q=0.93, de;
Error-Message: XML is of incorrect content type
Error-Code: XXXX
Error-Spec: RFCXYZ,sec 3; RFCXXX, sec54
Error-Date:  Saturday 19 June 2004, 18:05:30 GMT (whatever encoding)
Error-Method: GET
Error-ContentLength: 63

Again, why do you have a Content-Encoding header, and malformed at that, in the request?

Thanks for pointing that out. Get it right fixed.



The Mime type of the content was text/xml. This requires the content to
be in ASCII format, but we found some UTF-8 characters in the message.
We could interpret the message at present but will not necessarily be
able to do so in the future. Please refer to RFCXYZ, sec 3 and RFCXXX,
sec54 for more information. These can be found at http://ietf.org/

The XML file identifies itself as iso-8859-1. Clearly it's intended that those bytes are understood as iso-8859-1 characters, not UTF-8 characters. A decent implementation would surely _either_ use the encoding declaration, when none appears in the Content-Type (i.e. the same as if "application/xml" were the content-type), or (conforming to RFC 2376) use us-ascii, and treat all the high byte characters as broken or single byte unexpected characters in a default encoding such as (so often the case) iso-8859-1?

I have to direct you to the huge thread that started this out on the atom mailing list. http://www.imc.org/atom-syntax/mail-archive/msg04656.html Perhaps someone there can post a short resume of it.


Also, shouldn't the text say US-ASCII as opposed to just ASCII? :)


ADVANTAGES:

I quite like the idea. Filling up logs of broken servers -- excellent. Perhaps you could take advantage of the Referer header to get a short message in there. :)

There should clearly be some good behavior rules.


Note that some dubious servers ignore the method: they'll treat ERR
the same as GET, or do even worse things.  (E.g. one server treats
this request line as a GET of "HTTP/1.1": "ERR /GET HTTP/1.1", and
treats this request as a GET of an empty URL: "ERR /index.html
HTTP/1.1").

Interesting point. You can't do much about broken servers. They will slowly die out hopefully.



So you might not want to send ERRs to servers which haven't solicited them.

Yes. One could request which methods a server supports before sending the ERR.
There is an HTTP method for that, OPTIONS I think.


-- Jamie

Thanks a lot for the lengthy response. Looks like this is the right place to debate this proposal.


Henry Story
http://bblfish.net