[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposal for an HTTP ERR method
[I am also cc ing this to atom-syntax, just so those that are
interested there to continue this discussion can move it over here. A
log of it can be found here:
http://lists.w3.org/Archives/Public/ietf-http-wg/2004AprJun/
]
On 23 Jun 2004, at 16:58, Jamie Lokier wrote:
Henry Story wrote:
When a client receives a malformed server response it CAN (SHOULD?)
notify the resource that it is broken, by sending a ERR request,
What kind of malformed server response?
Broken HTTP headers are comparatively rare and should probably get an
ERR, except perhaps for the Server header.
Thanks. One more good reason for ERR. :-)
Malformed HTML is very common. Sending ERR in response to malformed
HTML would generate a flood of ERRs. But -- what is malformed HTML
anyway?
Yes. Presumably HTML would not warrant an ERR. But XHTML might very
well.
XML and XHTML do have well defined specifications.
I admit, that despite reading many documents and specifications,
I hadn't realised that text/xml needed to use ASCII characters only.
Neither had I, nor most of the people on the atom mailing list. There
is a HUGE thread there going on and on about that, which lead us to 2
proposals to solve this issue, of which this is the more generally
applicable one.
To your example:
GET /index.xml HTTP/1.x
Content-encoding: text/xml; charset=UTF-8
Accept: */*
Accept-Encoding: gzip, deflate;q=1.0, identity;q=0.5, *;q=0
Accept-Language: en-us, ja;q=0.62, de-de;q=0.93, de;
...
That's a malformed request.
400 Bad Request is the correct server response :)
Thanks. I am sorry I wrote this all out a little too fast.
The request would be the following:
------------8<---------------------
GET /index.xml HTTP/1.1
Host: example.com
Connection: keep-alive
User-Agent: BlogEx
Accept: text/xml
-----------8<-----------------------
The response would be something like
------------8<---------------------
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/xml
Server: SomeServer/2.1
Content-Length: 55
Date: Wed, 23 Jun 2004 15:36:05 GMT
<?xml version="1.0" encoding="iso-8859-1" ?>
<pløtz/>
-----------8<---------------------
I will fix it right away on the wiki at:
http://www.intertwingly.net/wiki/pie/PaceErrVerb#preview
[...] I can't see what character that is after the "l" and before the
"t".
It appears as a box in my mailer. (Emacs says it's character code
0x8f8 but I am suspicious).
It's a swedish o I think (the one with a line through it. So it is not
ascii.
The response is broken though clearly interpretable. Clients (in the
wider of Consumer2C or B2C) will therefore attempt to accommodate the
standards due to market pressure. Market pressures are close to
physical laws in their ferocity. We cannot change them. As a result
more an more such breakages will occur, and the standards will be left
in the dust of this vicious whirlwind.[1] In any case fighting against
it is going to be very tiresome.
It would be an easier fight if there were a central, high profile
place where commonly needed implementation bugs and workarounds could
be deposited -- and eventually removed a few years later when it's
confirmed they're not required any more.
I imagine that in the body of the message (if one thinks it would be a
good thing for ERR to have a body that is) one could have a URL that
points to such a place. Perhaps a few will pop up as a result of
creating such a method.
Not sure if that would help or hinder the fight to get clean standard
implementations out there,but it would certainly help with building
interoperable code, and highlighting the problems of real
implementations.
Name and shame, perhaps?
That can be something additional. But perhaps before shaming someone
one should first alert them to the error of their ways.
Here is an example of the clients message:
-------8<-------
ERR /index.xml HTTP/1.x
Content-encoding: text/xml; charset=UTF-8
Accept: */*
Accept-Encoding: gzip, deflate;q=1.0, identity;q=0.5, *;q=0
Accept-Language: en-us, ja;q=0.62, de-de;q=0.93, de;
Error-Message: XML is of incorrect content type
Error-Code: XXXX
Error-Spec: RFCXYZ,sec 3; RFCXXX, sec54
Error-Date: Saturday 19 June 2004, 18:05:30 GMT (whatever encoding)
Error-Method: GET
Error-ContentLength: 63
Again, why do you have a Content-Encoding header, and malformed at
that, in the request?
Thanks for pointing that out. Get it right fixed.
The Mime type of the content was text/xml. This requires the content
to
be in ASCII format, but we found some UTF-8 characters in the message.
We could interpret the message at present but will not necessarily be
able to do so in the future. Please refer to RFCXYZ, sec 3 and RFCXXX,
sec54 for more information. These can be found at http://ietf.org/
The XML file identifies itself as iso-8859-1. Clearly it's intended
that those bytes are understood as iso-8859-1 characters, not UTF-8
characters. A decent implementation would surely _either_ use the
encoding declaration, when none appears in the Content-Type (i.e. the
same as if "application/xml" were the content-type), or (conforming to
RFC 2376) use us-ascii, and treat all the high byte characters as
broken or single byte unexpected characters in a default encoding such
as (so often the case) iso-8859-1?
I have to direct you to the huge thread that started this out on the
atom mailing list.
http://www.imc.org/atom-syntax/mail-archive/msg04656.html
Perhaps someone there can post a short resume of it.
Also, shouldn't the text say US-ASCII as opposed to just ASCII? :)
ADVANTAGES:
I quite like the idea. Filling up logs of broken servers --
excellent. Perhaps you could take advantage of the Referer header to
get a short message in there. :)
There should clearly be some good behavior rules.
Note that some dubious servers ignore the method: they'll treat ERR
the same as GET, or do even worse things. (E.g. one server treats
this request line as a GET of "HTTP/1.1": "ERR /GET HTTP/1.1", and
treats this request as a GET of an empty URL: "ERR /index.html
HTTP/1.1").
Interesting point. You can't do much about broken servers. They will
slowly die out hopefully.
So you might not want to send ERRs to servers which haven't solicited
them.
Yes. One could request which methods a server supports before sending
the ERR.
There is an HTTP method for that, OPTIONS I think.
-- Jamie
Thanks a lot for the lengthy response. Looks like this is the right
place to debate this proposal.
Henry Story
http://bblfish.net