[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Finishing the XML-tagging discussion



> > > An advisory parameter of this sort is a worthless parameter. If you cannot
> > > depend on it appearing for every instance of a given content type you
> > > cannot use it for anything.
> 
> > that's not true.   if the parameter is missing then some opportunities
> > to evaluate the object (in the absence of knowledge of the complete type)
> > will be lost.  but that's not the same thing as "you cannot use it for
> > anything".
> 
> Something that I cannot count on being there is of no use to me, and, I
> suspect, to anyone else who cares about this stuff. This has to be
> dependable to be useful.

by that argument nothing in MIME would be useful except the ability
to handle text/plain; charset=us-ascii, since that's all that was
there in RFC 822 and all that could be expected out of email recipients
at the time MIME was deployed.  

and I fail to see how special recognition of the -xml frob is any 
more likely than special treatment of a $superclass parameter.
by your criteria above neither of these will be useful.

> > > (Or worse, you will use it and fail when it isn't present.)
> 
> > how is this kind of "failure" any different than the kind of failure
> > that exists when a reader doesn't understand image/svg-xml and
> > fails to recognize that the -xml suffix means that it can be
> > treated as generic xml?
> 
> It is different because I can do something about it -- I can upgrade
> my software. Whereas in order to get benefit from your proposal I have
> to upgrade the software that created the object I received -- software I 
> do not control. And in some cases I even have to upgrade the protocols
> in use.

okay.  I see that there is a higher barrier to adoption of 
$superclass than for -xml because the sender is more likely to
be able to generate -xml than $superclass.  

but I still think you overstate the hazard.  if someone sends you
an image/svg content without the $superclass label you can still
fix the problem on your end merely by defining a specific handler
for image/svg.  true, it doesn't solve the problem for how to deal
with large numbers of unanticipated XML-based types, but it will 
work on a small scale.

but I have to wonder - how likely is it that such XML-ish things 
will be generated by vanilla MIME user agents anyway?  isn't it
more likely that they will be generated by things that are XML-aware?

> > > This is a complete red herring. Nobody is proposing that the suffix be
> > > used for negotiation purposes. Negotiation is a different problem than
> > > labelling.
> 
> > yes, they are different problems, but if we establish this syntatic
> > conventntion, people will want to use that convention in content
> > negotiation.  or to put it another way, a content-type feature
> > labelling convention  that works hand in hand with content negotiation
> > is vastly more useful than one which does not.
> 
> Then let's by all means add a content-features tag as well. Negotiation
> problem solved.

so you want to put the xml frob in two different places?
seems like that introduces far more silly states than putting
it in either of them. 

> > > Again, an advisory parameter of this sort accomplishes nothing.
> 
> > "nothing" seems to me like a gross exaggeration.
> 
> I see it as an understatement.

fine, but it's not exactly an illuminating explanation.

> > > > To put it another way, if it's absolutely important that every instance
> > > > of image/svg be externally labelled as XML then I'd agree with you.
> > > > But I don't see this as absolutely important.
> 
> > > I do.
> 
> > okay, why?
> 
> Because I like to build software that actually works most of the time.

funny, I do too.  OTOH, I realize that when I send someone a new
or unusual kind of MIME object that there's a chance they won't
be able to interpret it, and that I won't find this out immediately,
and I accept that as a fundamental limitation of MIME.

> > okay, how many HTTP servers can deal with
> 
> > Accept: */*-xml
> 
> Any compliant HTTP server can deal with it. Now, it may not give you  the
> result you want, but so what? This is nothing but a strawman argument; the
> proposal at hand doesn't specify any extensions to the accept field in HTTP. 
> If you think it is such a risk that such usage will appear then we can
> specifically ban it.

banning content negotiation's use of this information hardly seems like a 
constructive way forward.  what I'd like to see is a proposal for  adding 
these frobs that anticipates the needs of content negotiation rather than 
pretending that they don't exist.

(this can still be done even if they are bundled in the content type name,
but it it probably needs a slighly different syntax; for one thing,
"-" already appears in a number of content-type names without being
intended as a feature separator.)

> > > Thus far all you have cited are easily surmountable
> > > problems, like the ordering of future additional suffixes (assuming there
> > > ever are any).
> 
> > yes, if we decide to use frobs on the content-type name it's not too
> > difficult to define a canonical ordering such that there's one unique
> > way of spelling any content-type name.  but if we do go in that direction
> > then I'd like us to go ahead and define that syntatic convention,
> > including the ordering
> 
> Fine with me.
> 
> > > > a concrete example of something that this breaks would be helpful
> > > > in getting me to understand your concerns.
> > >
> > > It breaks so many things in so many ways... Some exmaples:
> > >
> > > (1) Silly state problems. Consider the possible effect of image/jpeg;
> > >     $superclass=text/xml on a handler only prepared to accept XML text.
> > >     (And compare it with the effect of image/jpeg; charset=us-ascii on
> > >     any existing handler.)
> 
> > we need to compare apples to apples.  if the handler is only prepared
> > to accept XML text and the image/jpeg arrives with a $superclass=text/xml
> > type then it gets handled to the XML layer and that layer says
> > "invalid XML content" (and ideally the recipient gets a chance to save it).
> > if it sees image/jpeg; charset=us-ascii then it gets treated as
> > application/octet-stream and the recipient gets a chance to save it.
> 
> I'm sorry, but I _am_ comparing apples with apples. I've seen way too much
> software that simply crashes under such circumstances. (And so, I suspect, 
> have you.) Whereas the latter case causes no harm at all.

okay, I'm making assumptions about XML here - which is that XML readers
actually are built on parsing technology and can thus supply reasonable
error messages a high percentage of the time.  and that there are
cookies within XML that the XML parser will use to quickly distinguish
most obviously-not-valid-xml content from maybe-valid-xml content.

now if were were talking about subclasses of e.g. postscript or pdf I 
would indeed expect to see gobbledegook and crashes.  (since interpreters
for these langages often seem to crash even on perfectly valid documents)

and of course this would be a general purpose mechanism, not one specific
to xml.

but given that there's already a great deal of potential for mislabelling
in MIME, does this additional opportunity really increase the risk
that an object will be mislabelled?  i.e. how much more likely is it that 

a) the content-type is correct, but a $superclass parameter is present 
   and incorrect, than
b) the content-type is incorrect?

my guess is that superclasses will often be omitted, but will rarely
be incorrect when they are present.

> 
> > > (2) Problems with sending agents not including the tag. Suppose an application
> > >     is deployed that depends on the superclass tag. (This is inevitable once
> > >     the tag is defined, you can call it advisory until you are blue in the
> > >     face but if it is used at all it won't be taken as such.)
> 
> > let me see if I understand you correctly: what you are saying is that
> > people will expect the new convention (whatever it is) to work and
> > control the recipient's MIME readers's fallback behavior even in
> > the presence of a  vast installed base that neither understands this
> > convention nor XML?
> 
> No, that is not what I'm saying. What I'm saying is that a new agent, one that
> depends on the parameter being present, only works if it is present. 

why should an agent depend on the parameter being present (if it knows
that the specific type is XML-based)?

> And this then implies that a substantial number of agents need to send the 
> tag in order for it to be useful. This won't happen soon if at all, so my 
> agent that I wrote which depends on the tag we've called for ends up not 
> working.
> 
> > or that this would be like the user agents that could recognize
> > filename suffixes but refused  to look at the content-type?
> > people would expect their content to be read by the recipient
> > even if they could not label it correctly?
> 
> People do expect it. We have argued long and hard that a filename isn't
> sufficient, and we've mostly won. Now that we've won we want to say, 
> "Surprise, we've changed our minds, you now need to generate these 
> parameters for things to work". This is unacceptable.

we've always said that the content-type specification included the parameters,
that the content-type name without the parameters was incomplete.  so I don't
see how we're changing our minds by defining new types that happen to have 
parameters - even if some of those parameters are shared with other types.

> > (the latter strikes me as an argment against any sort of XML
> > convention at all - since you seem to be saying that if the convention
> > does exist that it will be (mis)used in preference to the primary
> > content-type )
> 
> The nice thing about the tag being part of the media type is that it leverages
> off of our years of insisting that proper media type labels be used. We define
> the convention and the rest of the problem takes care of itself.
> 
> > and how many user agents does this affect?  e.g., what percentage
> > of UAs cannot send out a text/plain attachment with a charset label?
> 
> A pretty large number, actually.
> 
> > > (3) Problems with places where parameters aren't expected/allowed. Once
> > >     the tag is required there will be pressure to generate it. This in turn
> > >     will lead to sending agents upgrading producing it and thereby cranking
> > >     out parameters for the first time. Some of these agents are now used to
> > >     generate values for fields that don't allow parameters. The upgrade will
> > >     cause these fields to become synatically invalid.
> 
> > such as in accept headers?
> 
> Strawman again. There's nothing syntactically invalid about putting funny
> wildcards in an accept header.

no, but as you point out, that doesn't have the desired result.

anyway the content negotiation argument isn't about the particular syntax 
that is used. it's about whether, having defined this new frob for
content-types, we then have to define new content negotiation
mechanisms to recognize those frobs.  if we can come up with something
that can be made to more-or-less work with existing HTTP accept and/or 
conneg expressisons, that would seem like a win.

Keith