From owner-ietf-imaa Sun Feb 9 11:17:18 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h19JHIq24693 for ietf-imaa-bks; Sun, 9 Feb 2003 11:17:18 -0800 (PST) Received: from [63.202.92.156] (adsl-63-202-92-156.dsl.snfc21.pacbell.net [63.202.92.156]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h19JHGd24687 for ; Sun, 9 Feb 2003 11:17:16 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sun, 9 Feb 2003 11:10:29 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Dealing with open issues in IMAA Content-Type: text/plain; charset="iso-8859-1" ; format="flowed" Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Greetings. The list is now open. Adam Costello and I came up with draft-hoffman-imaa (with some early help from Patrik Fältström), and we think it is a good way forward for internationalizing Internet email addresses. Those of you who have read the draft know that there are a bunch of open issues. In fact, there are a few where Adam and I strongly disagree with each other. For the first round of discussion, people who have strong opinions on any of the open issues should speak up, starting a new thread with an appropriate subject line (if one isn't happening already). After we get a sense of what people think, we'll revise the document, possibly closing off some issues. In the latter case, we'll start an appendix of "design choices" so that people can see how we got to where we end up. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sun Feb 9 11:17:19 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h19JHJH24696 for ietf-imaa-bks; Sun, 9 Feb 2003 11:17:19 -0800 (PST) Received: from [63.202.92.156] (adsl-63-202-92-156.dsl.snfc21.pacbell.net [63.202.92.156]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h19JHHd24691 for ; Sun, 9 Feb 2003 11:17:17 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sun, 9 Feb 2003 11:16:50 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Case sensitivity on the LHS Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This one should be lively. RFC 2821 and RFC 2822 make it clear that the left-hand side (LHS) of email addresses are opaque, which in turn means they are case-sensitive. The -00 draft of IMAA preserves this. Most email users don't know that the LHS is case-sensitive, and probably guess that it isn't because the RHS (the domain name) is not. Further, there are some mail systems and gateways that do case conversion on the LHS. If we simplify IMAA to make the LHS case-insensitive, it will probably match the expectations of users better. It would also mean that we could reuse Nameprep instead of using our own Stringprep profile. There are other reasons why this might be good listed in the IMAA document. But to do so would go against the spirit of the standards on which IMAA rests, namely 2821 and 2822 (and 821 and 822 before them). Purity or modernity? --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sun Feb 9 11:28:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h19JSPO24870 for ietf-imaa-bks; Sun, 9 Feb 2003 11:28:25 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h19JSPd24866 for ; Sun, 9 Feb 2003 11:28:25 -0800 (PST) Received: (qmail 80940 invoked by uid 1016); 9 Feb 2003 19:28:51 -0000 Date: 9 Feb 2003 19:28:51 -0000 Message-ID: <20030209192851.80939.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Background reading for non-ASCII mailbox names Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: My web page http://cr.yp.to/djbdns/idn.html discusses six problems caused by careless internationalization proposals: * interoperability failures; * inconsistent displays; * unnecessary implementation and deployment costs; * multiple semantically similar names; * identical displays of different names; and * typing failures. The discussion focuses on domain names for concreteness, but the same principles apply to mailbox names, login names, etc. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago P.S. The mailing-list software silently discarded this message the first time I sent it. From owner-ietf-imaa Sun Feb 9 14:47:03 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h19Ml3Y29944 for ietf-imaa-bks; Sun, 9 Feb 2003 14:47:03 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h19Mkwd29938; Sun, 9 Feb 2003 14:47:01 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id RAA16869; Sun, 9 Feb 2003 17:47:00 -0500 Message-Id: <4.2.0.58.J.20030209173037.05a45ca0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sun, 09 Feb 2003 17:43:27 -0500 To: Paul Hoffman / IMC , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Case sensitivity on the LHS In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:16 03/02/09 -0800, Paul Hoffman / IMC wrote: >If we simplify IMAA to make the LHS case-insensitive, it will probably >match the expectations of users better. It would also mean that we could >reuse Nameprep instead of using our own Stringprep profile. There are >other reasons why this might be good listed in the IMAA document. But to >do so would go against the spirit of the standards on which IMAA rests, >namely 2821 and 2822 (and 821 and 822 before them). Would it just go against the spirit, or also create other problems? The following problems come to my mind: [mostly just thinking aloud] - Some systems currently treat ASCII as case-sensitive (don't have any idea how many). But these names would not be encoded, so the behavior would stay the same. (except if nameprep is applied before the check for ascii-only is done, which may well be the case). - What is the current user expectation? My guess is that case-insensitive is more widespread. In any case, one or the other expectation will be disappointed (if they ever happen to notice). Do we have any idea which systems are more numerous (the only sample I have at the moment is my own email address, which is case-insensitive). Overall, I think that the whole nameprep/stringprep stuff is already complicated enough, and so if there are not major problems, going with case-insensitive looks much better to me. Regards, Martin. From owner-ietf-imaa Sun Feb 9 21:57:06 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1A5v6f08711 for ietf-imaa-bks; Sun, 9 Feb 2003 21:57:06 -0800 (PST) Received: from mercury.ccil.org (mail@[192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1A5v5d08707 for ; Sun, 9 Feb 2003 21:57:05 -0800 (PST) Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 18i6wC-0007YE-00 for ; Mon, 10 Feb 2003 00:57:04 -0500 Subject: John Cowan on IMAA draft To: ietf-imaa@imc.org Date: Mon, 10 Feb 2003 00:57:04 -0500 (EST) X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: John Cowan Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Since local names are under the sole control of a single domain (even if people nominate their own local names, it's the domain mail administrator who decides whether they work), I think that having an ACE prefix is not necessary. There is no requirement that every possible name be available. I think the de facto situation that local names are case-insensitive should be accepted. Doing local names by parts (delimited by non-alphanumeric ASCII characters) is a good idea. However, I'm not wedded to it. We should go for 63-character limitation. Recognizing fullwidth @ is important, because it's context dependent whether people are using halfwidth or fullwidth characters, and they may not even be conscious of it in double-width environments. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_ ----- End of forwarded message (env-from cowan) ----- -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_ From owner-ietf-imaa Mon Feb 10 05:04:39 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AD4dr22966 for ietf-imaa-bks; Mon, 10 Feb 2003 05:04:39 -0800 (PST) Received: from crow.verisign.com (crow.verisign.com [216.168.237.103]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1AD4bd22957; Mon, 10 Feb 2003 05:04:37 -0800 (PST) Received: from vsvapostalgw3.prod.netsol.com (vsvapostalgw3.prod.netsol.com [10.170.12.61]) by crow.verisign.com (nsi_0.1/8.9.1) with ESMTP id IAA04734; Mon, 10 Feb 2003 08:04:32 -0500 (EST) Received: by vsvapostalgw3.prod.netsol.com with Internet Mail Service (5.5.2653.19) id <1SMTVFFZ>; Mon, 10 Feb 2003 08:02:33 -0500 Message-ID: <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> From: "Hollenbeck, Scott" To: "'Paul Hoffman / IMC'" , ietf-imaa@imc.org Subject: RE: Case sensitivity on the LHS Date: Mon, 10 Feb 2003 08:00:34 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > If we simplify IMAA to make the LHS case-insensitive, it will > probably match the expectations of users better. It would also mean > that we could reuse Nameprep instead of using our own Stringprep > profile. There are other reasons why this might be good listed in the > IMAA document. But to do so would go against the spirit of the > standards on which IMAA rests, namely 2821 and 2822 (and 821 and 822 > before them). > > Purity or modernity? Modernity. I agree that case insensitivity will probably match the expectations of users better. Is this document intended to be a formal update to 2821 and 2822? Both (2821 section 4.1.2 and 2822 section 3.4.1) contain formal definitions of the local part of an email address. -Scott- From owner-ietf-imaa Mon Feb 10 06:08:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AE8Gv26517 for ietf-imaa-bks; Mon, 10 Feb 2003 06:08:16 -0800 (PST) Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AE8Ed26510 for ; Mon, 10 Feb 2003 06:08:14 -0800 (PST) Received: (qmail 12207 invoked from network); 10 Feb 2003 14:07:57 -0000 Received: from adsl-65-42-242-53.dsl.lgnnmi.ameritech.net (HELO ?192.168.0.100?) (65.42.242.53) by server.iicinternet.com with SMTP; 10 Feb 2003 14:07:57 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: Date: Mon, 10 Feb 2003 09:07:37 -0500 To: ietf-imaa@imc.org From: tedd Subject: Re: Case sensitivity on the LHS Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul: >This one should be lively. Why not? Everything else is. >Most email users don't know that the LHS is case-sensitive, and >probably guess that it isn't because the RHS (the domain name) is >not. Further, there are some mail systems and gateways that do case >conversion on the LHS. > >If we simplify IMAA to make the LHS case-insensitive, it will >probably match the expectations of users better. Absolutely. >It would also mean that we could reuse Nameprep instead of using our >own Stringprep profile. There are other reasons why this might be >good listed in the IMAA document. But to do so would go against the >spirit of the standards on which IMAA rests, namely 2821 and 2822 >(and 821 and 822 before them). > >Purity or modernity? > >--Paul Hoffman, Director >--Internet Mail Consortium I am open to arguments otherwise, but at present, my vote would be to make the LHS case-insensitive. In fact, I don't understand the reasoning behind considering case-sensitive in the first place. Would someone be so kind as to point out the benefit(s) of having an email address of Tedd@sperling.com being different than tedd@sperling.com? To me, it just doesn't make any sense -- or do I not understand the problem. Thank you. tedd -- http://sperling.com/ From owner-ietf-imaa Mon Feb 10 06:21:07 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AEL7r28484 for ietf-imaa-bks; Mon, 10 Feb 2003 06:21:07 -0800 (PST) Received: from fluff.x42.com (xp8rji20lb1dl3ntueen@fluff.x42.com [213.187.218.11]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AEL5d28477 for ; Mon, 10 Feb 2003 06:21:05 -0800 (PST) Received: (qmail 28505 invoked by uid 569); 10 Feb 2003 14:21:05 -0000 Date: Mon, 10 Feb 2003 15:21:05 +0100 From: Magnus Bodin To: tedd Cc: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS Message-ID: <20030210142105.GG12186@bodin.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4i X-Face: "/J{klxZ0}6u#[u&\4L/KMmGO}7(W|&yk4c(NYO^IPyMT<3DMOn7\?+Bw?33T ,}nX(Pj6}j;X1LPn$%d<;in~z50w#P>3u6)|bgwm~ZB@Hl?Y|BTa*/vH!~}Iln6F>>3: s/'5[>fW7gYB$B.m=85bu$GTPN#NG##a_^mc9uBp9.gvh*i>fHyB: Reply-By: Thu Feb 13 15:18:45 2003 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Le Mon, Feb 10, 2003 at 09:07:37AM -0500, tedd écrivait: > > Would someone be so kind as to point out the benefit(s) of having > an email address of Tedd@sperling.com being different than > tedd@sperling.com? To me, it just doesn't make any sense -- or do I > not understand the problem. It might not make any sense in English with ASCII [A-Za-z]. In a different language with a different pair of upper/lower-case characters, it might be a bigger difference between a upper/lower/mixed-case word. /magnus -- http://x42.com From owner-ietf-imaa Mon Feb 10 06:33:58 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AEXwU29596 for ietf-imaa-bks; Mon, 10 Feb 2003 06:33:58 -0800 (PST) Received: from mercury.ccil.org (mail@[192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1AEXvd29591 for ; Mon, 10 Feb 2003 06:33:57 -0800 (PST) Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 18iF0P-000303-00 for ; Mon, 10 Feb 2003 09:33:57 -0500 Subject: Re: Case sensitivity on the LHS To: ietf-imaa@imc.org Date: Mon, 10 Feb 2003 09:33:57 -0500 (EST) X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: John Cowan Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: tedd scripsit: > Would > someone be so kind as to point out the benefit(s) of having an email > address of Tedd@sperling.com being different than tedd@sperling.com? The classic minimal pair is Tedd@example.com vs. TedD@example.com. But I agree that it's a silly distinction. Mail admins don't *have* to let people choose the absolutely-precisely-preferred forms of their name as local-parts. -- LEAR: Dost thou call me fool, boy? John Cowan FOOL: All thy other titles http://www.ccil.org/~cowan thou hast given away: jcowan@reutershealth.com That thou wast born with. http://www.reutershealth.com From owner-ietf-imaa Mon Feb 10 06:36:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AEaof00539 for ietf-imaa-bks; Mon, 10 Feb 2003 06:36:50 -0800 (PST) Received: from relay-3m.club-internet.fr (relay-3m.club-internet.fr [194.158.104.42]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1AEamd00527; Mon, 10 Feb 2003 06:36:48 -0800 (PST) Received: from mine.club-internet.fr (f11v-10-143.d1.club-internet.fr [213.44.169.143]) by relay-3m.club-internet.fr (Postfix) with ESMTP id 7E9A2E33C; Mon, 10 Feb 2003 15:37:32 +0100 (CET) Message-Id: <5.2.0.9.0.20030210144156.023fbec0@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Mon, 10 Feb 2003 14:45:44 +0100 To: Martin Duerst , Paul Hoffman / IMC , ietf-imaa@imc.org From: "J-F C. (Jefsey) Morfin" Subject: Re: Case sensitivity on the LHS In-Reply-To: <4.2.0.58.J.20030209173037.05a45ca0@localhost> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 23:43 09/02/03, Martin Duerst wrote: >Overall, I think that the whole nameprep/stringprep stuff is already >complicated enough, and so if there are not major problems, going with >case-insensitive looks much better to me. agreed. we also have to consider all the possible devices (exsiting or to come) having to send mails with reduced keyboards and to support IDNs with reduced computing resources. From owner-ietf-imaa Mon Feb 10 06:59:39 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AExdq01122 for ietf-imaa-bks; Mon, 10 Feb 2003 06:59:39 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1AExad01114; Mon, 10 Feb 2003 06:59:36 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id JAA24238; Mon, 10 Feb 2003 09:59:35 -0500 Message-Id: <4.2.0.58.J.20030210094623.05b46498@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Mon, 10 Feb 2003 09:48:25 -0500 To: "J-F C. (Jefsey) Morfin" , Paul Hoffman / IMC , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Case sensitivity on the LHS In-Reply-To: <5.2.0.9.0.20030210144156.023fbec0@mail.club-internet.fr> References: <4.2.0.58.J.20030209173037.05a45ca0@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 14:45 03/02/10 +0100, J-F C. (Jefsey) Morfin wrote: >we also have to consider all the possible devices (exsiting or to come) >having to send mails with reduced keyboards and to support IDNs with >reduced computing resources. Yes. Please note that nameprep/stringprep are can be reduced (in some cases drastically) if you know that you only will get a subset of characters as input. But even then, being able to use the same nameprep/stringprep for both sides of the '@' is a clear win. Regards, Martin. From owner-ietf-imaa Mon Feb 10 07:03:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AF38101235 for ietf-imaa-bks; Mon, 10 Feb 2003 07:03:08 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (courrier4.internet.gouv.qc.ca [192.197.162.9] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AF37d01231 for ; Mon, 10 Feb 2003 07:03:07 -0800 (PST) Received: (qmail 3585 invoked from network); 10 Feb 2003 15:02:54 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.104) by mailgen2.internet.gouv.qc.ca with SMTP; 10 Feb 2003 15:02:54 -0000 Message-Id: <5.0.2.1.2.20030210095250.00b03c68@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Mon, 10 Feb 2003 10:02:57 -0500 To: Magnus Bodin , tedd From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: Case sensitivity on the LHS Cc: ietf-imaa@imc.org In-Reply-To: <20030210142105.GG12186@bodin.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 15:21 2003-02-10 +0100, Magnus Bodin a écrit : >Le Mon, Feb 10, 2003 at 09:07:37AM -0500, tedd écrivait: > > > > Would someone be so kind as to point out the benefit(s) of having > > an email address of Tedd@sperling.com being different than > > tedd@sperling.com? To me, it just doesn't make any sense -- or do I > > not understand the problem. > >[Magnus] It might not make any sense in English with ASCII [A-Za-z]. In a >different language with a different pair of upper/lower-case characters, >it might be a bigger difference between a upper/lower/mixed-case word. [Alain] You mean in German for non-proprer names? (in French and English there are also cases where sensitivity matters, but it is rather rare, and should not be a rule[*], but I could agree that we have to consider the German issue -- are there other languages like German?). As this is relevant, are there cases with proper names where upper and lower case will change anything ? Alain LaBonté (no difference with ALAIN LABONTÉ, except that I would then write LA BONTÉ in two words, although I would not either make a special case with this two-word issue, a side issue) Québec *: French: "J'aime le Français" means "I love the Frenchman" "J'aime le français" means "I love French" English: "This month is august" does not mean the same as "This month is August" But these are very execptional in both languages... From owner-ietf-imaa Mon Feb 10 07:20:34 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AFKY301656 for ietf-imaa-bks; Mon, 10 Feb 2003 07:20:34 -0800 (PST) Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AFKVd01651 for ; Mon, 10 Feb 2003 07:20:31 -0800 (PST) Received: (qmail 16440 invoked from network); 10 Feb 2003 15:20:14 -0000 Received: from adsl-65-42-242-53.dsl.lgnnmi.ameritech.net (HELO ?192.168.0.100?) (65.42.242.53) by server.iicinternet.com with SMTP; 10 Feb 2003 15:20:14 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <20030210142105.GG12186@bodin.org> References: <20030210142105.GG12186@bodin.org> Date: Mon, 10 Feb 2003 10:19:51 -0500 To: ietf-imaa@imc.org From: tedd Subject: Re: Case sensitivity on the LHS Content-Type: text/plain; charset="iso-8859-1" ; format="flowed" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1AFKXd01653 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >Le Mon, Feb 10, 2003 at 09:07:37AM -0500, tedd écrivait: >> >> Would someone be so kind as to point out the benefit(s) of having >> an email address of Tedd@sperling.com being different than >> tedd@sperling.com? To me, it just doesn't make any sense -- or do I >> not understand the problem. > >It might not make any sense in English with ASCII [A-Za-z]. In a >different language with a different pair of upper/lower-case characters, >it might be a bigger difference between a upper/lower/mixed-case word. > >/magnus /magnus: Thank you -- very interesting. But, the upper/lower-case character problem you describe will still exist on the RHS regardless -- and that is not going to change; Thus, making the LHS case-sensitive would only compound the problem described as I see it. I believe that most users will not understand why the case would be sensitive for one side and not for the other -- most don't realize that now -- and as Paul pointed out. I think a considerable amount of user frustration, confusion and error would enter into the mix if the rules were different for each side of the "@" -- not to mention the problems that may arise from the implementation of two different sets of rules to servers, mail admins and such. Thus, I believe that whatever method is adapted for character consideration should be consistent throughout the address. tedd -- http://sperling.com/ From owner-ietf-imaa Mon Feb 10 07:31:40 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AFVeT01928 for ietf-imaa-bks; Mon, 10 Feb 2003 07:31:40 -0800 (PST) Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AFVbd01922 for ; Mon, 10 Feb 2003 07:31:37 -0800 (PST) Received: (qmail 17131 invoked from network); 10 Feb 2003 15:31:19 -0000 Received: from adsl-65-42-242-53.dsl.lgnnmi.ameritech.net (HELO ?192.168.0.100?) (65.42.242.53) by server.iicinternet.com with SMTP; 10 Feb 2003 15:31:19 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: References: Date: Mon, 10 Feb 2003 10:30:57 -0500 To: ietf-imaa@imc.org From: tedd Subject: Re: Case sensitivity on the LHS Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >tedd scripsit: > >> Would >> someone be so kind as to point out the benefit(s) of having an email >> address of Tedd@sperling.com being different than tedd@sperling.com? > >The classic minimal pair is Tedd@example.com vs. TedD@example.com. >But I agree that it's a silly distinction. Mail admins don't *have* >to let people choose the absolutely-precisely-preferred forms of their >name as local-parts. > >-- >LEAR: Dost thou call me fool, boy? John Cowan John: Agreed. And furthermore, I believe that: (lower-case)omega@(lower-case)omega.com is better than allowing -- (upper-case)omega@(lower-case)omega.com -- and trying to explain, implement, and having people understand why the upper-case omega character (code point) is not allowed on both sides of the "@". tedd -- http://sperling.com/ From owner-ietf-imaa Mon Feb 10 07:32:58 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AFWw601967 for ietf-imaa-bks; Mon, 10 Feb 2003 07:32:58 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1AFWvd01962 for ; Mon, 10 Feb 2003 07:32:57 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id KAA04483; Mon, 10 Feb 2003 10:32:23 -0500 Message-Id: <4.2.0.58.J.20030210094840.059796b8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Mon, 10 Feb 2003 10:32:18 -0500 To: Magnus Bodin , tedd From: Martin Duerst Subject: Re: Case sensitivity on the LHS Cc: ietf-imaa@imc.org In-Reply-To: <20030210142105.GG12186@bodin.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 15:21 03/02/10 +0100, Magnus Bodin wrote: >It might not make any sense in English with ASCII [A-Za-z]. In a >different language with a different pair of upper/lower-case characters, >it might be a bigger difference between a upper/lower/mixed-case word. This is a point to consider. However, the differences between upper/lower/mixed-case words usually apply to the actual language (e.g. nouns vs. verbs,...), not to names. This is certainly the case in German. In various European languages, there are individual differences of how to spell names with prefixes (e.g. French 'de' or 'du', Dutch 'van', German 'von', ...). German is not special in this respect. There are not only casing variants, but also whether there is a space or not (e.g. 'du Bois' vs. 'Du Bois' vs. 'duBois' vs. 'DuBois' vs. 'Dubois', not all of them necessarily in use). We kind of know that we cannot deal with the space. So half of the distinctions in this area are already lost, and it becomes impossible to completely and faithfully reflect personal spelling differences to the last detail. In that case, it seems better to just go all the way to case-insensitivity. Regards, Martin. From owner-ietf-imaa Mon Feb 10 09:23:07 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AHN7P09555 for ietf-imaa-bks; Mon, 10 Feb 2003 09:23:07 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1AHN5d09546; Mon, 10 Feb 2003 09:23:05 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1AHN4NG032108 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Mon, 10 Feb 2003 18:23:05 +0100 To: Paul Hoffman / IMC Cc: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS X-Payment: hashcash 1.1 0:030210:phoffman@imc.org:a1921e04f1a69add X-Hashcash: 0:030210:phoffman@imc.org:a1921e04f1a69add X-Payment: hashcash 1.1 0:030210:ietf-imaa@imc.org:d446b39ab17e9259 X-Hashcash: 0:030210:ietf-imaa@imc.org:d446b39ab17e9259 From: Simon Josefsson Date: Mon, 10 Feb 2003 18:23:03 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Sun, 9 Feb 2003 11:16:50 -0800") Message-ID: User-Agent: Gnus/5.090015 (Oort Gnus v0.15) Emacs/21.3.50 (i686-pc-linux-gnu) References: X-Face: %bo>yc#X1.-jVa- List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > This one should be lively. > > RFC 2821 and RFC 2822 make it clear that the left-hand side (LHS) of > email addresses are opaque, which in turn means they are > case-sensitive. The -00 draft of IMAA preserves this. > > Most email users don't know that the LHS is case-sensitive, and > probably guess that it isn't because the RHS (the domain name) is > not. Further, there are some mail systems and gateways that do case > conversion on the LHS. > > If we simplify IMAA to make the LHS case-insensitive, it will probably > match the expectations of users better. It would also mean that we > could reuse Nameprep instead of using our own Stringprep > profile. There are other reasons why this might be good listed in the > IMAA document. But to do so would go against the spirit of the > standards on which IMAA rests, namely 2821 and 2822 (and 821 and 822 > before them). > > Purity or modernity? It depends on which definition of "case-insensitive" you use. If you use NFKC you will collapse many distinct names of humans into the same name, which is a failure as far as LHS is concerned. C.f. ß maps to ss. LHS are often human names which are free text, and I fear NFKC will damage a significant amount of non-western names. NFKC is appropriate for preparing strings for equality comparisons, but can be too aggressive in other situations. Changing the LHS definition in RFC 282{1,2} should IMHO be done based on technical reasons, and I don't see any technical reason presented above. Arguing that users doesn't read the technical specification isn't a good motivation for changing the specification; users will never read the technical specification. Applications are responsible for implementing a non-surprising behavior for clients (which I agree treating LHS as case-insensitive is), and with the current specifications they can, e.g. by searching case insensitively. Unless a technical case can be made for changing the specification, let's move on. From owner-ietf-imaa Mon Feb 10 11:44:17 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AJiHU15917 for ietf-imaa-bks; Mon, 10 Feb 2003 11:44:17 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AJiEd15909 for ; Mon, 10 Feb 2003 11:44:15 -0800 (PST) Received: (qmail 1494 invoked by uid 66); 10 Feb 2003 19:44:15 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 10 Feb 2003 19:44:15 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-10-2035d); 10 Feb 2003 20:44:09 +0100 Date: 10 Feb 2003 20:43:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8f$3A$+JcDD@3247.org> In-Reply-To: Subject: Re: Case sensitivity on the LHS User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-10-2035d MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: John Cowan schrieb/wrote: > The classic minimal pair is Tedd@example.com vs. TedD@example.com. > But I agree that it's a silly distinction. Mail admins don't *have* > to let people choose the absolutely-precisely-preferred forms of their > name as local-parts. This example shows, however, that *preserving* case can be important. “Tedd†and “TedD†have different meanings. This, of course, is even more important wrt languages that have a case mapping not identical to that of Unicode. The typical example is “Iâ€â†”“i†vs. “İâ€â†”“i†and “Iâ€â†”“ıâ€. Mapping the address “I.Surname@example.com†to “i.surname@example.com†might be *just* *wrong*. Claus (NB: The important non-ASCII characters above are the Capital Latin Letter I with Dot and the small Latin Letter Dottless i.) -- ------------------------ http://www.faerber.muc.de/ ------------------------ OpenPGP: DSS 1024/639680F0 E7A8 AADB 6C8A 2450 67EA AF68 48A5 0E63 6396 80F0 From owner-ietf-imaa Mon Feb 10 11:44:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1AJiGC15916 for ietf-imaa-bks; Mon, 10 Feb 2003 11:44:16 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1AJiEd15908 for ; Mon, 10 Feb 2003 11:44:14 -0800 (PST) Received: (qmail 1493 invoked by uid 66); 10 Feb 2003 19:44:15 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 10 Feb 2003 19:44:14 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-10-2035d); 10 Feb 2003 20:44:09 +0100 Date: 10 Feb 2003 20:42:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8f$3A5e3cDD@3247.org> Subject: Compatibility with IDNA User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-10-2035d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hallo, I'd like to explain some of the design decisions I have made for draft- faerber-i18n-email-netnews-names-00.txt The basic idea is that IDNA, IMAA and similar internationalised identifiers (e.g. newsgroup names) should be able to use the very same encoding method. This has several advantages: . You can put a local-part, a domain, or a complete email address into the same encoding/decoding function and the results are correct. This makes implementations much easier. (Note: This need not be true for email addresses in non-canonical form, i.e. with anything not allowed in RFC 2821.) . You can have a domain name embedded in a local-part and it is encoded the same way as a domain on the right hand side if it is delimited by one of the delimiters listed above (useful for the so-called percent hack, MIXER, etc.) . The reverse is also true: You can have an email address converted to a domain name (as seen in SOA DNS records, for example). The design decisions that have to be made to make this work as expected are these: . Do a NFCK normalisation at the very beginning (needed for delimiters, I've missed that in my draft). . Don't encode the local-part as a whole, but use as much delimiters as possible to split it into pieces. (NB: We have to be very intelligent wrt quoted-strings here.) . Use the mixed-case annotation. (Yes, it has to be formalised then.) . Use the same ACE prefix as IDNA. It should be noted that this differs in some important aspects from IDNA: . IDNA does the normalisation later. . IDNA only recognises the dot (in four variants, two in NFKC) as a seperator. . IDNA maps everyting to lower-case. . IDNA uses ``UseSTD13ASCIIRules''. . IDNA has a strong length limit. But these differences don't have an impact on the output for all valid domain names (or, in the case of the mixed-case annotation, produce an equivalent result). Claus From owner-ietf-imaa Mon Feb 10 16:42:05 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B0g5P24401 for ietf-imaa-bks; Mon, 10 Feb 2003 16:42:05 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B0g4d24397 for ; Mon, 10 Feb 2003 16:42:04 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iOUy-0004Nd-00 for ; Mon, 10 Feb 2003 16:42:08 -0800 Date: Mon, 10 Feb 2003 11:44:16 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: John Cowan on IMAA draft Message-ID: <20030210114416.GB9872@nicemice.net> Reply-To: IETF IMAA list References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: John Cowan wrote: > Since local names are under the sole control of a single domain > (even if people nominate their own local names, it's the domain mail > administrator who decides whether they work), I think that having an > ACE prefix is not necessary. An ACE prefix (or suffix, or infix, etc) is necessary so that applications know whether to convert ASCII local-parts to non-ASCII for display. How else will my mail program know how to display the From address of incoming mail? AMC From owner-ietf-imaa Mon Feb 10 16:41:57 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B0fva24387 for ietf-imaa-bks; Mon, 10 Feb 2003 16:41:57 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B0fsd24381 for ; Mon, 10 Feb 2003 16:41:56 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iOUn-0004NR-00 for ; Mon, 10 Feb 2003 16:41:57 -0800 Date: Mon, 10 Feb 2003 11:40:16 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS Message-ID: <20030210114016.GA9872@nicemice.net> Reply-To: IETF IMAA list References: <4.2.0.58.J.20030209173037.05a45ca0@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030209173037.05a45ca0@localhost> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst wrote: > - Some systems currently treat ASCII as case-sensitive (don't have > any idea how many). But these names would not be encoded, so the > behavior would stay the same. (except if nameprep is applied before > the check for ascii-only is done, which may well be the case). In the imaa-00 draft, Stringprep is applied only to non-ASCII strings, same as in IDNA. ToASCII and ToUnicode are designed that way so that IDNA/IMAA will have no impact on the handling of traditional ASCII labels/local-parts. One gotcha (the only one, I think) is non-ASCII representations of ASCII strings. For example, suppose that foo and FOO are local-parts referring to two distinct mailboxes. Today, you can type foo to send mail to one mailbox, and you can type FOO to send mail to the other mailbox, but if you type FOO using fullwidth characters, the mail will probably go nowhere. But if the sender's user agent is IMAA-aware, it will perform ToASCII on the fullwidth FOO, resulting in either ASCII FOO (if case-folding is not done) or ASCII foo (if case-folding is done). So having case-insensitive internationalized local-parts (which entails case-folding) will cause counter-intuitive results when a user tries to send mail to a case-sensitive ASCII local-part by typing a string containing both uppercase characters and non-ASCII characters. The problem occurs only when all three atypical circumstances coincide (case-sensitive ASCII local parts, users typing uppercase characters in email addresses, and users typing non-ASCII characters to represent an ASCII string), and case-sensitive local-parts are already counter-intuitive anyway, so this pitfall is probably not worth worrying about. > Do we have any idea which systems are more numerous (the only > sample I have at the moment is my own email address, which is > case-insensitive). I have never encountered a case-sensitive local-part. Has anyone here ever encountered a case-sensitive local-part? I don't mean to argue that local-parts are de facto case-insensitive, despite the standards. In fact it irks me whenever I see applications convert local-parts to all-caps or all-lowercase in defiance of the standards. The fact is that ASCII local-parts "may be case-sensitive", and so they must be treated as if they are case-sensitive, even if they almost always aren't. But for non-ASCII local-parts, "may be case-sensitive" is not an option, they either must be case-sensitive or must be case-insensitive, because it's the sender that decides, not the recipient. So we're doomed to depart from tradition(*), we just have to pick a direction. AMC (*) Unless we mandate mixed-case annotations, which I don't see happening. From owner-ietf-imaa Mon Feb 10 17:44:10 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B1iAQ25996 for ietf-imaa-bks; Mon, 10 Feb 2003 17:44:10 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B1i9d25992 for ; Mon, 10 Feb 2003 17:44:09 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iPT3-0004ct-00 for ; Mon, 10 Feb 2003 17:44:13 -0800 Date: Tue, 11 Feb 2003 01:44:13 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS Message-ID: <20030211014413.GD16359@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: "Hollenbeck, Scott" wrote: > Is this document intended to be a formal update to 2821 and 2822? Is IDNA a formal update to RFCs 1034 and 1035? IMAA would bear a similar relation to RFCs 821, 822, 2821, 2822. > Both (2821 section 4.1.2 and 2822 section 3.4.1) contain formal > definitions of the local part of an email address. IMAA does not redefine local part, but instead defines a new term, internationalized local part. Just as IDNA does not redefine domain label, but instead defines a new term, internationalized domain label. tedd wrote: > (lower-case)omega@(lower-case)omega.com > > is better than allowing -- > > (upper-case)omega@(lower-case)omega.com > > -- and trying to explain, implement, and having people understand why > the upper-case omega character (code point) is not allowed on both > sides of the "@". Actually, uppercase OMEGA *is* allowed on both sides of the @, in the sense that a user can type uppercase OMEGA on both sides, and IMAA and IDNA will both accept it. The question is whether mail sent to OMEGA@OMEGA.com will reach the same mailbox as mail sent to omega@omega.com. We know that omega vs. OMEGA makes no difference in the domain part; they are the same domain. The question is whether OMEGA and omega in the local part should refer to the same mailbox, or to two different mailboxes. For ASCII local parts, the answer is unknown, and can vary from one mail server to another. Applications preserve the case, and the destination mail server decides whether to perform a case-sensitive comparison or a case-insensitive comparison against its database of mailbox names. But for non-ASCII local parts, we can't defer the decision to the destination mail server, because the applications need to know whether to perform case-folding or not. So we need to pick one rule, case-sensitive or case-insensitive, for everyone. [Unless we mandate the use of mixed-case annotation. But that would mean mandating a significant amount of additional complexity in all IMAA applications, where the only benefit from the additional complexity is the ability of each mail server to decide whether its local parts are case-sensitive or case-insensitive. But in practice, virtually no one is interested in creating case-sensitive local parts. So mandating mixed-case annotations would have a significant cost and no significant practical benefit. The IDNA model appears to be a better tradeoff: mandate that non-ASCII local parts are always case-insensitive, and leave mixed-case annotations as an *optional* technique for preserving case. Applications that care about preserving case can choose to spend the extra implementation effort.] Claus Färber wrote: > This example shows, however, that *preserving* case can be important. > This, of course, is even more important wrt languages that have a > case mapping not identical to that of Unicode. The typical example is > [Turkish i] Is this a "typical" example, or the *only* example? Turkish i is the only locale-dependent aspect of the Unicode case-folding operation, according to the Unicode case-folding table. I suppose it's possible that the table overlooks some other locale-dependent things that ought to be in there, but can anyone name any examples besides Turkish i? Simon Josefsson wrote: > If you use NFKC you will collapse many distinct names of humans into > the same name, which is a failure as far as LHS is concerned. C.f. ß > maps to ss. NFKC does not map German sharp s to ss. Neither does NFC. It is the Unicode case-folding operation that maps German sharp s to ss. It has nothing to do with NFKC. You claim that "many distinct names" of humans will get collapsed by NFKC into the same name. So far you have provided zero examples. Could you supply a few more please? > NFKC is appropriate for preparing strings for equality comparisons, Which is exactly why we use it. IDNA and IMAA are designed to allow the equality comparisons to be performed by legacy software that doesn't know about Stringprep. That means the preparation has to be done by the IDNA/IMAA-aware applications before the strings are inserted into old protocols and transfered to old servers. > Changing the LHS definition in RFC 282{1,2} should IMHO be done based > on technical reasons, and I don't see any technical reason presented > above. Arguing that users doesn't read the technical specification > isn't a good motivation for changing the specification; users will > never read the technical specification. Applications are responsible > for implementing a non-surprising behavior for clients (which I > agree treating LHS as case-insensitive is), and with the current > specifications they can, e.g. by searching case insensitively. Here is the technical argument: The day that IMAA is adopted, I should be able to create an ACE username on yahoo.com, and people should be able to send mail to that account by typing the corresponding non-ASCII username into their IMAA-aware mail user agent. The mail should reach me even if yahoo.com is completely IMAA-unaware. Now suppose the sender types that non-ASCII username slightly differently from the way I typed it when I created it. Will that mail reach me? If we do case-folding in ToASCII, then yes it will. If we don't do case-folding in ToASCII, then the mail will either bounce, or worse yet, it will go to some other user. While this pitfall (mail going to the wrong user because the sender typed the wrong case) has always been theoretically possible with ASCII local parts, it never happens in practice, because in practice mail servers recognize local parts case-insensitively. But if we omit case-folding from the IMAA ToASCII, then this pitfall will become very real, because the existing mail servers won't know how to do case-insensitive comparisons of ACE local parts. If we include case-folding in IMAA ToASCII, then all non-ASCII local parts are automatically case-insensitive, even on legacy mail servers. AMC From owner-ietf-imaa Mon Feb 10 18:07:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B27UD26439 for ietf-imaa-bks; Mon, 10 Feb 2003 18:07:30 -0800 (PST) Received: from mercury.ccil.org (mail@[192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B27Td26435 for ; Mon, 10 Feb 2003 18:07:29 -0800 (PST) Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 18iPpb-0006XA-00 for ; Mon, 10 Feb 2003 21:07:31 -0500 Subject: Re: John Cowan on IMAA draft In-Reply-To: <20030210114416.GB9872@nicemice.net> from "Adam M. Costello" at "Feb 10, 2003 11:44:16 am" To: IETF IMAA list Date: Mon, 10 Feb 2003 21:07:31 -0500 (EST) X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: John Cowan Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam M. Costello scripsit: > An ACE prefix (or suffix, or infix, etc) is necessary so that > applications know whether to convert ASCII local-parts to non-ASCII for > display. How else will my mail program know how to display the From > address of incoming mail? Hmm. You're right. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_ From owner-ietf-imaa Mon Feb 10 18:18:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B2I8j26859 for ietf-imaa-bks; Mon, 10 Feb 2003 18:18:08 -0800 (PST) Received: from mercury.ccil.org (mail@[192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B2I7d26854 for ; Mon, 10 Feb 2003 18:18:07 -0800 (PST) Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 18iPzv-0006nN-00 for ; Mon, 10 Feb 2003 21:18:11 -0500 Subject: Re: Case sensitivity on the LHS In-Reply-To: <20030211014413.GD16359@nicemice.net> from "Adam M. Costello" at "Feb 11, 2003 01:44:13 am" To: IETF IMAA list Date: Mon, 10 Feb 2003 21:18:11 -0500 (EST) X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: John Cowan Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam M. Costello scripsit: > Is this a "typical" example, or the *only* example? Turkish i is the > only locale-dependent aspect of the Unicode case-folding operation, > according to the Unicode case-folding table. I suppose it's possible > that the table overlooks some other locale-dependent things that ought > to be in there, but can anyone name any examples besides Turkish i? Lithuanian "I" with an accent above lowercases to "I" + DOT ABOVE + the main accent, because (unlike all other "i"s with accents) the i keeps its dot. For Unicode case-folding purposes, this discrepancy is ignored. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_ From owner-ietf-imaa Mon Feb 10 18:33:58 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B2XwF27243 for ietf-imaa-bks; Mon, 10 Feb 2003 18:33:58 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B2Xwd27239 for ; Mon, 10 Feb 2003 18:33:58 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iQFF-0004lZ-00 for ; Mon, 10 Feb 2003 18:34:01 -0800 Date: Tue, 11 Feb 2003 02:34:01 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Compatibility with IDNA Message-ID: <20030211023401.GE16359@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A5e3cDD@3247.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8f$3A5e3cDD@3247.org> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Claus Färber wrote: > The basic idea is that IDNA, IMAA and similar internationalised > identifiers (e.g. newsgroup names) should be able to use the very same > encoding method. It would certainly be nice, if it's possible, but it's not obvious whether it's possible. > This has several advantages: > > . You can put a local-part, a domain, or a complete email address > into the same encoding/decoding function and the results are > correct. Currently, we don't even have a single encoding/decoding function for domain names. We have functions for domain labels. IDNA does not define functions for whole domain names because the delimiting and quoting conventions vary. For example, DNS master files use dots as delimiters, but DNS protocol messages don't. DNS master files use backslashes to quote dots that are not delimiters, but in DNS protocol messages there is nothing special about dots or backslashes. The message header and SMTP protocols add still more delimiters and quoting mechanisms. Various config file formats have their own ad-hoc quoting mechanisms, different from the ones use in message headers and DNS master files. It seems that the only hope is to require the application to parse the larger items (domain names, mail addresses) into their constituent parts (labels, local parts) using whatever delimiters and quoting mechanisms are appropriate in its context, and then standardize the encoding/decoding of the individual parts. In IDNA, each label is encoded as a unit, never subdivided, even if it contains dots or other ASCII punctuation. It's far too late to consider altering that fundamental architecture of IDNA. Maybe, if we had co-designed internationalized domain names, mail addresses, and newsgroup names all at the same time, we would have done things differently, but IDNA is done and approved and due to be deployed any day now. At this point, the most we could try for is to use the exact same encoding for local-parts (or subparts) as is used for domain labels. > . You can have a domain name embedded in a local-part and it is > encoded the same way as a domain on the right hand side if it is > delimited by one of the delimiters listed above (useful for the > so-called percent hack, MIXER, etc.) We might be able to do that if we subdivide local parts. > . The reverse is also true: You can have an email address converted to > a domain name (as seen in SOA DNS records, for example). We might be able to do that if we *don't* subdivide local parts. If we subdivide local parts, then foo.bar@example.org would be encoded differently depending on whether it was ACE-ified and then domain-ified, or domain-ified and then ACE-ified. > . Use the mixed-case annotation. (Yes, it has to be formalised then.) I support mixed-case annotation as an option, but not as a requirement. The meager incremental benefit of requiring it versus merely allowing it does not appear to be worth the additional required complexity. > . Use the same ACE prefix as IDNA. We should use the same ACE prefix if and only if the ToASCII and ToUnicode operations are identical. If two different sets of ToASCII/ToUnicode operations were to use the same prefix, that would invite errors where a string gets encoded by one ToASCII and decoded by the wrong ToUnicode, which would probably cause the original string to get converted into a non-equivalent string. AMC From owner-ietf-imaa Mon Feb 10 18:57:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B2vBj27732 for ietf-imaa-bks; Mon, 10 Feb 2003 18:57:11 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B2vAd27728 for ; Mon, 10 Feb 2003 18:57:10 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iQbi-0004pZ-00 for ; Mon, 10 Feb 2003 18:57:14 -0800 Date: Tue, 11 Feb 2003 02:57:14 +0000 From: "Adam M. Costello" To: IETF IMAA list Subject: Re: Case sensitivity on the LHS Message-ID: <20030211025714.GF16359@nicemice.net> Reply-To: IETF IMAA list References: <20030211014413.GD16359@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: John Cowan wrote: > > Turkish i is the only locale-dependent aspect of the Unicode > > case-folding operation, according to the Unicode case-folding > > table. I suppose it's possible that the table overlooks some other > > locale-dependent things that ought to be in there, but can anyone > > name any examples besides Turkish i? > > Lithuanian "I" with an accent above lowercases to "I" + DOT ABOVE + > the main accent, because (unlike all other "i"s with accents) the i > keeps its dot. For Unicode case-folding purposes, this discrepancy is > ignored. So if we consider case-mapping, rather than case-folding, then there are three affected languages rather than just two, but it's still all about the dot on the letter i. It still appears to me that this is an isolated anomaly, not a typical example from a large class of locale-dependent upper/lower case issues. Some people have raised a concern that case-folding causes damage (loss of important information). It does cause very slight damage in rare circumstances, but I think not doing it would cause annoyance (mail bouncing or going to the wrong user) quite often. AMC From owner-ietf-imaa Mon Feb 10 20:04:51 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B44pE29233 for ietf-imaa-bks; Mon, 10 Feb 2003 20:04:51 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1B44od29229 for ; Mon, 10 Feb 2003 20:04:50 -0800 (PST) Received: (qmail 99896 invoked by uid 1016); 11 Feb 2003 04:05:19 -0000 Date: 11 Feb 2003 04:05:19 -0000 Message-ID: <20030211040519.99895.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Sound mapping References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam M. Costello writes: > Now suppose the sender types that non-ASCII username slightly > differently from the way I typed it when I created it. Will that mail > reach me? If we do case-folding in ToASCII, then yes it will. False. Case folding fixes only some of the errors caused by human retyping. For example, studies have shown that SOUNDEX folding has a substantially higher correction rate. (Go read the literature.) Now that you're aware of this fact, are you going to demand that all similar-sounding names be mapped together? Why do you support case folding and not SOUNDEX folding? Of course, we're just talking about English. Errors are far more difficult to characterize in a world of many languages. As a practical matter, you're going to have to stop expecting the computer to fix your spelling mistakes. This is why I support ISO 14755. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Mon Feb 10 21:17:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B5HQZ00888 for ietf-imaa-bks; Mon, 10 Feb 2003 21:17:26 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B5HPd00884 for ; Mon, 10 Feb 2003 21:17:25 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iSnR-00056t-00 for ; Mon, 10 Feb 2003 21:17:29 -0800 Date: Tue, 11 Feb 2003 05:17:29 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Sound mapping Message-ID: <20030211051729.GG16359@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <20030211040519.99895.qmail@cr.yp.to> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030211040519.99895.qmail@cr.yp.to> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I wrote: > Now suppose the sender types that non-ASCII username slightly > differently from the way I typed it when I created it. Will that mail > reach me? If we do case-folding in ToASCII, then yes it will. "D. J. Bernstein" wrote: > False. Case folding fixes only some of the errors caused by human > retyping. For example, studies have shown that SOUNDEX folding has a > substantially higher correction rate. Right. There was a portion of my sentence that was in my head but didn't make it to my fingers. What I meant to write was: Now suppose the sender types that non-ASCII username slightly differently from the way I typed it when I created it, using an uppercase letter where I used a lowercase letter (or vice versa). I noticed the omission when I received the message back from the list server, but I thought the intention was clear enough from context. > Now that you're aware of this fact, are you going to demand that all > similar-sounding names be mapped together? No. > Why do you support case folding and not SOUNDEX folding? Because users are already accustomed to not bothering to remember and type the proper case of letters in mail addresses, because in practice it doesn't matter. By doing case-folding we can avoid surprising them. Users are not accustomed to mail addresses being sound-insensitive, so there's no point in us working harder to meet nonexistent expectations. AMC From owner-ietf-imaa Mon Feb 10 22:21:53 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1B6Lrk02368 for ietf-imaa-bks; Mon, 10 Feb 2003 22:21:53 -0800 (PST) Received: from leonis.nus.edu.sg (leonis.nus.edu.sg [137.132.1.18]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1B6Lod02363 for ; Mon, 10 Feb 2003 22:21:51 -0800 (PST) Received: from camus (8-55.priv.nus.edu.sg [172.18.8.55]) by leonis.nus.edu.sg (8.12.1/8.12.1) with SMTP id h1B6NBwE001576; Tue, 11 Feb 2003 14:23:13 +0800 (SGT) Message-ID: <001c01c2d195$b1370320$f57812ac@camus> From: "Maynard Kang" To: "D. J. Bernstein" , References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <20030211040519.99895.qmail@cr.yp.to> Subject: Re: Sound mapping Date: Tue, 11 Feb 2003 14:20:36 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > Of course, we're just talking about English. Errors are far more > difficult to characterize in a world of many languages. As a practical > matter, you're going to have to stop expecting the computer to fix your > spelling mistakes. This is why I support ISO 14755. > IMHO, I think case folding is perfectly reasonable in a world where only upper-case characters are shown on keyboards (not just US keyboards, but also other keyboards that present Latin characters in addition to local characters) as it is a simplistic algorithm that can be implemented easily. To require compulsory 14755 input for e-mail addresses is plain ridiculous, if you ask me. It's like telling consumers that they need to know how to build a TV before they can watch it. I certainly do not want to have to know the internal code points of my e-mail address before I can enter it. regards, maynard From owner-ietf-imaa Tue Feb 11 04:31:10 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BCVAK15393 for ietf-imaa-bks; Tue, 11 Feb 2003 04:31:10 -0800 (PST) Received: from crow.verisign.com (crow.verisign.com [216.168.237.103]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BCV9d15388 for ; Tue, 11 Feb 2003 04:31:09 -0800 (PST) Received: from vsvapostalgw3.prod.netsol.com (vsvapostalgw3.prod.netsol.com [10.170.12.61]) by crow.verisign.com (nsi_0.1/8.9.1) with ESMTP id HAA11638 for ; Tue, 11 Feb 2003 07:31:03 -0500 (EST) Received: by vsvapostalgw3.prod.netsol.com with Internet Mail Service (5.5.2653.19) id <1V7STYV0>; Tue, 11 Feb 2003 07:29:02 -0500 Message-ID: <3CD14E451751BD42BA48AAA50B07BAD603370662@vsvapostal3.prod.netsol.com> From: "Hollenbeck, Scott" To: "'IETF IMAA list'" Subject: RE: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 07:27:04 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > "Hollenbeck, Scott" wrote: > > > Is this document intended to be a formal update to 2821 and 2822? > > Is IDNA a formal update to RFCs 1034 and 1035? IMAA would bear a > similar relation to RFCs 821, 822, 2821, 2822. Maybe yours is a rhetorical question since you probably know the answer (I don't because IDNA hasn't yet been published as an RFC), but here's why I asked mine: someone implementing 1034/1035 or 2821/2822 (which, by the way, obsolete 821 and 822) might not necessarily know of the new features defined by IDNA and IMAA since there are no "update" references maintained by the RFC Editor. I suspect that it would probably be a good idea to have IDNA update 1034/1035, and likewise it's probably a good idea to have IMAA update 2821/2822 so that implementers see a clear relationship between the specifications. > > Both (2821 section 4.1.2 and 2822 section 3.4.1) contain formal > > definitions of the local part of an email address. > > IMAA does not redefine local part, but instead defines a new term, > internationalized local part. Just as IDNA does not redefine domain > label, but instead defines a new term, internationalized domain label. These new terms won't necessarily be known to implementers of the earlier specifications that IDNA and IMAA build upon. If new features with new processing rules are being defined, why not give RFC readers a clear pointer to the specifications that describe the new features? Asking a different way: would these new features (or something similar) have been included in 1034/1035 or 2821/2822 if internationalization was considered when the earlier specifications were being written? While we'll never know for sure, I'm suggesting that the possibility of a "yes" answer implies that IDNA and IMAA _should_ update the earlier specifications. -Scott- From owner-ietf-imaa Tue Feb 11 05:17:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BDHQt18383 for ietf-imaa-bks; Tue, 11 Feb 2003 05:17:26 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BDHOd18375 for ; Tue, 11 Feb 2003 05:17:24 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1BDHMNG023285 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK) for ; Tue, 11 Feb 2003 14:17:23 +0100 To: IETF IMAA list Subject: Re: Case sensitivity on the LHS X-Payment: hashcash 1.1 0:030211:ietf-imaa@imc.org:5019fd0edc5cc9c4 X-Hashcash: 0:030211:ietf-imaa@imc.org:5019fd0edc5cc9c4 From: Simon Josefsson Date: Tue, 11 Feb 2003 14:17:22 +0100 In-Reply-To: <20030211014413.GD16359@nicemice.net> ("Adam M. Costello"'s message of "Tue, 11 Feb 2003 01:44:13 +0000") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.2 References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> X-Face: %bo>yc#X1.-jVa- List-Unsubscribe: List-ID: "Adam M. Costello" writes: > "Hollenbeck, Scott" wrote: > >> Is this document intended to be a formal update to 2821 and 2822? > > Is IDNA a formal update to RFCs 1034 and 1035? IDNA does not modify RFC 1034/1035 behaviour, so no. > IMAA would bear a similar relation to RFCs 821, 822, 2821, 2822. Not if it makes changes to those RFCs, which seems to be (part of) what this discussion is about. >> If you use NFKC you will collapse many distinct names of humans into >> the same name, which is a failure as far as LHS is concerned. C.f. ß >> maps to ss. > > NFKC does not map German sharp s to ss. Neither does NFC. It is the > Unicode case-folding operation that maps German sharp s to ss. It has > nothing to do with NFKC. You are right, sorry. Still, it is part of the nameprep steps, so the result is the same. > You claim that "many distinct names" of humans will get collapsed by > NFKC into the same name. So far you have provided zero examples. Could > you supply a few more please? If you want to modify existing standards, I think you need to prove that it doesn't break things, not the other way around. >> Changing the LHS definition in RFC 282{1,2} should IMHO be done based >> on technical reasons, and I don't see any technical reason presented >> above. Arguing that users doesn't read the technical specification >> isn't a good motivation for changing the specification; users will >> never read the technical specification. Applications are responsible >> for implementing a non-surprising behavior for clients (which I >> agree treating LHS as case-insensitive is), and with the current >> specifications they can, e.g. by searching case insensitively. > > Here is the technical argument: The day that IMAA is adopted, I should > be able to create an ACE username on yahoo.com, and people should be > able to send mail to that account by typing the corresponding non-ASCII > username into their IMAA-aware mail user agent. The mail should reach > me even if yahoo.com is completely IMAA-unaware. You don't need to change the definition of LHS for this. > Now suppose the sender types that non-ASCII username slightly > differently from the way I typed it when I created it. Will that mail > reach me? If we do case-folding in ToASCII, then yes it will. If we > don't do case-folding in ToASCII, then the mail will either bounce, or > worse yet, it will go to some other user. The same is true today. > While this pitfall (mail going to the wrong user because the sender > typed the wrong case) has always been theoretically possible with > ASCII local parts, it never happens in practice, because in practice > mail servers recognize local parts case-insensitively. But if we > omit case-folding from the IMAA ToASCII, then this pitfall will > become very real, because the existing mail servers won't know how > to do case-insensitive comparisons of ACE local parts. If we > include case-folding in IMAA ToASCII, then all non-ASCII local parts > are automatically case-insensitive, even on legacy mail servers. You are saying that LHS works case insentivively today becase mail servers already treat them case-insensitively, but yet think that they will not know how to do case-insensitive ACE mappings if IMAA is introduced? If server administrators want case insensitive behaviour, they can instruct their software in that way. You claim this works today, I claim it will work tomorrow with or without IMAA. If server administrators doesn't want case insensitive behaviour, which is fine by todays specifications, then I don't see a reason why they should suffer any pain only because other people want a different behaviour from their software but is incapable of configuring their software in that way. It seems to me that the question of case sensitive LHS can continue to be decided by server administrators, not specification writers. From owner-ietf-imaa Tue Feb 11 05:59:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BDx8t20276 for ietf-imaa-bks; Tue, 11 Feb 2003 05:59:08 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (courrier4.internet.gouv.qc.ca [192.197.162.9] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1BDx7d20271 for ; Tue, 11 Feb 2003 05:59:07 -0800 (PST) Received: (qmail 8649 invoked from network); 11 Feb 2003 13:58:55 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.47) by mailgen2.internet.gouv.qc.ca with SMTP; 11 Feb 2003 13:58:55 -0000 Message-Id: <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Tue, 11 Feb 2003 08:58:57 -0500 To: Simon Josefsson , IETF IMAA list From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: Case sensitivity on the LHS Cc: alb@iquebec.com In-Reply-To: References: <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 14:17 2003-02-11 +0100, Simon Josefsson a écrit : [non-quoted correspondent] > > You claim that "many distinct names" of humans will get collapsed by > > NFKC into the same name. So far you have provided zero examples. Could > > you supply a few more please? > >[Simon] If you want to modify existing standards, I think you need to prove >that it doesn't break things, not the other way around. [Alain] Could somebody summarize what is the actual behaviour of NFKC for me? I'm not sure what kind of mapping is done by NFKC... but I suspect the kind of problems that may ocur. For example, if accents are removed from Latin letters once entered (which in most cases would be convenient for those who can't enter accented characters -- because I would like them to reach me if my email address were to be « Alain.LaBonté@iquébec.com »), there might indeed be collapses, and I will give an actual example, using a look-alike family-name cluster from the city of Québec's telephone book: There are actual real-life collisions for example, in these cases: Cote B Côte B Coté B Côté B This was already one of my favorites for dictionary ordering (the 4 family names, are 4 disting French words, meaning respectively "quote", "hill side", "quoted" and "side"). However this case of collapse is imho no different from cases of names collapsing just because they are simply identical. One has, with the same ISP, to find extra ways to distinguish them. Imho an email address "B.Cote@..." should be able to reach "B.Côté@..." Alain LaBonté Québec From owner-ietf-imaa Tue Feb 11 06:27:21 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BERLk24356 for ietf-imaa-bks; Tue, 11 Feb 2003 06:27:21 -0800 (PST) Received: from smtp.denic.de (smtp.denic.de [194.246.96.22]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BERJd24348 for ; Tue, 11 Feb 2003 06:27:19 -0800 (PST) Received: from notes.denic.de (denics15.denic.de [194.246.96.18]) by smtp.denic.de with esmtp id 18ibNT-0007ib-00; Tue, 11 Feb 2003 15:27:15 +0100 Subject: Re: Re: Case sensitivity on the LHS To: ietf-imaa@imc.org X-Mailer: Lotus Notes Release 5.0.6a January 17, 2001 Message-ID: From: "Marcos Sanz/Denic" Date: Tue, 11 Feb 2003 15:28:19 +0100 X-MIMETrack: Serialize by Router on notes/Denic(Release 5.0.11 |July 24, 2002) at 11.02.2003 15:27:14 MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On 10.02.2003 15:48 Martin Duerst wrote: > [...] > But even then, being able to use the same > nameprep/stringprep for both sides of the '@' is a clear win. As far as I remember, Nameprep states to be a profile for preparing domain names and it should not be used for any other purpose. Wouldn't it be a better idea to make anyway a new profile for IMAA? My 2p. Regards, Marcos Sanz From owner-ietf-imaa Tue Feb 11 06:58:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BEwXE26704 for ietf-imaa-bks; Tue, 11 Feb 2003 06:58:33 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BEwUd26700 for ; Tue, 11 Feb 2003 06:58:31 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1BEwQNG026578 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Tue, 11 Feb 2003 15:58:28 +0100 To: Alain =?iso-8859-1?q?LaBont=E9?= Cc: IETF IMAA list , alb@iquebec.com Subject: Re: Case sensitivity on the LHS X-Payment: hashcash 1.1 0:030211:alb@sct1.gouv.qc.ca:078f1ff838583efb X-Hashcash: 0:030211:alb@sct1.gouv.qc.ca:078f1ff838583efb X-Payment: hashcash 1.1 0:030211:ietf-imaa@imc.org:114c7a167863957a X-Hashcash: 0:030211:ietf-imaa@imc.org:114c7a167863957a X-Payment: hashcash 1.1 0:030211:alb@iquebec.com:c64a1ad7a8b64f8f X-Hashcash: 0:030211:alb@iquebec.com:c64a1ad7a8b64f8f From: Simon Josefsson Date: Tue, 11 Feb 2003 15:58:25 +0100 In-Reply-To: <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> (Alain =?iso-8859-1?q?LaBont=E9's?= message of "Tue, 11 Feb 2003 08:58:57 -0500") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.2 References: <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> X-Face: %bo>yc#X1.-jVa- List-Unsubscribe: List-ID: Alain LaBonté writes: > A 14:17 2003-02-11 +0100, Simon Josefsson a écrit : > > [non-quoted correspondent] >> > You claim that "many distinct names" of humans will get collapsed by >> > NFKC into the same name. So far you have provided zero examples. Could >> > you supply a few more please? >> >>[Simon] If you want to modify existing standards, I think you need to prove >>that it doesn't break things, not the other way around. > > [Alain] Could somebody summarize what is the actual behaviour of NFKC for me? > > I'm not sure what kind of mapping is done by NFKC... but I suspect > the kind of problems that may ocur. > > For example, if accents are removed from Latin letters once > entered (which in most cases would be convenient for those who can't > enter accented characters -- because I would like them to reach me if > my email address were to be « Alain.LaBonté@iquébec.com »), there > might indeed be collapses, and I will give an actual example, using a > look-alike family-name cluster from the city of Québec's telephone > book: > > There are actual real-life collisions for example, in these cases: While being an interesting example, nameprep works fine on these strings. To illustrate, I'm including below (a) initial data as UTF-8 (b) after NFKC (c) after nameprep. > Cote B Initial data (length 6): 43 6f 74 65 20 42 After normalization (length 6): 43 6f 74 65 20 42 After nameprep (length 6): 63 6f 74 65 20 62 > Côte B Initial data (length 7): 43 c3 b4 74 65 20 42 After normalization (length 7): 43 c3 b4 74 65 20 42 After nameprep (length 7): 63 c3 b4 74 65 20 62 > Coté B Initial data (length 7): 43 6f 74 c3 a9 20 42 After normalization (length 7): 43 6f 74 c3 a9 20 42 After nameprep (length 7): 63 6f 74 c3 a9 20 62 > Côté B Initial data (length 8): 43 c3 b4 74 c3 a9 20 42 After normalization (length 8): 43 c3 b4 74 c3 a9 20 42 After nameprep (length 8): 63 c3 b4 74 c3 a9 20 62 > This was already one of my favorites for dictionary ordering (the > 4 family names, are 4 disting French words, meaning respectively > "quote", "hill side", "quoted" and "side"). > > However this case of collapse is imho no different from cases of > names collapsing just because they are simply identical. One has, with > the same ISP, to find extra ways to distinguish them. Imho an email > address "B.Cote@..." should be able to reach "B.Côté@..." As you can see, a nameprep approach would distinguish between all four strings. I think this is good though, and even fear that it might even be too aggressive. From owner-ietf-imaa Tue Feb 11 07:08:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BF8Ze27118 for ietf-imaa-bks; Tue, 11 Feb 2003 07:08:35 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1BF8Yd27114 for ; Tue, 11 Feb 2003 07:08:34 -0800 (PST) Received: (qmail 6928 invoked by alias); 11 Feb 2003 15:08:35 -0000 Received: from unknown (HELO DAVIS1) (12.234.226.61) by 0 with SMTP; 11 Feb 2003 15:08:35 -0000 Message-ID: <003d01c2d1df$6cedef40$7300a8c0@DAVIS1> From: "Mark Davis" To: "Simon Josefsson" , "IETF IMAA list" , "Alain LaBonté" Cc: References: <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> Subject: Re: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 07:08:24 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > I'm not sure what kind of mapping is done by NFKC... but I suspect the > kind of problems that may ocur. > > For example, if accents are removed from Latin letters once entered ... NFKC does not remove the accents from Latin letters. If you are going to comment on NFKC, pro or contra, you should comment on what it does, rather than on simple speculation as to what it does. The formal results are in UAX #15 on the Unicode site. There is a chart that shows the effects on characters on http://www.unicode.org/charts/normalization/. For example, if you look at Latin characters on http://www.unicode.org/charts/normalization/chart_Latin.html you will see that "ô" remains as "ô" in NFKC. Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Alain LaBonté" To: "Simon Josefsson" ; "IETF IMAA list" Cc: Sent: Tuesday, February 11, 2003 05:58 Subject: Re: Case sensitivity on the LHS > > A 14:17 2003-02-11 +0100, Simon Josefsson a écrit : > > [non-quoted correspondent] > > > You claim that "many distinct names" of humans will get collapsed by > > > NFKC into the same name. So far you have provided zero examples. Could > > > you supply a few more please? > > > >[Simon] If you want to modify existing standards, I think you need to prove > >that it doesn't break things, not the other way around. > > [Alain] Could somebody summarize what is the actual behaviour of NFKC for me? > > I'm not sure what kind of mapping is done by NFKC... but I suspect the > kind of problems that may ocur. > > For example, if accents are removed from Latin letters once entered > (which in most cases would be convenient for those who can't enter accented > characters -- because I would like them to reach me if my email address > were to be « Alain.LaBonté@iquébec.com »), there might indeed be collapses, > and I will give an actual example, using a look-alike family-name cluster > from the city of Québec's telephone book: > > There are actual real-life collisions for example, in these cases: > > Cote B > Côte B > Coté B > Côté B > > This was already one of my favorites for dictionary ordering (the 4 > family names, are 4 disting French words, meaning respectively "quote", > "hill side", "quoted" and "side"). > > However this case of collapse is imho no different from cases of names > collapsing just because they are simply identical. One has, with the same > ISP, to find extra ways to distinguish them. Imho an email address > "B.Cote@..." should be able to reach "B.Côté@..." > > Alain LaBonté > Québec > > From owner-ietf-imaa Tue Feb 11 09:34:23 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BHYNK07149 for ietf-imaa-bks; Tue, 11 Feb 2003 09:34:23 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1BHYLd07141 for ; Tue, 11 Feb 2003 09:34:21 -0800 (PST) Received: (qmail 25074 invoked by alias); 11 Feb 2003 17:34:23 -0000 Received: from unknown (HELO DAVIS1) (12.234.226.61) by 0 with SMTP; 11 Feb 2003 17:34:23 -0000 Message-ID: <008a01c2d1f3$cb36f5b0$7300a8c0@DAVIS1> From: "Mark Davis" To: "IETF IMAA list" References: <20030211014413.GD16359@nicemice.net> <20030211025714.GF16359@nicemice.net> Subject: Re: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 09:34:12 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: 1. From a practical point of view, I personally very much in favor of case-insensitive names. While many programmers may be accustomed to (and use) case sensitivity, the average user is simply annoyed by it, or worse. Having mail bounce from john@foo.com because it had to be John@foo.com just doesn't make sense to most people. 2. In terms of data, the Lithuanian accent behavior exists, but the use of such accents is uncommon -- it is really for dictionary annotations. So from a practical standpoint it does not play a role. 3. Turkic spelling is the more important issue. If the normal case folding is used, then one gets the following equivalence sets: {i, I} {I} {i} So kiz@foo.com and KIZ@foo.com are considered case variants, while kiz@foo.com and KIZ@foo.com are different. Turkic languages, on the other hand, use: {i, I} {i, I} For them, kiz@foo.com and KIZ@foo.com should be considered case variants, and different from kiz@foo.com and KIZ@foo.com (which should also be case variants). Where a system can have different case matching behavior for different languages, this is not a problem. So where a database client presents data sorted or selected for Turkish, this should be taken into account. Where a system needs a single uniform case matching behavior over all strings, such as for a case-insensitive file system, typically implementations use an inclusive case matching, since it covers the vast majority of the world. That is, they use the following equivalence set: {i, I, I, i} The one downside for Turkic languages is that it does not allow them to have two email addresses that only differ by the dots on the I(s). Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Adam M. Costello" To: "IETF IMAA list" Sent: Monday, February 10, 2003 18:57 Subject: Re: Case sensitivity on the LHS > > John Cowan wrote: > > > > Turkish i is the only locale-dependent aspect of the Unicode > > > case-folding operation, according to the Unicode case-folding > > > table. I suppose it's possible that the table overlooks some other > > > locale-dependent things that ought to be in there, but can anyone > > > name any examples besides Turkish i? > > > > Lithuanian "I" with an accent above lowercases to "I" + DOT ABOVE + > > the main accent, because (unlike all other "i"s with accents) the i > > keeps its dot. For Unicode case-folding purposes, this discrepancy is > > ignored. > > So if we consider case-mapping, rather than case-folding, then there > are three affected languages rather than just two, but it's still > all about the dot on the letter i. It still appears to me that this > is an isolated anomaly, not a typical example from a large class of > locale-dependent upper/lower case issues. > > Some people have raised a concern that case-folding causes damage (loss > of important information). It does cause very slight damage in rare > circumstances, but I think not doing it would cause annoyance (mail > bouncing or going to the wrong user) quite often. > > AMC > From owner-ietf-imaa Tue Feb 11 10:54:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BIsQ710199 for ietf-imaa-bks; Tue, 11 Feb 2003 10:54:26 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BIsPd10195 for ; Tue, 11 Feb 2003 10:54:25 -0800 (PST) Received: from 192.168.0.17 (ppp36-56.hrz.uni-bielefeld.de [129.70.36.56]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HA500805R6KXS@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Tue, 11 Feb 2003 19:54:22 +0100 (MET) Date: Tue, 11 Feb 2003 19:47:15 +0100 From: Marc Mutz Subject: Re: Case sensitivity on the LHS In-reply-to: To: ietf-imaa@imc.org Message-id: <200302111947.30700@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_CVUS+eOvYT6jMLA"; charset="iso-8859-1" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_CVUS+eOvYT6jMLA Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Sunday 09 February 2003 20:16, Paul Hoffman / IMC wrote: > RFC 2821 and RFC 2822 make it clear that the left-hand side (LHS) of > email addresses are opaque, which in turn means they are > case-sensitive. The -00 draft of IMAA preserves this. It is of course nice to be able to use one function for the domain, as=20 well as the local-part, but since we already consider different field=20 delimiters in addition to DOT, this single-function argument probably=20 won't fly in practice anyway. So we might as well define another profile. And IMO, it should _not_=20 include case folding. For the simple reason that it's too aggressive. As an example (in addition to the ones already given by others),=20 consider German "Ma=DFe" (measures). The sz ligature is a character that=20 only exists in lower case form. I don't know what lead IDNA to fold=20 that to "ss", but I think that this is a bug in IDNA and should be=20 avoided by all means in IMAA. That's b/c Masse is German for "mass".=20 :-o The easy way to avoid this is doing no case folding. The hard way (since=20 incompatible with IDNA) would be to fix the =DF-mapping in the IMAA=20 tables... Marc =2D-=20 You can fool some people sometimes But you can't fool all the people all the time -- Bob Marley --Boundary-02=_CVUS+eOvYT6jMLA Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+SUVB3oWD+L2/6DgRAsqSAKDFBpXKUqHeiArs+tmXwM+SoeYUbACeJP01 R/A8rthSmBiQMxNZk4ZHV+M= =zmxv -----END PGP SIGNATURE----- --Boundary-02=_CVUS+eOvYT6jMLA-- From owner-ietf-imaa Tue Feb 11 10:58:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BIwXL10282 for ietf-imaa-bks; Tue, 11 Feb 2003 10:58:33 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BIwWd10277 for ; Tue, 11 Feb 2003 10:58:32 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18ifbr-0006gf-00; Tue, 11 Feb 2003 13:58:23 -0500 Date: Tue, 11 Feb 2003 13:58:23 -0500 From: John C Klensin To: "Hollenbeck, Scott" , "'IETF IMAA list'" Subject: IMAA (or alternative) and 2821/2822 (was: RE: Case sensitivity on the LHS) Message-ID: <152208734.1044971903@p3.JCK.COM> In-Reply-To: <3CD14E451751BD42BA48AAA50B07BAD603370662@vsvapostal3.prod.netsol.com> References: <3CD14E451751BD42BA48AAA50B07BAD603370662@vsvapostal3 .prod.netsol.com> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Scott, The listing of 2821/2822 as obsoleting 821/822 was basically a mistake on someone's fault -- Proposed Standards can't replace full Standards. Questions about why the RFC-Index has not been corrected should be addressed to the RFC Editor and/or IESG... I don't know. However, while the process is moving very slowly (anyone who went through the DRUMS end-game can probably guess at many of the reasons), Pete and I are working on revisions/updates to 2821/2822 that fix the errors and incorporate changes suggested on various lists. IMO, it would border on the insane to produce new versions of those documents that omit mention of internationalization work. But it would probably be unwise to make those mentions/references normative, since doing so would tie the maturity level of 2821/2822 to that of the internationalization work. E.g., it would force the revisions to recycle at Proposed, rather than being candidates for processing at Draft. There is another twist in this, best seen when one examines the question of updating 1034/1035. For better or worse, a key property of IDNA was that it not modify 1034/1035 in any way. I don't see any way that it can be meaningful to claim that IDNA doesn't modify 1034/1035 in any way and that it "updates" them. regards, john --On Tuesday, 11 February, 2003 07:27 -0500 "Hollenbeck, Scott" wrote: > >> "Hollenbeck, Scott" wrote: >> >> > Is this document intended to be a formal update to 2821 and >> > 2822? >> >> Is IDNA a formal update to RFCs 1034 and 1035? IMAA would >> bear a similar relation to RFCs 821, 822, 2821, 2822. > > Maybe yours is a rhetorical question since you probably know > the answer (I don't because IDNA hasn't yet been published as > an RFC), but here's why I asked mine: someone implementing > 1034/1035 or 2821/2822 (which, by the way, obsolete 821 and > 822) might not necessarily know of the new features defined by > IDNA and IMAA since there are no "update" references > maintained by the RFC Editor. I suspect that it would > probably be a good idea to have IDNA update 1034/1035, and > likewise it's probably a good idea to have IMAA update > 2821/2822 so that implementers see a clear relationship > between the specifications. > >> > Both (2821 section 4.1.2 and 2822 section 3.4.1) contain >> > formal definitions of the local part of an email address. >> >> IMAA does not redefine local part, but instead defines a new >> term, internationalized local part. Just as IDNA does not >> redefine domain label, but instead defines a new term, >> internationalized domain label. > > These new terms won't necessarily be known to implementers of > the earlier specifications that IDNA and IMAA build upon. If > new features with new processing rules are being defined, why > not give RFC readers a clear pointer to the specifications > that describe the new features? > > Asking a different way: would these new features (or something > similar) have been included in 1034/1035 or 2821/2822 if > internationalization was considered when the earlier > specifications were being written? While we'll never know for > sure, I'm suggesting that the possibility of a "yes" answer > implies that IDNA and IMAA _should_ update the earlier > specifications. > > -Scott- From owner-ietf-imaa Tue Feb 11 11:07:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BJ7BH10532 for ietf-imaa-bks; Tue, 11 Feb 2003 11:07:11 -0800 (PST) Received: from crow.verisign.com (crow.verisign.com [216.168.237.103]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BJ79d10528 for ; Tue, 11 Feb 2003 11:07:09 -0800 (PST) Received: from VSVAPOSTALGW1.prod.netsol.com (vsvapostalgw1.prod.netsol.com [10.170.12.38]) by crow.verisign.com (nsi_0.1/8.9.1) with ESMTP id OAA07691; Tue, 11 Feb 2003 14:07:04 -0500 (EST) Received: by VSVAPOSTALGW1.prod.netsol.com with Internet Mail Service (5.5.2653.19) id <1V7PBABM>; Tue, 11 Feb 2003 14:03:14 -0500 Message-ID: <3CD14E451751BD42BA48AAA50B07BAD603370674@vsvapostal3.prod.netsol.com> From: "Hollenbeck, Scott" To: "'John C Klensin'" , "'IETF IMAA list'" Subject: RE: IMAA (or alternative) and 2821/2822 (was: RE: Case sensitivit y on the LHS) Date: Tue, 11 Feb 2003 14:03:05 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > There is another twist in this, best seen when one examines the > question of updating 1034/1035. For better or worse, a key > property of IDNA was that it not modify 1034/1035 in any way. I > don't see any way that it can be meaningful to claim that IDNA > doesn't modify 1034/1035 in any way and that it "updates" them. Thanks, I'd forgotten that point. As a practical matter I can see the value in not tying the specifications together. -Scott- From owner-ietf-imaa Tue Feb 11 11:24:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BJOQS11588 for ietf-imaa-bks; Tue, 11 Feb 2003 11:24:26 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1BJOOd11582 for ; Tue, 11 Feb 2003 11:24:24 -0800 (PST) Received: (qmail 17253 invoked by alias); 11 Feb 2003 19:24:27 -0000 Received: from unknown (HELO DAVIS1) (12.234.226.61) by 0 with SMTP; 11 Feb 2003 19:24:27 -0000 Message-ID: <00a401c2d203$2ad8a090$7300a8c0@DAVIS1> From: "Mark Davis" To: "Marc Mutz" , References: <200302111947.30700@sendmail.mutz.com> Subject: Re: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 11:24:14 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The reason that those are case folded is that the uppercase of Maße is MASSE, which is the same as the uppercase of Masse. So for it to be a well-defined equivalence relation, {ss, SS, ß} have to be in the same class. Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Marc Mutz" To: Sent: Tuesday, February 11, 2003 10:47 Subject: Re: Case sensitivity on the LHS From owner-ietf-imaa Tue Feb 11 11:32:19 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BJWJb12246 for ietf-imaa-bks; Tue, 11 Feb 2003 11:32:19 -0800 (PST) Received: from mail.reutershealth.com ([65.246.141.36]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BJWId12241 for ; Tue, 11 Feb 2003 11:32:18 -0800 (PST) Received: from skunk.reutershealth.com (mail [65.246.141.36]) by mail.reutershealth.com (Pro-8.9.3/Pro-8.9.3) with SMTP id OAA02525; Tue, 11 Feb 2003 14:29:24 -0500 (EST) Message-Id: <200302111929.OAA02525@mail.reutershealth.com> Received: by skunk.reutershealth.com (sSMTP sendmail emulation); Tue, 11 Feb 2003 14:31:50 -0500 From: John Cowan Subject: Re: Case sensitivity on the LHS To: mark.davis@jtcsv.com (Mark Davis) Date: Tue, 11 Feb 2003 14:31:50 -0500 (EST) Cc: ietf-imaa@imc.org (IETF IMAA list) In-Reply-To: <008a01c2d1f3$cb36f5b0$7300a8c0@DAVIS1> from "Mark Davis" at Feb 11, 2003 09:34:12 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Mark Davis scripsit: > The one downside for Turkic languages is that it does not allow them to have > two email addresses that only differ by the dots on the I(s). This is not much of a downside, at least for Turkish, because due to the vowel harmony rules you can usually tell whether an i is dotted or dotless. -- After fixing the Y2K bug in an application: John Cowan WELCOME TO jcowan@reutershealth.com DATE: SUNDAK, JANUARK 1, 2000 http://www.ccil.org/~cowan From owner-ietf-imaa Tue Feb 11 13:40:17 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BLeHP17213 for ietf-imaa-bks; Tue, 11 Feb 2003 13:40:17 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BLeFd17207 for ; Tue, 11 Feb 2003 13:40:16 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id QAA12518; Tue, 11 Feb 2003 16:40:11 -0500 Message-Id: <4.2.0.58.J.20030211162348.0479ea38@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Tue, 11 Feb 2003 16:27:57 -0500 To: "Mark Davis" , "Marc Mutz" , From: Martin Duerst Subject: Re: Case sensitivity on the LHS In-Reply-To: <00a401c2d203$2ad8a090$7300a8c0@DAVIS1> References: <200302111947.30700@sendmail.mutz.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: [using sz for the German sharp-s] At 11:24 03/02/11 -0800, Mark Davis wrote: >The reason that those are case folded is that the uppercase of Masze is >MASSE, which is the same as the uppercase of Masse. So for it to be a >well-defined equivalence relation, {ss, SS, sz} have to be in the same class. Well, this is only the case if you want a full "round-trip" equivalence. For the purpose of IDNA and IMAA, it would have been possible to define an equivalence with two classes ({ss, SS}, {sz}) without actual problems, because IDNA maps to lowercase. (The only exception being for people trying to input words with sz in all-uppercase, which they won't do anyway.) But given that IDNA has taken the decision it has, I don't think it's worth to create a whole new table with all the associated confusion for IMAA just for this little tweak. Regards, Martin. From owner-ietf-imaa Tue Feb 11 13:49:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BLnBk17486 for ietf-imaa-bks; Tue, 11 Feb 2003 13:49:11 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1BLn8d17479 for ; Tue, 11 Feb 2003 13:49:09 -0800 (PST) Received: (qmail 2792 invoked by uid 66); 11 Feb 2003 21:49:04 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 11 Feb 2003 21:49:04 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-11-0944d); 11 Feb 2003 22:48:30 +0100 Date: 11 Feb 2003 12:17:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fd5$Jh3cDD@3247.org> In-Reply-To: <20030211023401.GE16359@nicemice.net> Subject: Re: Compatibility with IDNA User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-11-0944d MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam M. Costello schrieb/wrote: > Claus Färber wrote: >> This has several advantages: >> >> . You can put a local-part, a domain, or a complete email address >> into the same encoding/decoding function and the results are >> correct. > Currently, we don't even have a single encoding/decoding function for > domain names. We have functions for domain labels. IDNA does not > define functions for whole domain names because the delimiting and > quoting conventions vary. For example, DNS master files use dots as > delimiters, but DNS protocol messages don't. DNS master files use > backslashes to quote dots that are not delimiters, but in DNS protocol > messages there is nothing special about dots or backslashes. Well, any quoting in config files is not part of the address. The next generation of software will take UTF-8 as input and encode the labels automatically anyway. > The message header and SMTP protocols add still more delimiters and > quoting mechanisms. Any quoting, whitespace, etc. allowed by RFC2822 is just extra noise used to embed an email address in higher-level protocols. The real question is how to deal with the minimum quoting required by RFC 2821. Is that considered part of the email address? For example, how is that quoting handled if such an email address is included as a DNS label? Do MTAs match the email address ``"joe user"@example.com'' against the login name ``joe user'' or agains ``"joe user"''? This basically defines how much of the encoding to undo before doing the encoding (and to reapply after doing the encoding to the extent necessary). Well, even if this means that you have to encode the local-part and the RHS separatly, it would still be a benifit to be able to use the same function on both sides (which will still encode valid domain names embedded in the local part the same way as IDNA). > Various config file formats have their own ad-hoc quoting mechanisms, > different from the ones use in message headers and DNS master files. Again, this is just extra noise used to embed an email address in higher-level protocols. > In IDNA, each label is encoded as a unit, never subdivided, even if > it contains dots or other ASCII punctuation. It's far too late to > consider altering that fundamental architecture of IDNA. All of the characters that could make a difference are not valid within a domain label that is part of a domain host name (which use ``UseSTD3ASCIIRules''). So even although IDNA is being deployed now, there will be no domain names where it would make any difference. I think that changes that don't make a difference for deployed IDNs should still be possible. Or, what describes the idea better: It should be possible to define IMAA in a way so that the its encoding function just happens to work with *valid* domain names too. > Maybe, if we had co-designed internationalized domain names, mail any > At this point, the most we could try for is to use the exact same > encoding for local-parts (or subparts) as is used for domain labels. >> . You can have a domain name embedded in a local-part and it is >> encoded the same way as a domain on the right hand side if it is >> delimited by one of the delimiters listed above (useful for the >> so-called percent hack, MIXER, etc.) > We might be able to do that if we subdivide local parts. >> . The reverse is also true: You can have an email address converted to >> a domain name (as seen in SOA DNS records, for example). > We might be able to do that if we *don't* subdivide local parts. Oh, right. I missed the point that c\.faerber.example.com and c.faerber.example.com are different domain names. It is clear that the string used in the DNS must be identical to the one produced by IMAA so that IDNA-unaware and IMAA-unaware software can handle these addresses. I wonder if anything would break if a DNS server software would encode labels containing dots according to a IDNA-compatible IMAA (and not IDNA). IDNAs are only used for domain names, which have a very restricted subset of charaters. Binary data in the DNS does not use IDNA anyway (and must bypass any ACE if it contains any octets above 0x80). Non-binary data that does make use of non-ASCII characters is currently limited to domain names, which UseSTD3ASCIIRules. Mandating an encoding different from IDNA that will produce the same output for domain names should not hurt anyone. For example, an DNS server implementation could do this: . Parse the config file, treat non-ASCII chars as opaque. . For each label found, do this: . Check whether it's binary or made up of characters (for example, it could assume that anything that contains octets above 0x80 encoded as '\OOO' is binary and everything else is character data [you could quote Unicode chars as '\x{XXXX}', for example]). . For binary data, just undo the quoting and be done. Print an error if (quoted) Unicode characters or unencoded octets above 0x80 are found. . For character data, convert from the local charset (e.g. UTF-8) to Unicode, undo the quoting of characters and encode the resulting name using the IDNA-compatible IMAA. Unless the zone file writer really means to include *invalid* domain names (that contain characters not from ['A'..'Z','a'..'Z','-'] plus the separators) as domain names encoded according to IDNA, this would work. If he really means to do that (some people just like to break things), s/he can still do the IDNA-encoding manually. Claus -- ------------------------ http://www.faerber.muc.de/ ------------------------ OpenPGP: DSS 1024/639680F0 E7A8 AADB 6C8A 2450 67EA AF68 48A5 0E63 6396 80F0 From owner-ietf-imaa Tue Feb 11 13:55:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1BLt8q17629 for ietf-imaa-bks; Tue, 11 Feb 2003 13:55:08 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1BLt6d17624 for ; Tue, 11 Feb 2003 13:55:07 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18iiMv-00073D-00 for ietf-imaa@imc.org; Tue, 11 Feb 2003 16:55:09 -0500 Date: Tue, 11 Feb 2003 16:55:09 -0500 From: John C Klensin To: IETF IMAA list Subject: Re: Sound mapping Message-ID: <162814004.1044982509@p3.JCK.COM> In-Reply-To: <20030211051729.GG16359@nicemice.net> References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol. com> <20030211014413.GD16359@nicemice.net> <20030211040519.99895.qmail@cr.yp.to> <20030211051729.GG16359@nicemice.net> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Tuesday, 11 February, 2003 05:17 +0000 "Adam M. Costello" wrote: > Because users are already accustomed to not bothering to > remember and type the proper case of letters in mail > addresses, because in practice it doesn't matter. By doing > case-folding we can avoid surprising them. Users are not > accustomed to mail addresses being sound-insensitive, so > there's no point in us working harder to meet nonexistent > expectations. And because there is a fairly large experimental and observational literature in human factors in computing (as well as some other areas) that indicates that ordinary people, unprompted, will assume that upper and lower case strings that look like names or words in most languages that use Latin-based alphabets will be treated as equivalent*. One corollary to those results (or an independent result, depending whose work one reads) is that, when people spell words out loud, they do not give indications of case unless there is some special clue that they should. By contrast, the expectation that things that sounded alike would be treated as equivalent largely disappeared with the standardization of spelling in the 19th century (at least for English -- I haven't researched other languages and would not be surprised if some standardized spellings earlier and some did so later). Of course, "simplified spelling" efforts show up every decade or two, and most of them start from the notion that identically-pronounced words should match and be spelled identically, but they have failed to gain any traction (and their advocates have mostly been dismissed as nut cases). john * Note that the statement above is fairly conservative. I haven't followed developments in the relevant literature for 15 or 20 years and have never studied any work that might exist on languages using characters that are not Latin-derived. "Ordinary people" is an informal category that does not include specialists who have learned about, and gotten used to, case distinctions. Two more disclaimers: (i) I have not seen any research that would establish whether people would expect strange-case constructions to be case-independent. E.g., those expectations about "names or words" would predict the assumption that "JOHN", "John", and "john" would be equivalent. But I don't know what assumptions would be made about the equivalence (or lack thereof) of "john" and "jOhn" and "jOhN" -- those are sufficiently strange-looking that I would be unsurprised if the reader/observer at least wondered. (ii) There could easily be some cultural differences that might cause people to suspect that case distinctions were important if their languages, e.g., capitalized all nouns (and not just "proper nouns", as in English). Seeing a noun written entirely in lower case might trigger sufficient alarms for them to turn it into a variation on the first disclaimer. I suspect there has been research on that topic, but I haven't seen it. From owner-ietf-imaa Tue Feb 11 16:57:44 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1C0vi122761 for ietf-imaa-bks; Tue, 11 Feb 2003 16:57:44 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1C0vfd22757 for ; Tue, 11 Feb 2003 16:57:43 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18ilDb-0007ZU-00 for ; Tue, 11 Feb 2003 16:57:43 -0800 Date: Wed, 12 Feb 2003 00:57:43 +0000 From: "Adam M. Costello" To: IETF IMAA list Subject: Re: Case sensitivity on the LHS Message-ID: <20030212005743.GA27754@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <3CD14E451751BD42BA48AAA50B07BAD603370662@vsvapostal3.prod.netsol.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4.2.0.58.J.20030211162348.0479ea38@localhost> <200302111947.30700@sendmail.mutz.com> <3CD14E451751BD42BA48AAA50B07BAD603370662@vsvapostal3.prod.netsol.com> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This message responds to Scott Hollenbeck, Simon Josefsson, Marc Mutz, and Marcos Sanz/Denic. "Hollenbeck, Scott" wrote: > > > Is this document intended to be a formal update to 2821 and 2822? > > > > Is IDNA a formal update to RFCs 1034 and 1035? IMAA would bear a > > similar relation to RFCs 821, 822, 2821, 2822. > > Maybe yours is a rhetorical question since you probably know the > answer Actually, I don't know the answer, and it wasn't a rhetorical question. All I know is that the answer ought to be the same for IDNA and IMAA. I do know that IDNA does not require any implementations of RFC 1034/1035 to be changed, ever. I don't know exactly what it means for one RFC to "update" another. People interested in RFCs 1034/1035 are very likely to be interested in IDNA too. > 2821/2822 (which, by the way, obsolete 821 and 822) I hope John is correct that this is an error, because RFC 2822 neglects to mention that mail user agents need to preserve case in local parts in message headers, an important requirement that is stated in RFC 822. Simon Josefsson wrote: > Not if it makes changes to those RFCs, which seems to be (part of) > what this discussion is about. This discussion is not about making changes to the definitions of local part in RFCs 821, 822, 2821, 2822. Those RFCs define local parts as ASCII strings conforming to a certain syntax, which may be case-sensitive at the whim of the mail server for the domain where the local parts have meaning, and which therefore must have their case preserved by all other agents. IMAA would not change any of that. It would define a new concept, internationalized local parts, which unlike local parts can contain non-ASCII characters. The case-sensitivity rules for ASCII local parts would stay the same. But non-ASCII internationalized local parts already break the first rule of local parts (that they contain only ASCII characters), so there's no reason to think that they necessarily obey the other rules (that they may be case-sensitive and must be case-preserving). Of course, we would like internationalized local parts to resemble local parts as much as possible. But is it possible to use the exact same case-sensitivity rules (may be case-sensitive, must be case-preserving)? I can think of only two ways to achieve that. One way is to require all IMAA-aware mail user agents to support mixed-case annotations. I've already argued that that would have too great a cost for too little benefit. The other way is to expect existing mail servers that currently do case-insensitive ASCII comparisons of local parts (which is pretty much all of them) to upgrade to IMAA-awareness and do case-insensitive comparisons of internationalized local parts. But a key goal of IMAA, like IDNA, is not asking for upgrades of the infrustructure. We only want to ask for upgrades of end-user applications. IMAA is supposed to affect mail user agents, not mail transport agents. The mail server at yahoo.com is merely an intermediary between the sender of a message and the recipient of a message. Once the sender and recipient have upgraded their mail user agents, they shouldn't need to wait for yahoo.com to take action before they gain full use of IMAA. Therefore, I think it is undesirable for non-ASCII internationalized local parts to try to exactly duplicate the case-sensitivity rules of ASCII local parts (may be case-sensitive, must be case-preserving). Using different rules for non-ASCII internationalized local parts is not a change in existing standards, for reasons given above. That leaves at least three choices for handling case-sensitivity in non-ASCII internationalized local parts: 1) must be case-sensitive, case must be preserved 2) must be case-insensitive, case cannot be preserved 3) lowercase must be accepted, non-lowercase may be accepted (in which case it must be considered equivalent), case may be preserved Option 1 would not use case-folding. Options 2 and 3 would use case-folding. Option 3 is a new idea I have brewing, more on that later. > > You claim that "many distinct names" of humans will get collapsed by > > NFKC into the same name. So far you have provided zero examples. > > If you want to modify existing standards, I think you need to prove > that it doesn't break things, not the other way around. I don't want to modify existing standards, and I'm not breaking anything. If the use of Nameprep means that IMAA cannot distinguish two strings (like Maße and Masse), that may be unfortunate, but the existing mail standards can't distinguish them either, because they don't allow non-ASCII characters (like ß). IMAA would not cause anything to stop working that used to work, so it wouldn't "break" anything. Marc Mutz wrote: > It is of course nice to be able to use one function for the domain, as > well as the local-part, but since we already consider different field > delimiters in addition to DOT, this single-function argument probably > won't fly in practice anyway. We can't use a single function for entire domain names, or entire mail addresses, but we might be able to use the same function for domain labels and local parts (or subparts). > So we might as well define another profile. I have no objection to defining another profile if Nameprep isn't appropriate. But I think Nameprep is appropriate. I think case-folding is appropriate. > And IMO, it should _not_ include case folding. For the simple reason > that it's too aggressive. > > As an example (in addition to the ones already given by others), > consider German "Maße" (measures). The sz ligature is a character > that only exists in lower case form. I don't know what lead IDNA > to fold that to "ss", but I think that this is a bug in IDNA and > should be avoided by all means in IMAA. That's b/c Masse is German for > "mass". Please estimate the amount of annoyance that German speakers would suffer if ß matches ss, and the amount of annoyance that Turkish/Azeri/Lithuanian speakers would suffer regarding dots on i's, and weigh that against the amount of annoyance that all Latin/Cyrillic/Greek/etc users would suffer if uppercase non-ASCII letters do not match lowercase non-ASCII letters. As for why IDNA folds ß into ss, it's because that's what the Unicode case-folding operation does. We decided that trying to fix various perceived problems in Unicode's case-folding and normalization tables was a rat-hole (it would never end, someone would always find another problem in need of fixing), and was outside our scope of expertise, and it was better to simply include those Unicode-defined operations as-is. Marcos Sanz/Denic wrote: > As far as I remember, Nameprep states to be a profile for preparing > domain names and it should not be used for any other purpose. Actually, it does not state that. Here's what it states: Nameprep is used by the IDNA [IDNA] protocol for preparing domain names; it is not designed for any other purpose. It is explicitly not designed for processing arbitrary free text and SHOULD NOT be used for that purpose. That's all true. Nameprep was not designed for preparing local parts. If we design a profile for preparing local parts, and notice that it is identical to (or nearly identical to) Nameprep, then we should consider reusing Nameprep. There's no recommendation against that. The recommendation is against using Nameprep for arbitrary free text, which is not what we're doing. > Wouldn't it be a better idea to make anyway a new profile for IMAA? Maybe, maybe not. That's what we're trying to figure out now. I'm leaning toward "maybe not". AMC From owner-ietf-imaa Tue Feb 11 17:27:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1C1Rcb23404 for ietf-imaa-bks; Tue, 11 Feb 2003 17:27:38 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1C1Rbd23400 for ; Tue, 11 Feb 2003 17:27:37 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18ilga-0007dM-00 for ; Tue, 11 Feb 2003 17:27:40 -0800 Date: Wed, 12 Feb 2003 01:27:40 +0000 From: "Adam M. Costello" To: IETF IMAA list Subject: Re: Case sensitivity on the LHS Message-ID: <20030212012740.GB27754@nicemice.net> Reply-To: IETF IMAA list References: <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Alain LaBonté wrote: > Could somebody summarize what is the actual behaviour of NFKC for me? NFKC normalizes the representation of strings that are "compatibly equivalent". There are two kinds of equivalence: canonical equivalence, and compatible equivalence. Any strings that are canonically equivalent are also compatibly equivalent. Typical examples of canonical equivalence: Latin small letter a with dot above Latin small letter a, combining dot above Latin small letter a, combining dot above, combining dot below Latin small letter a, combining dot below, combining dot above Kelvin sign Latin capital letter K Compatible equivalence adds equivalences between compatibility characters and other characters. Compatibility characters are generally characters that the Unicode consortium would not have included at all if it had not been necessary to support round-trip lossless conversions to/from character sets that make the distinction. Here are some typical examples (compatibility character first, followed by the regular character(s)): vulgar fraction one half digit one, fraction slash, digit two Latin small ligature ij Latin small letter i, Latin small letter j dot above space, combining dot above em space space double prime prime, prime superscript two digit two roman numeral one Latin capital letter I circled digit one digit one Arabic letter Beeh final form Arabic letter Beeh fullwidth Latin small letter a Latin small letter a AMC From owner-ietf-imaa Tue Feb 11 18:30:28 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1C2USd24655 for ietf-imaa-bks; Tue, 11 Feb 2003 18:30:28 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1C2UQd24651 for ; Tue, 11 Feb 2003 18:30:26 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18imfO-0007n5-00 for ; Tue, 11 Feb 2003 18:30:30 -0800 Date: Wed, 12 Feb 2003 02:30:30 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Compatibility with IDNA Message-ID: <20030212023030.GA28984@nicemice.net> Reply-To: IETF IMAA list References: <20030211023401.GE16359@nicemice.net> <8fd5$Jh3cDD@3247.org> <8f$3A5e3cDD@3247.org> <20030211023401.GE16359@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8fd5$Jh3cDD@3247.org> <20030211023401.GE16359@nicemice.net> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I wrote: > We should use the same ACE prefix [for IMAA and IDNA] if and only if > the ToASCII and ToUnicode operations are identical. Oops, that's a little stronger than we need. We should use the same ACE prefix if and only if the ToUnicode operations are identical. The ToASCII operations could be different if the difference has no effect on the observable ToUnicode behavior (remember that ToUnicode invokes ToASCII). > If two different sets of ToASCII/ToUnicode operations were to use the > same prefix, that would invite errors where a string gets encoded by > one ToASCII and decoded by the wrong ToUnicode, which would probably > cause the original string to get converted into a non-equivalent > string. Notice that this is a concern only if the ToUnicode operations differ. Claus Färber wrote: > > IDNA does not define functions for whole domain names because the > > delimiting and quoting conventions vary. For example, DNS master > > files use dots as delimiters, but DNS protocol messages don't. DNS > > master files use backslashes to quote dots that are not delimiters, > > but in DNS protocol messages there is nothing special about dots or > > backslashes. > > Well, any quoting in config files is not part of the address. The next > generation of software will take UTF-8 as input and encode the labels > automatically anyway. Okay, but how does that help me define a single function that takes entire domain names (or even entire mail addresses) as input? Consider the domain name whose first label is foo.bar and whose second and third labels are example and org. In a DNS protocol message, the domain name would contain only one dot, and no backslashes. In a DNS master file, the domain name would be a string looking like this: foo\.bar.example.org. Or actually, it might gratuitously look like this: \f\o\o\.\b\a\r.\e\x\am\p\l\e.\o\r\g. Anyway, what am I supposed to pass to this single function that takes entire domain names? It needs to be either a list of labels (in which case we have failed to factor out the parsing and left it up to the application, same as in IDNA), or it needs to be a string with some quoting mechanism for the embedded dot, in which case applications are still going to have to parse everything anyway in order to undo the various quoting styles (DNS master file, message header, SMTP, other config files) before applying the single quoting style used by this single function. If applications are doomed to parse everything anyway, we might as well stick with the model where the ACE conversions are applied independently to the individual pieces. The most we can hope for is to reuse the same conversion operations for pieces found on either side of the at-sign. > The real question is how to deal with the minimum quoting required by > RFC 2821. Is that considered part of the email address? For example, > how is that quoting handled if such an email address is included as a > DNS label? Do MTAs match the email address ``"joe user"@example.com'' > against the login name ``joe user'' or agains ``"joe user"''? I was wondering the same thing myself this morning. You can also ask the question in the other direction. If I find "foo".example.org. in an SOA record, and I want to send mail there, do I need to compose the To: field like this: "\"foo\""@example.org ? The various RFCs (1034, 1035, 822, 2822) are not clear about this. My best guess is that the RFC 822 and SMTP quotes and backslashes are not really part of the local-part, and should be removed before the local part is inserted into some other context, like a DNS master file (and therefore, if a quote character appears in the DNS master file, it really is part of the local part, and needs to be quoted in the To: field and the SMTP RCPT command). But I wouldn't rely on that guess. I'd avoid using any special characters in domain-mapped mail addresses until/unless an official clarification is published. > Well, even if this means that you have to encode the local-part and > the RHS separatly, it would still be a benefit to be able to use the > same function on both sides Agreed. I have some ideas for how to achieve that, which I'll describe in an upcoming message. > > > You can have an email address converted to a domain name (as seen > > > in SOA DNS records, for example). > > > We might be able to do that if we *don't* subdivide local parts. > > Oh, right. I missed the point that c\.faerber.example.com and > c.faerber.example.com are different domain names. > > It is clear that the string used in the DNS must be identical to the > one produced by IMAA so that IDNA-unaware and IMAA-unaware software > can handle these addresses. > > I wonder if anything would break if a DNS server software would encode > labels containing dots according to a IDNA-compatible IMAA (and not > IDNA). > > IDNAs are only used for domain names, which have a very restricted > subset of charaters. Binary data in the DNS does not use IDNA anyway > (and must bypass any ACE if it contains any octets above 0x80). > > Non-binary data that does make use of non-ASCII characters is > currently limited to domain names, which UseSTD3ASCIIRules. Mandating > an encoding different from IDNA that will produce the same output for > domain names should not hurt anyone. That's a clever idea, but you're assuming that all textual domain names use the STD-3 ASCII rules (LDH restrictions). Actually, that restricted syntax is "preferred" for domain names in general, but required only for names of hosts and mail domains. Domain names used to name things other than hosts and mail domains are not obligated to use the preferred syntax. An example is SRV names, like _ldap._tcp.example.org [RFC-2782]. Another example is classless in-addr.arpa delegations, like 0/25.2.0.192.in-addr.arpa [RFC-2317 = BCP-20]. Of course these particular examples (the only two I know of) have no need for non-ASCII characters, but I'd still be very hesitant to contemplate backtracking on IDNA's applicability by declaring "IDNA doesn't apply to all textual domain labels like we said it did, it applies only to labels conforming to the STD-3 ASCII rules". I have a hard time imagining how we could put that cat back in the bag. Getting consensus on such a change would take months (or forever), judging from the history of the IDN working group. AMC From owner-ietf-imaa Tue Feb 11 18:41:37 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1C2fb424880 for ietf-imaa-bks; Tue, 11 Feb 2003 18:41:37 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1C2fZd24873 for ; Tue, 11 Feb 2003 18:41:35 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id B1992789A7 for ; Wed, 12 Feb 2003 02:41:37 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 56549-06 for ; Wed, 12 Feb 2003 02:41:35 +0000 (GMT) Received: from jeffreyibm (unknown [211.104.147.95]) by pie1.i-dns.net (Postfix) with SMTP id A75317886F for ; Wed, 12 Feb 2003 02:41:30 +0000 (GMT) Message-ID: <032801c2d240$89cb09c0$fc00a8c0@jeffreyibm> From: "jeffrey" To: "'IETF IMAA list'" References: <3CD14E451751BD42BA48AAA50B07BAD603370662@vsvapostal3.prod.netsol.com> Subject: Re: Case sensitivity on the LHS Date: Wed, 12 Feb 2003 11:43:30 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: IDNA and IMAA do not redefine the local or the domain labels of 1034/1035 & 2821/2822. They are meant to be definitions of internationalised labels that have an ascii equivalent, hence no updates needs to be made. I do a cut and paste from a previous AMC post on the idn wg: +----------------------------+ | internationalized labels | | | | +----------------+ | | | ASCII labels | | | | | | | | +--------+ | | | | | ACE | | | | | | labels | | | | | +--------+ | | | +----------------+ | +----------------------------+ > Asking a different way: would these new features (or something similar) have > been included in 1034/1035 or 2821/2822 if internationalization was > considered when the earlier specifications were being written? I'm under the impression that the result of the current idn efforts are meant to be transitory ( ? ), in keeping with the timeliness criteria of the wg. I suspect if these rfcs were written with i18n in mind, the results would be different. jeffrey ----- Original Message ----- From: "Hollenbeck, Scott" To: "'IETF IMAA list'" Sent: Tuesday, February 11, 2003 9:27 PM Subject: RE: Case sensitivity on the LHS > > > "Hollenbeck, Scott" wrote: > > > > > Is this document intended to be a formal update to 2821 and 2822? > > > > Is IDNA a formal update to RFCs 1034 and 1035? IMAA would bear a > > similar relation to RFCs 821, 822, 2821, 2822. > > Maybe yours is a rhetorical question since you probably know the answer (I > don't because IDNA hasn't yet been published as an RFC), but here's why I > asked mine: someone implementing 1034/1035 or 2821/2822 (which, by the way, > obsolete 821 and 822) might not necessarily know of the new features defined > by IDNA and IMAA since there are no "update" references maintained by the > RFC Editor. I suspect that it would probably be a good idea to have IDNA > update 1034/1035, and likewise it's probably a good idea to have IMAA update > 2821/2822 so that implementers see a clear relationship between the > specifications. > > > > Both (2821 section 4.1.2 and 2822 section 3.4.1) contain formal > > > definitions of the local part of an email address. > > > > IMAA does not redefine local part, but instead defines a new term, > > internationalized local part. Just as IDNA does not redefine domain > > label, but instead defines a new term, internationalized domain label. > > These new terms won't necessarily be known to implementers of the earlier > specifications that IDNA and IMAA build upon. If new features with new > processing rules are being defined, why not give RFC readers a clear pointer > to the specifications that describe the new features? > > Asking a different way: would these new features (or something similar) have > been included in 1034/1035 or 2821/2822 if internationalization was > considered when the earlier specifications were being written? While we'll > never know for sure, I'm suggesting that the possibility of a "yes" answer > implies that IDNA and IMAA _should_ update the earlier specifications. > > -Scott- > From owner-ietf-imaa Wed Feb 12 00:38:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1C8cK717926 for ietf-imaa-bks; Wed, 12 Feb 2003 00:38:20 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1C8cId17918 for ; Wed, 12 Feb 2003 00:38:19 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18isPK-0008QN-00 for ietf-imaa@imc.org; Wed, 12 Feb 2003 03:38:18 -0500 Date: Wed, 12 Feb 2003 03:38:18 -0500 From: John C Klensin To: IETF IMAA list Subject: Re: Compatibility with IDNA Message-ID: <201402821.1045021098@p3.JCK.COM> In-Reply-To: <20030212023030.GA28984@nicemice.net> References: <20030211023401.GE16359@nicemice.net> <8fd5$Jh3cDD@3247.org> <8f$3A5e3cDD@3247.org> <20030211023401.GE16359@nicemice.net> <20030212023030.GA28984@nicemice.net> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Wednesday, 12 February, 2003 02:30 +0000 "Adam M. Costello" wrote: >> The real question is how to deal with the minimum quoting >> required by RFC 2821. Is that considered part of the email >> address? For example, how is that quoting handled if such an >> email address is included as a DNS label? Do MTAs match the >> email address ``"joe user"@example.com'' against the login >> name ``joe user'' or agains ``"joe user"''? > > I was wondering the same thing myself this morning. You can > also ask the question in the other direction. If I find > > "foo".example.org. > > in an SOA record, and I want to send mail there, do I need to > compose the To: field like this: > > "\"foo\""@example.org Adam, The "only the receiving MTA gets to mess with the local-part" rule has been historically interpreted _very_ strictly and bad things have happened when it isn't. The general intent is that ''joe user'' and ''"joe user"'' be treated as equal and that ''foo'' and ''\"foo\"'' be equivalent as well, although, in the ''\"foo\"'' case, the minimal quoting rule is violated. However, the specifications very carefully avoid the assumption that a mailbox name bears any relationship to a login name. Some users, systems, and administrators find that relationships convenient. At the other extreme, some believe that having a mailbox name match the user name is an unnecessary and undesirable disclosure of information that puts important information into the hands of potential crackers and they simply won't permit it. So one answer would be that the question "which form matches the user name" is irrelevant; the only important question is "which form the receiving/delivery MTA will interprets as matching the internal mailbox (or maildrop) name". There is a second principle, which is that mailbox names, unlike most traditional DNS strings, get really close to user command-level interfaces. And command interfaces have a history of mucking up quoting conventions in a big way. Different operating systems foul up things in different ways, just to make things interesting. People who write code for the Internet email environment have discovered, after years and years of abuse of the system, a need to get really conservative about anything they actually want to have delivered. Smart email administrators tend to avoid configuring "joe user" as a mailbox name, or make sure that "joe.user", or something else that doesn't require quoting, is supported as a recommended alias. Similarly, despite the fact that the SOA record mailbox form joe\.user.some.domain is perfectly well defined as equivalent to joe.user@some.domain, folks who are more interested in making sure that the domain admin mailbox can be contacted than they are in demonstrating how much they know about the DNS usually set up names or aliases to avoid having to deal with periods in the local part. And receiving/delivery MTAs (or the associated alias mechanisms) written by people with a strong "the mail must go through if I can possibly figure out what was intended" mentality are usually configured so that joe user "joe user" joe\ user "joe\ user" and even 'joe user' and maybe even 'joe user" """"joe user" and "\"joe user" and all of their case variants, end up pointing to the same maildrop. That is either the robustness principle carried to one of its extremes or just good sense. But nothing requires that all of those cases be treated the same, any more than anything requires case-matching. Consequently, a sending/originating MUA that makes strong assumptions about how the delivery MTA is going to interpret local-parts will, at best, violate the protocols and periodically end up with undeliverable mail or, at worst, do fairly severe violence to the email environment. john From owner-ietf-imaa Wed Feb 12 01:09:01 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1C991424946 for ietf-imaa-bks; Wed, 12 Feb 2003 01:09:01 -0800 (PST) Received: from smtp.denic.de (smtp.denic.de [194.246.96.22]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1C98xd24938 for ; Wed, 12 Feb 2003 01:08:59 -0800 (PST) Received: from notes.denic.de (denics15.denic.de [194.246.96.18]) by smtp.denic.de with esmtp id 18ist0-0006Gc-00; Wed, 12 Feb 2003 10:08:59 +0100 Subject: Re: Re: Case sensitivity on the LHS To: Marc Mutz Cc: ietf-imaa@imc.org X-Mailer: Lotus Notes Release 5.0.6a January 17, 2001 Message-ID: From: "Marcos Sanz/Denic" Date: Wed, 12 Feb 2003 10:09:59 +0100 X-MIMETrack: Serialize by Router on notes/Denic(Release 5.0.11 |July 24, 2002) at 12.02.2003 10:08:59 MIME-Version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1C990d24941 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On 11.02.2003 19:47 Marc Mutz wrote: > > As an example (in addition to the ones already given by others), > consider German "Maße" (measures). The sz ligature is a character that > only exists in lower case form. I don't know what lead IDNA to fold > that to "ss", but I think that this is a bug in IDNA and should be > avoided by all means in IMAA. That is completely right. Maybe should we create a new profile for stringprep that overwrites the mapping tables of RFC3454? Regards, Marcos Sanz From owner-ietf-imaa Wed Feb 12 02:32:51 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CAWp505075 for ietf-imaa-bks; Wed, 12 Feb 2003 02:32:51 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CAWod05071 for ; Wed, 12 Feb 2003 02:32:50 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18iuCB-0000Od-00 for ; Wed, 12 Feb 2003 02:32:51 -0800 Date: Wed, 12 Feb 2003 10:32:51 +0000 From: "Adam M. Costello" To: IETF IMAA list Subject: Re: Compatibility with IDNA Message-ID: <20030212103251.GC1140@nicemice.net> Reply-To: IETF IMAA list References: <20030211023401.GE16359@nicemice.net> <8fd5$Jh3cDD@3247.org> <8f$3A5e3cDD@3247.org> <20030211023401.GE16359@nicemice.net> <20030212023030.GA28984@nicemice.net> <201402821.1045021098@p3.JCK.COM> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201402821.1045021098@p3.JCK.COM> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: John C Klensin wrote: > The "only the receiving MTA gets to mess with the local-part" > rule has been historically interpreted _very_ strictly and bad > things have happened when it isn't. The general intent is that > > ''joe user'' and ''"joe user"'' > be treated as equal and that > ''foo'' and ''\"foo\"'' > be equivalent as well, although, in the ''\"foo\"'' case, the > minimal quoting rule is violated. Okay, but I don't know how to use that principle to answer my questions. Let me ask them again: Suppose I am told to create an SOA record that will cause people to send mail like so: To: "joe:user"@example.org What is the most correct thing for me to put in the DNS master file? joe:user.example.org. "joe:user".example.org. \"joe:user\".example.org. The first two mean exactly the same thing (they cause the same DNS protocol message to be sent), so they are equally right or wrong. In the other direction, suppose I encounter an SOA record containing \"joe:user\".example.org. What is the most correct To: field I should construct to send mail there? To: "joe:user"@example.org To: "\"joe:user\""@example.org I don't dispute that it would be foolish to actually put such addresses in SOA records, I'm just trying to understand the intention of relevant standards. As far as I can tell, the issue wasn't really addressed. AMC From owner-ietf-imaa Wed Feb 12 11:23:12 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CJNCN02980 for ietf-imaa-bks; Wed, 12 Feb 2003 11:23:12 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1CJNBd02976 for ; Wed, 12 Feb 2003 11:23:11 -0800 (PST) Received: (qmail 23985 invoked by alias); 12 Feb 2003 19:22:56 -0000 Received: from unknown (HELO DAVIS1) (32.97.110.142) by 0 with SMTP; 12 Feb 2003 19:22:56 -0000 Message-ID: <002a01c2d2cc$1f565c70$6dde2b09@DAVIS1> From: "Mark Davis" To: "Marc Mutz" , , "Martin Duerst" References: <200302111947.30700@sendmail.mutz.com> <4.2.0.58.J.20030211162348.0479ea38@localhost> Subject: Re: Case sensitivity on the LHS Date: Wed, 12 Feb 2003 11:22:44 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: [using sz for the German sharp-s] But the uppercase of "Masze" is "MASSE". So for case-insensitivity, you need to treat these equivalently. If someone types in either Masze@foo.com or MASSE@foo.com, they should get the same result. However, the same is true for "Masse" and "MASSE", so that tosses Masse@foo.com into the hopper. So for it to be a well-defined equivalence relation, {ss, SS, sz} have to be in the same class. Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Martin Duerst" To: "Mark Davis" ; "Marc Mutz" ; Sent: Tuesday, February 11, 2003 13:27 Subject: Re: Case sensitivity on the LHS > > [using sz for the German sharp-s] > > At 11:24 03/02/11 -0800, Mark Davis wrote: > > >The reason that those are case folded is that the uppercase of Masze is > >MASSE, which is the same as the uppercase of Masse. So for it to be a > >well-defined equivalence relation, {ss, SS, sz} have to be in the same class. > > Well, this is only the case if you want a full "round-trip" equivalence. > For the purpose of IDNA and IMAA, it would have been possible to define > an equivalence with two classes ({ss, SS}, {sz}) without actual problems, > because IDNA maps to lowercase. > (The only exception being for people trying to input words with sz > in all-uppercase, which they won't do anyway.) > > But given that IDNA has taken the decision it has, I don't think it's > worth to create a whole new table with all the associated confusion > for IMAA just for this little tweak. > > Regards, Martin. > From owner-ietf-imaa Wed Feb 12 11:42:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CJgTe03470 for ietf-imaa-bks; Wed, 12 Feb 2003 11:42:29 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (courrier4.internet.gouv.qc.ca [192.197.162.9] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1CJgSd03465 for ; Wed, 12 Feb 2003 11:42:28 -0800 (PST) Received: (qmail 23962 invoked from network); 12 Feb 2003 19:42:22 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.49) by mailgen2.internet.gouv.qc.ca with SMTP; 12 Feb 2003 19:42:22 -0000 Message-Id: <5.0.2.1.2.20030212143718.00a96fa8@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Wed, 12 Feb 2003 14:42:24 -0500 To: "Mark Davis" , "Simon Josefsson" , "IETF IMAA list" From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: Case sensitivity on the LHS Cc: In-Reply-To: <003d01c2d1df$6cedef40$7300a8c0@DAVIS1> References: <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 07:08 2003-02-11 -0800, Mark Davis a écrit : > > I'm not sure what kind of mapping is done by NFKC... but I suspect the > > kind of problems that may ocur. > > > > For example, if accents are removed from Latin letters once entered >... > >NFKC does not remove the accents from Latin letters. If you are going to >comment on NFKC, pro or contra, you should comment on what it does, rather >than on simple speculation as to what it does. The formal results are in UAX >#15 on the Unicode site. There is a chart that shows the effects on >characters on http://www.unicode.org/charts/normalization/. For example, if >you look at Latin characters on >http://www.unicode.org/charts/normalization/chart_Latin.html you will see >that "ô" remains as "ô" in NFKC. [Alain] Good (and thanks for the reference). But I wanted to say that even if having email addresses including accented latin letters is a "must have" (I look forward to seeing my address as Alain.LaBonté@abc.com), we need to have a way to allow those who can't enter accented letters to be able to access the same email address (on, say, a US keyboard [or a Japanese one with Romaji support], in typing Alain.LaBonte@abc.com). One way would be to have email aliases, but is it the best way? Will those ewith accented names have the burden of being sure to have aliases all the time to do this? Alain LaBonté Québec From owner-ietf-imaa Wed Feb 12 11:52:53 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CJqrl04670 for ietf-imaa-bks; Wed, 12 Feb 2003 11:52:53 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (inet-cou2.gouv.qc.ca [192.197.162.9]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1CJqqd04665 for ; Wed, 12 Feb 2003 11:52:52 -0800 (PST) Received: (qmail 3593 invoked from network); 12 Feb 2003 19:52:46 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.49) by mailgen2.internet.gouv.qc.ca with SMTP; 12 Feb 2003 19:52:46 -0000 Message-Id: <5.0.2.1.2.20030212144247.05326e38@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Wed, 12 Feb 2003 14:52:49 -0500 To: "Mark Davis" , "IETF IMAA list" From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: Case sensitivity on the LHS In-Reply-To: <008a01c2d1f3$cb36f5b0$7300a8c0@DAVIS1> References: <20030211014413.GD16359@nicemice.net> <20030211025714.GF16359@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 09:34 2003-02-11 -0800, Mark Davis a écrit : >1. From a practical point of view, I personally very much in favor of >case-insensitive names. Me too. And I would add that ideally, the behaviour in reaching an email address would be optimum with accent-insensitivity as well. A bit like searching with Google or Altavista. Unaccented-affected-Latin-letter keyword search requests will reach all accented targets. So should email addresses behave. Of course this makes a special case for the Latin alphabet, but this is a ransom of having had the Latin alphabet implemented on all keyboards of the world, but only implemented for the basic historical letters, unaccented. That situation is realistically going to last for a while. Alain LaBonté Québec From owner-ietf-imaa Wed Feb 12 11:58:58 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CJwwh04792 for ietf-imaa-bks; Wed, 12 Feb 2003 11:58:58 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (inet-cou2.gouv.qc.ca [192.197.162.9]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1CJwvd04788 for ; Wed, 12 Feb 2003 11:58:57 -0800 (PST) Received: (qmail 443 invoked from network); 12 Feb 2003 19:58:52 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.49) by mailgen2.internet.gouv.qc.ca with SMTP; 12 Feb 2003 19:58:52 -0000 Message-Id: <5.0.2.1.2.20030212145756.06d781c0@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Wed, 12 Feb 2003 14:58:54 -0500 To: "Adam M. Costello" , IETF IMAA list From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: Case sensitivity on the LHS In-Reply-To: <20030212012740.GB27754@nicemice.net> References: <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 01:27 2003-02-12 +0000, Adam M. Costello a écrit : >Alain LaBonté wrote: > > > Could somebody summarize what is the actual behaviour of NFKC for me? > >NFKC normalizes the representation of strings that are "compatibly >equivalent". There are two kinds of equivalence: canonical equivalence, >and compatible equivalence. Any strings that are canonically equivalent >are also compatibly equivalent. [Alain] Thanks very much, I appreciate such an explanatory answer, it is what is required so that everybody understands the same thing. Alain LaBonté Québec From owner-ietf-imaa Wed Feb 12 12:09:10 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CK9AT05137 for ietf-imaa-bks; Wed, 12 Feb 2003 12:09:10 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CK98d05133 for ; Wed, 12 Feb 2003 12:09:08 -0800 (PST) Received: from 192.168.0.17 (ppp36-228.hrz.uni-bielefeld.de [129.70.36.228]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HA700J0IPB0SH@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Wed, 12 Feb 2003 21:09:07 +0100 (MET) Date: Wed, 12 Feb 2003 20:59:14 +0100 From: Marc Mutz Subject: Re: Case sensitivity on the LHS In-reply-to: <002a01c2d2cc$1f565c70$6dde2b09@DAVIS1> To: ietf-imaa@imc.org Cc: Mark Davis Message-id: <200302122059.41792@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_teqS+CWcnNzGiNa"; charset="iso-8859-1" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <4.2.0.58.J.20030211162348.0479ea38@localhost> <002a01c2d2cc$1f565c70$6dde2b09@DAVIS1> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_teqS+CWcnNzGiNa Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Wednesday 12 February 2003 20:22, Mark Davis wrote: > But the uppercase of "Masze" is "MASSE". So for case-insensitivity, > you need to treat these equivalently. No, not at all. If you fold to the lower-case versions of characters,=20 then SS->ss, not SS->=DF, so =DF could/should be a class of it's own. > If someone types in either > Masze@foo.com or MASSE@foo.com, they should get the same result. Nobody would enter "MASSE" if asked to enter "Ma=DFe" and =DF was available= =20 on the keyboard. Even with caps lock enabled, they'd enter 'MA=DFE". Mapping =DF->ss is like mapping =E4->ae. As "Ma=DFe" vs. "Masse" shows, =DF= is a=20 letter of it's own right in German, it just happens to have the=20 tradition of never appearing at the beginning of a word, so no-one ever=20 bothered to define what it's upper-case version should look like...=20 Also, the uppercase-is-SS trick works reasonably well, but not perfect.=20 E.g., If I write NUSS, then there's no ambiguity. It's uppercase "Nu=DF"=20 (nut). [ In this particular instance, the usage of =DF was a bug in the language=20 that was fixed with the new orthographic rules a few years back. Nuss=20 is pronounced with a short u, while Nu=DF would be pronounced with a long=20 u. So basically, with the modern orthographic rules, the only uses of =DF=20 that are left are those where "ss" would not fit the pronouncation,=20 which is another argument for keeping "ss" and =DF separate. ] > However, the same is true for "Masse" and "MASSE", so that tosses > Masse@foo.com into the hopper. So for it to be a well-defined > equivalence relation, {ss, SS, sz} have to be in the same class. That ambiguity is probably why, in Austria, some people use(d?) SZ as=20 the upper-case version of =DF (enough so that a mid-80's dictionary I=20 have here mentions it). Why was the class then not defined to be { ss, SS, =DF, sz, SZ }? Marc =2D-=20 The illegal we do immediately. The unconstitutional takes a bit longer. -- Henry Kissinger --Boundary-02=_teqS+CWcnNzGiNa Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+Sqet3oWD+L2/6DgRAlAsAJ0XNcdb94vDJ720S1MWq2xkfy3bNwCgkdFe AuaDWfHL3q/d/Pd4Sg3LZ1A= =mPEX -----END PGP SIGNATURE----- --Boundary-02=_teqS+CWcnNzGiNa-- From owner-ietf-imaa Wed Feb 12 12:21:23 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CKLNS05605 for ietf-imaa-bks; Wed, 12 Feb 2003 12:21:23 -0800 (PST) Received: from mail.reutershealth.com ([65.246.141.36]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CKLLd05601 for ; Wed, 12 Feb 2003 12:21:21 -0800 (PST) Received: from skunk.reutershealth.com (mail [65.246.141.36]) by mail.reutershealth.com (Pro-8.9.3/Pro-8.9.3) with SMTP id PAA13404; Wed, 12 Feb 2003 15:18:33 -0500 (EST) Message-Id: <200302122018.PAA13404@mail.reutershealth.com> Received: by skunk.reutershealth.com (sSMTP sendmail emulation); Wed, 12 Feb 2003 15:20:56 -0500 From: John Cowan Subject: Re: Case sensitivity on the LHS To: mutz@kde.org (Marc Mutz) Date: Wed, 12 Feb 2003 15:20:56 -0500 (EST) Cc: ietf-imaa@imc.org, mark.davis@jtcsv.com (Mark Davis) In-Reply-To: <200302122059.41792@sendmail.mutz.com> from "Marc Mutz" at Feb 12, 2003 08:59:14 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Marc Mutz scripsit: > That ambiguity is probably why, in Austria, some people use(d?) SZ as=20 > the upper-case version of =DF (enough so that a mid-80's dictionary I=20 > have here mentions it). Why was the class then not defined to be > { ss, SS, =DF, sz, SZ }? Indeed, when learning to type in the mid-70s, I was taught to type not only "SZ" but "sz" as well when using an English-language typewriter. -- Values of beeta will give rise to dom! John Cowan --mv, Unix 6th edition jcowan@reutershealth.com (http://cm.bell-labs.com/cm/cs/who/dmr/odd.html) From owner-ietf-imaa Wed Feb 12 12:33:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CKXTt05829 for ietf-imaa-bks; Wed, 12 Feb 2003 12:33:29 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CKXRd05821 for ; Wed, 12 Feb 2003 12:33:27 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id PAA05210; Wed, 12 Feb 2003 15:33:24 -0500 Message-Id: <4.2.0.58.J.20030212143823.05074340@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 12 Feb 2003 14:50:53 -0500 To: "Mark Davis" , "Marc Mutz" , From: Martin Duerst Subject: Re: Case sensitivity on the LHS In-Reply-To: <002a01c2d2cc$1f565c70$6dde2b09@DAVIS1> References: <200302111947.30700@sendmail.mutz.com> <4.2.0.58.J.20030211162348.0479ea38@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:22 03/02/12 -0800, Mark Davis wrote: >[using sz for the German sharp-s] > >But the uppercase of "Masze" is "MASSE". So for case-insensitivity, you need >to treat these equivalently. If someone types in either Masze@foo.com or >MASSE@foo.com, they should get the same result. However, the same is true >for "Masse" and "MASSE", so that tosses Masse@foo.com into the hopper. So >for it to be a well-defined equivalence relation, {ss, SS, sz} have to be in >the same class. One could use a very similar argument for getting rid of the accents in French. The upper case of "e'te'" is (for many purposes) "ETE". So that would then put {e, e', E, E'} all in the same class. [There are of course some differences, such as that the upper case E' actually exists and is sometimes used.] The question, from an user perspective, is "Is it more important to have MASSE be the same as masze, or is it more important to have masse and masze being different?". The answer to this question, in the case of things such as domain names (mostly lower case, false positives not allowed), would be "the later". The answer is different in a typical search-engine context (false positives allowed). Applying upper-case and lower-case operations until one gets a result that doesn't change anymore is not the only way to obtain (well-defined) equivalence classes. And stringprep/nameprep doesn't actually need equivalence classes, it only needs a 'toLower' operation. What is most important is to make sure that user expectations within the application at hand are met as well as possible. Anyway, because IDNA is now defined the way it is, I don't think it is worth doing something different for IMAA. For German users to have to learn that they can use sz on the left hand side, but not on the right hand is not really a good solution. Regards, Martin. From owner-ietf-imaa Wed Feb 12 12:33:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CKXQU05817 for ietf-imaa-bks; Wed, 12 Feb 2003 12:33:26 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CKXOd05812 for ; Wed, 12 Feb 2003 12:33:25 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id PAA05218; Wed, 12 Feb 2003 15:33:25 -0500 Message-Id: <4.2.0.58.J.20030212145155.05a0dd50@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 12 Feb 2003 14:55:08 -0500 To: Alain LaBonte , "IETF IMAA list" From: Martin Duerst Subject: Re: Case sensitivity on the LHS Cc: In-Reply-To: <5.0.2.1.2.20030212143718.00a96fa8@entree.sct1.gouv.qc.ca> References: <003d01c2d1df$6cedef40$7300a8c0@DAVIS1> <20030211014413.GD16359@nicemice.net> <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 14:42 03/02/12 -0500, Alain LaBont wrote: >[Alain] Good (and thanks for the reference). But I wanted to say that >even if having email addresses including accented latin letters is a "must >have" (I look forward to seeing my address as Alain.LaBonte'@abc.com), we >need to have a way to allow those who can't enter accented letters to be >able to access the same email address (on, say, a US keyboard [or a >Japanese one with Romaji support], in typing Alain.LaBonte@abc.com). > >One way would be to have email aliases, but is it the best way? Will those >ewith accented names have the burden of being sure to have aliases all the >time to do this? I would have to say, unfortunately, yes. ASCII-only equivalents are not something that can be generated mechanically. For some languages, that may be possible, but not for others (in particular languages not written with the Latin script), and not for the collection of all languages together. Regards, Martin. From owner-ietf-imaa Wed Feb 12 12:48:46 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CKmkB06081 for ietf-imaa-bks; Wed, 12 Feb 2003 12:48:46 -0800 (PST) Received: from relay-1m.club-internet.fr (relay-1m.club-internet.fr [194.158.104.40]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CKmjd06077 for ; Wed, 12 Feb 2003 12:48:45 -0800 (PST) Received: from mine.club-internet.fr (f06a-4-36.d1.club-internet.fr [212.194.123.36]) by relay-1m.club-internet.fr (Postfix) with ESMTP id 083811755 for ; Wed, 12 Feb 2003 21:49:04 +0100 (CET) Message-Id: <5.2.0.9.0.20030212215301.02cc21b0@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Wed, 12 Feb 2003 21:54:29 +0100 To: "IETF IMAA list" From: "J-F C. (Jefsey) Morfin" Subject: Re: Case sensitivity on the LHS In-Reply-To: <5.0.2.1.2.20030212144247.05326e38@entree.sct1.gouv.qc.ca> References: <008a01c2d1f3$cb36f5b0$7300a8c0@DAVIS1> <20030211014413.GD16359@nicemice.net> <20030211025714.GF16359@nicemice.net> Mime-Version: 1.0 Content-Type: multipart/mixed; x-avg-checked=avg-ok-51C62536; boundary="=======592A1CA4=======" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --=======592A1CA4======= Content-Type: text/plain; x-avg-checked=avg-ok-51C62536; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit At 20:52 12/02/03, Alain LaBonté wrote: >A 09:34 2003-02-11 -0800, Mark Davis a écrit : >>1. From a practical point of view, I personally very much in favor of >>case-insensitive names. >Me too. >And I would add that ideally, the behaviour in reaching an email address >would be optimum with accent-insensitivity as well. >A bit like searching with Google or Altavista. >Unaccented-affected-Latin-letter keyword search requests will reach all >accented targets. So should email addresses behave. From reading all the mails in here it appears that - from a user point of view and seemingly to respect the RFCs - [optional] case sensitivity is a need. If we put ourselves in a user perspective we will want a consistent support of the whole URL on the left of the "@" and on the right of the "?". Case insensitivity can only be used when the receiving end has accepted it (the case of small devices as I noted before). Otherwise it seems there is a real need for sentivity. We are here considering only *existing* cases, but if languages are using upper and lower cases it is to extend their possiblities. A *new* development should not *reduce* the existing possiblities even if they are not *yet* used. The more the people will use the network, the more they will want what they have everywhere else. --=======592A1CA4======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-51C62536 Content-Disposition: inline --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.449 / Virus Database: 251 - Release Date: 27/01/03 --=======592A1CA4=======-- From owner-ietf-imaa Wed Feb 12 13:37:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CLbcS07211 for ietf-imaa-bks; Wed, 12 Feb 2003 13:37:38 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1CLbad07205 for ; Wed, 12 Feb 2003 13:37:37 -0800 (PST) Received: (qmail 609 invoked by alias); 12 Feb 2003 21:37:40 -0000 Received: from unknown (HELO DAVIS1) (32.97.110.142) by 0 with SMTP; 12 Feb 2003 21:37:40 -0000 Message-ID: <00b901c2d2de$f1472720$6dde2b09@DAVIS1> From: "Mark Davis" To: "Marc Mutz" , Cc: References: <4.2.0.58.J.20030211162348.0479ea38@localhost> <002a01c2d2cc$1f565c70$6dde2b09@DAVIS1> <200302122059.41792@sendmail.mutz.com> Subject: Re: Case sensitivity on the LHS Date: Wed, 12 Feb 2003 13:37:27 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > No, not at all. If you fold to the lower-case versions of characters, > then SS->ss, not SS->ß, so ß could/should be a class of it's own. But the *purpose* of case folding is to do a caseless match; in other words, the caseless matching drives the folding operation. You look at what a caseless match would produce, then drive the case folding from that. > Mapping ß->ss is like mapping ä->ae. Now it is unfortunate that there is no uppercase version of ess-zed; sometimes you see the character preserved in all uppercase words, more typically it is converted to SS. But it is not like ä vs ae; you would not normally see uppercased words converted so that the umlauts became E. > That ambiguity is probably why, in Austria, some people use(d?) SZ This would not be normal German orthography. Mark =================== Marc Munz wrote: On Wednesday 12 February 2003 20:22, Mark Davis wrote: > But the uppercase of "Masze" is "MASSE". So for case-insensitivity, > you need to treat these equivalently. No, not at all. If you fold to the lower-case versions of characters, then SS->ss, not SS->ß, so ß could/should be a class of it's own. > If someone types in either > Masze@foo.com or MASSE@foo.com, they should get the same result. Nobody would enter "MASSE" if asked to enter "Maße" and ß was available on the keyboard. Even with caps lock enabled, they'd enter 'MAßE". Mapping ß->ss is like mapping ä->ae. As "Maße" vs. "Masse" shows, ß is a letter of it's own right in German, it just happens to have the tradition of never appearing at the beginning of a word, so no-one ever bothered to define what it's upper-case version should look like... Also, the uppercase-is-SS trick works reasonably well, but not perfect. E.g., If I write NUSS, then there's no ambiguity. It's uppercase "Nuß" (nut). [ In this particular instance, the usage of ß was a bug in the language that was fixed with the new orthographic rules a few years back. Nuss is pronounced with a short u, while Nuß would be pronounced with a long u. So basically, with the modern orthographic rules, the only uses of ß that are left are those where "ss" would not fit the pronouncation, which is another argument for keeping "ss" and ß separate. ] > However, the same is true for "Masse" and "MASSE", so that tosses > Masse@foo.com into the hopper. So for it to be a well-defined > equivalence relation, {ss, SS, sz} have to be in the same class. That ambiguity is probably why, in Austria, some people use(d?) SZ as the upper-case version of ß (enough so that a mid-80's dictionary I have here mentions it). Why was the class then not defined to be { ss, SS, ß, sz, SZ }? Marc -- The illegal we do immediately. The unconstitutional takes a bit longer. -- Henry Kissinger Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Marc Mutz" To: Cc: "Mark Davis" Sent: Wednesday, February 12, 2003 11:59 Subject: Re: Case sensitivity on the LHS From owner-ietf-imaa Wed Feb 12 15:09:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1CN9Be09877 for ietf-imaa-bks; Wed, 12 Feb 2003 15:09:11 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1CN9Ad09872 for ; Wed, 12 Feb 2003 15:09:10 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18j609-000252-00 for ; Wed, 12 Feb 2003 15:09:13 -0800 Date: Wed, 12 Feb 2003 23:09:13 +0000 From: "Adam M. Costello" To: IETF IMAA list Subject: Re: Case sensitivity on the LHS Message-ID: <20030212230913.GA7477@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <5.0.2.1.2.20030211083406.00a95b50@entree.sct1.gouv.qc.ca> <5.0.2.1.2.20030212143718.00a96fa8@entree.sct1.gouv.qc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4.2.0.58.J.20030212145155.05a0dd50@localhost> <4.2.0.58.J.20030212143823.05074340@localhost> <5.0.2.1.2.20030212144247.05326e38@entree.sct1.gouv.qc.ca> <5.0.2.1.2.20030212143718.00a96fa8@entree.sct1.gouv.qc.ca> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Alain LaBonté wrote: > even if having email addresses including accented latin letters > is a "must have" (I look forward to seeing my address as > Alain.LaBonté@abc.com), we need to have a way to allow those who can't > enter accented letters to be able to access the same email address > (on, say, a US keyboard [or a Japanese one with Romaji support], in > typing Alain.LaBonte@abc.com). > > One way would be to have email aliases, but is it the best way? There are two ways. Either use an alias Alain.LaBonte --> Alain.LaBonté, or tell the sender the ACE form, which would look something like xn##Alain.LaBont-meb (of course we have yet to work out the details). > ideally, the behaviour in reaching an email address would be optimum > with accent-insensitivity as well. > > A bit like searching with Google or Altavista. Email addresses are not a directory or a search service. They are identifiers. We had the same discussion for IDNs. If you want a domain that's easy and inuitive to access for both accent-enabled users and accent-challenged users, you'll need to have two domains (one of which could be an alias for the other). The same is true of email addresses. Martin Duerst wrote: > stringprep/nameprep doesn't actually need equivalence classes, it only > needs a 'toLower' operation. Well, UTR#21 says "Caseless matching is implemented using case-folding", and we wanted to do case-insensitive comparisons, which we assumed is a synonym for "caseless matching", so we used case-folding. > Anyway, because IDNA is now defined the way it is, I don't think it is > worth doing something different for IMAA. For German users to have to > learn that they can use sz on the left hand side, but not on the right > hand is not really a good solution. Definitely. AMC From owner-ietf-imaa Wed Feb 12 17:49:39 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D1ndt13456 for ietf-imaa-bks; Wed, 12 Feb 2003 17:49:39 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1D1nbd13452 for ; Wed, 12 Feb 2003 17:49:38 -0800 (PST) Received: (qmail 26059 invoked by uid 66); 13 Feb 2003 01:49:36 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 13 Feb 2003 01:49:36 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-0123d); 13 Feb 2003 02:49:23 +0100 Date: 13 Feb 2003 01:45:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8flEF$rJcDD@3247.org> In-Reply-To: <4.2.0.58.J.20030212145155.05a0dd50@localhost> Subject: Re: Case sensitivity on the LHS User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-0123d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst schrieb/wrote: > I would have to say, unfortunately, yes. ASCII-only equivalents are > not something that can be generated mechanically. For some languages, > that may be possible, but not for others (in particular languages > not written with the Latin script), and not for the collection of > all languages together. Which means... let the user register multiple names/aliases if they want to have two versions point to them. Claus -- ------------------------ http://www.faerber.muc.de/ ------------------------ OpenPGP: DSS 1024/639680F0 E7A8 AADB 6C8A 2450 67EA AF68 48A5 0E63 6396 80F0 From owner-ietf-imaa Wed Feb 12 18:03:22 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D23Ma13701 for ietf-imaa-bks; Wed, 12 Feb 2003 18:03:22 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (courrier4.internet.gouv.qc.ca [192.197.162.9] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1D23Ld13697 for ; Wed, 12 Feb 2003 18:03:21 -0800 (PST) Received: (qmail 13652 invoked from network); 13 Feb 2003 02:03:16 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.49) by mailgen2.internet.gouv.qc.ca with SMTP; 13 Feb 2003 02:03:16 -0000 Message-Id: <5.0.2.1.2.20030212210030.00a97200@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Wed, 12 Feb 2003 21:03:19 -0500 To: list-ietf-i18n-imaa@faerber.muc.de (Claus Färber), ietf-imaa@imc.org From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: Case sensitivity on the LHS In-Reply-To: <8flEF$rJcDD@3247.org> References: <4.2.0.58.J.20030212145155.05a0dd50@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 01:45 2003-02-13 +0100, Claus Färber a écrit : >Martin Duerst schrieb/wrote: > > I would have to say, unfortunately, yes. ASCII-only equivalents are > > not something that can be generated mechanically. For some languages, > > that may be possible, but not for others (in particular languages > > not written with the Latin script), and not for the collection of > > all languages together. > >Which means... let the user register multiple names/aliases if they want >to have two versions point to them. [Alain] Fair enough. That's at least a reasonable way to solve the problem, if it can't be done mechanically in a universal way. This automaton could be localized at the ISP's level for a language group, but that is of course not relevant at this point, I guess... Alain LaBonté Québec From owner-ietf-imaa Wed Feb 12 18:56:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D2uGq14890 for ietf-imaa-bks; Wed, 12 Feb 2003 18:56:16 -0800 (PST) Received: from [63.202.92.149] (adsl-63-202-92-149.dsl.snfc21.pacbell.net [63.202.92.149]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1D2uEd14884 for ; Wed, 12 Feb 2003 18:56:14 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 12 Feb 2003 18:56:26 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: There are other open topics, folks... Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam and I will digest the thread on case-sensitivity and make some changes in the -01 draft based on it. There are a lot of other open issues; please pick your favorite and start a thread on it! --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Wed Feb 12 19:08:03 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D383g15219 for ietf-imaa-bks; Wed, 12 Feb 2003 19:08:03 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1D382d15215 for ; Wed, 12 Feb 2003 19:08:02 -0800 (PST) Received: (qmail 44294 invoked by uid 1016); 13 Feb 2003 03:08:31 -0000 Date: 13 Feb 2003 03:08:31 -0000 Message-ID: <20030213030831.44293.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: The typing issue References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <20030211040519.99895.qmail@cr.yp.to> <001c01c2d195$b1370320$f57812ac@camus> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Maynard Kang writes: > To require compulsory 14755 input for e-mail addresses is plain > ridiculous, if you ask me. Keyboard interfaces have to support ISO 14755. For users, ISO 14755 is simply an extra option---sometimes the only option that works. Let me put it this way. Someone gives you a business card. The card has an email address. The email address has (say) Japanese characters that you've never seen before. How do you type those characters? Answer: The card shows you, on the next line, what to type, thanks to a universal keyboard standard for Unicode, namely ISO 14755. Done. The only alternative proposal I've seen is forcing every international user to set up a second email address---an ASCII address. Why waste all that effort to work around the typing issue, imposing extra costs on billions of users, when we can simply have keyboard interfaces support a perfectly straightforward standard that allows everything to be typed? ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Wed Feb 12 19:48:42 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D3mg216276 for ietf-imaa-bks; Wed, 12 Feb 2003 19:48:42 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1D3mfd16272 for ; Wed, 12 Feb 2003 19:48:41 -0800 (PST) Received: (qmail 50739 invoked by uid 1016); 13 Feb 2003 03:49:11 -0000 Date: 13 Feb 2003 03:49:10 -0000 Message-ID: <20030213034910.50738.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS References: <20030210142105.GG12186@bodin.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: tedd writes: > making the LHS case-sensitive Mailbox names _are_ case-sensitive. Any message sender that tries to convert mailbox names to lowercase will break interoperability. > Thus, I believe that whatever method is adapted for character > consideration should be consistent throughout the address. Sorry, but the IDNA proponents repeatedly refused to think beyond domain names. They said that mailbox names (and login names and so on) were ``out of scope.'' The IDNA proponents also refused to take the conservative approach of prohibiting uppercase---which allows uppercase to be safely added later if it's necessary. They insisted on case-insensitivity---which becomes an irrevocable decision if users start relying on uppercase addresses. The IDNA proponents also didn't stop and think ``Gee, maybe we're doing something wrong'' when they received public objections from more than THREE HUNDRED PEOPLE. They simply went ahead and declared ``consensus.'' ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Wed Feb 12 20:00:27 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D40R916473 for ietf-imaa-bks; Wed, 12 Feb 2003 20:00:27 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1D40Qd16469 for ; Wed, 12 Feb 2003 20:00:26 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jAY1-0002wq-00 for ; Wed, 12 Feb 2003 20:00:29 -0800 Date: Thu, 13 Feb 2003 04:00:29 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Compatibility with IDNA Message-ID: <20030213040028.GA9630@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A5e3cDD@3247.org> <20030211023401.GE16359@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030211023401.GE16359@nicemice.net> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I wrote: > At this point, the most [reuse of IDNA] we could try for is to use > the exact same encoding for local-parts (or subparts) as is used for > domain labels. Let's explore this idea and see where it leads. As I argued earlier, IDNA and IMAA can use the same ACE prefix only if they use the exact same ToUnicode operation. ToUnicode invokes ToASCII, so it might appear that we would therefore also need the exact same ToASCII operation. However, the only thing ToUnicode does with the result of ToASCII is perform a case-insensitive ASCII comparison on it. Therefore, the IMAA ToASCII and the IDNA ToASCII can differ, provided that the differences are confined to the case of letters in the output. IDNA leaves a lot of implementation freedom regarding the case of the letters output from ToASCII, because the output of ToASCII is an ASCII domain label, which is case-insensitive. Therefore IDNA & Punycode do not specify whether Punycode uses uppercase or lowercase letters to encode the deltas, and IDNA does not specify whether ToASCII prepends a lowercase prefix or an uppercase prefix (or a mixed-case prefix). The main difficulty with reusing ToUnicode in IMAA is that it accepts both lowercase ACEs and uppercase ACEs (and mixed-case ACEs). For example, if xn--blahblah is an ACE local part, then ToUnicode will convert xn--blahblah and XN--BLAHBLAH into the exact same Unicode string. But if we then apply ToASCII to those two identical Unicode strings, we obviously get two identical ASCII strings. So the two local parts xn--blahblah and XN--BLAHBLAH, which "may" refer to distinct mailboxes according to the standards, have been collapsed into a single local part after a round trip through ToUnicode and ToASCII. Not good! ToUnicode and ToASCII need to be lossless. I can think of one way out of this trap: Impose a new administrative requirement that non-lowercase ACE local parts must not be created unless they refer to the same mailbox as the corresponding all-lowercase ACE local part. So if xn--blahblah exists at all, the all-lowercase form is the one guaranteed to work. All other capitalizations are either equivalent to xn--blahblah, or don't exist. This new requirement is automatically obeyed by all mail servers that do case-insensitive ASCII comparisons on local parts (which is virtually all of them in practice). For case-sensitive mail servers, administrators will need to avoid creating non-lowercase ACE local parts. There is theoretically a chance that there exists a local part today that violates the new requirement. But it would have to be a valid ACE (which is very rare), and be non-lowercase (which is atypical), and be served by a case-sensitive mail server (which is very rare). I would be extremely surprised if the intersection of those sets is not empty. Getting back to the round-trip problem, we're not done solving it yet. For a case-sensitive mail server, it might be that xn--blahblah exists, but all other capitalizations don't exist. Therefore users need to be able to reliably type something that ToASCII will convert to xn--blahblah and not some other capitalization. The most obvious modification of ToASCII that would accomplish this would be to have Punycode always use lowercase letters for encoding deltas, and have ToASCII always use the lowercase form of the ACE prefix when prepending it. That's a perfectly reasonable way to implement it, but it would be overkill to specify such a strong constraint. All we really need is: If the input of ToASCII contains no uppercase characters, then the output of ToASCII must contain no uppercase characters. The simple implementation (always use lowercase when you have a choice) obviously satisfies the constraint, but the door is also left open for more complex optional behavior (like mixed-case annotations for preserving case). If a user types a non-ASCII local part without using uppercase characters, it will definitely work; if the user capitalizes any of the characters, it will still work if the mail server performs case-insensitive ASCII comparisons on local parts. That's the same situation as today: If a user types an ASCII local part in the original correct case, it will definitely work; if the user types the local part using some other capitalization, it will still work if the mail server performs case-insensitive ASCII comparisons on local parts. That's it. The open issue of whether to subdivide local parts is orthogonal. Whatever pieces we obtain before the at-sign (a whole local part or subparts of it), we can use the very same ToASCII and ToUnicode implementation that we use for domain labels, including the same ACE prefix and the same profile. (But we can't necessarily use any off-the-shelf IDNA implementation; we need to make sure ToASCII satisifies IMAA's additional constraint.) Here's a summary of what this would mean for case sensitivity: For domains whose mail server is already case-insensitive for ASCII local parts, non-ASCII local parts would likewise be case-insensitive, automatically. For domains whose mail server is case-sensitive for ASCII local parts, it is possible for two ASCII case-variants to refer to different mailboxes, but this would not be possible for non-ASCII case variants. Only the all-lowercase version could exist. The non-lowercase non-ASCII variants would either work by accident or bounce, depending on the exact implementation of the sender's ToASCII. Whereas most mail user agents preserve case for ASCII local parts, most probably would not preserve case for non-ASCII local parts, because mixed-case annotations require considerable additional effort. But it would be possible. If you know that your own mail address is case-insensitive, then you can use mixed-case annotations in your outgoing From: fields, and recipients whose mail user agents make the extra effort will display it in mixed-case form. You know they can reply because you know your address is case-insensitive. AMC From owner-ietf-imaa Wed Feb 12 21:49:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D5nua18752 for ietf-imaa-bks; Wed, 12 Feb 2003 21:49:56 -0800 (PST) Received: from exchange.ad.skymv.com (66-120-210-136.ded.pacbell.net [66.120.210.136]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1D5ntd18745 for ; Wed, 12 Feb 2003 21:49:55 -0800 (PST) Received: from exchange.ad.skymv.com ([192.168.1.71]) by exchange.ad.skymv.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 12 Feb 2003 21:49:41 -0800 content-class: urn:content-classes:message Subject: RE: The typing issue MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Wed, 12 Feb 2003 21:49:41 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Message-ID: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: The typing issue Thread-Index: AcLTDVPpjzxKc7xkRPeVQU+sL7WzqAAFB+Pg From: "Dan Kohn" To: X-OriginalArrivalTime: 13 Feb 2003 05:49:41.0765 (UTC) FILETIME=[B409C350:01C2D323] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1D5ntd18746 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: D. J. Bernstein wrote: > Let me put it this way. Someone gives you a business card. The card > has an email address. The email address has (say) Japanese characters > that you've never seen before. How do you type those characters? > Answer: The card shows you, on the next line, what to type, thanks to > a universal keyboard standard for Unicode, namely ISO 14755. Done. > The only alternative proposal I've seen is forcing every international > user to set up a second email address---an ASCII address. Why waste > all that effort to work around the typing issue, imposing extra costs > on billions of users, when we can simply have keyboard interfaces > support a perfectly straightforward standard that allows everything > to be typed? There is an alternative to registering an ASCII domain for each IDN: instead, you can print the punycode on the business card below the IMAA/IDN email address. Compared to ISO 14755 [1], it seems to me that punycode is more universal (it works wherever ASCII is available), more compact (it supports LDH rather than hex), and no more ugly than ISO 14755. Take an email address on a business card of @example.com (where the bracketed characters would be shown as kanji). Not knowing Japanese, I'd rather see (and type) xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC u+30C9 u+3067@example.com where for each u+ I need to hold down ctrl-alt. Of course, we both agree that they could also set up sonosupiidode@example.com to forward to the same mailbox. Of course, I know how much Dan hates punycode and so he'll hate this suggestion. [1] http://www-rocq.inria.fr/qui/Philippe.Deschamp/divers/ALB-CD.html - dan -- Dan Kohn From owner-ietf-imaa Thu Feb 13 00:08:00 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D880W03379 for ietf-imaa-bks; Thu, 13 Feb 2003 00:08:00 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1D87xd03375 for ; Thu, 13 Feb 2003 00:07:59 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jEPX-0004mI-00 for ; Thu, 13 Feb 2003 00:07:59 -0800 Date: Thu, 13 Feb 2003 08:07:59 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Compatibility with IDNA Message-ID: <20030213080759.GA18181@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A5e3cDD@3247.org> <20030211023401.GE16359@nicemice.net> <20030213040028.GA9630@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030213040028.GA9630@nicemice.net> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I wrote: > If the input of ToASCII contains no uppercase characters, then the > output of ToASCII must contain no uppercase characters. Actually, IMAA would need to impose this constraint on both ToASCII and ToUnicode. The ToUnicode operation, as written in the IDNA spec, automatically satisfies the constraint, whereas ToASCII, as written, has some flexibility that IMAA would need to limit. But just as it's okay for various ToASCII implementations to output slightly different strings, provided they are all equivalent, it would also be harmless for various ToUnicode implementations to output slightly different strings, provided they are all equivalent. (IDNA defines equivalence between X and Y as: ToASCII(X) matches ToASCII(Y) using a case-insensitive ASCII comparison.) It is this flexibility in both ToASCII and ToUnicode that makes possible case preservation via mixed-case annotations. So in order to guarantee that the lowercase form of internationalized local parts always works and survives round-trips through ToASCII and ToUnicode, we need to impose the above constraint on both operations. Note that I'm not suggesting that we mention mixed-case annotations or case preservation in the IMAA spec. IDNA doesn't mention it, so IMAA probably won't either. As long is it's possible, I'm content. AMC From owner-ietf-imaa Thu Feb 13 00:52:37 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1D8qba11707 for ietf-imaa-bks; Thu, 13 Feb 2003 00:52:37 -0800 (PST) Received: from leonis.nus.edu.sg (leonis.nus.edu.sg [137.132.1.18]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1D8qUd11664 for ; Thu, 13 Feb 2003 00:52:31 -0800 (PST) Received: from bic.nus.edu.sg (12-49.priv.nus.edu.sg [172.18.12.49]) by leonis.nus.edu.sg (8.12.1/8.12.1) with ESMTP id h1D8rRwE015282; Thu, 13 Feb 2003 16:53:33 +0800 (SGT) Message-ID: <3E4B5CAD.9090304@bic.nus.edu.sg> Date: Thu, 13 Feb 2003 16:51:57 +0800 From: Tan Tin Wee User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Dan Kohn CC: ietf-imaa@imc.org Subject: Re: The typing issue References: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I always thought that a Japanese name card had Japanese characters on one side for those who can read and write Japanese, and English (Romaji) characters on the other side of the card, for those who can't read or write Japanese. So in the same vein, the IDN user, say Japanese, has his/her Japanese domain name or email address, not for the sake of the non-Japanese reader/writer. If it were intended for this guy, then the IDN user would have sent the email using the ASCII character address. This means that anyone not knowing Japanese is not expected to be able to type or read Japanese on his/her computer, and if they sent an email address to you in Japanese, chances are that the content included Japanese characters which you can't read either. So the issue of whether you or I prefer to type "xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC u+30C9 u+3067@example.com " is a near non-issue except in the following kind of circumstance, for instance, say, "I know both Japanese and Romaji, and this Japanese (who can handle English too) guy wrote email to me in Japanese, which I typically reply to on my Japanese-enabled notebook, but I was flying into San Francisco, and had to reply to him using my yahoo account (which is by then IDN'ised), and the computer at the airport business center, doesn't have the keyboard input system for Japanese, I had to figure out the punycode xn-- version of his email address in order to reply to him urgently, and cannot possible have the time to hold down the Control Alt key to do the U+ thing." The issue IMHO is also not "forcing every international user to set up a second email address". The point is that if every IDN user is also an international person, he would have had his namecards printed one side in say, Japanese, for Japanese meetings, and the other side in English, for international meetings which he is having. In the same way, he would naturally have both an ASCII email address and the Japanese email address. And in his common usage, he will be using his Japanese language email address when mailing stuff in Japanese to his Japanese friends and if any of the stuff leaks out to the International friends, he may well include his ASCII email address, and if not, he's not expecting you to respond anyway. In fact, if all of us English speaking folks put ourselves in the same shoes as the IDN guy, we might be asking the opposite. "I would rather type email addresses in Japanese characters rather than type ASCII because it is not natural and I find it rather difficult to recognise the terribly confusing ASCII characters on the keyboard simply so that I can write a completely Japanese email to my friend who is also Japanese down the next block. So the Internet stuff of sending email in ASCII, an alien character set is pretty broken for me." So take home message here is that all of us should be aware that we are arguing from the world view of the English-enabled person. The whole purpose of having IDNs and IDN email addresses, in my opinion, is for the sake of the un-ASCII'ed masses in the world who take a long time, (as long as you take to key in funny IDN characters or U+whatever characters), to use the Web or the Email to read stuff in their own language. So long as they're ok with it, I'm ok with it. To them, U+whatever, or xn-whateverpunycode, or even tinwee@pobox.org.sg are all just as bad for them, as much as U+whatever, or xn-whatever or @example.com is equally bad for me as a non-Japanese user. Sorry to take so long to put across a small point, but I am not a true native English user. -- tin wee Dan Kohn wrote: >D. J. Bernstein wrote: > > > >>Let me put it this way. Someone gives you a business card. The card >>has an email address. The email address has (say) Japanese characters >>that you've never seen before. How do you type those characters? >> >> >>Answer: The card shows you, on the next line, what to type, thanks to >>a universal keyboard standard for Unicode, namely ISO 14755. Done. >> >> >>The only alternative proposal I've seen is forcing every international >>user to set up a second email address---an ASCII address. Why waste >>all that effort to work around the typing issue, imposing extra costs >>on billions of users, when we can simply have keyboard interfaces >>support a perfectly straightforward standard that allows everything >>to be typed? >> >> > >There is an alternative to registering an ASCII domain for each IDN: >instead, you can print the punycode on the business card below the >IMAA/IDN email address. > >Compared to ISO 14755 [1], it seems to me that punycode is more >universal (it works wherever ASCII is available), more compact (it >supports LDH rather than hex), and no more ugly than ISO 14755. > >Take an email address on a business card of >@example.com (where the bracketed characters would be >shown as kanji). > >Not knowing Japanese, I'd rather see (and type) >xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC >u+30C9 u+3067@example.com where for each u+ I need to hold down >ctrl-alt. Of course, we both agree that they could also set up >sonosupiidode@example.com to forward to the same mailbox. > >Of course, I know how much Dan hates punycode and so he'll hate this >suggestion. > >[1] http://www-rocq.inria.fr/qui/Philippe.Deschamp/divers/ALB-CD.html > > > - dan >-- >Dan Kohn > > > > > From owner-ietf-imaa Thu Feb 13 03:30:02 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DBU2o29007 for ietf-imaa-bks; Thu, 13 Feb 2003 03:30:02 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DBU1d29003 for ; Thu, 13 Feb 2003 03:30:01 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1DBUDKA015006 for ; Thu, 13 Feb 2003 11:30:14 GMT To: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 22:53:04 +0000 From: Roy Badami Message-ID: <1045135813.15005.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: What is the current user expectation? My guess is that case-insensitive is more widespread. In any case, one or the other expectation will be disappointed (if they ever happen to notice). Do we have any idea which systems are more numerous (the only sample I have at the moment is my own email address, which is case-insensitive). In terms of traditional Unix MTAs, sendmail is case insensitive by default, and this would have been a good starting point to answer the question five or ten years ago. If you want to know what's the norm know, you have to ask the questions: Are hotmail addresses case sensitive? Are e-mail addresses on Exchange servers case sensitive (by default). I'm pretty sure the answer to both these questions are that they're case-insensitive. There's no doubt that case-sensitive local parts are unusual; the important question is whether they exist to any significant degree at all... -roy From owner-ietf-imaa Thu Feb 13 03:29:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DBT9028976 for ietf-imaa-bks; Thu, 13 Feb 2003 03:29:09 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DBT7d28972 for ; Thu, 13 Feb 2003 03:29:07 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1DBTKKA014977 for ; Thu, 13 Feb 2003 11:29:20 GMT To: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 22:25:13 +0000 From: Roy Badami Message-ID: <1045135760.14976.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I have never encountered a case-sensitive local-part. Has anyone here ever encountered a case-sensitive local-part? I have never encountered one myself, though I have heard rumours of their existence. I think it's important to be aware of the fact that Internet e-mail is often gatewayed into non-Internet systems such as UUCP or FidoNET in parts of the world where IP connectivity is not commonplace. Is anyone familiar with the state of current deployment of modern UUCP and FidoNET networks, and in a position to comment on the case issues that arise there? -roy From owner-ietf-imaa Thu Feb 13 03:28:12 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DBSCF28929 for ietf-imaa-bks; Thu, 13 Feb 2003 03:28:12 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DBSAd28925 for ; Thu, 13 Feb 2003 03:28:10 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1DBSMKA014972 for ; Thu, 13 Feb 2003 11:28:23 GMT To: ietf-imaa@imc.org Subject: Re: John Cowan on IMAA draft Date: Tue, 11 Feb 2003 22:10:53 +0000 From: Roy Badami Message-ID: <1045135702.14971.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Recognizing fullwidth @ is important, because it's context dependent whether people are using halfwidth or fullwidth characters, and they may not even be conscious of it in double-width environments. Seconded. It's not unusual to see our Japanese customers accidentally put full width characters into their English language e-mail messages to the company I work for. Disallowing full-width-at will almost certainly cause confusion. The argument for consistency with the IDNA approach to full-width-dot is also a strong one. -roy From owner-ietf-imaa Thu Feb 13 03:29:34 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DBTYu28990 for ietf-imaa-bks; Thu, 13 Feb 2003 03:29:34 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DBTVd28986 for ; Thu, 13 Feb 2003 03:29:32 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1DBTiKA014982 for ; Thu, 13 Feb 2003 11:29:44 GMT To: ietf-imaa@imc.org Subject: Re: Case sensitivity on the LHS Date: Tue, 11 Feb 2003 22:42:21 +0000 From: Roy Badami Message-ID: <1045135784.14981.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Lithuanian "I" with an accent above lowercases to "I" + DOT ABOVE + the main accent, because (unlike all other "i"s with accents) the i keeps its dot. For Unicode case-folding purposes, this discrepancy is ignored. It's a font issue :) Evidently in a Lithuanian font, the glyph for 'dotless I' incorporates a dot... -roy From owner-ietf-imaa Thu Feb 13 03:27:47 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DBRlq28912 for ietf-imaa-bks; Thu, 13 Feb 2003 03:27:47 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DBRjd28908 for ; Thu, 13 Feb 2003 03:27:45 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1DBRlKA014954 for ; Thu, 13 Feb 2003 11:27:55 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: A couple of comments on the open issues... Date: Tue, 11 Feb 2003 21:55:45 +0000 From: Roy Badami Message-ID: <1045135666.14953.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: [I'm not a subscriber to the list, so it would be helpful to cc me on any reply, though I do intend to review the archives periodically] By way of background, I should say that I became interested in IDNs late on in the process. I followed much of the later part of the process on the idn list via the archives, but never really felt the need to post, since my opinions tended to be well represented by many other long-standing members of the WG. However, I hope you won't mind my voicing my opinions on a couple of the open issues raised in the IMAA document. I shan't get into the obvious big issue of case (in)sensitivity, since this is not an issue I have strong opinions on, and I have confidence that this forum (and any subsequent WG) will do something sensible in that respect. But a couple of other issues in the document seemed worthy of comment: Rather than transform the entire local part as a single unit, another approach is to pick out smaller pieces of the local part, and transform each piece independently, analogous to the way labels are picked out of a domain name and transformed independently. The tradeoff is complexity versus compatibility with various unofficial conventions for structured local parts, like owner-listname, user+tag, sublocal.local, path!user, etc. I'm particular keen that the use of a tag or suffix with a local username in not broken by IMAA. Many MTAs provide the functionality that all mail addressed to will be delivered to , either by default, or as a configuration option. (Delimiter is to my knowledge typically either '+' or '-', though it's possible that there are others in use). This effectively allows a user of such an MTA (that has been suitably configureed) to have multiple e-mail addresses without requiring any action on the part of the mail administrator, and allows the user to run scripts that process their mail according to the suffix. There are several software packages available to users of Unix and Unix-like machines that make use of this functionality, and I think it is important that users of IMAs (if that's the right term) are not excluded or seriously inconvenienced in the use of such packages. Secondly: Should we consider using punctuation other than hyphens in the ACE prefix? Then we could use the same letters as IDNA. For example, if the IDNA ACE prefix were bq--, the IMAA ACE prefix could be bq== or bq## I'm not going to voice an opinion on the specific question posed, but I would like to counsel against using either the equals sign or the hash sign for this purpose. In particular, I would urge the list to ensure that as far as possible local parts generated by any IMAA specification adhere to a far more conservative character set than that mandated by RFC822/2822, namely the set of characters that has historically been commonplace in local parts. I would personally regard this as being alphanumerics, period, hyphen and underscore. It is unfortunately still the case that there are systems in widespread use which have difficulty in accomodating RFC822 addresses that contain valid but unusual characters. In general, I suspect that this is most likely to be the case when a local non-RFC822 mail system is gatewaying into the RFC822 world. For instance, I am aware that Lotus Notes systems running release 4 have problems sending mail to RFC822 addresses containing plus signs, at least in the default configuration. Whilst such restrictions in Internet e-mail addressing are clearly undesirable, and whilst some such gateways may have mechanisms for escaping unusual characters in order to represent them within local constraints, it remains a fact of life that systems such as these are currently deployed on the Internet in significant numbers, and I think it is desirable that IMAA does not exacerbate the deficiencies of such legacy systems to the point of rendering them incapable of communicating with users of IMAs. A second argument for not straying outside the traditional character set I define above is the danger that there may be MTAs in use which ascribe special meaning to unusual characters in a way which is not easily configurable (if it is configurable at all). We all know that there are deployed systems that ascribe special meaning to the characters '%' and '!', despite the fact that these characters have no special meaning in RFC822. There may be systems out there that ascribe special meaning to other unusual characters, too, and use of these characters may make adoption of IMAA problematic for sites which depend on such systems. These are just my initial thoughts on a couple of the issues that seemed important. -roy From owner-ietf-imaa Thu Feb 13 03:30:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DBUow29038 for ietf-imaa-bks; Thu, 13 Feb 2003 03:30:50 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DBUmd29034 for ; Thu, 13 Feb 2003 03:30:48 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1DBV1KA015011 for ; Thu, 13 Feb 2003 11:31:01 GMT Date: Tue, 11 Feb 2003 23:59:08 +0000 to: ietf-imaa@imc.org Subject: Re: A couple of comments on the open issues... From: Roy Badami Message-ID: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A second argument for not straying outside the traditional character set I define above is the danger that there may be MTAs in use which ascribe special meaning to unusual characters in a way which is not easily configurable (if it is configurable at all). We all know that there are deployed systems that ascribe special meaning to the characters '%' and '!', despite the fact that these characters have no special meaning in RFC822. There may be systems out there that ascribe special meaning to other unusual characters, too, and use of these characters may make adoption of IMAA problematic for sites which depend on such systems. I'd like to reflect further on this particular thought of mine, if I may... The main issue here, I think, is that there may be sites that wish to allow the creation of internationalized mailboxes without being in a position to upgrade to IMAA-compliant software. In some cases, it may be adequate simply to create mailboxes within the legacy system corresponding to the ACE-encoded local part. Now, we don't know in general what restrictions these legacy systems might place on mailbox names. We know there are systems that will not permit '%' or '!' or many other punctuation characters in mailbox names, but for all we know there may be systems that don't permit dot, hyphen, or underscore in local names. There may even be systems that don't permit digits in mailbox names. There are almost certainly still systems on the Internet that don't allow the creation of mailbox names longer than eight (or even six) characters -- clearly they will have major difficulties if they attempt to support the creation of internationalized mailboxes, and there's nothing we can do about that. So in the end even my defined set of alphanumerics, dot, hyphen and underscore is somewhat arbitrary. Although all these characters have been historically commonplace in local parts, that's not to say that all systems have historically allowed the creation of mailboxes containing these characters. There's little we can do to help those systems that don't allow the creation of arbitrary mailbox names, but restricting the character set to a conservative one can only help. (The other argument in my previous post is probably the more important one, though. There are legacy systems connected to the Internet that are incapable of sending e-mail to addresses containing certain unusual punctuation characters, so we should be careful to avoid using those characters in our ACE encoding.) -roy From owner-ietf-imaa Thu Feb 13 04:28:10 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DCSA329937 for ietf-imaa-bks; Thu, 13 Feb 2003 04:28:10 -0800 (PST) Received: from mercury.ccil.org (mail@[192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DCS9d29933 for ; Thu, 13 Feb 2003 04:28:09 -0800 (PST) Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 18jIRz-0007hy-00; Thu, 13 Feb 2003 07:26:47 -0500 Subject: Re: Case sensitivity on the LHS In-Reply-To: <1045135760.14976.TMDA@moriarty.gnomon.org.uk> from Roy Badami at "Feb 11, 2003 10:25:13 pm" To: Roy Badami Date: Thu, 13 Feb 2003 07:26:47 -0500 (EST) CC: ietf-imaa@imc.org X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: John Cowan Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami scripsit: > I think it's important to be aware of the fact that Internet e-mail is > often gatewayed into non-Internet systems such as UUCP or FidoNET in > parts of the world where IP connectivity is not commonplace. UUCP has case-sensitive host names, but the local-part is subject to the same ambiguous rule as Internet local-parts: it's case-insensitive iff the recipient MUA decides it is. Fidonet names are case-insensitive. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_ From owner-ietf-imaa Thu Feb 13 05:06:02 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DD62901360 for ietf-imaa-bks; Thu, 13 Feb 2003 05:06:02 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1DD60d01356 for ; Thu, 13 Feb 2003 05:06:00 -0800 (PST) Received: (qmail 27071 invoked by uid 66); 13 Feb 2003 13:05:59 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 13 Feb 2003 13:05:59 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1304d); 13 Feb 2003 14:05:53 +0100 Date: 13 Feb 2003 14:01:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8flF09g3cDD@3247.org> In-Reply-To: <5.0.2.1.2.20030212143718.00a96fa8@entree.sct1.gouv.qc.ca> Subject: Re: Case sensitivity on the LHS User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1304d MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Alain LaBonté schrieb/wrote: > [Alain] Good (and thanks for the reference). But I wanted to say that even > if having email addresses including accented latin letters is a "must have" > (I look forward to seeing my address as Alain.LaBonté@abc.com), we need to > have a way to allow those who can't enter accented letters to be able to > access the same email address (on, say, a US keyboard [or a Japanese one > with Romaji support], in typing Alain.LaBonte@abc.com). They can view/enter the address as Alain.xn--LaBont-gva@abc.com (for example). Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Thu Feb 13 05:19:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DDJBs02582 for ietf-imaa-bks; Thu, 13 Feb 2003 05:19:11 -0800 (PST) Received: from mailgen2.internet.gouv.qc.ca (inet-cou2.gouv.qc.ca [192.197.162.9]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1DDJAd02575 for ; Thu, 13 Feb 2003 05:19:10 -0800 (PST) Received: (qmail 6535 invoked from network); 13 Feb 2003 13:19:02 -0000 Received: from unknown (HELO p295.sct1.gouv.qc.ca) (142.213.85.49) by mailgen2.internet.gouv.qc.ca with SMTP; 13 Feb 2003 13:19:02 -0000 Message-Id: <5.0.2.1.2.20030213080513.00a97200@entree.sct1.gouv.qc.ca> X-Sender: alabonte@entree.sct1.gouv.qc.ca X-Mailer: QUALCOMM Windows Eudora Version 5.0.2 Date: Thu, 13 Feb 2003 08:19:08 -0500 To: "D. J. Bernstein" , ietf-imaa@imc.org From: =?iso-8859-1?Q?Alain_LaBont=E9?= Subject: Re: The typing issue In-Reply-To: <20030213030831.44293.qmail@cr.yp.to> References: <8f$3A$+JcDD@3247.org> <3CD14E451751BD42BA48AAA50B07BAD60337064E@vsvapostal3.prod.netsol.com> <20030211014413.GD16359@nicemice.net> <20030211040519.99895.qmail@cr.yp.to> <001c01c2d195$b1370320$f57812ac@camus> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A 03:08 2003-02-13 +0000, D. J. Bernstein a écrit : >Maynard Kang writes: > > To require compulsory 14755 input for e-mail addresses is plain > > ridiculous, if you ask me. [DJB] >Keyboard interfaces have to support ISO 14755. For users, ISO 14755 is >simply an extra option---sometimes the only option that works. > >Let me put it this way. Someone gives you a business card. The card has >an email address. The email address has (say) Japanese characters that >you've never seen before. How do you type those characters? > >Answer: The card shows you, on the next line, what to type, thanks to a >universal keyboard standard for Unicode, namely ISO 14755. Done. > >The only alternative proposal I've seen is forcing every international >user to set up a second email address---an ASCII address. Why waste all >that effort to work around the typing issue, imposing extra costs on >billions of users, when we can simply have keyboard interfaces support a >perfectly straightforward standard that allows everything to be typed? > >---D. J. Bernstein, Associate Professor, Department of Mathematics, >Statistics, and Computer Science, University of Illinois at Chicago [ALB] As the instigator and project editor of ISO/IEC 14755, I am but pleased with this promotion of that standard that ought to be on any computer which has a keyboard. However there is a trap with some characters. A lot of similar glyphs refer to different characters. With Latin letters and Unicode normalization as a case in point one can probably get out of this trap most of the times (unless someone, say, uses a Greek upper case Alpha letter or a Cyrillic letter A instead of Latin A, although one must expect that this reasonably won't happen). But with some other scripts there might be an issue with this multi-character-one-glyph problem (not talking about the issue of compatibility characters which could be normalized too). ISO/IEC 14755 has a feedback option that allows to know exactly what UCS id a glyph on the screen corresponds to, for future entry. But on a business card, this feedback does not exist. I don't know a fool-proof solution to this problem, except for a yet-to-be-developed intelligent interface, dedicated to glyph-character correspondence display, after optical reading (coupled with an ISO/IEC 14755 implementation). That said, I agree that ISO/IEC 14755 should be on every computer, that would allow people to get out of trouble for rare but familiar characters which people occasionally have to enter. Alain LaBonté Project editor, ISO/IEC 14755 (developed by ISO/IEC JTC1/SC35/WG1) Québec From owner-ietf-imaa Thu Feb 13 08:16:17 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DGGHC11736 for ietf-imaa-bks; Thu, 13 Feb 2003 08:16:17 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DGGFd11732 for ; Thu, 13 Feb 2003 08:16:15 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA14526; Thu, 13 Feb 2003 11:15:42 -0500 Message-Id: <4.2.0.58.J.20030213085558.05a4b9c8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 13 Feb 2003 11:15:14 -0500 To: Tan Tin Wee , Dan Kohn From: Martin Duerst Subject: Re: The typing issue Cc: ietf-imaa@imc.org In-Reply-To: <3E4B5CAD.9090304@bic.nus.edu.sg> References: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I fully agree with Tan Tin Wee. To summarize, the following options have been proposed: 1) Print ISO 14755 for your email address on your namecard 2) Print punicode for your email address on your namecard 3) Print an ASCII-only email address on your namecard 4) Do nothing We don't have to do anything on this issue here, we can just leave it to the user. However, for the record, I predict here that we will mostly see 3), and there will be lot's of 4) (but you probably won't see that very much). 1) and 2) will only be used very rarely. Regards, Martin. At 16:51 03/02/13 +0800, Tan Tin Wee wrote: >I always thought that a Japanese name card had >Japanese characters on one side for those who can read and write Japanese, >and English (Romaji) characters on the other side of the card, >for those who can't read or write Japanese. > >So in the same vein, the IDN user, say Japanese, has his/her Japanese >domain name or email address, not for the sake of the non-Japanese >reader/writer. If it were intended for this guy, >then the IDN user would have sent the email using >the ASCII character address. > >This means that anyone not knowing Japanese is not expected >to be able to type or read Japanese on his/her computer, and >if they sent an email address to you in Japanese, chances are >that the content included Japanese characters which you can't >read either. > >So the issue of whether you or I prefer to type > >"xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC >u+30C9 u+3067@example.com " > >is a near non-issue except in the following kind of circumstance, >for instance, say, > >"I know both Japanese and Romaji, and this Japanese (who can handle >English too) guy wrote email to me in Japanese, which I typically reply to >on my Japanese-enabled notebook, but I was flying into San Francisco, and >had to reply to him using my yahoo account (which is by then IDN'ised), >and the computer at the airport business center, doesn't have the keyboard >input system for Japanese, I had to figure out the punycode xn-- version >of his email address in order to reply to him >urgently, and cannot possible have the time to hold down >the Control Alt key to do the U+ thing." > >The issue IMHO is also not "forcing every international user >to set up a second email address". The point is that >if every IDN user is also an international person, he would >have had his namecards printed one side in say, Japanese, >for Japanese meetings, and the other side in English, >for international meetings which he is having. In the >same way, he would naturally have both an ASCII email >address and the Japanese email address. > >And in his common usage, he will be using his Japanese >language email address when mailing stuff in Japanese >to his Japanese friends and if any of the stuff leaks >out to the International friends, he may well >include his ASCII email address, and if not, he's not expecting you to >respond anyway. >In fact, if all of us English speaking folks put >ourselves in the same shoes as the IDN guy, we might be asking the opposite. >"I would rather type email addresses in Japanese characters rather than >type ASCII because it is not natural and I find >it rather difficult to recognise the terribly confusing >ASCII characters on the keyboard simply so that I can >write a completely Japanese email to my friend >who is also Japanese down the next block. So the >Internet stuff of sending email in ASCII, an alien >character set is pretty broken for me." > >So take home message here is that all of us should be aware that >we are arguing from the world view of the English-enabled person. >The whole purpose of having IDNs and IDN email addresses, in >my opinion, is for the sake of the un-ASCII'ed masses in the world >who take a long time, (as long as you take to key in funny IDN >characters or U+whatever characters), to use the Web or the >Email to read stuff in their own language. So long as they're >ok with it, I'm ok with it. To them, U+whatever, or >xn-whateverpunycode, or even tinwee@pobox.org.sg >are all just as bad for them, as much as U+whatever, or >xn-whatever or @example.com >is equally bad for me as a non-Japanese user. > >Sorry to take so long to put across a small point, >but I am not a true native English user. > >-- >tin wee > > > >Dan Kohn wrote: > >>D. J. Bernstein wrote: >> >> >> >>>Let me put it this way. Someone gives you a business card. The card >>>has an email address. The email address has (say) Japanese characters >>>that you've never seen before. How do you type those characters? >>> >>> >>>Answer: The card shows you, on the next line, what to type, thanks to >>>a universal keyboard standard for Unicode, namely ISO 14755. Done. >>> >>> >>>The only alternative proposal I've seen is forcing every international >>>user to set up a second email address---an ASCII address. Why waste >>>all that effort to work around the typing issue, imposing extra costs >>>on billions of users, when we can simply have keyboard interfaces >>>support a perfectly straightforward standard that allows everything >>>to be typed? >> >>There is an alternative to registering an ASCII domain for each IDN: >>instead, you can print the punycode on the business card below the >>IMAA/IDN email address. >> >>Compared to ISO 14755 [1], it seems to me that punycode is more >>universal (it works wherever ASCII is available), more compact (it >>supports LDH rather than hex), and no more ugly than ISO 14755. >> >>Take an email address on a business card of >>@example.com (where the bracketed characters would be >>shown as kanji). >> >>Not knowing Japanese, I'd rather see (and type) >>xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC >>u+30C9 u+3067@example.com where for each u+ I need to hold down >>ctrl-alt. Of course, we both agree that they could also set up >>sonosupiidode@example.com to forward to the same mailbox. >> >>Of course, I know how much Dan hates punycode and so he'll hate this >>suggestion. >> >>[1] http://www-rocq.inria.fr/qui/Philippe.Deschamp/divers/ALB-CD.html >> >> >> - dan >>-- >>Dan Kohn >> >> >> >> From owner-ietf-imaa Thu Feb 13 09:29:00 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DHT0V16853 for ietf-imaa-bks; Thu, 13 Feb 2003 09:29:00 -0800 (PST) Received: from relay-2m.club-internet.fr (relay-2m.club-internet.fr [194.158.104.41]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DHSrd16833 for ; Thu, 13 Feb 2003 09:28:53 -0800 (PST) Received: from mine.club-internet.fr (f10v-8-217.d1.club-internet.fr [213.44.235.217]) by relay-2m.club-internet.fr (Postfix) with ESMTP id CF44D169C for ; Thu, 13 Feb 2003 18:28:44 +0100 (CET) Message-Id: <5.2.0.9.0.20030213181053.040e2a00@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Thu, 13 Feb 2003 18:25:04 +0100 To: ietf-imaa@imc.org From: "J-F C. (Jefsey) Morfin" Subject: Re: The typing issue In-Reply-To: <4.2.0.58.J.20030213085558.05a4b9c8@localhost> References: <3E4B5CAD.9090304@bic.nus.edu.sg> <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> Mime-Version: 1.0 Content-Type: multipart/mixed; x-avg-checked=avg-ok-6CE14C30; boundary="=======68DF64EB=======" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --=======68DF64EB======= Content-Type: text/plain; x-avg-checked=avg-ok-6CE14C30; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit An WSIS discussion of this issue, on cultural and societal grounds, shows a clear demand that: To: M.le.Président@République.fr if sent as such, is received and read as such by the destinee. The rationale is: as long as the character set used is a technically limited, politeness accepts to be limited by technical limitations. When the character set becomes the standard usual character set, standard usual social conventions must resume. jfc At 17:15 13/02/03, Martin Duerst wrote: >I fully agree with Tan Tin Wee. To summarize, the following options >have been proposed: > >1) Print ISO 14755 for your email address on your namecard >2) Print punicode for your email address on your namecard >3) Print an ASCII-only email address on your namecard >4) Do nothing > >We don't have to do anything on this issue here, we can just leave >it to the user. However, for the record, I predict here that we >will mostly see 3), and there will be lot's of 4) (but you >probably won't see that very much). 1) and 2) will only be used >very rarely. > >Regards, Martin. > > >At 16:51 03/02/13 +0800, Tan Tin Wee wrote: > >>I always thought that a Japanese name card had >>Japanese characters on one side for those who can read and write Japanese, >>and English (Romaji) characters on the other side of the card, >>for those who can't read or write Japanese. >> >>So in the same vein, the IDN user, say Japanese, has his/her Japanese >>domain name or email address, not for the sake of the non-Japanese >>reader/writer. If it were intended for this guy, >>then the IDN user would have sent the email using >>the ASCII character address. >> >>This means that anyone not knowing Japanese is not expected >>to be able to type or read Japanese on his/her computer, and >>if they sent an email address to you in Japanese, chances are >>that the content included Japanese characters which you can't >>read either. >> >>So the issue of whether you or I prefer to type >> >>"xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC >>u+30C9 u+3067@example.com " >> >>is a near non-issue except in the following kind of circumstance, >>for instance, say, >> >>"I know both Japanese and Romaji, and this Japanese (who can handle >>English too) guy wrote email to me in Japanese, which I typically reply >>to on my Japanese-enabled notebook, but I was flying into San Francisco, >>and had to reply to him using my yahoo account (which is by then >>IDN'ised), and the computer at the airport business center, doesn't have >>the keyboard input system for Japanese, I had to figure out the punycode >>xn-- version of his email address in order to reply to him >>urgently, and cannot possible have the time to hold down >>the Control Alt key to do the U+ thing." >> >>The issue IMHO is also not "forcing every international user >>to set up a second email address". The point is that >>if every IDN user is also an international person, he would >>have had his namecards printed one side in say, Japanese, >>for Japanese meetings, and the other side in English, >>for international meetings which he is having. In the >>same way, he would naturally have both an ASCII email >>address and the Japanese email address. >> >>And in his common usage, he will be using his Japanese >>language email address when mailing stuff in Japanese >>to his Japanese friends and if any of the stuff leaks >>out to the International friends, he may well >>include his ASCII email address, and if not, he's not expecting you to >>respond anyway. >>In fact, if all of us English speaking folks put >>ourselves in the same shoes as the IDN guy, we might be asking the opposite. >>"I would rather type email addresses in Japanese characters rather than >>type ASCII because it is not natural and I find >>it rather difficult to recognise the terribly confusing >>ASCII characters on the keyboard simply so that I can >>write a completely Japanese email to my friend >>who is also Japanese down the next block. So the >>Internet stuff of sending email in ASCII, an alien >>character set is pretty broken for me." >> >>So take home message here is that all of us should be aware that >>we are arguing from the world view of the English-enabled person. >>The whole purpose of having IDNs and IDN email addresses, in >>my opinion, is for the sake of the un-ASCII'ed masses in the world >>who take a long time, (as long as you take to key in funny IDN >>characters or U+whatever characters), to use the Web or the >>Email to read stuff in their own language. So long as they're >>ok with it, I'm ok with it. To them, U+whatever, or >>xn-whateverpunycode, or even tinwee@pobox.org.sg >>are all just as bad for them, as much as U+whatever, or >>xn-whatever or @example.com >>is equally bad for me as a non-Japanese user. >> >>Sorry to take so long to put across a small point, >>but I am not a true native English user. >> >>-- >>tin wee >> >> >> >>Dan Kohn wrote: >> >>>D. J. Bernstein wrote: >>> >>> >>> >>>>Let me put it this way. Someone gives you a business card. The card >>>>has an email address. The email address has (say) Japanese characters >>>>that you've never seen before. How do you type those characters? >>>> >>>> >>>>Answer: The card shows you, on the next line, what to type, thanks to >>>>a universal keyboard standard for Unicode, namely ISO 14755. Done. >>>> >>>> >>>>The only alternative proposal I've seen is forcing every international >>>>user to set up a second email address---an ASCII address. Why waste >>>>all that effort to work around the typing issue, imposing extra costs >>>>on billions of users, when we can simply have keyboard interfaces >>>>support a perfectly straightforward standard that allows everything >>>>to be typed? >>> >>>There is an alternative to registering an ASCII domain for each IDN: >>>instead, you can print the punycode on the business card below the >>>IMAA/IDN email address. >>> >>>Compared to ISO 14755 [1], it seems to me that punycode is more >>>universal (it works wherever ASCII is available), more compact (it >>>supports LDH rather than hex), and no more ugly than ISO 14755. >>> >>>Take an email address on a business card of >>>@example.com (where the bracketed characters would be >>>shown as kanji). >>> >>>Not knowing Japanese, I'd rather see (and type) >>>xn--d9juau41awczczp@example.com than u+305D u+306E u+30B9 u+30D4 u+30FC >>>u+30C9 u+3067@example.com where for each u+ I need to hold down >>>ctrl-alt. Of course, we both agree that they could also set up >>>sonosupiidode@example.com to forward to the same mailbox. >>> >>>Of course, I know how much Dan hates punycode and so he'll hate this >>>suggestion. >>> >>>[1] http://www-rocq.inria.fr/qui/Philippe.Deschamp/divers/ALB-CD.html >>> >>> >>> - dan >>>-- >>>Dan Kohn >>> >>> > > > > > >--- >Incoming mail is certified Virus Free. >Checked by AVG anti-virus system (http://www.grisoft.com). >Version: 6.0.454 / Virus Database: 253 - Release Date: 10/02/03 --=======68DF64EB======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-6CE14C30 Content-Disposition: inline --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.454 / Virus Database: 253 - Release Date: 10/02/03 --=======68DF64EB=======-- From owner-ietf-imaa Thu Feb 13 09:40:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DHecH18225 for ietf-imaa-bks; Thu, 13 Feb 2003 09:40:38 -0800 (PST) Received: from relay-5v.club-internet.fr (relay-5v.club-internet.fr [194.158.96.110]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DHebd18220 for ; Thu, 13 Feb 2003 09:40:37 -0800 (PST) Received: from mine.club-internet.fr (f10v-8-217.d1.club-internet.fr [213.44.235.217]) by relay-5v.club-internet.fr (Postfix) with ESMTP id 6605B1771 for ; Thu, 13 Feb 2003 18:40:57 +0100 (CET) Message-Id: <5.2.0.9.0.20030213182505.040e5570@pop.online.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Thu, 13 Feb 2003 18:32:12 +0100 To: ietf-imaa@imc.org From: "J-F C. (Jefsey) Morfin" Subject: how should I address this? Mime-Version: 1.0 Content-Type: multipart/mixed; x-avg-checked=avg-ok-6CE14C30; boundary="=======353C7706=======" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --=======353C7706======= Content-Type: text/plain; x-avg-checked=avg-ok-6CE14C30; charset=us-ascii; format=flowed Content-Transfer-Encoding: 8bit I have been explained why case-sentivity is necessary to a worldwide business project that could have a real magnitude. This is obviously key IP for its developpers. They claim they are entitled to the respect of existing RFCs (I am not competent enough to judge of that). They also understand the concern of this WG and are ready to disclose their application under NDA to who ever is the final decision maker. How should I address this? jfc --=======353C7706======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-6CE14C30 Content-Disposition: inline --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.454 / Virus Database: 253 - Release Date: 10/02/03 --=======353C7706=======-- From owner-ietf-imaa Thu Feb 13 09:58:53 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DHwr618860 for ietf-imaa-bks; Thu, 13 Feb 2003 09:58:53 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DHwod18853; Thu, 13 Feb 2003 09:58:50 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <5.2.0.9.0.20030213182505.040e5570@pop.online.fr> References: <5.2.0.9.0.20030213182505.040e5570@pop.online.fr> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 13 Feb 2003 09:58:49 -0800 To: "J-F C. (Jefsey) Morfin" , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: how should I address this? Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 6:32 PM +0100 2/13/03, J-F C. (Jefsey) Morfin wrote: >They also understand the concern of this WG and are ready to >disclose their application under NDA to who ever is the final >decision maker. How should I address this? This is not a WG. It never was, and never pretended to be. If you are unclear on this concept, please read the Tao of the IETF document. There is an undisclosed number of people and archivers who subscribe to this list. No one should send any information to this list that could be considered confidential. There is no "final decision maker" for this document. The authors will strive to take the reasonable technical concerns of others into account when crafting the protocol. We are quite sure that we cannot make everyone happy in the end. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Thu Feb 13 10:15:47 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DIFlN19434 for ietf-imaa-bks; Thu, 13 Feb 2003 10:15:47 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1DIFkd19430 for ; Thu, 13 Feb 2003 10:15:46 -0800 (PST) Received: (qmail 8486 invoked by uid 1016); 13 Feb 2003 18:16:14 -0000 Date: 13 Feb 2003 18:16:14 -0000 Message-ID: <20030213181614.8485.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: facts about the real world, part 1 References: <4.2.0.58.J.20030209173037.05a45ca0@localhost> <20030210114016.GA9872@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam M. Costello writes: > I have never encountered a case-sensitive local-part. Has anyone here > ever encountered a case-sensitive local-part? Certainly. Here's the qmail situation, for example: * Case-insensitive: ASCII-lowercased mailbox names are compared to case-sensitive names in /etc/passwd and .qmail-*. * Can go either way: The original mailbox name is passed to mail delivery agents. * Case-sensitive: ezmlm, mailing-list-management software running under qmail, pays attention to case in its mailbox names. All of this is very widely deployed. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 13 10:21:46 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DILkd19635 for ietf-imaa-bks; Thu, 13 Feb 2003 10:21:46 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1DILjd19631 for ; Thu, 13 Feb 2003 10:21:45 -0800 (PST) Received: (qmail 10863 invoked by uid 1016); 13 Feb 2003 18:22:13 -0000 Date: 13 Feb 2003 18:22:13 -0000 Message-ID: <20030213182213.10862.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: facts about the real world, part 2 References: <1045135666.14953.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > Delimiter is to my knowledge typically either '+' or '-', though it's > possible that there are others in use The Andrew mail system---still in use, and historically the first system to provide convenient subaddressing to users---has = as its standard separator; the same separator has some uses in ezmlm. Although the default qmail separator is -, qmail makes it very easy for the system administrator to choose any byte as a separator. (With a bit more work, you can even have different separators for different users.) ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 13 10:29:58 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DITwV19790 for ietf-imaa-bks; Thu, 13 Feb 2003 10:29:58 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1DITud19780 for ; Thu, 13 Feb 2003 10:29:56 -0800 (PST) Received: (qmail 14214 invoked by uid 1016); 13 Feb 2003 18:30:24 -0000 Date: 13 Feb 2003 18:30:24 -0000 Message-ID: <20030213183024.14213.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: The typing issue References: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> <3E4B5CAD.9090304@bic.nus.edu.sg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Tan Tin Wee writes: > he would naturally have both an ASCII email > address and the Japanese email address As I said, that's forcing every international user to set up a second email address in ASCII. With ISO 14755, the second email address disappears. There's still ASCII information on the business card---but it isn't a second address; it's a universal explanation of how to type the first address. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 13 11:05:40 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DJ5eu20634 for ietf-imaa-bks; Thu, 13 Feb 2003 11:05:40 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DJ5Yd20626 for ; Thu, 13 Feb 2003 11:05:34 -0800 (PST) Received: from 192.168.0.17 (ppp36-99.hrz.uni-bielefeld.de [129.70.36.99]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HA9002W5H0ZB4@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Thu, 13 Feb 2003 20:05:32 +0100 (MET) Date: Thu, 13 Feb 2003 19:50:55 +0100 From: Marc Mutz Subject: Open Issue: Stored strings vs. queries. To: ietf-imaa@imc.org Message-id: <200302131950.55716@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_Pk+S+FS5Koy9ryY"; charset="iso-8859-1" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_Pk+S+FS5Koy9ryY Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline Hi! An interesting issue is what is to be considered stored local-parts and=20 what is a queried local-part. Obvious stored local-parts: =2D MTA config =2D address books (e.g. LDAP) Obvious queries: =2D address lookups (e.g. LDAP queries) =2D SMTP commands Non-obvious: =2D Mail headers One might very well say that a mail header is a request (in that the=20 user enters it and the SMTP server for the given domain needs to look=20 it up to return success or failure), so that it's slots would fall into=20 the "query" category. And naturally, one would like to have the content=20 of the message headers and that of the smtp commands be subject to the=20 same rules, now that 282{1,2} agree on the mailbox definition. However, the POV that the message is stored and it's addresses will be=20 subject to queries (e.g. by user filtering or searching) isn't way off,=20 too. There's also the argument that what the server looks up is the=20 argument given to SMTP's RCPT TO command, not what's in the header=20 fields. So the main question I see is whether header field slots (and,=20 consequently[1], SMTP command slots) are queries or stored. Marc [1] I think that the whatever category those belong to, they should both=20 belong to the same class. =2D-=20 Nie wird so viel gelogen wie vor der Wahl, w=E4hrend des Kriegs und nach der Jagd -- Otto von Bismarck --Boundary-02=_Pk+S+FS5Koy9ryY Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+S+kP3oWD+L2/6DgRAggyAJ0UiO7wNe8s1xQIt52SIXYoZQp81gCgv1KR zbOiBJtKL6dooUQmIT+PWYw= =Xdx2 -----END PGP SIGNATURE----- --Boundary-02=_Pk+S+FS5Koy9ryY-- From owner-ietf-imaa Thu Feb 13 11:05:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DJ5cC20630 for ietf-imaa-bks; Thu, 13 Feb 2003 11:05:38 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DJ5Wd20624 for ; Thu, 13 Feb 2003 11:05:32 -0800 (PST) Received: from 192.168.0.17 (ppp36-99.hrz.uni-bielefeld.de [129.70.36.99]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HA9002W5H0ZB4@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Thu, 13 Feb 2003 20:05:30 +0100 (MET) Date: Thu, 13 Feb 2003 19:48:11 +0100 From: Marc Mutz Subject: Open Issue: Splitting of local-part into labels and where? To: ietf-imaa@imc.org Message-id: <200302131948.20807@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_0h+S+kOHuL79s6j"; charset="us-ascii" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_0h+S+kOHuL79s6j Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline Hi! The question of whether or not to split the dequoted local-part into=20 labels and if so, at which delimiter characters, remains unanswered in=20 the -00 draft. I think that splitting into labels is needed to keep old software (esp.=20 MTAs and filtering software) working. I can't add more to Roy's=20 arguments here, and I feel there will be not much discussion about this=20 particular issue. The more interesting point is _where_ to split. There are some very=20 obvious characters (mainly full stops and hyphens), but apart from=20 that, the rest of the candidate chars is much less clear. I'd include at least the following: @ [1] + (used for subaddresses) The draft also mentions '!', Roy mentioned the underscore. In addition, all such separators should be recognized in all their=20 variants (for full-stops, see IDNA, for @ see draft, for others I admit=20 to not know the Unicode repertoire by heart ;-)) and be replaced with=20 their US-ASCII equivalents. The draft also mentions the option of using all non-alnum US-ASCII=20 characters. If the number of equivalent Unicode code points is small,=20 then this is certainly the best option, although we should then provide=20 a mapping table for the to-usascii mapping. Marc [1] Which reminds my that the draft specifies splitting local-part and=20 domain at _the_ at-sign. It should probably read "at the _last_=20 at-sign", since local-parts may contain at-signs themselves. =2D-=20 It's one thing to accept a risk to your own data, but quite another to standardize on something that imposes that risk on others, no matter how unlikely you think it is that anything "really bad" will happen, and no matter how desirable the outcome. -- Bart Schaefer, on ietf-822 --Boundary-02=_0h+S+kOHuL79s6j Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+S+h03oWD+L2/6DgRAnJRAJ4upfq3kM+jKGaGTEiN0WS2q6dY9gCfcTcG 23gpkEsxEw7trxbbnwvxiMI= =uGFj -----END PGP SIGNATURE----- --Boundary-02=_0h+S+kOHuL79s6j-- From owner-ietf-imaa Thu Feb 13 11:38:41 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DJcfQ22706 for ietf-imaa-bks; Thu, 13 Feb 2003 11:38:41 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DJced22701 for ; Thu, 13 Feb 2003 11:38:40 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA09916; Thu, 13 Feb 2003 14:38:41 -0500 Message-Id: <4.2.0.58.J.20030213141928.050c46f0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 13 Feb 2003 14:21:26 -0500 To: "D. J. Bernstein" , ietf-imaa@imc.org From: Martin Duerst Subject: Re: The typing issue In-Reply-To: <20030213183024.14213.qmail@cr.yp.to> References: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> <3E4B5CAD.9090304@bic.nus.edu.sg> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Dan, As I said, nobody is forcing anybody to do anything. Addressees who think they like ISO 14755 better will go for that one. Addressees who think they like an ASCII equivalent email address better will go for that one. Let's just see how things develop, and talk again in a few years. Regards, Martin. At 18:30 03/02/13 +0000, D. J. Bernstein wrote: >Tan Tin Wee writes: > > he would naturally have both an ASCII email > > address and the Japanese email address > >As I said, that's forcing every international user to set up a second >email address in ASCII. > >With ISO 14755, the second email address disappears. There's still ASCII >information on the business card---but it isn't a second address; it's a >universal explanation of how to type the first address. > >---D. J. Bernstein, Associate Professor, Department of Mathematics, >Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 13 11:38:44 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DJciV22718 for ietf-imaa-bks; Thu, 13 Feb 2003 11:38:44 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DJchd22711 for ; Thu, 13 Feb 2003 11:38:43 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA09919; Thu, 13 Feb 2003 14:38:41 -0500 Message-Id: <4.2.0.58.J.20030213142550.03349d90@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 13 Feb 2003 14:34:58 -0500 To: Marc Mutz , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Open Issue: Splitting of local-part into labels and where? In-Reply-To: <200302131948.20807@sendmail.mutz.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 19:48 03/02/13 +0100, Marc Mutz wrote: >Hi! > >The question of whether or not to split the dequoted local-part into >labels and if so, at which delimiter characters, remains unanswered in >the -00 draft. >The more interesting point is _where_ to split. There are some very >obvious characters (mainly full stops and hyphens), but apart from >that, the rest of the candidate chars is much less clear. My understanding is that this is very much a slippery slope. So we better stop as soon as possible. I see some argument for making '.' a delimiter, because that would make it easier to apply the same function to a whole email address as to a domain name only. But that on the other hand prohibits us to make '-' a delimiter. >In addition, all such separators should be recognized in all their >variants (for full-stops, see IDNA, for @ see draft, for others I admit >to not know the Unicode repertoire by heart ;-)) and be replaced with >their US-ASCII equivalents. There are a lot of variants, and they very much depend on circumstances. IDNA recognizes the full-width and ideographic variants of the full stop, but as far as I remember, it does not recognize any other variants or equivalents. Making sure that a user enters a '@' in the right form can and should be a quality of implementation issue under the responsibility of the application. Such issues can easily be integrated into the input architecture e.g. for East Asian languages and scripts. If we treat all non-alphanum ASCII characters special, then what about non-alphabetic/syllabic/ideographic symbols,... in other scripts? Shouldn't they also be treated as special separators? Regards, Martin. From owner-ietf-imaa Thu Feb 13 11:49:14 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DJnE023016 for ietf-imaa-bks; Thu, 13 Feb 2003 11:49:14 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DJnDd23012 for ; Thu, 13 Feb 2003 11:49:13 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id 6E60978A18 for ; Thu, 13 Feb 2003 19:49:14 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 98275-07 for ; Thu, 13 Feb 2003 19:49:12 +0000 (GMT) Received: from jeffreyibm (unknown [211.219.53.139]) by pie1.i-dns.net (Postfix) with SMTP id 83C7F78A2A for ; Thu, 13 Feb 2003 19:49:08 +0000 (GMT) Message-ID: <008701c2d399$40ca9330$fc00a8c0@jeffreyibm> From: "Jeffrey J Zahari" To: References: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> <3E4B5CAD.9090304@bic.nus.edu.sg> <20030213183024.14213.qmail@cr.yp.to> Subject: Re: The typing issue Date: Fri, 14 Feb 2003 04:50:56 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Just out of curiosity, are there any complete systems/applications that support ISO14755? Because implementing idnc3 phase 1 seems like it could be a long time a coming. jeffrey j zahari ----- Original Message ----- From: "D. J. Bernstein" To: Sent: Friday, February 14, 2003 3:30 AM Subject: Re: The typing issue > > Tan Tin Wee writes: > > he would naturally have both an ASCII email > > address and the Japanese email address > > As I said, that's forcing every international user to set up a second > email address in ASCII. > > With ISO 14755, the second email address disappears. There's still ASCII > information on the business card---but it isn't a second address; it's a > universal explanation of how to type the first address. > > ---D. J. Bernstein, Associate Professor, Department of Mathematics, > Statistics, and Computer Science, University of Illinois at Chicago > From owner-ietf-imaa Thu Feb 13 12:57:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DKv8K26394 for ietf-imaa-bks; Thu, 13 Feb 2003 12:57:08 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DKv2d26385 for ; Thu, 13 Feb 2003 12:57:02 -0800 (PST) Received: from 192.168.0.17 (ppp36-220.hrz.uni-bielefeld.de [129.70.36.220]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HA9008APM70QG@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Thu, 13 Feb 2003 21:57:01 +0100 (MET) Date: Thu, 13 Feb 2003 21:43:28 +0100 From: Marc Mutz Subject: Re: Open Issue: Splitting of local-part into labels and where? In-reply-to: <4.2.0.58.J.20030213142550.03349d90@localhost> To: ietf-imaa@imc.org Message-id: <200302132143.49081@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_EOAT+UDYhC4polm"; charset="us-ascii" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <4.2.0.58.J.20030213142550.03349d90@localhost> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_EOAT+UDYhC4polm Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Thursday 13 February 2003 20:34, Martin Duerst wrote: > I see some argument > for making '.' a delimiter, because that would make it easier > to apply the same function to a whole email address as to > a domain name only. We will not be able to use the same function for the LHS as for the RHS=20 anyway. That's simply b/c there are characters that are allowed in=20 local-parts, but not in domains (the definition of quoted-string with=20 that of dot-atom in rfc 2822). > If we treat all non-alphanum ASCII characters special, then what > about non-alphabetic/syllabic/ideographic symbols,... in other > scripts? Shouldn't they also be treated as special separators? No, b/c they were never used as separators (since they can't appear in=20 local-parts, obviously). OTOH, dot, plus, etc _are_ used as separators currently. And since=20 no-one here can tell what separators are used in the wild (e.g. b/c the=20 subaddress separator is config'able), it's only logical to split at=20 non-alphanum ASCII characters. It also makes the ACE-ILP more readable=20 for those that have to deal with it (users of non-imaa compliant MUAs,=20 MTA admins). We should, of course, exclude pathological cases, such as control=20 characters, so a quick lookup in a charset table reveals the following=20 candidates: HT CRLF SP (whitespace) ! " # $ % & ' ( ) * +, - . / : ; < =3D > ? @ [ \ ] ^ _ ` { | } ~ all of which are currently allowed in quoted-string and thus in=20 local-part. (Note: Only splitting at dots would e.g. mean that we mangle whitespace!=20 I don't think that that's what we want, esp. b/c CRLF is a "multi-byte=20 character", so to speak) Marc =2D-=20 It has become fashionable in the post Cold War world to label opponents as terrorists [...]. By doing so, the authorities instill within society a culture of fear, leading people to accept that their rights (and the rights of others) be trampled on for the sake of the common good. In other words, it justifies the loss of privacy and a state of surveillance they would otherwise not accept. Both communism and fascism were examples of this technique used to perfection. -- John Horvath: The Internet: A Terrorist Network? Telepolis 2001/08/22 (#9350) --Boundary-02=_EOAT+UDYhC4polm Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TAOE3oWD+L2/6DgRAqDGAKDe3PTP8PdgdVJtu88Q2zuKh4qtuwCbBMSV EJQ0g7Rk/4vI3u2yGVtRdUA= =5F5m -----END PGP SIGNATURE----- --Boundary-02=_EOAT+UDYhC4polm-- From owner-ietf-imaa Thu Feb 13 13:04:42 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DL4gh26691 for ietf-imaa-bks; Thu, 13 Feb 2003 13:04:42 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DL4ad26687; Thu, 13 Feb 2003 13:04:36 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200302131950.55716@sendmail.mutz.com> References: <200302131950.55716@sendmail.mutz.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 13 Feb 2003 13:04:33 -0800 To: Marc Mutz , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Open Issue: Stored strings vs. queries. Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 7:50 PM +0100 2/13/03, Marc Mutz wrote: >One might very well say that a mail header is a request (in that the >user enters it and the SMTP server for the given domain needs to look >it up to return success or failure), so that it's slots would fall into >the "query" category. And naturally, one would like to have the content >of the message headers and that of the smtp commands be subject to the >same rules, now that 282{1,2} agree on the mailbox definition. > >However, the POV that the message is stored and it's addresses will be >subject to queries (e.g. by user filtering or searching) isn't way off, >too. There's also the argument that what the server looks up is the >argument given to SMTP's RCPT TO command, not what's in the header >fields. This doesn't match the general definition of queries and stored in Stringprep. I cannot see how a header would be considered a query. It isn't asking for anything, it is a part of the message. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Thu Feb 13 14:12:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DMC9X28252 for ietf-imaa-bks; Thu, 13 Feb 2003 14:12:09 -0800 (PST) Received: from patan.sun.com (patan.Sun.COM [192.18.98.43]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DMC7d28244 for ; Thu, 13 Feb 2003 14:12:07 -0800 (PST) Received: from esunmail ([129.147.58.120]) by patan.sun.com (8.9.3+Sun/8.9.3) with ESMTP id PAA10894 for ; Thu, 13 Feb 2003 15:12:09 -0700 (MST) Received: from xpa-fe2 (esunmail [129.147.58.120]) by edgemail1.Central.Sun.COM (iPlanet Messaging Server 5.2 HotFix 1.08 (built Dec 6 2002)) with ESMTP id <0HA9000SWPO9VA@edgemail1.Central.Sun.COM> for ietf-imaa@imc.org; Thu, 13 Feb 2003 15:12:09 -0700 (MST) Received: from nifty-jr.west.sun.com ([129.153.12.95]) by mail.sun.net (iPlanet Messaging Server 5.2 HotFix 1.08 (built Dec 6 2002)) with ESMTPSA id <0HA900JDTPO63G@mail.sun.net> for ietf-imaa@imc.org; Thu, 13 Feb 2003 15:12:08 -0700 (MST) Date: Thu, 13 Feb 2003 14:11:50 -0800 From: Chris Newman Subject: Re: facts about the real world, part 2 In-reply-to: <20030213182213.10862.qmail@cr.yp.to> To: "D. J. Bernstein" , ietf-imaa@imc.org Message-id: <2147483647.1045145510@nifty-jr.west.sun.com> MIME-version: 1.0 X-Mailer: Mulberry/3.0.0 (Mac OS X) Content-type: text/plain; charset=us-ascii; format=flowed Content-transfer-encoding: 7BIT Content-disposition: inline X-message-flag: Outlook: the best virus distribution system around References: <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <20030213182213.10862.qmail@cr.yp.to> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: begin quotation by D. J. Bernstein on 2003/2/13 18:22 +0000: > Roy Badami writes: >> Delimiter is to my knowledge typically either '+' or '-', though it's >> possible that there are others in use > > The Andrew mail system---still in use, and historically the first system > to provide convenient subaddressing to users---has = as its standard > separator; the same separator has some uses in ezmlm. AMS at Carnegie Mellon University (where it was created) used '+'. Perhaps the AMS deployment you saw changed it from the default '+' to '='. But there are some other systems using "=" as a subaddress delimiter. - Chris From owner-ietf-imaa Thu Feb 13 14:32:53 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DMWrN28676 for ietf-imaa-bks; Thu, 13 Feb 2003 14:32:53 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1DMWqd28672 for ; Thu, 13 Feb 2003 14:32:52 -0800 (PST) Received: (qmail 53066 invoked by uid 1016); 13 Feb 2003 22:33:20 -0000 Date: 13 Feb 2003 22:33:20 -0000 Message-ID: <20030213223320.53054.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: facts about the real world, part 2 References: <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <20030213182213.10862.qmail@cr.yp.to> <2147483647.1045145510@nifty-jr.west.sun.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Chris Newman writes: > AMS at Carnegie Mellon University (where it was created) used '+'. Perhaps > the AMS deployment you saw changed it from the default '+' to '='. My comment was, in fact, based on current = addresses at CMU. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 13 14:31:39 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DMVdA28656 for ietf-imaa-bks; Thu, 13 Feb 2003 14:31:39 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DMVWd28650 for ; Thu, 13 Feb 2003 14:31:33 -0800 (PST) Received: from 192.168.0.17 (ppp36-244.hrz.uni-bielefeld.de [129.70.36.244]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HA900BS3QKEXX@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Thu, 13 Feb 2003 23:31:32 +0100 (MET) Date: Thu, 13 Feb 2003 23:20:01 +0100 From: Marc Mutz Subject: Re: Open Issue: Stored strings vs. queries. In-reply-to: To: ietf-imaa@imc.org Message-id: <200302132320.20874@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_koBT+av/jSvKTA3"; charset="us-ascii" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <200302131950.55716@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_koBT+av/jSvKTA3 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Thursday 13 February 2003 22:04, Paul Hoffman / IMC wrote: > I cannot see how a header would be considered a query. OK, let's do a Gedankenexperiment: I think we agree that the smtp slots have to be queries. (Else, a single=20 intermediate MTA running oldVersion will break the transport of a=20 newVersion-MUA-composed message to a newVersion-mailbox. Consider the case of stored mail header field slots. I want to send a message with an oldVersion MUA to a newVersion mailbox=20 whose local-part contains a code point from the set U(oldVersion) \cap AO(newVersion) My MUA will rfc2822-serialize the message, thereby checking that any=20 to/cc/bcc address meets the requirements of oldVersion-stored=20 addresses. The address for the newVersion mailbox will fail that step,=20 since it contains a code point from oldVersion of the U set. This way, only addresses meeting the requirements of stored addresses=20 are permitted. This doesn't even change if the MUA keeps track of the=20 user-entered address separately of the rfc2822 serialization (for later=20 use in the smtp rcpt to command), since it is simply not permitted to=20 put the address into the rfc2822 serialization. The only option for the MUA would be to leave out any address that fails=20 the stored requirements test, but not the query requirements test[1]=20 from the rfc2822 serialization, but use the address in the RCPT TO=20 command. Well, that's what spammers do... Consider OTOH the case of query mail header field slots. My MUA will detect that the mailbox meets the requirements of a query=20 mailbox, and happily put the mailbox together with the others (if any)=20 into both the rfc2822 serialization slots, as well as the SMTP slots.=20 Sending succeeds. Conclusion: If we want oldVersion MUAs to be able to send mails to=20 newVersion mailboxes (ie. a mailbox whose address contains newly=20 assigned code points), then rfc2822 slots need to be queries or - if we=20 insist they are stored - MUAs be required to omit the newVersion=20 mailbox from the message, but give it in smtp rcpt to. Since the latter option isn't really an option (all mails from=20 oldVersion MUA users would appear to be bcc'ed to me, with the BCC=20 header stripped), the question boils down to: Do we want oldVersion MUAs to be able to send messages to newVersion=20 mailboxes? My answer would be "Yes", since that's what I understand IDNA works hard=20 to achieve for DNS (an oldVersion browser can still do a dns lookup of=20 a newVersion IDN). Marc [1] If both fail, the address should be rejected. =2D-=20 Ich gegen meinen Bruder. Ich und mein Bruder gegen unseren Cousin. Ich, mein Bruder und unser Cousin gegen unsere Nachbarn. Wir alle gegen den Fremden. -- Beduinen-Sprichwort --Boundary-02=_koBT+av/jSvKTA3 Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TBok3oWD+L2/6DgRAmEAAJ9yEoVVfilLvFTbB5d7HsezDI0v5QCg4jSO aqozkASVdbPFFRP6zXmHD4k= =g5Dt -----END PGP SIGNATURE----- --Boundary-02=_koBT+av/jSvKTA3-- From owner-ietf-imaa Thu Feb 13 15:09:40 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DN9eG29479 for ietf-imaa-bks; Thu, 13 Feb 2003 15:09:40 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DN9dd29475 for ; Thu, 13 Feb 2003 15:09:39 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jSUA-0006bq-00 for ; Thu, 13 Feb 2003 15:09:42 -0800 Date: Thu, 13 Feb 2003 23:09:42 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: how should I address this? Message-ID: <20030213230942.GA25048@nicemice.net> Reply-To: IETF IMAA list References: <5.2.0.9.0.20030213182505.040e5570@pop.online.fr> <5.2.0.9.0.20030213182505.040e5570@pop.online.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5.2.0.9.0.20030213182505.040e5570@pop.online.fr> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: "J-F C. (Jefsey) Morfin" wrote: > I have been explained why case-sensitivity is necessary to a worldwide > business project... > > They are ready to disclose their application under NDA to who ever is > the final decision maker. Paul Hoffman / IMC replied: > There is no "final decision maker" for this document. The authors > will strive to take the reasonable technical concerns of others into > account when crafting the protocol. Furthermore, IMAA aims to be a public standard, and therefore it is being developed by a public forum (this mailing list). It would be improper for the design decisions to be based on secret discussions. The business project in question can choose to participate in this public forum, or they can choose not to participate. AMC From owner-ietf-imaa Thu Feb 13 15:22:48 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DNMmr29693 for ietf-imaa-bks; Thu, 13 Feb 2003 15:22:48 -0800 (PST) Received: from smtp5.andrew.cmu.edu (SMTP5.andrew.cmu.edu [128.2.10.85]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DNMld29689 for ; Thu, 13 Feb 2003 15:22:47 -0800 (PST) Received: from penguin.andrew.cmu.edu (PENGUIN.andrew.cmu.edu [128.2.121.100]) by smtp5.andrew.cmu.edu (8.12.3.Beta2/8.12.3.Beta2) with ESMTP id h1DNMmDX006235; Thu, 13 Feb 2003 18:22:48 -0500 Date: Thu, 13 Feb 2003 18:22:48 -0500 Message-Id: <200302132322.h1DNMmDX006235@smtp5.andrew.cmu.edu> From: Lawrence Greenfield X-Mailer: BatIMail version 3.3 To: ietf-imaa@imc.org In-reply-to: <20030213223320.53054.qmail@cr.yp.to> Subject: Re: facts about the real world, part 2 References: <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <20030213182213.10862.qmail@cr.yp.to> <2147483647.1045145510@nifty-jr.west.sun.com> <20030213223320.53054.qmail@cr.yp.to> User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-4?Q?Unebigory?= =?ISO-8859-4?Q?=F2mae?=) Emacs/21.2 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Date: 13 Feb 2003 22:33:20 -0000 From: "D. J. Bernstein" Chris Newman writes: > AMS at Carnegie Mellon University (where it was created) used > '+'. Perhaps the AMS deployment you saw changed it from the > default '+' to '='. My comment was, in fact, based on current = addresses at CMU. That's a different system from AMS, actually. That's run by the CS department. That's running MMDF. The large AMS installation at CMU has been replaced by Cyrus IMAP and Sendmail. MMDF has some interesting other quirks besides supporting both + and = seperator characters. It doesn't support ESMTP, and it actually chokes on 8-bit characters _anywhere_ in the message. Larry From owner-ietf-imaa Thu Feb 13 15:53:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1DNrKM00733 for ietf-imaa-bks; Thu, 13 Feb 2003 15:53:20 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1DNrJd00729 for ; Thu, 13 Feb 2003 15:53:19 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jTAN-0006iG-00; Thu, 13 Feb 2003 15:53:19 -0800 Date: Thu, 13 Feb 2003 23:53:19 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: A couple of comments on the open issues... Message-ID: <20030213235319.GB25048@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami wrote: > there may be sites that wish to allow the creation of > internationalized mailboxes without being in a position to upgrade to > IMAA-compliant software. > > In some cases, it may be adequate simply to create mailboxes within > the legacy system corresponding to the ACE-encoded local part. Indeed, one of the primary design goals of IMAA, like IDNA, is that it not be necessary to upgrade any infrastructure, but only end-user applications. > I'm particular keen that the use of a tag or suffix with a local > username in not broken by IMAA. Many MTAs provide the functionality > that all mail addressed to will be delivered > to , either by default, or as a configuration option. > > This effectively allows a user of such an MTA (that has been suitably > configureed) to have multiple e-mail addresses without requiring any > action on the part of the mail administrator, and allows the user to > run scripts that process their mail according to the suffix. So if IMAA operates on the entire local part, this multiple-address feature will be unavailable to users with internationalized local parts. This is a good argument in favor of having IMAA operate independently on subparts of the local part. (Of course, there are arguments on the other side too, and I'm sure we'll hear them. I haven't yet chosen a side in this debate.) > I would urge the list to ensure that as far as possible local parts > generated by any IMAA specification adhere to a far more conservative > character set than that mandated by RFC822/2822, namely the set of > characters that has historically been commonplace in local parts. I > would personally regard this as being alphanumerics, period, hyphen > and underscore. > > It is unfortunately still the case that there are systems in > widespread use which have difficulty in accomodating RFC822 addresses > that contain valid but unusual characters. > > there may be MTAs in use which ascribe special meaning to unusual > characters in a way which is not easily configurable (if it is > configurable at all). I think that's also a good argument. I have a strong gut feeling that we wouldn't want dots in the prefix, for any number of reasons. AMC From owner-ietf-imaa Thu Feb 13 16:54:07 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E0s7e02122 for ietf-imaa-bks; Thu, 13 Feb 2003 16:54:07 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E0s5d02118 for ; Thu, 13 Feb 2003 16:54:05 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E0sOKA018315 for ; Fri, 14 Feb 2003 00:54:24 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Open Issue: Splitting of local-part into labels and where? From: Roy Badami Date: Fri, 14 Feb 2003 00:54:24 +0000 Message-ID: <1045184064.18314.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I'd include at least the following: @ [1] + (used for subaddresses) The draft also mentions '!', Roy mentioned the underscore. Actually, I mention hyphen as a separator. Underscore I simply mention as a common character in e-mail addresses of the form r_badami, and hence part of my 'safe set' of characters that it is reasonable to assume all common software handles correctly. -roy From owner-ietf-imaa Thu Feb 13 16:49:17 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E0nH302040 for ietf-imaa-bks; Thu, 13 Feb 2003 16:49:17 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E0nFd02035 for ; Thu, 13 Feb 2003 16:49:15 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E0nXKA018277 for ; Fri, 14 Feb 2003 00:49:34 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Another issue: quoting From: Roy Badami Date: Fri, 14 Feb 2003 00:49:33 +0000 Message-ID: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Apologies if I've missed this (I've only had time to skim the document and discussion so far), but I'd like to raise the issue of quoting as a possible open issue (that isn't listed as such in the base document). Quoting in local parts is a messy construct. As I understand it, the revised grammar in RFC2822 doesn't allow quoting (or linear whitespace) in a localpart, though of course they are still allowed for backward compatibility as part of the obsolete syntax. Given that these constructs are deprecated by RFC2822, is there any reason to permit their use in internationalized local parts at all? I can think of one very strong reason for forbidding them outright in internationalized localparts: complexity. The current specification is rarely implemented correctly on the current Internet. Fully supporting quoting in IMAA could significantly complicate the complexity of IMAA, depending on what other design choices are made, and there's no reason to believe it would be supported any better than it is with ASCII. One of the problems with the 822 specification is that localparts aren't just a sequence of characters, with quoting being just a transport encoding. Localparts are a sequence of tokens, namely atoms and dots. In fact, the definition of "dequoted local part" in the base document is problematic as it stands, because in 822 there isn't any such thing. By way of example, it seems to me that the RFC822 parse of the following two localparts is distinct, and as a result they could validly refer to distinct mailboxes: roy.badami "roy.badami" The first is a sequence of three tokens: the atom "roy", the token "." and the atom "badami". The second is a single atom. The following three localparts are all equivalent, however: roy.badami roy . badami "roy"."badami" Most software wants to treat the localpart as simply a sequence of characters (at least after dequoting) but it isn't defined that way. Continuing to allow quoting in localparts may not be without its benefits (eg for X.400 gatewaying), but since the authors of RFC2822 appear to have already made a decision on the future of quoting, I suggest that we consider the following option: I would therefore propose that we consider defining a grammar for internationalized local parts based on the 2822 grammar. As a result, any localpart that contains a non-ASCII character MUST NOT contain any of the following: quoting (ie backlash or double-quote) whitespace any ascii character that is not permited unquoted. Opinions? -roy From owner-ietf-imaa Thu Feb 13 17:12:04 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E1C4b02551 for ietf-imaa-bks; Thu, 13 Feb 2003 17:12:04 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E1C2d02547 for ; Thu, 13 Feb 2003 17:12:02 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E1CKKA018392 for ; Fri, 14 Feb 2003 01:12:21 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Open Issue: Splitting of local-part into labels and where? From: Roy Badami Date: Fri, 14 Feb 2003 01:12:20 +0000 Message-ID: <1045185140.18391.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: My understanding is that this is very much a slippery slope. So we better stop as soon as possible. I see some argument for making '.' a delimiter, because that would make it easier to apply the same function to a whole email address as to a domain name only. If we assume we are going to use punycode, then you prety much *have* to split at dots, otherwise you'd have cases where punycode generates strings containing multiple consecutive dots. Yes, you could make this valid by generating suitably quoting, but you'd be creating local parts that are (a) deprecated by 2822 and (b) stand a significant chance of not working on the current Internet. (See my separate post on quoting.) But that on the other hand prohibits us to make '-' a delimiter. But I don't think it's clear at this stage that the implentational convenience of using the same function justifies the cost to the utility of the specification. My gut feeling at this point is that it doesn't. Actually, I'm not even convinced at this point that it's clear that we should use punycode for the ACE, eg if we choose to ACE-encode strings containing dots, and require the result not to contain consecutive dots. Adam, what comes after AMC-ACE-Z ? :) -roy From owner-ietf-imaa Thu Feb 13 17:48:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E1mcl03507 for ietf-imaa-bks; Thu, 13 Feb 2003 17:48:38 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E1mWd03503 for ; Thu, 13 Feb 2003 17:48:34 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E1mkKA018527 for ; Fri, 14 Feb 2003 01:48:46 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Splitting and encoding of local-parts: some thoughts From: Roy Badami Date: Fri, 14 Feb 2003 01:48:46 +0000 Message-ID: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Some thoughts on where to split, and how to encode. (1) We could encode the whole dequoted localpart in one go. But then we can't use punycode as the ACE, because it could produce multiple consecutive dots, and I really don't think we want to go there. (That's not to say we shouldn't consider this option with a different ACE, which could still be a bootstring profile.) (2) We could simply split at dots, and encode using punycode as the ACE. Each token (between dots) would of course have to be marked in some way (eg with an ACE prefix) if it's ACE-encodeed. This means that if dots are commonplace in internationalized localparts, the extra ACE prefixes would eat into our length limit. Splitting just at dots however doesn't allow much of the maniplulation that is commonly performed on localparts, so the base document proposes an alternative for consideration, namely (3) split at all non-alphanumeric ASCII characters. So we need to pose the question, are non-alphanumeric ASCII characters (or characters that normalize to them) likely to be commonplace in internationalized localparts? If so, this will eat into our lenght limit further. If we want to go down the road of accepting some added complexity in the algorithms in order to maximimize the length of string we can encode, then perhaps neither (2) nor (3) are ideal. Then again, maybe even (3) is good enough if such characters will in practice be uncommon in internationalized local parts. (I note that the IDN WG essentially chose to trade simplicity for coding efficiency in their selection of punycode against the previous WG favourites of RACE and DUDE.) However, if we opt for (1) then we pretty much rule out punycode (though we could use a bootstring profile in which dot is not a basic code point). If we have to go for something different from the IDN encoding, there's not real extra cost in going for something more radically different. While I'm there, though, I'd suggest that if we end up defining a new bootstring profile here, we should consider going for underscore rather than hyphen as the delimited, and use underscores rather than hyphens in the ACE prefix. The reason is that hyphen is often used as a delimited in structured local parts, whereas underscore is a safe character to use in a localpart that is almost never used in this way. I'd also like to throw a slightly wacky idea out there: if we accept that any prefixes and suffixes applied to (or stripped off from) e-mail addresses are restricted to ASCII, then it is adequate to identify the substring of the localpart from the first non-ASCII character to the last non-ASCII character, and encode and mark that in some way. It then becomes safe to append and strip arbirary ASCII suffixes from IMAs. If we avoid common separators such as plus and minus in the output from the ACE, and in whatever markup we use to tag the string as ACE-encoded, it also becomes safe to split on common separators. My concern here is that maybe even this doesn't go far enough. Is it reasonable to restrict users of localpart suffixes (delimited by plus or minus) to restrict their suffixes to ASCII characters? Just some more thoughts, and apologies for the fact that this has turned in to a bit of a stream of consciousness... :) -roy From owner-ietf-imaa Thu Feb 13 17:59:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E1xTR03771 for ietf-imaa-bks; Thu, 13 Feb 2003 17:59:29 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E1xRd03767 for ; Thu, 13 Feb 2003 17:59:27 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E1xgKA018585 for ; Fri, 14 Feb 2003 01:59:42 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: facts about the real world, part 2 From: Roy Badami Date: Fri, 14 Feb 2003 01:59:42 +0000 Message-ID: <1045187982.18584.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: AMS at Carnegie Mellon University (where it was created) used '+'. That certainly tallies with my recollections. I remember encountering addresses of the form foo+bar@cmu.edu around 1989/1990, I think. In particular, there was some Andrew-based mailing list management software that used listname+request@cmu.edu, and there were a couple of high-profile lists hosted on this platform around that time. -roy From owner-ietf-imaa Thu Feb 13 18:08:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E28aI04001 for ietf-imaa-bks; Thu, 13 Feb 2003 18:08:36 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E28Yd03996 for ; Thu, 13 Feb 2003 18:08:35 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E28qKA018632 for ; Fri, 14 Feb 2003 02:08:52 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Open Issue: Splitting of local-part into labels and where? From: Roy Badami Date: Fri, 14 Feb 2003 02:08:52 +0000 Message-ID: <1045188532.18629.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: [1] Which reminds my that the draft specifies splitting local-part and domain at _the_ at-sign. It should probably read "at the _last_ at-sign", since local-parts may contain at-signs themselves. Strinctly speaking, if we're going to follow RFC822, then we split at _the_ unquoted at-sign. There can only be one, according to the syntax. There may of course be quoted at-signs in atoms on either side of the unquoted at-sign (though having them on the right would result in a domain name that doesn't follow hostname rules). See me separate thread on quoting on why I think we probably shouldn't go down this road. -roy From owner-ietf-imaa Thu Feb 13 18:19:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2JXR04508 for ietf-imaa-bks; Thu, 13 Feb 2003 18:19:33 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1E2JRd04487 for ; Thu, 13 Feb 2003 18:19:28 -0800 (PST) Received: (qmail 19180 invoked by uid 66); 14 Feb 2003 02:19:20 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 02:19:20 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d); 14 Feb 2003 03:19:14 +0100 Date: 14 Feb 2003 02:44:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpGadzJcDD@3247.org> In-Reply-To: <20030213183024.14213.qmail@cr.yp.to> Subject: Re: The typing issue User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: D. J. Bernstein schrieb/wrote: > With ISO 14755, the second email address disappears. There's still > ASCII information on the business card---but it isn't a second > address; it's a universal explanation of how to type the first > address. If I wanted to type an email address from a business card I'd prefer an alternative address I can easily type over some ``strange numbers and letters'' I have to enter while holding down two other keys. In other words, my order of preference is (for characters I can't type): . ASCII transliteration/ASCII-only alias address . Punycode . ISO 14775 Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Thu Feb 13 18:19:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2JXq04509 for ietf-imaa-bks; Thu, 13 Feb 2003 18:19:33 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1E2JRd04488 for ; Thu, 13 Feb 2003 18:19:28 -0800 (PST) Received: (qmail 19179 invoked by uid 66); 14 Feb 2003 02:19:20 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 02:19:20 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d); 14 Feb 2003 03:19:14 +0100 Date: 14 Feb 2003 02:41:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpG$x4JcDD@3247.org> In-Reply-To: <200302131948.20807@sendmail.mutz.com> Subject: Re: Open Issue: Splitting of local-part into labels and where? User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Organization: KDE Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Marc Mutz schrieb/wrote: > The question of whether or not to split the dequoted local-part into > labels and if so, at which delimiter characters, remains unanswered in > the -00 draft. Well, in draft-faerber-i18n-email-netnews-names-00.txt I suggested this list: SP / %x00-1F / "." / "@" / "+" / "%" / "=" / "/" / "," / ";" / ":" / "!" / "(" / ")" / "[" / "]" / "<" / ">" [[RATIONALE: As much delimiters as possible are used to increase the chance that the encoding of individual parts of the identifier are encoded the same way when included in other identifiers: "@" - used to separate local-part and domain name. "+" - used by some mailers for subaddressing "%" - used by some MTAs to embed domains within the local-part of email addresses ("percent-hack") "=" - used within MIXER (RFC 2156) "/" - used wihtin MIXER (RFC 2156), used as a newsgroup component separator in some leagacy non-RFC BBS networks. ",", ";" - used to separate identifiers in many positions ":" - used to seperate (obsolete) source routes from the destination address " " - used to separate source routes from each other. "!" - used as a separator within the Path header in RFC 1036, used as a address separator within (obsolete) UUCP bang addresses "(", ")" - used for comments, used within the replacement for some seperators according to MIXER (e.g. "(a)" instead of "@") "[", "]", "<", ">" - as precaution ]] And all Unicode characters that have a compatibility mapping to one of those, of course. We might want to add control characters - just in case they appear. I'm not quite sure about the quoting characters ``"'' and ``\''. If we apply toASCII/toUnicode after dequoting the local part, there's no reason to include them. If we keep some quotes (canonicalised to the minimal quoted version), we should probably include them. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Thu Feb 13 18:19:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2JXJ04507 for ietf-imaa-bks; Thu, 13 Feb 2003 18:19:33 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1E2JRd04489 for ; Thu, 13 Feb 2003 18:19:28 -0800 (PST) Received: (qmail 19187 invoked by uid 66); 14 Feb 2003 02:19:21 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 02:19:21 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d); 14 Feb 2003 03:19:14 +0100 Date: 14 Feb 2003 03:19:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpGcCiZcDD@3247.org> In-Reply-To: <200302132143.49081@sendmail.mutz.com> Subject: Re: Open Issue: Splitting of local-part into labels and where? User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Organization: KDE Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Marc Mutz schrieb/wrote: > We will not be able to use the same function for the LHS as for the > RHS anyway. That's simply b/c there are characters that are allowed in > local-parts, but not in domains. So the function will produce invalid output for invalid input on the RHS. That's not a problem. We _can_ invent a function that will yield the same result as the IDNA function for all valid domain names. It can then be used for the RHS, too. Our function does not have to produce exactly identical results for something that is not a domain name. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Thu Feb 13 18:19:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2JXg04506 for ietf-imaa-bks; Thu, 13 Feb 2003 18:19:33 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1E2JRd04490 for ; Thu, 13 Feb 2003 18:19:28 -0800 (PST) Received: (qmail 19181 invoked by uid 66); 14 Feb 2003 02:19:21 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 02:19:21 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d); 14 Feb 2003 03:19:14 +0100 Date: 14 Feb 2003 03:14:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpGbcs3cDD@3247.org> In-Reply-To: <200302131950.55716@sendmail.mutz.com> Subject: Re: Open Issue: Stored strings vs. queries. User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-13-1613d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Organization: KDE Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Marc Mutz schrieb/wrote: > An interesting issue is what is to be considered stored local-parts and > what is a queried local-part. > Obvious stored local-parts: > - MTA config > - address books (e.g. LDAP) There are two types of address books: . Authorative address books, which are run by the same authority as the MTA config files. These are cleraly ``stored''. . Personal address books, which contain queries (because they are used to match against addresses _created_ elsewhere). Otherwise, you could not enter an address that was created by someone using a newer version of the profile. > Obvious queries: > - address lookups (e.g. LDAP queries) > - SMTP commands > Non-obvious: > - Mail headers Again, clearly a ``query''. It makes use of names created elswhere. Otherwise, you could not send mail to an address that was created by someone using a newer version of the profile. The distinction between ``stored'' and ``query'' is an application of the Robustness Principle: Don't generate names with unassigned code points (=> ``stored'') but allow them if someone insists that such an address exists (=> ``query''). The terms ``stored'' and ``query'' are confusing if used for anything else than DNS (where the servers ``store'' the authorative list of names and everyone else ``queries'' these servers). Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Thu Feb 13 18:30:12 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2UCe04798 for ietf-imaa-bks; Thu, 13 Feb 2003 18:30:12 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E2U7d04793; Thu, 13 Feb 2003 18:30:07 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> References: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 13 Feb 2003 18:30:05 -0800 To: Roy Badami , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Splitting and encoding of local-parts: some thoughts Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:48 AM +0000 2/14/03, Roy Badami wrote: >Some thoughts on where to split, and how to encode. > >(1) We could encode the whole dequoted localpart in one go. But then >we can't use punycode as the ACE, because it could produce multiple >consecutive dots, and I really don't think we want to go there. >(That's not to say we shouldn't consider this option with a different >ACE, which could still be a bootstring profile.) Um, punycode doesn't output dots. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Thu Feb 13 18:39:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2dcR05321 for ietf-imaa-bks; Thu, 13 Feb 2003 18:39:38 -0800 (PST) Received: from kathmandu.sun.com (kathmandu.sun.com [192.18.98.36]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E2dbd05317 for ; Thu, 13 Feb 2003 18:39:37 -0800 (PST) Received: from esunmail ([129.147.58.120]) by kathmandu.sun.com (8.9.3+Sun/8.9.3) with ESMTP id TAA28302 for ; Thu, 13 Feb 2003 19:39:40 -0700 (MST) Received: from xpa-fe1 (esunmail [129.147.58.120]) by edgemail1.Central.Sun.COM (iPlanet Messaging Server 5.2 HotFix 1.08 (built Dec 6 2002)) with ESMTP id <0HAA00FI5223LD@edgemail1.Central.Sun.COM> for ietf-imaa@imc.org; Thu, 13 Feb 2003 19:39:40 -0700 (MST) Received: from nifty-jr.west.sun.com ([129.153.12.95]) by mail.sun.net (iPlanet Messaging Server 5.2 HotFix 1.08 (built Dec 6 2002)) with ESMTPSA id <0HAA006I62215C@mail.sun.net> for ietf-imaa@imc.org; Thu, 13 Feb 2003 19:39:39 -0700 (MST) Date: Thu, 13 Feb 2003 18:39:21 -0800 From: Chris Newman Subject: Re: Another issue: quoting In-reply-to: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> To: Roy Badami , ietf-imaa@imc.org Message-id: <2147483647.1045161561@nifty-jr.west.sun.com> MIME-version: 1.0 X-Mailer: Mulberry/3.0.0 (Mac OS X) Content-type: text/plain; charset=us-ascii; format=flowed Content-transfer-encoding: 7BIT Content-disposition: inline X-message-flag: Outlook: the best virus distribution system around References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: begin quotation by Roy Badami on 2003/2/14 0:49 +0000: > I would therefore propose that we consider defining a grammar for > internationalized local parts based on the 2822 grammar. As a result, > any localpart that contains a non-ASCII character MUST NOT contain any > of the following: > > quoting (ie backlash or double-quote) > whitespace > any ascii character that is not permited unquoted. > > Opinions? While we should ban all use of obsolete 2822 grammer with non-ASCII chars, we should support the full richness of the legal to generate grammer for 2822 with non-ASCII chars for consistancy. The extra work is not that hard: 1. remove "@domain" from addr-spec 2. if LHS starts with double-quote, strip surrounding double-quotes and internal use of "\" as the quote for a quoted-pair. 3. apply IDN transformation. I would argue that step 2 is sufficiently simple that it's not worth banning potentially useful address syntaxes (like "Common Name"@domain). The reverse transformation for step 2 is a bit more complex: 2-reverse: if output of IDN decode contains any of: 2822-specials, US-ASCII whitespace, leading or trailing "." or embedded "..", then wrap with double quotes and apply quoted-chars where necessary. This is still simple compared to all the IDN gunk. - Chris From owner-ietf-imaa Thu Feb 13 18:39:14 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2dEa05309 for ietf-imaa-bks; Thu, 13 Feb 2003 18:39:14 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E2dBd05303 for ; Thu, 13 Feb 2003 18:39:13 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jVkw-00073d-00 for ; Thu, 13 Feb 2003 18:39:14 -0800 Date: Fri, 14 Feb 2003 02:39:14 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Splitting and encoding of local-parts: some thoughts Message-ID: <20030214023914.GC25048@nicemice.net> Reply-To: IETF IMAA list References: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC wrote: > Um, punycode doesn't output dots. It doesn't introduce them, but it will copy them from the input to the output, if they exist in the input. The first thing the Punycode encoder does is copy all ASCII characters appearing in the input directly to the output. AMC From owner-ietf-imaa Thu Feb 13 18:48:43 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2mhd05655 for ietf-imaa-bks; Thu, 13 Feb 2003 18:48:43 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E2mfd05651 for ; Thu, 13 Feb 2003 18:48:41 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E2mvKA018838 for ; Fri, 14 Feb 2003 02:48:57 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Question: full-width at From: Roy Badami Date: Fri, 14 Feb 2003 02:48:56 +0000 Message-ID: <1045190936.18837.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Question: is full-width-at the only character that contains the at-sign in its compatibility decomposition? If not, should we perhaps consider converting the entire e-mail address to NFKC before splitting at at-sign? Even if there are no other such characters currently defined in Unicode, should we consider doing this anyway, in case a future compatibility character has such as compatibility decomposition, or can/should we avoid the issue by tying ourselves to a particular version of Unicode? And I won't even ask what happens if you try and follow the at-sign with a combining accent. I'm assuming there are no precomposed characters of that form, so NFKC vs NFCD won't make a difference here... Actually, I will ask: assuming we just split at the at sign, the domain name will begin with a combining accent. How will IDNA treat this? -roy From owner-ietf-imaa Thu Feb 13 18:53:24 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2rO005825 for ietf-imaa-bks; Thu, 13 Feb 2003 18:53:24 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E2rMd05821 for ; Thu, 13 Feb 2003 18:53:23 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E2rgKA018882 for ; Fri, 14 Feb 2003 02:53:42 GMT To: ietf-imaa@imc.org CC: roy@gnomon.org.uk In-reply-to: <20030214023914.GC25048@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Splitting and encoding of local-parts: some thoughts References: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> <20030214023914.GC25048@nicemice.net> From: Roy Badami Date: Fri, 14 Feb 2003 02:53:42 +0000 Message-ID: <1045191222.18879.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: It doesn't introduce them, but it will copy them from the input to the output, if they exist in the input. The first thing the Punycode encoder does is copy all ASCII characters appearing in the input directly to the output. Indeed, so: encodes to Note that the input didn't contain any consecutive dots, but the output did. -roy From owner-ietf-imaa Thu Feb 13 18:59:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E2xTR05967 for ietf-imaa-bks; Thu, 13 Feb 2003 18:59:29 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E2xRd05963 for ; Thu, 13 Feb 2003 18:59:27 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E2xlKA018905 for ; Fri, 14 Feb 2003 02:59:47 GMT To: Chris.Newman@Sun.COM CC: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: <2147483647.1045161561@nifty-jr.west.sun.com> (message from Chris Newman on Thu, 13 Feb 2003 18:39:21 -0800) Subject: Re: Another issue: quoting References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> Date: Fri, 14 Feb 2003 02:59:46 +0000 Message-ID: <1045191586.18900.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: While we should ban all use of obsolete 2822 grammer with non-ASCII chars, we should support the full richness of the legal to generate grammer for 2822 with non-ASCII chars for consistancy. The extra work is not that hard: Oops, you're right, my bad. I had thought that local-part had to be a dot-atom, but that's not true. I'm going to have to read 2822 carefully, but can anyone confirm that roy.badami and "roy.badami" are still potentially distinct mailboxes in 2822, even neglecting obsolete deprecated stuff? If so, this will require great care on the part of IMAA... -roy From owner-ietf-imaa Thu Feb 13 19:08:17 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E38HX06103 for ietf-imaa-bks; Thu, 13 Feb 2003 19:08:17 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E38Ed06099 for ; Thu, 13 Feb 2003 19:08:15 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1E38YKA018949 for ; Fri, 14 Feb 2003 03:08:35 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk In-reply-to: <1045185140.18391.TMDA@moriarty.gnomon.org.uk> (message from Roy Badami on Fri, 14 Feb 2003 01:12:20 +0000) Subject: Re: Open Issue: Splitting of local-part into labels and where? References: <1045185140.18391.TMDA@moriarty.gnomon.org.uk> From: Roy Badami Date: Fri, 14 Feb 2003 03:08:34 +0000 Message-ID: <1045192114.18946.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: If we assume we are going to use punycode, then you prety much *have* to split at dots, otherwise you'd have cases where punycode generates strings containing multiple consecutive dots. Yes, you could make this valid by generating suitably quoting, but you'd be creating local parts that are (a) deprecated by 2822 and (b) stand a significant chance of not working on the current Internet. (See my separate post on quoting.) Oops, as has just been pointed out to me, quoting is valid within the legal-to-generate grammar of 2822. I would still voice the concern that it anything containing quoting is likely to be less than universally acceptable to currently deployed systems (and we therefore shouldn't generate it). -roy From owner-ietf-imaa Thu Feb 13 19:24:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E3OGf06468 for ietf-imaa-bks; Thu, 13 Feb 2003 19:24:16 -0800 (PST) Received: from mercury.ccil.org (mail@[192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E3OFd06464 for ; Thu, 13 Feb 2003 19:24:15 -0800 (PST) Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 18jWHP-0000Dn-00; Thu, 13 Feb 2003 22:12:47 -0500 Subject: Re: Question: full-width at In-Reply-To: <1045190936.18837.TMDA@moriarty.gnomon.org.uk> from Roy Badami at "Feb 14, 2003 02:48:56 am" To: Roy Badami Date: Thu, 13 Feb 2003 22:12:47 -0500 (EST) CC: ietf-imaa@imc.org X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: John Cowan Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami scripsit: > Question: is full-width-at the only character that contains the > at-sign in its compatibility decomposition? No, there is also U+FE6B, the SMALL COMMERCIAL AT. It's fullwidth but standard size, with lots of kerning around it. U+FF20 is fullwidth and large-sized. > And I won't even ask what happens if you try and follow the at-sign > with a combining accent. I'm assuming there are no precomposed > characters of that form, so NFKC vs NFCD won't make a difference > here... There are none, fortunately. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_ From owner-ietf-imaa Thu Feb 13 19:31:12 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E3VC106642 for ietf-imaa-bks; Thu, 13 Feb 2003 19:31:12 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1E3VBd06638 for ; Thu, 13 Feb 2003 19:31:11 -0800 (PST) Received: (qmail 65924 invoked by uid 1016); 14 Feb 2003 03:31:40 -0000 Date: 14 Feb 2003 03:31:40 -0000 Message-ID: <20030214033140.65923.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Another issue: quoting References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > One of the problems with the 822 specification is that localparts > aren't just a sequence of characters, with quoting being just a > transport encoding. False. RFC 822, section 3.4.4, specifies beyond a shadow of a doubt that quoting _is_ just a transport encoding. Read it! Disclaimers: 1. I agree that there are many serious problems in RFC 822. 2. I agree that what you falsely accuse RFC 822 of doing would have been another problem---if it were true. 3. None of this is meant as a comment on what RFC 2822 says. 4. None of this is meant as a comment on what software actually does. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 13 19:49:24 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E3nOE07237 for ietf-imaa-bks; Thu, 13 Feb 2003 19:49:24 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E3nNd07233 for ; Thu, 13 Feb 2003 19:49:23 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jWqs-0007Hb-00 for ; Thu, 13 Feb 2003 19:49:26 -0800 Date: Fri, 14 Feb 2003 03:49:26 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Compatibility with IDNA Message-ID: <20030214034926.GA27030@nicemice.net> Reply-To: IETF IMAA list References: <8f$3A5e3cDD@3247.org> <20030211023401.GE16359@nicemice.net> <20030213040028.GA9630@nicemice.net> <20030213080759.GA18181@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030213080759.GA18181@nicemice.net> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Reminder of context: This is a continuation of the thread exploring the question "What would it take to use the same ToASCII/ToUnicode operations for both IMAA and IDNA?" I wrote that IMAA could use the IDNA ToASCII if it didn't allow quite as much flexibility in the case of the output: > If the input of ToASCII contains no uppercase characters, then the > output of ToASCII must contain no uppercase characters. I had the right idea, but I didn't express it quite correctly. The above wording fails to account for titlecase characters, or for uppercase characters that have no lowercase mates (like Cyrillic letter palochka and all the Georgian capital letters). What I meant was: Definition: String X is "canonical case" iff CaseFold(X) = X. Constraint: If the input of ToASCII is canonical case, then the output of ToASCII must also be canonical case. I then explained that IMAA would need to impose the constraint on ToUnicode too. But earlier I had said that IMAA and IDNA could use the same prefix only if they used the exact same ToUnicode. Oops! How bad is this? Not a disaster, I think. The danger of using the same prefix is that the wrong ToUnicode operation might get applied. I can think of two ways this could happen: 1) Implementation error. A program applies the IDNA ToUnicode operation to a local part, in blatant violation of the standards. A different prefix would protect against this blunder, but one could argue that we as spec designers are not obligated to provide such protection, if there are other advantages to be gained by not providing it. 2) Copying local parts into domain names, or vice-versa. For example, DNS SOA records. Now a program might innocently use the ToUnicode operation appropriate for the data type where the string now sits, not knowing that it was actually copied from a different data type. In fact, this is one of the main reasons for reusing ToUnicode, so that things might get displayed intelligibly even when copied from one side of the at-sign to the other. Regardless of how the wrong ToUnicode gets applied, what damage can result? If the IMAA ToUnicode is used on a domain label, there is no problem, because the IMAA ToUnicode is just a more constrained version of the IDNA ToUnicode. If the IDNA ToUnicode is used on a local part, the result could be wrong (sending mail to it would bounce), but only if the mail server is case-sensitive and the ToUnicode implementation gratuitously uppercases some of the output characters even though its input was all lowercase ASCII. I can imagine a ToUnicode implementation that uppercases some of its output characters when some of the input characters are uppercase ASCII (that's how case preservation via mixed-case annotations would work), but I cannot imagine what would posess a ToUnicode implementation to go to the extra effort of uppercasing some of its output characters given an all lowercase ASCII input. Not only is the risk small, but it's a risk that already exists today with ASCII local parts that are copied into domain names. Whereas applications "must" preserve the case of ASCII local parts, they merely "should" preserve the case of ASCII domain labels. If the local part of foo@example.org is case-sensitive, and I put foo.example.org in an SOA record, there is a risk that that the domain name will somehow fall into the hands of a program that gratuitously capitalizes domain names, yielding FOO.EXAMPLE.ORG. When this is eventually converted back to a mail address, FOO@EXAMPLE.ORG, it won't work. In summary, I still think that using the same ToASCII & ToUnicode operations, including the same profile and prefix, for both domain labels and local parts, is a viable option worth considering. I haven't decided whether I think it's the best option. AMC From owner-ietf-imaa Thu Feb 13 20:40:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E4euQ08567 for ietf-imaa-bks; Thu, 13 Feb 2003 20:40:56 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E4etd08563 for ; Thu, 13 Feb 2003 20:40:55 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jXel-0007PR-00; Thu, 13 Feb 2003 20:40:59 -0800 Date: Fri, 14 Feb 2003 04:40:59 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Another issue: quoting Message-ID: <20030214044058.GE25048@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <20030214033140.65923.qmail@cr.yp.to> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <1045191586.18900.TMDA@moriarty.gnomon.org.uk> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030214033140.65923.qmail@cr.yp.to> <1045191586.18900.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami wrote: > Quoting in local parts is a messy construct. Yes. Quoting in general is inherently messy, and the quoting in message headers is no exception. > One of the problems with the 822 specification is that localparts > aren't just a sequence of characters, with quoting being just a > transport encoding. Localparts are a sequence of tokens, namely atoms > and dots. > > By way of example, it seems to me that the RFC822 parse of the > following two localparts is distinct, and as a result they could > validly refer to distinct mailboxes: > > roy.badami > "roy.badami" I had these exact same concerns two months ago, and careful reading of RFC 822 clarified some things. D. J. Bernstein has already pointed to section 3.4.4 of RFC 822, which explains that the backslashes used to quote the next character, and the quotation marks around quoted-strings, are not part of the data, and should not be retained outside a message-header context. As for the atom/dot structure, section 6.2.4 explains that it is not significant. So in your example above, roy.badami and "roy.badami" do in fact refer to the same mailbox, even though they parse differently. > In fact, the definition of "dequoted local part" in the base document > is problematic as it stands, because in 822 there isn't any such > thing. I don't see what's wrong with defining a new term. Actually, the definition of "dequoted local part" is merely giving a name to a concept described (but not named) in the last paragraph of section 3.4.4 of RFC 822. AMC From owner-ietf-imaa Thu Feb 13 21:07:45 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E57jX09063 for ietf-imaa-bks; Thu, 13 Feb 2003 21:07:45 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E57id09059 for ; Thu, 13 Feb 2003 21:07:44 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jY4i-0007Tk-00; Thu, 13 Feb 2003 21:07:48 -0800 Date: Fri, 14 Feb 2003 05:07:48 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Question: full-width at Message-ID: <20030214050748.GF25048@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045190936.18837.TMDA@moriarty.gnomon.org.uk> <1045190936.18837.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1045190936.18837.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy> Question: is full-width-at the only character that contains the Roy> at-sign in its compatibility decomposition? John> No, there is also U+FE6B, the SMALL COMMERCIAL AT. Roy> If not, should we perhaps consider converting the entire e-mail Roy> address to NFKC before splitting at at-sign? Roy> And I won't even ask what happens if you try and follow the at-sign Roy> with a combining accent. Roy> Actually, I will ask: assuming we just split at the at sign, the Roy> domain name will begin with a combining accent. How will IDNA treat Roy> this? These issues were considered for IDNA. There are many characters that decompose to dot (like U+FE52 small full stop, U+2024 one dot leader) or decompose to strings containing dots (like U+2488 digit one full stop, U+33C2 square AM). Therefore, if a whole domain name is normalized before being scanned for dots, it might result in a different number of labels than if it had not been normalized. We discussed whether IDNA should require this pre-normalization. Ultimately we decided that that was getting too far into user interface issues. IDNA instead focuses almost entirely on individual labels, not whole domain names. The one exception is to require that three particular dot-like non-ASCII characters be recognized as dots, just the three that we had reason to believe would be likely to be input by users trying to type dots. The IMAA draft follows this example. It says as little as possible about entire mail addresses, and focuses on the local part. The one exception is to require one particular at-like character to be recognized as an at-sign, the one that we have reason to believe is likely to be input by users trying to type at-signs. As for initial combining characters, the same question arose with IDNA. If a label begins with a combining character, will it combine with the preceeding dot? This was considered a user-interface issue that IDNA should not address. AMC From owner-ietf-imaa Thu Feb 13 22:50:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1E6oU911516 for ietf-imaa-bks; Thu, 13 Feb 2003 22:50:30 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1E6oTd11512 for ; Thu, 13 Feb 2003 22:50:29 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jZgA-0007i2-00; Thu, 13 Feb 2003 22:50:34 -0800 Date: Fri, 14 Feb 2003 06:50:34 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Open Issue: Splitting of local-part into labels and where? Message-ID: <20030214065034.GG25048@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045185140.18391.TMDA@moriarty.gnomon.org.uk> <1045192114.18946.TMDA@moriarty.gnomon.org.uk> <1045185140.18391.TMDA@moriarty.gnomon.org.uk> <200302132143.49081@sendmail.mutz.com> <8fpGcCiZcDD@3247.org> <4.2.0.58.J.20030213142550.03349d90@localhost> <200302132143.49081@sendmail.mutz.com> <200302131948.20807@sendmail.mutz.com> <4.2.0.58.J.20030213142550.03349d90@localhost> <200302131948.20807@sendmail.mutz.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8fpG$x4JcDD@3247.org> <1045188532.18629.TMDA@moriarty.gnomon.org.uk> <1045192114.18946.TMDA@moriarty.gnomon.org.uk> <1045185140.18391.TMDA@moriarty.gnomon.org.uk> <8fpGcCiZcDD@3247.org> <200302132143.49081@sendmail.mutz.com> <4.2.0.58.J.20030213142550.03349d90@localhost> <200302131948.20807@sendmail.mutz.com> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This message responds to messages by Marc Mutz, Martin Duerst, Roy Badami, and Claus Färber. Marc Mutz wrote: > I think that splitting into labels is needed to keep old software > (esp. MTAs and filtering software) working. I can't add more to Roy's > arguments here, and I feel there will be not much discussion about > this particular issue. That's optimistic! :) > The more interesting point is _where_ to split. There are some very > obvious characters (mainly full stops and hyphens), It would be nice to split at hyphens, but it would also be nice to share the ToASCII/ToUnicode operations between IDNA and IMAA. We can't have both. > The draft also mentions the option of using all non-alnum US-ASCII > characters. If the number of equivalent Unicode code points is small, > then this is certainly the best option, although we should then > provide a mapping table for the to-usascii mapping. It might be preferable to simply require applying NFKC to the entire local part before splitting it. It's a more heavyweight operation, but it's already in the toolbox. > the draft specifies splitting local-part and domain at _the_ at-sign. I don't think the draft says that. It says: In an internationalized mail address, the following characters MUST be recognized as at-signs for separating the local part from the domain name: U+0040 (commercial at), U+FF20 (fullwidth commercial at). This is not intended to be a full specification of how to parse mail addresses; see other RFCs for that. This is merely intended to expand the set of characters that can be used to separate the local part from the domain name. There are other places where the phrase "the at-sign" is used. It is meant as shorthand for "the at-sign that separates the local part from the domain name". If that's not sufficiently clear from context, we could be more explicit. Martin Duerst wrote: > Making sure that a user enters a '@' in the right form can and should > be a quality of implementation issue under the responsibility of the > application. The same argument could be made about fullwidth full stop in IDNA. There were arguments on both sides, and ultimately we chose to require that all IDNA-aware applications recognize a few characters as dots. If IMAA is to be a natural follow-on to IDNA, it ought to handle the at-sign the same way. By the way, one of the arguments for the dot requirement was the following scenario: You and I both have IDNA-aware applications, and I can type a domain name into mine and it works, but when I type it into the body of a message and mail it to you, and you paste it into your application, it fails, because yours doesn't recognize the same dots as mine. We don't try to 100% solve that problem, but standardizing the few most common variants of the essential delimiters is an easy 99% solution. Marc Mutz wrote: > We will not be able to use the same function for the LHS as for the > RHS anyway. That's simply b/c there are characters that are allowed > in local-parts, but not in domains The IDNA ToASCII and ToUnicode operations are designed to handle all ASCII characters just fine, because domain names in general can contain all ASCII characters, although host names and mail domains are restricted to letters, digits, hyphens, and the dot separators. In another thread I have outlined a scheme for reusing the IDNA ToASCII and ToUnicode in IMAA. If you think it wouldn't work, please explain. > it's only logical to split at non-alphanum ASCII characters. I agree. Except we'd have to make sure that we can reverse the process. For example, if we encode each piece in such a way that the encoding can introduce hyphens, then we better not split on hyphens! If we don't reuse the IDNA operations, then it's possible to use an encoding that uses only alphanumerics, and we can split on all non-alphanumeric ASCII characters. > We should, of course, exclude pathological cases, such as control > characters Huh? If control characters are not considered delimiters, then that leaves them as part of the pieces that get encoded, which is even worse. If they're considered delimiters, then we don't mess with them at all. I think "*all* non-alphanumeric ASCII characters" is the right target. Roy Badami wrote: > If we assume we are going to use punycode, then you prety much *have* > to split at dots, otherwise you'd have cases where punycode generates > strings containing multiple consecutive dots. Yes, you could make > this valid by generating suitable quoting, but you'd be creating local > parts that stand a significant chance of not working on the current > Internet. That's a very keen observation. I don't know how afraid of quoting we should really be, but it's good to be aware that non-ASCII forms that don't need quotes can map to ACE forms that do need quotes. I hadn't really noticed that. > I don't think it's clear at this stage that the implentational > convenience of using the same function justifies the cost to the > utility of the specification. Implementational convenience is not the only reason to use the same operations. Another reason is so that when domains are copied into local parts and vice-versa, they might be displayed intelligibly. How well that would or wouldn't work depends on whether local parts are subdivided. > Actually, I'm not even convinced at this point that it's clear that we > should use punycode for the ACE, eg if we choose to ACE-encode strings > containing dots, and require the result not to contain consecutive > dots. > > Adam, what comes after AMC-ACE-Z ? :) We can consider different prefixes, and different profiles, but I doubt that anyone wants to see another encoding. It just gets to be too much. People might be willing to accept a simple wrapper around Punycode. For example, in order to use hyphen as a delimiter we would need to eliminate hyphens from the Punycode encoding, which could be done by applying Punycode itself, then replacing the hyphen with the ACE infix (which would be purely alphanumeric) (or prepend the infix if there is no hyphen). > There may of course be quoted at-signs in atoms on either side of the > unquoted at-sign Actually, atoms cannot contain at-signs. Quoted-strings can, but they're not permitted after the at-sign. Claus Färber wrote: [regarding the set of delimiters for subdividing local parts] > I'm not quite sure about the quoting characters ``"'' and ``\''. If > we apply toASCII/toUnicode after dequoting the local part, there's no > reason to include them. Sure there is. Consider: From: foo bar <"foo\"bar\\"@example.org> The dequoted local part is: foo"bar\ In any case, I don't see the point of trying to justify each delimiter individually. I think its easier to justify a simple policy like "all non-alphanumeric ASCII characters" or "all ASCII characters that Punycode doesn't generate" (alphanumerics and hyphen). AMC From owner-ietf-imaa Fri Feb 14 02:01:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EA1Pf12223 for ietf-imaa-bks; Fri, 14 Feb 2003 02:01:25 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EA1Md12210 for ; Fri, 14 Feb 2003 02:01:22 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EA1dKA020809 for ; Fri, 14 Feb 2003 10:01:39 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk CC: ietf-imaa@imc.org In-reply-to: <20030214044058.GE25048@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Another issue: quoting References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <20030214033140.65923.qmail@cr.yp.to> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <1045191586.18900.TMDA@moriarty.gnomon.org.uk> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <20030214044058.GE25048@nicemice.net> Date: Fri, 14 Feb 2003 10:01:37 +0000 Message-ID: <1045216897.20793.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I had these exact same concerns two months ago, and careful reading of RFC 822 clarified some things. D. J. Bernstein has already pointed to section 3.4.4 of RFC 822, which explains that the backslashes used to quote the next character, and the quotation marks around quoted-strings, are not part of the data, and should not be retained outside a message-header context. As for the atom/dot structure, section 6.2.4 explains that it is not significant. So in your example above, roy.badami and "roy.badami" do in fact refer to the same mailbox, even though they parse differently. Thanks for the reference to 6.2.4. It appears that I'm just plain wrong here. I don't see what's wrong with defining a new term. Not worth persuing this line of discussion; my comment was based on a misapprehension of what RFC822 said. -roy From owner-ietf-imaa Fri Feb 14 02:01:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EA1Pe12222 for ietf-imaa-bks; Fri, 14 Feb 2003 02:01:25 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EA1Md12209 for ; Fri, 14 Feb 2003 02:01:22 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EA1cKA020794 for ; Fri, 14 Feb 2003 10:01:38 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk CC: ietf-imaa@imc.org In-reply-to: <20030214044058.GE25048@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Another issue: quoting References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <20030214033140.65923.qmail@cr.yp.to> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <1045191586.18900.TMDA@moriarty.gnomon.org.uk> <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <20030214044058.GE25048@nicemice.net> From: Roy Badami Date: Fri, 14 Feb 2003 10:01:37 +0000 Message-ID: <1045216897.20793.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I had these exact same concerns two months ago, and careful reading of RFC 822 clarified some things. D. J. Bernstein has already pointed to section 3.4.4 of RFC 822, which explains that the backslashes used to quote the next character, and the quotation marks around quoted-strings, are not part of the data, and should not be retained outside a message-header context. As for the atom/dot structure, section 6.2.4 explains that it is not significant. So in your example above, roy.badami and "roy.badami" do in fact refer to the same mailbox, even though they parse differently. Thanks for the reference to 6.2.4. It appears that I'm just plain wrong here. I don't see what's wrong with defining a new term. Not worth persuing this line of discussion; my comment was based on a misapprehension of what RFC822 said. -roy From owner-ietf-imaa Fri Feb 14 02:31:28 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EAVS416548 for ietf-imaa-bks; Fri, 14 Feb 2003 02:31:28 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EAVRd16542 for ; Fri, 14 Feb 2003 02:31:27 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jd7v-0008Av-00; Fri, 14 Feb 2003 02:31:27 -0800 Date: Fri, 14 Feb 2003 10:31:27 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Splitting and encoding of local-parts: some thoughts Message-ID: <20030214103127.GC30815@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami wrote: > (1) We could encode the whole dequoted localpart in one go. But then > we can't use punycode as the ACE, because it could produce multiple > consecutive dots, and I really don't think we want to go there. I'm not sure we need to be quite so fearful of quoting, but we should certainly keep this in mind as we weigh the options. > (2) We could simply split at dots, and encode using punycode as the > ACE. ...if dots are commonplace in internationalized localparts, the > extra ACE prefixes would eat into our length limit. > > (3) split at all non-alphanumeric ASCII characters. ...this will eat > into our length limit further. Yes, we need to weigh the benefit of being friendly to structured local part conventions (like user+tag@domain) and the benefit of IDNs copied into local parts being displayed intelligibly (like user%domain@domain or listname-return.user=domain@domain) versus the cost of extra ACE prefixes, and the cost of the extra subdividing step. > I'd also like to throw a slightly wacky idea out there: if we accept > that any prefixes and suffixes applied to (or stripped off from) > e-mail addresses are restricted to ASCII, then it is adequate to > identify the substring of the localpart from the first non-ASCII > character to the last non-ASCII character, and encode and mark that in > some way. > > Is it reasonable to restrict users of localpart suffixes (delimited by > plus or minus) to restrict their suffixes to ASCII characters? I would think that anyone who wants to use the user+tag convention, and wants to use non-ASCII characters in the user part, probably also wants to use non-ASCII characters in the tag part. AMC From owner-ietf-imaa Fri Feb 14 03:10:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EBAum21164 for ietf-imaa-bks; Fri, 14 Feb 2003 03:10:56 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EBAsd21156 for ; Fri, 14 Feb 2003 03:10:54 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EBBDKA021114 for ; Fri, 14 Feb 2003 11:11:13 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk CC: ietf-imaa@imc.org In-reply-to: <20030214103127.GC30815@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Splitting and encoding of local-parts: some thoughts References: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> <20030214103127.GC30815@nicemice.net> Date: Fri, 14 Feb 2003 11:11:12 +0000 Message-ID: <1045221072.21097.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I'm not sure we need to be quite so fearful of quoting, but we should certainly keep this in mind as we weigh the options. I think if we generate quoting there will be MUAs out there that can't cope with it. I certainly remember that when X.400 was popular in Britain, people using RFC1148 mapped addresses had problems communicating with some people. -roy From owner-ietf-imaa Fri Feb 14 03:10:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EBAuH21163 for ietf-imaa-bks; Fri, 14 Feb 2003 03:10:56 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EBArd21152 for ; Fri, 14 Feb 2003 03:10:54 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EBBCKA021100 for ; Fri, 14 Feb 2003 11:11:12 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk CC: ietf-imaa@imc.org In-reply-to: <20030214103127.GC30815@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Splitting and encoding of local-parts: some thoughts References: <1045187326.18524.TMDA@moriarty.gnomon.org.uk> <20030214103127.GC30815@nicemice.net> From: Roy Badami Date: Fri, 14 Feb 2003 11:11:12 +0000 Message-ID: <1045221072.21097.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I'm not sure we need to be quite so fearful of quoting, but we should certainly keep this in mind as we weigh the options. I think if we generate quoting there will be MUAs out there that can't cope with it. I certainly remember that when X.400 was popular in Britain, people using RFC1148 mapped addresses had problems communicating with some people. -roy From owner-ietf-imaa Fri Feb 14 03:37:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EBbQZ22297 for ietf-imaa-bks; Fri, 14 Feb 2003 03:37:26 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EBbOd22293 for ; Fri, 14 Feb 2003 03:37:24 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EBbgKA021248 for ; Fri, 14 Feb 2003 11:37:43 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Question: UseSTD3ASCIIRules on RHS of IMA From: Roy Badami Date: Fri, 14 Feb 2003 11:37:42 +0000 Message-ID: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Should IMAA make any statement as to whether UseSTD3ASCIIRules MUST or SHOULD be set when processessing any IDN on the RHS of an IMA, or should this decision be left up to the implementor? I'd suggest that IMAA should at least make a recommendation on this, to ensure consistency in the handling of IMAs between implementations. -roy From owner-ietf-imaa Fri Feb 14 04:06:51 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EC6pQ23251 for ietf-imaa-bks; Fri, 14 Feb 2003 04:06:51 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EC6nd23246 for ; Fri, 14 Feb 2003 04:06:49 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EC78KA021359 for ; Fri, 14 Feb 2003 12:07:08 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Question: Fullwidth double-quote and fullwidth backslash From: Roy Badami Date: Fri, 14 Feb 2003 12:07:08 +0000 Message-ID: <1045224428.21358.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: When dequoting/requoting localparts, should we consider recognizing fullwidth double quotes and fullwidth backslash (and any other double-quote-like and backlash-like characters)? It seems to me that the arguments for this are similar to those for fullwidth dot and fullwidth at, and once we decide to recognize metacharacters in fullwidth form, we should apply this consistently to *all* metacharacters. -roy From owner-ietf-imaa Fri Feb 14 05:05:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1ED5Q625022 for ietf-imaa-bks; Fri, 14 Feb 2003 05:05:26 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1ED5Od25018 for ; Fri, 14 Feb 2003 05:05:24 -0800 (PST) Received: from dirichlet.Physik.Uni-Bielefeld.DE (dirichlet.Physik.Uni-Bielefeld.DE [129.70.125.234]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HAA00M4JV0WDB@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Fri, 14 Feb 2003 14:05:20 +0100 (MET) Date: Fri, 14 Feb 2003 13:52:24 +0100 From: Marc Mutz Subject: Re: Open Issue: Splitting of local-part into labels and where? In-reply-to: <20030214065034.GG25048@nicemice.net> To: ietf-imaa@imc.org Message-id: <200302141352.49408@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_gaOT+4am4xdI/Kv"; charset="iso-8859-1" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <1045185140.18391.TMDA@moriarty.gnomon.org.uk> <200302131948.20807@sendmail.mutz.com> <20030214065034.GG25048@nicemice.net> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_gaOT+4am4xdI/Kv Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Friday 14 February 2003 07:50, Adam M. Costello wrote: > It might be preferable to simply require applying NFKC to the entire > local part before splitting it. It's a more heavyweight operation, > but it's already in the toolbox. Completely agreed. Marc =2D-=20 The [Sonny Bono Copyright Term Extension Act] expands copyright not only for future, but also for existing works, even though their authors obviously don't need any additional incentive to create them. -- "The Progress Of Science And Useful Arts": Why Copyright Today Threatens Intellectual Freedom, Free Expression Policy Project --Boundary-02=_gaOT+4am4xdI/Kv Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TOag3oWD+L2/6DgRAtHgAJ45taH/Kr+73Gt+lnjCalw+71meRACfZkIJ IL6M4+jd9GIDVDEWqSu4A9Q= =66sK -----END PGP SIGNATURE----- --Boundary-02=_gaOT+4am4xdI/Kv-- From owner-ietf-imaa Fri Feb 14 05:52:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EDqKK27647 for ietf-imaa-bks; Fri, 14 Feb 2003 05:52:20 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1EDqGd27632 for ; Fri, 14 Feb 2003 05:52:16 -0800 (PST) Received: (qmail 27000 invoked by uid 66); 14 Feb 2003 13:52:14 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 13:52:14 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d); 14 Feb 2003 14:52:04 +0100 Date: 14 Feb 2003 14:43:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpJ5tRZcDD@3247.org> In-Reply-To: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> Subject: Re: Another issue: quoting User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami schrieb/wrote: > Most software wants to treat the localpart as simply a sequence of > characters (at least after dequoting) but it isn't defined that way. To be quirks-compatible with software that does make a distinction between different forms of the same local-part, I'd recommend the following: . Save the original form. . Fully dequote/remove comments/extra whitespace before toUnicode and/or toASCII. . If the output of toUnicode/toASCII is identical to the input, use the original form in step #1, otherwise use the output generated by the function. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 14 05:52:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EDqK127646 for ietf-imaa-bks; Fri, 14 Feb 2003 05:52:20 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1EDqGd27633 for ; Fri, 14 Feb 2003 05:52:16 -0800 (PST) Received: (qmail 26999 invoked by uid 66); 14 Feb 2003 13:52:14 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 13:52:14 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d); 14 Feb 2003 14:52:04 +0100 Date: 14 Feb 2003 14:33:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpJ5$qocDD@3247.org> In-Reply-To: <1045224428.21358.TMDA@moriarty.gnomon.org.uk> Subject: Re: Question: Fullwidth double-quote and fullwidth backslash User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami schrieb/wrote: > When dequoting/requoting localparts, should we consider recognizing > fullwidth double quotes and fullwidth backslash (and any other > double-quote-like and backlash-like characters)? Just do a NFKC normalisation at the very beginning and then additinally map U+3002 to U+002E. This will handle all of these special cases. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 14 05:52:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EDqKf27645 for ietf-imaa-bks; Fri, 14 Feb 2003 05:52:20 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1EDqGd27634 for ; Fri, 14 Feb 2003 05:52:16 -0800 (PST) Received: (qmail 26998 invoked by uid 66); 14 Feb 2003 13:52:14 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 13:52:14 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d); 14 Feb 2003 14:52:04 +0100 Date: 14 Feb 2003 14:07:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpJ4yM3cDD@3247.org> In-Reply-To: <20030214065034.GG25048@nicemice.net> Subject: Re: Open Issue: Splitting of local-part into labels and where? User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam M. Costello schrieb/wrote: > Claus Färber wrote: > [regarding the set of delimiters for subdividing local parts] >> I'm not quite sure about the quoting characters ``"'' and ``\''. If >> we apply toASCII/toUnicode after dequoting the local part, there's no >> reason to include them. > Sure there is. Consider: > From: foo bar <"foo\"bar\\"@example.org> > The dequoted local part is: > foo"bar\ So what? They're just ordinary characters like ``a'', ``b'' and ``c'' then. The quote characters are rarely used as separators. The dequoted local part ``foö"bär\'' could be encoded as ``xn--fo"br\-eua5l'', which could be quoted/included in a From header as ``"xn--fo\"br\\-eua5l" (comment allowed by RFC 822) @example.org''. For decoding, you would first decode the local part, apply toUnicode and then re-apply a quoting for display. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 14 05:49:37 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EDnbR27547 for ietf-imaa-bks; Fri, 14 Feb 2003 05:49:37 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EDnZd27543 for ; Fri, 14 Feb 2003 05:49:35 -0800 (PST) Received: from dirichlet.Physik.Uni-Bielefeld.DE (dirichlet.Physik.Uni-Bielefeld.DE [129.70.125.234]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HAA00161X2F0S@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Fri, 14 Feb 2003 14:49:28 +0100 (MET) Date: Fri, 14 Feb 2003 14:36:44 +0100 From: Marc Mutz Subject: Re: Another issue: quoting In-reply-to: <1045191586.18900.TMDA@moriarty.gnomon.org.uk> To: ietf-imaa@imc.org Message-id: <200302141436.55925@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_3DPT+1EFGU+k7dQ"; charset="us-ascii" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <2147483647.1045161561@nifty-jr.west.sun.com> <1045191586.18900.TMDA@moriarty.gnomon.org.uk> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_3DPT+1EFGU+k7dQ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Friday 14 February 2003 03:59, Roy Badami wrote: > I'm going to have to read 2822 carefully, but can anyone confirm that > roy.badami and "roy.badami" are still potentially distinct mailboxes > in 2822, even neglecting obsolete deprecated stuff? After lexing, they are equivalent. Marc =2D-=20 This is as small as I think is sensible. -- Don Sanders after committing a 1MB patch to KMail CVS --Boundary-02=_3DPT+1EFGU+k7dQ Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TPD33oWD+L2/6DgRArYnAJ9kIRXFd0cDQADzZ3Fr6s+B4lwFgQCg+hlq DNwIW1XHJAcPuJ9hur2POx4= =KoqQ -----END PGP SIGNATURE----- --Boundary-02=_3DPT+1EFGU+k7dQ-- From owner-ietf-imaa Fri Feb 14 06:09:54 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EE9sl28039 for ietf-imaa-bks; Fri, 14 Feb 2003 06:09:54 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EE9qd28035 for ; Fri, 14 Feb 2003 06:09:52 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1EE9gXf002461 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 14 Feb 2003 15:09:44 +0100 To: Roy Badami Cc: ietf-imaa@imc.org Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA X-Payment: hashcash 1.1 0:030214:roy@gnomon.org.uk:c5139691bad24d4e X-Hashcash: 0:030214:roy@gnomon.org.uk:c5139691bad24d4e X-Payment: hashcash 1.1 0:030214:ietf-imaa@imc.org:42ddbd6510bb9e54 X-Hashcash: 0:030214:ietf-imaa@imc.org:42ddbd6510bb9e54 From: Simon Josefsson Date: Fri, 14 Feb 2003 15:09:42 +0100 In-Reply-To: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> (Roy Badami's message of "Fri, 14 Feb 2003 11:37:42 +0000") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-2.8 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01, USER_AGENT,USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > Should IMAA make any statement as to whether UseSTD3ASCIIRules MUST or > SHOULD be set when processessing any IDN on the RHS of an IMA, or > should this decision be left up to the implementor? Why should IMAA discuss it at all? IDNA defines how domain names are internationalized, and that takes care of RHS. From owner-ietf-imaa Fri Feb 14 06:17:31 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EEHVU28381 for ietf-imaa-bks; Fri, 14 Feb 2003 06:17:31 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EEHUd28377 for ; Fri, 14 Feb 2003 06:17:30 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1EEHTXf002600 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 14 Feb 2003 15:17:29 +0100 To: list-ietf-i18n-imaa@faerber.muc.de (Claus =?iso-8859-1?q?F=E4rber?=) Cc: ietf-imaa@imc.org Subject: Re: Question: Fullwidth double-quote and fullwidth backslash X-Payment: hashcash 1.1 0:030214:list-ietf-i18n-imaa@faerber.muc.de:eaede714d93ebaca X-Hashcash: 0:030214:list-ietf-i18n-imaa@faerber.muc.de:eaede714d93ebaca X-Payment: hashcash 1.1 0:030214:ietf-imaa@imc.org:60732d59cfd3dbea X-Hashcash: 0:030214:ietf-imaa@imc.org:60732d59cfd3dbea From: Simon Josefsson Date: Fri, 14 Feb 2003 15:17:28 +0100 In-Reply-To: <8fpJ5$qocDD@3247.org> (list-ietf-i18n-imaa@faerber.muc.de's message of "14 Feb 2003 14:33:00 +0100") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <8fpJ5$qocDD@3247.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Status: No, hits=-2.8 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01, USER_AGENT,USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: list-ietf-i18n-imaa@faerber.muc.de (Claus Färber) writes: > Roy Badami schrieb/wrote: >> When dequoting/requoting localparts, should we consider recognizing >> fullwidth double quotes and fullwidth backslash (and any other >> double-quote-like and backlash-like characters)? > > Just do a NFKC normalisation at the very beginning and then additinally > map U+3002 to U+002E. This will handle all of these special cases. Doing normalization before mapping goes against stringprep and results in different behaviour (see the "self reverting" test vectors on ). I'm not saying your idea is a bad one, I think it is another indication that IMAA cannot be a simple stringprep profile. From owner-ietf-imaa Fri Feb 14 06:22:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EEMPM28437 for ietf-imaa-bks; Fri, 14 Feb 2003 06:22:25 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EEMNd28433 for ; Fri, 14 Feb 2003 06:22:23 -0800 (PST) Received: from dirichlet.Physik.Uni-Bielefeld.DE (dirichlet.Physik.Uni-Bielefeld.DE [129.70.125.234]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HAA002QUYL83O@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Fri, 14 Feb 2003 15:22:20 +0100 (MET) Date: Fri, 14 Feb 2003 15:09:34 +0100 From: Marc Mutz Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA In-reply-to: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> To: ietf-imaa@imc.org Message-id: <200302141509.45488@sendmail.mutz.com> Organization: KDE MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_piPT+XLS9Ul2dz1"; charset="us-ascii" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_piPT+XLS9Ul2dz1 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Friday 14 February 2003 12:37, Roy Badami wrote: > Should IMAA make any statement as to whether UseSTD3ASCIIRules MUST > or SHOULD be set when processessing any IDN on the RHS of an IMA, or > should this decision be left up to the implementor? The RHS is an IDN-unaware domain name slot. As such, it is accounted for=20 by IDNA, not IMAA. Marc =2D-=20 It takes 5 minutes to create [a OpenPGP key]. Of course it takes a bit more time to get it signed... -- David Faure --Boundary-02=_piPT+XLS9Ul2dz1 Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TPip3oWD+L2/6DgRAvYfAKCSAi65EtZZcRIeJmjHeNCu1XgDvwCgp7g3 sFF0tGowIEbsNWZPSYyAUu0= =Ojx1 -----END PGP SIGNATURE----- --Boundary-02=_piPT+XLS9Ul2dz1-- From owner-ietf-imaa Fri Feb 14 06:40:40 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EEeeR01160 for ietf-imaa-bks; Fri, 14 Feb 2003 06:40:40 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EEecd01150 for ; Fri, 14 Feb 2003 06:40:38 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EEeuKA021987 for ; Fri, 14 Feb 2003 14:40:56 GMT To: jas@extundo.com CC: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: (message from Simon Josefsson on Fri, 14 Feb 2003 15:09:42 +0100) Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> Date: Fri, 14 Feb 2003 14:40:55 +0000 Message-ID: <1045233655.21973.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Why should IMAA discuss it at all? IDNA defines how domain names are internationalized, and that takes care of RHS. Because, as IDNA says, the rules as to where the STD3 hostname rules should be applied are imprecise and the subject of debate, and the implementor needs to interpret the RFCs to decide whether they apply in any given circumstance. If the RFCs are clear on the matter, then it's surely helpful to the implementor to document in IMAA the required value of the UseSTD3ASCIIRules flag for processing IMAs. (And I'd appreciate a reference to the RFC that makes it clear whether mail domains are subject to hostname rules, if there is one.) If there's any ambiguity or disagreement as to whether the hostname rules apply, then perhaps IMAA should take the opportunity of clarifying the situation. -roy From owner-ietf-imaa Fri Feb 14 07:09:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EF9PH03763 for ietf-imaa-bks; Fri, 14 Feb 2003 07:09:25 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EF9Nd03759 for ; Fri, 14 Feb 2003 07:09:24 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EF9iKA022090 for ; Fri, 14 Feb 2003 15:09:44 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Another issue: quoting From: Roy Badami Date: Fri, 14 Feb 2003 15:09:44 +0000 Message-ID: <1045235384.22089.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: To be quirks-compatible with software that does make a distinction between different forms of the same local-part, I'd recommend the following: That doesn't necessarily sound like a bad idea, but my original reason for raising this issue was based on a misunderstanding of RFC822. So this raises the question, do we believe that there are implementations out there that have quirks we need to work around. Do we have any idea what the quirks are? -roy From owner-ietf-imaa Fri Feb 14 07:36:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EFauc04331 for ietf-imaa-bks; Fri, 14 Feb 2003 07:36:56 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EFasd04327 for ; Fri, 14 Feb 2003 07:36:54 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1EFaqXf005046 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 14 Feb 2003 16:36:53 +0100 To: Roy Badami Cc: ietf-imaa@imc.org Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA X-Payment: hashcash 1.1 0:030214:roy@gnomon.org.uk:e78805723debcece X-Hashcash: 0:030214:roy@gnomon.org.uk:e78805723debcece X-Payment: hashcash 1.1 0:030214:ietf-imaa@imc.org:32b2312a5b929d15 X-Hashcash: 0:030214:ietf-imaa@imc.org:32b2312a5b929d15 From: Simon Josefsson Date: Fri, 14 Feb 2003 16:36:52 +0100 In-Reply-To: <1045233655.21973.TMDA@moriarty.gnomon.org.uk> (Roy Badami's message of "Fri, 14 Feb 2003 14:40:55 +0000") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045233655.21973.TMDA@moriarty.gnomon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-2.8 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01, USER_AGENT,USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > Why should IMAA discuss it at all? IDNA defines how domain names are > internationalized, and that takes care of RHS. > > Because, as IDNA says, the rules as to where the STD3 hostname rules > should be applied are imprecise and the subject of debate, and the > implementor needs to interpret the RFCs to decide whether they apply > in any given circumstance. > > If the RFCs are clear on the matter, then it's surely helpful to the > implementor to document in IMAA the required value of the > UseSTD3ASCIIRules flag for processing IMAs. (And I'd appreciate a > reference to the RFC that makes it clear whether mail domains are > subject to hostname rules, if there is one.) > > If there's any ambiguity or disagreement as to whether the hostname > rules apply, then perhaps IMAA should take the opportunity of > clarifying the situation. Informative text doesn't hurt, I guess, but IMAA shouldn't make normative statements about this IMHO. All application must already know the answer to this problem, as the same problem existed before IDNA. If the RFCs are unclear, that should be fixed independently of IMAA. From owner-ietf-imaa Fri Feb 14 07:55:01 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EFt1O04839 for ietf-imaa-bks; Fri, 14 Feb 2003 07:55:01 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EFsxd04835 for ; Fri, 14 Feb 2003 07:55:00 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EFtKKA022313 for ; Fri, 14 Feb 2003 15:55:20 GMT To: jas@extundo.com CC: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: (message from Simon Josefsson on Fri, 14 Feb 2003 16:36:52 +0100) Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> Date: Fri, 14 Feb 2003 15:55:19 +0000 Message-ID: <1045238119.22308.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Informative text doesn't hurt, I guess, but IMAA shouldn't make normative statements about this IMHO. All application must already know the answer to this problem, as the same problem existed before IDNA. If the RFCs are unclear, that should be fixed independently of IMAA. This is what IDNA says, but I'm not sure it's true. I doubt most mail software knows the answer to this problem; I suspect that most MUAs simply pass the address to the MTA and see what happens, and that most MTAs simply pass the domain to the resolver and see what happens. In the absence of an RFC-mandated requirement to validate the domain, it could be argued that this is the correct thing to do in terms of the robustness principle, even if the domain name contains characters that are technically illegal. However, with IMAs, the MUA has to explicitly invoke IDNA, and it hence has to choose a value for the UseSTD3ASCIIRules flag. It isn't absolutely clear to me what the correct value of this flag is (though it may be clear to others), and it's highly desirable that IMAA implementations behave consistently. I don't see why normative text must be ruled out. -roy From owner-ietf-imaa Fri Feb 14 08:51:23 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EGpNt07460 for ietf-imaa-bks; Fri, 14 Feb 2003 08:51:23 -0800 (PST) Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1EGpMd07453 for ; Fri, 14 Feb 2003 08:51:22 -0800 (PST) Received: (qmail 1358 invoked from network); 14 Feb 2003 16:50:53 -0000 Received: from adsl-65-43-34-157.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.34.157) by server.iicinternet.com with SMTP; 14 Feb 2003 16:50:53 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <3E4B5CAD.9090304@bic.nus.edu.sg> References: <138AA78F80DCE84B8EE424399FFBF9C904FAA1@exchange.ad.skymv.com> <3E4B5CAD.9090304@bic.nus.edu.sg> Date: Fri, 14 Feb 2003 11:50:39 -0500 To: ietf-imaa@imc.org From: tedd Subject: Re: The typing issue Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >Sorry to take so long to put across a small point, >but I am not a true native English user. > >-- >tin wee tin wee: It's not a small point -- it's THE point. The entire issue has been to globalize Internet use, not limit it. tedd -- http://sperling.com/ From owner-ietf-imaa Fri Feb 14 09:00:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EH0BS07836 for ietf-imaa-bks; Fri, 14 Feb 2003 09:00:11 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EH09d07832 for ; Fri, 14 Feb 2003 09:00:09 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1EH07Xf006936 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 14 Feb 2003 18:00:08 +0100 To: Roy Badami Cc: ietf-imaa@imc.org Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA X-Payment: hashcash 1.1 0:030214:roy@gnomon.org.uk:69c15676ad6cb0f7 X-Hashcash: 0:030214:roy@gnomon.org.uk:69c15676ad6cb0f7 X-Payment: hashcash 1.1 0:030214:ietf-imaa@imc.org:d228d3dbac191cd6 X-Hashcash: 0:030214:ietf-imaa@imc.org:d228d3dbac191cd6 From: Simon Josefsson Date: Fri, 14 Feb 2003 18:00:07 +0100 In-Reply-To: <1045238119.22308.TMDA@moriarty.gnomon.org.uk> (Roy Badami's message of "Fri, 14 Feb 2003 15:55:19 +0000") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045238119.22308.TMDA@moriarty.gnomon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-2.8 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01, USER_AGENT,USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > Informative text doesn't hurt, I guess, but IMAA shouldn't make > normative statements about this IMHO. > > All application must already know the answer to this problem, as the > same problem existed before IDNA. If the RFCs are unclear, that > should be fixed independently of IMAA. > > This is what IDNA says, but I'm not sure it's true. I doubt most mail > software knows the answer to this problem; I suspect that most MUAs > simply pass the address to the MTA and see what happens, and that most > MTAs simply pass the domain to the resolver and see what happens. > > In the absence of an RFC-mandated requirement to validate the domain, > it could be argued that this is the correct thing to do in terms of the > robustness principle, even if the domain name contains characters that > are technically illegal. > > However, with IMAs, the MUA has to explicitly invoke IDNA, and it > hence has to choose a value for the UseSTD3ASCIIRules flag. It isn't > absolutely clear to me what the correct value of this flag is (though > it may be clear to others), and it's highly desirable that IMAA > implementations behave consistently. I don't see why normative text > must be ruled out. IDNA suggests that applications that simply passed things on before should not set the flag, and applications that enforces hostname restrictions today should set the flag. Application have made a decision, conscious or not. I'm not sure I see any need to enforce anything with regard to this more than already is, and in particular not how IMAA would be the right place for it. ,---- | 3) For each label, decide whether or not to enforce the restrictions on | ASCII characters in host names [STD3]. (Applications already faced this | choice before the introduction of IDNA, and can continue to make the | decision the same way they always have; IDNA makes no new | recommendations regarding this choice.) If the restrictions are to be | enforced, set the flag called "UseSTD3ASCIIRules" for that label. `---- From owner-ietf-imaa Fri Feb 14 09:58:57 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EHwvJ13057 for ietf-imaa-bks; Fri, 14 Feb 2003 09:58:57 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1EHwud13053 for ; Fri, 14 Feb 2003 09:58:56 -0800 (PST) Received: (qmail 22443 invoked by uid 66); 14 Feb 2003 17:58:54 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 14 Feb 2003 17:58:54 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d); 14 Feb 2003 18:58:49 +0100 Date: 14 Feb 2003 18:43:00 +0100 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8fpKYQTZcDD@3247.org> In-Reply-To: Subject: Re: Question: Fullwidth double-quote and fullwidth backslash User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-14-1342d MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Simon Josefsson schrieb/wrote: > list-ietf-i18n-imaa@faerber.muc.de (Claus Färber) writes: >> Just do a NFKC normalisation at the very beginning and then additinally >> map U+3002 to U+002E. This will handle all of these special cases. > Doing normalization before mapping goes against stringprep and results > in different behaviour (see the "self reverting" test vectors on > ). It does not if you do the normalisation twice (at the very beginning and after mapping). For IMAA, it suffices to specify that implementations MUST accept all characters as delimiters that decompose to one of our delimiters during NFKC-with-U+3002-to-U+002E normalisation and that the delimiters MUST be normalised. The easiest way to implement this is an additional normalisation at the very beginning. IDNA can get away wihtout such a normalisation because they have a single delimiter (U+002E) in their output. The IDNA processing maps all dot variants (including U+3002 and the width variants) to whatever delimiter is used (usually a dot U+002E or, in DNS packets, no delimiter at all). Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 14 11:13:10 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EJDAf16564 for ietf-imaa-bks; Fri, 14 Feb 2003 11:13:10 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EJD9d16559 for ; Fri, 14 Feb 2003 11:13:09 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA20781; Fri, 14 Feb 2003 14:13:08 -0500 Message-Id: <4.2.0.58.J.20030214105633.05c158e0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 14 Feb 2003 10:59:27 -0500 To: Roy Badami , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Question: Fullwidth double-quote and fullwidth backslash In-Reply-To: <1045224428.21358.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: My (limited) understanding is that quotes and backslashes are not printed on business cards, and not entered by the user. It therefore seems completely unnecessary to consider full-width variants. While the average user might not get the '@' right, we should be able to rely on programmers getting the quotes and backslashes right. Regards, Martin. At 12:07 03/02/14 +0000, Roy Badami wrote: >When dequoting/requoting localparts, should we consider recognizing >fullwidth double quotes and fullwidth backslash (and any other >double-quote-like and backlash-like characters)? > >It seems to me that the arguments for this are similar to those for >fullwidth dot and fullwidth at, and once we decide to recognize >metacharacters in fullwidth form, we should apply this consistently to >*all* metacharacters. > > -roy From owner-ietf-imaa Fri Feb 14 11:13:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EJD8b16552 for ietf-imaa-bks; Fri, 14 Feb 2003 11:13:08 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EJD7d16546 for ; Fri, 14 Feb 2003 11:13:07 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA20778; Fri, 14 Feb 2003 14:13:08 -0500 Message-Id: <4.2.0.58.J.20030214105432.05980360@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 14 Feb 2003 10:55:37 -0500 To: IETF IMAA list , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Question: full-width at Cc: Roy Badami In-Reply-To: <20030214050748.GF25048@nicemice.net> References: <1045190936.18837.TMDA@moriarty.gnomon.org.uk> <1045190936.18837.TMDA@moriarty.gnomon.org.uk> <1045190936.18837.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Adam, I agree very much with the general direction of staying away from user interface issues. The less of these, the better. Regards, Martin. At 05:07 03/02/14 +0000, Adam M. Costello wrote: >These issues were considered for IDNA. There are many characters that >decompose to dot (like U+FE52 small full stop, U+2024 one dot leader) or >decompose to strings containing dots (like U+2488 digit one full stop, >U+33C2 square AM). Therefore, if a whole domain name is normalized >before being scanned for dots, it might result in a different number of >labels than if it had not been normalized. We discussed whether IDNA >should require this pre-normalization. Ultimately we decided that that >was getting too far into user interface issues. IDNA instead focuses >almost entirely on individual labels, not whole domain names. The >one exception is to require that three particular dot-like non-ASCII >characters be recognized as dots, just the three that we had reason to >believe would be likely to be input by users trying to type dots. > >The IMAA draft follows this example. It says as little as possible >about entire mail addresses, and focuses on the local part. The >one exception is to require one particular at-like character to be >recognized as an at-sign, the one that we have reason to believe is >likely to be input by users trying to type at-signs. > >As for initial combining characters, the same question arose with IDNA. >If a label begins with a combining character, will it combine with the >preceeding dot? This was considered a user-interface issue that IDNA >should not address. > >AMC From owner-ietf-imaa Fri Feb 14 11:13:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EJD9P16560 for ietf-imaa-bks; Fri, 14 Feb 2003 11:13:09 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EJD8d16551 for ; Fri, 14 Feb 2003 11:13:08 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA20788; Fri, 14 Feb 2003 14:13:09 -0500 Message-Id: <4.2.0.58.J.20030214110907.05bf4398@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 14 Feb 2003 11:36:49 -0500 To: IETF IMAA list , Roy Badami From: Martin Duerst Subject: Re: A couple of comments on the open issues... In-Reply-To: <20030213235319.GB25048@nicemice.net> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 23:53 03/02/13 +0000, Adam M. Costello wrote: >Roy Badami wrote: > > I'm particular keen that the use of a tag or suffix with a local > > username in not broken by IMAA. Many MTAs provide the functionality > > that all mail addressed to will be delivered > > to , either by default, or as a configuration option. > > > > This effectively allows a user of such an MTA (that has been suitably > > configureed) to have multiple e-mail addresses without requiring any > > action on the part of the mail administrator, and allows the user to > > run scripts that process their mail according to the suffix. > >So if IMAA operates on the entire local part, this multiple-address >feature will be unavailable to users with internationalized local parts. >This is a good argument in favor of having IMAA operate independently on >subparts of the local part. First a question: Is subadressing just something that is done by a few email systems locally, or is it something specified in some of the email standards? The use of different separators in different systems seem to suggest the later. If that's the case, to what extent do we really have to consider it here? Also, using subaddressing seems to be quite popular for high-end users. But is it actually very much used by the bulk of users (the proverbial hotmail/yahoo/... crowd)? My guess would be that it's not. If that's the case, then this may give us some slack, because as far as I understand, the main push for internationalized addresses is from average users, not necessarily high-end users. It's also easy to imagine that users have a single internationalized address (for personal mail) and a bunch of structured addresses (for list subscriptions,...). A third thing I have just thought about is that the design of punycode actually has some very nice properties that allow separation of the subnet address in most cases even if the whole LHS is encoded in one go. The first thing here is that the separators, as long as they are ASCII characters, are still visible in the encoded version. Secondly, very simple pattern search allows to find all the addresses belonging to the same primary address, with one exception: It's difficult to check whether there are accidental non-ASCII characters smuggled in before the separator. As an example of the last case, assume '+' is the separator, 'abc' is the primary address, 'AABBCC' is it's punycode encoding, and we find an address looking something like xn--+fghAABBCCDDEE, then we don't know whether the original address was abc+defgh, or whether it was abcd+efgh or abcde+fgh, i.e. we don't know whether what's encoded in DDEE comes before or after the +. But we know that 'abc' comes before the +, because otherwise it would be encoded differently. So a seach for something like m/^xn--+[\x21-\x7e]*$user.*/ (where $user is the punycode encoding of the username in initial position) will pretty much identify all the mail for that user. Regards, Martin. From owner-ietf-imaa Fri Feb 14 12:24:23 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EKONH19778 for ietf-imaa-bks; Fri, 14 Feb 2003 12:24:23 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EKOLd19774 for ; Fri, 14 Feb 2003 12:24:21 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EKOZKA023410 for ; Fri, 14 Feb 2003 20:24:35 GMT To: duerst@w3.org CC: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: <4.2.0.58.J.20030214105633.05c158e0@localhost> (message from Martin Duerst on Fri, 14 Feb 2003 10:59:27 -0500) Subject: Re: Question: Fullwidth double-quote and fullwidth backslash References: <4.2.0.58.J.20030214105633.05c158e0@localhost> Date: Fri, 14 Feb 2003 20:24:34 +0000 Message-ID: <1045254274.23405.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On the contrary, quotes can appear on business cards. Consider the following (invented) address, obtained by mapping an X.400 address using RFC1148 or successors: "/PN=Roy.Badami/OU=Systems/O=Microsoft Inc/C=US/ADMD=ATT/"@x-400-relay.att.com I really have seen addresses like this (though not recently, I'll admit). If the LHS contains unusual characters, quoting had better appear on the business card. -roy From owner-ietf-imaa Fri Feb 14 12:31:06 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EKV6W20348 for ietf-imaa-bks; Fri, 14 Feb 2003 12:31:06 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EKV4d20340 for ; Fri, 14 Feb 2003 12:31:04 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EKVRKA023447 for ; Fri, 14 Feb 2003 20:31:28 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Question: Fullwidth double-quote and fullwidth backslash From: Roy Badami Date: Fri, 14 Feb 2003 20:31:27 +0000 Message-ID: <1045254687.23446.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: It does not if you do the normalisation twice (at the very beginning and after mapping). For IMAA, it suffices to specify that implementations MUST accept all characters as delimiters that decompose to one of our delimiters during NFKC-with-U+3002-to-U+002E normalisation and that the delimiters MUST be normalised. The easiest way to implement this is an additional normalisation at the very beginning. Are you saying we can do a normalization of the entire e-mail address without violating IDNA (which specifies that the domain be split on dot-like characters before normalization). Because we have to parse the quoting in order to identify the local-part (the LHS may contain quoted at-signs). -roy From owner-ietf-imaa Fri Feb 14 12:34:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EKYKx20880 for ietf-imaa-bks; Fri, 14 Feb 2003 12:34:20 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EKYId20872 for ; Fri, 14 Feb 2003 12:34:18 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EKYfKA023468 for ; Fri, 14 Feb 2003 20:34:41 GMT To: duerst@w3.org CC: ietf-imaa@imc.org In-reply-to: <4.2.0.58.J.20030214110907.05bf4398@localhost> (message from Martin Duerst on Fri, 14 Feb 2003 11:36:49 -0500) Subject: Re: A couple of comments on the open issues... References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> Date: Fri, 14 Feb 2003 20:34:40 +0000 Message-ID: <1045254880.23456.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: First a question: Is subadressing just something that is done by a few email systems locally, or is it something specified in some of the email standards? It's not specified in any standard that I'm aware of, but it is functionality that is provided in many Unix MTAs. Also, using subaddressing seems to be quite popular for high-end users. But is it actually very much used by the bulk of users (the proverbial hotmail/yahoo/... crowd)? My guess would be that it's not. Probably true. If that's the case, then this may give us some slack, because as far as I understand, the main push for internationalized addresses is from average users, not necessarily high-end users. It's also easy to imagine that users have a single internationalized address (for personal mail) and a bunch of structured addresses (for list subscriptions,...). I guess. But it would be nice if high-end users could use IMAs with subaddresses if they wish. A third thing I have just thought about is that the design of punycode actually has some very nice properties that allow separation of the subnet address in most cases even if the whole LHS is encoded in one go. The problem here is that you don't want to change the MTA. The MTA uses a particular algorithm for spliting the address into mailbox and subaddress, and it would be nice if IMAA didn't break it. -roy From owner-ietf-imaa Fri Feb 14 12:51:42 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EKpgN21827 for ietf-imaa-bks; Fri, 14 Feb 2003 12:51:42 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EKped21823 for ; Fri, 14 Feb 2003 12:51:40 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EKq3KA023558 for ; Fri, 14 Feb 2003 20:52:03 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Case sensitivity on the LHS From: Roy Badami Date: Fri, 14 Feb 2003 20:52:03 +0000 Message-ID: <1045255923.23555.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The IDNA model appears to be a better tradeoff: mandate that non-ASCII local parts are always case-insensitive, and leave mixed-case annotations as an *optional* technique for preserving case. So does IDNA permit mixed-case annotation? I was under the impression that at one point the draft forbade the use of mixed-case annotation with IDNA, but I can't find that prohibition in the current documents. How exactly does mixed-case annotation work in IDNA? You have to somehow pass case information through the nameprep process; is the precise algorithm actually spelled out anywhere? -roy From owner-ietf-imaa Fri Feb 14 13:06:51 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EL6pB22060 for ietf-imaa-bks; Fri, 14 Feb 2003 13:06:51 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EL6md22056 for ; Fri, 14 Feb 2003 13:06:48 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1EL7AKA023630 for ; Fri, 14 Feb 2003 21:07:10 GMT To: jas@extundo.com CC: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: (message from Simon Josefsson on Fri, 14 Feb 2003 18:00:07 +0100) Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> Date: Fri, 14 Feb 2003 21:07:09 +0000 Message-ID: <1045256829.23623.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: IDNA suggests that applications that simply passed things on before should not set the flag, and applications that enforces hostname restrictions today should set the flag. Application have made a decision, conscious or not. I'm not sure I see any need to enforce anything with regard to this more than already is, and in particular not how IMAA would be the right place for it. Ok, maybe you're right. However an informational note would not go amiss, even if it's only on the lines of the note in IDNA that basically just says 'this point is contentious'. Further minor wrinkle: if the application chooses not to enforce hostname rules (ie not to set UseSTD3ASCIIRules), it still needs to enforce those retrictions on domain names that occur as a result of the 822/2822 syntax (ie a label can't contain specials). Does IMAA need to say anything about the application of the 822/2822 rules (in the same way that IDNA talks about the application of the STD3 rules)? -roy From owner-ietf-imaa Fri Feb 14 13:59:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1ELx9R23331 for ietf-imaa-bks; Fri, 14 Feb 2003 13:59:09 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1ELx8d23327 for ; Fri, 14 Feb 2003 13:59:08 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id QAA03780; Fri, 14 Feb 2003 16:59:04 -0500 Message-Id: <4.2.0.58.J.20030214153013.059a8818@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 14 Feb 2003 15:38:18 -0500 To: Roy Badami From: Martin Duerst Subject: Re: Question: Fullwidth double-quote and fullwidth backslash Cc: ietf-imaa@imc.org In-Reply-To: <1045254274.23405.TMDA@moriarty.gnomon.org.uk> References: <4.2.0.58.J.20030214105633.05c158e0@localhost> <4.2.0.58.J.20030214105633.05c158e0@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Roy, At 20:24 03/02/14 +0000, Roy Badami wrote: >On the contrary, quotes can appear on business cards. Ok, thanks. So they actually can, and do, in odd cases. Paper is patient. (German saying) But are we really required, or do we see it as our goal, to help people avoid some potential typing mistakes in addresses that are, by their length and complexity, not at all user-friendly in the first place? My position is that we don't have any reason to go there. Regards, Martin. >Consider the following (invented) address, obtained by mapping an >X.400 address using RFC1148 or successors: > >"/PN=Roy.Badami/OU=Systems/O=Microsoft Inc/C=US/ADMD=ATT/"@x-400-relay.att.com > >I really have seen addresses like this (though not recently, I'll admit). > >If the LHS contains unusual characters, quoting had better appear on >the business card. > > -roy From owner-ietf-imaa Fri Feb 14 15:06:19 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1EN6Jt25355 for ietf-imaa-bks; Fri, 14 Feb 2003 15:06:19 -0800 (PST) Received: from relay-3m.club-internet.fr (relay-3m.club-internet.fr [194.158.104.42]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1EN6Hd25351 for ; Fri, 14 Feb 2003 15:06:17 -0800 (PST) Received: from mine.club-internet.fr (f16m-10-114.d1.club-internet.fr [212.195.121.114]) by relay-3m.club-internet.fr (Postfix) with ESMTP id AB982E113; Sat, 15 Feb 2003 00:07:01 +0100 (CET) Message-Id: <5.2.0.9.0.20030214233425.02baf6e0@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Sat, 15 Feb 2003 00:02:13 +0100 To: Martin Duerst , Roy Badami From: "J-F C. (Jefsey) Morfin" Subject: Re: Question: Fullwidth double-quote and fullwidth backslash Cc: ietf-imaa@imc.org In-Reply-To: <4.2.0.58.J.20030214153013.059a8818@localhost> References: <1045254274.23405.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214105633.05c158e0@localhost> <4.2.0.58.J.20030214105633.05c158e0@localhost> Mime-Version: 1.0 Content-Type: multipart/mixed; x-avg-checked=avg-ok-2353733D; boundary="=======3B9A1B11=======" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --=======3B9A1B11======= Content-Type: text/plain; x-avg-checked=avg-ok-2353733D; charset=us-ascii; format=flowed Content-Transfer-Encoding: 8bit At 21:38 14/02/03, Martin Duerst wrote: >My position is that we don't have any reason to go there. What 95% of the users could accept today will not tell you much about what 10% will demand once IMAA has changed the conception of 80% of the worldwide users, and service providers and designers, about the mail address, ie a key element of a service representing 80% of the internet traffic. IMHO the question is not "what should we do?", but "what cannot we really do?". I have used the WSIS lists and asked around. People cannot commit on something they never saw. But the interest, and the subsequent demands are here. I suggest you carry the same test. Also, remember that people mostly use Windows, and that Windows uses file names with space and write file names with upper cases on the diplays etc.. and some other funny things people see every day and they understand as an improvement over the current proposition (or a liberation from limitations they do not understand: "why would it be so complex? it is all over my IE screen today"). --=======3B9A1B11======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-2353733D Content-Disposition: inline --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.454 / Virus Database: 253 - Release Date: 10/02/03 --=======3B9A1B11=======-- From owner-ietf-imaa Fri Feb 14 15:51:37 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1ENpbr26241 for ietf-imaa-bks; Fri, 14 Feb 2003 15:51:37 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1ENpad26231 for ; Fri, 14 Feb 2003 15:51:36 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jpcJ-0001Xp-00; Fri, 14 Feb 2003 15:51:39 -0800 Date: Fri, 14 Feb 2003 23:51:39 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Case sensitivity on the LHS Message-ID: <20030214235139.GC4500@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045255923.23555.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1045255923.23555.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami wrote: > So does IDNA permit mixed-case annotation? I was under the impression > that at one point the draft forbade the use of mixed-case annotation > with IDNA, but I can't find that prohibition in the current documents. IDNA makes no mention of mixed-case annotation or case preservation. > How exactly does mixed-case annotation work in IDNA? Officially, it doesn't. It was too contentious to be mentioned. The prevailing opinion was that IDNA is complex enough already, and shouldn't saddle implementors with nonessential complications. However, IDNA is architected so that mixed-case annotations could be added in the future in a backward-compatible way. The ToASCII operation already has almost enough flexibility for this, because the case of the letters used to encode the non-ASCII characters is unconstrained. The ASCII letters, however, are always lowercased by the Nameprep step, and Punycode simply copies them, so a strict implementation of ToASCII cannot preserve the case of ASCII letters for inputs containing some non-ASCII characters. And a strict implementation of ToUnicode has no flexibility at all; the exact Unicode output is completely determined regardless of any mixed-case in the ACE input. But ToASCII and ToUnicode don't really need to be that strict to ensure interoperability. As long as they output something equivalent to what the official versions output, that's good enough. A future update of IDNA could therefore relax the specification enough to allow for mixed-case annotations. Even if such an update never comes, gutsy implementors could go ahead and do it, and still interoperate with the strictly conformant implementations and with each other. That last sentence is probably heresy. :) > You have to somehow pass case information through the nameprep > process; is the precise algorithm actually spelled out anywhere? Nope. Since it's not officially supported, working out all the details wasn't a priority, and hasn't yet been done. AMC P.S. Roy, have you subscribed to the list yet? You appear to be participating as actively as anyone. From owner-ietf-imaa Fri Feb 14 16:15:37 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F0FbU26803 for ietf-imaa-bks; Fri, 14 Feb 2003 16:15:37 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F0FZd26799 for ; Fri, 14 Feb 2003 16:15:35 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jpzX-0001dF-00 for ; Fri, 14 Feb 2003 16:15:39 -0800 Date: Sat, 15 Feb 2003 00:15:39 +0000 From: "Adam M. Costello" To: IETF IMAA list Subject: Re: A couple of comments on the open issues... Message-ID: <20030215001538.GE4500@nicemice.net> Reply-To: IETF IMAA list References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst wrote: > Is subadressing just something that is done by a few email systems > locally, or is it something specified in some of the email standards? As far as I know, there are no standards for any structured local parts. It's all unofficial conventions. > to what extent do we really have to consider it here? I'd say we're under no obligation to play nicely with these unofficial practices, but our goal is to create utility, not frustration, so we shouldn't immediately dismiss the concern. > Also, using subaddressing seems to be quite popular for high-end > users. But is it actually very much used by the bulk of users (the > proverbial hotmail/yahoo/... crowd)? Currently, no. As spam becomes ever more of a problem, more users might start using subaddressing as a means of combating it (like I do). > It's also easy to imagine that users have a single internationalized > address (for personal mail) and a bunch of structured addresses (for > list subscriptions,...). That's a good point. But then there are freaks like me who use subaddressing for all mail, personal or not, as a means of eliminating spam. Every address is expendable and can be disabled the first time it receives spam; there is no single stable mail address for me (but there is a stable template, and a stable URL). > A third thing I have just thought about is that the design of punycode > actually has some very nice properties that allow separation of the > subnet address in most cases even if the whole LHS is encoded in one > go. That's kind of cool, but as Roy says, it still doesn't let people use IMAs containing delimiters in the local part. If the mail server for the domain is configured to support user+tag syntax, but the ACE form of USER+tag looks like xn--+tag-blahblah, then it won't work at all, because the mail server will try to deliver to xn--. So users would not get the full IMAA functionality until the infrastructure (the MTA) has been upgraded, which is an undesirable dependence (one that IDNA does not suffer). The counter-argument is that most users won't care about the missing functionality. AMC From owner-ietf-imaa Fri Feb 14 16:33:45 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F0XjC27112 for ietf-imaa-bks; Fri, 14 Feb 2003 16:33:45 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F0Xid27108 for ; Fri, 14 Feb 2003 16:33:44 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jqH5-0001gJ-00 for ; Fri, 14 Feb 2003 16:33:47 -0800 Date: Sat, 15 Feb 2003 00:33:47 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: Another issue: quoting Message-ID: <20030215003347.GF4500@nicemice.net> Reply-To: IETF IMAA list References: <1045183773.18276.TMDA@moriarty.gnomon.org.uk> <8fpJ5tRZcDD@3247.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8fpJ5tRZcDD@3247.org> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Claus Färber wrote: > To be quirks-compatible with software that does make a distinction > between different forms of the same local-part, I'd recommend the > following: > > . Save the original form. > . Fully dequote/remove comments/extra whitespace before toUnicode > and/or toASCII. > . If the output of toUnicode/toASCII is identical to the input, use the > original form in step #1, otherwise use the output generated by the > function. An interesting idea, but you seem to be assuming that the destination of the local part uses the same quoting mechanism as the origin of the local part. When these transformations are applied, it's typically because a local part obtained from a user interface is being transfered into a message header or SMTP command, or a local part obtained from a message header is being transfered onto a display. The quotation mechanisms for local parts in user interfaces (or app-specific config files, etc) are not standardized, and may or may not match the quotation mechanisms used in message headers and SMTP commands. AMC From owner-ietf-imaa Fri Feb 14 16:58:34 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F0wYS27982 for ietf-imaa-bks; Fri, 14 Feb 2003 16:58:34 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F0wXd27978 for ; Fri, 14 Feb 2003 16:58:33 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jqf6-0001kD-00; Fri, 14 Feb 2003 16:58:36 -0800 Date: Sat, 15 Feb 2003 00:58:36 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA Message-ID: <20030215005836.GG4500@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045238119.22308.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045233655.21973.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1045256829.23623.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami wrote: > Should IMAA make any statement as to whether UseSTD3ASCIIRules MUST > or SHOULD be set when processessing any IDN on the RHS of an IMA, or > should this decision be left up to the implementor? Simon Josefsson wrote: > All application must already know the answer to this problem, as the > same problem existed before IDNA. If the RFCs are unclear, that > should be fixed independently of IMAA. > > I'm not sure I see any need to enforce anything with regard to this > more than already is, and in particular not how IMAA would be the > right place for it. I agree with Simon. IMAA explicitly points to IDNA for handling the domain part of the mail address, and IDNA already says as much as we dare on the subject. Here's a taste of why we don't want stick our necks out: RFC-821 forbids 3foo.org. RFC-2821 allows 3foo.org, but forbids foo+.org. RFC-822 and RFC-2822 allow foo+.org. It's a mess, and we don't want to make it our job to try to make sense of it. We know from experience with IDNA that any attempt at clarification will be too controversial to reach consensus. Roy Badami wrote: > Further minor wrinkle: if the application chooses not to enforce > hostname rules (ie not to set UseSTD3ASCIIRules), it still needs to > enforce those retrictions on domain names that occur as a result of > the 822/2822 syntax (ie a label can't contain specials). If the mail address is being put into a message header, then it ought to enforce those restrictions, yes. If the mail address is being put into an SMTP command, then it's the 821/2821 syntax that ought to be enforced. The destination might be neither of those contexts, but something else with its own formal syntax rules. > Does IMAA need to say anything about the application of the 822/2822 > rules (in the same way that IDNA talks about the application of the > STD3 rules)? It might be good to generalize "quote the local part if necessary" to something like "handle any context-dependent syntax rules", and give the same example about quotation sometimes being needed in message headers and SMTP commands, and maybe give another example about character restrictions in those contexts. AMC From owner-ietf-imaa Fri Feb 14 17:00:05 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F105O28041 for ietf-imaa-bks; Fri, 14 Feb 2003 17:00:05 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F103d28035 for ; Fri, 14 Feb 2003 17:00:03 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18jqgT-000FpW-00 for ietf-imaa@imc.org; Fri, 14 Feb 2003 20:00:01 -0500 Date: Fri, 14 Feb 2003 20:00:00 -0500 From: John C Klensin To: IETF IMAA list Subject: Can we back up a bit and ask some basic questions? An alternate model Message-ID: <18245836.1045252800@p3.JCK.COM> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ( Very long -- contains both a high-level critique and the basis for an alternate proposal that preserves local-part opacity and skips ACE forms and goes directly to UTF-8 or another common Unicode encoding. ) Hi. I've been trying to follow the traffic about this proposal, and am frankly overwhelmed. I hope I'm not addressing anything that has already been discussed..., but I haven't seen such traffic. (1) A context for email internationalization and a critique of IMAA's starting assumptions... IMAA starts from the assumption that the right way to handle domain names is, with slight modifications, the right way to handle email local-parts. The debate, with few exceptions, has been about details and options given that choice. I'd like to move up a half-level and suggest that choice may be the wrong one, and that there are other options that will serve us, and Internet email generally, better. A different way to put my concern is that I wonder whether, with the IDNA-hammer in hand, email is just the nearest handy thing that can be construed as a nail. Several of the things that imposed constraints on the DNS solution are different for the local-part (LHS) of email. That may point to a different solution. The two important examples that occur to me include: (i) Single-turnaround UDP transactions versus multistep, TCP-based, SMTP. Because of the use of UDP, and some other things, there is no possibility for the DNS client and the server to interact about capabilities, form of names, etc. While multi-hop (relay) transmissions of mail complicates things, we do precisely those types of negotiations in SMTP all the time. In some respects, almost every mail command is such a negotiation. But, more important, we've got well-established (and, by now, widely deployed) precedents that permit clients to send extended materials only to servers that are prepared to deal with them. That is a feature, not a bug. It is not something we can do with the DNS, which was one of the major reasons why IDNA was needed. (ii) An established DNS syntax versus an opaque local-part For the DNS, the syntax of names used by client and server is fairly well specified. While, in theory, a client could somehow use, e.g., slashes or equal signs to separate labels in presentation, but parse things appropriately for DNS queries and responses, in practice it just doesn't happen. And someone might be able to read the specifications to prohibit such variations in a lot of cases. With email, the idea of a completely opaque local-part -- one that can be internally parsed, decomposed, and interpreted only by the delivery MTA -- is required by existing standards. More important, we have taken huge advantage of it over the years to do all sorts of interesting things. Many of the ones that were very important some years ago have fallen into disuse, but it would be, IMO, very unfortunate, and very dangerous, to discard the capability. Of course, if doing so were the only way to get internationalization, that would be worth considering anyway. But it isn't, as a sketch of a counterproposal below will, I hope, demonstrate. There has been a good deal of discussion of subaddresses on this list, but they are certainly not the only case, or even the most important one. We have used trickily-written local-parts to embed routing information (with the %-hack and bang-strings), to express X.400 addresses (MIXER and the UFN work), to implement Dan Bernstein's idea that binds return-paths to target addresses (the idea is interesting and useful, regardless of how the spammers use it), to implement portions of a variety of services whose commands are transported through email, and so on, for a long list of brilliant ideas, cheap hacks, and most points in between. The variety of those techniques is such that, unlike the DNS, we can't reasonably say "no sane person would construct a name such as XN--abcDeF" and then look at enough zone files to increase confidence that no one has. With email, we've had people working for years to find strange combinations and sequences of characters that could safely be used to delimit important (to them) information in different contexts. It would be impossible to claim that every combination has been used, but a plan that requires uniform interpretation of the local-part is in a lot more danger of wrecking something that some group or enterprise cares deeply about than with IDNA (or, more generally, prefixed-ACE) strings in the DNS. Even were "no human would construct such a name" true, it wouldn't help. Some of these systems use a hash of the origin address, or a variant on the message ID, or part or all of the message itself. Incorporating hashed or encoded identifying information in addresses goes back at least to some of the proto-groupware projects of the late 70s and early 80s. And the most aggressive of the "user name doesn't go in the email address because it discloses information" crew have been using random or encrypted strings instead. For years. Remember, too, that case-insensitivity for local names isn't something the protocol calls for. It is a suggestion for those mail servers who support email address that contain people's names or other obvious "name" or "word"-type strings. But the MTAs that support non-case-sensitive names have, historically, done it in a number of ways. I've seen systems in which local-parts such as John.C.Klensin John.Klensin john.klensin klensin J.Klensin J.C.Klensin and so forth, all match --and do so by algorithm, not alias files-- but where johN.klenSin is going to bounce. Maybe that is reasonable, maybe it is perverse, but it is certainly conforming. There are at least a couple of systems and configurations in which a leading underscore on the name implies something like "use it, but don't pass it through aliasing or translation". And a leading tilde means something else -- Ned would remember, but I've blocked it out at the moment. There are many more of these examples. There are enough of them that one just can't give up opacity -- or shouldn't want to-- unless there is no alternative that still gives us internationalized names. The question is not "how character X should be handled", but "can we get away with interpreting the local part in the sending MUA or MTA". And the answer, I suspect, is "no". To put this differently, a prime condition for IDNA was that it not require changes in the protocols or assumptions of the DNS. A stated condition for IMAA is that it not require changes to the mail infrastructure. But weakening or crippling the "opaque local part" rule does as much, or more, violence to the email infrastructure than, e.g., just putting UTF-8 in the addresses. I'll come back to that. Again, if this were the only way to do internationalization, it would be worth considering the risk of breaking things. But it isn't. Let me outline an idea, just to prove that one exists (if it gets any traction, I'd be happy to try to write, or collaborate on, an I-D). (2) A different set of assumptions What we have usually said about email, especially at the mail transport (SMTP) level, is that, if the sender wants to do something that is (historically) unusual, it must get the permission of the receiver first. That permission originally came out of band -- as a private agreement among consenting adults-- and then we introduced ESMTP to provide an in-band permission mechanism that could actually scale. If, as the administrator of an MTA, I want to accept "XN--AbCdEf" as a local-part for one of my mailboxes, I can do that today -- all I need to do is to put such an address in my alias files. And, if my MUA is configured to see it and turn it into a displayable Unicode string, that is my problem or feature and not anyone else's. The problem arises if the sender wants to type in a Unicode (or local CCS) string in the hope of getting the right one to me. And that is a mess, with a large selection of loose ends and tradeoffs, as 250 or so messages in less than five days attests. But we don't need to do it. With one change --and a different way of looking at the problem-- International local parts do not need to be handled differently from ASCII ones. That is how, IMO, it should be. It is is not, we either break existing mail conventions, or put non-ASCII users at a permanent disadvantage relative to ASCII ones. Those two options, to use a technical term, stink. So, instead, let's assume * We want to simply move to Unicode strings in local-parts. * There isn't really a lot of point in sending off strangely-encoded addresses to systems that can't handle them. Gibberish is gibberish, and users don't much like gibberish. If you are going to send me a message in English, and assume I can read English and that you might want a reply, you are better off using ASCII addressing. If you are going to send me a message in Klingon... well, I can't render it or read it, and whether you expect me to have an address in Klingon, or want to put Klingon characters in your return address (reverse-path or From: field), is going to be the least of our problems. Indeed, I would _prefer_ that it get dumped by some MTA because, if it doesn't, I want my little antispambot to toss it in the same bin with From: "=?x-unknown?Q?=C1=F6=BE=D0=BB=F7=B4=DE?=" Subject: =?x-unknown?Q?[=B1=A4=B0=ED]=C7=D1=B9=E6 [...] (chosen at random from the most recent bin-addition) * We don't want to destroy the "opaque local part" principle -- even with Unicode local-parts, the sender needs to either know the exact form of the mailbox name or the alias and conversion rules adopted by the recipient MTA. Guessing what those rules are will work sometimes, just as they do in today's ASCII environment. And sometimes it won't... ditto. * Unlike the MIME situation, where we were trying to be sure that users could get multilingual and multimedia messages even if their system mail adminstrators were slow about upgrading, no one is going to get a mailbox that can be reached with a non-ASCII string unless some system administrative process happens. So the upgrade question is rather different, unless people are really worried about ASCII mailbox names (forward paths) but i18n addresses in [2]822-level "From:" and "To:" fields. I don't think that is a major issue, but, if it is, we should probably be looking at RFC2047-like updates. (3) Strawman semi-proposal This is not a proposal, because there are still loose ends (some identified below). But it should be sufficient to act as an existence proof that there is a plausible alternative to IMAA, one that meets the assumptions/ conditions above. (a) We define a new SMTP extension. For purposes of discussion, let's call it UTF8ADDRESSES. Loose end a.1: This probably won't work unless 8BITMIME also works. That may need to be specified. Conversely, perhaps we could extend 8BITMIME so that this became just a parameter somewhere. I haven't thought that through and it doesn't make a lot of difference right now. Loose end a.2: I'm assuming that UTF-8 is the right choice here. It may not be, although we should really pick one, and only one, encoding. But there is no reason that I can see to force an ACE; 8-bit characters should be fine, modulo the limitation in (b.1). If UTF-8 isn't the right answer, make appropriate substitutions elsewhere in this note. Loose end a.3: Because of the opacity requirement, I believe that even IMAA would ultimately require an ESMTP extension and negotiation to work properly. So, in some ways this proposal and IMAA are compatible and complementary if an IMAA ACE string is used rather than UTF-8. I think that would be overkill unless we really like that hammer. (b) If a server advertises UTF8ADDRESSES, the local-part definition is changed so that the characters in local parts are construed as being Unicode in UTF-8 (or whatever is chosen), but are otherwise left essentially unchanged. E.g., the special rules about @ (ASCII 0x40), " (ASCII 0x22), \ (ASCII 0x5C), etc., simply get promoted to U+0040, U+0022, and so on. Loose end b.1: I think UTF-8 helps here, because any occurrence of ASCII characters gets represented as the relevant single octets. So the string is fairly easy to parse. Other codings would need to ensure that there was no confusion with anything 2821 thinks is a delimiter. Loose end b.2: some tidying will have to be done to the 2822[bis] text to make this work, but I don't think it is rocket science. The 2821[bis] tuning is trivial, since the work would be done in the extension document. (c) The parsing rules don't change. Everything to the left of [unquoted] "@" is the local-part, everything to its right is a domain name. Local-parts are opaque and interpreted only by the delivery MTA, modulo the quoting rules. And they normally don't get interpreted by anything else either. Loose end c.1: Life would be a good deal easier if any sender taking advantage of this feature were flatly prohibited from using source routes. It wouldn't harm anything that I can imagine, and it would make it a lot easier to safely decompose the string. Loose end c.2: I can't see any particular reason why the domain name in this arrangement would be forced to be transmitted in punycode (as an ACE). UTF-8 would probably work as well. Or one might be able to write the spec to permit either punycode or UTF-8 (or whatever) on the RHS. Of course, this doesn't change the requirement for nameprep. Loose end c.3: The opacity principle probably prevents rules about folding, special provisions for full width characters, etc. If those are used, and the delivery MTA doesn't match them up as intended (whatever that means), the mail would be undeliverable (just like today with ASCII). Some mappings/foldings would be wise for receiving MTAs to support, just as case-insensitive ASCII handling is wise, and others would be stupid. We should give advice, but opacity prevents requirements. Nor, IMO, are requirements needed. (d) An originating or relay MTA that received a forward-path address containing non-ASCII characters, but that discovered the next MTA in sequence didn't advertise UTF8ADDRESSES would more or less follow the rules for 8BITMIME relaying. I.e., it would either have to find a valid address (presumably ASCII) it can forward for delivery, or find a routing path that would permit sending the UTF-8 addresses, or bounce the mail because it has no clue how to process it. Loose end d.1: The way such an MTA gets the information needed for those first two options is outside the scope of the standard. That isn't much different from the situation today when a relay MTA gets an address it can't figure out how to parse because some rule or other is violated. And it is better for something that can explain what is happening to bounce the mail than to deliver it to something that might blow up on the 8-bit characters and not return any non-delivery information. Accepting the message and then blowing up of course violates the standard, but that fact and a dollar will get you... (e) The issue of what the delivery MTA actually puts in the mail store, and how the receiving MUA(s) handle that, has never been the subject of Internet protocols and that should probably not change. But, again, we can give advice. It seems to me that a great deal of this week's discussion could usefully be turned into that advice. E.g., "Dear sysadmin, if you configure your MTA so that you have mailboxes whose names contain characters that look like quotes or the at-sign, you are inviting big trouble". It also seems to me that IMAA might well turn out to be (or be easily transformed into) a good delivery MTA -> mailstore or delivery MTA-> final MUA protocol for MUAs that have not been updated. But, again, these are basically local-machine and user interface issues, which are traditionally outside IETF scope. (f) There are some other issues here, but that is the general picture. For example, we would have to _very_ carefully work out what went into reply messages, given that those might hit a host that wasn't prepared to have i18n characters in them. But that is a problem to be worked out, not a showstopper. Comments? john From owner-ietf-imaa Fri Feb 14 17:18:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F1IQr28408 for ietf-imaa-bks; Fri, 14 Feb 2003 17:18:26 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F1IOd28404 for ; Fri, 14 Feb 2003 17:18:24 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1F1InKA024695 for ; Sat, 15 Feb 2003 01:18:49 GMT To: duerst@w3.org CC: ietf-imaa@imc.org In-reply-to: <4.2.0.58.J.20030214153013.059a8818@localhost> (message from Martin Duerst on Fri, 14 Feb 2003 15:38:18 -0500) Subject: Re: Question: Fullwidth double-quote and fullwidth backslash References: <4.2.0.58.J.20030214105633.05c158e0@localhost> <4.2.0.58.J.20030214153013.059a8818@localhost> Date: Sat, 15 Feb 2003 01:18:48 +0000 Message-ID: <1045271928.24683.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: But are we really required, or do we see it as our goal, to help people avoid some potential typing mistakes in addresses that are, by their length and complexity, not at all user-friendly in the first place? If we're going to support quoting in IMAs then I think my original question is still a valid one. We know that Japanese users (at least those who are not intimately familliar with character set issues) often consider full width and half width characters as equivalent and interchangeable. For this reason, the IDN group chose to accept full with dot as equivalent to half width dot. The IMAA base document suggests doing the same for at-sign, presumably for the same reason. *If* we are going to allow quoting in IMAs (that aren't plain 822 addresses) then it is a reasonable question to pose to the group as to whether the same approach should be taken with the relevent metacharacters. If we're going to constuct a syntax for IMAs that involves double quotes and backslash, then I think making an effort to ensure that these are interpreted correctly by the software is sensible. So I'm not sure I really understand your objection... -roy From owner-ietf-imaa Fri Feb 14 17:12:24 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F1CO028327 for ietf-imaa-bks; Fri, 14 Feb 2003 17:12:24 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F1CNd28323 for ; Fri, 14 Feb 2003 17:12:23 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18jqsU-000Frq-00 for ietf-imaa@imc.org; Fri, 14 Feb 2003 20:12:26 -0500 Date: Fri, 14 Feb 2003 20:12:26 -0500 From: John C Klensin To: IETF IMAA list Subject: Re: A couple of comments on the open issues... Message-ID: <18991408.1045253546@p3.JCK.COM> In-Reply-To: <20030215001538.GE4500@nicemice.net> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Saturday, 15 February, 2003 00:15 +0000 "Adam M. Costello" wrote: > > Martin Duerst wrote: > >> Is subadressing just something that is done by a few email >> systems locally, or is it something specified in some of the >> email standards? > > As far as I know, there are no standards for any structured > local parts. It's all unofficial conventions. Adam and Martin, Let me see if I can shed a little light on this, independent of the tome I just mailed (which addresses this, and several other, issues in the context of questioning whether IMAA is a reasonable approach and proposing an alternative. There are no standards at all for interpreting the local part. The standard is "no one but the delivery MTA can try to interpret the local part in any way". >> to what extent do we really have to consider it here? > > I'd say we're under no obligation to play nicely with these > unofficial practices, but our goal is to create utility, not > frustration, so we shouldn't immediately dismiss the concern. If your goal is not to break the existing standards, then you are obligated to not assign _any_ special interpretation to anything that appears in the local-part (unless your document applies strictly to the interface between the delivery MTA and the network). That causes these "unofficial practices" --which are standard-conforming-- to take care of themselves. >> Also, using subaddressing seems to be quite popular for >> high-end users. But is it actually very much used by the >> bulk of users (the proverbial hotmail/yahoo/... crowd)? Wrong question, I think. There are all sorts of things that do things when email arrives. Many of them are not "users", but robots and agents. Some handle a _lot_ of messages. And many of them funny characters and encodings in the forward and/or reverse paths as well as message header lines like "subject:". One really doesn't want to break them. >... > So users would not get the full IMAA functionality until the > infrastructure (the MTA) has been upgraded, which is an > undesirable dependence (one that IDNA does not suffer). The > counter-argument is that most users won't care about the > missing functionality. Yep. But you may really screw the users/ robots/ agents/ etc. who do care. Getting internationalization at the cost of reduced functionality for anyone or anything that is now doing something that conforms should be considered only if it is the only way. It isn't. regards, john From owner-ietf-imaa Fri Feb 14 17:39:15 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F1dFx28852 for ietf-imaa-bks; Fri, 14 Feb 2003 17:39:15 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F1dDd28846 for ; Fri, 14 Feb 2003 17:39:13 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1F1dcKA024793 for ; Sat, 15 Feb 2003 01:39:39 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk CC: ietf-imaa@imc.org In-reply-to: <20030215005836.GG4500@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045238119.22308.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045233655.21973.TMDA@moriarty.gnomon.org.uk> <20030215005836.GG4500@nicemice.net> Date: Sat, 15 Feb 2003 01:39:37 +0000 Message-ID: <1045273177.24777.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: RFC-821 forbids 3foo.org. If I'd ever spotted that in the past, I'd completely forgotten it. Quick, someone tell 3com that no-one's allowed to send mail to them... :) I propose that we all give up and use X.400 instead... -roy From owner-ietf-imaa Fri Feb 14 17:39:13 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F1dDs28847 for ietf-imaa-bks; Fri, 14 Feb 2003 17:39:13 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F1dCd28840 for ; Fri, 14 Feb 2003 17:39:12 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1F1dbKA024778 for ; Sat, 15 Feb 2003 01:39:37 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk CC: ietf-imaa@imc.org In-reply-to: <20030215005836.GG4500@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045238119.22308.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045233655.21973.TMDA@moriarty.gnomon.org.uk> <20030215005836.GG4500@nicemice.net> From: Roy Badami Date: Sat, 15 Feb 2003 01:39:37 +0000 Message-ID: <1045273177.24777.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: RFC-821 forbids 3foo.org. If I'd ever spotted that in the past, I'd completely forgotten it. Quick, someone tell 3com that no-one's allowed to send mail to them... :) I propose that we all give up and use X.400 instead... -roy From owner-ietf-imaa Fri Feb 14 17:49:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F1nGD29228 for ietf-imaa-bks; Fri, 14 Feb 2003 17:49:16 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F1nFd29223 for ; Fri, 14 Feb 2003 17:49:15 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18jrS5-000Fvu-00; Fri, 14 Feb 2003 20:49:13 -0500 Date: Fri, 14 Feb 2003 20:49:13 -0500 From: John C Klensin To: Roy Badami , "ietf-imaa@imc.org" Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA Message-ID: <21197961.1045255753@p3.JCK.COM> In-Reply-To: <1045273177.24777.TMDA@moriarty.gnomon.org.uk> References: <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045238119.22308.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045233655.21973.TMDA@moriarty.gnomon.org.uk> <20030215005836.GG4500@nicemice.net> <1045273177.24777.TMDA@moriarty.gnomon.org.uk> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Saturday, 15 February, 2003 01:39 +0000 Roy Badami wrote: > RFC-821 forbids 3foo.org. > > If I'd ever spotted that in the past, I'd completely forgotten > it. Quick, someone tell 3com that no-one's allowed to send > mail to them... :) The rule was changed in RFC 1123, which is what permitted 3COM to make that registration. > I propose that we all give up and use X.400 instead... Great idea. IA4 in addresses, no i18n problems :-) john From owner-ietf-imaa Fri Feb 14 18:53:27 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F2rRr01202 for ietf-imaa-bks; Fri, 14 Feb 2003 18:53:27 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1F2rQd01198 for ; Fri, 14 Feb 2003 18:53:27 -0800 (PST) Received: (qmail 76882 invoked by uid 1016); 15 Feb 2003 02:53:56 -0000 Date: 15 Feb 2003 02:53:56 -0000 Message-ID: <20030215025356.76880.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: A couple of comments on the open issues... References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: John C Klensin writes: > There are no standards at all for interpreting the local part. False. RFC 822 clearly states that the local part consists of ASCII characters. The ASCII standard specifies interpretations of bytes 33-126 as glyphs. (If you're under the delusion that this interpretation is not actually standardized, or that users don't rely on it, please switch your MUA to EBCDIC header I/O for a month and let us know the results.) The question of which mailbox names are valid on a system, and how those mailboxes are handled, is (almost) entirely up to the system. But we have a global interpretation of ASCII mailbox names as glyphs. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Fri Feb 14 19:02:08 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F328s01430 for ietf-imaa-bks; Fri, 14 Feb 2003 19:02:08 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1F328d01426 for ; Fri, 14 Feb 2003 19:02:08 -0800 (PST) Received: (qmail 78290 invoked by uid 1016); 15 Feb 2003 03:02:37 -0000 Date: 15 Feb 2003 03:02:37 -0000 Message-ID: <20030215030237.78289.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: facts about the real world, part 3 References: <1045235384.22089.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > So this raises the question, do we believe that there are > implementations out there that have quirks we need to work around. Do > we have any idea what the quirks are? Certainly. sendmail screws up quoting in the worst possible way; it does exactly what you accused 822 of doing. If you try sending mail to "A.B" then sendmail won't deliver it to A.B. (Perhaps this has been fixed in new versions, but it's certainly true for a very large fraction of the sendmail installations on the net.) This is why 2822 requires minimal quoting. The general principle, as discussed in http://cr.yp.to/proto/design.html, is that (in the absence of overriding concerns such as efficiency) there should be only one way to encode each object. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Fri Feb 14 19:25:23 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F3PNU01793 for ietf-imaa-bks; Fri, 14 Feb 2003 19:25:23 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F3PMd01788 for ; Fri, 14 Feb 2003 19:25:22 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jsxC-00025F-00; Fri, 14 Feb 2003 19:25:26 -0800 Date: Sat, 15 Feb 2003 03:25:26 +0000 From: "Adam M. Costello" To: "ietf-imaa@imc.org" Cc: Roy Badami Subject: Re: Question: UseSTD3ASCIIRules on RHS of IMA Message-ID: <20030215032526.GH4500@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045238119.22308.TMDA@moriarty.gnomon.org.uk> <1045222662.21247.TMDA@moriarty.gnomon.org.uk> <1045233655.21973.TMDA@moriarty.gnomon.org.uk> <20030215005836.GG4500@nicemice.net> <1045273177.24777.TMDA@moriarty.gnomon.org.uk> <21197961.1045255753@p3.JCK.COM> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <21197961.1045255753@p3.JCK.COM> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Warning: This has gotten off topic. This is a rathole, whose only possible purpose is to illustrate that this topic is a rathole and should be avoided in IMAA like it was avoided in IDNA. :) John C Klensin wrote: > The rule was changed in RFC 1123, which is what permitted 3COM to make > that registration. RFC-1123 relaxed the syntax of "host names". RFC-821 refers to its field as a "host name" because when RFC-821 was written, it was always the name of a host. But almost three years before RFC-1123 was published, RFC-974 had generalized the domain part of a mail address to be either the name of a host or the name of an MX record (which is not a host, and RFC-974 always calls it a "domain name", never a "host name"). So when RFC-1123 relaxed the syntax of "host names", did the SMTP field even qualify as a host name anymore? If not, what tells us that the relaxation applied to it? Nothing that I can see. This may have been an innocent oversight, but I can't find any clarification. RFC-2821 tries to have it both ways. On the one hand, it alters the syntax to match the RFC-1123 host name syntax. On the other hand, RFC-2821 states very clearly that names of MX records are *not* host names, which suggests that if the syntax of "host names" were relaxed again, that relaxation would not automatically apply to the syntax of RFC-2821. AMC From owner-ietf-imaa Fri Feb 14 20:00:27 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F40R502443 for ietf-imaa-bks; Fri, 14 Feb 2003 20:00:27 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F40Qd02439 for ; Fri, 14 Feb 2003 20:00:26 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jtV8-00029S-00 for ; Fri, 14 Feb 2003 20:00:30 -0800 Date: Sat, 15 Feb 2003 04:00:30 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Subject: Re: facts about the real world, part 3 Message-ID: <20030215040030.GI4500@nicemice.net> Reply-To: IETF IMAA list References: <1045235384.22089.TMDA@moriarty.gnomon.org.uk> <20030215030237.78289.qmail@cr.yp.to> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030215030237.78289.qmail@cr.yp.to> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: "D. J. Bernstein" wrote: > This is why 2822 requires minimal quoting. Nitpick: Although it recommends that quoting be avoided when it's not needed (so foo.bar is preferred over "foo.bar"), I don't think think it recommends "minimal quoting"; for example, I see no preference for "foo:bar" over "\f\o\o\:\b\a\r". If an implementation can handle the quotation marks, it can probably handle the backslashes too, so I guess there's no need to recommend minimal quoting. AMC From owner-ietf-imaa Fri Feb 14 21:25:31 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F5PVT03427 for ietf-imaa-bks; Fri, 14 Feb 2003 21:25:31 -0800 (PST) Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F5PUd03423 for ; Fri, 14 Feb 2003 21:25:30 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 18jupS-0002Le-00; Fri, 14 Feb 2003 21:25:34 -0800 Date: Sat, 15 Feb 2003 05:25:34 +0000 From: "Adam M. Costello" To: ietf-imaa@imc.org Cc: Roy Badami Subject: Re: Question: Fullwidth double-quote and fullwidth backslash Message-ID: <20030215052534.GJ4500@nicemice.net> Reply-To: IETF IMAA list , Roy Badami References: <1045271928.24683.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214105633.05c158e0@localhost> <1045254274.23405.TMDA@moriarty.gnomon.org.uk> <8fpKYQTZcDD@3247.org> <8fpJ5$qocDD@3247.org> <1045224428.21358.TMDA@moriarty.gnomon.org.uk> <8fpJ5$qocDD@3247.org> <1045224428.21358.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1045254687.23446.TMDA@moriarty.gnomon.org.uk> <1045271928.24683.TMDA@moriarty.gnomon.org.uk> <1045254274.23405.TMDA@moriarty.gnomon.org.uk> <8fpKYQTZcDD@3247.org> <8fpJ5$qocDD@3247.org> <1045224428.21358.TMDA@moriarty.gnomon.org.uk> User-Agent: Mutt/1.4i Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This message responds to messages by Roy Badami and Claus Färber. Roy Badami wrote: > When dequoting/requoting localparts, should we consider recognizing > fullwidth double quotes and fullwidth backslash (and any other > double-quote-like and backlash-like characters)? > > It seems to me that the arguments for this are similar to those for > fullwidth dot and fullwidth at, and once we decide to recognize > metacharacters in fullwidth form, we should apply this consistently to > *all* metacharacters. I don't think the arguments are sufficiently similar. For one thing, the dots and at-signs that delimit a mail address are not metacharacters. They are part of the address, and they serve a standard function in all mail addresses in all contexts. Metacharacters are characters that are not actually part of the string they appear in. Examples are quote characters, wildcard characters, macro-expansion characters, etc. The motivating example for requiring the recognition of various dots and at-signs as separators in IDNs and IMAs is this: If I can type an address into my IMAA-aware application and it works, then I expect to be able to type the address into a message body, mail it to you, and have you paste it into your IMAA-aware application, and have it work. We cannot guarantee success, but standardizing the most common dots and at-signs gets us 99% of the way there. But local parts that require quotation are fundamentally more difficult, even with today's ASCII local parts. Although there is a standard quotation mechanism for local parts in message headers and SMTP commands, there is no standard quotation mechanism for user interfaces. Some user agents might copy the user input directly into the header (relying on the user to supply any needed quotation), others might assume the user input is literal and add more quotation if needed, and others might allow users to use some other quotation mechanism altogether, which the agent undoes before applying the 822-style quotation. There's no standard, so we can't expect local parts requiring quotation to be mailable and paste-able, even in today's ASCII world. It would be a wasted effort to try to standardize the Unicode variants of non-standard ASCII metacharacters. > quotes can appear on business cards. They can, but anyone who puts such an address on a business card must not be very concerned about being reachable (for the reasons above). Aside from the futility argument, it would probably be overstepping our authority to try to standardize Unicode variants of metacharacters. It's not hard to imagine that local parts might be found in contexts where dequoting them involves undoing %hex escapes or &ent; escapes. Should we try to insist that fullwidth % and fullwidth & should be recognized as introducing those escape sequences? Of course not, that would almost surely contradict the relevant standards. Claus Färber wrote: > Just do a NFKC normalisation at the very beginning Not before dequoting, for the reason given in the preceeding paragraph. Metacharacters are context-dependent and out of our jurisdiction, and need to be removed before we even have a string to work with. Applying NFKC after dequoting, but before subdividing the local part, is okay. > For IMAA, it suffices to specify that implementations MUST accept > all characters as delimiters that decompose to one of our delimiters > during NFKC-with-U+3002-to-U+002E normalisation and that the > delimiters MUST be normalised. > > The easiest way to implement this is an additional normalisation at > the very beginning. I'm not confident that the first paragraph is exactly equivalent to the second. Normalization is very subtle. If the latter is what you have in mind, it might be best to specify that, and leave it up to the optimizers to prove the existence of a shortcut if there is one. By the way, I'm not sure the CJK community would want ideographic full stop mapped to full stop inside the local part. They might prefer the ability to have genuine ideographic full stops in there. Roy Badami wrote: > Are you saying we can do a normalization of the entire e-mail address > without violating IDNA (which specifies that the domain be split on > dot-like characters before normalization). IDNA requires that normalization happen as part of the processing of each individual label, but it doesn't say there must never have been a previous normalization step. IDNA does not specify exactly how a domain name is split into labels (because it depends on context). In some situations normalization could be a part of, or a precursor to, that splitting operation. AMC From owner-ietf-imaa Fri Feb 14 22:13:34 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1F6DY304195 for ietf-imaa-bks; Fri, 14 Feb 2003 22:13:34 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1F6DWd04189 for ; Fri, 14 Feb 2003 22:13:33 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18jvZw-000GQC-00; Sat, 15 Feb 2003 01:13:36 -0500 Date: Sat, 15 Feb 2003 01:13:36 -0500 From: John C Klensin To: "D. J. Bernstein" cc: "ietf-imaa@imc.org" Subject: Re: A couple of comments on the open issues... Message-ID: <37061101.1045271616@p3.JCK.COM> In-Reply-To: <20030215025356.76880.qmail@cr.yp.to> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> <20030215025356.76880.qmail@cr.yp.to> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I used the term "interpreting" differently from the way Dan understood it. But his interpretation is reasonable, so I apologize for any confusion I may have caused. Let me restate my assertion: 821/822 and 2821/2822 specify the _syntax_ of the local part. That syntax specification, to all intents and purposes, limits the local-part string to ASCII graphics and specifies the quoting conventions. (If you care about what I'm glossing over here, and don't know, I strongly suggest you go read the specifications rather than taking either my word or Dan's for it. In reading them, remember that 821 and 2821 specify the addresses that can be used to transport and deliver Internet mail. Additional flexibilities permitted in message bodies are fine, but you may not be able to get the additional forms there unless you use a non-Internet (or non-conforming) mail transport. ) However, the definition in terms of a sequence of ASCII characters (with or without quoting) is a rather low-level piece of syntax. There are no productions for disaggregating or organizing the local-part into specific subdivisions such as "user name", "subaddress", "embedded route", or the like. And it was in the sense of being able to look at (or parse) a local-part and say, e.g., "ok, this is a user name and that is a subaddress" that I used the term "interpret". And it is "interpretation" of the local-part in that sense, by anything but the delivery MTA, that is prohibited. The _syntax_ of local parts consists of a string of ASCII characters and some quoting rules but, except for the delivery MTA, one cannot interpret those characters to suggest any particular meaning, nor can one subdivide them into units with particular meaning. Except to the delivery MTA, they are just a character string, albeit one restricted to ASCII and with some rules about quoting. john --On Saturday, 15 February, 2003 02:53 +0000 "D. J. Bernstein" wrote: > > John C Klensin writes: >> There are no standards at all for interpreting the local >> part. > > False. > > RFC 822 clearly states that the local part consists of ASCII > characters. The ASCII standard specifies interpretations of > bytes 33-126 as glyphs. > > (If you're under the delusion that this interpretation is not > actually standardized, or that users don't rely on it, please > switch your MUA to EBCDIC header I/O for a month and let us > know the results.) > > The question of which mailbox names are valid on a system, and > how those mailboxes are handled, is (almost) entirely up to > the system. But we have a global interpretation of ASCII > mailbox names as glyphs. > > ---D. J. Bernstein, Associate Professor, Department of > Mathematics, Statistics, and Computer Science, University of > Illinois at Chicago From owner-ietf-imaa Sat Feb 15 05:03:12 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FD3CF16117 for ietf-imaa-bks; Sat, 15 Feb 2003 05:03:12 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FD3Bd16112 for ; Sat, 15 Feb 2003 05:03:11 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FD3aKA027638 for ; Sat, 15 Feb 2003 13:03:36 GMT To: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: <20030215052534.GJ4500@nicemice.net> (ietf-imaa.amc+0@nicemice.net.RemoveThisWord) Subject: Re: Question: Fullwidth double-quote and fullwidth backslash References: <1045271928.24683.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214105633.05c158e0@localhost> <1045254274.23405.TMDA@moriarty.gnomon.org.uk> <8fpKYQTZcDD@3247.org> <8fpJ5$qocDD@3247.org> <1045224428.21358.TMDA@moriarty.gnomon.org.uk> <8fpJ5$qocDD@3247.org> <20030215052534.GJ4500@nicemice.net> From: Roy Badami Date: Sat, 15 Feb 2003 13:03:35 +0000 Message-ID: <1045314215.27637.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: But local parts that require quotation are fundamentally more difficult, even with today's ASCII local parts. Although there is a standard quotation mechanism for local parts in message headers and SMTP commands, there is no standard quotation mechanism for user interfaces. Some user agents might copy the user input directly into the header (relying on the user to supply any needed quotation), others might assume the user input is literal and add more quotation if needed, and others might allow users to use some other quotation mechanism altogether, which the agent undoes before applying the 822-style quotation. There's no standard, so we can't expect local parts requiring quotation to be mailable and paste-able, even in today's ASCII world. It would be a wasted effort to try to standardize the Unicode variants of non-standard ASCII metacharacters. I'm not sure I follow where the user interface issues come in. As I see it, an RFC-822 address (in the form in which it appears in headers) is regarded by most users as an opaque string of characters which they must copy verbatim in order to reach the recipient. Any quoting that is needed in an address will already be present when the address is given to the end user (eg out of band) and will by typed in literally by the user into the MUA. Most users probably won't even explicitly know that quoting is going on, they'll just notice that it's a slightly ususual address that they've been given. The only thing a user (who is not familliar with the RFCs) will do with this opaque string is copy it or transcribe it. The string they are given to transcribe may contain a number of classes of characters: ascii characters, international characters, dots, at-sign, double-quote and backslash. IDNA already ensures that if the user accidentally mis-transcribes normal alphanumeric characters on the RHS as full-width characters this won't break the address, and I imagine that IMAA will involve a similar normalization on the LHS. IDNA already ensures that if the user accidentally mis-transcribes dot on the RHS as full width dot, this will work. IMAA proposes ensureing that if the user accidentally mis-transcribes the at-sign as a full-width at-sign, the address will still work. So if IMAA chooses not to allow for users mis-transcribing backslash and double-quote as full-width characters, these will end up being the *only* ASCII characters in an address (as presented to the users) that are sensitive to full-width/half-width transcription issues. I feel that the group should consider attempting to either solve or avoid the transcription problem by doing one of the following: either (1) modify the dequoting mechanism to recognize full-width-backslash, full-width double-quote, and any other similar characters that are considered appropriate, or (2) declare that an IMA that contains non-ASCII characters SHOULD NOT use quoting. I don't have a strong preference one way or the other between the above options. -roy From owner-ietf-imaa Sat Feb 15 06:59:48 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FExml23519 for ietf-imaa-bks; Sat, 15 Feb 2003 06:59:48 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FExkd23515 for ; Sat, 15 Feb 2003 06:59:47 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FF0DKA028094 for ; Sat, 15 Feb 2003 15:00:13 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: What is IMAA: some scenarios for deployment From: Roy Badami Date: Sat, 15 Feb 2003 15:00:13 +0000 Message-ID: <1045321213.28093.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The goal of this group is clearly to make IMAs (internationalized mail addresses) usable on the Internet, and a subgoal that has been inherited from the IDNA work seems to be to try and do that in a way that eliminates (or at least minimises) the impact in terms of requiring changes to existing infrastructure. But I'm not sure we're all clear exactly what we expect from this subgoal, so I'd like to analyse it further with a few example scenarios. Scenario 1a A user of a POP/IMAP service from an ISP. The user has no control over the domain, but can choose their local mailbox name when they sign up for the service. The service has not been upgraded to contain any support for IMAs. The user wishes to use an address containing an internationalized local-part, and wishes to be able to conveniently send mail to and receive mail from other users of IMAs. IMAA will work here with just an upgrade to an IMA-aware MUA, as long as the user is able to register the ACE-encoded mailbox name within any syntax and length restrictions imposed by the ISP. Clearly the sign-up process will be cumbersome, but this can be assisted by an ACE-conversion tool bundled with the IMA-aware MUA. The ISP could choose to provide an IMA-aware sign up process at a later date, and this would be a relatively minor upgrade to their systems. All it would require is for the sign-up form to perform the ToASCII transformation for the user; it wouldn't require changes to the ISPs core e-mail infrastructure. In fact, in cases where sign-up is performed by means of an application provided to the user on CD-ROM (rather than by means of a web form), it wouldn't necessarily require any changes to the ISP's infrastructure at all, just an enhancement to the ISP's sign-up application. If we care about supporting this scenario, then we need to be sympathetic towards the kinds of restrictions on mailbox names that are likely to exist in this context. Anyone know what the restrictions on hotmail addresses are, for instance? Scenario 1b Same user, same requirements, but the user prefers to use the ISP's web mail interface rather than a POP/IMAP client. Whilst the user can still advertise an address containing an internationalized local-part, the user will have to work with ACE-encoded addresses within the web-mail interface. There is nothing we can do to fix this; the ISP must upgrade their web mail system to one which is IMA aware, and this is likely to be a major project for a large ISP. This is one way in which IMAA deployment is going to be more problematic than say, the deployment of IDNs in URLs. Virtually every ISP that provides end-user e-mail services operates a web mail service, which is essentially a large, centralized MUA. Requiring all MUAs to be upgraded therefore *does* require a major infrastructure upgrade on the part of virtually every ISP. This is not the fault of IMAA, of course -- any IMA solution will have this consequence. Scenario 2a An small organization runs its own Unix mail server. The organization has registered their own domain name, and the best preference MX record points to their own mail server. A second MX record points to their ISP, who will queue mail for them if their server is down. Neither the organization's mail server not the ISPs mail infrastructure is IMA-aware. Users pick up mail from the organizations internal mail server using a variety of clients, including local Unix clients and POP/IMAP clients. The organization wishes to register an IDN, and to use fully internationalised e-mail addresses (ie both the local-part and the domain contain international characters). Clearly this can be solved in the same way as scenario 1a. All that is needed is to upgrade the client software to be IMA-aware. Administering the mail server will be cumbersome (but not impossible) since all local-parts and domains will have to be entered in ACE-encoded form. Upgrading to an IMA-aware version of the MTA software that provides tools to allow the administrator to work with the internationalized local-parts and domains directly is beneficial, but not required. Scenario 2b Same organization as the above scenario, but some of the power users who were used to using the subaddressing facilities of their MTA (and perhaps other functionality relying on structured local-parts) are finding that IMAA doesn't really meet their needs, and are staying with ASCII e-mail addresses. The organization expedites their upgrade to an IMA-aware MTA (which they had been planning anyway for the reasons described in scenario 2a above). The new MTA parses structured addresses only after applying the ToUnicode operation, and the users who want IMA-aware subaddressing are happy. Granted, these subaddresses are not humanly readable to correspondents using non-IMA-aware software, but this isn't perceived by the users to be a major problem, since correspondents aren't expected to manipulate the subaddresses themselves, and in any case the users maintain alternate ASCII addresses for use when corresponding with people who don't use IMA-aware MUAs. Note that whilst the organization had to upgrade their MTA to support subaddressing in IMAs, this has no impact on the backup MX service which is provided by their ISP, which remains IMA-unware. Is scenario 2b really that bad? I don't think so, particularly if IMAA contains some informational matter on this issue and suggests that MTA authors wishing to allow subaddressing consider doing subaddressing after ToUnicode. So actually, I'm now sitting on the fence again on the issue of splitting the domain at anything that might locally be used as a separator. Particularly if the increase in length of the encoded local-part impacts the ability to support scenario 1a above. (But note that if we encode the local-part as a single opaque object, we need to fix the consecutive dots issue as this will probably impact on the ability to support scenario 1a.) -roy From owner-ietf-imaa Sat Feb 15 07:40:15 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FFeFP24986 for ietf-imaa-bks; Sat, 15 Feb 2003 07:40:15 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FFeDd24980 for ; Sat, 15 Feb 2003 07:40:13 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FFeeKA028251 for ; Sat, 15 Feb 2003 15:40:40 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Can we back up a bit and ask some basic questions? Analternate model From: Roy Badami Date: Sat, 15 Feb 2003 15:40:40 +0000 Message-ID: <1045323640.28250.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Your document is well argued. We certainly shouldn't blindly assume that just because the ACE vs just-send-8 issue was argued to death in the IDN WG, the trade-offs between the two approaches when applied to IMAs will automatically be the same as those for IDNs. (Though I can also understand that this group probably really doesn't want to go there again.) But I'd urge you to consider the four scenarios I just put forward in the thread "What is IMAA: some scenarios for deployment" Scenario 1a: with IMAA there's a reasonable hope of basic support with just an updated mail client, and better support with minor updates to the ISPs sign-up systems. With UTF8ADDRESSES this will require in addition a major upgrade to the ISP's mail infrastructure. Scenario 1b: IMAA and UTF8ADDRESSES both require a major upgrade to the ISP's infrastructure. Scenario 2a: IMAA requires only an upgrade to the mail clients; UTF8ADDRESSES requires an upgrade to the clients, the organization's MTA, and the ISP's mail infrastructure (so that the backup MX will continue to work). Scenario 2b: IMAA requires upgrades to the mail clients and, as currently specified in the draft, an upgrade to the organization's MTA (though the need to upgrade the MTA might disappear depending on the design decisions we take in IMAA). UTF8ADDRESSES requires upgrades to the clients, the MTA and the ISP's infrastructure. -roy From owner-ietf-imaa Sat Feb 15 09:31:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FHV9l01014 for ietf-imaa-bks; Sat, 15 Feb 2003 09:31:09 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FHV7d01005 for ; Sat, 15 Feb 2003 09:31:07 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <18991408.1045253546@p3.JCK.COM> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 09:29:01 -0800 To: IETF IMAA list From: Paul Hoffman / IMC Subject: Re: A couple of comments on the open issues... Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: [ This thread is about subaddressing ] At 8:12 PM -0500 2/14/03, John C Klensin wrote: >There are no standards at all for interpreting the local part. The >standard is "no one but the delivery MTA can try to interpret the >local part in any way". Exactly right. >Getting internationalization at the cost of reduced functionality >for anyone or anything that is now doing something that conforms >should be considered only if it is the only way. It isn't. Fully agree. The purpose of the discussion about making IMAA subaddress-aware is to make is so that the delivery MTA can do something based on the subaddresses. But IMAA doesn't need to be subaddress-aware for that to happen. MTAs that do things with subaddresses already have a processing step for how to parse a full address and what to do after parsing. Those processes could simply have a ToUnicode step added before the processing step, at which point the subaddress delimiters appear. If we try to make IMAA subaddress-aware, we inherently restrict the characters that can be used as delimiters for subaddress. If we don't make IMAA subaddress-aware, there is no restriction, and current systems work exactly the way they do now if they add the ToUnicode step before subaddress processing. We can do a lot of damage and make a lot of restrictions trying to be smart here. Let's instead go back to IETF first principles and let the end systems do the thinking. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 09:47:39 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FHldk03486 for ietf-imaa-bks; Sat, 15 Feb 2003 09:47:39 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FHlbd03476 for ; Sat, 15 Feb 2003 09:47:37 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FHm4KA028771 for ; Sat, 15 Feb 2003 17:48:05 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Can we back up a bit and ask some basic questions? Analternate model From: Roy Badami Date: Sat, 15 Feb 2003 17:48:04 +0000 Message-ID: <1045331284.28770.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I'd like to make a further comment on the UTF8ADDRESSES proposal. I don't necessarily think that UTF8ADDRESSES is a bad idea. There is far more scope for both IMAA and UTF8ADDRESSES to co-exist than there was for two approaches to coexist in IDNs. The e-mail infrastructure is far more easily extensible than the DNS infrastructure. RFC (2)821 allows us the ESMTP extension mechanism, and RFC (2)822 allows us to add additional headers. These mechanisms are both powerful and general, and may well be useful to furthering the goal of IMAs in a number of ways (about which I have more to say, but that will have to wait for another message). Compare it with the situation for mail bodies. The original MIME WG chose not to extend SMTP to require it to be 8-bit-clean, but to define quoting mechanisms to allow the transport of 8-bit mail over a 7-bit transport. Subsequently, 8BITMIME was defined, to allow consenting MTAs to exchange 8-bit mail unencoded over 8-bit-clean channels. I think that there will clearly be a demand for both mechanisms to exist, and I'm sure both kinds of mechanisms will be standardised. As with MIME, initial deployment will probably primarily use an ACSII encoding, but over time the native 8-bit approach will become more and more prevalent. It's unclear to me whether this group should confine itself to the ASCII-encoded protocol, or should address both approaches simultaneously. I realize that many will probably wish to confine themselves to IMAA, but I think there is a benefit in releasing both solutions to the world simultaneously, so that people can choose which they wish to deploy. -roy From owner-ietf-imaa Sat Feb 15 10:32:24 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FIWO205579 for ietf-imaa-bks; Sat, 15 Feb 2003 10:32:24 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FIWMd05573 for ; Sat, 15 Feb 2003 10:32:22 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <18245836.1045252800@p3.JCK.COM> References: <18245836.1045252800@p3.JCK.COM> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 10:31:46 -0800 To: IETF IMAA list From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basic questions? An alternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 8:00 PM -0500 2/14/03, John C Klensin wrote: >IMAA starts from the assumption that the right way to handle domain >names is, with slight modifications, the right way to handle email >local-parts. Those words in the document, and they should not be put there by interpretation. IMAA starts from the assumption that the right way to handle domain names is likely to be as good or better as other proposals. If it is as good or better, it is preferable to using a different approach. If a better approach based on a different format is proposed,that's fine. Early views of what would be best for IDNs were shown to be not as good as what we ended up with. [ Much elided below ] >(3) Strawman semi-proposal To help the discussion, let's call the two proposals IMAA-ACE and IMAA-UTF8. FWIW, I thought about something along these lines a long time ago and rejected it for some of the reasons below. >(a) We define a new SMTP extension. For purposes of discussion, >let's call it UTF8ADDRESSES. Therefore IMAA-UTF8 cannot achieve wide use until this extension is nearly universally adopted. When you get a business card from a colleague that has an internationalized email address, you can probably assume that their MTA supports UTF8ADDRESSES, but there is no way for you to know if your own MTA supports it, so you don't know if you can send him or her mail. With IMAA-UTF8, you always need to know that the originating MTA supports UTF8ADDRESSES. If you are a road warrior, you would need to know if your on-the-road ISP's MTAs support UTF8ADDRESSES before you could send mail to an IMAA-UTF8 address. Of course, you have no control over this. With IMAA-UTF8, if a company has an MTA that supports UTF8ADDRESSES, and they want to add a firewall that has an SMTP proxy, that proxy must support UTF8ADDRESSES. Today, companies can safely add firewalls that have SMTP proxies that have fewer ESMTP services than the MTA for which they front. These sound like show-stoppers to me. If I'm wrong, please let me know. >(e) The issue of what the delivery MTA actually puts in the mail >store, and how the receiving MUA(s) handle that, has never been the >subject of Internet protocols and that should probably not change. This can be read as "the display of internationalized addresses to the user will happen by magic". That seems pretty irresponsible to the people who have deployed POP and IMAP clients following the IETF's standards. It also assumes that the MTA knows that all mail clients that are going to read mail from the mailstore can read IMAA-UTF8 addresses. Given that there is no negotiation possible there, that is a pretty heady assumption. Basically, you are forcing the MTA to know the display capabilities of all possible readers of stored mail. MIME avoided this: IMAA-foo should avoid this. >(f) There are some other issues here, but that is the general >picture. For example, we would have to _very_ carefully work out >what went into reply messages, given that those might hit a host >that wasn't prepared to have i18n characters in them. But that is a >problem to be worked out, not a showstopper. We fully disagree about the status of that. Many systems based on IETF standard will break if reasonable responses are routinely lost. The outcome of this proposal is that no one will know when it is safe for them to start using their internationalized LHS. This will be even more confusing to the people who can very safely use their internationalized RHS in email, but not their internationalized LHS. This sounds like a setup for frustration and failure. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 10:32:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FIWQ705583 for ietf-imaa-bks; Sat, 15 Feb 2003 10:32:26 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FIWNd05578; Sat, 15 Feb 2003 10:32:23 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <18245836.1045252800@p3.JCK.COM> References: <18245836.1045252800@p3.JCK.COM> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 09:59:53 -0800 To: John C Klensin , IETF IMAA list From: Paul Hoffman / IMC Subject: Need for ESMPT extension for IMAA? Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: In John's proposal for a very different idea of how to do IMAA, he said: At 8:00 PM -0500 2/14/03, John C Klensin wrote: > Loose end a.3: Because of the opacity requirement, I > believe that even IMAA would ultimately require an ESMTP > extension and negotiation to work properly. So, in some > ways this proposal and IMAA are compatible and > complementary if an IMAA ACE string is used rather than > UTF-8. I think that would be overkill unless we really > like that hammer. Could you elucidate? I cannot see how IMAA touches standard SMTP at all if we agree (as I'm pretty sure we do) that the local part is opaque. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 10:31:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FIVup05549 for ietf-imaa-bks; Sat, 15 Feb 2003 10:31:56 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FIVsd05543; Sat, 15 Feb 2003 10:31:54 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18k76S-000Ie6-00; Sat, 15 Feb 2003 13:31:56 -0500 Date: Sat, 15 Feb 2003 13:31:56 -0500 From: John C Klensin To: "Paul Hoffman / IMC" , IETF IMAA list Subject: Subaddressing (was: Re: A couple of comments on the open issues...) Message-ID: <81361551.1045315916@p3.JCK.COM> In-Reply-To: References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Saturday, 15 February, 2003 09:29 -0800 "Paul Hoffman / IMC" wrote: > [ This thread is about subaddressing ] I adjusted the subject line in the hope that will be helpful. In case it isn't clear, I think Paul and I are in complete agreement about this. Just to be sure, let me try to restate this as a design principle to see if he agrees... Whether one uses an IDNA-like approach, or something of the general character of 8BITADDRESSES (I suspect I'm going fairly quickly come to hate that term), we don't want to disrupt local-address opacity. Consequently, anything that is to be done with subaddresses --or any other special-purpose parsing of the local-part into subdivisions for differential processing-- must be done before encoding on the sending side and processed post-decoding on the receiving side. The only exception, or group of exceptions, arise if the sender and receiver explicit agree, before the address is sent, on interpreting that address in some specific fashion. Such an agreement could be via an SMTP extension or, at least in theory, via some out of band negotiation or private agreement. If we can actually agree on that, then a good many of the recent discussions become irrelevant, and we can move on to other topics. Where we disagree, I think, is how much that rule can be bent and whether it is necessary to do so. I am imagining implementations in which the originating user enters an address in a local CCS, or Unicode in some form, into an MUA and passes the message off to the local MTA, which processes it into IMAA. Variations on that model, without the i18n characters, are very popular today, and we know roughly how well they work (let's say the experience has not been uniformly good). The problem is that the MTA implementers and admins, after long experience, decide that they are smarter than the user or the MUA-writer (they are often correct) and start fixing things up. I think the IMAA approach encourages them to include trying to understand, and fix, the local part, and that is what worries me -- experience indicates that the more transparent that originating MTA can be to what it receives, the less likely we are to have problems. If the user types (or pastes) the punycode string for the local-part into the MUA, and the MTA just passes it on without looking, I don't have a problem at all, but that really isn't our objective, is it? At the other end, the cases that scare me most are the ones that show up as edge cases in Roy's analysis (Roy, I will respond in more detail to that later -- I'm considering it rather than ignoring it). For example, suppose I'm an ISP operating a commercial mail server and I let my users choose their own mailbox names. But suppose that server also supports some control functions, and those control functions are introduced (I'm going to pick an example that is obviously perverse, but the more realistic ones are just more complicated to explain) by a semi-random set of characters that could reasonably include IESG--. Now, unless I'm careless or an idiot, I prevent users from creating mailboxes that might conflict with those names. But, as soon as I do that, the hypothesis that IMAA would permit any user to create an i18n name without having to negotiate with the ISP or mail supplier fails. Conversely, if I am careless or an idiot (there is ample evidence that there are sysadmins out there with those properties), users create mailboxes for which messages silently disappear, or my control messages don't show up, depending on who is unlucky. There is a possibly-more-interesting case involving a firewall-gateway that poses as the receiving mail server but intercepts and parses control messages from the Internet (as noted in RFC 2979, those critters skirt close to the line of standards violations, but there are lots of them out there). Are these high-risk cases? I don't know. But they are plausible and, so far, we have had a firm rule that we don't put the email infrastructure at risk while making improvements if there is any other way. And it is a risk. Now, by contrast, if the receiving server explicitly says "I'm prepared to receive i18n addresses" then, whatever standard we adopt, we can be assured that the server isn't going to muck it up in some unpredictable way. I think prudence requires looking at that option very carefully, especially since, as illustrated above, the notion of users at opposite ends of the network using IMAA to communicate using Unicode addresses without any of the intervening MTA being involved may not always work. And I think that analysis of that part of the issue applies where the transport encoding is then UTF-8, or Punycode, or something else. john > At 8:12 PM -0500 2/14/03, John C Klensin wrote: >> There are no standards at all for interpreting the local >> part. The standard is "no one but the delivery MTA can try >> to interpret the local part in any way". > > Exactly right. > >> Getting internationalization at the cost of reduced >> functionality for anyone or anything that is now doing >> something that conforms should be considered only if it is >> the only way. It isn't. > > Fully agree. > > The purpose of the discussion about making IMAA > subaddress-aware is to make is so that the delivery MTA can do > something based on the subaddresses. But IMAA doesn't need to > be subaddress-aware for that to happen. MTAs that do things > with subaddresses already have a processing step for how to > parse a full address and what to do after parsing. Those > processes could simply have a ToUnicode step added before the > processing step, at which point the subaddress delimiters > appear. > > If we try to make IMAA subaddress-aware, we inherently > restrict the characters that can be used as delimiters for > subaddress. If we don't make IMAA subaddress-aware, there is > no restriction, and current systems work exactly the way they > do now if they add the ToUnicode step before subaddress > processing. > > We can do a lot of damage and make a lot of restrictions > trying to be smart here. Let's instead go back to IETF first > principles and let the end systems do the thinking. > > --Paul Hoffman, Director > --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 10:37:53 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FIbrh05751 for ietf-imaa-bks; Sat, 15 Feb 2003 10:37:53 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FIbnd05746; Sat, 15 Feb 2003 10:37:49 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <1045331284.28770.TMDA@moriarty.gnomon.org.uk> References: <1045331284.28770.TMDA@moriarty.gnomon.org.uk> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 10:37:46 -0800 To: Roy Badami , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basic questions? Analternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 5:48 PM +0000 2/15/03, Roy Badami wrote: >I realize that many will probably wish to confine themselves to IMAA, >but I think there is a benefit in releasing both solutions to the >world simultaneously, so that people can choose which they wish to >deploy. The IETF has a long history of very bad outcomes when we release two very different ways to do the same thing. In the email world, the fact that S/MIME and PGP are both IETF standards has pretty much prevented any sender from being able to assume what the recipient can receive. The fact that there are two standardized PKIX certificate enrollment protocols has caused it to be almost impossible to roll out secure email or VPNs in a reasonable fashion. And so on. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 11:05:03 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FJ53P06441 for ietf-imaa-bks; Sat, 15 Feb 2003 11:05:03 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FJ50d06431; Sat, 15 Feb 2003 11:05:00 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <81361551.1045315916@p3.JCK.COM> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> <81361551.1045315916@p3.JCK.COM> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 11:01:29 -0800 To: John C Klensin , IETF IMAA list From: Paul Hoffman / IMC Subject: Re: Subaddressing (was: Re: A couple of comments on the open issues...) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:31 PM -0500 2/15/03, John C Klensin wrote: >--On Saturday, 15 February, 2003 09:29 -0800 "Paul Hoffman / IMC" > wrote: > >>[ This thread is about subaddressing ] > >I adjusted the subject line in the hope that will be helpful. > >In case it isn't clear, I think Paul and I are in complete agreement >about this. Just to be sure, let me try to restate this as a design >principle to see if he agrees... > > Whether one uses an IDNA-like approach, or something of > the general character of 8BITADDRESSES (I suspect I'm > going fairly quickly come to hate that term), we don't > want to disrupt local-address opacity. Consequently, > anything that is to be done with subaddresses --or any > other special-purpose parsing of the local-part into > subdivisions for differential processing-- must be > done before encoding on the sending side and processed > post-decoding on the receiving side. > > The only exception, or group of exceptions, arise if the > sender and receiver explicit agree, before the address > is sent, on interpreting that address in some specific > fashion. Such an agreement could be via an SMTP > extension or, at least in theory, via some out of band > negotiation or private agreement. > >If we can actually agree on that, then a good many of the recent >discussions become irrelevant, and we can move on to other topics. John and I agree on this; if others do as well, the latter part of that sentence becomes true. >Where we disagree, I think, is how much that rule can be bent and >whether it is necessary to do so. I am imagining implementations in >which the originating user enters an address in a local CCS, or >Unicode in some form, into an MUA and passes the message off to the >local MTA, which processes it into IMAA. The typical mail model today is that the MUA software acts as the first MTA in the chain, so it is the MUA that would do the IMAA conversion. > Variations on that model, without the i18n characters, are very >popular today, and we know roughly how well they work (let's say the >experience has not been uniformly good). The problem is that the >MTA implementers and admins, after long experience, decide that they >are smarter than the user or the MUA-writer (they are often correct) >and start fixing things up. I think the IMAA approach encourages >them to include trying to understand, and fix, the local part, and >that is what worries me -- experience indicates that the more >transparent that originating MTA can be to what it receives, the >less likely we are to have problems. We agree with the worry, which is why I would trust my MUA to do any conversion, but no one else. That also supports the view of opacity through all the SMTP servers. >If the user types (or pastes) the punycode string for the local-part >into the MUA, and the MTA just passes it on without looking, I don't >have a problem at all, but that really isn't our objective, is it? Correct. The objective is that the user will { type | paste | speak } a name using { the current charset | ACE } into the LHS and It Just Works. . . . >But, as soon as I do that, the hypothesis that IMAA would permit any >user to create an i18n name without having to negotiate with the ISP >or mail supplier fails. And that is why the hypothesis is nowhere in the IMAA-ACE document. It is not congruent with today's practice, and it isn't needed. The rest of the argument about why this wouldn't work is perfectly true but irrelevant to IMAA-ACE. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 11:31:56 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FJVu007126 for ietf-imaa-bks; Sat, 15 Feb 2003 11:31:56 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FJVrd07115 for ; Sat, 15 Feb 2003 11:31:53 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FJWMKA029314 for ; Sat, 15 Feb 2003 19:32:22 GMT To: phoffman@imc.org CC: ietf-imaa@imc.org, roy@gnomon.org.uk In-reply-to: (message from Paul Hoffman / IMC on Sat, 15 Feb 2003 10:37:46 -0800) Subject: Re: Can we back up a bit and ask some basic questions? Analternate model References: <1045331284.28770.TMDA@moriarty.gnomon.org.uk> Date: Sat, 15 Feb 2003 19:32:21 +0000 Message-ID: <1045337541.29302.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The IETF has a long history of very bad outcomes when we release two very different ways to do the same thing. In the email world, the fact that S/MIME and PGP are both IETF standards has pretty much prevented any sender from being able to assume what the recipient can receive. The fact that there are two standardized PKIX certificate enrollment protocols has caused it to be almost impossible to roll out secure email or VPNs in a reasonable fashion. And so on. I'm not sure that you understand what I was proposing. A better analogy would be to pose the question: would it have been a good thing for ESMTP and 8BITMIME to have been defined concurrently with the base MIME standards? (That isn't intended as a loaded question; "No, it wouldn't have made any difference" is a perfectly valid answer.) I'm proposing that the two solutions are defined as part of an encompassing IMA Architecture, not independently without regard to interoperability. This is what I imagine the world will be like a few years from now: All IMA-aware systems will support IMAA (this will be a mandatory part of some IMA Architecture standard). Many systems, particularly those in parts of the world where IMAs are popular, will use ESMTP extensions to exchange IMAs in native UTF-8, and to exchange messages in an extended format that allows native UTF-8 addresses in the headers. Systems that receive a message with UTF-8 addresses and need to relay it to a system that doesn't support the requisite ESMTP extensions will need to apply ToASCII to both the envelope and header addresses before forwarding the message. This is analogous to the (admitedly inconsistently implemented) requirement that a system which receives an 8BITMIME message converts it to a suitable 7-bit encoding if the destination system doesn't support 8BITMIME. Maybe defining a complete IMA Architecture is too much to tackle at once. As I envision it, IMAA will probably be the only manadatory part of the IMA Architecture, so maybe there's a benefit in getting that done quickly. However, I think there may be benefits in putting some though into the bigger picture of the IMA Architecture, even if we decide to concentrate on IMAA for the present. One example: if we were for sake of argument to decide that ILPs should be case insensitive, but that it would be desirable (but not absolutely essential) to make them case preserving, our decision as to whether to put any effort into making IMAA preserve the case of local-parts might be strongly influenced by whether we believe that another component of the ultimate IMA Architecture would generally carry e-mail addresses in ILP-aware and IDN-aware slots. Only half jokingly, I propose that we repurpose the acronym IMAA to mean IMA Architecture, and come up with a new name for the protocol described in the base document. Part of the reason for this is that I don't actually think that IMAA is a good name for the base document protocol, since it has consequences that go beyond end-user applications. -roy From owner-ietf-imaa Sat Feb 15 11:57:28 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FJvSA08305 for ietf-imaa-bks; Sat, 15 Feb 2003 11:57:28 -0800 (PST) Received: from max.kde.org (max.kde.org [134.2.170.93]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FJvQd08301 for ; Sat, 15 Feb 2003 11:57:27 -0800 (PST) Received: from 192.168.0.3 (pD9525249.dip.t-dialin.net [217.82.82.73]) by max.kde.org (Postfix) with ESMTP id CDAC3B5DB9 for ; Sat, 15 Feb 2003 20:57:23 +0100 (CET) From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: A couple of comments on the open issues... Date: Sat, 15 Feb 2003 20:41:29 +0100 User-Agent: KMail/1.5.9 References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> In-Reply-To: <18991408.1045253546@p3.JCK.COM> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Message-Id: <200302152040.05763@sendmail.mutz.com> Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_pfpT+EQe8ieglmC"; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_pfpT+EQe8ieglmC Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Saturday 15 February 2003 02:12, John C Klensin wrote: > There are no standards at all for interpreting the local part. > The standard is "no one but the delivery MTA can try to > interpret the local part in any way". MIXER (rfc 2156) specifies a "sub-syntax" for local-parts. The=20 interpretation is done by the X.400<->rfc(2)822 gateways, not the final=20 delivery MTA. MIXER is standards-track. Marc =2D-=20 Eternal vigilance is the price of liberty -- Thomas Jefferson --Boundary-02=_pfpT+EQe8ieglmC Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+Tpfp3oWD+L2/6DgRAs4UAJ9gwPxB+dQpEoV6DDjOMNjy8wF2MgCgiYsY OsRmOauqhojYI5yoDJt1oz8= =nkzC -----END PGP SIGNATURE----- --Boundary-02=_pfpT+EQe8ieglmC-- From owner-ietf-imaa Sat Feb 15 12:46:28 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FKkSS10412 for ietf-imaa-bks; Sat, 15 Feb 2003 12:46:28 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FKkQd10407 for ; Sat, 15 Feb 2003 12:46:26 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FKkuKA029644 for ; Sat, 15 Feb 2003 20:46:56 GMT To: phoffman@imc.org CC: john-ietf@jck.com, ietf-imaa@imc.org In-reply-to: (message from Paul Hoffman / IMC on Sat, 15 Feb 2003 11:01:29 -0800) Subject: Re: Subaddressing (was: Re: A couple of comments on the open issues...) References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <1045254880.23456.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <1045135666.14953.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030214110907.05bf4398@localhost> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> <81361551.1045315916@p3.JCK.COM> Date: Sat, 15 Feb 2003 20:46:54 +0000 Message-ID: <1045342014.29626.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > Whether one uses an IDNA-like approach, or something of > the general character of 8BITADDRESSES (I suspect I'm > going fairly quickly come to hate that term), we don't > want to disrupt local-address opacity. Consequently, > anything that is to be done with subaddresses --or any > other special-purpose parsing of the local-part into > subdivisions for differential processing-- must be > done before encoding on the sending side and processed > post-decoding on the receiving side. > > [...] > >If we can actually agree on that, then a good many of the recent >discussions become irrelevant, and we can move on to other topics. John and I agree on this; if others do as well, the latter part of that sentence becomes true. I've moved from strongly disagreeing to sitting on the fence to feeling like I'm going to fall off the fence on your side. I hope the grass is nice and soft on your side of the fence :) -roy From owner-ietf-imaa Sat Feb 15 13:04:43 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FL4hs10759 for ietf-imaa-bks; Sat, 15 Feb 2003 13:04:43 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FL4gd10755 for ; Sat, 15 Feb 2003 13:04:42 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18k9UF-000IxP-00; Sat, 15 Feb 2003 16:04:39 -0500 Date: Sat, 15 Feb 2003 16:04:39 -0500 From: John C Klensin To: Marc Mutz cc: "ietf-imaa@imc.org" Subject: Re: A couple of comments on the open issues... Message-ID: <90524747.1045325079@p3.JCK.COM> In-Reply-To: <200302152040.05763@sendmail.mutz.com> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <20030215001538.GE4500@nicemice.net> <18991408.1045253546@p3.JCK.COM> <200302152040.05763@sendmail.mutz.com> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Saturday, 15 February, 2003 20:41 +0100 Marc Mutz wrote: > On Saturday 15 February 2003 02:12, John C Klensin wrote: > >> There are no standards at all for interpreting the local part. >> The standard is "no one but the delivery MTA can try to >> interpret the local part in any way". > > > MIXER (rfc 2156) specifies a "sub-syntax" for local-parts. The > interpretation is done by the X.400<->rfc(2)822 gateways, not > the final delivery MTA. > > MIXER is standards-track. Marc, we could get into an interesting, and quite lengthy, discussion of hairsplitting definitions here. I recommend against it. Just as Dan did not understand "interpreting" the way I intended it, you are oversimplifying the concept of a delivery MTA. I perhaps should not have said "no one", or I should have made the (intended) contextual restriction to the Internet/SMTP mail transport environment more explicit, but I am (believe it or not) trying to keep these notes from running on forever. The relevant text from 2821 is section 2.3.8, part of which says A "gateway" SMTP system (usually referred to just as a "gateway") receives mail from a client system in one transport environment and transmits it to a server system in another transport environment. Differences in protocols or message semantics between the transport environments on either side of a gateway may require that the gateway system perform transformations to the message that are not permitted to SMTP relay systems. That is an uncomfortable, but necessary, blanket exception for gateway systems to do anything at all that they need to do. Mixer is, at least, a gateway between an Internet/SMTP mail transport environment and an X.400 one, so it is covered by that exception. If one _really_ wants to split hairs, it probably doesn't even need the exemption: The assortment of BITNET, ccMAIL, FidoNet, Profs, UUCP mail, etc., etc., gateways we had floating around several years ago (I suspect that few, if any, of them have gone completely away) were typically defined as SMTP<->foo gateways, regardless of what they did once they got hold of the mail. But MIXER is, as you say, defined largely as an [2]822 <-> X.400 gateway. As such, it is acting either before the first ("originating") SMTP MTA gets hold of the message, or it is acting after handoff from an MTA at the receiving end. And the MTA that hands things off to an MUA or a message store so that the 822 processing becomes relevant _is_ a "delivery MTA". On a side note, I think I can assure you that the much-abused editor of 2821 was quite aware of the gateway situation generally and MIXED, and RFC1327 and other predecessors, when that section was written, as was most of the WG, the relevant ADs, etc. You might even say "painfully aware". If the text isn't clear enough, I'd welcome suggestions for clarification -- the list of clarification suggestions I still have to retrofit before posting the first 2821bis I-D is shrinking, and I'd much rather deal with suggestions now then later. (A note to anyone else reading this: messages suggesting changes that will make the text more clear, especially when they propose specific text, are being incorporated to the degree possible. Messages explaining that 2821, or SMTP generally, are broken, or insisting that it should work differently, or claiming that the 2821 text didn't actually represent WG consensus, are being filed for passing off to the ADs and former WG Chair, whom I intend to let make the decisions as to which of the issues are worth [re]opening. But my filing systems are not good, and some of those messages may get lost (or might have been lost in one of a series of disk crashes last year) or not noticed when the recipients go through the pile, so I would really suggest holding them until the AD's and Chair are ready for them -- or sending them to them, not me. :-( john From owner-ietf-imaa Sat Feb 15 13:43:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FLhPD11600 for ietf-imaa-bks; Sat, 15 Feb 2003 13:43:25 -0800 (PST) Received: from mail.epost.de (mail.epost.de [193.28.100.164] (may be forged)) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FLhMd11591 for ; Sat, 15 Feb 2003 13:43:22 -0800 (PST) Received: from [217.82.82.73] (217.82.82.73) by mail.epost.de (6.7.015) (authenticated as Marc.Mutz@epost.de) id 3E41A7C700110AD6 for ietf-imaa@imc.org; Sat, 15 Feb 2003 22:43:24 +0100 From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: Subaddressing Date: Sat, 15 Feb 2003 22:23:40 +0100 User-Agent: KMail/1.5.9 References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <81361551.1045315916@p3.JCK.COM> In-Reply-To: <81361551.1045315916@p3.JCK.COM> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_d/qT+jNKGmB3+Ji"; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200302152223.41136@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_d/qT+jNKGmB3+Ji Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Saturday 15 February 2003 19:31, John C Klensin wrote: > Whether one uses an IDNA-like approach, or something of > the general character of 8BITADDRESSES (I suspect I'm > going fairly quickly come to hate that term), we don't > want to disrupt local-address opacity. Consequently, > anything that is to be done with subaddresses --or any > other special-purpose parsing of the local-part into > subdivisions for differential processing-- must be > done before encoding on the sending side and processed > post-decoding on the receiving side. One very noteable feature of IDNA is that it requires no modification=20 whatsoever on the server side. OK, let this sink in. Now, why should IMAA? I'd like to lay out why it _shouldn't_. I haven't followed the IDNA discussions, so I don't know how much of an=20 issue server security and scalability had on the design of IDNA, but,=20 looking back, IDNA does a very good job of keeping complexity out of=20 the server. The complexity is completely contained in the clients,=20 which results in better scalability, since the clients run on=20 over-(cpu)-powered PCs, and in better server security, since they don't=20 change at all. So the requirements to balance are 1. keeping complexity out of the server for the reasons stated above 2. keeping the concept of opaque local-parts in all it's strictness. Why are those requirements contrary and that need to be balanced? Because strictly opaque local-parts require the interpretation to happen=20 after the ToUnicode operation, which increases the complexity of the=20 server (US-ASCII/char*->UTF-8/wchar_t transition). The reverse also holds: Keeping complexity out of the server requires=20 (partial) interpretation of the local-part (splitting at a set of=20 characters) to be performed before the ToUnicode operation. I assert - and the IDNA design agrees with me here - that it is a great=20 deal more important to keep complexity out of the server than to keep=20 the strict opaque local-part concept. Marc =2D-=20 "Similia similibus curentur" -- Bush's new motto in fighting terrorism. --Boundary-02=_d/qT+jNKGmB3+Ji Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+Tq/d3oWD+L2/6DgRAtWtAKCFolNmJwZmUATJUaqV/Ft4WtUqBACdEEwD iU4Yrh9mgoOgyrWyWbDR6Yo= =Abvx -----END PGP SIGNATURE----- --Boundary-02=_d/qT+jNKGmB3+Ji-- From owner-ietf-imaa Sat Feb 15 13:43:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FLhP411599 for ietf-imaa-bks; Sat, 15 Feb 2003 13:43:25 -0800 (PST) Received: from mail.epost.de (mail.epost.de [193.28.100.164] (may be forged)) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FLhMd11592 for ; Sat, 15 Feb 2003 13:43:22 -0800 (PST) Received: from [217.82.82.73] (217.82.82.73) by mail.epost.de (6.7.015) (authenticated as Marc.Mutz@epost.de) id 3E41A7C700110AD2 for ietf-imaa@imc.org; Sat, 15 Feb 2003 22:43:19 +0100 From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: Can we back up a bit and ask some basic questions? Analternate model Date: Sat, 15 Feb 2003 21:58:41 +0100 User-Agent: KMail/1.5.9 References: <1045323640.28250.TMDA@moriarty.gnomon.org.uk> In-Reply-To: <1045323640.28250.TMDA@moriarty.gnomon.org.uk> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Message-Id: <200302152158.57661@sendmail.mutz.com> Cc: Reply-To: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Description: Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Saturday 15 February 2003 16:40, Roy Badami wrote: > But I'd urge you to consider the four scenarios I just put forward in > the thread "What is IMAA: some scenarios for deployment" I'd like to extend this to include IMAA-ACE-opaque, IMAA-ACE-split and IMAA-UTF8: > Scenario 1a: (ISP IMAA-unaware, user want to use IMAs with her local MUA): > with IMAA there's a reasonable hope of basic support > with just an updated mail client, and better support with minor > updates to the ISPs sign-up systems. With UTF8ADDRESSES this will > require in addition a major upgrade to the ISP's mail infrastructure. IMAA-ACE-* are no different here. > Scenario 1b: (1a with webmail) > IMAA and UTF8ADDRESSES both require a major upgrade to > the ISP's infrastructure. IMAA-ACE-* yes, IMAA-UTF8 probably not. It might suffice to add or change to in the delivered html page. > Scenario 2a: (IMAA-unaware ISP and company mail server, IMA/IDN use) > IMAA requires only an upgrade to the mail clients; > UTF8ADDRESSES requires an upgrade to the clients, the organization's > MTA, and the ISP's mail infrastructure (so that the backup MX will > continue to work). No change here, too. > Scenario 2b: IMAA requires upgrades to the mail clients and, as > currently specified in the draft, an upgrade to the organization's > MTA (though the need to upgrade the MTA might disappear depending on > the design decisions we take in IMAA). UTF8ADDRESSES requires > upgrades to the clients, the MTA and the ISP's infrastructure. IMAA-ACE-opaque requires that. IMAA-ACE-split requires no modification on the server side, just like 2a. Marc -- 'When you see the ping of death, duck and cover.' -- Bruce Schneier, Crypto-Gram Oct 2002 From owner-ietf-imaa Sat Feb 15 13:46:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FLkZu11648 for ietf-imaa-bks; Sat, 15 Feb 2003 13:46:35 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FLkXd11644 for ; Sat, 15 Feb 2003 13:46:34 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18kA8q-000J1V-00; Sat, 15 Feb 2003 16:46:36 -0500 Date: Sat, 15 Feb 2003 16:46:35 -0500 From: John C Klensin To: Roy Badami cc: "ietf-imaa@imc.org" Subject: Re: Can we back up a bit and ask some basic questions? Analternate model Message-ID: <93041416.1045327595@p3.JCK.COM> In-Reply-To: <1045337541.29302.TMDA@moriarty.gnomon.org.uk> References: <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Saturday, 15 February, 2003 19:32 +0000 Roy Badami wrote: > I'm not sure that you understand what I was proposing. A > better analogy would be to pose the question: would it have > been a good thing for ESMTP and 8BITMIME to have been defined > concurrently with the base MIME standards? (That isn't > intended as a loaded question; "No, it wouldn't have made any > difference" is a perfectly valid answer.) How would "basically, they were" grab you as an answer? > I'm proposing that the two solutions are defined as part of an > encompassing IMA Architecture, not independently without > regard to interoperability. > > This is what I imagine the world will be like a few years from > now: > > All IMA-aware systems will support IMAA (this will be a > mandatory part of some IMA Architecture standard). > > Many systems, particularly those in parts of the world where > IMAs are popular, will use ESMTP extensions to exchange IMAs > in native UTF-8, and to exchange messages in an extended > format that allows native UTF-8 addresses in the headers. > > Systems that receive a message with UTF-8 addresses and need > to relay it to a system that doesn't support the requisite > ESMTP extensions will need to apply ToASCII to both the > envelope and header addresses before forwarding the message. > This is analogous to the (admitedly inconsistently > implemented) requirement that a system which receives an > 8BITMIME message converts it to a suitable 7-bit encoding if > the destination system doesn't support 8BITMIME. >... Ok, I see where you are headed. Let me try to summarize many months of moaning in the IDN WG, plus some email experience, including with 8BITMIME downgrading, which I'm glad you cited. Disclaimer: unlike Paul and Adam, I'm not an IDNA co-author, and am widely believed even be an IDNA-hater (not true), so you don't get to assume that I'm biased in favor of IDNA-derived solutions. * UTF-8, while more common and better known than punycode, is really not a very efficient encoding, especially for Asian languages. Indeed, under a number of conditions, it is a less efficient encoding. So, other than aesthetics and the belief that large benefits will accrue from its being closer to the internal form used by several (many?) systems, there is no really strong case to using it instead of punycode. * There are more efficient encodings than either, and they use all eight bits of octets, but they are even more strange (less familiar and used in other places) than punycode. Several of them are members of the "start from 16-bit UCS-2 Unicode (or 32-bit UCS-4 10646) strings and compress" family. * Where we have two ways to do something, bad things often happen. Paul identified one of them -- industry looks at the two possibilities, throws up its collective hands about interoperability and either does nothing or does something that won't interwork with many systems. The other is that they get mixed up. The scenario you outline is a nearly-guaranteed recipe (as the complexities of 8BITMIME downgrading has been) for an over-clever MTA author to say "if I can send IMAA without negotiation, and negotiation fails, I can either go to all that downgrading trouble, which might not work anyway, or I can just send the 8bit stuff, which might get through. The latter is a lot less work, so..." If negotiation is needed, then we should negotiate, regardless of the agreed mail transport format. If the best mail transport format is punycode, we should use it, whether the transport environment permits 8bit or not. But alternate ways to do the same thing, especially when they don't provide significantly different functionality, tend to cause far more problems than they are worth. I covered another aspect of this in an off-list note to you and Paul a short time ago. If either of you believe that it contains any profound insights that would be helpful to others, please feel free to forward it to the list. john From owner-ietf-imaa Sat Feb 15 14:12:27 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FMCRu12107 for ietf-imaa-bks; Sat, 15 Feb 2003 14:12:27 -0800 (PST) Received: from mail.epost.de ([193.28.100.187]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FMCPd12103 for ; Sat, 15 Feb 2003 14:12:25 -0800 (PST) Received: from [217.82.82.73] (217.82.82.73) by mail.epost.de (6.7.015) (authenticated as Marc.Mutz@epost.de) id 3E41A8510010C568 for ietf-imaa@imc.org; Sat, 15 Feb 2003 23:12:27 +0100 From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: Subaddressing Date: Sat, 15 Feb 2003 22:55:29 +0100 User-Agent: KMail/1.5.9 References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <81361551.1045315916@p3.JCK.COM> <200302152223.41136@sendmail.mutz.com> In-Reply-To: <200302152223.41136@sendmail.mutz.com> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_hdrT+Ttq3TZ7FdO"; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <200302152255.46332@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_hdrT+Ttq3TZ7FdO Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Saturday 15 February 2003 22:23, Marc Mutz wrote: > So the requirements to balance are > 1. keeping complexity out of the server for the reasons stated above > 2. keeping the concept of opaque local-parts in all it's strictness. In the light of John's recent mail, I should add that I'm talking about=20 keeping or not keeping the concept of opaque local-parts /for IMAs/,=20 iow, whether or not to _extend_ the concept to ILPs. I'm specifically not trying to start a discussion whether or not opaque=20 local-parts as currently used are a bad idea - they're not. Apologies=20 if my mail came across like that. Marc =2D-=20 [Norton SystemWorks 2002] Wipe Info uses hexadecimal values to wipe files. This provides more security than wiping with decimal values. -- Norton SystemWorks 2002 Manual, p.160 (seen on Cryptogram 12/01) --Boundary-02=_hdrT+Ttq3TZ7FdO Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+Trdh3oWD+L2/6DgRAtBoAJ9dZm6z34AxPYU1dvR7yG9k09tqlgCeOueJ VVaIwlCWQxRAd/Q/gYqhegM= =XgPC -----END PGP SIGNATURE----- --Boundary-02=_hdrT+Ttq3TZ7FdO-- From owner-ietf-imaa Sat Feb 15 14:17:43 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FMHhH12214 for ietf-imaa-bks; Sat, 15 Feb 2003 14:17:43 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FMHfd12210 for ; Sat, 15 Feb 2003 14:17:42 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FMIBKA030485 for ; Sat, 15 Feb 2003 22:18:12 GMT MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15950.48285.727512.858157@moriarty.gnomon.org.uk> Date: Sat, 15 Feb 2003 22:18:05 +0000 To: John C Klensin Cc: Roy Badami , "ietf-imaa@imc.org" Subject: Re: Can we back up a bit and ask some basic questions? Analternate model In-Reply-To: <93041416.1045327595@p3.JCK.COM> References: <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <93041416.1045327595@p3.JCK.COM> X-Mailer: VM 7.03 under Emacs 20.7.2 From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > > I'm not sure that you understand what I was proposing. A > > better analogy would be to pose the question: would it have > > been a good thing for ESMTP and 8BITMIME to have been defined > > concurrently with the base MIME standards? (That isn't > > intended as a loaded question; "No, it wouldn't have made any > > difference" is a perfectly valid answer.) > > How would "basically, they were" grab you as an answer? Thanks, that's interesting. I'd always assumed that 8BITMIME was defined much later (probably because MTAs get updated more slowly than MUAs, so I first encountered it much later). For the record, I'm largely happy with both IDNA and the base IMAA propo > * UTF-8, while more common and better known than > punycode, is really not a very efficient encoding [...] > * There are more efficient encodings than either [...] I have to say that I don't believe coding efficiency is incredibly important to e-mail, particularly coding efficiently of addresses (except insofar as we need to allow useful IMAs within existing protocols that contain length restrictions). The main reason why I think this will inevitably happen in the future (regardless of whether this forum mandates it) is that in the long term we will move to a message body which (by default) is just a block of UTF-8, with no requirements for special coding in any headers or in the body. Once this happens, and punycode is unnecessary within the message, it would seem to me to make sence to eliminate it from (2)821, too. I guess this is outside the scope of the IMA list, however... > over-clever MTA author to say "if I can send IMAA > without negotiation, and negotiation fails, I can either > go to all that downgrading trouble, which might not work > anyway, or I can just send the 8bit stuff, which might > get through. The latter is a lot less work, so..." I'm not sure that the situations are comparable. The reason that MTA authors do this with 8-bit to 7-bit conversion is partly that it almost invariably works, so they can get away with it. Indeed, Dan Bernstein has some arguments (that I don't entirely agree with) that it works _better_ than following the RFCs (ie at least one author of a major MTA made a considered decision to disregard this particular requirement because in his opinion his approach interoperated better with the rest of the Internet). I don't believe the situation would be the same with IMAs -- I strongly suspect that just-send-8 for RFC-(2)821 commands and RFC-(2)822 headers simply won't work in most cases... -roy From owner-ietf-imaa Sat Feb 15 14:27:27 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FMRRj12357 for ietf-imaa-bks; Sat, 15 Feb 2003 14:27:27 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FMRPd12353 for ; Sat, 15 Feb 2003 14:27:26 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1FMRuKA030537 for ; Sat, 15 Feb 2003 22:27:56 GMT MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15950.48872.86946.619159@moriarty.gnomon.org.uk> Date: Sat, 15 Feb 2003 22:27:52 +0000 To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: Re: Can we back up a bit and ask some basic questions? Analternate model In-Reply-To: <200302152158.57661@sendmail.mutz.com> References: <1045323640.28250.TMDA@moriarty.gnomon.org.uk> <200302152158.57661@sendmail.mutz.com> X-Mailer: VM 7.03 under Emacs 20.7.2 From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > > Scenario 1b: > (1a with webmail) > > IMAA and UTF8ADDRESSES both require a major upgrade to > > the ISP's infrastructure. > > IMAA-ACE-* yes, IMAA-UTF8 probably not. It might suffice to add or > change to > > in the delivered html page. I'm not sure how IMAA-UTF8 differs from John Klensin's UTF8ADDRESSES proposal, but this would appear to apply equally to both. I'm not sure I believe that it really would be that simple to upgrade web mail systems to support UTF-8 addresses, but it's an interesting suggestion. > > Scenario 2b: IMAA requires upgrades to the mail clients and, as > > currently specified in the draft, an upgrade to the organization's > > MTA (though the need to upgrade the MTA might disappear depending on > > the design decisions we take in IMAA). UTF8ADDRESSES requires > > upgrades to the clients, the MTA and the ISP's infrastructure. > > > IMAA-ACE-opaque requires that. > IMAA-ACE-split requires no modification on the server side, just like > 2a. That's what I intended by saying that the requirement to upgrade the MTA might disappear depending on the design decisions we make in IMAA (splitting is an open issue within IMAA). -roy From owner-ietf-imaa Sat Feb 15 14:43:31 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FMhVW12620 for ietf-imaa-bks; Sat, 15 Feb 2003 14:43:31 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FMhUd12616 for ; Sat, 15 Feb 2003 14:43:30 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18kB1w-000J8a-00; Sat, 15 Feb 2003 17:43:32 -0500 Date: Sat, 15 Feb 2003 17:43:32 -0500 From: John C Klensin To: Marc Mutz cc: "ietf-imaa@imc.org" Subject: Re: Subaddressing Message-ID: <96457829.1045331012@p3.JCK.COM> In-Reply-To: <200302152255.46332@sendmail.mutz.com> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <81361551.1045315916@p3.JCK.COM> <200302152223.41136@sendmail.mutz.com> <200302152255.46332@sendmail.mutz.com> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --On Saturday, 15 February, 2003 22:55 +0100 Marc Mutz wrote: > On Saturday 15 February 2003 22:23, Marc Mutz wrote: >> So the requirements to balance are >> 1. keeping complexity out of the server for the reasons >> stated above 2. keeping the concept of opaque local-parts in >> all it's strictness. > > In the light of John's recent mail, I should add that I'm > talking about keeping or not keeping the concept of opaque > local-parts /for IMAs/, iow, whether or not to _extend_ the > concept to ILPs. > > I'm specifically not trying to start a discussion whether or > not opaque local-parts as currently used are a bad idea - > they're not. Apologies if my mail came across like that. I think we are getting close to figuring out where we agree or disagree, which is progress even if we can't agree. When I read the above, my reaction is "MTAs don't read minds". You presumably don't intend that they do, so I probably don't understand your suggestion. Unless there is an external negotiation, or something more drastic, both traditional local parts and ones containing i18n strings are going to enter and leave MTAs in Forward-path and Reverse-path fields containing a Mailbox. If the presence of i18n material is not identified to the MTA, than any information that identifies that a particular address is an IMA must be embedded in the Local-part (obviously, an ACE prefix is such an embedded indication). However, if the Local-part of the relevant Mailbox is opaque, then the MTA can't tell whether what is in it is a traditional (opaque) local part or an internationalized one. In the limiting case, if the MTA peeks, it still can't be guaranteed that IESG--string is an IMA rather some sort of traditional address or instruction. After all, it is just an ASCII string that obeys the quoting rules. So, I don't see how you can have non-opaque IMAs without making traditional local-parts non-opaque. Unless, again, there is some sort of handshake at the SMTP level that tells the MTA what is an IMA and what isn't. What am I missing? john From owner-ietf-imaa Sat Feb 15 15:12:52 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FNCqd13115 for ietf-imaa-bks; Sat, 15 Feb 2003 15:12:52 -0800 (PST) Received: from mail.epost.de (mail.epost.de [193.28.100.167] (may be forged)) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FNCod13109 for ; Sat, 15 Feb 2003 15:12:50 -0800 (PST) Received: from [217.82.82.73] (217.82.82.73) by mail.epost.de (6.7.015) (authenticated as Marc.Mutz@epost.de) id 3E4D1FC10001B9B7 for ietf-imaa@imc.org; Sun, 16 Feb 2003 00:12:52 +0100 From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: Can we back up a bit and ask some basic questions? Analternate model Date: Sat, 15 Feb 2003 23:55:35 +0100 User-Agent: KMail/1.5.9 References: <1045323640.28250.TMDA@moriarty.gnomon.org.uk> <200302152158.57661@sendmail.mutz.com> <15950.48872.86946.619159@moriarty.gnomon.org.uk> In-Reply-To: <15950.48872.86946.619159@moriarty.gnomon.org.uk> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_yVsT+7Vs6mU4HLw"; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200302152355.46847@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_yVsT+7Vs6mU4HLw Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Saturday 15 February 2003 23:27, Roy Badami wrote: > I'm not sure how IMAA-UTF8 differs from John Klensin's UTF8ADDRESSES > proposal, They're the same. > but this would appear to apply equally to both. I'm not > sure I believe that it really would be that simple to upgrade web > mail systems to support UTF-8 addresses, but it's an interesting > suggestion. It's _potentially_ that simple, not necessarily. OTOH, punycode necessarily is _not_ that simple. Just an observation. UTF-8 has other problems, that others have already=20 mentioned. Marc =2D-=20 If privacy is outlawed, only outlaws will have privacy. -- Phil Zimmermann --Boundary-02=_yVsT+7Vs6mU4HLw Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TsVy3oWD+L2/6DgRAv3/AKD9ULs9OIYi5uie/qeW+Cmo29s6RgCgxKvI f1r+S7Xb9Bg1nm81sVB2tTk= =LbEP -----END PGP SIGNATURE----- --Boundary-02=_yVsT+7Vs6mU4HLw-- From owner-ietf-imaa Sat Feb 15 15:56:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1FNuX113784 for ietf-imaa-bks; Sat, 15 Feb 2003 15:56:33 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1FNuRd13780; Sat, 15 Feb 2003 15:56:27 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200302152223.41136@sendmail.mutz.com> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <81361551.1045315916@p3.JCK.COM> <200302152223.41136@sendmail.mutz.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 14:39:10 -0800 To: Marc Mutz , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Subaddressing Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 10:23 PM +0100 2/15/03, Marc Mutz wrote: >I haven't followed the IDNA discussions, so I don't know how much of an >issue server security and scalability had on the design of IDNA, but, >looking back, IDNA does a very good job of keeping complexity out of >the server. "very good" = complete >Because strictly opaque local-parts require the interpretation to happen >after the ToUnicode operation, which increases the complexity of the >server (US-ASCII/char*->UTF-8/wchar_t transition). Whoa. Server interpretation of IMAA-ACE is only required when the server is already interpreting the LHS, that is, when the server is making the LHS non-opaque. Servers use of ToUnicode is strictly optional. >I assert - and the IDNA design agrees with me here - that it is a great >deal more important to keep complexity out of the server than to keep >the strict opaque local-part concept. Please do not say things like "the IDNA design agrees with me here". The IDNA design had nothing to do with keeping complexity out of servers: it was about making no changes to servers. That is the same design principle that we used with IMAA-ACE. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 16:08:00 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G080714059 for ietf-imaa-bks; Sat, 15 Feb 2003 16:08:00 -0800 (PST) Received: from mail.epost.de (mail.epost.de [193.28.100.165] (may be forged)) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G07wd14052 for ; Sat, 15 Feb 2003 16:07:58 -0800 (PST) Received: from [217.82.82.73] (217.82.82.73) by mail.epost.de (6.7.015) (authenticated as Marc.Mutz@epost.de) id 3E426BBE00136D3A for ietf-imaa@imc.org; Sun, 16 Feb 2003 01:08:01 +0100 From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: Subaddressing Date: Sun, 16 Feb 2003 00:50:54 +0100 User-Agent: KMail/1.5.9 References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <200302152255.46332@sendmail.mutz.com> <96457829.1045331012@p3.JCK.COM> In-Reply-To: <96457829.1045331012@p3.JCK.COM> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_sJtT+3FtnwT/aia"; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <200302160051.08385@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_sJtT+3FtnwT/aia Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Saturday 15 February 2003 23:43, John C Klensin wrote: > --On Saturday, 15 February, 2003 22:55 +0100 Marc Mutz > wrote: > > In the light of John's recent mail, I should add that I'm > > talking about keeping or not keeping the concept of opaque > > local-parts /for IMAs/, iow, whether or not to _extend_ the > > concept to ILPs. > > > > I'm specifically not trying to start a discussion whether or > > not opaque local-parts as currently used are a bad idea - > > they're not. Apologies if my mail came across like that. > > I think we are getting close to figuring out where we agree or > disagree, which is progress even if we can't agree. :-) > When I read the above, my reaction is "MTAs don't read minds". > You presumably don't intend that they do, so I probably don't > understand your suggestion. There are two separate things here: 1. What I left of your quotation of my mails was a disclaimer intended=20 to respond to your reply to my mail about MIXER. It was just to say=20 that I don't intent to suggest changes to 2821bis this way. You shouldn't try to read too much into that. 2. The point of the first of the two mails you were replying to was -=20 very condensed - : a. I assert that splitting the ILP[1] avoids the need to change servers,=20 thus reducing complexity[2] b. I understood your point about local-parts being opaque and=20 interpretation[3] of the local part to be performed after the ToUnicode=20 operation as an argument _against_ splitting[1]. c. I assert that requiring the ToUnicode operation in servers to keep=20 current functionality[4] working is needlessly increasing=20 implementation complexity[2] d. As a consequence of a-c, I think that splitting[1] is preferable over=20 keeping the local-part opaque[4] w.r.t. IMAA. [1] at non-alnum US-ASCII characters, and normalizing the delimiter=20 chars to their US-ASCII equivalents. [2] where I assume that increased complexity decreases security and=20 scalability. [3] subaddresses, etc, and I don't want to engage into a hair splitting=20 discussion which of the the M*As does actually interpret the local-part=20 according to a local convention, either ;-) That's why I talked about=20 "servers" (which for me also includes IMAP and whatnot). [4] ie. processing it without prior splitting[1] Marc =2D-=20 The DMCA is unconstitutional, but they don't care. Until it's ruled unconstitutional, they've won. If they can scare software companies, ISPs, programmers, and T-shirt manufacturers [...] into submission, they've won for another day. The entertainment industry is fighting a holding action, and fear, uncertainty, and doubt are their weapons. We need to win this, and we need to win it quickly. Every day we don't win is a loss. -- Bruce Schneier, Crypto-Gram Aug 2001 --Boundary-02=_sJtT+3FtnwT/aia Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TtJs3oWD+L2/6DgRAk7eAJ0QITPQgRHnfAKvhvueVZWA3qWJMwCgpGcH k+Fc25v9jJ7Mbez8tchauzA= =vkLe -----END PGP SIGNATURE----- --Boundary-02=_sJtT+3FtnwT/aia-- From owner-ietf-imaa Sat Feb 15 16:08:51 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G08pp14083 for ietf-imaa-bks; Sat, 15 Feb 2003 16:08:51 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G08od14077 for ; Sat, 15 Feb 2003 16:08:51 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id TAA25845 for ; Sat, 15 Feb 2003 19:08:53 -0500 Message-Id: <4.2.0.58.J.20030215172311.0511a0c8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sat, 15 Feb 2003 17:59:17 -0500 To: ietf-imaa@imc.org From: Martin Duerst Subject: Updating RFC 2368 (mailto: URI) Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This is an issue that I don't think has been raised yet. RFC 2368 (http://www.ietf.org/rfc/rfc2368.txt), "The mailto URL scheme" (Hoffman, Masinter, Zawinski) defines the details of the 'mailto:' URI scheme. We should think about how this is affected by IMAs. In particular, with IRIs, I think that users will expect that they can write Comments about my page (where IMA and IDN are non-ASCII characters). The IRI-to-URI conversion converts the above to mailto:%II%MM%AA@%II%DD%NN.tld where %II,... is the %-hex-escaped UTF-8 encoding of the original characters. So we have to extend the definition of mailto: URIs to include this case, and we have to make sure that this maps to IMAA. This mapping will turn out to be more or less complex depending on how IMAA looks. For IMAA-ACE, we will have to describe the details of the conversion (similar e.g. to those in http://www.ietf.org/rfc/rfc2192.txt), and they will have to be implemented. For 'UTF8ADDRESS' (I changed that from plural to singular, but I think that there is still room for improvement), it will be more straightforward. This of course would not restrain the ACE-initiated to use e.g. an ACEd IDN directly. But this would confuse the users. While we are it, some more comments on RFC 2368 and internationalization. Here is some text that I found: 8-bit characters in mailto URLs are forbidden. MIME encoded words (as defined in [RFC2047]) are permitted in header values, but not for any part of a "body" hname. This is not completely clear. Is it forbidden to use 8-bit octets in mailto URIs in raw form? The general URI spec would definitely forbid this, so this would just be a repetition. Or is it forbidden to include such octets %-escaped? In that case, it would be impossible e.g. to create a mailto: URI that sends a PNG image to somebody (for example, not that that's frequently done). Regards, Martin. From owner-ietf-imaa Sat Feb 15 16:08:52 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G08qa14093 for ietf-imaa-bks; Sat, 15 Feb 2003 16:08:52 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G08pd14080; Sat, 15 Feb 2003 16:08:51 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id TAA25852; Sat, 15 Feb 2003 19:08:54 -0500 Message-Id: <4.2.0.58.J.20030215184409.05123db8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sat, 15 Feb 2003 19:07:46 -0500 To: Paul Hoffman / IMC , IETF IMAA list From: Martin Duerst Subject: Re: Can we back up a bit and ask some basic questions? An alternate model In-Reply-To: References: <18245836.1045252800@p3.JCK.COM> <18245836.1045252800@p3.JCK.COM> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 10:31 03/02/15 -0800, Paul Hoffman / IMC wrote: >>(a) We define a new SMTP extension. For purposes of discussion, let's >>call it UTF8ADDRESSES. > >Therefore IMAA-UTF8 cannot achieve wide use until this extension is nearly >universally adopted. When you get a business card from a colleague that >has an internationalized email address, you can probably assume that their >MTA supports UTF8ADDRESSES, but there is no way for you to know if your >own MTA supports it, so you don't know if you can send him or her mail. Well, I think we can distinguish the following main situations here: (taking Japanese as an example) 1) You don't know Japanese, so you won't even try. 2) You have such an address yourself, and if your MTA supports it on incoming mail, the chances are high it will work for outgoing mail. 3) You have tried it for another, similar, address, and it failed, so you won't try it again unless you have good reasons to assume it might have improved. 4) You'll give it a try, and see what happens. You might also cc the mail to the older (ASCII-only) address to make sure. >With IMAA-UTF8, you always need to know that the originating MTA supports >UTF8ADDRESSES. If you are a road warrior, you would need to know if your >on-the-road ISP's MTAs support UTF8ADDRESSES before you could send mail to >an IMAA-UTF8 address. Of course, you have no control over this. Well, I have my 'on-the-road MTA', but I don't use its MTA. SMTP after POP and now SSH tunneling work fine fine for me, and wouldn't be affected. Similar for VPNs that many companies use these days for security reasons. Of course, individual mileage may vary. >With IMAA-UTF8, if a company has an MTA that supports UTF8ADDRESSES, and >they want to add a firewall that has an SMTP proxy, that proxy must >support UTF8ADDRESSES. Today, companies can safely add firewalls that have >SMTP proxies that have fewer ESMTP services than the MTA for which they front. Your last sentence can be interpreted in two ways. In the trivial way, such SMTP firewalls are always possible if the ESMPT service is not needed. That wouldn't be different for IMAA-UTF8. The other interpretation would raise some questions: Do you want to say that all functionality offered by current ESMTP options can be tunnelled over MTAs that don't offer these options? And do you claim that this is true not only on paper (as spec'ed), but also in real life (implementations)? Regards, Martin. From owner-ietf-imaa Sat Feb 15 16:08:54 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G08sW14104 for ietf-imaa-bks; Sat, 15 Feb 2003 16:08:54 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G08rd14100 for ; Sat, 15 Feb 2003 16:08:53 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id TAA25848; Sat, 15 Feb 2003 19:08:53 -0500 Message-Id: <4.2.0.58.J.20030215182950.03351550@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sat, 15 Feb 2003 18:37:24 -0500 To: Roy Badami , John C Klensin From: Martin Duerst Subject: Re: Can we back up a bit and ask some basic questions?Analternate model Cc: "ietf-imaa@imc.org" In-Reply-To: <15950.48285.727512.858157@moriarty.gnomon.org.uk> References: <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <93041416.1045327595@p3.JCK.COM> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 22:18 03/02/15 +0000, Roy Badami wrote: >I have to say that I don't believe coding efficiency is incredibly >important to e-mail, particularly coding efficiently of addresses >(except insofar as we need to allow useful IMAs within existing >protocols that contain length restrictions). > >The main reason why I think this will inevitably happen in the future >(regardless of whether this forum mandates it) is that in the long >term we will move to a message body which (by default) is just a block >of UTF-8, I think this is the direction we are moving to, but not very quickly. Similar for the WWW, we are seeing more and more UTF-8, but again, not extremely quickly. >with no requirements for special coding in any headers or in >the body. Something like Content-Type: text/plain;charset=utf-8 will be present for a VERY long time. But maybe that's not what you mean by encoding. >Once this happens, and punycode is unnecessary within the >message, it would seem to me to make sence to eliminate it from >(2)821, too. I think this is one way to argue, but a) I don't think there is any plan for using ACE explicitly within the message body (it can always be used, but it will be just a random sequence of ASCII letters); b) The motivation for uniform encoding is much stronger in the headers than in the body (I'm very happy that nobody has brought up proposals yet for using a variety of legacy encodings, with labeling, in the header); c) If we think we have a good feel about where we are going, then it may be a lot cheaper to try to go there faster and on the most direct way we can find rather than waste time. Regards, Martin. From owner-ietf-imaa Sat Feb 15 16:24:48 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G0OmY14432 for ietf-imaa-bks; Sat, 15 Feb 2003 16:24:48 -0800 (PST) Received: from mail.epost.de (mail.epost.de [193.28.100.167] (may be forged)) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G0Okd14427 for ; Sat, 15 Feb 2003 16:24:46 -0800 (PST) Received: from [217.82.82.73] (217.82.82.73) by mail.epost.de (6.7.015) (authenticated as Marc.Mutz@epost.de) id 3E4D1FC10001C8AF for ietf-imaa@imc.org; Sun, 16 Feb 2003 01:24:49 +0100 From: Marc Mutz Organization: KDE To: ietf-imaa@imc.org Subject: Re: Subaddressing Date: Sun, 16 Feb 2003 01:08:35 +0100 User-Agent: KMail/1.5.9 References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <200302152223.41136@sendmail.mutz.com> In-Reply-To: X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_EatT+pz5zL17F9o"; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <200302160108.36146@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_EatT+pz5zL17F9o Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Saturday 15 February 2003 23:39, Paul Hoffman / IMC wrote: > Whoa. Server interpretation of IMAA-ACE is only required when the > server is already interpreting the LHS, that is, when the server is > making the LHS non-opaque. Servers use of ToUnicode is strictly > optional. As you agreed, IDNA doesn't require servers to know about IDNA for=20 current functionality, even if that functionality is used for IDNs. "Non-splitting" IMAA-ACE does. My point is: It shouldn't, so IMAA-ACE should split. Marc =2D-=20 Silent leges inter arma -- Cicero --Boundary-02=_EatT+pz5zL17F9o Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+TtaE3oWD+L2/6DgRAuoqAJ9aAFg7Cobm9w3GAttTh13WiSixrQCfQ7db xRuqbhBSPP+sLleWwqw2WgE= =dSG0 -----END PGP SIGNATURE----- --Boundary-02=_EatT+pz5zL17F9o-- From owner-ietf-imaa Sat Feb 15 20:04:00 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G440H18542 for ietf-imaa-bks; Sat, 15 Feb 2003 20:04:00 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G43nd18528 for ; Sat, 15 Feb 2003 20:03:51 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030215184409.05123db8@localhost> References: <18245836.1045252800@p3.JCK.COM> <18245836.1045252800@p3.JCK.COM> <4.2.0.58.J.20030215184409.05123db8@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 20:01:43 -0800 To: IETF IMAA list From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basic questions? An alternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 7:07 PM -0500 2/15/03, Martin Duerst wrote: >At 10:31 03/02/15 -0800, Paul Hoffman / IMC wrote: > >>>(a) We define a new SMTP extension. For purposes of discussion, >>>let's call it UTF8ADDRESSES. >> >>Therefore IMAA-UTF8 cannot achieve wide use until this extension is >>nearly universally adopted. When you get a business card from a >>colleague that has an internationalized email address, you can >>probably assume that their MTA supports UTF8ADDRESSES, but there is >>no way for you to know if your own MTA supports it, so you don't >>know if you can send him or her mail. > >Well, I think we can distinguish the following main situations here: >(taking Japanese as an example) > >1) You don't know Japanese, so you won't even try. >2) You have such an address yourself, and if your MTA supports it > on incoming mail, the chances are high it will work for outgoing mail. >3) You have tried it for another, similar, address, and it failed, > so you won't try it again unless you have good reasons to assume > it might have improved. >4) You'll give it a try, and see what happens. You might also > cc the mail to the older (ASCII-only) address to make sure. Agree, almost. That last sentence makes an assumption that is not warranted. >>With IMAA-UTF8, you always need to know that the originating MTA >>supports UTF8ADDRESSES. If you are a road warrior, you would need >>to know if your on-the-road ISP's MTAs support UTF8ADDRESSES before >>you could send mail to an IMAA-UTF8 address. Of course, you have no >>control over this. > >Well, I have my 'on-the-road MTA', but I don't use its MTA. >SMTP after POP and now SSH tunneling work fine fine for me, >and wouldn't be affected. Similar for VPNs that many companies >use these days for security reasons. Of course, individual >mileage may vary. So IMAA-UTF8 is only for experts who know how to use tunneling. No, thank you. >>With IMAA-UTF8, if a company has an MTA that supports >>UTF8ADDRESSES, and they want to add a firewall that has an SMTP >>proxy, that proxy must support UTF8ADDRESSES. Today, companies can >>safely add firewalls that have SMTP proxies that have fewer ESMTP >>services than the MTA for which they front. > >Your last sentence can be interpreted in two ways. In the trivial way, >such SMTP firewalls are always possible if the ESMPT service is not >needed. That wouldn't be different for IMAA-UTF8. Yes, it would be quite different. Most ESMTP extensions are non-critical; the message can still be delivered if the extension is not available. IMAA-UTF8 would be critical: if any MTA in the path didn't support it, that MTA would have to make out-of-band guesses about how to deliver the message or would have to send back a "message is undeliverable" response. Both are pretty bad, and I'm not sure which is worse. >The other interpretation would raise some questions: Do you want to say >that all functionality offered by current ESMTP options can be tunnelled >over MTAs that don't offer these options? No, of course not. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 20:03:44 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G43ij18523 for ietf-imaa-bks; Sat, 15 Feb 2003 20:03:44 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G43Md18517; Sat, 15 Feb 2003 20:03:25 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200302160108.36146@sendmail.mutz.com> References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <200302152223.41136@sendmail.mutz.com> <200302160108.36146@sendmail.mutz.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 19:46:53 -0800 To: Marc Mutz , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Subaddressing Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:08 AM +0100 2/16/03, Marc Mutz wrote: >As you agreed, IDNA doesn't require servers to know about IDNA for >current functionality, even if that functionality is used for IDNs. > >"Non-splitting" IMAA-ACE does. > >My point is: It shouldn't, so IMAA-ACE should split. OK, I see where you are coming from. I don't agree with the conclusion yet, but I can see that we haven't been specific enough in our discussion. I'll try to create a taxonomy that includes IMAA-ACE-OPAQUE, IMAA-ACE-SPLIT, and IMAA-UTF8. Tomorrow. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sat Feb 15 20:03:47 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1G43l818527 for ietf-imaa-bks; Sat, 15 Feb 2003 20:03:47 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1G43gd18520 for ; Sat, 15 Feb 2003 20:03:45 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030215172311.0511a0c8@localhost> References: <4.2.0.58.J.20030215172311.0511a0c8@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sat, 15 Feb 2003 19:54:49 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Updating RFC 2368 (mailto: URI) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 5:59 PM -0500 2/15/03, Martin Duerst wrote: >We should think about how this is affected by IMAs. >In particular, with IRIs, We should think about this eventually, but it is premature to do so now. The IRI spec has not gone through IETF last call. >While we are it, some more comments on RFC 2368 and internationalization. >Here is some text that I found: > > 8-bit characters in mailto URLs are forbidden. MIME encoded words (as > defined in [RFC2047]) are permitted in header values, but not for any > part of a "body" hname. > >This is not completely clear. Is it forbidden to use 8-bit octets >in mailto URIs in raw form? Yes. >The general URI spec would definitely >forbid this, so this would just be a repetition. Yes. It was added for emphasis. >Or is it forbidden to include such octets %-escaped? In that case, >it would be impossible e.g. to create a mailto: URI that sends a >PNG image to somebody (for example, not that that's frequently done). It is impossible to do that anyhow. RFC explicitly 2368 forbids MIME body parts. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sun Feb 16 05:48:15 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GDmFu04547 for ietf-imaa-bks; Sun, 16 Feb 2003 05:48:15 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GDmDd04542 for ; Sun, 16 Feb 2003 05:48:13 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1GDmiKA002113 for ; Sun, 16 Feb 2003 13:48:45 GMT To: duerst@w3.org CC: john-ietf@jck.com, ietf-imaa@imc.org In-reply-to: <4.2.0.58.J.20030215182950.03351550@localhost> (message from Martin Duerst on Sat, 15 Feb 2003 18:37:24 -0500) Subject: Re: Can we back up a bit and ask some basic questions?Analternate model References: <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030215182950.03351550@localhost> Date: Sun, 16 Feb 2003 13:48:42 +0000 Message-ID: <1045403322.2104.TMDA@moriarty.gnomon.org.uk> From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >The main reason why I think this will inevitably happen in the future >(regardless of whether this forum mandates it) is that in the long >term we will move to a message body which (by default) is just a block >of UTF-8, I think this is the direction we are moving to, but not very quickly. Similar for the WWW, we are seeing more and more UTF-8, but again, not extremely quickly. Sorry, there was a typo in my comment above. I meant to say: we will move to a _message_ which (by default) is just a block of UTF-8 ie, not only will out content-transfer-encoding be 8-bit, but we'll dispense with the escaping mechanism in subject and other headers. At this point it makes sense to dispense with the ACE in the headers, too. A message once again becomes just a piece of text you can view (most of) in a text editor. >with no requirements for special coding in any headers or in >the body. Something like Content-Type: text/plain;charset=utf-8 will be present for a VERY long time. But maybe that's not what you mean by encoding. Indeed. I meant that quoted-printable and base64 will go away, at least for plain text. I think this is one way to argue, but a) I don't think there is any plan for using ACE explicitly within the message body (it can always be used, but it will be just a random sequence of ASCII letters); Sorry, I intended to refer to the entire message, where there is a clear plan to use ACE. b) The motivation for uniform encoding is much stronger in the headers than in the body (I'm very happy that nobody has brought up proposals yet for using a variety of legacy encodings, with labeling, in the header); Agreed. c) If we think we have a good feel about where we are going, then it may be a lot cheaper to try to go there faster and on the most direct way we can find rather than waste time. Having thought about it further, the kind of solution I was envisioning would have to wait for a new message format to be defined, in which the headers were 8-bit. Making this change just for addresses doesn't make sense, and defining the native UTF-8 message format is clearly outside the scope of the present discussions. -roy From owner-ietf-imaa Sun Feb 16 07:48:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GFmo311832 for ietf-imaa-bks; Sun, 16 Feb 2003 07:48:50 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GFmmd11828 for ; Sun, 16 Feb 2003 07:48:48 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1GFnLKA002572 for ; Sun, 16 Feb 2003 15:49:21 GMT To: ietf-imaa@imc.org cc: roy@gnomon.org.uk Subject: POP3 mailbox names and IMAP userids From: Roy Badami Date: Sun, 16 Feb 2003 15:49:21 +0000 Message-ID: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: POP3 mailbox names (the argument to the POP3 USER or APOP command) are not local-parts. They identify a mailbox on the POP3 server, but are significant only to the POP3 server. (RFC 1939 section 7). In practice, they often are the same as the local-part of the e-mail address with which the mailbox is associated, but this is neither required by the standards, nor is it universally true. It is not uncommon, for instance, for the POP3 mailbox name to be the entire RFC-822 address, possibly with some other character (such as percent) substituted for at-sign, presumably to accomodate broken clients that believe that a POP3 mailbox name cannot contain an at-sign. Would it be a reasonable design decision for an IMA-aware POP3 client to apply ToASCII to mailbox names that contain international characters? This would do the right thing for mailbox names that corresponded to local parts (probably the most common case?). But what about mailbox names that correspond to an entire e-mail address (but possibly with the at-sign replaced by some other punctionation character)? Is the user going to be on their own here? The MUA could have explicit support for this kind of structured mailbox name, but it's going to be cumbersome: the user is going to have to enter the ILP, the delimiter and the IDN into three separate fields in the client, or at least tell the client what the delimiter is, so that the client can apply the correct transformations to the local part and domain labels embedded within the mailbox name. Is this going to be too confusing for users? As a side note (and I'm not suggesting this would be a good idea): if we were to split at all non-LDH ASCII characters (and perhaps full-width dot as well), and we use nameprep, and we use the same ACE-prefix as IDNA, and the domain happens to obey hostname rules, then it appears to me that if the client chooses to apply the IMAA ToASCII function to the mailbox name, it would generally Do The Right Thing -- ie it would correctly transform the IDN as a side effect, without having to know what the delimiter was between the local part and the domain -- as long as it isn't hyphen). On to IMAP clients. I'm not very familliar with IMAP, but as I understand it the corresponding concept is the IMAP user name or userid (not the IMAP mailbox, which is more akin to a folder on an MUA). Although RFC 2060 has some words to say on internationalized mailbox names, it appears to say very little that I could find about userids. Notably, the grammar allows userids to be unrestricted 8-bit strings with the restriction that they can't contain NUL. Does it make sense for an IMAP MUA to choose to apply ToASCII to IMAP userids? Or perhaps there are circumstances where a client might want to send UTF-8 as the userid? Since IMAP appears to be defined to be 8-bit-clean, it's possible that there are clients that already send UTF-8 for userids. So perhaps IMAP MUAs will have to have a user preference to enable ACE encoding of userids? Yet another setting for the user to come to grips with. The ultimate problem here with both POP and IMAP is that local parts are being carried in protocol elements that are defined to be opaque, so it's not clear (to me) what IMAA can or should do about this. -roy From owner-ietf-imaa Sun Feb 16 08:34:59 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GGYxo13011 for ietf-imaa-bks; Sun, 16 Feb 2003 08:34:59 -0800 (PST) Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GGYvd13005 for ; Sun, 16 Feb 2003 08:34:57 -0800 (PST) Received: from [209.187.148.215] (helo=p3.JCK.COM) by bs.jck.com with esmtp (Exim 4.10) id 18kRko-000LCH-00; Sun, 16 Feb 2003 11:34:58 -0500 Date: Sun, 16 Feb 2003 11:34:58 -0500 From: John C Klensin To: Roy Badami cc: "ietf-imaa@imc.org" Subject: Re: POP3 mailbox names and IMAP userids Message-ID: <160744047.1045395298@p3.JCK.COM> In-Reply-To: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> References: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> X-Mailer: Mulberry/3.0.0 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy, In practice, POP3 and IMAP accounts (mailboxes and user ids, respectively) have two important properties: (i) Unlike email addresses, which people type, enter into address books, and pass around to each other using mechanisms other than mail headers and envelopes, they are typically configured once per user (or at most, once per client machine). There are some separate issues when these things are accessed through web interfaces that simulate MUAs, but those are, well, separate issues. (ii) The typical user finds out the form in which the POP3 or IMAP accessing information (which often involves a domain name for the server that doesn't correspond obviously to the email address either) by getting a message that provides the information. That message may appear in a mail message to a different mailbox, on a web page, in a paper letter or two, but the point is that the user is given it in exact form. Mailbox providers who pass them out in any form that can be misinterpreted, even by the totally clueless, are just looking for extra customer service calls... and no one sane self-inflicts that damage intentionally. So, this is a non-problem. If the enrollment message says "configure your mail reader so that the Pop server is popmail.gnomon.org.uk and your mailboxID is roy12345&&/%", you will put that on the relevant screen and move on. You may wonder where on earth those strings came from (I would), but the typical user won't care (and, if they decline to tell, you and I will get past worrying about it). If, instead, it specifies one or more punycode strings, you will put those in and move on. And, if they give you something complex in UTF-8 or some other encoding, they had better hope your mail-reader knows how to handle it, or the customer support line will start ringing. When it does, those mail suppliers who can learn will change their minds about the presentation format. An oversimplification of the design principle here is pretty close to "people process strings that use characters, and even ones they think represent words; computers process bits". john --On Sunday, 16 February, 2003 15:49 +0000 Roy Badami wrote: > POP3 mailbox names (the argument to the POP3 USER or APOP > command) are not local-parts. They identify a mailbox on the > POP3 server, but are significant only to the POP3 server. > (RFC 1939 section 7). > > In practice, they often are the same as the local-part of the > e-mail address with which the mailbox is associated, but this > is neither required by the standards, nor is it universally > true. > > It is not uncommon, for instance, for the POP3 mailbox name to > be the entire RFC-822 address, possibly with some other > character (such as percent) substituted for at-sign, > presumably to accomodate broken clients that believe that a > POP3 mailbox name cannot contain an at-sign. > > Would it be a reasonable design decision for an IMA-aware POP3 > client to apply ToASCII to mailbox names that contain > international characters? This would do the right thing for > mailbox names that corresponded to local parts (probably the > most common case?). >... From owner-ietf-imaa Sun Feb 16 09:10:49 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GHAnq17483 for ietf-imaa-bks; Sun, 16 Feb 2003 09:10:49 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GHAkd17472; Sun, 16 Feb 2003 09:10:46 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> References: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sun, 16 Feb 2003 09:10:30 -0800 To: Roy Badami , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: POP3 mailbox names and IMAP userids Cc: roy@gnomon.org.uk Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 3:49 PM +0000 2/16/03, Roy Badami wrote: >Would it be a reasonable design decision for an IMA-aware POP3 client >to apply ToASCII to mailbox names that contain international >characters? This completely out of scope for this mailing list. We are discussing modifying email addresses, not the POP (or IMAP) protocol. Please take this discussion to the appropriate mailing list. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sun Feb 16 09:28:32 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GHSWt18716 for ietf-imaa-bks; Sun, 16 Feb 2003 09:28:32 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GHSQd18710; Sun, 16 Feb 2003 09:28:27 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <1045135861.15010.TMDA@moriarty.gnomon.org.uk> <200302152223.41136@sendmail.mutz.com> <200302160108.36146@sendmail.mutz.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sun, 16 Feb 2003 09:28:26 -0800 To: Marc Mutz , ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Subaddressing Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Let's look at two conceptual variants of IMAA (called IMAA-ACE-OPAQUE and IMAA-ACE-SPLIT) and how they affect current SMTP servers that do sub-address processing before writing received messages to a data store. We all agree that the LHS is opaque to the MTA according to the protocol. This means that, if the MTA wants to do any sub-address processing, it inherently makes the LHS non-opaque. In sub-address processing, the MTA parses the LHS, usually looking for delimiters, but sometimes looking for non-delimiter flags at the beginning or end of the name. Neither IMAA-ACE-OPAQUE and IMAA-ACE-SPLIT require any change to current MTAs unless the MTA parses the LHS. IMAA-ACE-OPAQUE always requires that the MTA that parses the LHS first perform ToUnicode before it parses the LHS. IMAA-ACE-SPLIT does not require the MTA to parse the LHS if the IMAA-ACE-SPLIT protocol uses exactly the same delimiters and flags as the MTA uses when parsing. If the delimiters or flags used by the MTA and IMAA-ACE-SPLIT are different, the MTA must perform ToUnicode before it parses the LHS. Thus, the advantage of IMAA-ACE-SPLIT is only for a subset of the MTAs out there. Further, I am not convinced that we can make an IMAA-ACE-SPLIT that sanely handles more than one delimiter (although I suspect Adam will try and I may be proved wrong). Regardless, IMAA-ACE-SPLIT will be a more complex protocol than IMAA-ACE-OPAQUE. The question is whether the additional complexity of the protocol is worth the advantage of preventing the MTAs that do sub-address processing to not need to add the ToUnicode step to the beginning of the processing. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Sun Feb 16 11:06:13 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GJ6Dk24329 for ietf-imaa-bks; Sun, 16 Feb 2003 11:06:13 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GJ6Cd24323; Sun, 16 Feb 2003 11:06:12 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA08051; Sun, 16 Feb 2003 14:06:11 -0500 Message-Id: <4.2.0.58.J.20030216095144.051234e0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sun, 16 Feb 2003 10:12:54 -0500 To: Paul Hoffman / IMC , IETF IMAA list From: Martin Duerst Subject: Re: Can we back up a bit and ask some basic questions? An alternate model In-Reply-To: References: <4.2.0.58.J.20030215184409.05123db8@localhost> <18245836.1045252800@p3.JCK.COM> <18245836.1045252800@p3.JCK.COM> <4.2.0.58.J.20030215184409.05123db8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 20:01 03/02/15 -0800, Paul Hoffman / IMC wrote: >At 7:07 PM -0500 2/15/03, Martin Duerst wrote: >>Well, I think we can distinguish the following main situations here: >>(taking Japanese as an example) >> >>1) You don't know Japanese, so you won't even try. >>2) You have such an address yourself, and if your MTA supports it >> on incoming mail, the chances are high it will work for outgoing mail. >>3) You have tried it for another, similar, address, and it failed, >> so you won't try it again unless you have good reasons to assume >> it might have improved. >>4) You'll give it a try, and see what happens. You might also >> cc the mail to the older (ASCII-only) address to make sure. > >Agree, almost. That last sentence makes an assumption that is not warranted. You mean the assumption that an ASCII-only address is available? That assumption is not warranted in all cases, but in most. Initial deployment of IMAs will in most cases be of the form 'hello, I now got this new Japanese address, can you see if it works?', and only later will first time IMA exchange (e.g. 'here is my IMA, please send me mail') increase. >>Well, I have my 'on-the-road MTA', but I don't use its MTA. >>SMTP after POP and now SSH tunneling work fine fine for me, >>and wouldn't be affected. Similar for VPNs that many companies >>use these days for security reasons. Of course, individual >>mileage may vary. > >So IMAA-UTF8 is only for experts who know how to use tunneling. No, thank you. I'm not sure there is that many people left between those who don't travel, those who use webmail when they travel, those who use the same ISP when they travel, and those who have to tunnel or VPN because of company policy. Anybody has some estimates? >Yes, it would be quite different. Most ESMTP extensions are non-critical; >the message can still be delivered if the extension is not available. >IMAA-UTF8 would be critical: if any MTA in the path didn't support it, >that MTA would have to make out-of-band guesses about how to deliver the >message or would have to send back a "message is undeliverable" response. >Both are pretty bad, and I'm not sure which is worse. Well, the assumption is still that a destination that wants to accept IMAA-UTF8 will make sure it upgrades its firewall, too. So as long as the firewall isn't upgraded, nobody is actually going to use IMAs at all (except maybe internally). This may change the rate of adoption, but it doesn't create additional external dependencies. Regards, Martin. From owner-ietf-imaa Sun Feb 16 11:06:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GJ6BC24317 for ietf-imaa-bks; Sun, 16 Feb 2003 11:06:11 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GJ69d24309; Sun, 16 Feb 2003 11:06:09 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA08048; Sun, 16 Feb 2003 14:06:11 -0500 Message-Id: <4.2.0.58.J.20030216094052.051204a8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sun, 16 Feb 2003 09:47:30 -0500 To: Paul Hoffman / IMC , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Updating RFC 2368 (mailto: URI) In-Reply-To: References: <4.2.0.58.J.20030215172311.0511a0c8@localhost> <4.2.0.58.J.20030215172311.0511a0c8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 19:54 03/02/15 -0800, Paul Hoffman / IMC wrote: >At 5:59 PM -0500 2/15/03, Martin Duerst wrote: >>Or is it forbidden to include such octets %-escaped? In that case, >>it would be impossible e.g. to create a mailto: URI that sends a >>PNG image to somebody (for example, not that that's frequently done). > >It is impossible to do that anyhow. RFC explicitly 2368 forbids MIME body >parts. Here is all I have found about this topic; if there is anything that I have missed, please tell me: The special hname "body" indicates that the associated hvalue is the body of the message. The "body" hname should contain the content for the first text/plain body part of the message. The mailto URL is primarily intended for generation of short text messages that are actually the content of automatic processing (such as "subscribe" messages for mailing lists), not general MIME bodies. 'primarily intended' doesn't mean that everything else is forbidden, I guess. A mail client should never send anything without complete disclosure to the user of what is will be sent; it should disclose not only the message destination, but also any headers. Unrecognized headers, or headers with values inconsistent with those the mail client would normally send should be especially suspect. MIME headers (MIME- Version, Content-*) are most likely inappropriate, as are those relating to routing (From, Bcc, Apparently-To, etc.) Again, 'most likely inappropriate' isn't the same as forbidden. It may well be that the user is shown the image that is being sent. Anyway, the PNG example was just an example. Actually, I worried more about sending a message body with text other than just ASCII. How would that be done? Regards, Martin. From owner-ietf-imaa Sun Feb 16 11:06:12 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GJ6CF24322 for ietf-imaa-bks; Sun, 16 Feb 2003 11:06:12 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GJ6Ad24314 for ; Sun, 16 Feb 2003 11:06:10 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA08045; Sun, 16 Feb 2003 14:06:10 -0500 Message-Id: <4.2.0.58.J.20030215192922.05cdc728@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sat, 15 Feb 2003 19:30:34 -0500 To: Roy Badami , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Can we back up a bit and ask some basic questions? Analternatemodel Cc: roy@gnomon.org.uk In-Reply-To: <1045323640.28250.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 15:40 03/02/15 +0000, Roy Badami wrote: >Your document is well argued. We certainly shouldn't blindly assume >that just because the ACE vs just-send-8 issue was argued to death in >the IDN WG, the trade-offs between the two approaches when applied to >IMAs will automatically be the same as those for IDNs. (Though I can >also understand that this group probably really doesn't want to go >there again.) Please note that John never proposed 'just-send-8'. 'just-send-8' is different from ESMPT UTF8ADDRESS. Regards, Martin. From owner-ietf-imaa Sun Feb 16 11:06:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1GJ6Gt24337 for ietf-imaa-bks; Sun, 16 Feb 2003 11:06:16 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1GJ6Ed24331 for ; Sun, 16 Feb 2003 11:06:14 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA08057; Sun, 16 Feb 2003 14:06:14 -0500 Message-Id: <4.2.0.58.J.20030216101703.05134de8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Sun, 16 Feb 2003 10:33:21 -0500 To: Roy Badami From: Martin Duerst Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model Cc: john-ietf@jck.com, ietf-imaa@imc.org In-Reply-To: <1045403322.2104.TMDA@moriarty.gnomon.org.uk> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030215182950.03351550@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 13:48 03/02/16 +0000, Roy Badami wrote: >Sorry, there was a typo in my comment above. I meant to say: > > we will move to a _message_ which (by default) is just a block of UTF-8 Thanks for the clarification. >ie, not only will out content-transfer-encoding be 8-bit, but we'll >dispense with the escaping mechanism in subject and other headers. At >this point it makes sense to dispense with the ACE in the headers, >too. A message once again becomes just a piece of text you can view >(most of) in a text editor. Yes, therefore allowing people around the globe to do all those things with emails that people in the ASCII-only world have done all the time: use text editors (as you say), write simple scripts (with the emphasis on simple) to process their email, and so on. Great! >Having thought about it further, the kind of solution I was >envisioning would have to wait for a new message format to be defined, >in which the headers were 8-bit. Making this change just for >addresses doesn't make sense, and defining the native UTF-8 message >format is clearly outside the scope of the present discussions. Well, I agree that we should concentrate on addresses here, but looking ahead is part of good engineering. So even if this happens in two steps (UTF8ADDRESS and UTF8HEADER), we can think about the interactions. And if we find out that it would be almost as easy to do both at the same, and maybe just as one extension, then I don't think we should feel restricted to not do it. Actually, my current guess is that it's almost as much effort to do both things in one extension as to do them separately: - Widening code paths to 8 bits has to be done only once, and can be done completely. - Negotiation is done on one item, rather than on several. This significantly reduces code complexity. The main problem is how to distinguish between header parts that are addresses and those that are other text. Regards, Martin. From owner-ietf-imaa Tue Feb 18 19:49:32 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1J3nWh26694 for ietf-imaa-bks; Tue, 18 Feb 2003 19:49:32 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1J3nUd26690 for ; Tue, 18 Feb 2003 19:49:30 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id 9F14078A5D; Wed, 19 Feb 2003 03:49:15 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 27593-10; Wed, 19 Feb 2003 03:49:13 +0000 (GMT) Received: from jeffreyibm (unknown [211.219.16.83]) by pie1.i-dns.net (Postfix) with SMTP id 4184578A4F; Wed, 19 Feb 2003 03:49:10 +0000 (GMT) Message-ID: <00ca01c2d7ca$2800c560$3800a8c0@jeffreyibm> From: "Jeffrey J Zahari" To: "Roy Badami" , "Martin Duerst" Cc: , References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model Date: Wed, 19 Feb 2003 12:51:13 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Martin Duerst" To: "Roy Badami" Cc: ; Sent: Monday, February 17, 2003 12:33 AM Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model > > At 13:48 03/02/16 +0000, Roy Badami wrote: > > >Sorry, there was a typo in my comment above. I meant to say: > > > > we will move to a _message_ which (by default) is just a block of UTF-8 > >Having thought about it further, the kind of solution I was > >envisioning would have to wait for a new message format to be defined, > >in which the headers were 8-bit. Making this change just for > >addresses doesn't make sense, and defining the native UTF-8 message > >format is clearly outside the scope of the present discussions. > > Well, I agree that we should concentrate on addresses here, > but looking ahead is part of good engineering. So even if > this happens in two steps (UTF8ADDRESS and UTF8HEADER), > we can think about the interactions. And if we find > out that it would be almost as easy to do both at the same, > and maybe just as one extension, then I don't think we > should feel restricted to not do it. > > Actually, my current guess is that it's almost as much > effort to do both things in one extension as to do them > separately: > What you're envisioning is something brought up previously, that IMAA should update 2821/2822. I assume UTF8ADDRESS refers to 2821 level email addresses and UTF8HEADER refers to email addresses within 2822 headers. What happens if an intermediate legacy smtp server cannot handle UTF8ADDRESS, and a receiver's MUA cannot handle messages with UTF8HEADER? With the IMAA-ACE approaches ( or until 2821/2 is altered/implemented ), unless the MTA requires non opaque lhs, this seems like the most efficient path to internationalised emails. jeffrey j zahari From owner-ietf-imaa Wed Feb 19 07:29:46 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1JFTkm22826 for ietf-imaa-bks; Wed, 19 Feb 2003 07:29:46 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1JFTid22819 for ; Wed, 19 Feb 2003 07:29:44 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id KAA16927; Wed, 19 Feb 2003 10:29:20 -0500 Message-Id: <4.2.0.58.J.20030219091733.054f4718@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 19 Feb 2003 09:45:48 -0500 To: "Jeffrey J Zahari" , "Roy Badami" From: Martin Duerst Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model Cc: , In-Reply-To: <00ca01c2d7ca$2800c560$3800a8c0@jeffreyibm> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 12:51 03/02/19 +0900, Jeffrey J Zahari wrote: >----- Original Message ----- >From: "Martin Duerst" > > Well, I agree that we should concentrate on addresses here, > > but looking ahead is part of good engineering. So even if > > this happens in two steps (UTF8ADDRESS and UTF8HEADER), > > we can think about the interactions. And if we find > > out that it would be almost as easy to do both at the same, > > and maybe just as one extension, then I don't think we > > should feel restricted to not do it. > > > > Actually, my current guess is that it's almost as much > > effort to do both things in one extension as to do them > > separately: >What you're envisioning is something brought up previously, that IMAA should >update 2821/2822. Well, as far as I understand, defining an SMTP service extension does not constitute an update to 2821/2822 itself. >I assume UTF8ADDRESS refers to 2821 level email addresses and UTF8HEADER >refers to email addresses within 2822 headers. Well, overall, there would actually be three things: - 2821 email addresses - 2822 email addresses - 2822 other text (where encoded words are used currently) [there may also be 2821 other text, but I'm not aware of such] So overall, we might need three different extensions. But as I said, I don't think it makes too much sense to allow 2822 email addresses in UTF-8 but to restrict other text to encoded words (although I have heard others think about such proposals). Of course, because our main focus is on email addresses, it also doesn't make much sense to solve the problem for 2822 other text, but not for 2822 email addresses. I understand less about the relationship between 2821 and 2822 functionality, but it may also turn out that they are related enough that it doesn't make sense to define two different extensions. >What happens if an >intermediate legacy smtp server cannot handle UTF8ADDRESS, and a receiver's >MUA cannot handle messages with UTF8HEADER? In the scenarios we are discussing here, that would be negotiated as any other SMTP extension. The only problem may be that there is not really a negotiation between the last MTA and the receiving MUA. >With the IMAA-ACE approaches ( or until 2821/2 is altered/implemented ), >unless the MTA requires non opaque lhs, this seems like the most efficient >path to internationalised emails. Well, adding another layer of encoding such as proposed in IMAA-ACE can in some way be quite efficient, but it complicates things forever if there is no alternative. Also, it's not clear how quickly we will arrive at a solution. If I think about IDNS, then my original proposal was published in December 1996, and it took overall more than six years to come to the point where actual deployment can now start. So I'm not so confident that this one will be so very quick. Regards, Martin. From owner-ietf-imaa Wed Feb 19 08:47:15 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1JGlFu25264 for ietf-imaa-bks; Wed, 19 Feb 2003 08:47:15 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1JGlEd25260 for ; Wed, 19 Feb 2003 08:47:14 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030219091733.054f4718@localhost> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 19 Feb 2003 08:47:11 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 9:45 AM -0500 2/19/03, Martin Duerst wrote: >Well, as far as I understand, defining an SMTP service extension >does not constitute an update to 2821/2822 itself. It requires an update to 2822 if that extension will change the rules for message format. >>What happens if an >>intermediate legacy smtp server cannot handle UTF8ADDRESS, and a receiver's >>MUA cannot handle messages with UTF8HEADER? > >In the scenarios we are discussing here, that would be negotiated >as any other SMTP extension. The only problem may be that there is >not really a negotiation between the last MTA and the receiving MUA. That's not a small "only problem". It means that messages might be rejected or accepted at different times and therefore the sender will not be able to predict whether or not he can send a message. Note that this problem manifests not only for To: addresses, but also From: and Cc: addresses (and probably others...). You can't predict whether or not your inclusion of your own IMAA in the From: field or the Cc: field will cause a message to be bounced. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Wed Feb 19 12:02:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1JK2Tr06462 for ietf-imaa-bks; Wed, 19 Feb 2003 12:02:29 -0800 (PST) Received: from moriarty.gnomon.org.uk (pc4-cmbg2-5-cust162.cmbg.cable.ntl.com [81.100.86.162]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1JK2Qd06448 for ; Wed, 19 Feb 2003 12:02:26 -0800 (PST) Received: from moriarty.gnomon.org.uk (roy@localhost [127.0.0.1]) by moriarty.gnomon.org.uk (8.12.3/8.12.3/Debian -4) with ESMTP id h1JK3IKA021177 for ; Wed, 19 Feb 2003 20:03:18 GMT MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15955.58113.661260.889533@moriarty.gnomon.org.uk> Date: Wed, 19 Feb 2003 20:03:13 +0000 To: Paul Hoffman / IMC Cc: ietf-imaa@imc.org Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model In-Reply-To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> X-Mailer: VM 7.03 under Emacs 20.7.2 From: Roy Badami X-Delivery-Agent: TMDA/0.65 (Johnstown) X-Primary-Address: roy@gnomon.org.uk Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > That's not a small "only problem". It means that messages might be > rejected or accepted at different times and therefore the sender will > not be able to predict whether or not he can send a message. Note > that this problem manifests not only for To: addresses, but also > From: and Cc: addresses (and probably others...). You can't predict > whether or not your inclusion of your own IMAA in the From: field or > the Cc: field will cause a message to be bounced. I think the only way this would be viable is if it was mandatory to convert to IMAA-ACE rather than bounce. The MUA issue would be up to local sites. Until they'd upgraded all their MUAs, they could simply convert all mail to IMAA-ACE, or they could make decisions on a per mailbox basis. The point is that it *is* viable to regard IMAA-ACE as a transition strategy to a native UTF-8 format (at least as far as SMTP based mail goes) because we have a robust negotiation mechanism. In this way, IMAs are different from IDNs. Support for the IMAA-ACE would still have to remain for a *long* time, of course (perhaps decades). I'm not necessarily saying it's a good idea, just noting that it is possible. -roy From owner-ietf-imaa Wed Feb 19 12:14:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1JKEZs07245 for ietf-imaa-bks; Wed, 19 Feb 2003 12:14:35 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1JKETd07239 for ; Wed, 19 Feb 2003 12:14:29 -0800 (PST) Received: from 192.168.0.3 (ppp36-90.hrz.uni-bielefeld.de [129.70.36.90]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HAK00CZ0O778S@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Wed, 19 Feb 2003 21:14:27 +0100 (MET) Date: Wed, 19 Feb 2003 21:06:14 +0100 From: Marc Mutz Subject: Re: Open Issue: Stored strings vs. queries. In-reply-to: <8fpGbcs3cDD@3247.org> To: ietf-imaa@imc.org Message-id: <200302192106.26062@sendmail.mutz.com> Organization: "Old Europe" - and proud MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_CP+U+Wbmf9Tc14m"; charset="iso-8859-1" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <8fpGbcs3cDD@3247.org> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_CP+U+Wbmf9Tc14m Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline Let's wrap at least this issue up ;-) On Friday 14 February 2003 03:14, Claus F=E4rber wrote: > Marc Mutz schrieb/wrote: > > An interesting issue is what is to be considered stored local-parts > > and what is a queried local-part. > > > > Obvious stored local-parts: > > - MTA config > > - address books (e.g. LDAP) > > There are two types of address books: > . Authorative address books, which are run by the same authority as > the MTA config files. These are cleraly ``stored''. > . Personal address books, which contain queries (because they are > used to match against addresses _created_ elsewhere). Otherwise, you > could not enter an address that was created by someone using a newer > version of the profile. > > > Obvious queries: > > - address lookups (e.g. LDAP queries) > > - SMTP commands > > > > Non-obvious: > > - Mail headers > > Again, clearly a ``query''. It makes use of names created elswhere. > Otherwise, you could not send mail to an address that was created by > someone using a newer version of the profile. I agree. I think there's a need to spell out when to use the=20 AllowUnassigned flag in IMAA. So, keeping the "stored" vs. "query"=20 terms for IDNA compat's sake, we currently have: Only unset AllowUnassigned for authorative address books. =2E..for a certain definition of "authorative address book" (AABs). What's= =20 that definition? AABs certainly include the config database/files of=20 final delivery MTAs. What about MUA identities? Queries, I'd say. What=20 about a company's LDAP server that contains the addresses of employees=20 and is administered by the same authority that configures the company's=20 MTAs? Query? Stored? According to your reply, you'd see them as stored.=20 Where to draw the line? And as important: How to spell that out? Proposed text (cf. rfc2821, 2.3.10, last sentence): The AllowUnassigned flag MUST be set except in the following case. When the result of the ToAscii operation is to be used as part of the MTA configuration on the host specified in the domain part of the address, the AllowUnassigned flag MUST NOT be set. IOW, only the delivery SMTP server responsible for the mailboxes on the=20 host specified by the domain part of the address should disallow=20 unassigned code points in the part of it's configuration that is used=20 to determine the valid local-parts. What other "authorative address books" are there and need to be=20 considered? The LDAP directory that contains the email addresses for=20 the employees? There are two scenarios: 1. The MTA reads it's config from LDAP, so the MUST NOT set=20 AllowUnassigned is in effect (since the string stored in LDAP is part=20 of the MTAs config) 2. The LDAP and the MTA config are fed independently or from a common=20 source. In this case, the LDAP tree will contain the same addresses as=20 the MTA config, since the addresses have to match for the LDAP=20 directory to be useful. So again, the MUST NOT from the MTA config=20 implicitly carries over to the LDAP strings. =46ine. However, I think the text could more precisely define what part of= =20 the MTA config the clause refers to (like adding a "to determine the=20 valid local-parts" somewhere in the text). Native speakers? Marc =2D-=20 If privacy is outlawed, only outlaws will have privacy. -- Phil Zimmermann --Boundary-02=_CP+U+Wbmf9Tc14m Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+U+PC3oWD+L2/6DgRAmJpAJwKpvrsIh2cLGfJqqMdplDafEN4zgCfZXzZ PSoMQNk++e/iddZXWFS0YdI= =g6Pz -----END PGP SIGNATURE----- --Boundary-02=_CP+U+Wbmf9Tc14m-- From owner-ietf-imaa Wed Feb 19 13:41:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1JLfU311104 for ietf-imaa-bks; Wed, 19 Feb 2003 13:41:30 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1JLfSd11099 for ; Wed, 19 Feb 2003 13:41:28 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <15955.58113.661260.889533@moriarty.gnomon.org.uk> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 19 Feb 2003 13:40:34 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 8:03 PM +0000 2/19/03, Roy Badami wrote: >I think the only way this would be viable is if it was mandatory to >convert to IMAA-ACE rather than bounce. So we need two mechanisms instead of one? And the advantage of that is...? For those of you who didn't follow the IDN WG for the past few years, this is highly analogous to the debate that happened there. The whole idea of a "transition" sounds great until you realize that the second format is going to be with us forever. Given that the transition strategy is harder than simply going with IMAA-ACE, there has to be a good reason for it. I don't consider "UTF-8 is good" to be a good enough reason. (And before anyone here calls me "anti-UTF-8", please look at the top of the first page of the UTF-8 RFC.) >The point is that it *is* viable to regard IMAA-ACE as a transition >strategy to a native UTF-8 format (at least as far as SMTP based mail >goes) because we have a robust negotiation mechanism. In this way, >IMAs are different from IDNs. Right: we were smart enough not to do that in the IDN WG. >I'm not necessarily saying it's a good idea, just noting that it is >possible. Generally, a stronger argument than that is needed. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Wed Feb 19 14:53:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1JMrZc12488 for ietf-imaa-bks; Wed, 19 Feb 2003 14:53:35 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1JMrYd12482 for ; Wed, 19 Feb 2003 14:53:34 -0800 (PST) Received: (qmail 96845 invoked by uid 1016); 19 Feb 2003 22:54:03 -0000 Date: 19 Feb 2003 22:54:03 -0000 Message-ID: <20030219225403.96844.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: POP3 mailbox names and IMAP userids References: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roy Badami writes: > It is not uncommon, for instance, for the POP3 mailbox name to be the > entire RFC-822 address, possibly with some other character (such as > percent) substituted for at-sign, presumably to accomodate broken > clients that believe that a POP3 mailbox name cannot contain an > at-sign. Correct. The same protocol works just fine in a UTF-8 world. (As far as I know, the only relevant software bugs are in the client UI, where some graphical clients assume 8859-1 rather than UTF-8.) This is a good illustration of the advantages of a universal character encoding. Sure, it's _possible_ to have different character encodings in different locations, with nasty conversions whenever information moves from one location to another; but getting it right is so difficult that the GoofyCode proponents would rather say stupid things like ``out of scope'' than even begin talking about the details. In contrast, UTF-8 works with the same simple byte copying that we use for ASCII today. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Wed Feb 19 16:07:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1K07PM13981 for ietf-imaa-bks; Wed, 19 Feb 2003 16:07:25 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1K07Nd13975; Wed, 19 Feb 2003 16:07:24 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id TAA20241; Wed, 19 Feb 2003 19:07:26 -0500 Message-Id: <4.2.0.58.J.20030219174016.04396718@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 19 Feb 2003 19:07:19 -0500 To: Paul Hoffman / IMC , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Can we back up a bit and ask some basic questions? In-Reply-To: References: <15955.58113.661260.889533@moriarty.gnomon.org.uk> <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 13:40 03/02/19 -0800, Paul Hoffman / IMC wrote: >At 8:03 PM +0000 2/19/03, Roy Badami wrote: >>I think the only way this would be viable is if it was mandatory to >>convert to IMAA-ACE rather than bounce. > >So we need two mechanisms instead of one? And the advantage of that is...? So we needed two different mechanisms (QP/base64 and 8-bit MIME) for body parts. I assume these things were not created without some good advantages in mind. >For those of you who didn't follow the IDN WG for the past few years, this >is highly analogous to the debate that happened there. The whole idea of a >"transition" sounds great until you realize that the second format is >going to be with us forever. Given that the transition strategy is harder >than simply going with IMAA-ACE, there has to be a good reason for it. > >I don't consider "UTF-8 is good" to be a good enough reason. Of course just saying 'UTF-8 is good' doesn't cut it. But the same goes for 'ACE is good'. The best way to explain the advantages of UTF-8, in my view, are to look at how to work on email data (mailboxes) with scripts and tools. While this is not laid down in any standard, and is usually not considered too much in discussions like these, the whole area of scripts and tools is very important for the success of a technology. And this definitely was the case for Internet mail (as opposed, e.g., to some ISO projects in the same area). Now being able to more/grep/less/awk/sed/perl/... through a mailbox is extremely easy as long as everything relevant stays in US-ASCII. It would also be very easy as long as everything relevant is in UTF-8. But as soon as things such as RFC 2047 and ACE come in, things get extremely complicated. Searching for 'Paul' or 'Hoffman' in email headers is trivial. Searching for (the native character equivalents of) 'Taro' or 'Suzuki' in the same headers turns into a major engineering project. It doesn't need to stay that way. A different way to explain things: Ulrich Drepper, of gclib fame, once put it very clearly that in order to move internationalization forward, he and other people with his knowledge would work on getting the basics done (i.e. moving things to UTF-8 or equivalent), and the people on the other side of the globe could then work on top of that (localizing applications, language-specific search,...). The IETF community has formalized this kind of thinking in a BCP, http://www.ietf.org/rfc/rfc2277.txt, (and look who is at the top of that document!) which says: "Protocols MUST be able to use the UTF-8 charset,"... While the IETF is very well known for its flexibility, RFC 2277 should not be something that is easily dismissed. Indeed, the burden should be on people to prove that UTF-8 does not work at all (which I haven't seen argued here yet), without constantly trying to turn around the burden of proof. >(And before anyone here calls me "anti-UTF-8", please look at the top of >the first page of the UTF-8 RFC.) Sorry, but http://www.ietf.org/rfc/rfc2279.txt lists Francois Yergeau as its only author. Same for the next version that is in the works (http://www.ietf.org/internet-drafts/draft-yergeau-rfc2279bis-03.txt). That one lists you in the acknowledgements. You have co-authored the RFC on UTF-16 (http://www.ietf.org/rfc/rfc2781.txt). That's something different. Regards, Martin. From owner-ietf-imaa Wed Feb 19 19:37:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1K3bK319030 for ietf-imaa-bks; Wed, 19 Feb 2003 19:37:20 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1K3bId19026 for ; Wed, 19 Feb 2003 19:37:18 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id 5F47A78A84 for ; Thu, 20 Feb 2003 03:37:10 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 86335-04 for ; Thu, 20 Feb 2003 03:37:08 +0000 (GMT) Received: from jeffreyibm (unknown [211.219.16.83]) by pie1.i-dns.net (Postfix) with SMTP id 10CC978A94 for ; Thu, 20 Feb 2003 03:37:07 +0000 (GMT) Message-ID: <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> From: "Jeffrey J Zahari" To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Thu, 20 Feb 2003 12:39:10 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Paul Hoffman / IMC" To: Sent: Thursday, February 20, 2003 6:40 AM Subject: Re: Can we back up a bit and ask some basicquestions?Analternate model > > At 8:03 PM +0000 2/19/03, Roy Badami wrote: > >I think the only way this would be viable is if it was mandatory to > >convert to IMAA-ACE rather than bounce. > > So we need two mechanisms instead of one? And the advantage of that is...? > > For those of you who didn't follow the IDN WG for the past few years, > this is highly analogous to the debate that happened there. The whole > idea of a "transition" sounds great until you realize that the second > format is going to be with us forever. Given that the transition > strategy is harder than simply going with IMAA-ACE, there has to be a > good reason for it. > > I don't consider "UTF-8 is good" to be a good enough reason. (And > before anyone here calls me "anti-UTF-8", please look at the top of > the first page of the UTF-8 RFC.) > > >The point is that it *is* viable to regard IMAA-ACE as a transition > >strategy to a native UTF-8 format (at least as far as SMTP based mail > >goes) because we have a robust negotiation mechanism. In this way, > >IMAs are different from IDNs. > > Right: we were smart enough not to do that in the IDN WG. > The process of query and reply of domain names differ from the mechanism provided by 2821 in that there is an opportunity for the sending MTA to negotiate the encoding of destination mailbox names with the receiving MTA. In that sense, 2821 UTF8ADDRESS can exist as a separate proposal to IMAA. Here is how both can exist: because IMAA-ACE approach implicitly assumes 2821 IMAA-ACE, a sender/receiver MTA can, using smtp extensions, specify UTF8ADDRESS or IMAAADDRESS, leaving it up to the intermediate MTA to do the appropriate conversions if necessary. It is assumed that servers advertising UTF8ADDRESS have the wherewithal to IDNA the RHS for dns resolution. jeffrey j zahari From owner-ietf-imaa Thu Feb 20 05:28:07 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KDS7P07652 for ietf-imaa-bks; Thu, 20 Feb 2003 05:28:07 -0800 (PST) Received: from mail.uni-bielefeld.de (IDENT:72@mail2.uni-bielefeld.de [129.70.4.90]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KDS5d07648 for ; Thu, 20 Feb 2003 05:28:06 -0800 (PST) Received: from dirichlet.Physik.Uni-Bielefeld.DE (dirichlet.Physik.Uni-Bielefeld.DE [129.70.125.234]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.10.12.16.25.p8) with ESMTP id <0HAM004S702L36@mail.uni-bielefeld.de> for ietf-imaa@imc.org; Thu, 20 Feb 2003 14:27:58 +0100 (MET) Date: Thu, 20 Feb 2003 14:19:54 +0100 From: Marc Mutz Subject: Re: Can we back up a bit and ask some basic questions?An alternate model In-reply-to: <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> To: ietf-imaa@imc.org Message-id: <200302201419.58159@sendmail.mutz.com> Organization: "Old Europe" - and proud MIME-version: 1.0 Content-type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_+XNV+IRR01s4Z+J"; charset="us-ascii" Content-transfer-encoding: 7bit User-Agent: KMail/1.5.9 X-PGP-Key: 0xBDBFE838 References: <4.2.0.58.J.20030215182950.03351550@localhost> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_+XNV+IRR01s4Z+J Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Thursday 20 February 2003 04:39, Jeffrey J Zahari wrote: > Here is how both can exist: because IMAA-ACE approach implicitly > assumes 2821 IMAA-ACE, a sender/receiver MTA can, using smtp > extensions, specify UTF8ADDRESS or IMAAADDRESS, leaving it up to the > intermediate MTA to do the appropriate conversions if necessary. It > is assumed that servers advertising UTF8ADDRESS have the wherewithal > to IDNA the RHS for dns resolution. Any UTF8ADDRESS extension to SMTP is a way to make the SMTP local part=20 (and domain?) slots explicitly IMA-aware. This is completely orthogonal=20 to IMAA-ACE. A server supporting UTF8ADDRESS would be required to=20 encode all addresses in IMAA-ACE if the next hop doesn't announce the=20 UTF8ADDRESS extension. An UTF8ADDRESS SMTP extension doesn't solve the problem for any other=20 IMA-unaware slot. Insofar it's usefulness is limited (though the same=20 can be said about 8BITMIME, of course). It's a convenience to - yes to=20 whom? While 8BITMIME corresponds to the "8bit" CTE in MIME messages and=20 thus saves (if supported) applying a CTE at the MUA level, UTF8ADDRESS=20 lacks such support outside of SMTP, since rfc2822 slots will still be=20 IMAA-unaware. So: Who is going to benefit from UTF8ADDRESS (other than aesthetics)? Marc =2D-=20 Ein Grundrecht auf Sicherheit steht bewusst nicht in der Verfassung. -- Sabine Leutheusser-Schnarrenberger (ehem. Bundesjustizministerin) --Boundary-02=_+XNV+IRR01s4Z+J Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+VNX93oWD+L2/6DgRAto+AKDtV5Ti/JqIe3wk7U8PcCc/mtfVcQCg4+4w WDvykc1fP6L5XzF63Ryjcto= =ATK/ -----END PGP SIGNATURE----- --Boundary-02=_+XNV+IRR01s4Z+J-- From owner-ietf-imaa Thu Feb 20 07:37:22 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KFbME17187 for ietf-imaa-bks; Thu, 20 Feb 2003 07:37:22 -0800 (PST) Received: from relay-3m.club-internet.fr (relay-3m.club-internet.fr [194.158.104.42]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KFbKd17183 for ; Thu, 20 Feb 2003 07:37:21 -0800 (PST) Received: from mine.club-internet.fr (f05v-22-96.d1.club-internet.fr [212.194.201.96]) by relay-3m.club-internet.fr (Postfix) with ESMTP id 2B4A7E1D1; Thu, 20 Feb 2003 16:38:08 +0100 (CET) Message-Id: <5.2.0.9.0.20030220164234.0337cdd0@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Thu, 20 Feb 2003 16:44:05 +0100 To: Marc Mutz , ietf-imaa@imc.org From: "J-F C. (Jefsey) Morfin" Subject: Re: Can we back up a bit and ask some basic questions?An alternate model In-Reply-To: <200302201419.58159@sendmail.mutz.com> References: <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030215182950.03351550@localhost> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> Mime-Version: 1.0 Content-Type: multipart/mixed; x-avg-checked=avg-ok-6B0D5DA6; boundary="=======1311FC3=======" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --=======1311FC3======= Content-Type: text/plain; x-avg-checked=avg-ok-6B0D5DA6; charset=us-ascii; format=flowed Content-Transfer-Encoding: 8bit At 14:19 20/02/03, Marc Mutz wrote: >Who is going to benefit from UTF8ADDRESS (other than aesthetics)? users. It happens that they do not care about UTF8ADDRESS, but they DO care about aesthetics. --=======1311FC3======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-6B0D5DA6 Content-Disposition: inline --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.454 / Virus Database: 253 - Release Date: 10/02/03 --=======1311FC3=======-- From owner-ietf-imaa Thu Feb 20 09:14:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KHEc721659 for ietf-imaa-bks; Thu, 20 Feb 2003 09:14:38 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KHEbd21649 for ; Thu, 20 Feb 2003 09:14:37 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 20 Feb 2003 08:58:15 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This thread has gone towards making guesses about how IMAA-UTF8 would be specified, and different people have different guesses. When there is a complete Internet Draft on IMAA-UTF8, we can discuss it sensibly; until then, we can't. If John or Martin or some other proponent of the idea wants to make a draft, please do so. It would be quite appropriate to discuss it on this mailing list. But until then, could we curtail the guessing? --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Thu Feb 20 09:34:03 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KHY3122854 for ietf-imaa-bks; Thu, 20 Feb 2003 09:34:03 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1KHY2d22849 for ; Thu, 20 Feb 2003 09:34:02 -0800 (PST) Message-ID: <02a701c2d906$1a4a3fe0$0f01a8c0@neteka.inc> From: "Edmon Chung" To: , "Paul Hoffman / IMC" References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Thu, 20 Feb 2003 12:32:53 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I do have a draft for it actually based on ESMTP. Should I just send it to this list or do we have a draft archive?... or should I send to ietf for archival?... Edmon ----- Original Message ----- From: "Paul Hoffman / IMC" To: Sent: Thursday, February 20, 2003 11:58 AM Subject: Re: Can we back up a bit and ask some basic questions?An alternate model > > This thread has gone towards making guesses about how IMAA-UTF8 would > be specified, and different people have different guesses. > > When there is a complete Internet Draft on IMAA-UTF8, we can discuss > it sensibly; until then, we can't. If John or Martin or some other > proponent of the idea wants to make a draft, please do so. It would > be quite appropriate to discuss it on this mailing list. But until > then, could we curtail the guessing? > > --Paul Hoffman, Director > --Internet Mail Consortium > From owner-ietf-imaa Thu Feb 20 10:30:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KIUaY27770 for ietf-imaa-bks; Thu, 20 Feb 2003 10:30:36 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KIUZd27764; Thu, 20 Feb 2003 10:30:35 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id NAA26244; Thu, 20 Feb 2003 13:30:36 -0500 Message-Id: <4.2.0.58.J.20030220132811.056277a0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 20 Feb 2003 13:28:52 -0500 To: "Edmon Chung" , , "Paul Hoffman / IMC" From: Martin Duerst Subject: Re: Can we back up a bit and ask some basic questions?An alternate model In-Reply-To: <02a701c2d906$1a4a3fe0$0f01a8c0@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Great! Please submit it as an Internet-Draft and copy this list. Regards, Martin. At 12:32 03/02/20 -0500, Edmon Chung wrote: >I do have a draft for it actually based on ESMTP. >Should I just send it to this list or do we have a draft archive?... or >should I send to ietf for archival?... >Edmon > > >----- Original Message ----- >From: "Paul Hoffman / IMC" >To: >Sent: Thursday, February 20, 2003 11:58 AM >Subject: Re: Can we back up a bit and ask some basic questions?An alternate >model > > > > > > This thread has gone towards making guesses about how IMAA-UTF8 would > > be specified, and different people have different guesses. > > > > When there is a complete Internet Draft on IMAA-UTF8, we can discuss > > it sensibly; until then, we can't. If John or Martin or some other > > proponent of the idea wants to make a draft, please do so. It would > > be quite appropriate to discuss it on this mailing list. But until > > then, could we curtail the guessing? > > > > --Paul Hoffman, Director > > --Internet Mail Consortium > > From owner-ietf-imaa Thu Feb 20 12:58:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KKwau05183 for ietf-imaa-bks; Thu, 20 Feb 2003 12:58:36 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1KKwZd05177 for ; Thu, 20 Feb 2003 12:58:35 -0800 (PST) Message-ID: <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> From: "Edmon Chung" To: , "Paul Hoffman / IMC" , "Martin Duerst" References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Thu, 20 Feb 2003 15:58:27 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I have just submitted the draft on IMA based on SMTP and POP extensions to the IETF. You could also check it out at http://www.dnsii.org/draft-ietf-chung-imax-00.txt Because I wrote it about 2.5 years ago some of the stuff might need some update. Anyway, comments and discussions would be very much appreciated. :-) Edmon ----- Original Message ----- From: "Martin Duerst" To: "Edmon Chung" ; ; "Paul Hoffman / IMC" Sent: Thursday, February 20, 2003 1:28 PM Subject: Re: Can we back up a bit and ask some basic questions?An alternate model > > Great! Please submit it as an Internet-Draft and copy this list. > > Regards, Martin. > > At 12:32 03/02/20 -0500, Edmon Chung wrote: > > >I do have a draft for it actually based on ESMTP. > >Should I just send it to this list or do we have a draft archive?... or > >should I send to ietf for archival?... > >Edmon > > > > > >----- Original Message ----- > >From: "Paul Hoffman / IMC" > >To: > >Sent: Thursday, February 20, 2003 11:58 AM > >Subject: Re: Can we back up a bit and ask some basic questions?An alternate > >model > > > > > > > > > > This thread has gone towards making guesses about how IMAA-UTF8 would > > > be specified, and different people have different guesses. > > > > > > When there is a complete Internet Draft on IMAA-UTF8, we can discuss > > > it sensibly; until then, we can't. If John or Martin or some other > > > proponent of the idea wants to make a draft, please do so. It would > > > be quite appropriate to discuss it on this mailing list. But until > > > then, could we curtail the guessing? > > > > > > --Paul Hoffman, Director > > > --Internet Mail Consortium > > > > > From owner-ietf-imaa Thu Feb 20 13:41:05 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KLf5X06323 for ietf-imaa-bks; Thu, 20 Feb 2003 13:41:05 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KLf3d06318 for ; Thu, 20 Feb 2003 13:41:04 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1KLf3Xf002609 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Thu, 20 Feb 2003 22:41:05 +0100 To: "Edmon Chung" Cc: Subject: Re: Can we back up a bit and ask some basic questions?An alternate model X-Payment: hashcash 1.1 0:030220:edmon@neteka.com:712c461fff8dfceb X-Hashcash: 0:030220:edmon@neteka.com:712c461fff8dfceb X-Payment: hashcash 1.1 0:030220:ietf-imaa@imc.org:3600676ef9aa3613 X-Hashcash: 0:030220:ietf-imaa@imc.org:3600676ef9aa3613 From: Simon Josefsson Date: Thu, 20 Feb 2003 22:41:03 +0100 In-Reply-To: <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> ("Edmon Chung"'s message of "Thu, 20 Feb 2003 15:58:27 -0500") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-2.0 required=5.0 tests=IN_REP_TO,REFERENCES,SPAM_PHRASE_00_01,USER_AGENT, USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: "Edmon Chung" writes: > I have just submitted the draft on IMA based on SMTP and POP extensions to > the IETF. You could also check it out at > http://www.dnsii.org/draft-ietf-chung-imax-00.txt This looks good IMHO. Apparently, this seems to be a back-wards compatible way to make SMTP and POP-3 accept internationalized mail addresses, thus making it possible to gradually phase out the 7-bit legacy compatible punycode hack. I'm not sure the M-* headers is such a good idea though, perhaps it is better to simply make this a way to enable non-ASCII in SMTP and POP3 and leave header internationalization up to another standard. The M-From also seem to break the SMTP and RFC822 envelope header dichotomy (MAIL FROM doesn't need to be the same as From:). Add a IMAP capability in the same vein and we are set. From owner-ietf-imaa Thu Feb 20 15:07:13 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KN7D708059 for ietf-imaa-bks; Thu, 20 Feb 2003 15:07:13 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1KN7Bd08055 for ; Thu, 20 Feb 2003 15:07:11 -0800 (PST) Message-ID: <037501c2d934$c4d47560$0f01a8c0@neteka.inc> From: "Edmon Chung" To: "Simon Josefsson" Cc: References: <4.2.0.58.J.20030215182950.03351550@localhost><93041416.1045327595@p3.JCK.COM><1045331284.28770.TMDA@moriarty.gnomon.org.uk><1045337541.29302.TMDA@moriarty.gnomon.org.uk><4.2.0.58.J.20030216101703.05134de8@localhost><4.2.0.58.J.20030219091733.054f4718@localhost><15955.58113.661260.889533@moriarty.gnomon.org.uk><006601c2d891$a29fbbb0$3800a8c0@jeffreyibm><4.2.0.58.J.20030220132811.056277a0@localhost><036f01c2d922$d1b59910$0f01a8c0@neteka.inc> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Thu, 20 Feb 2003 18:06:56 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Good to hear from you. I actually agree with you about the M- headers thing... i wasnt so sure to begin with. But in order to phase out the ACE fallback, there needs to be new header fields that could use email addresses in other forms than ACE... What are your thoughts? Edmon ----- Original Message ----- From: "Simon Josefsson" To: "Edmon Chung" Cc: Sent: Thursday, February 20, 2003 4:41 PM Subject: Re: Can we back up a bit and ask some basic questions?An alternate model > > "Edmon Chung" writes: > > > I have just submitted the draft on IMA based on SMTP and POP extensions to > > the IETF. You could also check it out at > > http://www.dnsii.org/draft-ietf-chung-imax-00.txt > > This looks good IMHO. > > Apparently, this seems to be a back-wards compatible way to make SMTP > and POP-3 accept internationalized mail addresses, thus making it > possible to gradually phase out the 7-bit legacy compatible punycode > hack. I'm not sure the M-* headers is such a good idea though, > perhaps it is better to simply make this a way to enable non-ASCII in > SMTP and POP3 and leave header internationalization up to another > standard. The M-From also seem to break the SMTP and RFC822 envelope > header dichotomy (MAIL FROM doesn't need to be the same as From:). > > Add a IMAP capability in the same vein and we are set. > > From owner-ietf-imaa Thu Feb 20 15:19:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1KNJB208293 for ietf-imaa-bks; Thu, 20 Feb 2003 15:19:11 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1KNJAd08289 for ; Thu, 20 Feb 2003 15:19:10 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1KNJ9Xf006788 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 21 Feb 2003 00:19:11 +0100 To: "Edmon Chung" Cc: Subject: Re: Can we back up a bit and ask some basic questions?An alternate model X-Payment: hashcash 1.1 0:030220:edmon@neteka.com:8dafbf099f914554 X-Hashcash: 0:030220:edmon@neteka.com:8dafbf099f914554 X-Payment: hashcash 1.1 0:030220:ietf-imaa@imc.org:83e495a30fe86269 X-Hashcash: 0:030220:ietf-imaa@imc.org:83e495a30fe86269 From: Simon Josefsson Date: Fri, 21 Feb 2003 00:19:09 +0100 In-Reply-To: <037501c2d934$c4d47560$0f01a8c0@neteka.inc> ("Edmon Chung"'s message of "Thu, 20 Feb 2003 18:06:56 -0500") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <037501c2d934$c4d47560$0f01a8c0@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-2.0 required=5.0 tests=IN_REP_TO,REFERENCES,SPAM_PHRASE_02_03,USER_AGENT, USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: "Edmon Chung" writes: > Good to hear from you. > I actually agree with you about the M- headers thing... i wasnt so sure to > begin with. > But in order to phase out the ACE fallback, there needs to be new header > fields that could use email addresses in other forms than ACE... What are > your thoughts? I don't think the issues need to be linked -- it would be possible to phase out punycoded address in SMTP MAIL FROM but keep them in message headers. It seems to me that internationalization of RFC 2821 and POP3 and IMAP is independent of internationalization of RFC 2822. It seems unlikely that it will be possible to move away from punycode in RFC 2822 soon because it is a stored format rather than a interactive protocol like RFC 2821. In a protocol you can agree on new non-back-wards compatible behaviour for a single session (such as your proposal) because all parties that are interested in the session are present and can negotiate, but in a storage format any new features should either be developed within the original or specification, or a completely new version of the format should be developed, because all parties that will see a certain RFC 2821 message is not present and able to interact with the sender to negotiate what features to use. A separate issue: I think your document should say that the strings passed in MAIL FROM (whether as ACE or UTF-8) should be processed by the IMA stringprep profile. From owner-ietf-imaa Thu Feb 20 18:50:57 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1L2ovw12509 for ietf-imaa-bks; Thu, 20 Feb 2003 18:50:57 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1L2otd12505 for ; Thu, 20 Feb 2003 18:50:55 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id 5CD3778A6C; Fri, 21 Feb 2003 02:50:42 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 17025-05; Fri, 21 Feb 2003 02:50:35 +0000 (GMT) Received: from jeffreyibm (unknown [211.219.16.83]) by pie1.i-dns.net (Postfix) with SMTP id 6E2DE78A69; Fri, 21 Feb 2003 02:50:32 +0000 (GMT) Message-ID: <00b101c2d954$4a9898b0$3800a8c0@jeffreyibm> From: "Jeffrey J Zahari" To: "Edmon Chung" , "Simon Josefsson" Cc: References: <4.2.0.58.J.20030215182950.03351550@localhost><93041416.1045327595@p3.JCK.COM><1045331284.28770.TMDA@moriarty.gnomon.org.uk><1045337541.29302.TMDA@moriarty.gnomon.org.uk><4.2.0.58.J.20030216101703.05134de8@localhost><4.2.0.58.J.20030219091733.054f4718@localhost><15955.58113.661260.889533@moriarty.gnomon.org.uk><006601c2d891$a29fbbb0$3800a8c0@jeffreyibm><4.2.0.58.J.20030220132811.056277a0@localhost><036f01c2d922$d1b59910$0f01a8c0@neteka.inc><037501c2d934$c4d47560$0f01a8c0@neteka.inc> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Fri, 21 Feb 2003 11:52:33 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The charset identifier is redundant within the examples. The use of the Q/B encoded like words as MAIL FROM nouns identifies the charset. Similarly, the xn-- identifies the ACE. Unless 8 bit is used, this is not needed. The use of the M-headers should be renamed as X- headers for experimental usage. But, isn't the split between 2821 and 2822 a design decision to keep the message payload separate from the transport mechanism ? This looks like a tie in between 2821 & 2822. In general, smtp implementations should keep data from 2821 separate from the 2822 message object, storing them internally as some form of meta data. jeffrey j zahari ----- Original Message ----- From: "Simon Josefsson" To: "Edmon Chung" Cc: Sent: Friday, February 21, 2003 8:19 AM Subject: Re: Can we back up a bit and ask some basic questions?An alternate model > > "Edmon Chung" writes: > > > Good to hear from you. > > I actually agree with you about the M- headers thing... i wasnt so sure to > > begin with. > > But in order to phase out the ACE fallback, there needs to be new header > > fields that could use email addresses in other forms than ACE... What are > > your thoughts? > > I don't think the issues need to be linked -- it would be possible to > phase out punycoded address in SMTP MAIL FROM but keep them in message > headers. It seems to me that internationalization of RFC 2821 and > POP3 and IMAP is independent of internationalization of RFC 2822. It > seems unlikely that it will be possible to move away from punycode in > RFC 2822 soon because it is a stored format rather than a interactive > protocol like RFC 2821. In a protocol you can agree on new > non-back-wards compatible behaviour for a single session (such as your > proposal) because all parties that are interested in the session are > present and can negotiate, but in a storage format any new features > should either be developed within the original or specification, or a > completely new version of the format should be developed, because all > parties that will see a certain RFC 2821 message is not present and > able to interact with the sender to negotiate what features to use. > > A separate issue: I think your document should say that the strings > passed in MAIL FROM (whether as ACE or UTF-8) should be processed by > the IMA stringprep profile. > > From owner-ietf-imaa Fri Feb 21 04:16:46 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1LCGkk28268 for ietf-imaa-bks; Fri, 21 Feb 2003 04:16:46 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1LCGhd28264 for ; Fri, 21 Feb 2003 04:16:44 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1LCGdXf022155 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 21 Feb 2003 13:16:40 +0100 To: "Edmon Chung" Cc: Subject: Re: Can we back up a bit and ask some basic questions?An alternate model X-Payment: hashcash 1.1 0:030221:edmon@neteka.com:38b5b9d809502e32 X-Hashcash: 0:030221:edmon@neteka.com:38b5b9d809502e32 X-Payment: hashcash 1.1 0:030221:ietf-imaa@imc.org:9a2bab253c846df9 X-Hashcash: 0:030221:ietf-imaa@imc.org:9a2bab253c846df9 From: Simon Josefsson Date: Fri, 21 Feb 2003 13:16:39 +0100 In-Reply-To: (Simon Josefsson's message of "Fri, 21 Feb 2003 00:19:09 +0100") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <037501c2d934$c4d47560$0f01a8c0@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-2.8 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01, USER_AGENT,USER_AGENT_GNUS_UA version=2.44 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Simon Josefsson writes: > A separate issue: I think your document should say that the strings > passed in MAIL FROM (whether as ACE or UTF-8) should be processed by > the IMA stringprep profile. Sorry for following up to myself, but on second thought I think this was a poor suggestion. The stringprep processing should be performed by the receiver. The sender may do it, but shouldn't be required to. My main reason is that stringprep is Unicode specific, and this proposal parametrizes the character set, which is a good property. Thus, stringprep would only be applicable to the UTF-8 case, which would be confusing. The alternative, to restrict the proposal to only UTF-8 would be less useful. Another alternative would be to add a meta-charset "IMAA" which is UTF-8 with IMAA stringprep processing, but I see no gain from it. Another reason is that a robust receiver will perform stringprep processing anyway, to be sure to catch non-IMAA aware clients, which would be used when someone uses e.g. telnet to a SMTP port. From owner-ietf-imaa Sun Feb 23 13:56:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1NLucH07961 for ietf-imaa-bks; Sun, 23 Feb 2003 13:56:38 -0800 (PST) Received: from smtp6.andrew.cmu.edu (SMTP6.andrew.cmu.edu [128.2.10.86]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1NLuad07956 for ; Sun, 23 Feb 2003 13:56:36 -0800 (PST) Received: from penguin.andrew.cmu.edu (PENGUIN.andrew.cmu.edu [128.2.121.100]) by smtp6.andrew.cmu.edu (8.12.7.Beta1/8.12.3.Beta2) with ESMTP id h1NLuApZ021414; Sun, 23 Feb 2003 16:56:10 -0500 Date: Sun, 23 Feb 2003 16:56:10 -0500 Message-Id: <200302232156.h1NLuApZ021414@smtp6.andrew.cmu.edu> From: Lawrence Greenfield X-Mailer: BatIMail version 3.3 To: Roy Badami , John C Klensin Cc: "ietf-imaa@imc.org" In-reply-to: <160744047.1045395298@p3.JCK.COM> Subject: Re: POP3 mailbox names and IMAP userids References: <1045410561.2571.TMDA@moriarty.gnomon.org.uk> <160744047.1045395298@p3.JCK.COM> User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-4?Q?Unebigory?= =?ISO-8859-4?Q?=F2mae?=) Emacs/21.2 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Date: Sun, 16 Feb 2003 11:34:58 -0500 From: John C Klensin [...] In practice, POP3 and IMAP accounts (mailboxes and user ids, respectively) have two important properties: (i) Unlike email addresses, which people type, enter into address books, and pass around to each other using mechanisms other than mail headers and envelopes, they are typically configured once per user (or at most, once per client machine). There are some separate issues when these things are accessed through web interfaces that simulate MUAs, but those are, well, separate issues. This is incorrect. In IMAP they are manipulated on ACLs and may form part of a hierarchy. Being able to understand another username ("principal" in the Kerberos world) is very important for any collaborative system. It is also important to users that they match up with e-mail addresses. My Kerberos principal ("leg@ANDREW.CMU.EDU") matches up nicely with my e-mail address ("leg@andrew.cmu.edu") and both might appear in a personal certificate. Unfortunately there's no BCP on principal names and authorization, so IETF protocols have different ideas of "user", "realm", "dn", etc. While it's possible for these things to diverge from e-mail addresses, it is highly undesirable that they do so. Larry From owner-ietf-imaa Sun Feb 23 21:01:31 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1O51V916775 for ietf-imaa-bks; Sun, 23 Feb 2003 21:01:31 -0800 (PST) Received: from sentosa.post1.com (sentosa.post1.com [202.27.17.100]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1O51Td16770 for ; Sun, 23 Feb 2003 21:01:30 -0800 (PST) Received: (qmail 11727 invoked from network); 24 Feb 2003 05:00:22 -0000 Received: from 220-128-56-69.hinet-ip.hinet.net (HELO JSENGTOSHIBA) (220.128.56.69) by sentosa.post1.com with SMTP; 24 Feb 2003 05:00:22 -0000 Message-ID: <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> From: "James Seng" To: "Edmon Chung" , , "Paul Hoffman / IMC" , "Martin Duerst" References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Mon, 24 Feb 2003 12:49:04 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Few comments: 1. Please submit this as an proper IETF I-D, thank you. 2. There are two separate issues, 2821 & 2822. IMAA deals along 2822 (and probably above it) whereas your proposal deals with 2821 specifically. Please dont violate laying. 3. Your proposal did not address with 2821 servers who did not response IMAX. 4. Dont forget the lessons on 8BITMIME and/or the lack of it. -James Seng ----- Original Message ----- From: "Edmon Chung" To: ; "Paul Hoffman / IMC" ; "Martin Duerst" Sent: Friday, February 21, 2003 4:58 AM Subject: Re: Can we back up a bit and ask some basic questions?An alternate model > > I have just submitted the draft on IMA based on SMTP and POP extensions to > the IETF. You could also check it out at > http://www.dnsii.org/draft-ietf-chung-imax-00.txt > > Because I wrote it about 2.5 years ago some of the stuff might need some > update. Anyway, comments and discussions would be very much appreciated. > :-) > > Edmon > > > > ----- Original Message ----- > From: "Martin Duerst" > To: "Edmon Chung" ; ; "Paul Hoffman / > IMC" > Sent: Thursday, February 20, 2003 1:28 PM > Subject: Re: Can we back up a bit and ask some basic questions?An alternate > model > > > > > > Great! Please submit it as an Internet-Draft and copy this list. > > > > Regards, Martin. > > > > At 12:32 03/02/20 -0500, Edmon Chung wrote: > > > > >I do have a draft for it actually based on ESMTP. > > >Should I just send it to this list or do we have a draft archive?... or > > >should I send to ietf for archival?... > > >Edmon > > > > > > > > >----- Original Message ----- > > >From: "Paul Hoffman / IMC" > > >To: > > >Sent: Thursday, February 20, 2003 11:58 AM > > >Subject: Re: Can we back up a bit and ask some basic questions?An > alternate > > >model > > > > > > > > > > > > > > This thread has gone towards making guesses about how IMAA-UTF8 would > > > > be specified, and different people have different guesses. > > > > > > > > When there is a complete Internet Draft on IMAA-UTF8, we can discuss > > > > it sensibly; until then, we can't. If John or Martin or some other > > > > proponent of the idea wants to make a draft, please do so. It would > > > > be quite appropriate to discuss it on this mailing list. But until > > > > then, could we curtail the guessing? > > > > > > > > --Paul Hoffman, Director > > > > --Internet Mail Consortium > > > > > > > > > From owner-ietf-imaa Sun Feb 23 23:07:13 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1O77De21303 for ietf-imaa-bks; Sun, 23 Feb 2003 23:07:13 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1O77Bd21292 for ; Sun, 23 Feb 2003 23:07:12 -0800 (PST) Message-ID: <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> From: "Edmon Chung" To: "James Seng" , , "Paul Hoffman / IMC" , "Martin Duerst" References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Date: Mon, 24 Feb 2003 02:05:44 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi James, ----- Original Message ----- From: "James Seng" > 1. Please submit this as an proper IETF I-D, thank you. has been done, please check: http://www.ietf.org/internet-drafts/draft-chung-imax-00.txt > 2. There are two separate issues, 2821 & 2822. IMAA deals along 2822 (and > probably above it) whereas your proposal deals with 2821 specifically. > Please dont violate laying. yup, have been addressed by a couple of ppl. Will be updated in -01, where section 3 will be eliminated altogether. > 3. Your proposal did not address with 2821 servers who did not response > IMAX. You are right. I forgot to add it in, it would be the same as in the second example in Section 2.2 though. But yes, I will add it in. > 4. Dont forget the lessons on 8BITMIME and/or the lack of it. I was actually thinking that IMAX would encourage the support and annoucement by server of ESMTP+8bitmime support. :-) Edmon > > -James Seng > > ----- Original Message ----- > From: "Edmon Chung" > To: ; "Paul Hoffman / IMC" ; "Martin > Duerst" > Sent: Friday, February 21, 2003 4:58 AM > Subject: Re: Can we back up a bit and ask some basic questions?An alternate > model > > > > > > I have just submitted the draft on IMA based on SMTP and POP extensions to > > the IETF. You could also check it out at > > http://www.dnsii.org/draft-ietf-chung-imax-00.txt > > > > Because I wrote it about 2.5 years ago some of the stuff might need some > > update. Anyway, comments and discussions would be very much appreciated. > > :-) > > > > Edmon > > > > > > > > ----- Original Message ----- > > From: "Martin Duerst" > > To: "Edmon Chung" ; ; "Paul Hoffman / > > IMC" > > Sent: Thursday, February 20, 2003 1:28 PM > > Subject: Re: Can we back up a bit and ask some basic questions?An > alternate > > model > > > > > > > > > > Great! Please submit it as an Internet-Draft and copy this list. > > > > > > Regards, Martin. > > > > > > At 12:32 03/02/20 -0500, Edmon Chung wrote: > > > > > > >I do have a draft for it actually based on ESMTP. > > > >Should I just send it to this list or do we have a draft archive?... or > > > >should I send to ietf for archival?... > > > >Edmon > > > > > > > > > > > >----- Original Message ----- > > > >From: "Paul Hoffman / IMC" > > > >To: > > > >Sent: Thursday, February 20, 2003 11:58 AM > > > >Subject: Re: Can we back up a bit and ask some basic questions?An > > alternate > > > >model > > > > > > > > > > > > > > > > > > This thread has gone towards making guesses about how IMAA-UTF8 > would > > > > > be specified, and different people have different guesses. > > > > > > > > > > When there is a complete Internet Draft on IMAA-UTF8, we can discuss > > > > > it sensibly; until then, we can't. If John or Martin or some other > > > > > proponent of the idea wants to make a draft, please do so. It would > > > > > be quite appropriate to discuss it on this mailing list. But until > > > > > then, could we curtail the guessing? > > > > > > > > > > --Paul Hoffman, Director > > > > > --Internet Mail Consortium > > > > > > > > > > > > > > > From owner-ietf-imaa Mon Feb 24 08:19:17 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1OGJH013247 for ietf-imaa-bks; Mon, 24 Feb 2003 08:19:17 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1OGJ8d13239; Mon, 24 Feb 2003 08:19:08 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 24 Feb 2003 07:59:54 -0800 To: "Edmon Chung" , "James Seng" , , "Martin Duerst" From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 2:05 AM -0500 2/24/03, Edmon Chung wrote: >http://www.ietf.org/internet-drafts/draft-chung-imax-00.txt This doesn't address any of the issues that I raised with John's proposal of IMAA-UTF8, and in fact brings in many more horrible problems like bad charset mappings and forcing a client fallback. What is the actual deployment advantage of a ESMTP extension over just plain IMAA-ACE? If the MUA or the MTA that is about to write to the message store needs to be able to act differently based on different input, which is true in all the proposals so far, wouldn't a solution that has zero effect on SMTP be better than one that requires a complete infrastructure upgrade (IMAA-UTF8) or one that has an optional infrastructure upgrade and requires an IMAA-ACE "fallback"? --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Mon Feb 24 10:53:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1OIrUh22122 for ietf-imaa-bks; Mon, 24 Feb 2003 10:53:30 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1OIrRd22108; Mon, 24 Feb 2003 10:53:28 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1OIrPXf029193 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Mon, 24 Feb 2003 19:53:26 +0100 To: Paul Hoffman / IMC Cc: "Edmon Chung" , "James Seng" , , "Martin Duerst" Subject: Re: Can we back up a bit and ask some basic questions?An alternate model X-Payment: hashcash 1.1 0:030224:phoffman@imc.org:a7e355f4ccfe0946 X-Hashcash: 0:030224:phoffman@imc.org:a7e355f4ccfe0946 X-Payment: hashcash 1.1 0:030224:edmon@neteka.com:fe0591c8fe13090d X-Hashcash: 0:030224:edmon@neteka.com:fe0591c8fe13090d X-Payment: hashcash 1.1 0:030224:jseng@pobox.org.sg:60ec4b1877c6110a X-Hashcash: 0:030224:jseng@pobox.org.sg:60ec4b1877c6110a X-Payment: hashcash 1.1 0:030224:ietf-imaa@imc.org:dd6cab4d7a87b442 X-Hashcash: 0:030224:ietf-imaa@imc.org:dd6cab4d7a87b442 X-Payment: hashcash 1.1 0:030224:duerst@w3.org:f7f8668248f1793b X-Hashcash: 0:030224:duerst@w3.org:f7f8668248f1793b From: Simon Josefsson Date: Mon, 24 Feb 2003 19:53:25 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Mon, 24 Feb 2003 07:59:54 -0800") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-32.4 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_GNUS_UA autolearn=ham version=2.50 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > At 2:05 AM -0500 2/24/03, Edmon Chung wrote: >>http://www.ietf.org/internet-drafts/draft-chung-imax-00.txt > > This doesn't address any of the issues that I raised with John's > proposal of IMAA-UTF8, and in fact brings in many more horrible > problems like bad charset mappings and forcing a client fallback. That is a problem? Not supporting anything else than UTF-8 is a problem with IDNA and IMAA in the real world. The bad charset mappings exist with IDNA/IMAA too, only that they are disguised by an assumption in the specifications. As long as not every machine on the Internet uses UTF-8 you must handle conversion and fallback at some point. I'd advocate solutions that tries to face and solve that problem, instead of hiding the problem by assuming the real world only uses Unicode. (I'm not saying the proposed fallback mechanism is the perfect one though, I'm sure it can be improved.) > What is the actual deployment advantage of a ESMTP extension over just > plain IMAA-ACE? It makes SMTP support non-ASCII for email addresses. Compared to using IMAA-ACE, the advantage is that an ESMTP extension doesn't require the use of a punycode encoder/decoder. What would the actual deployment advantage of using IMAA-ACE over an ESMTP extension with an ACE fallback be? > If the MUA or the MTA that is about to write to the message store > needs to be able to act differently based on different input, which > is true in all the proposals so far, wouldn't a solution that has > zero effect on SMTP be better than one that requires a complete > infrastructure upgrade (IMAA-UTF8) or one that has an optional > infrastructure upgrade and requires an IMAA-ACE "fallback"? If you want to deploy as fast as possible and only require changes in transport end points, yes. If you want to upgrade to a clean and easily maintined design in the long-term, no. An optional infrastructure upgrade and ACE fallback sounds good to me. Then in 20 years time where all machines to use Unicode, we can relax the ACE fallback into a MAY and eventually get rid of it. From owner-ietf-imaa Mon Feb 24 11:14:44 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1OJEi224006 for ietf-imaa-bks; Mon, 24 Feb 2003 11:14:44 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1OJEhd24001 for ; Mon, 24 Feb 2003 11:14:43 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 24 Feb 2003 11:14:15 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Can we back up a bit and ask some basic questions?An alternate model Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 7:53 PM +0100 2/24/03, Simon Josefsson wrote: > > What is the actual deployment advantage of a ESMTP extension over just >> plain IMAA-ACE? > >It makes SMTP support non-ASCII for email addresses. Compared to >using IMAA-ACE, the advantage is that an ESMTP extension doesn't >require the use of a punycode encoder/decoder. Are you saying that using a punycode decoder when writing to a message store is *harder* than doing an ESMTP extension that might involve bouncing or dropping mail? That seems kind of extreme, given that the punycode decoding is completely optional. And I don't understand why you talk about a punycode encoder; that is never needed by the SMTP server in IMAA-ACE. > What would the actual >deployment advantage of using IMAA-ACE over an ESMTP extension with an >ACE fallback be? That there would be no required change to the deployed base of SMTP servers out there, some of which are in hardware and cannot be upgraded. Internet mail is already deployed; forcing a change when it isn't needed is just plain bad design. Localizing the protocol change to one place makes it easier to deploy and makes it more predicable for end users. Do I need to go on? >If you want to deploy as fast as possible and only require changes in >transport end points, yes. If you want to upgrade to a clean and >easily maintined design in the long-term, no. In what way is using a new ESMTP extension more "easily maintained"? That certainly is not the experience in the SMTP world so far. Clean is in the eye of the beholder. You and I like UTF-8, but many people don't. Forcing them to use our preferred charset isn't a good practice if it can be avoided. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Mon Feb 24 11:53:10 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1OJrAF25382 for ietf-imaa-bks; Mon, 24 Feb 2003 11:53:10 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1OJr7d25375; Mon, 24 Feb 2003 11:53:08 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1OJr7Xf029968 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Mon, 24 Feb 2003 20:53:08 +0100 To: Paul Hoffman / IMC Cc: ietf-imaa@imc.org Subject: Re: Can we back up a bit and ask some basic questions?An alternate model X-Payment: hashcash 1.1 0:030224:phoffman@imc.org:ed2824e1155d3245 X-Hashcash: 0:030224:phoffman@imc.org:ed2824e1155d3245 X-Payment: hashcash 1.1 0:030224:ietf-imaa@imc.org:37e291d28c166de8 X-Hashcash: 0:030224:ietf-imaa@imc.org:37e291d28c166de8 From: Simon Josefsson Date: Mon, 24 Feb 2003 20:53:07 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Mon, 24 Feb 2003 11:14:15 -0800") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-29.8 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,NEW_DOMAIN_EXTENSIONS, QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_GNUS_UA autolearn=ham version=2.50 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > At 7:53 PM +0100 2/24/03, Simon Josefsson wrote: >> > What is the actual deployment advantage of a ESMTP extension over just >>> plain IMAA-ACE? >> >>It makes SMTP support non-ASCII for email addresses. Compared to >>using IMAA-ACE, the advantage is that an ESMTP extension doesn't >>require the use of a punycode encoder/decoder. > > Are you saying that using a punycode decoder when writing to a message > store is *harder* than doing an ESMTP extension that might involve > bouncing or dropping mail? That seems kind of extreme, given that the > punycode decoding is completely optional. And I don't understand why > you talk about a punycode encoder; that is never needed by the SMTP > server in IMAA-ACE. I'm saying that when implementing a MTA it is easier if I don't have to implement punycode in order to support non-ASCII. How would an ESMTP extension with an ACE fallback (i.e., IMAX) involve bouncing or dropping mail? Punycode decoding is not optional if the MTA wants to support non-ASCII. If the MTA doesn't want to support non-ASCII (for logging, for aliases, for routing, etc), none of this is relevant anyway and the MTA can continue to live in the old 7bit world and noone would notice or care. A punycode encoder is required if the MTA handle non-ASCII data in decoded, normal, format. Like in the user interface for /etc/aliases, /etc/mail/virtusertable etc. If it doesn't handle non-ASCII in normal format, it might as well not support non-ASCII at all since the user would never notice the different. In theory, I agree that a (probably) compliant MTA could be developed that didn't include a punycode encoder, but it would be limited. >> What would the actual >>deployment advantage of using IMAA-ACE over an ESMTP extension with an >>ACE fallback be? > > That there would be no required change to the deployed base of SMTP > servers out there, some of which are in hardware and cannot be > upgraded. Internet mail is already deployed; forcing a change when it > isn't needed is just plain bad design. Localizing the protocol change > to one place makes it easier to deploy and makes it more predicable > for end users. > > Do I need to go on? Yes, please. Why would an ESMTP extension with an ACE fallback (e.g., IMAX) require any changes to the deployed base of SMTP servers? >>If you want to deploy as fast as possible and only require changes in >>transport end points, yes. If you want to upgrade to a clean and >>easily maintined design in the long-term, no. > > In what way is using a new ESMTP extension more "easily maintained"? > That certainly is not the experience in the SMTP world so far. It appears easier to implement a MTA that handle non-ASCII data, than to implement a MTA that handle non-ASCII data AND punycode decoding/encoding of that data. > Clean is in the eye of the beholder. You and I like UTF-8, but many > people don't. Forcing them to use our preferred charset isn't a good > practice if it can be avoided. I agree completely. This is one of my problems with IDNA and IMAA, it forces Unicode on everyone. From owner-ietf-imaa Mon Feb 24 15:24:58 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1ONOww04984 for ietf-imaa-bks; Mon, 24 Feb 2003 15:24:58 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1ONOud04980 for ; Mon, 24 Feb 2003 15:24:56 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 24 Feb 2003 15:24:57 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 8:53 PM +0100 2/24/03, Simon Josefsson wrote: >I'm saying that when implementing a MTA it is easier if I don't have >to implement punycode in order to support non-ASCII. And you don't have to. IMAA-ACE works with no changes to the MTA. IMAX forces changes, including changing the maximum line lengths for the MAIL FROM and RCPT TO commands. That's pretty non-trivial. >How would an ESMTP extension with an ACE fallback (i.e., IMAX) involve >bouncing or dropping mail? The second paragraph of section 2.3 sure sounds like it would bounce things instead of doing an ACE fallback. >Punycode decoding is not optional if the MTA wants to support >non-ASCII. Where in the IMAA document does it say that? I believe you are completely wrong here. >A punycode encoder is required if the MTA handle non-ASCII data in >decoded, normal, format. Like in the user interface for /etc/aliases, >/etc/mail/virtusertable etc. Neither of those are controlled by the MTA. This is getting pretty silly. > If it doesn't handle non-ASCII in normal >format, it might as well not support non-ASCII at all since the user >would never notice the different. You are mixing up the MTA and the MUA. > In theory, I agree that a >(probably) compliant MTA could be developed that didn't include a >punycode encoder, but it would be limited. You have mixed up compliance with marketability. > > In what way is using a new ESMTP extension more "easily maintained"? >> That certainly is not the experience in the SMTP world so far. > >It appears easier to implement a MTA that handle non-ASCII data, than >to implement a MTA that handle non-ASCII data AND punycode >decoding/encoding of that data. But you keep talking about the need to handle fallback. Handling two protocols is not easier than handling one in any universe. > > Clean is in the eye of the beholder. You and I like UTF-8, but many > > people don't. Forcing them to use our preferred charset isn't a good > > practice if it can be avoided. > >I agree completely. This is one of my problems with IDNA and IMAA, it >forces Unicode on everyone. Unicode is not a charset. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Mon Feb 24 16:03:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1P03Ud06411 for ietf-imaa-bks; Mon, 24 Feb 2003 16:03:30 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1P03Td06406 for ; Mon, 24 Feb 2003 16:03:29 -0800 (PST) Message-ID: <03b001c2dc61$4edadd60$fb5016d3@neteka.inc> From: "Edmon Chung" To: , "Paul Hoffman / IMC" References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Mon, 24 Feb 2003 19:03:11 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi Paul, ----- Original Message ----- From: "Paul Hoffman / IMC" > And you don't have to. IMAA-ACE works with no changes to the MTA. > IMAX forces changes, including changing the maximum line lengths for > the MAIL FROM and RCPT TO commands. That's pretty non-trivial. if you think that changing the max line lengths is to big, I will take it out. Just thought that it would also be a good time to upgrade that part, especially due to the use of Punycode infact! ;-) It really isn't a "MUST". > >How would an ESMTP extension with an ACE fallback (i.e., IMAX) involve > >bouncing or dropping mail? > > The second paragraph of section 2.3 sure sounds like it would bounce > things instead of doing an ACE fallback. This describes the situation today! That is, an IDN/IMA-unaware client tries to send out a to/from a multilingual address. It has nothing to do with the IMAX architecture. I just hoped that it was clear to people about this reality and not shy away from it. If you think it is actually more confusing, I will take away the description. > >Punycode decoding is not optional if the MTA wants to support > >non-ASCII. > > Where in the IMAA document does it say that? I believe you are > completely wrong here. Yes it does say that support for ACE (to be updated to Punycode in -01 or when the RFC is out, as I said I wrote this 2 years ago...) is mandatory. ACE and UTF8 is mandatory. Please refer to last paragraph of section 2.2. Edmon From owner-ietf-imaa Mon Feb 24 16:41:19 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1P0fJd08100 for ietf-imaa-bks; Mon, 24 Feb 2003 16:41:19 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1P0fHd08095 for ; Mon, 24 Feb 2003 16:41:17 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <03b001c2dc61$4edadd60$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <03b001c2dc61$4edadd60$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 24 Feb 2003 16:33:22 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 7:03 PM -0500 2/24/03, Edmon Chung wrote: >if you think that changing the max line lengths is to big, I will take it >out. Gratuitous changes to deployed standards are never appreciated. > > The second paragraph of section 2.3 sure sounds like it would bounce > > things instead of doing an ACE fallback. > >This describes the situation today! No, it doesn't. The situation today is that sending non-ASCII in SMTP is forbidden. > > Where in the IMAA document does it say that? I believe you are >> completely wrong here. > >Yes it does say that support for ACE (to be updated to Punycode in -01 or >when the RFC is out, as I said I wrote this 2 years ago...) is mandatory. >ACE and UTF8 is mandatory. Please refer to last paragraph of section 2.2. I was asking about IMAA, not IMAX. Simon claims that Punycode is required for IMAA, and I asked where in IMAA it says that. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Mon Feb 24 17:06:54 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1P16sn09059 for ietf-imaa-bks; Mon, 24 Feb 2003 17:06:54 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1P16rd09055 for ; Mon, 24 Feb 2003 17:06:53 -0800 (PST) Message-ID: <03be01c2dc6a$2aa9cb50$fb5016d3@neteka.inc> From: "Edmon Chung" To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <03b001c2dc61$4edadd60$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Mon, 24 Feb 2003 20:06:36 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Paul Hoffman / IMC" > Gratuitous changes to deployed standards are never appreciated. ok. > > > The second paragraph of section 2.3 sure sounds like it would bounce > > > things instead of doing an ACE fallback. > > > >This describes the situation today! > > No, it doesn't. The situation today is that sending non-ASCII in SMTP > is forbidden. I think that the SMTP states that a server "MAY" / "SHOULD" reject instead of "MUST". It also specifies that it the local part MUST be interpreted by the host. Anyway, I understand where you are coming from, so I would stop the sentence as: If no charset is specified, the server SHOULD assume that the client is not IMAX compliant. Would this be ok? Edmon From owner-ietf-imaa Mon Feb 24 17:59:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1P1xZN10259 for ietf-imaa-bks; Mon, 24 Feb 2003 17:59:35 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1P1xXd10251 for ; Mon, 24 Feb 2003 17:59:33 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <03be01c2dc6a$2aa9cb50$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <03b001c2dc61$4edadd60$fb5016d3@neteka.inc> <03be01c2dc6a$2aa9cb50$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 24 Feb 2003 17:19:27 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 8:06 PM -0500 2/24/03, Edmon Chung wrote: >If no charset is specified, the server SHOULD assume that the client is not >IMAX compliant. > >Would this be ok? Wouldn't that prevent other ESMTP extensions from using the CHARSET option? Are there other ESMTP extensions that make this kind of assumption? --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Mon Feb 24 18:43:48 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1P2hmL11276 for ietf-imaa-bks; Mon, 24 Feb 2003 18:43:48 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1P2hld11272 for ; Mon, 24 Feb 2003 18:43:47 -0800 (PST) Received: (qmail 98713 invoked by uid 1016); 25 Feb 2003 02:44:17 -0000 Date: 25 Feb 2003 02:44:17 -0000 Message-ID: <20030225024417.98712.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Can we back up a bit and ask some basic questions?An alternate model References: <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Simon Josefsson writes: > Like in the user interface for /etc/aliases, /etc/mail/virtusertable etc. Right. Addresses are stored in many different locations, possibly with many different encodings. When a program copies an address from one location to another location with a different encoding, or compares addresses in two locations with different encodings, it has to convert between the encodings. Two examples: * An SMTP server compares an SMTP RCPT to a configuration file specifying acceptable RCPTs. If the address is encoded as GoofyCode in SMTP, but as UTF-8 in the file, then the SMTP server has to convert between UTF-8 and GoofyCode. * A system administrator uses the ``more'' program to feed the configuration file to ``xterm,'' which displays it on the screen. If the address is actually encoded as GoofyCode in the file, but UTF-8 in the xterm input, then the ``more'' program has to convert between GoofyCode and UTF-8. Any failure to do these conversions---or confusion over which encoding is used for a particular location---produces failures for the users. The cost of programming and deploying all these conversions is _the_ fundamental obstacle to moving beyond ASCII. Novice programmers might not be aware of how many different locations and conversions we're talking about here, so I've given a partial list of locations at the end of this message. The big advantage of UTF-8 is that it can and will be used everywhere, eliminating all of these conversions. Non-universal encodings such as GoofyCode don't have the same benefit; GoofyCode can't and won't be used as the xterm input format, for example. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago $DEFAULT in MDAs: partial mailbox name $EXT in MDAs: mailbox name $HOST in MDAs: domain name $LOCALDOMAIN in DNS clients: domain name $SENDER in MDAs: domain name $SENDER in MDAs: mailbox name .fetchmailrc, various locations: domain name .fetchmailrc, various locations: login name (often includes mailbox name) .qmail*: file names (partial mailbox names) .ssh/known_hosts: domain name /etc/aliases, various locations: domain name /etc/aliases, various locations: mailbox name /etc/hosts second column: domain name /etc/hosts.allow, various locations: domain name /etc/hosts.allow, various locations: port-113 name (often mailbox name) /etc/namedb/named.conf, various locations: domain name /etc/resolv.conf, search line: domain name /etc/virtusertable, various locations: mailbox name /public/file: domain name /service/dnscache/root/servers: domain name /service/tinydns/root/data, SOA hostmaster address: mailbox name /service/tinydns/root/data, various locations: domain name BIND log files, various locations: domain name DNS packet, SOA hostmaster address: domain name DNS packet, SOA hostmaster address: mailbox name (attached to domain name) DNS packet: query domain name DNS packet: record domain name DNS registration form: domain name HTTP Host field: domain name IMAP messages, To and Cc and so on: domain name IMAP messages, To and Cc and so on: mailbox name POP USER commands: login name (often includes mailbox name and domain name) POP messages, To and Cc and so on: domain name POP messages, To and Cc and so on: mailbox name SMTP HELO commands: domain name SMTP MAIL and RCPT commands: domain name SMTP MAIL and RCPT commands: mailbox name SMTP messages, To and Cc and so on: domain name SMTP messages, To and Cc and so on: mailbox name add-host command line: domain name add-mx command line: domain name add-ns command line: domain name dig command line: domain name dig output, in SOA hostmaster address: mailbox name dig output: domain name dnscache log files, various locations: domain name ezmlm subscription UI: mailbox name (includes domain name) gethostbyname(), first argument: domain name h_name and h_aliases: domain name host command line: domain name host output, in SOA hostmaster address: mailbox name host output: domain name http URLs: domain name httpd.conf, various locations: domain name lynx.cfg, various locations: domain name mail.local command line: mailbox name mailq output, various locations: domain name mailq output, various locations: mailbox name mailto URLs: domain name mailto URLs: mailbox name mutt command line: domain name mutt command line: mailbox name named zone files, SOA hostmaster address: mailbox name named zone files, various locations: domain name named zone files: file name (often includes domain name) ndc command line: domain name nsupdate command line: domain name pine command line: domain name pine command line: mailbox name praliases output, various locations: domain name praliases output, various locations: mailbox name qmail-inject command line: domain name qmail-inject command line: mailbox name qmail-queue envelope input: domain name qmail-queue envelope input: mailbox name sendmail command line: domain name sendmail command line: mailbox name ssh command line: domain name telnet command line: domain name tinydns log files, various locations: domain name From owner-ietf-imaa Mon Feb 24 18:57:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1P2vGN11611 for ietf-imaa-bks; Mon, 24 Feb 2003 18:57:16 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1P2vFd11607 for ; Mon, 24 Feb 2003 18:57:15 -0800 (PST) Received: (qmail 2065 invoked by uid 1016); 25 Feb 2003 02:57:46 -0000 Date: 25 Feb 2003 02:57:46 -0000 Message-ID: <20030225025746.2064.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Can we back up a bit and ask some basic questions?An alternate model References: <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > SMTP servers out there, some of which are in hardware and cannot be > upgraded. As usual, specifics would be helpful: which SMTP servers you're talking about, and what costs you're actually referring to when you say ``cannot be upgraded.'' In particular, if these SMTP servers have any trouble handling 8-bit data, I'd like to document that fact inside http://pi.cr.yp.to. > forcing a change when it isn't needed is just plain bad design The users are demanding a change. They're trying non-ASCII characters and screaming in anguish at the results. You can hear them, can't you? ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Tue Feb 25 04:50:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1PCoZm02951 for ietf-imaa-bks; Tue, 25 Feb 2003 04:50:35 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1PCoXd02943; Tue, 25 Feb 2003 04:50:33 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1PCoOXf014332 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Tue, 25 Feb 2003 13:50:24 +0100 To: Paul Hoffman / IMC Cc: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) X-Payment: hashcash 1.1 0:030225:phoffman@imc.org:083c3561d2b1bb76 X-Hashcash: 0:030225:phoffman@imc.org:083c3561d2b1bb76 X-Payment: hashcash 1.1 0:030225:ietf-imaa@imc.org:a31d8508b49f96f0 X-Hashcash: 0:030225:ietf-imaa@imc.org:a31d8508b49f96f0 From: Simon Josefsson Date: Tue, 25 Feb 2003 13:50:24 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Mon, 24 Feb 2003 15:24:57 -0800") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-29.8 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,NEW_DOMAIN_EXTENSIONS, QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_GNUS_UA autolearn=ham version=2.50 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > At 8:53 PM +0100 2/24/03, Simon Josefsson wrote: >>I'm saying that when implementing a MTA it is easier if I don't have >>to implement punycode in order to support non-ASCII. > > And you don't have to. IMAA-ACE works with no changes to the MTA. IMAX > forces changes, including changing the maximum line lengths for the > MAIL FROM and RCPT TO commands. That's pretty non-trivial. Perhaps IMAX can be modified so it doesn't require those changes? >>How would an ESMTP extension with an ACE fallback (i.e., IMAX) involve >>bouncing or dropping mail? > > The second paragraph of section 2.3 sure sounds like it would bounce > things instead of doing an ACE fallback. I don't get that impression. It sounds to me that unless IMAX is used, the interpretation and handling of the mail addresses is out of scope of IMAX. Perhaps it would be good to clarify that section so whatever the intention was, it is made specific? >>Punycode decoding is not optional if the MTA wants to support >>non-ASCII. > > Where in the IMAA document does it say that? I believe you are > completely wrong here. Are you saying that if I implement a MTA and want to support non-ASCII mail addresses in the places where MTAs use ASCII mail addresses today, that MTA need not implement punycode decoding? If so, how would you translate an incoming punycoded string into non-ASCII data that is stored in the log file, for instance? If you are saying that the MTA should put the IMAA encoded mail address in the log file, I'd say then that MTA doesn't support non-ASCII. An essential feature of supporting non-ASCII is to make it possible for the user of the application to actually see the characters. ASCII encoding them and displaying them to the user doesn't make the application support non-ASCII in practice. It would be like claiming to support Unicode in a terminal emulator when it only displayed Base64 encoding of the UTF-8 encoded Unicode code points. >>A punycode encoder is required if the MTA handle non-ASCII data in >>decoded, normal, format. Like in the user interface for /etc/aliases, >>/etc/mail/virtusertable etc. > > Neither of those are controlled by the MTA. This is getting pretty silly. That was not a generic example, it was an example for one MTA implementation: Sendmail. It uses and control those files. >> If it doesn't handle non-ASCII in normal >>format, it might as well not support non-ASCII at all since the user >>would never notice the different. > > You are mixing up the MTA and the MUA. I wasn't clear. I meant the user of the MTA, i.e., the administrator. Administrators have non-ASCII requirements too. >> In theory, I agree that a >>(probably) compliant MTA could be developed that didn't include a >>punycode encoder, but it would be limited. > > You have mixed up compliance with marketability. Perhaps. I'd like to consider that as being open to what practical requirements exists before designing a solution. MTA implementations, nor internationalization solutions for MTAs, exist in a vacuum. If it is impossible to implement an internationalized product and being compliant, the specification has a problem. >> > In what way is using a new ESMTP extension more "easily maintained"? >>> That certainly is not the experience in the SMTP world so far. >> >>It appears easier to implement a MTA that handle non-ASCII data, than >>to implement a MTA that handle non-ASCII data AND punycode >>decoding/encoding of that data. > > But you keep talking about the need to handle fallback. Handling two > protocols is not easier than handling one in any universe. True. Yes, the fallback is a problem. Hm. Perhaps those interested in non-ASCII need to require the use of modern software at the receiver and the sender, then implementations doesn't need to implement the fall back case. >> > Clean is in the eye of the beholder. You and I like UTF-8, but many >> > people don't. Forcing them to use our preferred charset isn't a good >> > practice if it can be avoided. >> >>I agree completely. This is one of my problems with IDNA and IMAA, it >>forces Unicode on everyone. > > Unicode is not a charset. I'm not sure if you genuinely missed my point due to this misunderstanding, but assuming you did, let me correct myself: replace "Unicode" with "Any charset encoding format of Unicode". I'm sorry that I cannot express this in any clearer way, perhaps someone who manages to comprehend what I mean can formulate this in a more precise way so my point gets through. I note that IMAA talks about representing characters using the Unicode "character set". I use charset as a short-hand for character set, but apparently that must be wrong if IMAA and what you say is consistent. What distinct definitions of "character set" and "charset" do you use? From owner-ietf-imaa Tue Feb 25 07:43:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1PFhGQ12123 for ietf-imaa-bks; Tue, 25 Feb 2003 07:43:16 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1PFhFd12118 for ; Tue, 25 Feb 2003 07:43:15 -0800 (PST) Message-ID: <052601c2dce4$8c34cd00$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Paul Hoffman / IMC" , "Simon Josefsson" Cc: References: <4.2.0.58.J.20030215182950.03351550@localhost><93041416.1045327595@p3.JCK.COM><1045331284.28770.TMDA@moriarty.gnomon.org.uk><1045337541.29302.TMDA@moriarty.gnomon.org.uk><4.2.0.58.J.20030216101703.05134de8@localhost><4.2.0.58.J.20030219091733.054f4718@localhost><15955.58113.661260.889533@moriarty.gnomon.org.uk><006601c2d891$a29fbbb0$3800a8c0@jeffreyibm><4.2.0.58.J.20030220132811.056277a0@localhost><036f01c2d922$d1b59910$0f01a8c0@neteka.inc><097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA><025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Tue, 25 Feb 2003 10:42:10 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Simon Josefsson" > Paul Hoffman / IMC writes: > > > At 8:53 PM +0100 2/24/03, Simon Josefsson wrote: > >>I'm saying that when implementing a MTA it is easier if I don't have > >>to implement punycode in order to support non-ASCII. > > > > And you don't have to. IMAA-ACE works with no changes to the MTA. IMAX > > forces changes, including changing the maximum line lengths for the > > MAIL FROM and RCPT TO commands. That's pretty non-trivial. > > Perhaps IMAX can be modified so it doesn't require those changes? Absolutely. I just thought it would be good to lengthen the fields... actually for punycode... It is not critical to IMAX, and I will take it out. > >>How would an ESMTP extension with an ACE fallback (i.e., IMAX) involve > >>bouncing or dropping mail? > > > > The second paragraph of section 2.3 sure sounds like it would bounce > > things instead of doing an ACE fallback. > > I don't get that impression. It sounds to me that unless IMAX is > used, the interpretation and handling of the mail addresses is out of > scope of IMAX. Perhaps it would be good to clarify that section so > whatever the intention was, it is made specific? I have changed 2.3 to reflect it. I will submit an updated draft based on our discussions so far. Edmon From owner-ietf-imaa Tue Feb 25 08:37:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1PGbo014695 for ietf-imaa-bks; Tue, 25 Feb 2003 08:37:50 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1PGbmd14691 for ; Tue, 25 Feb 2003 08:37:48 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <93041416.1045327595@p3.JCK.COM> <1045331284.28770.TMDA@moriarty.gnomon.org.uk> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Tue, 25 Feb 2003 08:37:48 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:50 PM +0100 2/25/03, Simon Josefsson wrote: > > Where in the IMAA document does it say that? I believe you are >> completely wrong here. > >Are you saying that if I implement a MTA and want to support non-ASCII >mail addresses in the places where MTAs use ASCII mail addresses >today, that MTA need not implement punycode decoding? You are (again) confusing the protocol with the implementation. The protocol does not require these things; the implementation might. >If so, how would you translate an incoming punycoded string into >non-ASCII data that is stored in the log file, for instance? MTA implementations that want to write into log files already need Punycode decoding for the host names. Your complaint here is invalid. >If you are saying that the MTA should put the IMAA encoded mail >address in the log file, I'd say then that MTA doesn't support >non-ASCII. You are free to say that. Others would disagree. In the case of IMAX, what would you want in your log file. All UTF-8? That means you need converters from every accepted charset to UTF-8. Careful sysadmins would probably want to know *exactly* what came in, not some converted form, but that means that their log file would have multiple charsets in it, which would make display a mess. A reasonable option is to store the addresses as ACE and to have a log-file viewer that converts on display (and has an option for not converting). Again, this is an implementation issue, not a protocol issue. > An essential feature of supporting non-ASCII is to make it >possible for the user of the application to actually see the >characters. ASCII encoding them and displaying them to the user >doesn't make the application support non-ASCII in practice. It would >be like claiming to support Unicode in a terminal emulator when it >only displayed Base64 encoding of the UTF-8 encoded Unicode code >points. IMAA describes in detail when and how to display the Unicode form to the user; IMAX mostly glosses over this. > >>A punycode encoder is required if the MTA handle non-ASCII data in >>>decoded, normal, format. Like in the user interface for /etc/aliases, >>>/etc/mail/virtusertable etc. >> >> Neither of those are controlled by the MTA. This is getting pretty silly. > >That was not a generic example, it was an example for one MTA >implementation: Sendmail. It uses and control those files. And, again, you are mixing up protocols with implementations. > >> If it doesn't handle non-ASCII in normal >>>format, it might as well not support non-ASCII at all since the user >>>would never notice the different. >> >> You are mixing up the MTA and the MUA. > >I wasn't clear. I meant the user of the MTA, i.e., the administrator. >Administrators have non-ASCII requirements too. Correct, and IMAA describes when and how to convert for display. >MTA implementations, nor internationalization solutions for MTAs, >exist in a vacuum. If it is impossible to implement an >internationalized product and being compliant, the specification has a >problem. Of course. Nothing in IMAA makes it "impossible to implement an internationalized product". > > But you keep talking about the need to handle fallback. Handling two >> protocols is not easier than handling one in any universe. > >True. Yes, the fallback is a problem. Hm. Perhaps those interested >in non-ASCII need to require the use of modern software at the >receiver and the sender, then implementations doesn't need to >implement the fall back case. That's not what the IMAX document says. If you want to propose a ESMTP extension with no fallback, either change IMAX or create your own Internet Draft. In either case, you will have to say explicitly how this will interact with SMTP servers that do not support the new protocol, how bounces would be handled, how users would know if they could send a message, and so on. I think when you write that, if you do so honestly, you will see that it would be silly to propose such a solution. > >> > Clean is in the eye of the beholder. You and I like UTF-8, but many >>> > people don't. Forcing them to use our preferred charset isn't a good >>> > practice if it can be avoided. >>> > >>I agree completely. This is one of my problems with IDNA and IMAA, it >>>forces Unicode on everyone. >> >> Unicode is not a charset. > >I'm not sure if you genuinely missed my point due to this >misunderstanding, but assuming you did, let me correct myself: replace >"Unicode" with "Any charset encoding format of Unicode". I think I hear you saying that you think that the protocols should allow any repertoire and any encoding of those repertoires. If so, we certainly disagree. The IETF is not very keen on creating protocols for which there would be limited and unpredictable interoperability. Other standards group might not be so picky. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Tue Feb 25 13:14:01 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1PLE1r29519 for ietf-imaa-bks; Tue, 25 Feb 2003 13:14:01 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1PLDwd29513; Tue, 25 Feb 2003 13:13:59 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1PLDuXf022180 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Tue, 25 Feb 2003 22:13:57 +0100 To: Paul Hoffman / IMC Cc: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) X-Payment: hashcash 1.1 0:030225:phoffman@imc.org:e9e11c085ec1ff2f X-Hashcash: 0:030225:phoffman@imc.org:e9e11c085ec1ff2f X-Payment: hashcash 1.1 0:030225:ietf-imaa@imc.org:815610c0f12621dc X-Hashcash: 0:030225:ietf-imaa@imc.org:815610c0f12621dc From: Simon Josefsson Date: Tue, 25 Feb 2003 22:13:56 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Tue, 25 Feb 2003 08:37:48 -0800") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 References: <4.2.0.58.J.20030215182950.03351550@localhost> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-32.4 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_GNUS_UA autolearn=ham version=2.50 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > At 1:50 PM +0100 2/25/03, Simon Josefsson wrote: >>>> Punycode decoding is not optional if the MTA wants to support >>>> non-ASCII. >>> Where in the IMAA document does it say that? I believe you are >>> completely wrong here. >> >>Are you saying that if I implement a MTA and want to support non-ASCII >>mail addresses in the places where MTAs use ASCII mail addresses >>today, that MTA need not implement punycode decoding? > > You are (again) confusing the protocol with the implementation. The > protocol does not require these things; the implementation might. Right, I was talking about the implementation, I tried to make that clear by saying "the MTA" rather than "the specification". Isn't (one of) the goal of the IMAA protocol to make it possible for MTA implementations to support non-ASCII? Then whether the implementation or the specification is generating the requirement seems like an academic point. The end result is that punycode decoding is required in the implementation, which is what I consider the problem. If a solution that didn't involve encoding techniques such as punycode could be developed, I think that should be preferred. >>If so, how would you translate an incoming punycoded string into >>non-ASCII data that is stored in the log file, for instance? > > MTA implementations that want to write into log files already need > Punycode decoding for the host names. Your complaint here is invalid. Obviously we are interpreting IMAX differently, or you wouldn't say that. Now that you write this I would agree that IMAX is unclear on one thing: does IMAX make the RHS of the email address a (in IDNA terminology) a IDN-aware domain name slot? I think it should. It doesn't make sense to negotiate non-ASCII and then simply don't take advantage of that and use IDNA for the RHS, treating it as a IDN-unaware domain name slot. IMAX authors, perhaps add an example (and text to go with it) that illustrates non-ASCII RHS too. MAIL FROM: if this is what you intend? The alternative would be MAIL FROM: but then IMHO the whole point of IMAX falls: that you can support non-ASCII using raw charset encodings instead of application specific encodings. I interpreted IMAX as providing a IDN-aware domain name slot for the RHS too, where you could send non-punycoded data. >>If you are saying that the MTA should put the IMAA encoded mail >>address in the log file, I'd say then that MTA doesn't support >>non-ASCII. > > You are free to say that. Others would disagree. In the case of IMAX, > what would you want in your log file. All UTF-8? That means you need > converters from every accepted charset to UTF-8. Careful sysadmins > would probably want to know *exactly* what came in, not some converted > form, but that means that their log file would have multiple charsets > in it, which would make display a mess. A reasonable option is to > store the addresses as ACE and to have a log-file viewer that converts > on display (and has an option for not converting). > > Again, this is an implementation issue, not a protocol issue. Yes. But it is an important point. A internationalization solution that doesn't consider these practical issues is of only theoretical value. I would want the log file to contain characters that can be read without special IDNA/IMAA/IMAX aware programs. I.e., if the system uses UTF-8 as the system encoding, I'd want the log file to be in UTF-8. If the system uses ISO-8859-1, the log file should be in ISO-8859-1 (and the application must cope with data that can't be represented somehow). Yes, the application must know how to convert alien (but charset tagged) data into the system charset. But IDNA and IMAA have the same characteristic: it require the application to convert Unicode (which is the only charset IDNA/IMAA accept) to the system charset. So I cannot see where the big difference lies? I agree careful sysadmins want to see exactly what came in. The only way to represent that, unless the system uses the same charset as the data that came in, is to print the charset of the incoming data and the byte sequence. The same is true today on a ISO-8859-1 system that receives Unicode via IDNA. It seems we disagree that it is reasonable to require users to use special applications to view log files, or edit configuration files, etc. Personally, I don't use applications that have configuration files or log files that can't be manipulated using text operations. I do suppose many Microsoft Windows users would find your approach acceptable though, since that's what they are accustomed to. IMHO a solution must be able to accomodate both users. >> An essential feature of supporting non-ASCII is to make it >>possible for the user of the application to actually see the >>characters. ASCII encoding them and displaying them to the user >>doesn't make the application support non-ASCII in practice. It would >>be like claiming to support Unicode in a terminal emulator when it >>only displayed Base64 encoding of the UTF-8 encoded Unicode code >>points. > > IMAA describes in detail when and how to display the Unicode form to > the user; IMAX mostly glosses over this. Yes, IMAX is not a final document so this isn't surprising. Although for IMAX, those issues are simpler since IMAX allows implementations to use charsets that the system already support natively. >> >>A punycode encoder is required if the MTA handle non-ASCII data in >>>>decoded, normal, format. Like in the user interface for /etc/aliases, >>>>/etc/mail/virtusertable etc. >>> >>> Neither of those are controlled by the MTA. This is getting pretty silly. >> >>That was not a generic example, it was an example for one MTA >>implementation: Sendmail. It uses and control those files. > > And, again, you are mixing up protocols with implementations. I'm sorry, I'll try to make it more clear when I talk about the implementation or the specification. If you are saying that we should simply ignore all implementation related aspects in a proposed solution, then I guess I simply don't agree with that. I'll continue to relate a proposal to the real world. >> >> If it doesn't handle non-ASCII in normal >>>>format, it might as well not support non-ASCII at all since the user >>>>would never notice the different. >>> >>> You are mixing up the MTA and the MUA. >> >>I wasn't clear. I meant the user of the MTA, i.e., the administrator. >>Administrators have non-ASCII requirements too. > > Correct, and IMAA describes when and how to convert for display. Right. This is what cause the dependence on punycode decoding. Since administrators not only view non-ASCII but input non-ASCII too, punycode encoding is required too. >>MTA implementations, nor internationalization solutions for MTAs, >>exist in a vacuum. If it is impossible to implement an >>internationalized product and being compliant, the specification has a >>problem. > > Of course. Nothing in IMAA makes it "impossible to implement an > internationalized product". Cool. Then, perhaps, what we have is two solutions that can implement an internationalized product. I'm trying to convince myself which of them is the better approach. >> > But you keep talking about the need to handle fallback. Handling two >>> protocols is not easier than handling one in any universe. >> >>True. Yes, the fallback is a problem. Hm. Perhaps those interested >>in non-ASCII need to require the use of modern software at the >>receiver and the sender, then implementations doesn't need to >>implement the fall back case. > > That's not what the IMAX document says. Right, I proposed something new. > If you want to propose a ESMTP extension with no fallback, either > change IMAX or create your own Internet Draft. In either case, you > will have to say explicitly how this will interact with SMTP servers > that do not support the new protocol, how bounces would be handled, > how users would know if they could send a message, and so on. I > think when you write that, if you do so honestly, you will see that > it would be silly to propose such a solution. Discarding it as silly seems a bit premature to me. Having such a proposal, that discusses all the consequences you mention seems like a valuable contribution to this discussion. But I guess it is easier to advocate one solution if the competition are discarded early on... >> >> > Clean is in the eye of the beholder. You and I like UTF-8, but many >>>> > people don't. Forcing them to use our preferred charset isn't a good >>>> > practice if it can be avoided. >>>> >> >>I agree completely. This is one of my problems with IDNA and IMAA, it >>>>forces Unicode on everyone. >>> >>> Unicode is not a charset. >> >>I'm not sure if you genuinely missed my point due to this >>misunderstanding, but assuming you did, let me correct myself: replace >>"Unicode" with "Any charset encoding format of Unicode". > > I think I hear you saying that you think that the protocols should > allow any repertoire and any encoding of those repertoires. If so, we > certainly disagree. The IETF is not very keen on creating protocols > for which there would be limited and unpredictable > interoperability. Other standards group might not be so picky. That is stretching it a bit, I think. I believe that a solution worth its salt should consider existing habits, and whether we like it or not there is more than charset used on the Internet. MIME appears to acknowledge this and is rather successful. HTML acknowledge this and is rather successful. Same for HTTP. Come to think of it, I can't recall any successful internationalization product the IETF has produced to counter my examples, can you help me? If you are speaking for IETF, I find it interesting that RFC 2277 "IETF Policy on Character Sets and Languages" says that protocols MAY allow use of any repertoire. It doesn't say that it is a bad idea to allow more than one charset. I agree with that document, let's require the use of UTF-8 in protocols, but allow negotiation of other charsets to smooth transition and deployment. From owner-ietf-imaa Tue Feb 25 15:02:18 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1PN2IJ02831 for ietf-imaa-bks; Tue, 25 Feb 2003 15:02:18 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-156.dsl.snfc21.pacbell.net [63.202.92.156]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1PN2Gd02824 for ; Tue, 25 Feb 2003 15:02:16 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Tue, 25 Feb 2003 14:56:03 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 10:13 PM +0100 2/25/03, Simon Josefsson wrote: >Isn't (one >of) the goal of the IMAA protocol to make it possible for MTA >implementations to support non-ASCII? Asking ludicrous questions is not a good form in technical discussions. Of course that is a goal. And IMAA does that already. >The end result is that punycode decoding is required >in the implementation, which is what I consider the problem. If a >solution that didn't involve encoding techniques such as punycode >could be developed, I think that should be preferred. Then you are not talking about IMAX. If you have some other protocol in mind that doesn't require punycode but still guarantees mail delivery, please write an Internet Draft for it. (If you are thinking of a protocol that doesn't require punycode but would instead simply bounce or lose mail that was sent to MTAs that didn't understand the new protocol, please don't bother writing an Internet draft...) >Now that you write this I would agree that IMAX is unclear on >one thing: One of many things.... > > You are free to say that. Others would disagree. In the case of IMAX, >> what would you want in your log file. All UTF-8? That means you need >> converters from every accepted charset to UTF-8. Careful sysadmins >> would probably want to know *exactly* what came in, not some converted >> form, but that means that their log file would have multiple charsets >> in it, which would make display a mess. A reasonable option is to >> store the addresses as ACE and to have a log-file viewer that converts >> on display (and has an option for not converting). >> >> Again, this is an implementation issue, not a protocol issue. > >Yes. But it is an important point. A internationalization solution >that doesn't consider these practical issues is of only theoretical >value. So the current SMTP, POP, IMAP, and HTTP protocols is only of theoretical value. Oh, well. >I would want the log file to contain... Fine. Ask your vendor to include that feature. This is not part of a protocol specification. >It seems we disagree that it is reasonable to require users to use >special applications to view log files, or edit configuration files, >etc. No. We disagree as to whether this is part of the protocol. Few (if any) IETF protocols cover this. > > IMAA describes in detail when and how to display the Unicode form to >> the user; IMAX mostly glosses over this. > >Yes, IMAX is not a final document so this isn't surprising. Neither is IMAA. At least the IMAA authors admit where the open issues are. >I'm sorry, I'll try to make it more clear when I talk about the >implementation or the specification. If you are saying that we should >simply ignore all implementation related aspects in a proposed >solution, then I guess I simply don't agree with that. I'll continue >to relate a proposal to the real world. This is a discussion of a potential IETF protocol. Please hold your discussion to things that could be included in an IETF protocol. If you don't like the way the IETF makes protocols, there are other standards organizations in which you might want to be active instead of the IETF. Or, you can take your concerns about the way we create protocols to the main IETF mailing list and see if there is enough support for your views to change the way the IETF works. > > Correct, and IMAA describes when and how to convert for display. > >Right. This is what cause the dependence on punycode decoding. Since >administrators not only view non-ASCII but input non-ASCII too, >punycode encoding is required too. Wrong, yet again. There is nothing in the IMAA document about how the administrator views documents. IMAA is about mail transport, not system administration. > > If you want to propose a ESMTP extension with no fallback, either >> change IMAX or create your own Internet Draft. In either case, you >> will have to say explicitly how this will interact with SMTP servers >> that do not support the new protocol, how bounces would be handled, > > how users would know if they could send a message, and so on. I >> think when you write that, if you do so honestly, you will see that >> it would be silly to propose such a solution. > >Discarding it as silly seems a bit premature to me. Having such a >proposal, that discusses all the consequences you mention seems like a >valuable contribution to this discussion. But I guess it is easier to >advocate one solution if the competition are discarded early on... We disagree here. This discussion has been happening for over 10 years with respect to ESMTP extensions. I don't consider that to be "early on". > > I think I hear you saying that you think that the protocols should >> allow any repertoire and any encoding of those repertoires. If so, we >> certainly disagree. The IETF is not very keen on creating protocols >> for which there would be limited and unpredictable >> interoperability. Other standards group might not be so picky. > >That is stretching it a bit, I think. I believe that a solution worth >its salt should consider existing habits, and whether we like it or >not there is more than charset used on the Internet. MIME appears to >acknowledge this and is rather successful. HTML acknowledge this and >is rather successful. Same for HTTP. Come to think of it, I can't >recall any successful internationalization product the IETF has >produced to counter my examples, can you help me? You just helped yourself. -MIME headers (RFC 2047) often display unreadable gibberish for charsets that the recipient can't decode, even when using quoted-printable (commonly called "quoted-unreadble" by people in the mail world). -HTML shows unintelligible gibberish if the charset used and stated in the document cannot be displayed by the user. People see this every day. -HTTP fails compeletly if the client lists charsets that it can read and none of those are charsets that the server can write. This is uncommon because this feature is rarely used because of the high failure rate. The third case is analogous to what you are proposing with "fail to deliver if the charset is not supported". >If you are speaking for IETF, I'm not, and you should assume that anyone other than Harald Alvestrand or Leslie Daigle who speaks for the IETF is bluffing or lying. > I find it interesting that RFC 2277 >"IETF Policy on Character Sets and Languages" says that protocols MAY >allow use of any repertoire. Really? It says that? Could you quote the sentence for us here? I couldn't find the word "repertoire" anywhere in RFC 2277. > It doesn't say that it is a bad idea to >allow more than one charset. I agree with that document, let's >require the use of UTF-8 in protocols, but allow negotiation of other >charsets to smooth transition and deployment. You should take this up with Harald Alvestrand, the author of RFC 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of the IESG) approved it to be on standards track. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Tue Feb 25 15:36:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1PNaT503647 for ietf-imaa-bks; Tue, 25 Feb 2003 15:36:29 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1PNaSd03643 for ; Tue, 25 Feb 2003 15:36:28 -0800 (PST) Received: (qmail 14934 invoked by alias); 25 Feb 2003 23:36:27 -0000 Received: from unknown (HELO DAVIS1) (12.234.231.250) by 0 with SMTP; 25 Feb 2003 23:36:27 -0000 Message-ID: <014d01c2dd26$afad2150$7900a8c0@DAVIS1> From: "Mark Davis" To: , "Paul Hoffman / IMC" References: <4.2.0.58.J.20030215182950.03351550@localhost> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Tue, 25 Feb 2003 15:36:10 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > > It doesn't say that it is a bad idea to > >allow more than one charset. I agree with that document, let's > >require the use of UTF-8 in protocols, but allow negotiation of other > >charsets to smooth transition and deployment. > You should take this up with Harald Alvestrand, the author of RFC > 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of > the IESG) approved it to be on standards track. I want to point out a very important feature here. While IDN does not use UTF-8, the contents are algorithmically mappable to UTF-8. That is *very* different from allowing arbitrary charsets. There is a huge problem with using arbitrary charsets; they don't interoperate well. They may not be supported on the recipient platform, or if supported, even the 'same' charset (such as SJIS) is interpreted in different ways on different platforms. If the on-the-wire protocol is UTF-8 (or algorithmically mappable to UTF-8) then senders and recipients only need to deal with one charset. Mark From owner-ietf-imaa Tue Feb 25 18:20:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1Q2Ka607625 for ietf-imaa-bks; Tue, 25 Feb 2003 18:20:36 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1Q2KZd07620 for ; Tue, 25 Feb 2003 18:20:35 -0800 (PST) Message-ID: <05f001c2dd3d$9ecd1860$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Mark Davis" , , "Paul Hoffman / IMC" References: <4.2.0.58.J.20030215182950.03351550@localhost> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Tue, 25 Feb 2003 21:20:19 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi Mark, I think you are right. That is why in the IMAX description, UTF8 is mandated. The thinking is similar to XML among other things. And in order to not reinvent the wheel, a fall back to punycode is suggested. What are your thoughts overall on the doc? BTW, I have updated the draft to -01 and changed a number of stuff. Most notably taking out section 3 as suggested by everyone... including myself :-) You can find it at: http://www.dnsii.org/draft-ietf-chung-imax-01.txt (Paul, I havent changed the optional parameter word "CHARSET" yet, but I think you are right and I will do so in the next version) (James, I have sent it to the IETF, but I dont know when they will get it posted... just in case you ask.) Edmon ----- Original Message ----- From: "Mark Davis" To: ; "Paul Hoffman / IMC" Sent: Tuesday, February 25, 2003 6:36 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > > > It doesn't say that it is a bad idea to > > >allow more than one charset. I agree with that document, let's > > >require the use of UTF-8 in protocols, but allow negotiation of other > > >charsets to smooth transition and deployment. > > > You should take this up with Harald Alvestrand, the author of RFC > > 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of > > the IESG) approved it to be on standards track. > > I want to point out a very important feature here. While IDN does not use > UTF-8, the contents are algorithmically mappable to UTF-8. That is *very* > different from allowing arbitrary charsets. > > There is a huge problem with using arbitrary charsets; they don't > interoperate well. They may not be supported on the recipient platform, or > if supported, even the 'same' charset (such as SJIS) is interpreted in > different ways on different platforms. If the on-the-wire protocol is UTF-8 > (or algorithmically mappable to UTF-8) then senders and recipients only need > to deal with one charset. > > Mark > > From owner-ietf-imaa Tue Feb 25 19:08:42 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1Q38gu08783 for ietf-imaa-bks; Tue, 25 Feb 2003 19:08:42 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1Q38ed08776 for ; Tue, 25 Feb 2003 19:08:40 -0800 (PST) Received: (qmail 20603 invoked by alias); 26 Feb 2003 03:08:44 -0000 Received: from unknown (HELO DAVIS1) (12.234.231.250) by 0 with SMTP; 26 Feb 2003 03:08:44 -0000 Message-ID: <018b01c2dd44$57134330$7900a8c0@DAVIS1> From: "Mark Davis" To: "Edmon Chung" , , "Paul Hoffman / IMC" References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <05f001c2dd3d$9ecd1860$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Tue, 25 Feb 2003 19:08:29 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > What are your thoughts overall on the doc? Sadly, I am up to my ears in Unicode 4.0 work right now, and am only able to keep half an ear open to this mailing list. I should have more time in a couple of weeks. Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Edmon Chung" To: "Mark Davis" ; ; "Paul Hoffman / IMC" Sent: Tuesday, February 25, 2003 18:20 Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > Hi Mark, > > I think you are right. That is why in the IMAX description, UTF8 is > mandated. The thinking is similar to XML among other things. And in order > to not reinvent the wheel, a fall back to punycode is suggested. What are > your thoughts overall on the doc? > > BTW, I have updated the draft to -01 and changed a number of stuff. Most > notably taking out section 3 as suggested by everyone... including myself > :-) > > You can find it at: http://www.dnsii.org/draft-ietf-chung-imax-01.txt > > (Paul, I havent changed the optional parameter word "CHARSET" yet, but I > think you are right and I will do so in the next version) > > (James, I have sent it to the IETF, but I dont know when they will get it > posted... just in case you ask.) > > Edmon > > > > ----- Original Message ----- > From: "Mark Davis" > To: ; "Paul Hoffman / IMC" > Sent: Tuesday, February 25, 2003 6:36 PM > Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > > > > > > > It doesn't say that it is a bad idea to > > > >allow more than one charset. I agree with that document, let's > > > >require the use of UTF-8 in protocols, but allow negotiation of other > > > >charsets to smooth transition and deployment. > > > > > You should take this up with Harald Alvestrand, the author of RFC > > > 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of > > > the IESG) approved it to be on standards track. > > > > I want to point out a very important feature here. While IDN does not use > > UTF-8, the contents are algorithmically mappable to UTF-8. That is *very* > > different from allowing arbitrary charsets. > > > > There is a huge problem with using arbitrary charsets; they don't > > interoperate well. They may not be supported on the recipient platform, or > > if supported, even the 'same' charset (such as SJIS) is interpreted in > > different ways on different platforms. If the on-the-wire protocol is > UTF-8 > > (or algorithmically mappable to UTF-8) then senders and recipients only > need > > to deal with one charset. > > > > Mark > > > > > > From owner-ietf-imaa Wed Feb 26 03:38:11 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1QBcB327010 for ietf-imaa-bks; Wed, 26 Feb 2003 03:38:11 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1QBc8d26999; Wed, 26 Feb 2003 03:38:09 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1QBc0Xf001113 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Wed, 26 Feb 2003 12:38:02 +0100 To: Paul Hoffman / IMC Cc: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> From: Simon Josefsson X-Payment: hashcash 1.1 0:030226:phoffman@imc.org:630af6f59ec7bc87 X-Hashcash: 0:030226:phoffman@imc.org:630af6f59ec7bc87 X-Payment: hashcash 1.1 0:030226:ietf-imaa@imc.org:0c20164f29ee9415 X-Hashcash: 0:030226:ietf-imaa@imc.org:0c20164f29ee9415 Date: Wed, 26 Feb 2003 12:38:00 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Tue, 25 Feb 2003 14:56:03 -0800") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Status: No, hits=-32.4 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_GNUS_UA autolearn=ham version=2.50 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > At 10:13 PM +0100 2/25/03, Simon Josefsson wrote: >>Isn't (one >>of) the goal of the IMAA protocol to make it possible for MTA >>implementations to support non-ASCII? > > Asking ludicrous questions is not a good form in technical > discussions. Of course that is a goal. And IMAA does that already. My question was sincere. IMAX appears to be a solution for internationalization of MTAs, at the SMTP layer. It does not propose solving the internationalization problem for MUAs. SMTP is an interactive protocol between two end-entities, and can therefor negotiate non-ASCII support, which is different from RFC (2)822 where all entities that will handle the stored data is not able to interact with the creator of that data to negotiate non-ASCII. IMAX takes advantage of this difference. I believe it would be possible to design a internationalization solution for RFC (2)822 that would be distinct from a SMTP internationalization solution. Those two distinctions could be investigated in parallel and evaluated on their own merits. If you think this is ludicrous and want this to be a productive discussion, please take the question seriously and explain in technical terms why your proposal is better. >>The end result is that punycode decoding is required >>in the implementation, which is what I consider the problem. If a >>solution that didn't involve encoding techniques such as punycode >>could be developed, I think that should be preferred. > > Then you are not talking about IMAX. If you have some other protocol > in mind that doesn't require punycode but still guarantees mail > delivery, please write an Internet Draft for it. (If you are thinking > of a protocol that doesn't require punycode but would instead simply > bounce or lose mail that was sent to MTAs that didn't understand the > new protocol, please don't bother writing an Internet draft...) Why not? That seems to be one serious alternative solution to IMAA. I can only interprete your dismissal of alternative solutions without a serious analysis that you either have done this analysis already and know the answers or that you don't want to see alternative ideas discussed. In the former case, I think it would be useful to read your analysis. >>Now that you write this I would agree that IMAX is unclear on >>one thing: > > One of many things.... Perhaps your experience in this area could be applied at improving the specification? I'm sure having two serious alternative to look at would help the discussion. >> > You are free to say that. Others would disagree. In the case of IMAX, >>> what would you want in your log file. All UTF-8? That means you need >>> converters from every accepted charset to UTF-8. Careful sysadmins >>> would probably want to know *exactly* what came in, not some converted >>> form, but that means that their log file would have multiple charsets >>> in it, which would make display a mess. A reasonable option is to >>> store the addresses as ACE and to have a log-file viewer that converts >>> on display (and has an option for not converting). >>> >>> Again, this is an implementation issue, not a protocol issue. >> >>Yes. But it is an important point. A internationalization solution >>that doesn't consider these practical issues is of only theoretical >>value. > > So the current SMTP, POP, IMAP, and HTTP protocols is only of > theoretical value. Oh, well. Those protocols do consider practical issues. A simple proof that they aren't of theoretical value is that they are used in practice. >>I would want the log file to contain... > > Fine. Ask your vendor to include that feature. This is not part of a > protocol specification. I'm the vendor, and I'm here to understand how to implement it. If the protocol specification doesn't give guidance or have considered how it will be implemented, I fear it will not work. If you take this fear seriously and want to prove me wrong, please explain how it can be done in the real world. This exercise would be useful when comparing IMAA with IMAX since it would give a complete picture. >>It seems we disagree that it is reasonable to require users to use >>special applications to view log files, or edit configuration files, >>etc. > > No. We disagree as to whether this is part of the protocol. Few (if > any) IETF protocols cover this. Few (if any) IETF protocols have designs that makes this a problem. A ESMTP extension with tagged charsets might not make this a problem, but IMAA does. If the IMAA design makes you propose that a reasonable approach is to implement special applications for viewing log files or edit configurations, a valid critique of IMAA would be that alternative solutions would not generate these problems. Dismissing this critique because the IMAA protocol doesn't clearly state the consequences of its design or declare it out of scope isn't productive. >> > IMAA describes in detail when and how to display the Unicode form to >>> the user; IMAX mostly glosses over this. >> >>Yes, IMAX is not a final document so this isn't surprising. > > Neither is IMAA. At least the IMAA authors admit where the open issues are. Are you accusing the IMAX authors are holding out on what the open issues are? I can't tell, but of course as an evaluator of both specifications I'd appreciate if you could disclose those problems. >>I'm sorry, I'll try to make it more clear when I talk about the >>implementation or the specification. If you are saying that we should >>simply ignore all implementation related aspects in a proposed >>solution, then I guess I simply don't agree with that. I'll continue >>to relate a proposal to the real world. > > This is a discussion of a potential IETF protocol. Please hold your > discussion to things that could be included in an IETF protocol. If > you don't like the way the IETF makes protocols, there are other > standards organizations in which you might want to be active instead > of the IETF. Or, you can take your concerns about the way we create > protocols to the main IETF mailing list and see if there is enough > support for your views to change the way the IETF works. I'm sorry, I was under the impression that the IETF worried about how protocols are implemented in practice too. >> > Correct, and IMAA describes when and how to convert for display. >> >>Right. This is what cause the dependence on punycode decoding. Since >>administrators not only view non-ASCII but input non-ASCII too, >>punycode encoding is required too. > > Wrong, yet again. There is nothing in the IMAA document about how the > administrator views documents. IMAA is about mail transport, not > system administration. You said that a reasonable approach to implement a non-ASCII solution based on IMAA was to implement special applications for viewing log files and editing configuration files. I don't consider this a reasonable solution, and thus object to IMAA based on this. In a discussion, the productive response to this criticism would be to explain how IMAA can accommodate other views as well. If IMAA cannot accommodate this view, all it would take to say that and I'll be enlightened. I agree the IMAA document doesn't answer my question, that's why I brought it up. >> > If you want to propose a ESMTP extension with no fallback, either >>> change IMAX or create your own Internet Draft. In either case, you >>> will have to say explicitly how this will interact with SMTP servers >>> that do not support the new protocol, how bounces would be handled, >> > how users would know if they could send a message, and so on. I >>> think when you write that, if you do so honestly, you will see that >>> it would be silly to propose such a solution. >> >>Discarding it as silly seems a bit premature to me. Having such a >>proposal, that discusses all the consequences you mention seems like a >>valuable contribution to this discussion. But I guess it is easier to >>advocate one solution if the competition are discarded early on... > > We disagree here. This discussion has been happening for over 10 years > with respect to ESMTP extensions. I don't consider that to be "early > on". If an idea can't be dismissed after 10 years of discussion, perhaps there is some merit with that idea. Since technical solutions are continuously proposed based on the idea, perhaps it would be useful to document why that idea is a bad one, if you believe that. >> > I think I hear you saying that you think that the protocols should >>> allow any repertoire and any encoding of those repertoires. If so, we >>> certainly disagree. The IETF is not very keen on creating protocols >>> for which there would be limited and unpredictable >>> interoperability. Other standards group might not be so picky. >> >>That is stretching it a bit, I think. I believe that a solution worth >>its salt should consider existing habits, and whether we like it or >>not there is more than charset used on the Internet. MIME appears to >>acknowledge this and is rather successful. HTML acknowledge this and >>is rather successful. Same for HTTP. Come to think of it, I can't >>recall any successful internationalization product the IETF has >>produced to counter my examples, can you help me? > > You just helped yourself. > > -MIME headers (RFC 2047) often display unreadable gibberish for > charsets that the recipient can't decode, even when using > quoted-printable (commonly called "quoted-unreadble" by people in the > mail world). > > -HTML shows unintelligible gibberish if the charset used and stated in > the document cannot be displayed by the user. People see this every > day. > > -HTTP fails compeletly if the client lists charsets that it can read > and none of those are charsets that the server can write. This is > uncommon because this feature is rarely used because of the high > failure rate. > > The third case is analogous to what you are proposing with "fail to > deliver if the charset is not supported". So you are saying that the IETF is "not very keen on creating protocols" like MIME, HTML and HTTP? That's an interesting proposition. >>If you are speaking for IETF, > > I'm not, and you should assume that anyone other than Harald > Alvestrand or Leslie Daigle who speaks for the IETF is bluffing or > lying. Right, I didn't want to make that assumption in this case as it would be offensive. >> I find it interesting that RFC 2277 >>"IETF Policy on Character Sets and Languages" says that protocols MAY >>allow use of any repertoire. > > Really? It says that? Could you quote the sentence for us here? I > couldn't find the word "repertoire" anywhere in RFC 2277. The word "repertoire" is indeed not used. The following section (§3.1, page 3 of the document, if you want to look it up) says that other charsets or other encoding schemes may be used. ,---- | Protocols MAY specify, in addition, how to use other charsets or | other character encoding schemes for ISO 10646, such as UTF-16, but | lack of an ability to use UTF-8 is a violation of this policy; such a | violation would need a variance procedure ([BCP9] section 9) with | clear and solid justification in the protocol specification document | before being entered into or advanced upon the standards track. `---- I note that punycode is a encoding scheme, and thus IDNA and IMAA violates this by lacking an ability to use UTF-8. >> It doesn't say that it is a bad idea to >>allow more than one charset. I agree with that document, let's >>require the use of UTF-8 in protocols, but allow negotiation of other >>charsets to smooth transition and deployment. > > You should take this up with Harald Alvestrand, the author of RFC > 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of > the IESG) approved it to be on standards track. Perhaps he is busy with other things, but I will ask if the policy in RFC 2277 doesn't apply any more, or where the variance procedure steps for the IDN working group are documented. Thanks for the suggestion. From owner-ietf-imaa Wed Feb 26 11:45:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1QJjcV25259 for ietf-imaa-bks; Wed, 26 Feb 2003 11:45:38 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1QJjbd25253; Wed, 26 Feb 2003 11:45:37 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA25230; Wed, 26 Feb 2003 14:45:37 -0500 Message-Id: <4.2.0.58.J.20030226125526.02dafed8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 26 Feb 2003 12:59:47 -0500 To: "Edmon Chung" , "Mark Davis" , , "Paul Hoffman / IMC" From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <05f001c2dd3d$9ecd1860$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <1045337541.29302.TMDA@moriarty.gnomon.org.uk> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 21:20 03/02/25 -0500, Edmon Chung wrote: >Hi Mark, > >I think you are right. That is why in the IMAX description, UTF8 is >mandated. The thinking is similar to XML among other things. There is a huge difference between headers (where having a single encoding is very important, because the tagging overhead is high, there are conversion problems,...) and bodies. Parallels to XML work for other body formats, but are really not adequate for headers. Also, XML is already 5 years old. Something new should look ahead, and not try to carry too much unnecessary old stuff. Regards, Martin. From owner-ietf-imaa Wed Feb 26 12:57:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1QKvZa27726 for ietf-imaa-bks; Wed, 26 Feb 2003 12:57:35 -0800 (PST) Received: from [63.202.92.156] (adsl-66-123-66-34.dsl.pltn13.pacbell.net [66.123.66.34] (may be forged)) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1QKvXd27706 for ; Wed, 26 Feb 2003 12:57:33 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 26 Feb 2003 12:57:34 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 12:38 PM +0100 2/26/03, Simon Josefsson wrote: >My question was sincere. IMAX appears to be a solution for >internationalization of MTAs, at the SMTP layer. It does not propose >solving the internationalization problem for MUAs. Yes, it does. It shows exactly how an MUA should display ACE names. > SMTP is an >interactive protocol between two end-entities, and can therefor >negotiate non-ASCII support, which is different from RFC (2)822 where >all entities that will handle the stored data is not able to interact >with the creator of that data to negotiate non-ASCII. IMAX takes >advantage of this difference. I believe it would be possible to >design a internationalization solution for RFC (2)822 that would be >distinct from a SMTP internationalization solution. We did that with IMAA. If you have a different proposal, please write an Internet Draft for it. > Those two >distinctions could be investigated in parallel and evaluated on their >own merits. Yes, but we need Internet Drafts before we can do that. > If you think this is ludicrous and want this to be a >productive discussion, please take the question seriously and explain >in technical terms why your proposal is better. How many times should this be done? IMAA is certainly going to be simpler than any proposal that requires changes to both MTAs and MUAs because it localizes the changes to one place (the MUA). It allows other entities in the Internet Mail system to easily use the internationalized email addresses without having to know anything about multiple charsets and repertoires. >(If you are thinking > > of a protocol that doesn't require punycode but would instead simply >> bounce or lose mail that was sent to MTAs that didn't understand the >> new protocol, please don't bother writing an Internet draft...) > >Why not? Because no one who cares about Internet mail wants to start bouncing mail messages unpredictably. Seriously, if you want to do that, don't do it here. Start your own mailing list. I'm quite willing to have folks who propose different solutions that are as reliable as IMAA-ACE discuss them here, because then we can pick just one. But people proposing to make Internet mail unreliable aren't welcome. >I can only interprete your dismissal of alternative solutions without >a serious analysis that you either have done this analysis already and >know the answers or that you don't want to see alternative ideas >discussed. The former. > In the former case, I think it would be useful to read >your analysis. No analysis needed. A "new and improved" mail system that is less reliable is a non-starter. > >>I would want the log file to contain... >> >> Fine. Ask your vendor to include that feature. This is not part of a >> protocol specification. > >I'm the vendor, and I'm here to understand how to implement it. If >the protocol specification doesn't give guidance or have considered >how it will be implemented, I fear it will not work. Then you're not a useful vendor. Others will be able to easily figure out where they want to write raw ACE blobs and where they want to convert them into Unicode characters (and, hopefully, which encoding to use for the Unicode characters). >I note that punycode is a encoding scheme, and thus IDNA and IMAA >violates this by lacking an ability to use UTF-8. Right. > > You should take this up with Harald Alvestrand, the author of RFC >> 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of >> the IESG) approved it to be on standards track. > >Perhaps he is busy with other things, He posted 152 messages to the IDN WG mailing list, some of which were on this very topic. It seems likely that he was paying attention... > but I will ask if the policy in >RFC 2277 doesn't apply any more, or where the variance procedure steps >for the IDN working group are documented. Thanks for the suggestion. Let us know what you find out. --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Wed Feb 26 15:10:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1QNAaj04830 for ietf-imaa-bks; Wed, 26 Feb 2003 15:10:36 -0800 (PST) Received: from yxa.extundo.com (178.230.13.217.in-addr.dgcsystems.net [217.13.230.178]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1QNAXd04824; Wed, 26 Feb 2003 15:10:34 -0800 (PST) Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.7/8.12.7) with ESMTP id h1QNAWqt011946 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Thu, 27 Feb 2003 00:10:33 +0100 Cc: ietf-imaa@imc.org To: Paul Hoffman / IMC From: Simon Josefsson Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <4.2.0.58.J.20030215182950.03351550@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> X-Payment: hashcash 1.1 0:030226:phoffman@imc.org:caea1291be305fee X-Hashcash: 0:030226:phoffman@imc.org:caea1291be305fee X-Payment: hashcash 1.1 0:030226:ietf-imaa@imc.org:1dd99d1b76b0c89c X-Hashcash: 0:030226:ietf-imaa@imc.org:1dd99d1b76b0c89c Date: Thu, 27 Feb 2003 00:10:32 +0100 In-Reply-To: (Paul Hoffman / IMC's message of "Wed, 26 Feb 2003 12:57:34 -0800") Message-ID: User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 (i686-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, hits=-32.4 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_GNUS_UA autolearn=ham version=2.50 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC writes: > At 12:38 PM +0100 2/26/03, Simon Josefsson wrote: >>My question was sincere. IMAX appears to be a solution for >>internationalization of MTAs, at the SMTP layer. It does not propose >>solving the internationalization problem for MUAs. > > Yes, it does. It shows exactly how an MUA should display ACE names. Are you referring to the M-* headers? Those were (rightly) dropped from IMAX, I believe. IMAX doesn't mention the term "MUA" at all. The section regarding M-* headers definitely does not show "exactly" how an MUA display the ACE names. >> Those two >>distinctions could be investigated in parallel and evaluated on their >>own merits. > > Yes, but we need Internet Drafts before we can do that. There are two Internet Drafts, with different approaches. We don't need more to investigate these two. >> If you think this is ludicrous and want this to be a >>productive discussion, please take the question seriously and explain >>in technical terms why your proposal is better. > > How many times should this be done? The IMAX draft is only a few weeks old, if you have discussed it many times before please provide a reference. > IMAA is certainly going to be simpler than any proposal that > requires changes to both MTAs and MUAs because it localizes the > changes to one place (the MUA). Earlier you said my question whether IMAA was an internationalization solution for MTAs was ludicrous yet you now say IMAA doesn't require any changes to the MTA. Clearly, if you want to internationalization support in the MTA, you will have to modify it. Let's take a step back: Compare the situation for the MTA with IMAX: if you want internationalization support in the MTA, can you can implement IMAX, if you don't want or care about it, don't implement. Neither choice will disrupt existing Internet mail services. Having asserted that a MTA without support for IMAA or IMAX will not disrupt existing services, for the remaining discussion we can assume that the MTA do want to be an internationalized product. I'll call it an I18NMTA to help keep things apart. What I'm trying to understand now whether IMAA or IMAX is the better choice for the I18NMTA. Some propositions: * IMAA requires the I18NMTA to implement punycode. IMAX doesn't (assuming my suggested clarification about treating RHS as a IDNA aware domain name slot is adopted). * You claim that under the IMAA design it is reasonable to implement separate applications for viewing log files and editing configuration files in the I18NMTA. IMAX doesn't require this as it uses the system's native character set. * IMAA requires the I18NMTA to support Unicode. While Unicode is a good thing, it can be difficult to implement in existing systems. It is potentially disruptive to the Internet Mail system, using your terminology. My idea of using a IMAX solution without fallback do not require this. No, I haven't described this idea in an Internet Draft, so you don't have to challenge the proposition, but I'd appreciate if you did. You are welcome to add propositions that are to IMAA's advantage. > It allows other entities in the Internet Mail system to easily use > the internationalized email addresses without having to know > anything about multiple charsets and repertoires. That isn't true. Not all systems are using Unicode, but IMAA requires that they implement Unicode. Clearly that is forcing them to know about multiple charsets. >>(If you are thinking >> > of a protocol that doesn't require punycode but would instead simply >>> bounce or lose mail that was sent to MTAs that didn't understand the >>> new protocol, please don't bother writing an Internet draft...) >> >>Why not? > > Because no one who cares about Internet mail wants to start bouncing > mail messages unpredictably. Of course not, that is obvious. How did you infer the bouncing would be unpredictable? > Seriously, if you want to do that, don't do it here. Start your own > mailing list. I'm quite willing to have folks who propose different > solutions that are as reliable as IMAA-ACE discuss them here, because > then we can pick just one. But people proposing to make Internet mail > unreliable aren't welcome. If you believe IMAX would make Internet mail unreliable, please explain why. >>I can only interprete your dismissal of alternative solutions without >>a serious analysis that you either have done this analysis already and >>know the answers or that you don't want to see alternative ideas >>discussed. > > The former. > >> In the former case, I think it would be useful to read >>your analysis. > > No analysis needed. A "new and improved" mail system that is less > reliable is a non-starter. "New and improved" is a loose term. IMAA could be considered a "new and improved" mail system. I believe analysis is needed if you want to make good decisions. >>>>> You are free to say that. Others would disagree. In the case of IMAX, >>>>> what would you want in your log file. All UTF-8? That means you need >>>>> converters from every accepted charset to UTF-8. Careful sysadmins >>>>> would probably want to know *exactly* what came in, not some converted >>>>> form, but that means that their log file would have multiple charsets >>>>> in it, which would make display a mess. A reasonable option is to >>>>> store the addresses as ACE and to have a log-file viewer that converts >>>>> on display (and has an option for not converting). >>>>> >>>>> Again, this is an implementation issue, not a protocol issue. >>>> >>>> Yes. But it is an important point. A internationalization solution >>>> that doesn't consider these practical issues is of only theoretical >>>> value. >>>> >>>> I would want the log file to contain characters that can be read >>>> without special IDNA/IMAA/IMAX aware programs. I.e., if the system >>>> uses UTF-8 as the system encoding, I'd want the log file to be in >>>> UTF-8. If the system uses ISO-8859-1, the log file should be in >>>> ISO-8859-1 (and the application must cope with data that can't be >>>> represented somehow). >>>> >>> Fine. Ask your vendor to include that feature. This is not part of a >>> protocol specification. >> >>I'm the vendor, and I'm here to understand how to implement it. If >>the protocol specification doesn't give guidance or have considered >>how it will be implemented, I fear it will not work. > > Then you're not a useful vendor. Others will be able to easily figure > out where they want to write raw ACE blobs and where they want to > convert them into Unicode characters (and, hopefully, which encoding > to use for the Unicode characters). And convert them into the system's native character set too, I'm sure. >>>>> I think I hear you saying that you think that the protocols should >>>>> allow any repertoire and any encoding of those repertoires. If so, we >>>>> certainly disagree. The IETF is not very keen on creating protocols >>>>> for which there would be limited and unpredictable >>>>> interoperability. Other standards group might not be so picky. >>>> >>>> That is stretching it a bit, I think. I believe that a solution worth >>>> its salt should consider existing habits, and whether we like it or >>>> not there is more than charset used on the Internet. MIME appears to >>>> acknowledge this and is rather successful. HTML acknowledge this and >>>> is rather successful. Same for HTTP. Come to think of it, I can't >>>> recall any successful internationalization product the IETF has >>>> produced to counter my examples, can you help me? >>>> >>>> If you are speaking for IETF, I find it interesting that RFC 2277 >>>> "IETF Policy on Character Sets and Languages" says that protocols MAY >>>> allow use of any repertoire. It doesn't say that it is a bad idea to >>>> allow more than one charset. I agree with that document, let's >>>> require the use of UTF-8 in protocols, but allow negotiation of other >>>> charsets to smooth transition and deployment. >>> >>> You should take this up with Harald Alvestrand, the author of RFC >>> 2277. Note that IDN chose not to use UTF-8, and Harald (as chair of >>> the IESG) approved it to be on standards track. >> >>Perhaps he is busy with other things, > > He posted 152 messages to the IDN WG mailing list, some of which were > on this very topic. It seems likely that he was paying attention... > >> but I will ask if the policy in >>RFC 2277 doesn't apply any more, or where the variance procedure steps >>for the IDN working group are documented. Thanks for the suggestion. > > Let us know what you find out. I found out that RFC 2277 hasn't been obsoleted. So that means you were wrong saying (see first paragraph of quoted text), that the IETF is not keen on creating protocols that allow any repertoire and any repertoire. The quoted text from RFC 2277 I provided earlier says that they MAY do this. For reference, Harald Tveit Alvestrand writes: > simon, > the "escape clause", if you want one, is that DNS names are not, in > many senses of the word, text; they're names. > And RFC 2277 says, in extenso: > > 2. Where to do internationalization > > Internationalization is for humans. This means that protocols are not > subject to internationalization; text strings are. Where protocol > elements look like text tokens, such as in many IETF application > layer protocols, protocols MUST specify which parts are protocol and > which are text. [WR 2.2.1.1] > > Names are a problem, because people feel strongly about them, many of > them are mostly for local usage, and all of them tend to leak out of > the local context at times. RFC 1958 [RFC 1958] recommends US-ASCII > for all globally visible names. > > This document does not mandate a policy on name internationalization, > but requires that all protocols describe whether names are > internationalized or US-ASCII. > > So IDN is really carrying internationalization outside the scope of > RFC 2277. > > The more basic reason is that the IETF is about doing what's right, > not what the rules say you have to do - in this particular case, using > UTF-8 > was debated up the wazoo and far beyond, and the group concluded that > using Punycode rather than raw UTF-8 encoding was the Right Decision, > and the IESG backed them on that. > > Using up processing time to write a BCP to cover this variance from > RFC 2277 would not be useful - especially since RFC 2277 can be read > to say that they didn't have to do this anyway. > > Feel free to forward this message wherever you feel like... > > Harald > > --On 26. februar 2003 12:51 +0100 Simon Josefsson wrote: > >> Harald, >> >> I'm sure you are busy, but I'd appreciate if you could take time to >> answer this question. It was suggested on the IMAA list by Paul >> Hoffman to ask you how to reconcile RFC 2277 with the approval of IDN. >> In particular, RFC 2277 says: >> From owner-ietf-imaa Wed Feb 26 15:40:05 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1QNe5O05493 for ietf-imaa-bks; Wed, 26 Feb 2003 15:40:05 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1QNe4d05488 for ; Wed, 26 Feb 2003 15:40:04 -0800 (PST) Message-ID: <06d601c2ddf0$36d08540$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Mark Davis" , , "Paul Hoffman / IMC" , "Martin Duerst" References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Wed, 26 Feb 2003 18:38:43 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: So you think its better to mandate some other encoding rather than UTF8? Edmon ----- Original Message ----- From: "Martin Duerst" To: "Edmon Chung" ; "Mark Davis" ; ; "Paul Hoffman / IMC" Sent: Wednesday, February 26, 2003 12:59 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > At 21:20 03/02/25 -0500, Edmon Chung wrote: > > >Hi Mark, > > > >I think you are right. That is why in the IMAX description, UTF8 is > >mandated. The thinking is similar to XML among other things. > > There is a huge difference between headers (where having > a single encoding is very important, because the tagging > overhead is high, there are conversion problems,...) and > bodies. Parallels to XML work for other body formats, but > are really not adequate for headers. > > Also, XML is already 5 years old. Something new should look > ahead, and not try to carry too much unnecessary old stuff. > > > Regards, Martin. > From owner-ietf-imaa Wed Feb 26 15:54:55 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1QNstp05864 for ietf-imaa-bks; Wed, 26 Feb 2003 15:54:55 -0800 (PST) Received: from exchange.ad.skymv.com (66-120-210-136.ded.pacbell.net [66.120.210.136]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1QNssd05859 for ; Wed, 26 Feb 2003 15:54:54 -0800 (PST) Received: from exchange.ad.skymv.com ([192.168.1.71]) by exchange.ad.skymv.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 26 Feb 2003 15:54:38 -0800 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 content-class: urn:content-classes:message Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Wed, 26 Feb 2003 15:54:37 -0800 Message-ID: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Problems of Internationalized Mail Address eXtensions (IMAX) Importance: normal Thread-Index: AcLd7EjjrCbZqfmlRKqNHz0C4Xvf8AAA+jwA From: "Dan Kohn" To: "Simon Josefsson" Cc: X-OriginalArrivalTime: 26 Feb 2003 23:54:38.0078 (UTC) FILETIME=[6BD599E0:01C2DDF2] X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1QNssd05860 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Simon, I feel that Paul is showing a Sisyphean level of patience here, but I know it can't continue. I believe you understand that very similar issues were hashed out, and an ASCII Compatible Encoding (ACE) solution was adopted in IDNA. I think you understand that this does not foreclose someone from eventually standardizing true UTF-8 support for DNS (probably using EDNS), but I suspect that no one ever will, because it's a huge amount of implementation work for no meaningful gain (since IDNA code paths will still have to be supported forever). The exact same logic holds for IMAA, and is why an IMAX ESMTP extension simply adds no meaningful value over IMAA for the only folks who care about i18n, which is the end-users. Simon Josefsson wrote: > Earlier you said my question whether IMAA was an internationalization > solution for MTAs was ludicrous yet you now say IMAA doesn't require > any changes to the MTA. Clearly, if you want to internationalization > support in the MTA, you will have to modify it. Let's take a step > back: The whole point of IMAA, as I'm sure you know, is that we don't want i18n support in MTAs. Why bother? It's a huge amount of work, and any mail admin who really cares about the LHS can still treat it as an opaque string (as the standard says it is). Other than the specific issue of sub-addressing, the whole concept of an I18NMTA is a huge amount of work for no value. If you do need to implement Unicode and nameprep support on an MTA to support sub-addressing (and that's not clear yet), then additionally adding punycode is a minor step. >> It allows other entities in the Internet Mail system to easily use >> the internationalized email addresses without having to know >> anything about multiple charsets and repertoires. > That isn't true. Not all systems are using Unicode, but IMAA requires > that they implement Unicode. Clearly that is forcing them to know > about multiple charsets. No, no, no. IMAA requires nameprep and punycode for i18n-capable MUAs. But the installed based of 500+ M MUAs out there today can continue to interact perfectly normally with any IMAA address. They just don't see the i18n version of the IMAA LHS, which is find, because it's an opaque string. And, MTAs don't need to be upgraded. This is the whole point of IMAA (and IDNA). > If you believe IMAX would make Internet mail unreliable, please > explain why. Because most MTAs don't support IMAX, and so every IMAX-capable MUA and MTA would always have to be downgrading to IMAA (or have to bounce the message), in which case no value has been added, but a lot of addition work has been done. Since there will always be some non-IMAX capable MTAs and MUAs, IMAA will always have to be around, so everyone of those IMAX MUAs and MTAs will still need to implement punycode. Or they can bounce the message, which is what Paul was referring to as a non-starter. Simon, I've seen this movie before, and I know how it ends. The ACE wins. If you want to go forward with IMAX, you may even succeed in getting it published as Experimental, though I doubt it. But no one will implement it, since it adds lots of effort but no value over IMAA, so why bother? - dan -- Dan Kohn From owner-ietf-imaa Wed Feb 26 17:22:06 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1R1M6308669 for ietf-imaa-bks; Wed, 26 Feb 2003 17:22:06 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1R1M4d08657 for ; Wed, 26 Feb 2003 17:22:04 -0800 (PST) Message-ID: <073601c2ddfe$93cc8100$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Dan Kohn" , "Simon Josefsson" Cc: References: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Wed, 26 Feb 2003 20:21:34 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi Dan, ----- Original Message ----- From: "Dan Kohn" > Because most MTAs don't support IMAX, and so every IMAX-capable MUA and > MTA would always have to be downgrading to IMAA (or have to bounce the > message), in which case no value has been added, but a lot of addition > work has been done. Since there will always be some non-IMAX capable > MTAs and MUAs, IMAA will always have to be around, so everyone of those > IMAX MUAs and MTAs will still need to implement punycode. Or they can > bounce the message, which is what Paul was referring to as a > non-starter. That was true from the very proposition of ESMTP. And ESMTP was designed so that features could be added as such. > If you want to go forward with IMAX, you may even succeed in getting it > published as Experimental, though I doubt it. But no one will implement > it, since it adds lots of effort but no value over IMAA, so why bother? This is not a fair statement. We have implemented the IMAX extension as an experiment already! :-) Of course, its because we devised it... But discussion with other peer vendors indicate that there is interest in this direction. I think Experimental is interesting. Perhaps we should pursue that... I think vendors would be interested to add the feature and we can argue till the end of time and we wont know if we dont publish it. Your thoughts? Edmon From owner-ietf-imaa Wed Feb 26 19:48:52 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1R3mqr14233 for ietf-imaa-bks; Wed, 26 Feb 2003 19:48:52 -0800 (PST) Received: from smtp6.andrew.cmu.edu (SMTP6.andrew.cmu.edu [128.2.10.86]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1R3mpY14226 for ; Wed, 26 Feb 2003 19:48:51 -0800 (PST) Received: from dunbar.rem.cmu.edu (DUNBAR.REM.cmu.edu [128.2.4.197]) by smtp6.andrew.cmu.edu (8.12.7.Beta1/8.12.3.Beta2) with ESMTP id h1R3mrpZ029463; Wed, 26 Feb 2003 22:48:53 -0500 Date: Wed, 26 Feb 2003 22:48:53 -0500 Message-Id: <200302270348.h1R3mrpZ029463@smtp6.andrew.cmu.edu> From: Lawrence Greenfield X-Mailer: BatIMail version 3.3 To: Dan Kohn Cc: ietf-imaa@imc.org In-reply-to: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-4?Q?Unebigory?= =?ISO-8859-4?Q?=F2mae?=) Emacs/21.2 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Date: Wed, 26 Feb 2003 15:54:37 -0800 From: "Dan Kohn" [...] work has been done. Since there will always be some non-IMAX capable MTAs and MUAs, IMAA will always have to be around, so everyone of those IMAX MUAs and MTAs will still need to implement punycode. Or they can bounce the message, which is what Paul was referring to as a non-starter. Simon, I've seen this movie before, and I know how it ends. The ACE wins. Dan, I think you're being a bit disingenuous here. Are you against the 8BITMIME extension? Do you think it is a failure? Was it a mistake? Larry From owner-ietf-imaa Wed Feb 26 20:37:33 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1R4bXd15688 for ietf-imaa-bks; Wed, 26 Feb 2003 20:37:33 -0800 (PST) Received: from exchange.ad.skymv.com (66-120-210-136.ded.pacbell.net [66.120.210.136]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1R4bXY15684 for ; Wed, 26 Feb 2003 20:37:33 -0800 (PST) Received: from exchange.ad.skymv.com ([192.168.1.71]) by exchange.ad.skymv.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 26 Feb 2003 20:37:17 -0800 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 content-class: urn:content-classes:message Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Wed, 26 Feb 2003 20:37:17 -0800 Message-ID: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Importance: normal Thread-Topic: Problems of Internationalized Mail Address eXtensions (IMAX) Thread-Index: AcLeEyAzauSJ+27KTuWPLN5YbF4TLAAAYa3g From: "Dan Kohn" To: "Lawrence Greenfield" Cc: X-OriginalArrivalTime: 27 Feb 2003 04:37:17.0687 (UTC) FILETIME=[E88C9070:01C2DE19] X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1R4bXY15685 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Lawrence Greenfield wrote: > I think you're being a bit disingenuous here. Are you against the > 8BITMIME extension? Do you think it is a failure? Was it a mistake? It's a fair criticism, based on the fact that I just proposed a new CTE that requires 8BitMIME on email. The difference, I believe, is that 8BitMIME provides a 33% bandwidth reduction when it can be used end-to-end, at the cost of requiring base64 transformations when encountering a non-compliant MTA or MUA. By contrast, I don't think IMAX offers any bandwidth, complexity, or usability advantages, while still requiring a lot of additional complexity in implementation. In that way, as I said in the original message, I believe it is much more analogous to the (since withdrawn) proposal to implement IDNs using EDNS and UTF-8. One could certainly design something that would work, but it would require servers to be upgraded, would offer no more functionality than IDNA, and would increase complexity. Why bother? - dan -- Dan Kohn From owner-ietf-imaa Wed Feb 26 21:11:37 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1R5BbM16232 for ietf-imaa-bks; Wed, 26 Feb 2003 21:11:37 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1R5BaY16227 for ; Wed, 26 Feb 2003 21:11:36 -0800 (PST) Received: (qmail 30122 invoked by uid 1016); 27 Feb 2003 05:12:07 -0000 Date: 27 Feb 2003 05:12:07 -0000 Message-ID: <20030227051207.30121.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Dan Kohn writes: > standardizing true UTF-8 support for DNS (probably using EDNS) You don't know what you're talking about. The DNS protocol is already 8-bit clean. DNS servers and caches can already handle 8-bit data. > it's a huge amount of implementation work for no meaningful gain (since > IDNA code paths will still have to be supported forever) If IDNA were a complete solution, if the massive costs of implementing and deploying that solution had already been incurred, and if Internet software development came to a complete halt so that we didn't have to worry about costs imposed on future implementors, then I'd agree. But IDNA isn't a complete solution; only a tiny fraction of the IDNA costs have been incurred; and, most importantly, Internet development shows no signs of coming to a halt. All the short-term upgrade costs that we're considering, no matter how huge they might seem, are tiny compared to the long-term costs of the character-set mess. Do you want implementors in ten years, or twenty years, or fifty years, to be continuing to worry about conversions from one character encoding to another? I want them to be spending the same time providing _new_ features for the users. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Wed Feb 26 23:28:19 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1R7SJB25625 for ietf-imaa-bks; Wed, 26 Feb 2003 23:28:19 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1R7SHY25621 for ; Wed, 26 Feb 2003 23:28:17 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id C4BBC789E0; Thu, 27 Feb 2003 07:28:13 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 05224-04; Thu, 27 Feb 2003 07:28:09 +0000 (GMT) Received: from jeffreyibm (unknown [137.132.137.37]) by pie1.i-dns.net (Postfix) with SMTP id 504AB78ABE; Thu, 27 Feb 2003 07:28:08 +0000 (GMT) Message-ID: <00bc01c2de32$12cfbb10$25898489@jeffreyibm> From: "Jeffrey J Zahari" To: "Dan Kohn" , "Lawrence Greenfield" Cc: References: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 15:30:14 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Actually dan, though it has been brought up before that 8 bit DNS is frowned upon, as of bind 9, anyone could have already set up a working DNS/mail infrastructure using bind 9 and qmail using UTF-8. However, IDNA was chosen because of better compression comparitive to UTF-8 and most importantly, the use of this "shim" meant that all legacy implementations would not have to be upgraded. So, while ACE will probably be around for an extremely long time to come, it may be a bit premature to say that investigations into divesting the 7 bit legacy is too bothersome to be attempted. jeffrey j zahari ----- Original Message ----- From: "Dan Kohn" To: "Lawrence Greenfield" Cc: Sent: Thursday, February 27, 2003 12:37 PM Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) > > By contrast, I don't think IMAX offers any bandwidth, complexity, or > usability advantages, while still requiring a lot of additional > complexity in implementation. In that way, as I said in the original > message, I believe it is much more analogous to the (since withdrawn) > proposal to implement IDNs using EDNS and UTF-8. One could certainly > design something that would work, but it would require servers to be > upgraded, would offer no more functionality than IDNA, and would > increase complexity. Why bother? > > - dan > -- > Dan Kohn > > From owner-ietf-imaa Wed Feb 26 23:58:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1R7woF01390 for ietf-imaa-bks; Wed, 26 Feb 2003 23:58:50 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1R7wmY01374; Wed, 26 Feb 2003 23:58:48 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id 7DC4C789E0; Thu, 27 Feb 2003 07:58:47 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 05538-10; Thu, 27 Feb 2003 07:58:45 +0000 (GMT) Received: from jeffreyibm (unknown [137.132.137.37]) by pie1.i-dns.net (Postfix) with SMTP id 52A2778A2E; Thu, 27 Feb 2003 07:58:44 +0000 (GMT) Message-ID: <00de01c2de36$58d94730$25898489@jeffreyibm> From: "Jeffrey J Zahari" To: "Paul Hoffman / IMC" , "Simon Josefsson" Cc: References: <4.2.0.58.J.20030215182950.03351550@localhost><15955.58113.661260.889533@moriarty.gnomon.org.uk><006601c2d891$a29fbbb0$3800a8c0@jeffreyibm><4.2.0.58.J.20030220132811.056277a0@localhost><036f01c2d922$d1b59910$0f01a8c0@neteka.inc><097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA><025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 16:00:51 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Simon Josefsson" To: "Paul Hoffman / IMC" Cc: Sent: Thursday, February 27, 2003 7:10 AM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > Paul Hoffman / IMC writes: > > > At 12:38 PM +0100 2/26/03, Simon Josefsson wrote: > >>My question was sincere. IMAX appears to be a solution for > >>internationalization of MTAs, at the SMTP layer. It does not propose > >>solving the internationalization problem for MUAs. > > > > Yes, it does. It shows exactly how an MUA should display ACE names. > > * IMAA requires the I18NMTA to implement punycode. IMAX doesn't > (assuming my suggested clarification about treating RHS as a IDNA > aware domain name slot is adopted). > > * You claim that under the IMAA design it is reasonable to implement > separate applications for viewing log files and editing > configuration files in the I18NMTA. IMAX doesn't require this as it > uses the system's native character set. > > * IMAA requires the I18NMTA to support Unicode. While Unicode is a > good thing, it can be difficult to implement in existing systems. > It is potentially disruptive to the Internet Mail system, using your > terminology. My idea of using a IMAX solution without fallback do > not require this. No, I haven't described this idea in an Internet > Draft, so you don't have to challenge the proposition, but I'd > appreciate if you did. > I actually have some trouble understanding this. IMAA would not require I18NMTA. It uses ACE straight off the bat, and the implementation would require updates to the MUA only and not the MTA. The only problem anyone would face would be to ensure that the input of the names into whatever mapping tables/control files are of the correct ACE format. > You are welcome to add propositions that are to IMAA's advantage. > > > It allows other entities in the Internet Mail system to easily use > > the internationalized email addresses without having to know > > anything about multiple charsets and repertoires. > > That isn't true. Not all systems are using Unicode, but IMAA requires > that they implement Unicode. Clearly that is forcing them to know > about multiple charsets. > Because IMAA is done right at the start of the email process, before the first SMTP MUA-MTA transaction, the addresses are already in ACE. Once again the only problem anyone would face would be to ensure that input of the names into whatever mapping tables/control files are of the correct ACE format. The only other reason why this could be a problem would be if an implementation required these files to be in something like UTF-8 or other similar formats. jeffrey j zahari From owner-ietf-imaa Thu Feb 27 02:22:23 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RAMNq21192 for ietf-imaa-bks; Thu, 27 Feb 2003 02:22:23 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RAMMY21188 for ; Thu, 27 Feb 2003 02:22:22 -0800 (PST) Received: (qmail 1377 invoked by uid 1016); 27 Feb 2003 10:22:49 -0000 Date: 27 Feb 2003 10:22:49 -0000 Message-ID: <20030227102249.1376.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> <00bc01c2de32$12cfbb10$25898489@jeffreyibm> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Jeffrey J Zahari writes: > anyone could have already set up a working DNS/mail > infrastructure using bind 9 and qmail using UTF-8 Here you're talking about a massive change to the Internet as if it were some trivial overnight task. > However, IDNA was chosen Here you're talking about even more massive change to the Internet as if it were already done. > better compression comparitive to UTF-8 Here you're wildly exaggerating the importance of a ludicrously small issue. > all legacy implementations would not have to be upgraded Here you're making the content-free observation that we don't have to do anything if we don't want to accomplish anything. Are you an implementor? Do you speak any languages other than English? What are your goals here? ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 27 08:12:48 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RGCmG17647 for ietf-imaa-bks; Thu, 27 Feb 2003 08:12:48 -0800 (PST) Received: from relay-3v.club-internet.fr (relay-3v.club-internet.fr [194.158.96.114]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RGCkY17643 for ; Thu, 27 Feb 2003 08:12:46 -0800 (PST) Received: from mine.club-internet.fr (f09m-1-5.d1.club-internet.fr [213.44.216.5]) by relay-3v.club-internet.fr (Postfix) with ESMTP id BD88C16DB; Thu, 27 Feb 2003 17:11:36 +0100 (CET) Message-Id: <5.2.0.9.0.20030227151554.02fd8b50@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Thu, 27 Feb 2003 15:18:02 +0100 To: "D. J. Bernstein" , ietf-imaa@imc.org From: "J-F C. (Jefsey) Morfin" Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <20030227102249.1376.qmail@cr.yp.to> References: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> <00bc01c2de32$12cfbb10$25898489@jeffreyibm> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Has someone documented: - the real obsolescence time of the obsolete solutions one is to support (vs OS version, hardware)? - possible strategies to force updates through partial compatibility, versioning? - when the massive changes to occur must have occured (technical necessity due to complexity, for example)? All the products have version incompatibilities. I certainly understand that stability must be protected, but don't we favor the past against the future? I mean is compatibility with the 20 last years (ascii only for everyone and 600M of users) or compatibility with the 1000 years to come (everyone's vernacular being supported and 6B of users) which is to prevail? At 11:22 27/02/03, D. J. Bernstein wrote: >Jeffrey J Zahari writes: > > anyone could have already set up a working DNS/mail > > infrastructure using bind 9 and qmail using UTF-8 > >Here you're talking about a massive change to the Internet as if it were >some trivial overnight task. > > > However, IDNA was chosen >Here you're talking about even more massive change to the Internet as if >it were already done. the key demployement issue (ITLDs) is not supported. Thank you. jfc From owner-ietf-imaa Thu Feb 27 08:34:44 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RGYik18587 for ietf-imaa-bks; Thu, 27 Feb 2003 08:34:44 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RGYgY18575; Thu, 27 Feb 2003 08:34:43 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA12909; Thu, 27 Feb 2003 11:34:39 -0500 Message-Id: <4.2.0.58.J.20030227105805.05962008@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 27 Feb 2003 10:59:45 -0500 To: "Edmon Chung" , "Mark Davis" , , "Paul Hoffman / IMC" From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <06d601c2ddf0$36d08540$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030216101703.05134de8@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 18:38 03/02/26 -0500, Edmon Chung wrote: >So you think its better to mandate some other encoding rather than UTF8? >Edmon No, what I mean is that nothing else than UTF-8 should be used, except for something like punycode for downgrading. So nothing like Big5, no negotiation on charsets, and so on. Regards, Martin. >----- Original Message ----- >From: "Martin Duerst" >To: "Edmon Chung" ; "Mark Davis" ; >; "Paul Hoffman / IMC" >Sent: Wednesday, February 26, 2003 12:59 PM >Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > > > > > At 21:20 03/02/25 -0500, Edmon Chung wrote: > > > > >Hi Mark, > > > > > >I think you are right. That is why in the IMAX description, UTF8 is > > >mandated. The thinking is similar to XML among other things. > > > > There is a huge difference between headers (where having > > a single encoding is very important, because the tagging > > overhead is high, there are conversion problems,...) and > > bodies. Parallels to XML work for other body formats, but > > are really not adequate for headers. > > > > Also, XML is already 5 years old. Something new should look > > ahead, and not try to carry too much unnecessary old stuff. > > > > > > Regards, Martin. > > From owner-ietf-imaa Thu Feb 27 08:34:44 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RGYi618585 for ietf-imaa-bks; Thu, 27 Feb 2003 08:34:44 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RGYgY18574 for ; Thu, 27 Feb 2003 08:34:42 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA12915; Thu, 27 Feb 2003 11:34:39 -0500 Message-Id: <4.2.0.58.J.20030227110506.03318158@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 27 Feb 2003 11:32:02 -0500 To: "Dan Kohn" , "Simon Josefsson" From: Martin Duerst Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) Cc: In-Reply-To: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.c om> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Dan, At 15:54 03/02/26 -0800, Dan Kohn wrote: >Simon, I feel that Paul is showing a Sisyphean level of patience here, >but I know it can't continue. I believe you understand that very >similar issues were hashed out, and an ASCII Compatible Encoding (ACE) >solution was adopted in IDNA. The issues were somewhat very similar, and somewhat they are probably quite different. I know that Paul may easily get tired because he went through similar argumentation before, and probably also because one of the purposes of working on IDN was to get some solution for IMAs. But some people (such as John Klensin) think that there are important differences between IDN and IMA. So closing the discussion early just because one is tired from a previous discussion doesn't seem adequate. >The exact same logic holds for IMAA, and is why an IMAX ESMTP extension >simply adds no meaningful value over IMAA for the only folks who care >about i18n, which is the end-users. End-users should be taken quite widely here. Somebody who is trying to write a new and interesting spam filter may be an end user. Email isn't just only passed around by MTAs and then read by people using MUAs, there are all kinds of other ways in which it is processed. >The whole point of IMAA, as I'm sure you know, is that we don't want >i18n support in MTAs. Why bother? It's a huge amount of work, and any >mail admin who really cares about the LHS can still treat it as an >opaque string (as the standard says it is). Other than the specific >issue of sub-addressing, the whole concept of an I18NMTA is a huge >amount of work for no value. Let's look at it. Making MTAs work with 8-bit headers in many ways is actually rather trivial. Doing negotiation isn't exactly trivial, but it is done already for 8BITMIME, and probably for other extensions (which is a huge difference from the IDN situation, as John noted). The main problem isn't the amount of work it takes in each instance, it is that due to various historical accidents, we are currently in a situation that is significantly suboptimal for everybody. We can either say "we are deep in this mess, let's dig deeper", or we can say "let's think about how we might dig into a direction where we might get out of this mess". > > If you believe IMAX would make Internet mail unreliable, please > > explain why. > >Because most MTAs don't support IMAX, and so every IMAX-capable MUA and >MTA would always have to be downgrading to IMAA (or have to bounce the >message), in which case no value has been added, but a lot of addition >work has been done. Since there will always be some non-IMAX capable >MTAs and MUAs, IMAA will always have to be around, so everyone of those >IMAX MUAs and MTAs will still need to implement punycode. This situations seems to be quite similar to 8BITMIME. Still 8BITMIME is used a lot. >Simon, I've seen this movie before, and I know how it ends. The ACE >wins. So you are concluding from a sample of 1? >If you want to go forward with IMAX, you may even succeed in getting it >published as Experimental, though I doubt it. But no one will implement >it, since it adds lots of effort but no value over IMAA, so why bother? By your argumentation, 8BITMIME would have become experimental, and nobody would have implemented it. Why do you think reality as we see it today is different? Regards, Martin. From owner-ietf-imaa Thu Feb 27 08:58:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RGwoj19831 for ietf-imaa-bks; Thu, 27 Feb 2003 08:58:50 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RGwnY19826 for ; Thu, 27 Feb 2003 08:58:49 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA23494 for ; Thu, 27 Feb 2003 11:58:50 -0500 Message-Id: <4.2.0.58.J.20030227114053.04f977f0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 27 Feb 2003 11:42:07 -0500 To: ietf-imaa@imc.org From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <20030227051207.30121.qmail@cr.yp.to> References: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The long-term cost argument is very well put below. Regards, Martin. At 05:12 03/02/27 +0000, D. J. Bernstein wrote: >Dan Kohn writes: > > it's a huge amount of implementation work for no meaningful gain (since > > IDNA code paths will still have to be supported forever) > >If IDNA were a complete solution, if the massive costs of implementing >and deploying that solution had already been incurred, and if Internet >software development came to a complete halt so that we didn't have to >worry about costs imposed on future implementors, then I'd agree. > >But IDNA isn't a complete solution; only a tiny fraction of the IDNA >costs have been incurred; and, most importantly, Internet development >shows no signs of coming to a halt. > >All the short-term upgrade costs that we're considering, no matter how >huge they might seem, are tiny compared to the long-term costs of the >character-set mess. Do you want implementors in ten years, or twenty >years, or fifty years, to be continuing to worry about conversions from >one character encoding to another? I want them to be spending the same >time providing _new_ features for the users. > >---D. J. Bernstein, Associate Professor, Department of Mathematics, >Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 27 09:00:28 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RH0SY19899 for ietf-imaa-bks; Thu, 27 Feb 2003 09:00:28 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RH0RY19893 for ; Thu, 27 Feb 2003 09:00:27 -0800 (PST) Message-ID: <082001c2de81$7365bcf0$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Mark Davis" , , "Paul Hoffman / IMC" , "Martin Duerst" References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 11:58:20 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I see. The reason support for other charset seems to make sense is that I can see that a lot of times IMA would be used within a same region, say china MTA to china MTA, which means that by simply using GB would make it easier in most cases. It is therefore going to be true that in most cases using the local encoding will be a much more efficient transport than UTF8. Because the the IMAX capable MTA should really annouce which encoding it supports, in real transaction, there wouldn't really be negotiation. Perhaps I should change it so that if there is no annoucement of additional charset support then the MUA MUST use UTF8 to start with and avoid having to negotiate further. Would that be better? Edmon ----- Original Message ----- From: "Martin Duerst" To: "Edmon Chung" ; "Mark Davis" ; ; "Paul Hoffman / IMC" Sent: Thursday, February 27, 2003 10:59 AM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > At 18:38 03/02/26 -0500, Edmon Chung wrote: > >So you think its better to mandate some other encoding rather than UTF8? > >Edmon > > No, what I mean is that nothing else than UTF-8 should be used, > except for something like punycode for downgrading. So nothing > like Big5, no negotiation on charsets, and so on. > > Regards, Martin. > > > > > > >----- Original Message ----- > >From: "Martin Duerst" > >To: "Edmon Chung" ; "Mark Davis" ; > >; "Paul Hoffman / IMC" > >Sent: Wednesday, February 26, 2003 12:59 PM > >Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > > > > > > > > > At 21:20 03/02/25 -0500, Edmon Chung wrote: > > > > > > >Hi Mark, > > > > > > > >I think you are right. That is why in the IMAX description, UTF8 is > > > >mandated. The thinking is similar to XML among other things. > > > > > > There is a huge difference between headers (where having > > > a single encoding is very important, because the tagging > > > overhead is high, there are conversion problems,...) and > > > bodies. Parallels to XML work for other body formats, but > > > are really not adequate for headers. > > > > > > Also, XML is already 5 years old. Something new should look > > > ahead, and not try to carry too much unnecessary old stuff. > > > > > > > > > Regards, Martin. > > > > > From owner-ietf-imaa Thu Feb 27 08:58:51 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RGwpD19836 for ietf-imaa-bks; Thu, 27 Feb 2003 08:58:51 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RGwoY19832 for ; Thu, 27 Feb 2003 08:58:50 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA23498; Thu, 27 Feb 2003 11:58:50 -0500 Message-Id: <4.2.0.58.J.20030227114254.04fda2b8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 27 Feb 2003 11:58:43 -0500 To: "Dan Kohn" , "Lawrence Greenfield" From: Martin Duerst Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) Cc: In-Reply-To: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.c om> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 20:37 03/02/26 -0800, Dan Kohn wrote: >Lawrence Greenfield wrote: > > > I think you're being a bit disingenuous here. Are you against the > > 8BITMIME extension? Do you think it is a failure? Was it a mistake? > >It's a fair criticism, based on the fact that I just proposed a new CTE >that requires 8BitMIME on email. The difference, I believe, is that >8BitMIME provides a 33% bandwidth reduction when it can be used >end-to-end, at the cost of requiring base64 transformations when >encountering a non-compliant MTA or MUA. > >By contrast, I don't think IMAX offers any bandwidth, agreed that this is negligible, and irrelevant. >complexity, or usability advantages, Disagreed. Let's say I were a procurer/sysadmin in Japan, and would have to procure MTAs and MUAs for a whole department. I would clearly buy a solution that lets me have a look at all the stuff that's going on without having hundreds of tools to make sure that I would see the right punycode decoding at the right place (rather than to have to stare at punycode anywhere). That would strongly reduce complexity and increase usability for me, and improve service for my user base. >while still requiring a lot of additional >complexity in implementation. In that way, as I said in the original >message, I believe it is much more analogous to the (since withdrawn) >proposal to implement IDNs using EDNS and UTF-8. One could certainly >design something that would work, but it would require servers to be >upgraded, would offer no more functionality than IDNA, and would >increase complexity. Why bother? Why think about the long-term future of the internet? Anyway, one of the problems in IDN was that there were so many different ways that something could be negotiated/distinguished, and none of them very well established. For SMTP, this is clearly different. Also, the DNS is so low-level that it is in fact possible to hide the uglyness of IDNA quite easily (look e.g. at idnkit, http://www.nic.ad.jp/ja/idn/mdnkit/download/#sources). It actually allows to patch binary applications in some cases. This is quite different for IMAA, it's uglyness will show in much more places. Regards, Martin. From owner-ietf-imaa Thu Feb 27 09:45:34 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RHjYc22790 for ietf-imaa-bks; Thu, 27 Feb 2003 09:45:34 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RHjWY22780 for ; Thu, 27 Feb 2003 09:45:32 -0800 (PST) Received: (qmail 20859 invoked by uid 66); 27 Feb 2003 17:45:30 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 27 Feb 2003 17:45:30 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-27-1438d); 27 Feb 2003 18:45:08 +0000 Date: 27 Feb 2003 00:00:00 +0000 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8geEHyYJcDD@3247.org> In-Reply-To: <138AA78F80DCE84B8EE424399FFBF9C904FAED@exchange.ad.skymv.com> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-27-1438d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Dan Kohn schrieb/wrote: > If you do need to implement Unicode and nameprep support on an MTA to > support sub-addressing (and that's not clear yet), then additionally > adding punycode is a minor step. You don't need nameprep just for subaddressing, only a Punycode decoder. The output of the Punycode decoder consists of normalised and nameprepped Unicode character sequences that can be compared directly. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Thu Feb 27 09:59:53 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RHxre23983 for ietf-imaa-bks; Thu, 27 Feb 2003 09:59:53 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RHxqY23974 for ; Thu, 27 Feb 2003 09:59:52 -0800 (PST) Message-ID: <086401c2de89$ed907b70$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Claus Färber" , References: <8geEHyYJcDD@3247.org> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 12:59:03 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I guess part of the concern is that nameprep might not be right for IMA. For example, some mail servers are case sensitive in the local part... and that should be perfectly fine. Edmon ----- Original Message ----- From: "Claus Färber" To: Sent: Wednesday, February 26, 2003 7:00 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > Dan Kohn schrieb/wrote: > > If you do need to implement Unicode and nameprep support on an MTA to > > support sub-addressing (and that's not clear yet), then additionally > > adding punycode is a minor step. > > You don't need nameprep just for subaddressing, only a Punycode decoder. > The output of the Punycode decoder consists of normalised and > nameprepped Unicode character sequences that can be compared directly. > > Claus > -- > http://www.faerber.muc.de/ > From owner-ietf-imaa Thu Feb 27 10:16:01 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RIG1C25156 for ietf-imaa-bks; Thu, 27 Feb 2003 10:16:01 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RIFxY25148 for ; Thu, 27 Feb 2003 10:15:59 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <082001c2de81$7365bcf0$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> <082001c2de81$7365bcf0$fb5016d3@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 27 Feb 2003 09:54:38 -0800 To: ietf-imaa@imc.org From: Paul Hoffman / IMC Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:58 AM -0500 2/27/03, Edmon Chung wrote: >I see. The reason support for other charset seems to make sense is that I >can see that a lot of times IMA would be used within a same region, say >china MTA to china MTA, which means that by simply using GB would make it >easier in most cases. And much more difficult and prone to errors in the others unless you standardize all of the charset mappings. (Which I assume you aren't proposing to do....) --Paul Hoffman, Director --Internet Mail Consortium From owner-ietf-imaa Thu Feb 27 11:20:29 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RJKTt01298 for ietf-imaa-bks; Thu, 27 Feb 2003 11:20:29 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RJKSY01292; Thu, 27 Feb 2003 11:20:28 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA15272; Thu, 27 Feb 2003 14:20:29 -0500 Message-Id: <4.2.0.58.J.20030227133401.04d7cdb8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 27 Feb 2003 13:37:21 -0500 To: Paul Hoffman / IMC , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: References: <082001c2de81$7365bcf0$fb5016d3@neteka.inc> <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> <082001c2de81$7365bcf0$fb5016d3@neteka.inc> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 09:54 03/02/27 -0800, Paul Hoffman / IMC wrote: >At 11:58 AM -0500 2/27/03, Edmon Chung wrote: >>I see. The reason support for other charset seems to make sense is that I >>can see that a lot of times IMA would be used within a same region, say >>china MTA to china MTA, which means that by simply using GB would make it >>easier in most cases. > >And much more difficult and prone to errors in the others unless you >standardize all of the charset mappings. (Which I assume you aren't >proposing to do....) I agree with Paul that having a large variety of charsets is a big problem. But I don't think the problem of mappings is the main aspect of it. The mapping differences usually apply for characters that are not that much used in the specific encoding. The main problem with using legacy encodings is that it's not at all forward-looking. It's introducing more legacy instead of removing legacy. Regards, Martin. From owner-ietf-imaa Thu Feb 27 11:43:16 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RJhGA02186 for ietf-imaa-bks; Thu, 27 Feb 2003 11:43:16 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RJhFY02182 for ; Thu, 27 Feb 2003 11:43:15 -0800 (PST) Message-ID: <08e801c2de98$6f90a1a0$fb5016d3@neteka.inc> From: "Edmon Chung" To: References: <4.2.0.58.J.20030215182950.03351550@localhost> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> <082001c2de81$7365bcf0$fb5016d3@neteka.inc> <089101c2de8e$45513f30$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 14:42:56 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I see. But why would someone convert to charset A to charset B? Except for from (or to) a particular local encoding to UTF8. Since charsets registered should be able to convert between ISO10646, there shouldnt be any problem. Edmon ----- Original Message ----- From: "Paul Hoffman / IMC" To: "Edmon Chung" Sent: Thursday, February 27, 2003 1:42 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > At 1:30 PM -0500 2/27/03, Edmon Chung wrote: > >I am not sure what you mean. > >The document did refer to using the charsets registered at IANA for MIME. > >Are the charsets registered there not a good set to work with? > > If you expect someone to convert from charset A to charset B > reliably, you have to standardize the mapping table. There are lots > of examples of mapping tables in use today that differ. Without that, > someone sending with one charset and expecting the mail to be put in > exactly the mailbox they mean might be in for a bad surprise. > > This was discussed on the IDN mailing list. > > --Paul Hoffman, Director > --Internet Mail Consortium > From owner-ietf-imaa Thu Feb 27 11:46:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RJkP302356 for ietf-imaa-bks; Thu, 27 Feb 2003 11:46:25 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RJkOY02352 for ; Thu, 27 Feb 2003 11:46:24 -0800 (PST) Message-ID: <08f401c2de98$dd377670$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Paul Hoffman / IMC" , References: <082001c2de81$7365bcf0$fb5016d3@neteka.inc> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> <082001c2de81$7365bcf0$fb5016d3@neteka.inc> <4.2.0.58.J.20030227133401.04d7cdb8@localhost> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 14:45:58 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I dont agree with the forward looking part. The big reason as far as I believe that there should be a charset parameter is because of forward looking. Today we move from English only to multilingual... if the design in the first place allowed the charset parameter, we wouldnt have the problem at all... realizing that, adding the feature would allow better transition to whatever we might face in terms of charset in the future. Your thoughts on this? Edmon ----- Original Message ----- From: "Martin Duerst" To: "Paul Hoffman / IMC" ; Sent: Thursday, February 27, 2003 1:37 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > At 09:54 03/02/27 -0800, Paul Hoffman / IMC wrote: > > >At 11:58 AM -0500 2/27/03, Edmon Chung wrote: > >>I see. The reason support for other charset seems to make sense is that I > >>can see that a lot of times IMA would be used within a same region, say > >>china MTA to china MTA, which means that by simply using GB would make it > >>easier in most cases. > > > >And much more difficult and prone to errors in the others unless you > >standardize all of the charset mappings. (Which I assume you aren't > >proposing to do....) > > I agree with Paul that having a large variety of charsets is a big problem. > But I don't think the problem of mappings is the main aspect of it. > The mapping differences usually apply for characters that are not that > much used in the specific encoding. > The main problem with using legacy encodings is that it's not at all > forward-looking. It's introducing more legacy instead of removing > legacy. > > Regards, Martin. > > From owner-ietf-imaa Thu Feb 27 11:49:04 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RJn4d02445 for ietf-imaa-bks; Thu, 27 Feb 2003 11:49:04 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RJn3Y02438; Thu, 27 Feb 2003 11:49:03 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA28021; Thu, 27 Feb 2003 14:49:03 -0500 Message-Id: <4.2.0.58.J.20030227142957.027930c8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 27 Feb 2003 14:47:53 -0500 To: "Edmon Chung" , "Mark Davis" , , "Paul Hoffman / IMC" From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <082001c2de81$7365bcf0$fb5016d3@neteka.inc> References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030219091733.054f4718@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Edmon, At 11:58 03/02/27 -0500, Edmon Chung wrote: >I see. The reason support for other charset seems to make sense is that I >can see that a lot of times IMA would be used within a same region, say >china MTA to china MTA, which means that by simply using GB would make it >easier in most cases. It is therefore going to be true that in most cases >using the local encoding will be a much more efficient transport than UTF8. 'easier' and 'efficient' are not very clear to me here. If we want IMAX to be really interoperable, we have to have at least one encoding that every IMAX-aware piece of software supports. The obvious choice for this is UTF-8. [Even if punycode is the only encoding that we require every IMAX-aware piece of software to support (for backwards compatibility), the software has to support Unicode.] So just doing things in UTF-8 looks easier to me than also using all these legacy encodings. Of course some people will say that just doing punycode only is even easier. The point is that with UTF-8, we are investing in our future, and creating a clear upgrade path, whereas bringing in legacy encodings is more like 'sidegrading'. >Because the the IMAX capable MTA should really annouce which encoding it >supports, in real transaction, there wouldn't really be negotiation. >Perhaps I should change it so that if there is no annoucement of additional >charset support then the MUA MUST use UTF8 to start with and avoid having to >negotiate further. Would that be better? That would be a step in the right direction. But I suggest to directly go to a model where UTF-8 and punycode (or whatever fallback we choose) are the only choices. Regards, Martin. From owner-ietf-imaa Thu Feb 27 11:59:46 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RJxkp02775 for ietf-imaa-bks; Thu, 27 Feb 2003 11:59:46 -0800 (PST) Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RJxjY02769 for ; Thu, 27 Feb 2003 11:59:45 -0800 (PST) Message-ID: <091f01c2de9a$bd20d2d0$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Mark Davis" , , "Paul Hoffman / IMC" , "Martin Duerst" References: <4.2.0.58.J.20030215182950.03351550@localhost> <15955.58113.661260.889533@moriarty.gnomon.org.uk> <006601c2d891$a29fbbb0$3800a8c0@jeffreyibm> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> <4.2.0.58.J.20030227142957.027930c8@localhost> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 14:59:22 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I see. You probably havent seen my latest reply on the "Forward looking" part yet. However, I know where you are coming from. Perhaps we should introduce the "charset" parameter but limit it to only UTF8 for now therefore we will have the flexibility to add in the future... it will also distinguish between regular mailfrom and rcptto commands. Your thoughts? Edmon ----- Original Message ----- From: "Martin Duerst" To: "Edmon Chung" ; "Mark Davis" ; ; "Paul Hoffman / IMC" Sent: Thursday, February 27, 2003 2:47 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > Hello Edmon, > > At 11:58 03/02/27 -0500, Edmon Chung wrote: > >I see. The reason support for other charset seems to make sense is that I > >can see that a lot of times IMA would be used within a same region, say > >china MTA to china MTA, which means that by simply using GB would make it > >easier in most cases. It is therefore going to be true that in most cases > >using the local encoding will be a much more efficient transport than UTF8. > > 'easier' and 'efficient' are not very clear to me here. If we want IMAX > to be really interoperable, we have to have at least one encoding that > every IMAX-aware piece of software supports. The obvious choice for this > is UTF-8. [Even if punycode is the only encoding that we require every > IMAX-aware piece of software to support (for backwards compatibility), > the software has to support Unicode.] > > So just doing things in UTF-8 looks easier to me than also using > all these legacy encodings. > > Of course some people will say that just doing punycode only is even easier. > The point is that with UTF-8, we are investing in our future, and > creating a clear upgrade path, whereas bringing in legacy encodings > is more like 'sidegrading'. > > > >Because the the IMAX capable MTA should really annouce which encoding it > >supports, in real transaction, there wouldn't really be negotiation. > >Perhaps I should change it so that if there is no annoucement of additional > >charset support then the MUA MUST use UTF8 to start with and avoid having to > >negotiate further. Would that be better? > > That would be a step in the right direction. But I suggest to directly > go to a model where UTF-8 and punycode (or whatever fallback we choose) > are the only choices. > > > Regards, Martin. > From owner-ietf-imaa Thu Feb 27 14:36:50 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RMaoQ09629 for ietf-imaa-bks; Thu, 27 Feb 2003 14:36:50 -0800 (PST) Received: from max.kde.org (max.kde.org [134.2.170.93]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RManY09623 for ; Thu, 27 Feb 2003 14:36:49 -0800 (PST) Received: from 192.168.0.3 (ppp36-239.hrz.uni-bielefeld.de [129.70.36.239]) by max.kde.org (Postfix) with ESMTP id 3544DB4383 for ; Thu, 27 Feb 2003 23:36:54 +0100 (CET) From: Marc Mutz Organization: "Old Europe" - and proud To: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 20:13:50 +0100 User-Agent: KMail/1.5.9 References: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> In-Reply-To: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> X-PGP-Key: 0xBDBFE838 MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_EOmX+iA97E/3Ahh"; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200302272014.12984@sendmail.mutz.com> Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --Boundary-02=_EOmX+iA97E/3Ahh Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Description: signed data Content-Disposition: inline On Thursday 27 February 2003 05:37, Dan Kohn wrote: > The difference, I believe, is > that 8BitMIME provides a 33% bandwidth reduction when it can be used > end-to-end, at the cost of requiring base64 transformations when > encountering a non-compliant MTA or MUA. The big difference of 8bitmime vs. any IMA-enabling SMTP extension is=20 that the former has 8bit CTE as it's companion in rfc2822=20 serializations, while the latter has nothing like that. Also, 8BITMIME makes it easier for the MIA[1] (doesn't need to apply CTE=20 in certain situations anymore), while IMA-SMTP extensions that exist in=20 a universe of their own (ie. not backed by the same structure in=20 rfc2822 serializations) don't. If a MIA wants to use the IMA-SMTP=20 extension, then it doesn't _save_ a conversion (8bit->qp/b64 as is the=20 case in 8BITMIME), but needs to _add_ one (IMAA->IMAX). Why would a MIA=20 want to use the extension if it was more work? [1] MIA =3D Message Injection Agent =2D-=20 The illegal we do immediately. The unconstitutional takes a bit longer. -- Henry Kissinger --Boundary-02=_EOmX+iA97E/3Ahh Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA+XmOE3oWD+L2/6DgRArtvAKC71O4dylXrRLnDLd1LsquZCTjwwQCdHrv+ VPfn8zLlJaNYve3h+CNJdwc= =0FFc -----END PGP SIGNATURE----- --Boundary-02=_EOmX+iA97E/3Ahh-- From owner-ietf-imaa Thu Feb 27 14:35:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RMZcR09590 for ietf-imaa-bks; Thu, 27 Feb 2003 14:35:38 -0800 (PST) Received: from exchange.ad.skymv.com (66-120-210-136.ded.pacbell.net [66.120.210.136]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1RMZbY09585 for ; Thu, 27 Feb 2003 14:35:37 -0800 (PST) Received: from exchange.ad.skymv.com ([192.168.1.71]) by exchange.ad.skymv.com with Microsoft SMTPSVC(5.0.2195.5329); Thu, 27 Feb 2003 14:35:21 -0800 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 content-class: urn:content-classes:message Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 27 Feb 2003 14:35:20 -0800 Message-ID: <138AA78F80DCE84B8EE424399FFBF9C9133B25@exchange.ad.skymv.com> Importance: normal X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Problems of Internationalized Mail Address eXtensions (IMAX) Thread-Index: AcLegXoOb/oj6qmmRxSvKBErCMkdygAJkDuA From: "Dan Kohn" To: X-OriginalArrivalTime: 27 Feb 2003 22:35:21.0062 (UTC) FILETIME=[82D85060:01C2DEB0] X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1RMZcY09587 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst wrote: > Disagreed. Let's say I were a procurer/sysadmin in Japan, > and would have to procure MTAs and MUAs for a whole department. > I would clearly buy a solution that lets me have a look at > all the stuff that's going on without having hundreds of > tools to make sure that I would see the right punycode > decoding at the right place (rather than to have to stare > at punycode anywhere). That would strongly reduce complexity > and increase usability for me, and improve service for my > user base. I believe sysadmins who care can use a punycode decoding tool (or Emacs macro) and that the vast majority won't care. But, as I said, if you really think there's a market for I18NMTAs, please go forward with a standard for it (maybe it will even get on the standards rather than the experimental track). But, every IMAX-capable MUA and MTA will need to implement IMAA for at least the next 50 years (IMHO), and I think most implementers will not bother with both. - dan -- Dan Kohn From owner-ietf-imaa Thu Feb 27 14:43:04 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RMh4f09814 for ietf-imaa-bks; Thu, 27 Feb 2003 14:43:04 -0800 (PST) Received: from m3001.hostcentric.net (m3001.hostcentric.net [216.157.79.237]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RMh2Y09809 for ; Thu, 27 Feb 2003 14:43:02 -0800 (PST) Received: (qmail 18451 invoked by alias); 27 Feb 2003 22:42:55 -0000 Received: from unknown (HELO DAVIS1) (12.234.231.250) by 0 with SMTP; 27 Feb 2003 22:42:55 -0000 Message-ID: <004201c2deb1$8a119160$7900a8c0@DAVIS1> From: "Mark Davis" To: "Edmon Chung" , References: <4.2.0.58.J.20030215182950.03351550@localhost> <4.2.0.58.J.20030220132811.056277a0@localhost> <036f01c2d922$d1b59910$0f01a8c0@neteka.inc> <097e01c2dbc0$0f87ff90$b03080dc@JSENGTOSHIBA> <025d01c2dbd3$2aa36950$fb5016d3@neteka.inc> <014d01c2dd26$afad2150$7900a8c0@DAVIS1> <4.2.0.58.J.20030226125526.02dafed8@localhost> <4.2.0.58.J.20030227105805.05962008@localhost> <082001c2de81$7365bcf0$fb5016d3@neteka.inc> <089101c2de8e$45513f30$fb5016d3@neteka.inc> <08e801c2de98$6f90a1a0$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Thu, 27 Feb 2003 14:42:41 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-Mimeole: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Just because a charset is registered does not mean it is always interpreted the same way: in fact, quite the contrary. See http://oss.software.ibm.com/icu/charset/index.html for examples. Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Edmon Chung" To: Sent: Thursday, February 27, 2003 11:42 Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > I see. But why would someone convert to charset A to charset B? Except for > from (or to) a particular local encoding to UTF8. Since charsets registered > should be able to convert between ISO10646, there shouldnt be any problem. > Edmon > > > > ----- Original Message ----- > From: "Paul Hoffman / IMC" > To: "Edmon Chung" > Sent: Thursday, February 27, 2003 1:42 PM > Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > > > At 1:30 PM -0500 2/27/03, Edmon Chung wrote: > > >I am not sure what you mean. > > >The document did refer to using the charsets registered at IANA for MIME. > > >Are the charsets registered there not a good set to work with? > > > > If you expect someone to convert from charset A to charset B > > reliably, you have to standardize the mapping table. There are lots > > of examples of mapping tables in use today that differ. Without that, > > someone sending with one charset and expecting the mail to be put in > > exactly the mailbox they mean might be in for a bad surprise. > > > > This was discussed on the IDN mailing list. > > > > --Paul Hoffman, Director > > --Internet Mail Consortium > > > > From owner-ietf-imaa Thu Feb 27 15:21:06 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1RNL6O10736 for ietf-imaa-bks; Thu, 27 Feb 2003 15:21:06 -0800 (PST) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1RNL5Y10732 for ; Thu, 27 Feb 2003 15:21:05 -0800 (PST) Received: (qmail 37788 invoked by uid 1016); 27 Feb 2003 23:21:35 -0000 Date: 27 Feb 2003 23:21:35 -0000 Message-ID: <20030227232135.37787.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <138AA78F80DCE84B8EE424399FFBF9C9133B25@exchange.ad.skymv.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Dan Kohn writes: > I believe sysadmins who care can use a punycode decoding tool (or > Emacs macro) and that the vast majority won't care. Won't care? _Won't care_? Do you think that moving beyond ASCII is some sort of theoretical game? You declare that the string "xyzzy" is a Greek alpha, you put "xyzzy" on the screen, and you expect the user to pretend he's seeing an alpha? Imagine that your computer's UI used octal instead of ABCDE etc. If you want an A, you type 101, and see 101 on your screen. If you want a B, you type 102, and see 102 on your screen. Maybe you have a special mail viewer that shows you A and B and so on, but everything else you do with the computer is in octal. How would you like that? I expect that you'll respond by saying that the user interface is ``out of scope.'' But declaring the failures of your solution to be ``out of scope'' doesn't make them disappear; it simply makes you look foolish. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago From owner-ietf-imaa Thu Feb 27 19:12:22 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1S3CMV16216 for ietf-imaa-bks; Thu, 27 Feb 2003 19:12:22 -0800 (PST) Received: from exchange.ad.skymv.com (66-120-210-136.ded.pacbell.net [66.120.210.136]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1S3CLY16212 for ; Thu, 27 Feb 2003 19:12:21 -0800 (PST) Received: from exchange.ad.skymv.com ([192.168.1.71]) by exchange.ad.skymv.com with Microsoft SMTPSVC(5.0.2195.5329); Thu, 27 Feb 2003 19:12:06 -0800 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 content-class: urn:content-classes:message Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 27 Feb 2003 19:12:05 -0800 Message-ID: <138AA78F80DCE84B8EE424399FFBF9C9195D40@exchange.ad.skymv.com> Importance: normal X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Problems of Internationalized Mail Address eXtensions (IMAX) Thread-Index: AcLetulLRtIWbKBUTYqpja5tfRtehgAHUZNQ From: "Dan Kohn" To: X-OriginalArrivalTime: 28 Feb 2003 03:12:06.0031 (UTC) FILETIME=[2C2D81F0:01C2DED7] X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id h1S3CMY16213 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Dan Kohn wrote: >> I believe sysadmins who care can use a punycode decoding tool (or >> Emacs macro) and that the vast majority won't care. D. J. Bernstein writes: > Won't care? _Won't care_? Yes, won't care. The difference between and (using the constructs from Section 3.4 of RFC 2822) is that the former is unambiguously user text (in the RFC 2277 sense) and the latter can be treated much more like a protocol element. I agree that it is nice to often be able to guess usernames or use heuristics about their meaning, but it's not necessary. (One could argue that it would also be nice to know what MAIL FROM or Content-Disposition means in your native tongue rather than treating them as abstract protocol elements, which they obviously are.) Anyway, I'm not making an absolutist argument here, that things are only protocol or text and never anything in between. I'm just pointing out that a Japanese mail admin certainly *could* get by just fine treating the LHS and RHS of as opaque text, as they would have to do anyway for many of their users' correspondents' ASCII addresses. - dan -- Dan Kohn From owner-ietf-imaa Thu Feb 27 19:32:34 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1S3WYc16671 for ietf-imaa-bks; Thu, 27 Feb 2003 19:32:34 -0800 (PST) Received: from smtp6.andrew.cmu.edu (SMTP6.andrew.cmu.edu [128.2.10.86]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1S3WWY16667 for ; Thu, 27 Feb 2003 19:32:33 -0800 (PST) Received: from penguin.andrew.cmu.edu (PENGUIN.andrew.cmu.edu [128.2.121.100]) by smtp6.andrew.cmu.edu (8.12.7.Beta1/8.12.3.Beta2) with ESMTP id h1S3WYpZ029993; Thu, 27 Feb 2003 22:32:34 -0500 Date: Thu, 27 Feb 2003 22:32:34 -0500 Message-Id: <200302280332.h1S3WYpZ029993@smtp6.andrew.cmu.edu> From: Lawrence Greenfield X-Mailer: BatIMail version 3.3 To: ietf-imaa@imc.org, Dan Kohn In-reply-to: <138AA78F80DCE84B8EE424399FFBF9C9195D40@exchange.ad.skymv.com> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <138AA78F80DCE84B8EE424399FFBF9C9195D40@exchange.ad.skymv.com> User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-4?Q?Unebigory?= =?ISO-8859-4?Q?=F2mae?=) Emacs/21.2 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: [...] > Won't care? _Won't care_? Yes, won't care. The difference between and (using the constructs from Section 3.4 of RFC 2822) is that the former is unambiguously user text (in the RFC 2277 sense) and the latter can be treated much more like a protocol element. I agree that it is nice to often be able to guess usernames or use heuristics about their meaning, but it's not necessary. (One could argue that it would also be nice to know what MAIL FROM or Content-Disposition means in your native tongue rather than treating them as abstract protocol elements, which they obviously are.) E-mail administrators are frequently looking at logs to track what happened to certain e-mail messages. They're given e-mail addresses like "bob sent this" or "joe was expecting this". Any MTA that didn't give facilities to do this sort of tracking (either through a program or reading logs or whatever) is clearly substandard. MTAs written after Punycode or whatever has come into widespread use would have to decode it for administrators, because administrators will not be given Punycode addresses by users. (Well, maybe sometime they will? But one hopes not by non-technical users.) It is silly to think that administrators can do their job effectively by treating them as "abstract protocol elements". MTA administrators dealing with addresses outside their native language/alphabet may well prefer to deal with Punycode addresses. Designing a usable system for an administrator to deal with many languages/scripts is probably pretty hard. Now, this fact doesn't mean that we _have_ to use UTF-8 in SMTP or in message headers. It just means that MTAs and other internal infrastructure _will_ become aware of any encodings. Larry From owner-ietf-imaa Thu Feb 27 21:08:52 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1S58qS19472 for ietf-imaa-bks; Thu, 27 Feb 2003 21:08:52 -0800 (PST) Received: from pie1.i-dns.net ([203.81.44.31]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1S58pY19468 for ; Thu, 27 Feb 2003 21:08:51 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by pie1.i-dns.net (Postfix) with ESMTP id 2BD3678ACF; Fri, 28 Feb 2003 05:08:54 +0000 (GMT) Received: from pie1.i-dns.net ([127.0.0.1]) by localhost (pie1.i-dns.net [127.0.0.1:10024]) (amavisd-new) with SMTP id 27039-08; Fri, 28 Feb 2003 05:08:51 +0000 (GMT) Received: from jeffreyibm (unknown [137.132.137.37]) by pie1.i-dns.net (Postfix) with SMTP id D572878ACB; Fri, 28 Feb 2003 05:08:50 +0000 (GMT) Message-ID: <001401c2dee7$c8683780$25898489@jeffreyibm> From: "Jeffrey J Zahari" To: "D. J. Bernstein" , References: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> <00bc01c2de32$12cfbb10$25898489@jeffreyibm> <20030227102249.1376.qmail@cr.yp.to> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Fri, 28 Feb 2003 13:10:58 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Virus-Scanned: by amavisd-new Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "D. J. Bernstein" To: Sent: Thursday, February 27, 2003 6:22 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) > > Jeffrey J Zahari writes: > > anyone could have already set up a working DNS/mail > > infrastructure using bind 9 and qmail using UTF-8 > > Here you're talking about a massive change to the Internet as if it were > some trivial overnight task. > I am not advocating that everyone should go down this path, just merely pointing out that efforts have been made to allow 8 bit on the Internet. > > However, IDNA was chosen > > Here you're talking about even more massive change to the Internet as if > it were already done. > IDNA was chosen, yes, by the idn wg. > > better compression comparitive to UTF-8 > > Here you're wildly exaggerating the importance of a ludicrously small > issue. > There have been no relaxations to the domain label length restrictions, or recommendations to cater for variable width encodings, so these limits would need due consideration. > > all legacy implementations would not have to be upgraded > > Here you're making the content-free observation that we don't have to > do anything if we don't want to accomplish anything. > Existing DNS implementations would not need any upgrades. The tools used to manipulate the zone files on the other hand, would need helper applications, or punycode support to do the necessary conversions. Protocols requiring the use of domain names as their protocol elements would likewise need some reworking to cater to the impact of ACE idns. However, one could argue the scope of DNS per se does not need to cover this. > Are you an implementor? Do you speak any languages other than English? > What are your goals here? > Yes, i speak mandarin, a smattering of dialects from guangdong, fujian and malay, a language which originally had no written form, but has adopted us-ascii ( no i18n problems there ), and I presume all our goals are to present balanced opinions on solutions at hand? > ---D. J. Bernstein, Associate Professor, Department of Mathematics, > Statistics, and Computer Science, University of Illinois at Chicago > jeffrey j zahari From owner-ietf-imaa Fri Feb 28 03:29:07 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SBT7k03662 for ietf-imaa-bks; Fri, 28 Feb 2003 03:29:07 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1SBT5Y03658 for ; Fri, 28 Feb 2003 03:29:05 -0800 (PST) Received: (qmail 23597 invoked by uid 66); 28 Feb 2003 11:28:46 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 28 Feb 2003 11:28:46 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-28-0038d); 28 Feb 2003 12:28:34 +0000 Date: 28 Feb 2003 00:00:00 +0000 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8giKPPM3cDD@3247.org> In-Reply-To: <08f401c2de98$dd377670$fb5016d3@neteka.inc> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-28-0038d MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Edmon Chung schrieb/wrote: > I dont agree with the forward looking part. > The big reason as far as I believe that there should be a charset parameter > is because of forward looking. Today we move from English only to > multilingual... if the design in the first place allowed the charset > parameter, we wouldnt have the problem at all... No, we would have a lot of problems now. For example, someone wants to send me a mail to cfärber@muc.de, but his mailer uses ISO-8859-15, which the MDA of my ISP does not know (although it knows ISO-8859-1 and UTF-8). The message bounces. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 28 04:16:03 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SCG3f08373 for ietf-imaa-bks; Fri, 28 Feb 2003 04:16:03 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1SCG1Y08365 for ; Fri, 28 Feb 2003 04:16:01 -0800 (PST) Received: (qmail 2014 invoked by uid 66); 28 Feb 2003 12:16:01 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 28 Feb 2003 12:16:01 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-28-0038d); 28 Feb 2003 13:15:51 +0000 Date: 28 Feb 2003 00:00:00 +0000 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8giKyXl3cDD@3247.org> In-Reply-To: <20030227232135.37787.qmail@cr.yp.to> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-28-0038d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: D. J. Bernstein schrieb/wrote: > Do you think that moving beyond ASCII is some sort of theoretical game? > You declare that the string "xyzzy" is a Greek alpha, you put "xyzzy" on > the screen, and you expect the user to pretend he's seeing an alpha? Someone who does not know the Greek language and can't type Greek characters might actually prefer entering xn--mxa (which is Nameprep and Punycode of a single alpha); it's even easier than using ISO 14755 for longer name components. While Greek and Cyrillic are quite familiar to people used to Latin, other scripts are certainly not; most people could not compare strings if they use different fonts or even rendering engines. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 28 08:32:41 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SGWfi23159 for ietf-imaa-bks; Fri, 28 Feb 2003 08:32:41 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1SGWeY23152 for ; Fri, 28 Feb 2003 08:32:40 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA20366; Fri, 28 Feb 2003 11:32:40 -0500 Message-Id: <4.2.0.58.J.20030228111508.042cedf0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 28 Feb 2003 11:26:32 -0500 To: "Dan Kohn" , From: Martin Duerst Subject: RE: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <138AA78F80DCE84B8EE424399FFBF9C9195D40@exchange.ad.skymv.c om> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 19:12 03/02/27 -0800, Dan Kohn wrote: >Dan Kohn wrote: > > >> I believe sysadmins who care can use a punycode decoding tool (or > >> Emacs macro) and that the vast majority won't care. > >D. J. Bernstein writes: > > > Won't care? _Won't care_? > >Yes, won't care. The difference between and >(using the constructs from Section 3.4 of RFC 2822) is that the former >is unambiguously user text (in the RFC 2277 sense) and the latter can be >treated much more like a protocol element. I agree that it is nice to >often be able to guess usernames or use heuristics about their meaning, >but it's not necessary. We are not talking about guessing here. Having somebody tell that there is a problem with her mail (address FOO), and you as a sysadmin go looking for that address FOO in some of your files has nothing to do with guessing. And the argumentation about protocol elements doesn't really work, of course it is a protocol element, but the whole effort we are making here is just about making it readable for users (at all levels, not just the end recipients). >(One could argue that it would also be nice to >know what MAIL FROM or Content-Disposition means in your native tongue >rather than treating them as abstract protocol elements, which they >obviously are.) Administrators (not end users) can and do obviously learn these. And there is a big difference between something being English, and always being the same, whereas something actually being e.g. Japanese, but mutilated beyond recognition. Just think about it the other way round: If you were a sysadmin and everything was in Chinese, you would probably pick up on the character for 'MAIL FROM' rather quickly, but then if user 'John Smith' with address john.smith@foo.com came over with a problem, you wouldn't want to have to put john.smith@foo.com into a tool and then go hunt for the problem with some Chinese characters. >Anyway, I'm not making an absolutist argument here, that things are only >protocol or text and never anything in between. Rather than 'between', I'd say 'overlap'. >I'm just pointing out >that a Japanese mail admin certainly *could* get by just fine treating >the LHS and RHS of as opaque text, as they would have to do >anyway for many of their users' correspondents' ASCII addresses. Japanese don't treat ASCII email addresses as opaque. If you have an address such as dan@dankohn.com, they don't have that many problems recognizing the element in the address. It will take them a lot longer than it will take the average native reader, but it's not really a problem. And even Japanese users do feel much better with ken@suzuki.name than with $%&^*@#%$((^&%(. So why should we work on a system that makes things worse for them? Regards, Martin. From owner-ietf-imaa Fri Feb 28 08:32:42 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SGWgP23163 for ietf-imaa-bks; Fri, 28 Feb 2003 08:32:42 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1SGWfY23158 for ; Fri, 28 Feb 2003 08:32:41 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id LAA20369; Fri, 28 Feb 2003 11:32:41 -0500 Message-Id: <4.2.0.58.J.20030228112727.03845bb8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 28 Feb 2003 11:30:52 -0500 To: list-ietf-i18n-imaa@faerber.muc.de (Claus Faerber), ietf-imaa@imc.org From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <8giKyXl3cDD@3247.org> References: <20030227232135.37787.qmail@cr.yp.to> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 00:00 03/02/28 +0000, Claus Faerber wrote: >Someone who does not know the Greek language and can't type Greek >characters might actually prefer entering xn--mxa (which is Nameprep and >Punycode of a single alpha); it's even easier than using ISO 14755 for >longer name components. Somebody who doesn't know Greek probably doesn't care to type Greek at all. I guess we should design IMAs so that Greek IMAs work best for Greeks, and Japanese IMAs work best for Japanese, and so on. Designing IMAs so that Greek IMAs work best for Japanese and Japanese IMAs work best for Greeks doesn't make sense. Regards, Martin. From owner-ietf-imaa Fri Feb 28 10:40:27 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SIeRL29658 for ietf-imaa-bks; Fri, 28 Feb 2003 10:40:27 -0800 (PST) Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h1SIeQY29653 for ; Fri, 28 Feb 2003 10:40:26 -0800 (PST) Received: (qmail 13331 invoked from network); 28 Feb 2003 18:34:32 -0000 Received: from adsl-65-43-32-68.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.32.68) by server.iicinternet.com with SMTP; 28 Feb 2003 18:34:32 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: Date: Fri, 28 Feb 2003 13:34:36 -0500 To: ietf-imaa@imc.org From: tedd Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >At 00:00 03/02/28 +0000, Claus Faerber wrote: > >>Someone who does not know the Greek language and can't type Greek >>characters might actually prefer entering xn--mxa (which is Nameprep and >>Punycode of a single alpha); it's even easier than using ISO 14755 for >>longer name components. > >Somebody who doesn't know Greek probably doesn't care to type Greek >at all. I guess we should design IMAs so that Greek IMAs work best >for Greeks, and Japanese IMAs work best for Japanese, and so on. >Designing IMAs so that Greek IMAs work best for Japanese and Japanese >IMAs work best for Greeks doesn't make sense. > >Regards, Martin. Martin: What about char sets that are not language specific, like mathematical symbols? The Unicode character database does provide many char sets that are not language specific -- look at symbol fonts and even dingbats for sake of argument. Shouldn't Greeks and Japanese (as well as everyone else) have easy and equal access to those characters -- and to all others char sets in the Unicode database as well? Designing things on a language specific basis looks like a "no matter how many times you cut it, it's still too short" type of thing. tedd -- http://sperling.com/ From owner-ietf-imaa Fri Feb 28 11:18:42 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SJIg103623 for ietf-imaa-bks; Fri, 28 Feb 2003 11:18:42 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1SJIfY03617 for ; Fri, 28 Feb 2003 11:18:41 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA05046; Fri, 28 Feb 2003 14:18:39 -0500 Message-Id: <4.2.0.58.J.20030228135703.0549aa20@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 28 Feb 2003 13:58:01 -0500 To: Marc Mutz , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <200302272014.12984@sendmail.mutz.com> References: <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> <138AA78F80DCE84B8EE424399FFBF9C9195D3E@exchange.ad.skymv.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 20:13 03/02/27 +0100, Marc Mutz wrote: >On Thursday 27 February 2003 05:37, Dan Kohn wrote: > > > The difference, I believe, is > > that 8BitMIME provides a 33% bandwidth reduction when it can be used > > end-to-end, at the cost of requiring base64 transformations when > > encountering a non-compliant MTA or MUA. > > >The big difference of 8bitmime vs. any IMA-enabling SMTP extension is >that the former has 8bit CTE as it's companion in rfc2822 >serializations, while the latter has nothing like that. I agree that it works better if we have both. That just may mean that we need both. Regards, Martin. >Also, 8BITMIME makes it easier for the MIA[1] (doesn't need to apply CTE >in certain situations anymore), while IMA-SMTP extensions that exist in >a universe of their own (ie. not backed by the same structure in >rfc2822 serializations) don't. If a MIA wants to use the IMA-SMTP >extension, then it doesn't _save_ a conversion (8bit->qp/b64 as is the >case in 8BITMIME), but needs to _add_ one (IMAA->IMAX). Why would a MIA >want to use the extension if it was more work? > >[1] MIA = Message Injection Agent > >-- >The illegal we do immediately. >The unconstitutional takes a bit longer. -- Henry Kissinger From owner-ietf-imaa Fri Feb 28 11:19:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SJJ9M03717 for ietf-imaa-bks; Fri, 28 Feb 2003 11:19:09 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1SJJ8Y03707 for ; Fri, 28 Feb 2003 11:19:08 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id OAA05053; Fri, 28 Feb 2003 14:18:39 -0500 Message-Id: <4.2.0.58.J.20030228140741.05499050@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 28 Feb 2003 14:12:04 -0500 To: tedd , ietf-imaa@imc.org From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 13:34 03/02/28 -0500, tedd wrote: [reordered] >What about char sets that are not language specific, like mathematical >symbols? The Unicode character database does provide many char sets that >are not language specific -- look at symbol fonts and even dingbats for >sake of argument. > >Shouldn't Greeks and Japanese (as well as everyone else) have easy and >equal access to those characters -- and to all others char sets in the >Unicode database as well? Designing things on a language specific basis >looks like a "no matter how many times you cut it, it's still too short" >type of thing. Sorry for having created confusion by maybe stating my opinion in a somewhat simplified fashion. Of course, if we would go so far as to create different designs for Greek and Japanese, and so on, we would end up in deep chaos. The idea is just that the system is designed so that everybody can easily deal with the characters they mostly use. If somebody is familiar with math symbols and wants to use them for mail addresses (I personally doubt that there will be much such use, but that's not the issue), then they should be able to just use them, without having to go through an ACE. Regards, Martin. >>At 00:00 03/02/28 +0000, Claus Faerber wrote: >> >>>Someone who does not know the Greek language and can't type Greek >>>characters might actually prefer entering xn--mxa (which is Nameprep and >>>Punycode of a single alpha); it's even easier than using ISO 14755 for >>>longer name components. >> >>Somebody who doesn't know Greek probably doesn't care to type Greek >>at all. I guess we should design IMAs so that Greek IMAs work best >>for Greeks, and Japanese IMAs work best for Japanese, and so on. >>Designing IMAs so that Greek IMAs work best for Japanese and Japanese >>IMAs work best for Greeks doesn't make sense. >> >>Regards, Martin. From owner-ietf-imaa Fri Feb 28 13:54:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SLsal11471 for ietf-imaa-bks; Fri, 28 Feb 2003 13:54:36 -0800 (PST) Received: from slarti.muc.de (slarti.muc.de [193.149.48.10]) by above.proper.com (8.11.6/8.11.3) with SMTP id h1SLsYY11466 for ; Fri, 28 Feb 2003 13:54:34 -0800 (PST) Received: (qmail 21052 invoked by uid 66); 28 Feb 2003 21:54:28 -0000 Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4) for ietf-imaa@imc.org; 28 Feb 2003 21:54:28 -0000 Received: by faerber.muc.de (OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-28-2224d); 28 Feb 2003 22:54:20 +0000 Date: 28 Feb 2003 00:00:00 +0000 From: list-ietf-i18n-imaa@faerber.muc.de (=?ISO-8859-1?Q?Claus_F=E4rber?=) To: ietf-imaa@imc.org Message-ID: <8giNmL1JcDD@3247.org> In-Reply-To: <4.2.0.58.J.20030228112727.03845bb8@localhost> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) User-Agent: OpenXP/32 v3.9.4 (Win32) alpha @ 2003-02-28-2224d MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst schrieb/wrote: > Designing IMAs so that Greek IMAs work best for Japanese and Japanese > IMAs work best for Greeks doesn't make sense. So it's ok if someone in Greece can't write down the email address of a friend in Japan on paper? Although most people will continue to have ASCII-only addresses in addition to their non-ASCII addresses, they won't switch the From line every time they mail foreign people. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 28 14:33:47 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h1SMXlZ13761 for ietf-imaa-bks; Fri, 28 Feb 2003 14:33:47 -0800 (PST) Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h1SMXjY13757 for ; Fri, 28 Feb 2003 14:33:46 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.9.3/8.9.3) with ESMTP id RAA05171; Fri, 28 Feb 2003 17:33:46 -0500 Message-Id: <4.2.0.58.J.20030228173123.05716328@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 28 Feb 2003 17:33:33 -0500 To: list-ietf-i18n-imaa@faerber.muc.de (Claus Faerber), ietf-imaa@imc.org From: Martin Duerst Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: <8giNmL1JcDD@3247.org> References: <4.2.0.58.J.20030228112727.03845bb8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 00:00 03/02/28 +0000, Claus Faerber wrote: >So it's ok if someone in Greece can't write down the email address of a >friend in Japan on paper? It's okay that someone in Greece can't write down an email with Japanese characters. It's okay that somebody in Greece doesn't understand Japanese. Would be nice if we could change that, but that won't happen soon. >Although most people will continue to have ASCII-only addresses in >addition to their non-ASCII addresses, they won't switch the From line >every time they mail foreign people. They won't want to do that. Tools will do that for them, pretty easily. Regards, Martin. From owner-ietf-imaa Fri Feb 28 16:10:14 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h210AEs16582 for ietf-imaa-bks; Fri, 28 Feb 2003 16:10:14 -0800 (PST) Received: from imsm031.netvigator.com (imsm031.netvigator.com [218.102.48.149]) by above.proper.com (8.11.6/8.11.3) with SMTP id h210ADY16578 for ; Fri, 28 Feb 2003 16:10:13 -0800 (PST) Received: (qmail 10798 invoked from network); 1 Mar 2003 00:10:04 -0000 Received: from n218103230004.netvigator.com (HELO EDMON15) (218.103.230.4) by imsm031.netvigator.com with SMTP; 1 Mar 2003 00:10:04 -0000 Message-ID: <0a0801c2df86$e2317920$fb5016d3@neteka.inc> From: "Edmon Chung" To: "Claus Färber" , References: <8giKPPM3cDD@3247.org> Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Date: Fri, 28 Feb 2003 19:09:52 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Why would the message bounce? His mailer would use UTF8. Edmon ----- Original Message ----- From: "Claus Färber" To: Sent: Thursday, February 27, 2003 7:00 PM Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Edmon Chung schrieb/wrote: > I dont agree with the forward looking part. > The big reason as far as I believe that there should be a charset parameter > is because of forward looking. Today we move from English only to > multilingual... if the design in the first place allowed the charset > parameter, we wouldnt have the problem at all... No, we would have a lot of problems now. For example, someone wants to send me a mail to cfärber@muc.de, but his mailer uses ISO-8859-15, which the MDA of my ISP does not know (although it knows ISO-8859-1 and UTF-8). The message bounces. Claus -- http://www.faerber.muc.de/ From owner-ietf-imaa Fri Feb 28 19:23:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h213NUa22490 for ietf-imaa-bks; Fri, 28 Feb 2003 19:23:30 -0800 (PST) Received: from leonis.nus.edu.sg (leonis.nus.edu.sg [137.132.1.18]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h213NPY22484 for ; Fri, 28 Feb 2003 19:23:26 -0800 (PST) Received: from bic.nus.edu.sg ([137.132.137.32]) by leonis.nus.edu.sg (8.12.1/8.12.1) with ESMTP id h213OjUn003683; Sat, 1 Mar 2003 11:24:48 +0800 (SGT) Message-ID: <3E60279D.4060105@bic.nus.edu.sg> Date: Sat, 01 Mar 2003 11:23:09 +0800 From: Tan Tin Wee User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Martin Duerst CC: Claus Faerber , ietf-imaa@imc.org Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) References: <4.2.0.58.J.20030228112727.03845bb8@localhost> <4.2.0.58.J.20030228173123.05716328@localhost> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: but it is NOT ok, if someone in Japan who understands Japanese cannot write down an email address or send email to another Japanese person who can understand Japanese. This needs to happen SOON. And it would be a big tragedy if 1 billion Chinese persons cannot write down their email addresses in Chinese and communicate that to the other 1 billion minus one Chinese persons using the Internet because we are worried that someone in Greece can't write down an email with Japanese characters, or because we are still hoping that all of them will master English first. We have gone through these arguments before in the IDN mailing list in case IMAA mailing list folks are new to these issues. So we should move on, and focus on going forward as Martin and others point out. bestrgds tin wee Martin Duerst wrote: > > At 00:00 03/02/28 +0000, Claus Faerber wrote: > >> So it's ok if someone in Greece can't write down the email address of a >> friend in Japan on paper? > > > It's okay that someone in Greece can't write down an email with > Japanese characters. It's okay that somebody in Greece doesn't > understand Japanese. Would be nice if we could change that, > but that won't happen soon. > > >> Although most people will continue to have ASCII-only addresses in >> addition to their non-ASCII addresses, they won't switch the From line >> every time they mail foreign people. > > > They won't want to do that. Tools will do that for them, pretty easily. > > Regards, Martin. > > From owner-ietf-imaa Sat Mar 1 07:25:40 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h21FPeW21728 for ietf-imaa-bks; Sat, 1 Mar 2003 07:25:40 -0800 (PST) Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.11.6/8.11.3) with SMTP id h21FPcY21723 for ; Sat, 1 Mar 2003 07:25:39 -0800 (PST) Received: (qmail 12140 invoked from network); 1 Mar 2003 15:25:34 -0000 Received: from adsl-65-43-32-68.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.32.68) by server.iicinternet.com with SMTP; 1 Mar 2003 15:25:34 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <4.2.0.58.J.20030228140741.05499050@localhost> References: <4.2.0.58.J.20030228140741.05499050@localhost> Date: Sat, 1 Mar 2003 10:25:02 -0500 To: ietf-imaa@imc.org From: tedd Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >>Shouldn't Greeks and Japanese (as well as everyone else) have easy >>and equal access to those characters -- and to all others char sets >>in the Unicode database as well? Designing things on a language >>specific basis looks like a "no matter how many times you cut it, >>it's still too short" type of thing. > >Sorry for having created confusion by maybe stating my opinion in a >somewhat simplified fashion. Of course, if we would go so far >as to create different designs for Greek and Japanese, and so on, >we would end up in deep chaos. The idea is just that the system >is designed so that everybody can easily deal with the characters >they mostly use. If somebody is familiar with math symbols and >wants to use them for mail addresses (I personally doubt that >there will be much such use, but that's not the issue), then >they should be able to just use them, without having to go >through an ACE. > >Regards, Martin. Martin: I believe that the answer will come from commercial software designers and not from the creators of ACE (or whatever the end algorithm will be). I believe that whatever encoding is used (i.e., xx--whatever) to stand-in for a seven-bit to eight-bit conversion will be converted to/from the user via Internet end-user software. I don't think the end-user will ever have to see, or understand, what an ACE-like encoded string is. Hardware and software developers clearly want a global market and adopting an Unicode-like database is one way, if not the only way, to accomplish that goal. Likewise, adopting an ACE-like algorithm for delivering an eight-bit message via a seven-bit medium has been the only way to fulfill that goal without trashing the net in the process. The end result I envision, as do many, is a Chinese sitting before his keyboard using a Chinese char set to converse with his brethren with absolutely no regard for, nor reliance upon, English -- all AMC-like mechanics (i.e., the Latin alphanumerics) will be completely transparent to him. As a side note, his preference to his language char set will be as easy for him to set as it is for us to change from a Helvetica to a Times font. Please note that this is will done via end-user software and not through the efforts of this group or any group like it. Eventually, I believe that the net will adopt an eight-bit format (or greater) and all this punycode and other such "make-fit" conversions will be nothing more than a uncomfortable growing-pains footnote in history. But until then, we will have to make due with what's available, and in doing so, provide opportunity for others to solve and meet end-user needs. Now with that said, this list is a discussion as to what to do with the LHS of email addresses. I claim that keeping/treating both sides the same will simplify and speed the process for developers and will get this global opportunity to the end-user sooner. However, this will (and may unnecessarily) limit opportunities for the global end user. Keep in mind, that case considerations made by us (the English speaking people) does not have the same implications as it does for the rest of the world. In other words, we have made a distinction that UC/LC means something and have extended, or rather imposed, that limitation globally. So, the question to this list is -- should we continue to impose the same restraints on the LHS as we have for the RHS -- or should we consider that the LHS of the argument different and be treated with less restriction and thus more opportunity -- opportunity, I might add, which is not without problems in implementation. tedd -- http://sperling.com/ From owner-ietf-imaa Sat Mar 1 12:45:35 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h21KjZK09717 for ietf-imaa-bks; Sat, 1 Mar 2003 12:45:35 -0800 (PST) Received: from relay-2m.club-internet.fr (relay-2m.club-internet.fr [194.158.104.41]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h21KjXY09711 for ; Sat, 1 Mar 2003 12:45:34 -0800 (PST) Received: from mine.club-internet.fr (f09m-6-102.d1.club-internet.fr [213.44.221.102]) by relay-2m.club-internet.fr (Postfix) with ESMTP id 70271170C; Sat, 1 Mar 2003 21:45:28 +0100 (CET) Message-Id: <5.2.0.9.0.20030301214954.02505d90@mail.club-internet.fr> X-Sender: jefsey@mail.club-internet.fr X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Sat, 01 Mar 2003 21:51:31 +0100 To: tedd , ietf-imaa@imc.org From: "J-F C. (Jefsey) Morfin" Subject: Re: Problems of Internationalized Mail Address eXtensions (IMAX) In-Reply-To: References: <4.2.0.58.J.20030228140741.05499050@localhost> <4.2.0.58.J.20030228140741.05499050@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 16:25 01/03/03, tedd wrote: >So, the question to this list is -- should we continue to impose the same >restraints on the LHS as we have for the RHS -- No. > or should we consider that the LHS of the argument different and be > treated with less restriction and thus more opportunity -- opportunity, I > might add, which is not without problems in implementation. Definitly yes. This is the only users' interest in this dscussion (user being end users and application/inter-application developpers). jfc From owner-ietf-imaa Sun Mar 2 03:09:45 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h22B9j414358 for ietf-imaa-bks; Sun, 2 Mar 2003 03:09:45 -0800 (PST) Received: from malmo.kicore.net (malmo.kicore.net [217.212.0.10]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h22B9bY14352 for ; Sun, 2 Mar 2003 03:09:38 -0800 (PST) Received: from terra (terra.malmo.kicore.net [217.212.0.22]) by malmo.kicore.net (8.12.5+Sun/KiNet-primary) with SMTP id h22B9O16009762 for ; Sun, 2 Mar 2003 12:09:24 +0100 (MET) Message-Id: <200303021109.h22B9O16009762@malmo.kicore.net> Date: Sun, 2 Mar 2003 12:12:34 +0100 (CET) From: Dan Oscarsson Reply-To: Dan Oscarsson Subject: Leaving the legacy world To: ietf-imaa@imc.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: vVNssXwBo5GQnRfSlE8Wyw== X-Mailer: dtmail 1.3.0 @(#)CDE Version 1.5 SunOS 5.9 sun4u sparc Sender: owner-ietf-imaa@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Reading through the massive amount of messages on the IMAA list I start getting the same type of hopelessness I got while being part of the IDN list. Is it not time to start leaving the legacy world behind us? The amount of messages is so large that I cannot give good comments to it. Below I will give my view of things. If will be several different things. You may split comments into separate threads. To avoid unclear sematics with words like internationalisation, ASCII and non-ASCII, I will use "legacy" as meaning systems/programs seeing the world in ASCII. Standard form of names ------------------------------------- In the world today names are used a lot. mail addresses, domain names, host names, URI and URLs are names. A name is composed of characters, any UCS character. A legacy name is composed of ASCII characters. The user do not see names as ACE, UTF-8, %-encoding or other encoded forms. For them it is a sequence of characters. And by characters they mean any character from UCS. Unicode/ISO 10646 groups had one very bad design failure: they allowed more than one representation of a single character. For example there are "full width" forms which do not belong in a character set as that is a display feature. The NFKC looks like a part way attemt to fix that, but failes due to also removing sematically different characters. I have, for a long time, studied all way to have a standard form for names (and text). I have read and looked at discussions and code about normalised/unnormalised text, decomposed/precomposed, form NFC/NFKC and others. I have looked at UTF-8, SCSU, ISO 8859-1, UCS-2, UCS-4, UTF-16 and many others. I have looked at impact on code. Some of my conclusions are: - UTF-8 is amoung the better for interoperability due to simple format and without endian problems. - UTF-8 is not good for character handling. It should not be used inside programs handling characters. UCS-1, UCS-2 or UCS-4 is much better (due to less complex handling and less CPU usage). - UTF-8 is not good in all protocols. It would be fine in the e-mail protocol but not in the DNS protocol due to space constraints. In DNS, SCSU would be fine. - Unnormalised text is not good as it allowes multiple forms of a character. - Decomposed text is not good as it takes a lot of space, does not match legacy character set handling and breaks semantics of some characters. - NFC preserves all data in the text but allowes multiple forms. - NFKC does not preserve all data and changes semantics of some characters. - A lot of people are saying UTF-8 but do not define what they encode using UTF-8. To interoperat you have to say encoding, character set and form used. >From this I think that the best choice for standard form of text is: - Normalised using NFC combined with all equivalent characters replaced with one character code. This means that Kelvin sign -> K Ligature ij -> characters i and j Full width a -> a But not that superscript 2 -> 2. Transmission of a name in a protocol should use the above form encoded with UTF-8. This will give as a single simple form that is easy to handle while preserving all important data. At protocol level you never need to know that some users have a "full width" @ because as protocol level there is only one form of @. Case sensitivity/case insensitivity --------------------------------------- I see no reason to not have case insensitivity in name matching. This includes all parts of a e-mail address, URLs, domain names, file names. That is what most user expect. I have studied different forms of matching and think the following is best on a global level: - All single character to single character case insensitive matching defined by Unicode is used. - SC/TC matching. - No single to multiple character matching (like s-sharp to ss). I have seen so many examples of how this results in a failure and it also makes code much more complexer. The matching/mangling of names used in IDNA is unacceptible. Keep to protocol level ------------------------ In the discussions both user interface and protocol level is discussed. Could we try to leave user interface matters to later on? Or at least separate the threads. For example it is very important in a protocol to have ONE well defined form of protocol elements. A mail address is: local-part@domain In a user interface the @-sign could be displayed using bold, wide, narrow, green or other display feature. In a protocol only one form of @ should be allowed. While a MUA may recognize a green or extra wide @ sign as the @-sign, the MTA should only recognize the standard @-sign. E-mail supporting full UCS ----------------------------- While we have to interoperate with legacy e-mail, I think it is high time to take a step forward and use the full UCS. To avoid unneeded mess, we design it so that a system handling non-legacy mailboxes is not a legacy system. Let us take a step forward with SMTP: Add a ESMTP extention that switches to UCS mode. Like this: EHLO mail.xxx.com 250 UCS Meaning the server supports UCS MAIL FROM: UCS Meaning that transmitting client switches to UCS mode. In UCS mode of SMTP the following is used: - All addresses in protocoll uses standard UCS form (this includes in MAIL FROM, RCPT To,...) - All headers are in standard UCS form. (they may not contain MIME encoded headers). - Default text body parts are in UCS/NFC/UTF-8. - Other text body format encodings are not recommended and not required to be supported. Standard UCS form used in protocol is: UCS/NFC with muliple definitions removed/UTF-8. Matching of "local part" should (or moust) be done case insensitively but case should be preserved in the protocol. If server/client do not support UCS the e-mail will be downgraded/upgraded depending on direction. Downgrading is done by converting headers to MIME encoding, e-mail addresses have local part encoded using a form preserving encoding into an opaque ASCII part and domain name encoded using IDNA (which will destroy data in domain name). Upgrading is doing the reverse. Summary ------------ - Use a standard form for names: UCS/NFC with muliple definitions removed/UTF-8. - Case insensitive matching of names using singe character equivalence plus SC/TC matching. - Update SMTP protocol to use UCS with ASCII as legacy downgrading. That is all. I have probably forgotten a lot of things. I can write drafts for the update to SMTP and standard form for text/names, or with someone. But will only do so if I feel that there is a real will from many people that it is time to go beyond the legacy ASCII world. I have not yet written one for DNS, even though most of what needs to be in it, due to lack of time and lack of feel that people really want to leave legacy DNS. Regards, Dan From owner-ietf-imaa Sun Mar 2 09:10:38 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id h22HAcI09573 for ietf-imaa-bks; Sun, 2 Mar 2003 09:10:38 -0800 (PST) Received: from [63.202.92.157] (adsl-63-202-92-157.dsl.snfc21.pacbell.net [63.202.92.157]) by above.proper.com (8.11.6/8.11.3) with ESMTP id h22HAYY09565; Sun, 2 Mar 2003 09:10:34 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200303021109.h22B9O16009762@malmo.kicore.net> References: <200303021109.h22B9O16009762@malmo.kicore.net> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warran