From owner-idn-reg-policy Tue Mar 25 09:31:20 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2PHVKZ25308 for idn-reg-policy-bks; Tue, 25 Mar 2003 09:31:20 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2PHVIg25300 for ; Tue, 25 Mar 2003 09:31:18 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Tue, 25 Mar 2003 09:31:09 -0800 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Greetings. I have just submitted a new Internet Draft that gives suggestions on how to register IDNs. You can find a link to the draft at the web site for this mailing list at . Comments are, of course, welcome. This document is different than the JET document in many ways. It is meant to be generic and usable by anyone, not just registries using CJK characters. It also attempts to make the registry policy easier and more predictable from the outside. I have heard that other folks will also be preparing Internet Drafts on this issue, so it will be good to see what the differences are. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Tue Mar 25 09:31:21 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2PHVLQ25309 for idn-reg-policy-bks; Tue, 25 Mar 2003 09:31:21 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2PHVJg25304 for ; Tue, 25 Mar 2003 09:31:19 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Tue, 25 Mar 2003 09:30:13 -0800 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Starting the list Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Greetings. There are now more than 50 people on this list, so it is a good time to make it active. The web site for this mailing list at . The main purpose of this list it to have an open discussion about the needs of registries when dealing with IDNs. There will be an active discussion of this at IDNConnect at the end of May, so this list can give attendees a lot of preparatory material to think about. More information on IDNConnect can be found at ; registration will start soon. BTW, if anyone has suggestions for other useful information to put on the web site, please let me know. I would like to restrict the list of documents to those that have been submitted and Internet Drafts so that there are no possible copyright issues. However, if there are other documents such as presentations that should be listed, I can add links to them. Please let me know off-line, not on the mailing list. Thanks! --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Mar 26 02:50:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2QAoQ127688 for idn-reg-policy-bks; Wed, 26 Mar 2003 02:50:26 -0800 (PST) Received: from twnic.net.tw (twnic.net.tw [211.72.210.250]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2QAoOg27678; Wed, 26 Mar 2003 02:50:24 -0800 (PST) Received: from twnic.net.tw (pc138.twnic.net.tw [211.72.211.138]) by twnic.net.tw (8.12.8/8.12.8) with ESMTP id h2QAoOwi028104; Wed, 26 Mar 2003 18:50:24 +0800 Message-ID: <3E818686.6010307@twnic.net.tw> Date: Wed, 26 Mar 2003 18:52:54 +0800 From: Erin Chen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1 X-Accept-Language: en-us MIME-Version: 1.0 To: Paul Hoffman / IMC CC: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs References: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi! Paul and dear all, Thanks Paul for propose draft-hoffman-idn-reg-00.txt . I have expressed some of my opinions, and I would like to know others comments. ----------cut------------- 2.3 Table for a zone that has no language restrictions A registry that does not restrict the number of languages will probably allow a much wider range of characters to be used in names. At the same time, that registry cannot easily use character variants because variants for one language will be different from the variants used in a different language. To handle conflicting variants among languages, the registry can choose to have no variants for any base characters, or can choose to have variants for a subset of the languages that are expressible in the characters allowed. -----------cut------------ A registry that does not restrict the number of languages here means the registry would possible offer the registration service for the languages which have overlapped variants. For the purpose of decrease confusion, cybersquating and DRP, so, how to let such registration result consist with the principle described in 2.2 is very important. ---------cut-------------- The table would look like: U+2200 U+2201|U+0043 U+2237|U+003AU+003A U+2202|U+0064;U+03B4 ---------cut-------------- As the table format describe in the document, the tabel would look like: U+2200 U+2201|U+0043 U+2237|U+003A:U+003A U+2202|U+0064:U+03B4 Isn't it? --------cut------------ Option 3 is likely to cause the most confusion with users because including some variants will cause a name to be found, bout using other variants will cause the name to be not found. ---------cut----------- I think the option 3 is a combination case of option 1 and option 2. If option 1 and option 2 is possible for implement. So implement option 3 would also possible for implement. The result would be the combination result of option 1 and option2. The key point has to consider is whether and which language has the requirements of allocate some variant labels for resolution and block other variant labels for prevent later registration at the same time. ---------cut--------- If the registry chose option 3, it must use an unspecified method to keep the elements in the registration bundle cohesive. This option SHOULD NOT be used except under carefully-controlled circumstances. ---------cut---------- I know the intention of this document is for the generic purpose not like IDN Admin Guideline just forcus on CJK characters. So, I suppose that, it is natural that a generic principle should comprise the specific language requirement like CJK Guideline has explain. At least does not just mention how impossible/difficult for registry choice option 3. Why need "Allocate some labels and block some other labels" ? That's because the variants could be seperate into 2 categories. One for allocate and one for block with a base domain name label. The allocated labels would put in the zone file for the same destination resolution with the base domain name. However the blocked labels would prevent the latter registration from others. So, if th registry make more explain to the public about his registration policy regard the variant table very clearly, and when user register a domain name let user recognize about what labels have been allocated and what labels have been blocked, that would be improve the user experience. ----------cut------------ $ORIGIN example.com. pale IN NS x.example.com. pale IN NS y.example.com. pa1e IN DNAME pale.example.com. ----------cut------------- As I know DNAME is support by BIND9 not BIND8, is it possible to let the users know if do not use DNAME the would have possible serious delegation problem? Such as there might be too much zone file have to maintain bye users, user has to keep the variant labels setting consistent. Erin Chen, TWNIC Paul Hoffman / IMC wrote: > > Greetings. I have just submitted a new Internet Draft that gives > suggestions on how to register IDNs. You can find a link to the draft > at the web site for this mailing list at > . Comments are, of course, welcome. > > This document is different than the JET document in many ways. It is > meant to be generic and usable by anyone, not just registries using > CJK characters. It also attempts to make the registry policy easier > and more predictable from the outside. I have heard that other folks > will also be preparing Internet Drafts on this issue, so it will be > good to see what the differences are. > > --Paul Hoffman, Director > --Internet Mail Consortium From owner-idn-reg-policy Wed Mar 26 08:16:09 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2QGG9R23159 for idn-reg-policy-bks; Wed, 26 Mar 2003 08:16:09 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2QGG7g23155 for ; Wed, 26 Mar 2003 08:16:07 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <3E818686.6010307@twnic.net.tw> References: <3E818686.6010307@twnic.net.tw> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 26 Mar 2003 08:16:08 -0800 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 6:52 PM +0800 3/26/03, Erin Chen wrote: >Thanks Paul for propose draft-hoffman-idn-reg-00.txt . >I have expressed some of my opinions, and I would like to know others >comments. Yes, it would be great to have more comments here. >----------cut------------- >2.3 Table for a zone that has no language restrictions > >A registry that does not restrict the number of languages will probably >allow a much wider range of characters to be used in names. At the same >time, that registry cannot easily use character variants because >variants for one language will be different from the variants used in a >different language. To handle conflicting variants among languages, the >registry can choose to have no variants for any base characters, or can >choose to have variants for a subset of the languages that are >expressible in the characters allowed. >-----------cut------------ > >A registry that does not restrict the number of languages here means the >registry would possible offer the registration service for the languages >which have overlapped variants. For the purpose of decrease confusion, >cybersquating and DRP, so, how to let such registration result consist with >the principle described in 2.2 is very important. It would be great if you could suggest some wording about how such a registry would work. I can't think of any way it could, but that doesn't mean it is impossible. >---------cut-------------- >The table would look like: >U+2200 >U+2201|U+0043 >U+2237|U+003AU+003A >U+2202|U+0064;U+03B4 >---------cut-------------- > >As the table format describe in the document, the tabel would look like: >U+2200 >U+2201|U+0043 >U+2237|U+003A:U+003A >U+2202|U+0064:U+03B4 >Isn't it? Your second change (to "U+2202|U+0064:U+03B4") is correct. The first one is not, however. The text above said: >- allows the PROPORTION character (U+2237) which has one variant which > is the string COLON (U+003A) COLON (U+003A) The way to show a string is to append all the characters together with no delimiters between them. >--------cut------------ >Option 3 is likely to cause the most confusion with users because >including some variants will cause a name to be found, bout using >other variants will cause the name to be not found. >---------cut----------- > >I think the option 3 is a combination case of option 1 and option 2. >If option 1 and option 2 is possible for implement. So implement >option 3 would also possible for implement. The result would be >the combination result of option 1 and option2. We fully agree. The JET document shows a way that it can be implemented. I very purposely tried not to say "it can't be done", just "it will cause the most confusion for users". > The key point >has to consider is whether and which language has the requirements >of allocate some variant labels for resolution and block other variant >labels for prevent later registration at the same time. Right. Unfortunately, the current draft of the JET document is silent about these requirements, and from talking to some JET members, I haven't heard any good description of why Chinese needs both. In fact, I remember many long conversations with CNNIC and TWNIC people a few years ago where they all said that just blocking (with no allocating) was fine. Maybe opinions in the Chinese language community have changed since then, but I haven't seen any written down in the JET document yet. Maybe the next version will cover this clearly. >---------cut--------- >If the registry chose option 3, it must use an unspecified method to >keep the elements in the registration bundle cohesive. This option >SHOULD NOT be used except under carefully-controlled circumstances. >---------cut---------- > >I know the intention of this document is for the generic purpose not like >IDN Admin Guideline just forcus on CJK characters. Exactly. >So, I suppose that, it is natural that a generic principle should comprise >the specific language requirement like CJK Guideline has explain. At least >does not just mention how impossible/difficult for registry choice option 3. I never said "impossible" because I am sure it is not impossible. I trust the JET people's analysis that it can be done. >Why need "Allocate some labels and block some other labels" ? That's >because the variants could be seperate into 2 categories. One for allocate >and one for block with a base domain name label. My apologies, but I don't understand that. The fact that it *could* be done does not explain why it is needed. Could you be clearer on the *need*? It would be helpful in this document (and in the JET document) if the actual need is clearly stated. >The allocated labels would put in the zone file for the same destination >resolution with the base domain name. However the blocked labels >would prevent the latter registration from others. > >So, if th registry make more explain to the public about his >registration policy >regard the variant table very clearly, and when user register a domain name >let user recognize about what labels have been allocated and what labels have >been blocked, that would be improve the user experience. True, but it would only help a little bit. Telling the users what has been done does not let them predict what will happen. If a registry says "we have mapped these characters to these other ones for this language reason", users will understand that; if a registry says "we have blocked these characters for this language reason", users will understand that. But I don't know how many users will understand "we have mapped some of them but blocked other ones even though the language reason is the same". If there is a good language reason for differentiating the two cases, that would be wonderful. >----------cut------------ > $ORIGIN example.com. > pale IN NS x.example.com. > pale IN NS y.example.com. > pa1e IN DNAME pale.example.com. >----------cut------------- >As I know DNAME is support by BIND9 not BIND8, is it possible to >let the users know if do not use DNAME the would have possible serious >delegation problem? Such as there might be too much zone file have to >maintain bye users, user has to keep the variant labels setting consistent. I don't understand your question. Why would there be "serious delegation problems"? Having "too many zone files" is not a problem if the users have the tools to manage them. If the users don't have the tools, then the additional zones would give bad information, but that will be fixed when users complain. This isn't any different than the current DNS when people make mistakes that affect users. The big point is that the effects can only affect the owner of the registration bundle, and that a cybersquatter can't cause the problems. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Mar 26 12:25:14 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2QKPE908255 for idn-reg-policy-bks; Wed, 26 Mar 2003 12:25:14 -0800 (PST) Received: from holodoc.allard.nu (postfix@holodoc.allard.nu [212.112.184.131]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2QKPBg08249; Wed, 26 Mar 2003 12:25:12 -0800 (PST) Received: from iis.se (as2-6-1.nvik.s.bonet.se [217.215.75.3]) by holodoc.allard.nu (Postfix) with ESMTP id 2D65496D3; Wed, 26 Mar 2003 21:25:14 +0100 (CET) Message-ID: <3E820CB0.1000605@iis.se> Date: Wed, 26 Mar 2003 21:25:20 +0100 From: Staffan Hagnell User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Paul Hoffman / IMC Cc: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs References: <3E818686.6010307@twnic.net.tw> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: What happens when the Allocation all labels policy is used and the input label contains several characters with equivalents? E.g. if there 6 base characters with one variant" each in the input label, will the owner of the registration have to keep 2 raised to 6 = 64 zones? Best regards - Staffan Hagnell Paul Hoffman / IMC wrote: > > At 6:52 PM +0800 3/26/03, Erin Chen wrote: > >> Thanks Paul for propose draft-hoffman-idn-reg-00.txt . >> I have expressed some of my opinions, and I would like to know others >> comments. > > > Yes, it would be great to have more comments here. > >> ----------cut------------- >> 2.3 Table for a zone that has no language restrictions >> >> A registry that does not restrict the number of languages will probably >> allow a much wider range of characters to be used in names. At the same >> time, that registry cannot easily use character variants because >> variants for one language will be different from the variants used in a >> different language. To handle conflicting variants among languages, the >> registry can choose to have no variants for any base characters, or can >> choose to have variants for a subset of the languages that are >> expressible in the characters allowed. >> -----------cut------------ >> >> A registry that does not restrict the number of languages here means the >> registry would possible offer the registration service for the languages >> which have overlapped variants. For the purpose of decrease confusion, >> cybersquating and DRP, so, how to let such registration result >> consist with >> the principle described in 2.2 is very important. > > > It would be great if you could suggest some wording about how such a > registry would work. I can't think of any way it could, but that > doesn't mean it is impossible. > >> ---------cut-------------- >> The table would look like: >> U+2200 >> U+2201|U+0043 >> U+2237|U+003AU+003A >> U+2202|U+0064;U+03B4 >> ---------cut-------------- >> >> As the table format describe in the document, the tabel would look like: >> U+2200 >> U+2201|U+0043 >> U+2237|U+003A:U+003A >> U+2202|U+0064:U+03B4 >> Isn't it? > > > Your second change (to "U+2202|U+0064:U+03B4") is correct. The first > one is not, however. The text above said: > >> - allows the PROPORTION character (U+2237) which has one variant which >> is the string COLON (U+003A) COLON (U+003A) > > The way to show a string is to append all the characters together with > no delimiters between them. > > >> --------cut------------ >> Option 3 is likely to cause the most confusion with users because >> including some variants will cause a name to be found, bout using >> other variants will cause the name to be not found. >> ---------cut----------- >> >> I think the option 3 is a combination case of option 1 and option 2. >> If option 1 and option 2 is possible for implement. So implement >> option 3 would also possible for implement. The result would be >> the combination result of option 1 and option2. > > > We fully agree. The JET document shows a way that it can be > implemented. I very purposely tried not to say "it can't be done", > just "it will cause the most confusion for users". > >> The key point >> has to consider is whether and which language has the requirements >> of allocate some variant labels for resolution and block other variant >> labels for prevent later registration at the same time. > > > Right. Unfortunately, the current draft of the JET document is silent > about these requirements, and from talking to some JET members, I > haven't heard any good description of why Chinese needs both. In fact, > I remember many long conversations with CNNIC and TWNIC people a few > years ago where they all said that just blocking (with no allocating) > was fine. Maybe opinions in the Chinese language community have > changed since then, but I haven't seen any written down in the JET > document yet. Maybe the next version will cover this clearly. > >> ---------cut--------- >> If the registry chose option 3, it must use an unspecified method to >> keep the elements in the registration bundle cohesive. This option >> SHOULD NOT be used except under carefully-controlled circumstances. >> ---------cut---------- >> >> I know the intention of this document is for the generic purpose not >> like >> IDN Admin Guideline just forcus on CJK characters. > > > Exactly. > >> So, I suppose that, it is natural that a generic principle should >> comprise >> the specific language requirement like CJK Guideline has explain. At >> least >> does not just mention how impossible/difficult for registry choice >> option 3. > > > I never said "impossible" because I am sure it is not impossible. I > trust the JET people's analysis that it can be done. > >> Why need "Allocate some labels and block some other labels" ? That's >> because the variants could be seperate into 2 categories. One for >> allocate >> and one for block with a base domain name label. > > > My apologies, but I don't understand that. The fact that it *could* be > done does not explain why it is needed. Could you be clearer on the > *need*? It would be helpful in this document (and in the JET document) > if the actual need is clearly stated. > >> The allocated labels would put in the zone file for the same destination >> resolution with the base domain name. However the blocked labels >> would prevent the latter registration from others. >> >> So, if th registry make more explain to the public about his >> registration policy >> regard the variant table very clearly, and when user register a >> domain name >> let user recognize about what labels have been allocated and what >> labels have >> been blocked, that would be improve the user experience. > > > True, but it would only help a little bit. Telling the users what has > been done does not let them predict what will happen. If a registry > says "we have mapped these characters to these other ones for this > language reason", users will understand that; if a registry says "we > have blocked these characters for this language reason", users will > understand that. But I don't know how many users will understand "we > have mapped some of them but blocked other ones even though the > language reason is the same". If there is a good language reason for > differentiating the two cases, that would be wonderful. > >> ----------cut------------ >> $ORIGIN example.com. >> pale IN NS x.example.com. >> pale IN NS y.example.com. >> pa1e IN DNAME pale.example.com. >> ----------cut------------- >> As I know DNAME is support by BIND9 not BIND8, is it possible to >> let the users know if do not use DNAME the would have possible serious >> delegation problem? Such as there might be too much zone file have to >> maintain bye users, user has to keep the variant labels setting >> consistent. > > > I don't understand your question. Why would there be "serious > delegation problems"? Having "too many zone files" is not a problem if > the users have the tools to manage them. If the users don't have the > tools, then the additional zones would give bad information, but that > will be fixed when users complain. This isn't any different than the > current DNS when people make mistakes that affect users. The big point > is that the effects can only affect the owner of the registration > bundle, and that a cybersquatter can't cause the problems. > > --Paul Hoffman, Director > --Internet Mail Consortium From owner-idn-reg-policy Wed Mar 26 14:28:49 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2QMSn813219 for idn-reg-policy-bks; Wed, 26 Mar 2003 14:28:49 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2QMSjg13211; Wed, 26 Mar 2003 14:28:45 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <3E820CB0.1000605@iis.se> References: <3E818686.6010307@twnic.net.tw> <3E820CB0.1000605@iis.se> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 26 Mar 2003 14:28:50 -0800 To: Staffan Hagnell From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 9:25 PM +0100 3/26/03, Staffan Hagnell wrote: >What happens when the Allocation all labels policy is used and the >input label contains several characters with equivalents? > >E.g. if there 6 base characters with one variant" each in the >input label, will the owner of the registration have to keep 2 >raised to 6 = 64 zones? Exactly right. If this wasn't clear from the draft, please suggest some wording to make it more so. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Mar 26 23:33:25 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2R7XPr08304 for idn-reg-policy-bks; Wed, 26 Mar 2003 23:33:25 -0800 (PST) Received: from holodoc.allard.nu (postfix@holodoc.allard.nu [212.112.184.131]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2R7XMg08298; Wed, 26 Mar 2003 23:33:23 -0800 (PST) Received: from iis.se (as2-6-1.nvik.s.bonet.se [217.215.75.3]) by holodoc.allard.nu (Postfix) with ESMTP id 0BE0D96D3; Thu, 27 Mar 2003 08:33:21 +0100 (CET) Message-ID: <3E82A944.6040703@iis.se> Date: Thu, 27 Mar 2003 08:33:24 +0100 From: Staffan Hagnell User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Paul Hoffman / IMC Cc: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs References: <3E818686.6010307@twnic.net.tw> <3E820CB0.1000605@iis.se> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Well, my point is that the "Allocation all labels" policy could be a bit messy for the owner. Therefore it would be proper to have some sort of information about that, for example: If the input label contains several characters that have equivalents, the owner could end up having to take care of large number of zones. For instance, if DIGIT ONE is a variant of LATIN SMALL LETTER L, the owner of the domain name llllll.com will have to manage 64 zones. -Staffan Paul Hoffman / IMC wrote: > > At 9:25 PM +0100 3/26/03, Staffan Hagnell wrote: > >> What happens when the Allocation all labels policy is used and >> the input label contains several characters with equivalents? >> >> E.g. if there 6 base characters with one variant" each in the >> input label, will the owner of the registration have to keep 2 raised >> to 6 = 64 zones? > > > Exactly right. If this wasn't clear from the draft, please suggest > some wording to make it more so. > > --Paul Hoffman, Director > --Internet Mail Consortium From owner-idn-reg-policy Thu Mar 27 06:34:39 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2REYdd18370 for idn-reg-policy-bks; Thu, 27 Mar 2003 06:34:39 -0800 (PST) Received: from maya20.nic.fr (maya20.nic.fr [192.134.4.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2REYbg18362; Thu, 27 Mar 2003 06:34:37 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya20.nic.fr (8.12.4/8.12.4) with ESMTP id h2REYMNn1202586; Thu, 27 Mar 2003 15:34:22 +0100 (CET) Received: by vespucci.nic.fr (Postfix, from userid 1055) id C5468110F0; Thu, 27 Mar 2003 15:34:25 +0100 (CET) Date: Thu, 27 Mar 2003 15:34:25 +0100 From: Stephane Bortzmeyer To: Paul Hoffman / IMC Cc: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030327143425.GA11197@nic.fr> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Tue, Mar 25, 2003 at 09:31:09AM -0800, Paul Hoffman / IMC wrote a message of 16 lines which said: > Greetings. I have just submitted a new Internet Draft that gives > suggestions on how to register IDNs. You can find a link to the draft > at the web site for this mailing list at > . Comments are, of course, > welcome. Well, to summary, I find it quite good, much simpler and more understandable than draft-jseng-idn-admin-02, specially for non-CJK countries. > A "string" is an ordered set of one or more characters. > > This document discusses characters that have equivalent or > near-equivalent characters or strings. The "base character" is the Shouldn't we use "code point" instead of "character"? > If the base character has more than one variant, the variants > are separated by a colon (":", ASCII 0x3A). Strings are given without > any intervening spaces ... > U+2202|U+0064;U+03B4 ^ Isn't it a typo? > A registry has three options for how to handle the case where > the registration bundle has more than one label. The policy options are: > > 1) Allocate all labels to the same registrant, making > the zone information identical to that of the input label. > > 2) Block all labels so they cannot be registered in the > future. > > 3) Allocate some labels and block some other labels. This entire scheme does not discuss financial issues. For instance, in Option 1, it will mean that a registrant will get more labels than he paid for. The registry will not be happy :-) In Option 2, OTOH, it means that there is no option for the registrant to activate some variants. Do you think this case (all variants are blocked and some are allocated to the same registrant, if he chooses so and if he pays, may be a smaller price than a "real" domain) is covered by Option 3? If so, I suggest to rewrite it to make it clearer. > Option 3 is likely to cause the most confusion with users because > including some variants will cause a name to be found, bout using > other variants will cause the name to be not found. If the variants actually allocated are choosen by the registrant, it is up to her to minimize confusion. From owner-idn-reg-policy Thu Mar 27 06:53:30 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2RErUa20241 for idn-reg-policy-bks; Thu, 27 Mar 2003 06:53:30 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-156.dsl.snfc21.pacbell.net [63.202.92.156]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2RErQg20228; Thu, 27 Mar 2003 06:53:26 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <3E82A944.6040703@iis.se> References: <3E818686.6010307@twnic.net.tw> <3E820CB0.1000605@iis.se> <3E82A944.6040703@iis.se> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 27 Mar 2003 06:53:20 -0800 To: Staffan Hagnell From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 8:33 AM +0100 3/27/03, Staffan Hagnell wrote: >Well, my point is that the "Allocation all labels" policy could be a >bit messy for the owner. Therefore it would be proper to have some >sort of information about that, for example: > >If the input label contains several characters that have >equivalents, the owner could end up having to take care of large >number of zones. For instance, if DIGIT ONE is a variant of LATIN >SMALL LETTER L, the owner of the domain name llllll.com will have to >manage 64 zones. Sounds fine (with a slight rewording to "llllll.example.com"). I'll add a parallel warning under the blocking section with a different example. If the input label contains characters that have equivalents, Internet users who don't know what the base characters used in the registration will not know what character to type in to get a DNS response. For instance, if DIGIT ONE is a variant of LATIN SMALL LETTER L, and LATIN SMALL LETTER L is a variant of DIGIT ONE, the user who sees "pale.example.com" will no know whether to type a "1" or a "l" after the "pa" in the first label. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Mar 27 07:46:26 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2RFkQe25059 for idn-reg-policy-bks; Thu, 27 Mar 2003 07:46:26 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-156.dsl.snfc21.pacbell.net [63.202.92.156]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2RFkNg25054; Thu, 27 Mar 2003 07:46:23 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030327143425.GA11197@nic.fr> References: <20030327143425.GA11197@nic.fr> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 27 Mar 2003 07:46:20 -0800 To: Stephane Bortzmeyer From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 3:34 PM +0100 3/27/03, Stephane Bortzmeyer wrote: >Well, to summary, I find it quite good, much simpler and more >understandable than draft-jseng-idn-admin-02, specially for non-CJK >countries. Thanks! > > A "string" is an ordered set of one or more characters. >> >> This document discusses characters that have equivalent or >> near-equivalent characters or strings. The "base character" is the > >Shouldn't we use "code point" instead of "character"? Character is more understandable, and I don't see any place in the document where a character wouldn't be considered its code point, but I'm open to changing it. > > If the base character has more than one variant, the variants >> are separated by a colon (":", ASCII 0x3A). Strings are given without >> any intervening spaces >... >> U+2202|U+0064;U+03B4 > ^ > Isn't it a typo? Yes. > > A registry has three options for how to handle the case where >> the registration bundle has more than one label. The policy options are: >> >> 1) Allocate all labels to the same registrant, making >> the zone information identical to that of the input label. >> >> 2) Block all labels so they cannot be registered in the >> future. >> >> 3) Allocate some labels and block some other labels. > >This entire scheme does not discuss financial issues. Correct. It would be unwise to predict how registries will act on this, especially over time. > For instance, in >Option 1, it will mean that a registrant will get more labels than he >paid for. The registry will not be happy :-) That's not necessarily true. Registries can decide whether or not to charge more for names with bundles. Note that in option 2 and 3, the registry is prevented from making money, so they might charge for bundles regardless of which option they choose. >In Option 2, OTOH, it means that there is no option for the registrant >to activate some variants. Right. > Do you think this case (all variants are >blocked and some are allocated to the same registrant, if he chooses >so and if he pays, may be a smaller price than a "real" domain) is >covered by Option 3? That wasn't my intention, but it is an interesting thought. My intention for option 3, which is what is discussed in the JET document, is that the registry decides using the table which names are allocated and which are blocked. >If the variants actually allocated are choosen by the registrant, it >is up to her to minimize confusion. You are proposing something different: the registry allows the registrant to pick which names from the bundle go in either category. It will still have the same problem of option 3, namely typical users of the DNS won't be able to predict what they should type, but it sounds "nicer" than option 3. What do other folks on this list think of that idea? --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Mar 27 15:11:57 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2RNBvC19770 for idn-reg-policy-bks; Thu, 27 Mar 2003 15:11:57 -0800 (PST) Received: from dgesmtp01.wcom.com (dgesmtp01.wcom.com [199.249.16.16]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2RNBpg19760; Thu, 27 Mar 2003 15:11:51 -0800 (PST) Received: from dgismtp02.wcomnet.com ([166.38.58.142]) by firewall.wcom.com (Iplanet MTA) with ESMTP id <0HCF00B0FK9SJQ@firewall.wcom.com>; Thu, 27 Mar 2003 23:08:16 +0000 (GMT) Received: from dgismtp02.wcomnet.com by dgismtp02.wcomnet.com (iPlanet Messaging Server 5.1 HotFix 0.7 (built May 7 2002)) with SMTP id <0HCF00701K9R8B@dgismtp02.wcomnet.com>; Thu, 27 Mar 2003 23:08:16 +0000 (GMT) Received: from vint.wcom.com ([166.50.135.56]) by dgismtp02.wcomnet.com (iPlanet Messaging Server 5.1 HotFix 0.7 (built May 7 2002)) with ESMTP id <0HCF0058UK9MZX@dgismtp02.wcomnet.com>; Thu, 27 Mar 2003 23:08:15 +0000 (GMT) Date: Thu, 27 Mar 2003 14:40:47 -0300 From: "vinton g. cerf" Subject: Re: New Internet Draft on registering IDNs In-reply-to: X-Sender: vcerf@pop.wcomnet.com To: Paul Hoffman / IMC , Stephane Bortzmeyer Cc: idn-reg-policy@imc.org Message-id: <5.2.0.9.2.20030327163511.02533da0@pop.wcomnet.com> MIME-version: 1.0 X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Content-type: text/plain; charset=us-ascii References: <20030327143425.GA11197@nic.fr> <20030327143425.GA11197@nic.fr> Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: paul, et al, the alternative of allowing the registrant to adjust the subset that is registered from the "bundle" has the nice property that it is the registrant's choice (and responsibility) for the resulting ease of use or lack thereof. I am not sure what to say about the economic position to be taken and perhaps this could be best left to the TLD operator. Excessive "greed" if you will pardon the use of the word, might prove to be a poor business choice, so there might be some balance between a single price regardless of the size of the selected bundle subset and a price equal to registering N distinct SLDs. Intuition is hard to rely on here since the properties of different languages and chosen rules for "equivalence" will lead to quite a variety of different cases, I would think. Vint (p.s. I am particpating in this discussion as an interested party but I hope no one misunderstands any of my opinions as representative of the ICANN board or its IDN committee) At 07:46 AM 3/27/2003 -0800, Paul Hoffman / IMC wrote: >You are proposing something different: the registry allows the registrant to pick which names from the bundle go in either category. It will still have the same problem of option 3, namely typical users of the DNS won't be able to predict what they should type, but it sounds "nicer" than option 3. > >What do other folks on this list think of that idea? Vint Cerf SVP Architecture & Technology WorldCom 22001 Loudoun County Parkway, F2-4115 Ashburn, VA 20147 703 886 1690 (v806 1690) 703 886 0047 fax From owner-idn-reg-policy Thu Mar 27 17:23:36 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2S1NaL28744 for idn-reg-policy-bks; Thu, 27 Mar 2003 17:23:36 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2S1NRg28734; Thu, 27 Mar 2003 17:23:27 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <5.2.0.9.2.20030327163511.02533da0@pop.wcomnet.com> References: <20030327143425.GA11197@nic.fr> <20030327143425.GA11197@nic.fr> <5.2.0.9.2.20030327163511.02533da0@pop.wcomnet.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 27 Mar 2003 17:23:24 -0800 To: "vinton g. cerf" , Stephane Bortzmeyer From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 2:40 PM -0300 3/27/03, vinton g. cerf wrote: >the alternative of allowing the registrant to adjust the subset that >is registered from the "bundle" has the nice property that it is the >registrant's choice (and responsibility) for the resulting ease of >use or lack thereof. Indeed. Novice registrants could start with "let me handle the most likely name and block the rest", and then migrate to fuller use when they are ready. Within a few years, when registrant-level bundle handlers are common, users will probably mostly go towards all-resolving. In thinking about this more, there are some downsides as well: - Much more of a hassle for the registry because now they have to manage the individual elements of the bundle manually instead of automatically - Much less predictable to DNS users of the zone > I am not sure what to say about the economic position to be taken >and perhaps this could be best left to the TLD operator. Right. RFCs usually don't talk about economics. > Excessive "greed" if you will pardon the use of the word, might >prove to be a poor business choice, so there might be some balance >between a single price regardless of the size of the selected bundle >subset and a price equal to registering N distinct SLDs. I think any registry reading this document (or the JET document, or others) will be able to quickly figure out the costs associated with the work they are taking on. We don't need to list it for them any more than BGP documents talk about the financial aspects of routing choices. >Intuition is hard to rely on here since the properties of different >languages and chosen rules for "equivalence" will lead to quite a >variety of different cases, I would think. Which is exactly the reason this document is more generic than the JET document. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Mar 28 08:51:31 2003 Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.6) id h2SGpVe07228 for idn-reg-policy-bks; Fri, 28 Mar 2003 08:51:31 -0800 (PST) Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.11.6/8.11.6) with ESMTP id h2SGpTg07223 for ; Fri, 28 Mar 2003 08:51:29 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 28 Mar 2003 08:51:28 -0800 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Updates to the web site Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I have updated the web site for this discussion () with pointers to two more documents: - Japanese characters in Internationalized Domain Name label (draft-yoneya-idn-jpchar) - Chinese Domain Name Consortium (CDNC) Status Update for IDN, a PowerPoint presentation made at the ICANN meeting in March, 2003 When there is any concrete documents from ICANN about the IDN registration policy discussions this week, I will post them. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Mon Mar 31 04:59:24 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VCxOJM013593 for ; Mon, 31 Mar 2003 04:59:24 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VCxOH9013592 for idn-reg-policy-bks; Mon, 31 Mar 2003 04:59:24 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VCx9JM013471; Mon, 31 Mar 2003 04:59:12 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h2VCugF26734; Mon, 31 Mar 2003 17:26:50 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h2VD5ia07732; Mon, 31 Mar 2003 17:36:01 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Mon, 31 Mar 2003 17:35:44 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: Paul Hoffman / IMC cc: Stephane Bortzmeyer , Subject: Re: New Internet Draft on registering IDNs In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Thu, 27 Mar 2003, Paul Hoffman / IMC wrote: > At 3:34 PM +0100 3/27/03, Stephane Bortzmeyer wrote: > >Well, to summary, I find it quite good, much simpler and more > >understandable than draft-jseng-idn-admin-02, specially for non-CJK > >countries. > > Thanks! I guess Stephane really means LGC (Latin, Greek, Cyrillic, ...) countries. The requirements of Indic, Hebrew and Arabic scripts in IDN may prove to be more complex than CJK. roozbeh From owner-idn-reg-policy Mon Mar 31 06:10:23 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VEANJM017935 for ; Mon, 31 Mar 2003 06:10:23 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VEAN3E017934 for idn-reg-policy-bks; Mon, 31 Mar 2003 06:10:23 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VEAFJM017927 for ; Mon, 31 Mar 2003 06:10:17 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h2VEAFF19664 for ; Mon, 31 Mar 2003 18:40:15 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h2VEJuL08310 for ; Mon, 31 Mar 2003 18:49:56 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Mon, 31 Mar 2003 18:49:56 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: IDN registration policy list Subject: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: With a personal and an Arabic script mindset, this is just a basic comparison: I like hoffman-idn-reg, since it's short and provides a solution for simple Latin-like scripts. I can point someone not very technical to the I-D and have him understand it. I also like it, since it sometimes tries not to be ignorant, and mentions things like: A registry MUST NOT blindly combine multiple tables which have overlapping equivalences. Instead, the registry MUST carefully analyze every instance in the combined table where a base character has one or more different variants and select the desired set of variants for the base character. (But unfortunately doesn't suggest any guidelines when doing so.) Unfortunately, the list ends here. Specifically, there are fetures that are *required* for Arabic but are missing in the language of the tables. hoffman-idn-reg has some chance to become better, but it needs to address these areas: 1. Mandatory equivalences as opposed to secondary/variant equivalences. This feature is necessary for defining equivalences between European and Arabic-Indic digit shapes in Arabic labels, for example. Many software platforms like MS Windows have features that will automatically change the shape of European digits to Arabic-Indic ones in an Arabic context. Also, some Arabi countries only use one form of the digit set (and only have this set on their keyboards), while others use the other (and only have that set on their keyboards). This is some feature that *all* Arabic script zones *require*, contrary to optional features like reservation of a label with very similiar characters that may create cybersquatting problems and *may* be solved by just limiting the repretoire to a shorter list of characters. 2. Clear language about conflict resolution. There needs to be some clear guidelines or recommendations about the times that two registered labels come into an intersection regarding the variant labels associated to them. This will happen with almost any multi-language Arabic-script zone (e.g. U+0649 vs U+064A vs U+06CC). 3. Clear language with specific guidelines and real-life examples for merging tables for different languages/locales. 4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB syntax is unreadable? Why can't one use a space? Having said all that, I still recommend helping to make the language of jseng-idn-admin more understandable to a western mindset (and possibly making it two I-Ds, one general and one CJK-only), than trying to make hoffman-idn-reg a second jseng-idn-admin. jseng-idn-admin is much better and clearer in all the above four points, and even provides an exact algorithm for point 3. roozbeh From owner-idn-reg-policy Mon Mar 31 07:28:11 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VFSBJM022271 for ; Mon, 31 Mar 2003 07:28:11 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VFSBda022270 for idn-reg-policy-bks; Mon, 31 Mar 2003 07:28:11 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VFRsJN022229; Mon, 31 Mar 2003 07:27:55 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 31 Mar 2003 07:27:41 -0800 To: Roozbeh Pournader From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: Stephane Bortzmeyer , Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 5:35 PM +0430 3/31/03, Roozbeh Pournader wrote: >I guess Stephane really means LGC (Latin, Greek, Cyrillic, ...) countries. >The requirements of Indic, Hebrew and Arabic scripts in IDN may prove to >be more complex than CJK. Indeed. That's the reason we need much more scrutiny of the different proposals. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Mon Mar 31 08:08:13 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VG8DJM026331 for ; Mon, 31 Mar 2003 08:08:13 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VG8DLD026330 for idn-reg-policy-bks; Mon, 31 Mar 2003 08:08:13 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya20.nic.fr (maya20.nic.fr [192.134.4.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VG86JM026320; Mon, 31 Mar 2003 08:08:11 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya20.nic.fr (8.12.4/8.12.4) with ESMTP id h2VG7rMn1260031; Mon, 31 Mar 2003 18:07:53 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id E41F3110F0; Mon, 31 Mar 2003 18:07:56 +0200 (CEST) Date: Mon, 31 Mar 2003 18:07:56 +0200 From: Stephane Bortzmeyer To: Paul Hoffman / IMC Cc: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030331160756.GA15352@nic.fr> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Tue, Mar 25, 2003 at 09:31:09AM -0800, Paul Hoffman / IMC wrote a message of 16 lines which said: > Greetings. I have just submitted a new Internet Draft that gives > suggestions on how to register IDNs. There is a big problem with the bundle approach described in this draft, a problem I discovered while trying to implement it. I designed a table, as specified in the draft, for the French language (for multi-lingual countries or for multinational domains like '.eu', the problem will be worse). My table may be too large, allowing all Latin-1 (ISO-8859-1) characters, even those not used in French but it is still too small for the entire European Union. With this table, I experience a dramatic explosion. Many domains generate a bundle of several thousands of names, sometimes more. If I implement Option 1 of the draft ("Allocate all labels to the same registrant, making the zone information identical to that of the input label."), my zone file will explode :-) The problem, as I see it, is that the draft uses variant tables which work on a per-character basis, without any concern that not all combinations mean something. Most of the words in a bundle have no meaning in French and there is no real reason to keep them. I understand that there is no easy option (using a dictionary will not work since many domain names are not in any dictionary). Did anyone try a bundle approach on his zone? Here is the table, for those interested. It is simply "accent-insensitive". I regard any composed character as a variant of the plain character. # Variant table for the French language # See Internet-Draft draft-hoffman-idn-reg-00 # # Designed at AFNIC # Stephane Bortzmeyer # $Id$ # a-z # a U+0061|U+00E0:U+00E1:U+00E2:U+00E3:U+00E4:U+00E5 U+0062 U+0063 U+0064 # e U+0065|U+00E8:U+00E9:U+00EA:U+00EB U+0066 U+0067 U+0068 # i U+0069|U+00EC:U+00ED:U+00EE:U+00EF U+006A U+006B U+006C U+006D U+006E # o U+006F|U+00F2:U+00F3:U+00F4:U+00F5:U+00F6:U+00F8 U+0070 U+0071 U+0072 U+0073 U+0074 # u U+0075|U+00F9:U+00FA:U+00FB:U+00FC U+0076 U+0077 U+0078 U+0079 U+007A # 0-9 U+0030 U+0031 U+0032 U+0033 U+0034 U+0035 U+0036 U+0037 U+0038 U+0039 # - (hyphen) U+002D # Ligature oe U+0153|U+006FU+0065 From owner-idn-reg-policy Mon Mar 31 08:52:58 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VGqwJM028756 for ; Mon, 31 Mar 2003 08:52:58 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VGqww3028755 for idn-reg-policy-bks; Mon, 31 Mar 2003 08:52:58 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VGqpJM028751 for ; Mon, 31 Mar 2003 08:52:53 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h2VGqpF04280 for ; Mon, 31 Mar 2003 21:22:51 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h2VH2aE09757 for ; Mon, 31 Mar 2003 21:32:36 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Mon, 31 Mar 2003 21:32:36 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: IDN registration policy list Subject: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: With a personal and an Arabic script mindset, this is just a basic comparison: I like hoffman-idn-reg, since it's short and provides a solution for simple Latin-like scripts. I can point someone not very technical to the I-D and have him understand it. I also like it, since it sometimes tries not to be ignorant, and mentions things like: A registry MUST NOT blindly combine multiple tables which have overlapping equivalences. Instead, the registry MUST carefully analyze every instance in the combined table where a base character has one or more different variants and select the desired set of variants for the base character. (But unfortunately doesn't suggest any guidelines when doing so.) Unfortunately, the list ends here. Specifically, there are fetures that are *required* for Arabic but are missing in the language of the tables. hoffman-idn-reg has some chance to become better, but it needs to address these areas: 1. Mandatory equivalences as opposed to secondary/variant equivalences. This feature is necessary for defining equivalences between European and Arabic-Indic digit shapes in Arabic labels, for example. Many software platforms like MS Windows have features that will automatically change the shape of European digits to Arabic-Indic ones in an Arabic context. Also, some Arabi countries only use one form of the digit set (and only have this set on their keyboards), while others use the other (and only have that set on their keyboards). This is some feature that *all* Arabic script zones *require*, contrary to optional features like reservation of a label with very similiar characters that may create cybersquatting problems and *may* be solved by just limiting the repretoire to a shorter list of characters. 2. Clear language about conflict resolution. There needs to be some clear guidelines or recommendations about the times that two registered labels come into an intersection regarding the variant labels associated to them. This will happen with almost any multi-language Arabic-script zone (e.g. U+0649 vs U+064A vs U+06CC). 3. Clear language with specific guidelines and real-life examples for merging tables for different languages/locales. 4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB syntax is unreadable? Why can't one use a space? Having said all that, I still recommend helping to make the language of jseng-idn-admin more understandable to a western mindset (and possibly making it two I-Ds, one general and one CJK-only), than trying to make hoffman-idn-reg a second jseng-idn-admin. jseng-idn-admin is much better and clearer in all the above four points, and even provides an exact algorithm for point 3. roozbeh From owner-idn-reg-policy Mon Mar 31 09:16:47 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VHGkJM029447 for ; Mon, 31 Mar 2003 09:16:47 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VHGkrA029446 for idn-reg-policy-bks; Mon, 31 Mar 2003 09:16:46 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VHGdJM029440 for ; Mon, 31 Mar 2003 09:16:41 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h2VHGdF12323 for ; Mon, 31 Mar 2003 21:46:39 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h2VHQOP10001 for ; Mon, 31 Mar 2003 21:56:24 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Mon, 31 Mar 2003 21:56:24 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: IDN registration policy list Subject: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: With a personal and an Arabic script mindset, this is just a basic comparison: I like hoffman-idn-reg, since it's short and provides a solution for simple Latin-like scripts. I can point someone not very technical to the I-D and have him understand it. I also like it, since it sometimes tries not to be ignorant, and mentions things like: A registry MUST NOT blindly combine multiple tables which have overlapping equivalences. Instead, the registry MUST carefully analyze every instance in the combined table where a base character has one or more different variants and select the desired set of variants for the base character. (But unfortunately doesn't suggest any guidelines when doing so.) Unfortunately, the list ends here. Specifically, there are fetures that are *required* for Arabic but are missing in the language of the tables. hoffman-idn-reg has some chance to become better, but it needs to address these areas: 1. Mandatory equivalences as opposed to secondary/variant equivalences. This feature is necessary for defining equivalences between European and Arabic-Indic digit shapes in Arabic labels, for example. Many software platforms like MS Windows have features that will automatically change the shape of European digits to Arabic-Indic ones in an Arabic context. Also, some Arabi countries only use one form of the digit set (and only have this set on their keyboards), while others use the other (and only have that set on their keyboards). This is some feature that *all* Arabic script zones *require*, contrary to optional features like reservation of a label with very similiar characters that may create cybersquatting problems and *may* be solved by just limiting the repretoire to a shorter list of characters. 2. Clear language about conflict resolution. There needs to be some clear guidelines or recommendations about the times that two registered labels come into an intersection regarding the variant labels associated to them. This will happen with almost any multi-language Arabic-script zone (e.g. U+0649 vs U+064A vs U+06CC). 3. Clear language with specific guidelines and real-life examples for merging tables for different languages/locales. 4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB syntax is unreadable? Why can't one use a space? Having said all that, I still recommend helping to make the language of jseng-idn-admin more understandable to a western mindset (and possibly making it two I-Ds, one general and one CJK-only), than trying to make hoffman-idn-reg a second jseng-idn-admin. jseng-idn-admin is much better and clearer in all the above four points, and even provides an exact algorithm for point 3. roozbeh From owner-idn-reg-policy Mon Mar 31 12:00:27 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VK0RJM006093 for ; Mon, 31 Mar 2003 12:00:27 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VK0R8t006091 for idn-reg-policy-bks; Mon, 31 Mar 2003 12:00:27 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from mail-aubervilliers.netaktiv.com (soyouz.netaktiv.com [80.67.162.6]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VK0EJM005962; Mon, 31 Mar 2003 12:00:25 -0800 (PST) Received: by mail-aubervilliers.netaktiv.com (Postfix, from userid 10) id 3D4E523EBF; Mon, 31 Mar 2003 22:00:07 +0200 (CEST) Received: from sources.org (stephane@localhost [127.0.0.1]) by ludwigV.sources.org (8.12.3/8.12.3/Debian -4) with ESMTP id h2VJuD6v006142; Mon, 31 Mar 2003 21:56:13 +0200 Message-Id: <200303311956.h2VJuD6v006142@ludwigV.sources.org> X-Mailer: exmh version 2.5 07/13/2001 (debian 2.5-1) with nmh-1.0.4+dev To: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs From: Stephane Bortzmeyer Cc: idn-reg-policy@imc.org Organization: NIC France Date: Mon, 31 Mar 2003 21:56:13 +0200 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Tue, Mar 25, 2003 at 09:31:09AM -0800, Paul Hoffman / IMC wrote a message of 16 lines which said: > Greetings. I have just submitted a new Internet Draft that gives > suggestions on how to register IDNs. There is a big problem with the bundle approach described in this draft, a problem I discovered while trying to implement it. I designed a table, as specified in the draft, for the French language (for multi-lingual countries or for multinational domains like '.eu', the problem will be worse). My table may be too large, allowing all Latin-1 (ISO-8859-1) characters, even those not used in French but it is still too small for the entire European Union. With this table, I experience a dramatic explosion. Many domains generate a bundle of several thousands of names, sometimes more. If I implement Option 1 of the draft ("Allocate all labels to the same registrant, making the zone information identical to that of the input label."), my zone file will explode :-) The problem, as I see it, is that the draft uses variant tables which work on a per-character basis, without any concern that not all combinations mean something. Most of the words in a bundle have no meaning in French and there is no real reason to keep them. I understand that there is no easy option (using a dictionary will not work since many domain names are not in any dictionary). Did anyone try a bundle approach on his zone? Here is the table, for those interested. It is simply "accent-insensitive". I regard any composed character as a variant of the plain character. # Variant table for the French language # See Internet-Draft draft-hoffman-idn-reg-00 # # Designed at AFNIC # Stephane Bortzmeyer # $Id$ # a-z # a U+0061|U+00E0:U+00E1:U+00E2:U+00E3:U+00E4:U+00E5 U+0062 U+0063 U+0064 # e U+0065|U+00E8:U+00E9:U+00EA:U+00EB U+0066 U+0067 U+0068 # i U+0069|U+00EC:U+00ED:U+00EE:U+00EF U+006A U+006B U+006C U+006D U+006E # o U+006F|U+00F2:U+00F3:U+00F4:U+00F5:U+00F6:U+00F8 U+0070 U+0071 U+0072 U+0073 U+0074 # u U+0075|U+00F9:U+00FA:U+00FB:U+00FC U+0076 U+0077 U+0078 U+0079 U+007A # 0-9 U+0030 U+0031 U+0032 U+0033 U+0034 U+0035 U+0036 U+0037 U+0038 U+0039 # - (hyphen) U+002D # Ligature oe U+0153|U+006FU+0065 From owner-idn-reg-policy Mon Mar 31 12:43:05 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VKh4JM010780 for ; Mon, 31 Mar 2003 12:43:04 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h2VKh45F010779 for idn-reg-policy-bks; Mon, 31 Mar 2003 12:43:04 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h2VKh1JN010775; Mon, 31 Mar 2003 12:43:01 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200303311956.h2VJuD6v006142@ludwigV.sources.org> References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 31 Mar 2003 12:42:46 -0800 To: Stephane Bortzmeyer From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 9:56 PM +0200 3/31/03, Stephane Bortzmeyer wrote: >The problem, as I see it, is that the draft uses variant tables which >work on a per-character basis, without any concern that not all >combinations mean something. Most of the words in a bundle have no >meaning in French and there is no real reason to keep them. > >I understand that there is no easy option (using a dictionary will not >work since many domain names are not in any dictionary). Exactly right. Either the system is automatic (no human intervention) and can create many unnecessary alternatives, or it is human-controlled and error-prone. In the latter case, what happens if the human who is reviewing all the names disagrees with the registrant about which names "make sense"? What if a reviewer misses some "sensible" variants and a different registrant grabs them? >Did anyone try a bundle approach on his zone? Asian registries have done so. But they have very different script variant issues than European registries. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Tue Apr 1 16:50:48 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h320omJM003656 for ; Tue, 1 Apr 2003 16:50:48 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h320olkH003655 for idn-reg-policy-bks; Tue, 1 Apr 2003 16:50:47 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h320olJM003649 for ; Tue, 1 Apr 2003 16:50:47 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 190WSo-0008N3-00 for ; Tue, 01 Apr 2003 16:50:50 -0800 Date: Mon, 31 Mar 2003 21:04:43 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: initial thoughts Message-ID: <20030331210443.GB15622@nicemice.net> Reply-To: IDN registration policy list Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: The following message was originally sent last Friday, but never made it to the list. AMC Registration policy was on my mind today, so I wrote up my thoughts, then looked at the list archive and Paul's draft for the first time. Apparently Paul and I think very much alike on this topic (probably because we'd seen the same earlier documents). Here's what I wrote: ---begin--- What policies ought a registry have before it starts admitting IDNs? 1. A policy defining what labels are admissible. This could be a set of allowed characters, or could be more complicated (for example, it could involve sequences of characters). The policy should allow only those strings for which the registry has sufficient expertise to define a sensible grouping policy (see below). The set of admissible labels can start small and gradually be expanded as the registry develops (or borrows) expertise. 2. A policy defining how labels are grouped. With traditional ASCII domain names, you register one name, and you get one name. If you register color.com, that doesn't prevent someone else from registering c010r.com or colour.com. For IDNs, it may be prudent to partition the set of admissible labels into groups (every admissible label belongs to exactly one group), where the group is the unit of registration; that is, registering any label in the group allocates the entire group, so that no one else may register any label in the same group. This policy could be implemented by defining a function that maps a label to a group identifier, which could be a canonical member of the group. The function might work by applying ToUnicode, then Nameprep, then some custom mappings of individual characters or sequences of characters, then ToASCII. 3. A policy defining how lookups behave with respect to groups. Does a lookup return the same resource records no matter which member of the group is queried? Or does the lookup fail except for particular members of the group chosen by the registrant (or by the registry)? All of the above policies need to be unambiguously specified and publicly available, so that registrants know what they're buying. I think those three things are all that is needed. In particular, I don't think it would be helpful to associate a language tag (or tags) with a registration; I think that would just raise unnecessary questions and controversies. ---end--- That's very close to Paul's draft, but there are a few slight differences: Paul's draft assumes that the policies can focus on single characters, and disregard sequences of characters. I wonder if that might not be powerful enough for some registries. We both propose partitioning the set of admissible labels into groups (bundles). Paul imagines a function that constructs the entire group given any member of the group, while I imagine a function that computes a group identifier given any member of the group, where the group identifier could be a single member of the group arbitrarily chosen to stand for the whole group. Either kind of function implies the same partition of the space, and both functions would use the same tables, but I think the group-id function might be easier to describe and understand. The group-construction function might be useful internally by the registry, if the registry needed to enumerate the group, but that doesn't really scale. That brings us to an issue I neglected to address: If the registry (or the registrant) decides that all members of the group should be visible, how will that work? Will it scale? The combinatorics are exponential. I think the only way it would work well is if the authoritative DNS servers (both the registry's servers that delegate the name, and the registrant's servers that provide the data for the name) include support for the grouping. I guess you would supply the tables to the DNS server, and it would perform the group-id computation whenever a request comes in, then look for the group-id in the zone database. This would require a standard machine-readable format for describing the grouping policy for a zone. Without that support in the DNS server, I think the sensible approach is for the registrant to choose individual member(s) of the group to be visible, and let the others be invisible (blocked). Registries could come up with a pricing plan, maybe the normal registration fee for a group with one visible member, and a small additional fee for each additional visible member. You could also imagine that the registry, rather than the registrant, chooses which member(s) will be visible, but I think it would be difficult for registries to come up with rules that would please everyone; it would probably be easier to let the registrant choose, and simply store the list. AMC From owner-idn-reg-policy Tue Apr 1 18:20:53 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h322KrJM007170 for ; Tue, 1 Apr 2003 18:20:53 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h322KruS007168 for idn-reg-policy-bks; Tue, 1 Apr 2003 18:20:53 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h322KpJM007158 for ; Tue, 1 Apr 2003 18:20:52 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 190Xrz-00008g-00 for ; Tue, 01 Apr 2003 18:20:55 -0800 Date: Wed, 2 Apr 2003 02:20:55 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: <20030402022055.GB30135@nicemice.net> Reply-To: IDN registration policy list References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roozbeh Pournader wrote: > 1. Mandatory equivalences as opposed to secondary/variant > equivalences. This feature is necessary for defining equivalences > between European and Arabic-Indic digit shapes in Arabic labels, for > example. > > This is some feature that *all* Arabic script zones *require* I'm not sure what you mean by "mandatory" and "require". Suppose I create the names .nicemice.net and .nicemice.net on the DNS server for nicemice.net (which I control), and suppose these two names are the same except that one uses ASCII digits and the other uses the corresponding Arabic-Indic digits, and suppose I associate different resource records with these names. Should there be a standard that prohibits me from doing this? Or suppose I register .com, which allocates an entire bundle of names (including .com) that no one else can register. Should the registry for .com be able to offer me the choice of whether .com is visible (equivalent to .com) or invisible (no such domain)? Or should the registry for .com be required to make .com visible and equivalent to .com, regardless of my preference? What would you think of this model: The Arabic-speaking community develops a best-practice recommendation regarding equivalences of names that should resolve to the same resource records, and TLDs can opt to support those equivalences, and can advertise their voluntary conformance in order to attract Arabic registrants. By the way, I'd like to point out that trying to make names automagically appear to be equivalent is a much trickier task than merely preventing registration of too-similar names to different registrants. The latter task involves the registry database only, not the DNS servers, nor any other servers or applications. But automagic equivalence potentially depends on the cooperation of anything that compares domain names. For example, HTTP servers compare the requested host name with host names in their config files in order to decide which virtual server to present, web browsers compare host names with user-entered lists to decide whether to accept images and cookies from a given site, and mail servers compare recipient domain names with domain names in their config files in order to decide whether the mail is local. It might be easier to handle this sort of thing more manually, at least in the beginning. For example, if contains four digits, it's one of 16 equivalent names. But really only two of them have any chance of being used. A registrant could select those two names to be visible, and leave all others invisible. They could then manually configure their web server and their mail server to recognize those two names. > 2. Clear language about conflict resolution. There needs to be > some clear guidelines or recommendations about the times that two > registered labels come into an intersection regarding the variant > labels associated to them. The whole point of having bundles is so that two registered labels cannot be variants of each other. Whichever one was registered first blocks the other from being registered. Now this does suggest a new sort of problem that hasn't existed before: bundles that are too big. Currently, the problem is that bundles are too small (just one name), so that registering a name doesn't prevent someone else from registering a confusingly similar name. The opposite problem would be bundles that are too big, where registering one name prevents someone else from registering another name that's different enough to be considered distinct under trademark law (or whatever rules are relevant to the dispute). It might happen that registrant X has a trademark on nameX, and registrant Y has a trademark on nameY, so both registrants are equally "entitled" to their respective names, but the two names are in the same bundle, so only one of the registrants can have their name. I guess this is not so different from cases where two companies have trademarks on the same name in different markets, but only one of them can have the name in .com. AMC From owner-idn-reg-policy Tue Apr 1 18:25:36 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h322PaJM007314 for ; Tue, 1 Apr 2003 18:25:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h322PaSY007313 for idn-reg-policy-bks; Tue, 1 Apr 2003 18:25:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h322PWJN007305; Tue, 1 Apr 2003 18:25:33 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Tue, 1 Apr 2003 18:05:18 -0800 To: Stephane Bortzmeyer From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: A followup on my own message, because I believe what I said was wrong. I said: >Either the system is automatic (no human intervention) and can >create many unnecessary alternatives, or it is human-controlled and >error-prone. That's too simplistic. Human-controlled systems that decide what is and is not a variant are error-prone, but human-controlled systems that decide which labels from an automatically-created bundle are not. This ties into Vint's suggestion the other day about non-automatic partitioning the labels in the bundle between allocated and blocked. In this scenario, the registry can say many different things, such as "your bundle contains 1024 entries, and..." - you can choose which five to put in the zone; the other 1019 will be blocked - we have chosen the five that we will put in the zone, and the 1019 that will be blocked - you can put as many as you want in the zone, and it will cost you US$10 per name per year to do so; the rest will be blocked for free - because your bundle has 1024 names, the base bundle cost is five times higher than if you had chosen a name that only had 128 names; plus, you must pay... ... and so on. I will cover this additional scenario in my next draft. More comments on this are welcome! --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Tue Apr 1 18:25:37 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h322PaJM007316 for ; Tue, 1 Apr 2003 18:25:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h322Pa18007315 for idn-reg-policy-bks; Tue, 1 Apr 2003 18:25:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h322PWJP007305 for ; Tue, 1 Apr 2003 18:25:34 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Tue, 1 Apr 2003 18:25:34 -0800 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 9:56 PM +0430 3/31/03, Roozbeh Pournader wrote: > A registry MUST NOT blindly combine multiple tables which have > overlapping equivalences. Instead, the registry MUST carefully analyze > every instance in the combined table where a base character has one or > more different variants and select the desired set of variants for the > base character. > >(But unfortunately doesn't suggest any guidelines when doing so.) Correct. I think that if I give a few suggestions for guidelines, it will lead readers to think that the problem is simple, which it is not. Either we list lots and lots, or none. I will add a note about why I have done none; see below. >Unfortunately, the list ends here. Specifically, there are fetures that >are *required* for Arabic but are missing in the language of the tables. And therefore it will be added. (Note to the list: I doubt that Arabic is the only language that I missed. If you know of others, please speak up!) >1. Mandatory equivalences as opposed to secondary/variant equivalences. >This feature is necessary for defining equivalences between European and >Arabic-Indic digit shapes in Arabic labels, for example. Very good point! This is a registry-specific early mapping step that must be done. I think it should be done before the variants are checked in the table; do folks here agree? >2. Clear language about conflict resolution. There needs to be some clear >guidelines or recommendations about the times that two registered labels >come into an intersection regarding the variant labels associated to them. >This will happen with almost any multi-language Arabic-script zone >(e.g. U+0649 vs U+064A vs U+06CC). I am unclear on how this differs from point #1. If any of those three characters are supposed to only be represented by one of them in names, then the registry-specific early mapping step will take care of them. Or is that not what you are referring to? Please be more specific. >3. Clear language with specific guidelines and real-life examples for >merging tables for different languages/locales. Currently, I believe that there are three possibilities: - the merging is trivially easy because there is no overlap - the merging is a policy decision by the registry at the time of table-making as to which language "wins" for the overlapping characters - it is impossible to register without knowing the supposed language of the registration I can add more discussion of that, but the third option is not "merging", it is forcing the problem on the registrant (who might be sly and use it as a way to make the bundle contain things that the registry might not have intended). From my reading of the JET document, they call the third option "merging" when in fact it is just the opposite: it prevents merging by pointing at one table. >4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB >syntax is unreadable? Why can't one use a space? Spaces as separators in tables cause problems going through gateway programs. I'm happy to add an inter-character separator of "-". --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 03:10:40 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32BAeJM020363 for ; Wed, 2 Apr 2003 03:10:40 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32BAet8020361 for idn-reg-policy-bks; Wed, 2 Apr 2003 03:10:40 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.12.9/8.11.6) with SMTP id h32BAcJM020340 for ; Wed, 2 Apr 2003 03:10:39 -0800 (PST) Message-ID: <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> From: "Edmon Chung" To: References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> Subject: Framework for IDN Operations Date: Wed, 2 Apr 2003 06:09:22 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi, I have just submitted 2 drafts on a flexible framework for managing character equivalence. More importantly driving towards a generic structure that can be incorporated into a provisioning protocol. Charprep: http://www.ietf.org/internet-drafts/draft-chung-idnop-charprep-00.txt Describes a framework for publishing: - codepoint inclusion table - character equivalence tables based on a given script Zoneprep: http://www.ietf.org/internet-drafts/draft-chung-idnop-zoneprep-00.txt Describes a framework that categorizes Reserved Variants and Zone Variants into different types: Primary Domain: originally submitted domain Reserved Variants (RV): - Normal RV: blocked from registration and can be activated as Zone variant by client - Restricted RV: blocked from registration and cannot be activated - Automatic ZV: automatically included as a zone variant - Suggested RV: suggested by the registry to be reserved by the client Zone Variants (ZV): - Normal ZV: can have its own delegation NS as well as hosts (function as a unique domain) - Same NS: must have same delegation NS as primary domain - Alias Only: is aliased to the primary domain Based on the Charprep/Zoneprep framework, I have also drafted an EPP mapping that introduces a new IDN object (that contains the set/bundle/package of IDN variants for a give primary domain/domain object): http://www.ietf.org/internet-drafts/draft-chung-idnop-epp-idn-00.txt I think this IDNOP framework provides a good generic structure that can accomodate different needs of different registries, while still maintaining a standard architecture. For example, registries can just choose to make the entire RV set Restricted RV, then it is the same as saying that all variants will simply be blocked. Or registries may choose to make all RVs into Automatic ZV, which effectively puts all RVs into the zonefile. Or there could be a mixture including Normal RVs that can be activated later. I am also drafting an IDNOP Guideline that describes a set of best practices for domain registries for implementing/launching IDNs, more specifically, on the following topics: 1. Character Equivalence Preparation Policies (and provisioning protocol considerations) 2. Sunrise and/or land-rush Management (including grandfathering of existing registered domains) 3. DNS Resolution Considerations (Zone file preparations, DNS request expectations management, and IDNA client considerations) 4. Domain Dispute Resolution Considerations Any thoughts or comments will be great. Edmon From owner-idn-reg-policy Wed Apr 2 05:47:32 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32DlWJM003571 for ; Wed, 2 Apr 2003 05:47:32 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32DlW4J003570 for idn-reg-policy-bks; Wed, 2 Apr 2003 05:47:32 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from grappa.isoc.org.il (root@grappa.isoc.org.il [132.70.9.72]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32DlTJM003555; Wed, 2 Apr 2003 05:47:30 -0800 (PST) Received: from BENNYPC (benny-pc.isoc.org.il [192.114.22.72]) by grappa.isoc.org.il (8.9.3p2/8.9.0) with ESMTP id QAA29249; Wed, 2 Apr 2003 16:47:23 +0300 From: "Benny Lipsicas" To: "'Paul Hoffman / IMC'" , Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Date: Wed, 2 Apr 2003 16:52:35 +0200 Message-ID: <00c101c2f927$82afdf40$481672c0@BENNYPC> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 x-mimeole: Produced By Microsoft MimeOLE V6.00.2800.1106 Importance: Normal In-Reply-To: Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >>Unfortunately, the list ends here. Specifically, there are fetures that >>are *required* for Arabic but are missing in the language of the tables. >And therefore it will be added. (Note to the list: I doubt that >Arabic is the only language that I missed. If you know of others, >please speak up!) Since it's my first post - I'll introduce myself: Benny Lipsicas, ".il" registry administrator. The language in question is Hebrew. One feature that may be of importance to us is the ability to prevent certain characters from appearing anywhere else but at the end of the label (i.e. it can only be the last char of a label), and we have another issue, which I'm not certain is in the scope of this list, any label in Hebrew needs to be written RTL, and if i'm not mistaken, this technically prevents the mixing of Hebrew and non-Hebrew chars in the same label. From owner-idn-reg-policy Wed Apr 2 07:59:38 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32FxcJM014159 for ; Wed, 2 Apr 2003 07:59:38 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32FxcOG014158 for idn-reg-policy-bks; Wed, 2 Apr 2003 07:59:38 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32FxXJN014144; Wed, 2 Apr 2003 07:59:33 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <00c101c2f927$82afdf40$481672c0@BENNYPC> References: <00c101c2f927$82afdf40$481672c0@BENNYPC> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 07:59:26 -0800 To: "Benny Lipsicas" , From: Paul Hoffman / IMC Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 4:52 PM +0200 4/2/03, Benny Lipsicas wrote: >The language in question is Hebrew. One feature that may be of >importance to us is the ability to prevent certain characters from >appearing anywhere else but at the end of the label (i.e. it can only be >the last char of a label), and we have another issue, which I'm not >certain is in the scope of this list, any label in Hebrew needs to be >written RTL, and if i'm not mistaken, this technically prevents the >mixing of Hebrew and non-Hebrew chars in the same label. The latter issue is definitely handled by the IDNA standard. Could you explain the reason for the first issue (that a particular character has to be the last character in the label)? --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 12:52:27 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqRJM001164 for ; Wed, 2 Apr 2003 12:52:27 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32KqR7A001162 for idn-reg-policy-bks; Wed, 2 Apr 2003 12:52:27 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqPJM001143 for ; Wed, 2 Apr 2003 12:52:25 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32KqKsx017302; Wed, 2 Apr 2003 15:52:26 -0500 Message-Id: <4.2.0.58.J.20030402145707.03cc9c18@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 15:00:31 -0500 To: Roozbeh Pournader , IDN registration policy list From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 21:56 03/03/31 +0430, Roozbeh Pournader wrote: >4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB >syntax is unreadable? Why can't one use a space? What about using Unicode (UTF-8) directly? What about defining an XML format for the tables? This would allow to publish tables in ASCII-only contexts but also easily view them in a browser with a simple stylesheet. As some of the drafts say, registries should only accept characters for which they have expertize on, which usually means that they can (and want to!) see them directly. Regards, Martin. From owner-idn-reg-policy Wed Apr 2 12:52:34 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqXJM001182 for ; Wed, 2 Apr 2003 12:52:33 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32KqXhk001181 for idn-reg-policy-bks; Wed, 2 Apr 2003 12:52:33 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqWJM001173; Wed, 2 Apr 2003 12:52:32 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32KqKt9017302; Wed, 2 Apr 2003 15:52:27 -0500 Message-Id: <4.2.0.58.J.20030402154656.035a9988@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 15:52:09 -0500 To: Paul Hoffman / IMC , Staffan Hagnell From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org In-Reply-To: References: <3E82A944.6040703@iis.se> <3E818686.6010307@twnic.net.tw> <3E820CB0.1000605@iis.se> <3E82A944.6040703@iis.se> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 06:53 03/03/27 -0800, Paul Hoffman / IMC wrote: >If the input label contains characters that have equivalents, Internet >users who don't know what the base characters used in the registration >will not know what character to type in to get a DNS response. For >instance, if DIGIT ONE is a variant of LATIN SMALL LETTER L, and LATIN >SMALL LETTER L is a variant of DIGIT ONE, the user who sees >"pale.example.com" will no know whether to type a "1" or a "l" after the >"pa" in the first label. I think there are various different situations. There are cases where it's totally unclear what to type in. There are cases where it's quite clear what to type in. There are cases where it would be clear what to type in, but the user isn't able to type the characters in question, and falls back to something else. Also, in some cases, it is easy for the owner of the 'bundle' to make sure users know what to type in. In other cases, it may be difficult. Regards, Martin. From owner-idn-reg-policy Wed Apr 2 12:52:28 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqRJM001169 for ; Wed, 2 Apr 2003 12:52:27 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32KqRHq001165 for idn-reg-policy-bks; Wed, 2 Apr 2003 12:52:27 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqOJM001139 for ; Wed, 2 Apr 2003 12:52:25 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32KqKt1017302; Wed, 2 Apr 2003 15:52:26 -0500 Message-Id: <4.2.0.58.J.20030402150335.035b1fc8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 15:07:03 -0500 To: IDN registration policy list , IDN registration policy list From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: <20030402022055.GB30135@nicemice.net> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 02:20 03/04/02 +0000, Adam M. Costello wrote: >Now this does suggest a new sort of problem that hasn't existed before: >bundles that are too big. Currently, the problem is that bundles are >too small (just one name), so that registering a name doesn't prevent >someone else from registering a confusingly similar name. The opposite >problem would be bundles that are too big, where registering one name >prevents someone else from registering another name that's different >enough to be considered distinct under trademark law (or whatever rules >are relevant to the dispute). It might happen that registrant X has a >trademark on nameX, and registrant Y has a trademark on nameY, so both >registrants are equally "entitled" to their respective names, but the >two names are in the same bundle, so only one of the registrants can >have their name. I guess this is not so different from cases where two >companies have trademarks on the same name in different markets, but >only one of them can have the name in .com. The danger of bundles being too big can easily happen for European languages, with a bundle that defines that all accented versions of a character are treated as the same as the base character. In that case, Paul's approach (also described by Adam) of using equivalence classes won't scale. What may work is that an accented character blocks the base character, but not characters with a different accent. But this no longer can be described with equivalence classes. Regards, Martin. From owner-idn-reg-policy Wed Apr 2 12:52:28 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqRJM001170 for ; Wed, 2 Apr 2003 12:52:27 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32KqRQI001166 for idn-reg-policy-bks; Wed, 2 Apr 2003 12:52:27 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqPJM001142; Wed, 2 Apr 2003 12:52:25 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32KqKt5017302; Wed, 2 Apr 2003 15:52:27 -0500 Message-Id: <4.2.0.58.J.20030402151053.0321e328@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 15:16:42 -0500 To: Paul Hoffman / IMC , "Benny Lipsicas" , From: Martin Duerst Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: References: <00c101c2f927$82afdf40$481672c0@BENNYPC> <00c101c2f927$82afdf40$481672c0@BENNYPC> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 07:59 03/04/02 -0800, Paul Hoffman / IMC wrote: >At 4:52 PM +0200 4/2/03, Benny Lipsicas wrote: >>The language in question is Hebrew. One feature that may be of >>importance to us is the ability to prevent certain characters from >>appearing anywhere else but at the end of the label (i.e. it can only be >>the last char of a label), and we have another issue, which I'm not >>certain is in the scope of this list, any label in Hebrew needs to be >>written RTL, and if i'm not mistaken, this technically prevents the >>mixing of Hebrew and non-Hebrew chars in the same label. > >The latter issue is definitely handled by the IDNA standard. Could you >explain the reason for the first issue (that a particular character has to >be the last character in the label)? Some Hebrew characters (kaf, mem, nun, peh, tsadi) have different forms when appearing at the end of a word (label). The Greek sigma is another example. In Unicode, this is handled by having separate codepoints for the final forms. This is in contrast to e.g. Arabic, where there are much more contextual forms, and shaping is handled on display, and there is only one codepoint per character. So a registry registring Hebrew would want to make sure that e.g. a kaf in the middle of a label is always U+05DB, but at the end of the label is always U+05DA. I'm not sure what should happen with labels that consist of more than one word, whether simple concatenation would be acceptable (and a final letter could help seeing the word boundary) or whether a hyphen or other, similar character would be used to concatenate words. Regards, Martin. From owner-idn-reg-policy Wed Apr 2 12:52:27 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqRJM001163 for ; Wed, 2 Apr 2003 12:52:27 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32KqQEg001159 for idn-reg-policy-bks; Wed, 2 Apr 2003 12:52:26 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32KqOJM001140; Wed, 2 Apr 2003 12:52:25 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32KqKt3017302; Wed, 2 Apr 2003 15:52:27 -0500 Message-Id: <4.2.0.58.J.20030402150927.03cdd948@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 15:09:51 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 18:25 03/04/01 -0800, Paul Hoffman / IMC wrote: >>4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB >>syntax is unreadable? Why can't one use a space? > >Spaces as separators in tables cause problems going through gateway >programs. I'm happy to add an inter-character separator of "-". What 'gateway programs'? Regards, Martin. From owner-idn-reg-policy Wed Apr 2 13:19:22 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32LJLJM002921 for ; Wed, 2 Apr 2003 13:19:21 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32LJLwO002920 for idn-reg-policy-bks; Wed, 2 Apr 2003 13:19:21 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32LJKJM002906; Wed, 2 Apr 2003 13:19:20 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32LJMsj026422; Wed, 2 Apr 2003 16:19:22 -0500 Message-Id: <4.2.0.58.J.20030402161258.0291df10@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 16:19:14 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs In-Reply-To: References: <3E818686.6010307@twnic.net.tw> <3E818686.6010307@twnic.net.tw> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 08:16 03/03/26 -0800, Paul Hoffman / IMC wrote: >Right. Unfortunately, the current draft of the JET document is silent >about these requirements, and from talking to some JET members, I haven't >heard any good description of why Chinese needs both. In fact, I remember >many long conversations with CNNIC and TWNIC people a few years ago where >they all said that just blocking (with no allocating) was fine. Maybe >opinions in the Chinese language community have changed since then, but I >haven't seen any written down in the JET document yet. Maybe the next >version will cover this clearly. This is just a wild guess, but it may have to do with the fact that even in Taiwan, simplified characters are sometimes used. The most often cited example is the 'tai' in Taiwan (U+53F0). This is clearly a simplified character, but it is often used. While in general, combinations of simplified and traditional variants can just be blocked, this is a case where just blocking would not work. >True, but it would only help a little bit. Telling the users what has been >done does not let them predict what will happen. If a registry says "we >have mapped these characters to these other ones for this language >reason", users will understand that; if a registry says "we have blocked >these characters for this language reason", users will understand that. >But I don't know how many users will understand "we have mapped some of >them but blocked other ones even though the language reason is the same". >If there is a good language reason for differentiating the two cases, that >would be wonderful. 'language reason' may be the same or different. It may be the same language, but a different reason. Also, in some cases, it may appear very natural to people understanding the language to read 'we have mapped A, B, and C, and blocked D, E, and F'. A very simplistic example would be French, with somebody registering e-acute. If the system replied 'we have mapped e (without any accent) and blocked e-grave, e-circumflex, and e-diaeresis, that would make sense to somebody understanding French. The e without accent can be used as an equivalent for e-acute, and is therefore mapped, but the other accented variants are never equivalents, and may be blocked just because they would otherwise interfere with the e without accent. [I don't claim that this is the right thing to do for French.] Regards, Martin. From owner-idn-reg-policy Wed Apr 2 14:43:03 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32Mh3JM006638 for ; Wed, 2 Apr 2003 14:43:03 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32Mh375006637 for idn-reg-policy-bks; Wed, 2 Apr 2003 14:43:03 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32Mh1JM006633 for ; Wed, 2 Apr 2003 14:43:02 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 190qwi-0002TD-00 for ; Wed, 02 Apr 2003 14:43:04 -0800 Date: Wed, 2 Apr 2003 22:43:04 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: <20030402224304.GB8966@nicemice.net> Reply-To: IDN registration policy list References: <00c101c2f927$82afdf40$481672c0@BENNYPC> <00c101c2f927$82afdf40$481672c0@BENNYPC> <4.2.0.58.J.20030402151053.0321e328@localhost> <00c101c2f927$82afdf40$481672c0@BENNYPC> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030402151053.0321e328@localhost> <00c101c2f927$82afdf40$481672c0@BENNYPC> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Benny Lipsicas wrote: > any label in Hebrew needs to be written RTL, and if i'm not mistaken, > this technically prevents the mixing of Hebrew and non-Hebrew chars in > the same label. This is governed by Nameprep. Basically, RTL and LTR characters cannot both be present in the same label. See Stringprep for exact definitions of RTL and LTR as used in this rule, and note that some characters are neither LTR nor RTL. You could mix Hebrew and Arabic characters in the same label, because both are RTL. Martin Duerst wrote: > So a registry registring Hebrew would want to make sure that e.g. a > kaf in the middle of a label is always U+05DB, but at the end of the > label is always U+05DA. I don't see why, and you yourself immediately contradict that statement: > I'm not sure what should happen with labels that consist of more than > one word, whether simple concatenation would be acceptable (and a > final letter could help seeing the word boundary) or whether a hyphen > or other, similar character would be used to concatenate words. Regardless of whether a hyphen-like character is used, the result is a label with a final character in the middle of the label. AMC From owner-idn-reg-policy Wed Apr 2 15:20:02 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NK2JM009633 for ; Wed, 2 Apr 2003 15:20:02 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32NK20R009632 for idn-reg-policy-bks; Wed, 2 Apr 2003 15:20:02 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NK0JN009628 for ; Wed, 2 Apr 2003 15:20:01 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 15:19:59 -0800 To: From: Paul Hoffman / IMC Subject: Re: Framework for IDN Operations Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 6:09 AM -0500 4/2/03, Edmon Chung wrote: >Any thoughts or comments will be great. Everyone on this list should see Neteka's wording in . Because of the nature of the Neteka statement, I would want to have these documents turned into a standard, and even worse into an Informational RFC or Best Current Practice RFC (where the Neteka patent statement doesn't even apply and therefore Neteka could nail anyone they wanted to under any terms). --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 15:35:50 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NZoJM010683 for ; Wed, 2 Apr 2003 15:35:50 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32NZoqn010682 for idn-reg-policy-bks; Wed, 2 Apr 2003 15:35:50 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NZmJN010678 for ; Wed, 2 Apr 2003 15:35:48 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030331210443.GB15622@nicemice.net> References: <20030331210443.GB15622@nicemice.net> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 15:35:45 -0800 To: IDN registration policy list From: Paul Hoffman / IMC Subject: Re: initial thoughts Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 9:04 PM +0000 3/31/03, Adam M. Costello wrote: >Paul's draft assumes that the policies can focus on single characters, >and disregard sequences of characters. I wonder if that might not be >powerful enough for some registries. It might not be, but I believe that it would be really, really hard to describe how one would use strings (sequences of characters) as input to the table. With the table as I have described it (and as the JET folks have in their document), the order of the records doesn't matter. If you have strings as the index, it does. >We both propose partitioning the set of admissible labels into groups >(bundles). Paul imagines a function that constructs the entire group >given any member of the group, while I imagine a function that computes >a group identifier given any member of the group, where the group >identifier could be a single member of the group arbitrarily chosen to >stand for the whole group. Either kind of function implies the same >partition of the space, and both functions would use the same tables, >but I think the group-id function might be easier to describe and >understand. The group-construction function might be useful internally >by the registry, if the registry needed to enumerate the group, but that >doesn't really scale. This seems like a somewhat academic difference, but I am probably missing something. Can you say where it might be important? >That brings us to an issue I neglected to address: If the registry (or >the registrant) decides that all members of the group should be visible, >how will that work? That is shown in the document. > Will it scale? Absolutely. As long as the zone's name servers can serve the records, there is no problem. Even if the database is kept on disk instead of in RAM, the response time on modern computers for even a gigantic zone is low relative to typical network delays. There are also other mitigating methods for huge zones, such as partitioning the responses on different servers at the same address. The VeriSign folks have already done a lot of research on this for the .com zone. > The combinatorics are exponential. >I think the only way it would work well is if the authoritative DNS >servers (both the registry's servers that delegate the name, and the >registrant's servers that provide the data for the name) include support >for the grouping. I guess you would supply the tables to the DNS >server, and it would perform the group-id computation whenever a request >comes in, then look for the group-id in the zone database. This would >require a standard machine-readable format for describing the grouping >policy for a zone. We disagree here. I don't see that requirement at all. As the managers of the current large zones have shown, you can have gigantic zone files. >Without that support in the DNS server, I think the sensible approach >is for the registrant to choose individual member(s) of the group to be >visible, and let the others be invisible (blocked). Registries could >come up with a pricing plan, maybe the normal registration fee for a >group with one visible member, and a small additional fee for each >additional visible member. Even though we disagree about the motivation for this, we agree that this is a good option for registries. I'm adding it to the next version of the document. >You could also imagine that the registry, rather than the registrant, >chooses which member(s) will be visible, but I think it would be >difficult for registries to come up with rules that would please >everyone; it would probably be easier to let the registrant choose, and >simply store the list. Speaking of "scaling", your suggestion makes scaling harder. That is, the registration process would have to include a step where the registrant chooses some things from a list. This is much more difficult than the registry saying "here's what you get, you can contact us if you don't like it". Well, of course they won't like it, but at least the process isn't blocked. It is up to the registry to decide what makes more sense to their customers. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 15:43:06 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32Nh6JM010859 for ; Wed, 2 Apr 2003 15:43:06 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32Nh6Ef010857 for idn-reg-policy-bks; Wed, 2 Apr 2003 15:43:06 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32Nh3JM010843 for ; Wed, 2 Apr 2003 15:43:04 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32Nh4sl031064; Wed, 2 Apr 2003 18:43:06 -0500 Message-Id: <4.2.0.58.J.20030402183922.0291bf10@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 18:40:25 -0500 To: IDN registration policy list , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: <20030402224304.GB8966@nicemice.net> References: <4.2.0.58.J.20030402151053.0321e328@localhost> <00c101c2f927$82afdf40$481672c0@BENNYPC> <00c101c2f927$82afdf40$481672c0@BENNYPC> <00c101c2f927$82afdf40$481672c0@BENNYPC> <4.2.0.58.J.20030402151053.0321e328@localhost> <00c101c2f927$82afdf40$481672c0@BENNYPC> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 22:43 03/04/02 +0000, Adam M. Costello wrote: >Martin Duerst wrote: > > > So a registry registring Hebrew would want to make sure that e.g. a > > kaf in the middle of a label is always U+05DB, but at the end of the > > label is always U+05DA. > >I don't see why, and you yourself immediately contradict that statement: To be exact, I mention a possible contradiction for the first part of the example, but the second part of the example still holds. Regards, Martin. > > I'm not sure what should happen with labels that consist of more than > > one word, whether simple concatenation would be acceptable (and a > > final letter could help seeing the word boundary) or whether a hyphen > > or other, similar character would be used to concatenate words. > >Regardless of whether a hyphen-like character is used, the result is a >label with a final character in the middle of the label. > >AMC From owner-idn-reg-policy Wed Apr 2 15:43:19 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NhJJM010871 for ; Wed, 2 Apr 2003 15:43:19 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32NhJW4010870 for idn-reg-policy-bks; Wed, 2 Apr 2003 15:43:19 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NhEJN010864; Wed, 2 Apr 2003 15:43:14 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030402151053.0321e328@localhost> References: <00c101c2f927$82afdf40$481672c0@BENNYPC> <00c101c2f927$82afdf40$481672c0@BENNYPC> <4.2.0.58.J.20030402151053.0321e328@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 15:43:13 -0800 To: Martin Duerst , "Benny Lipsicas" , From: Paul Hoffman / IMC Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 3:16 PM -0500 4/2/03, Martin Duerst wrote: >Some Hebrew characters (kaf, mem, nun, peh, tsadi) have different forms >when appearing at the end of a word (label). The Greek sigma is another >example. In Unicode, this is handled by having separate codepoints for >the final forms. This is in contrast to e.g. Arabic, where there are much >more contextual forms, and shaping is handled on display, and there is >only one codepoint per character. Thank you, this makes sense. I will add text about this situation in the next draft. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 15:43:06 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32Nh6JM010858 for ; Wed, 2 Apr 2003 15:43:06 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32Nh526010856 for idn-reg-policy-bks; Wed, 2 Apr 2003 15:43:05 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32Nh3JM010842; Wed, 2 Apr 2003 15:43:04 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h32Nh4sj031064; Wed, 2 Apr 2003 18:43:06 -0500 Message-Id: <4.2.0.58.J.20030402171329.0294df10@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Wed, 02 Apr 2003 18:35:40 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Paul, Below are some comments on your draft. At 09:31 03/03/25 -0800, Paul Hoffman / IMC wrote: >Greetings. I have just submitted a new Internet Draft that gives >suggestions on how to register IDNs. You can find a link to the draft at >the web site for this mailing list at >. Comments are, of course, welcome. small aside: it would be helpful to have a direct link, and the name of the draft. >This document is different than the JET document in many ways. I often see 'JET document' in this discussion. Is this draft-jseng-idn-admin, or something else? >It is meant to be generic and usable by anyone, not just registries using >CJK characters. It also attempts to make the registry policy easier and >more predictable from the outside. I have heard that other folks will also >be preparing Internet Drafts on this issue, so it will be good to see what >the differences are. >Internet Draft Paul Hoffman >draft-hoffman-idn-reg-00.txt IMC & VPNC >March 25, 2003 >Ex pires in six months >Intended status: Best Current Practice (BCP) > > > Framework for Registering Internationalized Domain Names I have been thinking about 'Framework' quite a bit. Is this draft a framework? It seems to be a definition of a table format with some associated algorithm(s), and a recommendation to use this table format/algorithms. And the recommendation isn't very clear: Should all registries use this table format/algorithm? Or is it just one of potentially many formats/algorithms, and registries can choose? >Abstract > >This document describes a framework for registering internationalized >domain names (IDNs) in a zone. Before accepting registrations of domain >names into a zone, the zone's registry should decide which codepoints in >the Unicode character set the zone will accept. The registry should also >decide whether particular characters in a registered domain name should >cause registration of multiple equivalent domain names. With those >decisions, the registry can safely register names using the steps >described here. This does not mention table format or algorithm. It gives rather detailed instructions to registries; just saying 'this memo gives advice to registries on ...' should be okay. Also, if the abstract mentiones 'registration of multiple equivalent', which is 'mapping', it should probably also mention 'blocking'. >1. Introduction The intro does not mention the table format and algorithm, it just speaks about a 'mechanism'. That had me wonder what was going on for too long. >IDNA [IDNA] specifies an encoding of characters in the Unicode character >set [UNICODE] the 'in' here confused me, because on first reading, I read it as 'into'. maybe change to 'from'. Also, IDNA isn't about encoding [single] characters, but character strings. >which is backwards-compatible with the current definition >of hostnames. This implies that domain names encoded according to IDNA >will be able to be transported between peers using any existing >protocol, including DNS. > >IDNA, through its requirement of Nameprep [NAMEPREP], uses equivalence >tables that are based only on the characters themselves; no attention is >paid to the intended language (if any) for the domain name. However, for >many domain names, the intended language of one or more parts of the >domain name actually does matter to the registry for the names and to >users. > >If there are no constraints on registration in a zone, people can >register characters that increases increases -> increase >the risk of misunderstandings, >cybersquatting, and other forms of confusion. A similar situation >existed before existed before -> exists separate of (it didn't go away with IDNA) >the introduction of IDNA exemplified by domain names such >as example.com and examp1e.com (note that the latter domain has has -> contains (this is just a small stylistic issue) >the >digit "1" instead of the letter "l"). > >For some human languages, there are characters and/or strings that have >equivalent or near-equivalent meanings. I would change 'meanings' to 'usages'. Because this is mostly about single characters, and these in general don't have meanings. >If someone is allowed to >register a name with such a character or string, the registry might want >to automatically register all the names that have the same meaning in >that language. Further, some registries might want to restrict the set >of characters to be registered for language-based reasons. In addition, >IDNA allows the use of thousands of non-alphanumeric characters, and >some zone administrators will want to prohibit some or all of these >characters. This paragraph may look much clearer if it is changed to a list of bullet points. The need for documenting what is done should also be listed here. >The intent of this document is that checking whether a label >can be approved can be a mathematical, objective inspection of the >codepoints in the label with no human intervention, and that all >applications of a particular table will yield identical results. > >The mechanism see above about 'mechanism'. >described here does not require a registry to know the >"intended language" of a label. It is impossible to describe the >"intended language" of names that include numbers or acronyms. It is in many cases impossible to know the 'intended language' of names even without numbers or acronyms. >Proposals >that have this requirement require human intervention to validate the >assertion from the registrant and are therefore susceptible to fraud >from the registrant. Further, such a requirement prevents I don't think this is true. It would not prevent, but it would make things more difficult. >the >registration of labels that have two languages, some of which are common >in countries with multiple languages. > >[IDN-ADMIN] shows a different proposal to the problem of registration >policy. That document uses a more complex algorithm and a different >registration philosophy that what is described here. that -> than >It is suggested that a registry act conservatively when starting >accepting IDNA-based domain names. This should also say that this means starting with a small set of base characters, and maybe adding more later. It should probably also say something about informing all current registrants about changes in policy. >Equivalences are very hard (if not >impossible) to define after registration has started. Assume that the >labels "x" and "y" at first are different, but later the tables for the >registry are changed so that "x" and "y" are then treated as being the >same. If x.example.com and y.example.com both were already registered to >different registrants, it is unclear which of them has to withdraw the >registration, how that selection process insert 'is' >done, and so on. Thus, having >complete, publicly-stated policies before accepting registration will >lead to a much more stable registration process. The 'act conservatively' to some extent suggests that policies could be relaxed as we go on, e.g. blocked variants reduced. Is this a good idea or not? Should probably be stated explicitly. >This document does not deal with how to handle whois data for multiple >registrations, and does not deal with regitrar-registry protocols. regitrar -> registrar >This document also only deals only with variants of single characters, remove one 'only'. >not variants of strings. Add something like 'although variants can be strings'. >1.1 Terminology Say that this is terminology used in this memo, not necessarily of general use. >A "string" is an ordered set of one or more characters. 'ordered set' -> 'sequence' (an ordered set does not allow duplicates, but a sequence allows repetitions). >This document discusses characters that have equivalent or >near-equivalent characters or strings. The "base character" is the >character that has one or more equivalents; the "variant(s)" are the >character(s) and/or string(s) that are equivalent to the base character. 'base character' is used in the context of combining characters, it would be better to find another term. It would also be good to clearly state what the purpose of the base is. My understanding is as follows: - The base must be a single character, variants don't. - If blocking is used, variants are blocked, but not the base. - The base, and not any variant, must be used in a registration request. Say that characters are Unicode codepoints. >A "registration bundle" is the set of all labels that comes from >expanding all base characters for a single name into their variants. > >A registry is the administrative authority for a DNS zone. That is, the >registry is the body that makes and enforces policies that are used in a >particular zone in the DNS. add quotes around 'registry'. Is this the same as the general use of the term, or is this specific to the discussion here? >The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and >"MAY" in this document are to be interpreted as described in RFC 2119 >[RFC2119]. > >2. Language-based tables > >The registration strategy described in this document uses a table that >lists all characters allowed for input and any variants of those >characters. If there are two equivalent characters, and I as a registry want to allow a registrant to use either of these when registering, do I need two entries in the table? >Note that the table lists all characters allowed, not only >the ones that have variants. > >It is widely expected that there will be different tables for the same >language created by different people. Many languages are spoken in many >different countries, and each country might have a different view of >which characters should or should not be considered needed for that >language. For example, some people would say that the Latin characters >are needed for various Indic languages, while others would say that >they are not. I think the question of whether ASCII is allowed or not is a very special one, which should be considered separately. There may be cases where ASCII is already allowed; there is a good argument for always allowing ASCII, in particular if the higher-level domains are all ASCII; there is a good argument to not allow ASCII if the higher-level domains are all non-ASCII. These arguments are rather different from arguments about allowing a few more or less characters. >A zone needs to have exactly one table; having more than one table can >lead to unpredictable results because the variants in the different >tables may conflict. The table must be carefully composed The word 'composed' here suggests that it is composed from several different tables. But I think the general statement is that a table must be carefully checked, in all cases, even for single languages. >so that all >expected variants will be created, and no unexpected variants are >created. > >The registry's table MUST NOT have more than one entry for a particular >base character. A table with more than one variant rule add 'for the same base character'. >requires that >some names be evaluated by humans and will open the registration process >to dispute. what about tables that don't have the same base character twice, but may map to the same character? E.g.: U+00E8|U+0065 U+00E9|U+0065 Or where a base character also appears as a variant U+00E8|U+0065 U+0065 Or where a base character appears as part of a variant: U+00FC|U+0075U+0065 U+0075 U+0065 >The tables are language-specific, although it is possible to create a >single table that covers multiple languages. The following three >sub-sections describe the use of tables in three scenarios. > >2.1 Table for a zone that uses names from one language > >A zone that has a single language has a significant advantage over >zones that cover multiple languages. Its table can be constructed >without concern for variants that appear in other languages for the >base characters of the language used in the zone. > >2.2 Table for a zone that uses names from a small number of languages > >If a zone covers more than one language, the registry must create its >registration table from multiple language tables. Creating a table from >many languages is easy if none of the languages have overlapping >character variants for any single base character. > >A registry MUST NOT blindly combine multiple tables which have >overlapping equivalences. Instead, the registry MUST carefully analyze >every instance in the combined table where a base character has one or >more different variants and select the desired set of variants for the >base character. > >2.3 Table for a zone that has no language restrictions > >A registry that does not restrict the number of languages will probably >allow a much wider range of characters to be used in names. At the same >time, that registry cannot easily use character variants because >variants for one language will be different from the variants used in a >different language. To handle conflicting variants among languages, the >registry can choose to have no variants for any base characters, or can >choose to have variants for a subset of the languages that are >expressible in the characters allowed. > > >3. Table processing rules > >The input to the process is called the "input label". The input is a label and a table, I guess. >The output of the >process is either failure (the input label cannot be registered at all), >or a registration bundle that contains one or more labels that have been >processed with ToASCII. It doesn't seem necessary to describe the algorithm with using ToASCII at the end. It would be more straightforward to have one Unicode string as input, and several as output. From a registrant's and from a user's point of view, and from the rest of this memo, these are the things mapped/blocked. >Processing the input label requires two versions of ToASCII: "standard >ToASCII" and "enhanced ToASCII". Standard ToASCII is exactly the same as >the ToASCII in [IDNA]. Enhanced ToASCII is standard ToASCII with the >steps from section 3.1 added. > >Note that the process MUST be executed only once. The process MUST NOT >be run on any output of the process, only on the new label that was >input. > > >3.1 Creating enhanced ToASCII. probably no need for a '.' at the end of the title. It would be clearer if there was a description for 'checking the input label' (preparation and checking against base characters in table) and another description for 'creating variant strings' (iteration through string and creation of combinations). This would avoid the term 'enhanced ToASCII', which is bound to create some confusion. >During the processing, an "temporary bundle" contains partial labels, >that is, labels that are being built and are not complete labels. The >partial labels in the temporary bundle consist of Unicode characters. the partial labels are strings, not necessarily single characters. >The following steps after step 2 but before step 3 of ToASCII. This implies that we continue with ToASCII after 2e). But the continuation for the input label is in 2b), and for the variants, in 2da). >2a) Split the input label into individual characters, called "candidate >characters". Compare each candidate character against the base >characters in the table. If any candidate character does not exist in >the set of base characters, the system MUST stop and not register any >names (that is, it MUST not register either the base name or any labels >that would have come from character variants). > >2b) Continue the steps in standard ToASCII for the input label. If >ToASCII fails for the input label, the system MUST stop and not register >any of the labels (even if the other labels would have passed ToASCII). >If ToASCII succeeds, add the result to the registration bundle. > >2c) For each candidate character in the input label, do the following: This is confusing. To show you why I think it is confusing, let's assume we have the following input label: U+0064U+0064U+0065 (dde) and the following table: U+0064 (d) U+0065|U+0066:U+0067 (e|f:g) If we start out with a 'temporary bundle' containing a single empty string, i.e. [""], then after 2c1), we have ["d"], then we go to 2c3) which (probably) does nothing because there are no variants, then back to 2c1), we get ["dd"], and the next time round, we get ["dde"]. Then at 2c2), for variant 'f', we get at 2c2a): ["dde","dde"], and then for variant 'g' we get ["dde","dde","dde","dde"] again by duplication at 2c2a). Then at 2c3), we get ["ddeg","ddeg","ddeg","ddeg"]. Of course what was intended was to get ["dde","ddf","ddg"], but that needs a different description. > 2c1) Copy the candidate character into every partial label in the > temporary bundle. If the base character that matches the candidate > character has no variants, go to step 2c3. > > 2c2) For each variant of the base character, do the following: > > 2c2a) Duplicate all of the current partial labels in the > temporary bundle. > > 2c2b) If this is the last variant, go to step 2c3; otherwise, > select the next variant, and go to step 2c2a. > > 2c3) Copy the variant into each partial label. > 2c4) If there are more candidate characters, select the next > candidate character and got to step 2c1. Otherwise, go to step 2d. > >2d) The temporary bundle now contains zero or more labels that consist >of Unicode characters. For each label in the temporary bundle: > > 2da) Process the label with standard ToASCII. > > 2db) If ToASCII succeeds, put the result in the registration bundle. > Otherwise, do not put anything into the registration bundle. > > 2dc) Select the next label and go to step 2da. > >2e) The resulting registration bundle has all the labels in ToASCII >encoding. Finish. What if some labels in this bundle conflict with already existing registrations? What if the same label appears more than once in the bundle? >4. Table format > >The format of the table is meant to be machine-readable but not >human-readable. It is fairly trivial For some people, writing a C program or a perl script is 'fairly trivial'. For others, it's not. It is easy to change the format to make it even more trivial. >to convert the table into one >that can be read by people. > >Each character in the table is given in the "U+" notation for Unicode >characters. The lines of the table are terminated with either a carriage >return character (ASCII 0x0D), a linefeed character (ASCII 0x0A), or a >sequence of carriage return followed by linefeed (ASCII 0x0D 0x0A). The >order of the lines in the table do not matter. > >Each line in the table starts with the character that is allowed in the >registry. If that character has any variants, the base character is 'the character'/'that character' the 'base character'? >is >followed by a vertical bar character ("|", ASCII 0x7C) and the variant >string. If the base character has more than one variant, the variants >are separated by a colon (":", ASCII 0x3A). Strings are given without >any intervening spaces > >The following is an example of how a table might look. The entries in >this table are purposely silly and should not be used by any registry as >the basis for choosing variants. For the example, assume that the >registry: >- allows the FOR ALL character (U+2200) with no variants >- allows the COMPLEMENT character (U+2201) which has a single variant > of LATIN CAPITAL LETTER C (U+0043) >- allows the PROPORTION character (U+2237) which has one variant which > is the string COLON (U+003A) COLON (U+003A) >- allows the PARTIAL DIFFERENTIAL character (U+2202) which has two > variants: LATIN SMALL LETTER D (U+0064) and GREEK SMALL LETTER DELTA > (U+03B4) > >The table would look like: >U+2200 >U+2201|U+0043 >U+2237|U+003AU+003A >U+2202|U+0064;U+03B4 > >The registry's table MUST NOT have more than one entry for a particular >base character. What about other restrictions? >Implementors of table processors should remember that there are tens of >thousands of characters whose codepoints are greater than 0xFFFF. Thus, >any program that assumes that each character in the table is represented >in exactly six octets ("U", "+", and exactly four octets representing >the character value) will fail with tables that use characters whose >value is greater than 0xFFFF. > > >5. Steps after registering an input label > >A registry has three options for how to handle the case where >the registration bundle has more than one label. The policy options are: > >1) Allocate all labels to the same registrant, making >the zone information identical to that of the input label. > >2) Block all labels so they cannot be registered in the >future. Does 'all labels' include the input label? >3) Allocate some labels and block some other labels. > >Option 1 will cause end users to be able to find names with variants >more easily, but will result in larger zone files. For some >language tables, the zone file could become so large that it >could negatively affect the ability of the registry to perform name >resolution. > >Option 2 does not increase the size of the zone file, but it >may cause end users to not be able to find names with variants >that they would expect. > >Option 3 is likely to cause the most confusion with users because >including some variants will cause a name to be found, bout using >other variants will cause the name to be not found. > >With any of these three options, the registry MUST keep a database that >links each label in the registration bundle to the input label. This link >needs to be maintained so that changes in the non-DNS registration >information (such as the label's owner name and address) is reflected in >every member of the registration bundle as well. > >If the registry chose option 1, when the zone information for the input >label changes, the zone information for all the members of the >registration bundle MUST change in exactly the same way. The zone >information for every member of the registration bundle MUST remain >identical as long as any of the members of the registration bundle >remain in the zone. A registry can keep the zone information for the >registration bundle identical using a database, or using DNAME records, >or using a combination of the two. > >If the registry chose option 2, when the zone information for the input >label changes, the blocked information for all the members of the >registration bundle MUST be identical to that of the input label, and >MUST remain identical as long as the input label remains in the zone. A >registry can keep the zone and blocked name information for the >registration bundle identical using a database. > >If the registry chose option 3, it must use an unspecified method to >keep the elements in the registration bundle cohesive. This option >SHOULD NOT be used except under carefully-controlled circumstances. >7. Owner implications of multiple labels > >The creation of a registration bundle for equivalent or near-equivalent >labels in a zone at the time of registration leads to many delegations. >This leads to records in parallel zones which MUST be synchronized. That >is, the owner of a registration bundle MUST keep the same information in the >zone for each label in the bundle. > >Using the examples from section 6, assume that the owner of the label >"pale" and "pa1e" creates a subdomain, "www". If the owner of >"example.com" used multiple delegations for the labels, the owner of >"pale" and "pa1e" would use two records: > > $ORIGIN pale.example.com. > www IN A 1.2.3.4 > > $ORIGIN pa1e.example.com. > www IN A 1.2.3.4 > >An alternative for these two records, which helps the registrant >keep their names in synch, would be: > > $ORIGIN pale.example.com. > www IN A 1.2.3.4 > > $ORIGIN pa1e.example.com. > www IN CNAME www.pale.example.com. > >If the owner of "example.com" used a DNAME CNAME or DNAME? >record to make "pale" and >"pa1e" equivalent, the owner of "pale" and "pa1e" could instead use one >record: lots of 'if' and 'would' here, suggesting that this is not necessarily a good thing to do. But looking at the result, CNAME is much easier to handle than anything else. Relying on an arbitrary registrant to keep their (potentially many) variants straight doesn't sound like something reliable. So I would propose that we make a strong recommendation for CNAME. > $ORIGIN pale.example.com. > www IN A 1.2.3.4 > > >8. Security considerations > >Apart from considerations listed in the IDNA specification, this >document explicitly talks about equivalences that a registry can define >as part of the policy which can be applied in a zone. A registry can >apply an equivalence table which solves some problems with homographs >already outlined in the security consideration section of IDNA. This >might be considered good for security because it will reduce the >possible confusion for the user, and lower the risk that the user will >"connect" to a service which was not intended. This should mention a) the potential of security problems created by badly designed tables, and b) the potential for user confusion and related security problems created by different tables for different zones (e.g. in .com, certain equivalences are valid, and users get used to these, but then in .new, these equivalences are not used, and users can get spoofed). Regards, Martin. From owner-idn-reg-policy Wed Apr 2 15:49:57 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NnuJM011182 for ; Wed, 2 Apr 2003 15:49:56 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h32NnulS011181 for idn-reg-policy-bks; Wed, 2 Apr 2003 15:49:56 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h32NnsJP011169 for ; Wed, 2 Apr 2003 15:49:55 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 15:49:53 -0800 To: From: Paul Hoffman / IMC Subject: Re: Framework for IDN Operations Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Whoops! The last message went out without a very important "not" in it. It should have read: At 6:09 AM -0500 4/2/03, Edmon Chung wrote: >Any thoughts or comments will be great. Everyone on this list should see Neteka's wording in . Because of the nature of the Neteka statement, I would *not* want to have these documents turned into a standard, and even worse into an Informational RFC or Best Current Practice RFC (where the Neteka patent statement doesn't even apply and therefore Neteka could nail anyone they wanted to under any terms). --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 16:58:17 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h330wGJM014943 for ; Wed, 2 Apr 2003 16:58:16 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h330wGmT014942 for idn-reg-policy-bks; Wed, 2 Apr 2003 16:58:16 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.12.9/8.11.6) with SMTP id h330wEJM014917 for ; Wed, 2 Apr 2003 16:58:15 -0800 (PST) Message-ID: <022401c2f97c$161cc3c0$8c7a4b0a@neteka.inc> From: "Edmon Chung" To: , "Paul Hoffman / IMC" References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> Subject: Re: Framework for IDN Operations Date: Wed, 2 Apr 2003 19:58:01 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hi Everyone, understand that patent related submissions would include a section stating IPR considerations. Also I will name the draft under "dnsii" as mentioned in the statement. So, the recent new ones are not :-) Edmon ----- Original Message ----- From: "Paul Hoffman / IMC" To: Sent: Wednesday, April 02, 2003 6:49 PM Subject: Re: Framework for IDN Operations > > Whoops! The last message went out without a very important "not" in > it. It should have read: > > At 6:09 AM -0500 4/2/03, Edmon Chung wrote: > >Any thoughts or comments will be great. > > Everyone on this list should see Neteka's wording in > . > > Because of the nature of the Neteka statement, I would *not* want to > have these documents turned into a standard, and even worse into an > Informational RFC or Best Current Practice RFC (where the Neteka > patent statement doesn't even apply and therefore Neteka could nail > anyone they wanted to under any terms). > > --Paul Hoffman, Director > --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 17:48:16 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h331mGJM017561 for ; Wed, 2 Apr 2003 17:48:16 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h331mGaa017560 for idn-reg-policy-bks; Wed, 2 Apr 2003 17:48:16 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h331mFJM017552 for ; Wed, 2 Apr 2003 17:48:15 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 190tpy-0002xF-00 for ; Wed, 02 Apr 2003 17:48:18 -0800 Date: Thu, 3 Apr 2003 01:48:18 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: initial thoughts Message-ID: <20030403014818.GC8966@nicemice.net> Reply-To: IDN registration policy list References: <20030331210443.GB15622@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC wrote: > I believe that it would be really, really hard to describe how one > would use strings (sequences of characters) as input to the table. Couldn't you simply say that when more than one key matches, use the longest match? It would be trickier to implement, but that's exactly how routing tables work. It seems pretty easy to describe. Whether it would be worth the additional complexity, I don't know. > > We both propose partitioning the set of admissible labels into > > groups (bundles). Paul imagines a function that constructs the > > entire group given any member of the group, while I imagine a > > function that computes a group identifier given any member of the > > group, where the group identifier could be a single member of the > > group arbitrarily chosen to stand for the whole group. Either > > kind of function implies the same partition of the space, and both > > functions would use the same tables, but I think the group-id > > function might be easier to describe and understand. > > This seems like a somewhat academic difference Yes, it is academic, because it's possible to define the exact same bundles either way. But people might find it difficult to wrap their heads around the bundle-generating function and reason about it. I know I do. I can't really follow the description in the draft, and even if I could, I think I would have difficulty answering this fundamental question: Is it possible that bundle(labelX) and bundle(labelY) are neither equal nor disjoint (that is, they overlap but are not exactly the same)? I've been assuming that you intended for the bundles to form a partition; that is, any two bundles are either equal or disjoint. Was that indeed your intention? The approach I proposed is to define the bundles implicitly, like so: labelX and labelY belong to the same bundle iff bundleID(labelX) == bundleID(labelY). This obviously forms a partition, no matter how the bundleID() function behaves. With this approach, you wouldn't need to generate all the variants when the label is registered (or ever). You could just create one entry under the bundle ID, containing the registrant info and a list of active variants (which could be just a handful). Whenever someone wants to register a label, or activate/deactivate a variant, you would compute bundleID(label) to see which bundle it belongs to. I find this approach easier to conceptualize, perhaps because it goes in the same direction as Stringprep's mapping and normalization steps (many input strings map to the same output string) and would use tables in a similar way. The function you propose uses tables in the other direction, so that one input string maps to many different output strings. I have no practice thinking that way. :) > > Will it scale? > > Absolutely. Suppose JPNIC decides that hiragana and katakana should block each other. [Background for readers not familiar with Japanese writing: In addition to the ideographic script (kanji), there are two parallel phonetic scripts: hiragana (for normal Japanese words) and katakana (for words recently imported from other languages, and sometimes merely for emphasis, like italics).] Now imagine a label consisting of 39 hiragana (for example, the Japanese translation of "good morning good day good evening good night thank you very much" can be written using 39 hiragana and fits in a single label with one byte to spare). The number of variants is over a half million million (5e11). At a storage cost of 64 bytes per variant, that's 32 terabytes, just for this one registration. > > You could also imagine that the registry, rather than the > > registrant, chooses which member(s) will be visible, but I think it > > would be difficult for registries to come up with rules that would > > please everyone; it would probably be easier to let the registrant > > choose, and simply store the list. > > Speaking of "scaling", your suggestion makes scaling harder. That > is, the registration process would have to include a step where > the registrant chooses some things from a list. This is much more > difficult than the registry saying "here's what you get, you can > contact us if you don't like it". Well, of course they won't like it, > but at least the process isn't blocked. I would still call that "registrant chooses", even though the registry is offering a default choice, because the registry still needs to be able to store lists of visible names for registrants who ask to deviate from the default. When I said "registry chooses", I meant that the registrant has zero input, so that the registry could define the choice algorithmically and omit any capability of storing lists. > It is up to the registry to decide what makes more sense to their > customers. Certainly. AMC From owner-idn-reg-policy Wed Apr 2 18:39:43 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h332dhJM019080 for ; Wed, 2 Apr 2003 18:39:43 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h332dhKk019079 for idn-reg-policy-bks; Wed, 2 Apr 2003 18:39:43 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from hosting.altserver.com (hosting.altserver.com [209.124.80.2]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h332dgJM019075 for ; Wed, 2 Apr 2003 18:39:42 -0800 (PST) Received: from f02m-7-22.d1.club-internet.fr ([212.194.18.22] helo=mine.jefsey.com) by hosting.altserver.com with esmtp (Exim 3.36 #1) id 190udk-0004aO-00 for idn-reg-policy@imc.org; Wed, 02 Apr 2003 18:39:44 -0800 Message-Id: <5.2.0.9.0.20030403034851.00a507d0@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Thu, 03 Apr 2003 04:17:31 +0200 To: idn-reg-policy@imc.org From: "JFC (Jefsey) Morfin" Subject: Re: Framework for IDN Operations In-Reply-To: <022401c2f97c$161cc3c0$8c7a4b0a@neteka.inc> References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - hosting.altserver.com X-AntiAbuse: Original Domain - imc.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [0 0] X-AntiAbuse: Sender Address Domain - jefsey.com Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Let me try to clarify what is proposed on this list in plain wording, to be sure there is no mistake. 1. IDNs introduce (much) more possible confusions than DNSascii names. The change is however in the number of confusion rather than in the possiblity of confusion. IBM.com and 1BM.com are possible with standard DNs. Judges will use that comparison to take decisions easily which will change the whole machanism any time. 2. IDNs introduce a way to write in plain language what had before to be written in ASCII only. This means that http://liberté.com can duplicate http://liberte.com - this is a subjective issue where a Judge may also take a decision changing the mechanism. All the more that different IDNs may correspond to the same ascii sequence in different languages - ccTLDs. The proposed solutions consist in addressing these problems with a technical mechanism (I like the word being unclear and generic: this is a word the Judges will like and use). This mechanism cannotbe however described in legal contractual permanent terms as "whole numeric DNs will not be accepted". Also these rules are supposed to be established on a per registry basis. This means that something may be permitted in ".com" and prohibited in ".de"? This means that I may protect a TM in one registry and I may not in another one. Yet that if someone obtains a change in a mechanism from a Judge, I may all the sudden be able to protect it. What means if I am not not watching a squatter my register it. I may also means that all the sudden I may lose the right to use an until now legitimate DN because the mechanism has changed in some other area. If I am not mistaken and if this may happen - even with a very low risk factor - the first thing to do is for for Registries to implement a wish list where all the denied registrations would be listed and allocated to a lottery among those having been denied to register them. So the tables would not only keep the existing equivalents, but the possible claims. This would probably decrease the pressure on Registries. Otherwise everytime they will give away an accentuated DN to match a non accentuated one they will fear to be sued by other registrants. I think that the iWhos should list all the reserved words. Today if I want 1BM.com I can find who owns IBM.com and buy it :-) to avoid confusion. If I really want http://liberté.com I must be able to know who has it reserved and to buy the whole lot. jfc From owner-idn-reg-policy Wed Apr 2 18:51:43 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h332phJM019279 for ; Wed, 2 Apr 2003 18:51:43 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h332phh9019278 for idn-reg-policy-bks; Wed, 2 Apr 2003 18:51:43 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h332pcJN019272; Wed, 2 Apr 2003 18:51:39 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <022401c2f97c$161cc3c0$8c7a4b0a@neteka.inc> References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> <022401c2f97c$161cc3c0$8c7a4b0a@neteka.inc> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 18:46:13 -0800 To: "Edmon Chung" , From: Paul Hoffman / IMC Subject: Re: Framework for IDN Operations Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 7:58 PM -0500 4/2/03, Edmon Chung wrote: >Hi Everyone, >understand that patent related submissions would include a section stating >IPR considerations. Also I will name the draft under "dnsii" as mentioned >in the statement. >So, the recent new ones are not :-) Your company is on public record as saying that you have applied for a patent that may be relevant, and that you will only grant rights to the patent if it is used in a standards-track technology. If your corporate officers want to change that statement, that's fine; until they do, I would not suggest that anyone take the above message as legally binding or even relevant. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 18:56:21 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h332uLJM019380 for ; Wed, 2 Apr 2003 18:56:21 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h332uL94019379 for idn-reg-policy-bks; Wed, 2 Apr 2003 18:56:21 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h332uKJM019375 for ; Wed, 2 Apr 2003 18:56:20 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 190uts-00036F-00 for ; Wed, 02 Apr 2003 18:56:24 -0800 Date: Thu, 3 Apr 2003 02:56:24 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030403025624.GD8966@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030402171329.0294df10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030402171329.0294df10@localhost> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst wrote: > CNAME or DNAME? CNAME and DNAME mean different things. CNAME defines an alias for a particular name (the alias is a leaf in the tree). DNAME defines an alias for a domain suffix (the alias effectively creates a mirror of a whole subtree). Paul's draft describes three different methods to get www.foo.example.com and www.bar.example.com to refer to the same server: parallel delegations and resource records: example.com zone: foo.example.com. NS bar.example.com. NS foo.example.com zone: www.foo.example.com. A 1.2.3.4 bar.example.com zone: www.bar.example.com. A 1.2.3.4 parallel delegations with CNAME: example.com zone: foo.example.com. NS bar.example.com. NS foo.example.com zone: www.foo.example.com. A 1.2.3.4 bar.example.com zone: www.bar.example.com. CNAME www.foo.example.com single delegation with DNAME: example.com zone: foo.example.com. NS bar.example.com. DNAME foo.example.com. foo.example.com zone: www.foo.example.com. A 1.2.3.4 The last method is the simplest, because there is no bar zone to manage. On the other hand, it's less flexible--the manager of the foo zone no longer has the ability to cause www.bar.example.com to refer to a distinct server, except by asking the manager of example.com to change the DNAME to an NS. Hmmm, is there a fourth method? parallel delegations with DNAME: example.com zone: foo.example.com. NS bar.example.com. NS foo.example.com zone: www.foo.example.com. A 1.2.3.4 bar.example.com zone: bar.example.com. DNAME foo.example.com Is that allowed? This way the foo manager wouldn't need to update the bar zone whenever a new host gets added to the foo zone, and would still have the power to fork off the bar.example.com domain without having to wait for updates in the example.com zone. AMC From owner-idn-reg-policy Wed Apr 2 19:44:56 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h333iuJM021360 for ; Wed, 2 Apr 2003 19:44:56 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h333iujt021359 for idn-reg-policy-bks; Wed, 2 Apr 2003 19:44:56 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.12.9/8.11.6) with SMTP id h333isJM021354 for ; Wed, 2 Apr 2003 19:44:54 -0800 (PST) Message-ID: <03da01c2f993$5e4cb7b0$8c7a4b0a@neteka.inc> From: "Edmon Chung" To: , "Paul Hoffman / IMC" References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> <01c701c2f908$52fef390$8c7a4b0a@neteka.inc> <022401c2f97c$161cc3c0$8c7a4b0a@neteka.inc> Subject: Re: Framework for IDN Operations Date: Wed, 2 Apr 2003 22:44:41 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Perhaps the best way then is for me to add in an IPR consideration session within the documents explicitly stating that the discussions within the documents are not bound by any patents implied by the statement. Would that work for you? Anyway, I look forward to comments and feedback on the documents themselves. :-) Edmon ----- Original Message ----- From: "Paul Hoffman / IMC" To: "Edmon Chung" ; Sent: Wednesday, April 02, 2003 9:46 PM Subject: Re: Framework for IDN Operations > > At 7:58 PM -0500 4/2/03, Edmon Chung wrote: > >Hi Everyone, > >understand that patent related submissions would include a section stating > >IPR considerations. Also I will name the draft under "dnsii" as mentioned > >in the statement. > >So, the recent new ones are not :-) > > Your company is on public record as saying that you have applied for > a patent that may be relevant, and that you will only grant rights to > the patent if it is used in a standards-track technology. If your > corporate officers want to change that statement, that's fine; until > they do, I would not suggest that anyone take the above message as > legally binding or even relevant. > > --Paul Hoffman, Director > --Internet Mail Consortium > From owner-idn-reg-policy Wed Apr 2 22:57:11 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h336vAJM027122 for ; Wed, 2 Apr 2003 22:57:10 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h336vADW027121 for idn-reg-policy-bks; Wed, 2 Apr 2003 22:57:10 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from grappa.isoc.org.il (root@grappa.isoc.org.il [132.70.9.72]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h336v8JM027113; Wed, 2 Apr 2003 22:57:09 -0800 (PST) Received: from BENNYPC (benny-pc.isoc.org.il [192.114.22.72]) by grappa.isoc.org.il (8.9.3p2/8.9.0) with ESMTP id JAA18598; Thu, 3 Apr 2003 09:56:59 +0300 From: "Benny Lipsicas" To: "'Paul Hoffman / IMC'" , Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Date: Thu, 3 Apr 2003 10:02:11 +0200 Message-ID: <00eb01c2f9b7$583142f0$481672c0@BENNYPC> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 x-mimeole: Produced By Microsoft MimeOLE V6.00.2800.1106 Importance: Normal In-Reply-To: Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Rethinking of it, I should have said last char of a word, and not a label. Hebrew has three letters which have an "ending form", i.e., when at the end of a word they are written differently (Arabic has also a few of those). Grammatically speaking, they can't be written in a word (which we need to decide whether or not to block, I guess it is legit for someone to want to write two words without separating between them). On top of that, two of those ending forms look similar to two other letters, therefore can be used to write words that look similar (like the 0 (zero) and o case). >>-----Original Message----- >>From: Paul Hoffman / IMC [mailto:phoffman@imc.org] >>Sent: Wednesday, April 02, 2003 5:59 PM >>To: Benny Lipsicas; idn-reg-policy@imc.org >>Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin >> >>At 4:52 PM +0200 4/2/03, Benny Lipsicas wrote: >>>The language in question is Hebrew. One feature that may be of >>>importance to us is the ability to prevent certain characters from >>>appearing anywhere else but at the end of the label (i.e. it can only be >>>the last char of a label), and we have another issue, which I'm not >>>certain is in the scope of this list, any label in Hebrew needs to be >>>written RTL, and if i'm not mistaken, this technically prevents the >>>mixing of Hebrew and non-Hebrew chars in the same label. >> >>The latter issue is definitely handled by the IDNA standard. Could >>you explain the reason for the first issue (that a particular >>character has to be the last character in the label)? >> >>--Paul Hoffman, Director >>--Internet Mail Consortium From owner-idn-reg-policy Wed Apr 2 23:35:47 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h337ZlJM004896 for ; Wed, 2 Apr 2003 23:35:47 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h337Zl1V004894 for idn-reg-policy-bks; Wed, 2 Apr 2003 23:35:47 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya40.nic.fr (maya40.nic.fr [192.134.4.151]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h337ZjJM004863 for ; Wed, 2 Apr 2003 23:35:46 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya40.nic.fr (8.12.4/8.12.4) with ESMTP id h337ZXvg840705; Thu, 3 Apr 2003 09:35:34 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id F30E7110F0; Thu, 3 Apr 2003 09:35:36 +0200 (CEST) Date: Thu, 3 Apr 2003 09:35:36 +0200 From: Stephane Bortzmeyer To: Martin Duerst Cc: Roozbeh Pournader , IDN registration policy list Subject: Format of the tables (U+xxxx, UTF-8, etc) Was: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: <20030403073536.GA6569@nic.fr> References: <4.2.0.58.J.20030402145707.03cc9c18@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030402145707.03cc9c18@localhost> User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Wed, Apr 02, 2003 at 03:00:31PM -0500, Martin Duerst wrote a message of 16 lines which said: > >4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB > >syntax is unreadable? Why can't one use a space? > > What about using Unicode (UTF-8) directly? > What about defining an XML format for the tables? > This would allow to publish tables in ASCII-only contexts > but also easily view them in a browser with a simple stylesheet. Good idea but there is already a solution to do so. I suggest that Paul rewrites his draft according to RFC 2629, which is widely implemented and already has stylesheets. From owner-idn-reg-policy Wed Apr 2 23:54:54 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h337ssJM007416 for ; Wed, 2 Apr 2003 23:54:54 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h337sstE007415 for idn-reg-policy-bks; Wed, 2 Apr 2003 23:54:54 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from grappa.isoc.org.il (root@grappa.isoc.org.il [132.70.9.72]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h337spJM007399; Wed, 2 Apr 2003 23:54:52 -0800 (PST) Received: from BENNYPC (benny-pc.isoc.org.il [192.114.22.72]) by grappa.isoc.org.il (8.9.3p2/8.9.0) with ESMTP id KAA19363; Thu, 3 Apr 2003 10:54:50 +0300 From: "Benny Lipsicas" To: "'Paul Hoffman / IMC'" , Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Date: Thu, 3 Apr 2003 11:00:02 +0200 Message-ID: <011301c2f9bf$6d0cf270$481672c0@BENNYPC> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 x-mimeole: Produced By Microsoft MimeOLE V6.00.2800.1106 Importance: Normal In-Reply-To: <00eb01c2f9b7$583142f0$481672c0@BENNYPC> Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: I sent this reply before noticing the rest of the replies posted. Those were very helpful, and did a better job in explaining the situation :-) - so this message can be ignored. Benny. >>-----Original Message----- >>From: owner-idn-reg-policy@mail.imc.org [mailto:owner-idn-reg- >>policy@mail.imc.org] On Behalf Of Benny Lipsicas >>Sent: Thursday, April 03, 2003 10:02 AM >>To: 'Paul Hoffman / IMC'; idn-reg-policy@imc.org >>Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin >> >> >>Rethinking of it, I should have said last char of a word, and not a >>label. Hebrew has three letters which have an "ending form", i.e., when >>at the end of a word they are written differently (Arabic has also a few >>of those). Grammatically speaking, they can't be written in a word >>(which we need to decide whether or not to block, I guess it is legit >>for someone to want to write two words without separating between them). >>On top of that, two of those ending forms look similar to two other >>letters, therefore can be used to write words that look similar (like >>the 0 (zero) and o case). >> >> >>>>-----Original Message----- >>>>From: Paul Hoffman / IMC [mailto:phoffman@imc.org] >>>>Sent: Wednesday, April 02, 2003 5:59 PM >>>>To: Benny Lipsicas; idn-reg-policy@imc.org >>>>Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin >>>> >>>>At 4:52 PM +0200 4/2/03, Benny Lipsicas wrote: >>>>>The language in question is Hebrew. One feature that may be of >>>>>importance to us is the ability to prevent certain characters from >>>>>appearing anywhere else but at the end of the label (i.e. it can only >>be >>>>>the last char of a label), and we have another issue, which I'm not >>>>>certain is in the scope of this list, any label in Hebrew needs to be >>>>>written RTL, and if i'm not mistaken, this technically prevents the >>>>>mixing of Hebrew and non-Hebrew chars in the same label. >>>> >>>>The latter issue is definitely handled by the IDNA standard. Could >>>>you explain the reason for the first issue (that a particular >>>>character has to be the last character in the label)? >>>> >>>>--Paul Hoffman, Director >>>>--Internet Mail Consortium >> >> From owner-idn-reg-policy Thu Apr 3 00:20:53 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h338KrJM011682 for ; Thu, 3 Apr 2003 00:20:53 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h338KrrO011681 for idn-reg-policy-bks; Thu, 3 Apr 2003 00:20:53 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya40.nic.fr (maya40.nic.fr [192.134.4.151]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h338KpJM011662 for ; Thu, 3 Apr 2003 00:20:52 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya40.nic.fr (8.12.4/8.12.4) with ESMTP id h338Klvg854095; Thu, 3 Apr 2003 10:20:47 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id 58625110F0; Thu, 3 Apr 2003 10:20:51 +0200 (CEST) Date: Thu, 3 Apr 2003 10:20:51 +0200 From: Stephane Bortzmeyer To: Martin Duerst Cc: IDN registration policy list Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Message-ID: <20030403082051.GA6764@nic.fr> References: <4.2.0.58.J.20030402150335.035b1fc8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030402150335.035b1fc8@localhost> User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Wed, Apr 02, 2003 at 03:07:03PM -0500, Martin Duerst wrote a message of 28 lines which said: > The danger of bundles being too big can easily happen for European > languages, with a bundle that defines that all accented versions of > a character are treated as the same as the base character. Yes, see my previous message, in the thread "New Internet Draft on registering IDNs". A typical example is the label "3suisses-assurances" (which actually exist in '.fr') which has a bundle of 306,250 labels with a table that uses (almost) all the Latin-1 characters. Not all of Latin-1 characters exist in French so we could downsize the table and therefore the bundles. But, on the other hand, for a registry like '.eu', we will need an even larger table since Europe requires more than just Latin-1. > In that case, Paul's approach (also described by Adam) of using > equivalence classes won't scale. It doesn't scale if you want to actually generate the bundle and publish them in a static zone file. I tried for the '.fr' zone which is quite small - 150,000 domains - and the resulting zone file was larger than '.com' even before the domains starting with the letter A were fully processed. But you have other approaches: * a dynamic DNS server like PowerDNS with a back-end that will match a label to its bundle at query-time, * Option 2 or 3 of Paul's draft, which do not require to actually store the complete bundle. > What may work is that an accented character blocks the base > character, but not characters with a different accent. Interesting. We could also draw inspiration from most Web search engines. They work that way: If there is no composed character in the query, they search "accent-insensitive". If there is at least one, they switch to "accent-sensitive". From owner-idn-reg-policy Thu Apr 3 02:28:56 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33AStJM027285 for ; Thu, 3 Apr 2003 02:28:55 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33AStR2027284 for idn-reg-policy-bks; Thu, 3 Apr 2003 02:28:55 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya40.nic.fr (maya40.nic.fr [192.134.4.151]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33ASmJM027269; Thu, 3 Apr 2003 02:28:53 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya40.nic.fr (8.12.4/8.12.4) with ESMTP id h33ASjvg893219; Thu, 3 Apr 2003 12:28:45 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id E3FC1110F0; Thu, 3 Apr 2003 12:28:48 +0200 (CEST) Date: Thu, 3 Apr 2003 12:28:48 +0200 From: Stephane Bortzmeyer To: Martin Duerst Cc: Paul Hoffman / IMC , idn-reg-policy@imc.org Subject: Equivalence only in one direction (Was: New Internet Draft on registering IDNs Message-ID: <20030403102848.GA7626@nic.fr> References: <4.2.0.58.J.20030402171329.0294df10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030402171329.0294df10@localhost> User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Wed, Apr 02, 2003 at 06:35:40PM -0500, Martin Duerst wrote a message of 690 lines which said: > If there are two equivalent characters, and I as a registry want > to allow a registrant to use either of these when registering, > do I need two entries in the table? This issue is not really clear to me in the current draft. But see below. > what about tables that don't have the same base character twice, > but may map to the same character? E.g.: > > U+00E8|U+0065 > U+00E9|U+0065 This is a special case of a more general issue, which is not clearly stated in the current draft: equivalence goes from the Left Hand Side to the Right Hand Side but not the reverse. An example in French: # Ligature oe U+0153|U+006FU+0065 Every occurrence of is equivalent to the string "oe" but the reverse is not true. I suggest to clarify the draft in that respect (no, no actual suggestion yet). From owner-idn-reg-policy Thu Apr 3 02:38:37 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33AcaJM028787 for ; Thu, 3 Apr 2003 02:38:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33AcarF028785 for idn-reg-policy-bks; Thu, 3 Apr 2003 02:38:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya40.nic.fr (maya40.nic.fr [192.134.4.151]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33AcYJM028774; Thu, 3 Apr 2003 02:38:35 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya40.nic.fr (8.12.4/8.12.4) with ESMTP id h33AcUvg896956; Thu, 3 Apr 2003 12:38:30 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id 8ADAA110F0; Thu, 3 Apr 2003 12:38:34 +0200 (CEST) Date: Thu, 3 Apr 2003 12:38:34 +0200 From: Stephane Bortzmeyer To: Martin Duerst Cc: Paul Hoffman / IMC , idn-reg-policy@imc.org Subject: Implementing the draft: first partial attempt (Was: New Internet Draft on registering IDNs Message-ID: <20030403103834.GA7922@nic.fr> References: <4.2.0.58.J.20030402171329.0294df10@localhost> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="bg08WKrSYDhXBjb5" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4.2.0.58.J.20030402171329.0294df10@localhost> User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: --bg08WKrSYDhXBjb5 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit On Wed, Apr 02, 2003 at 06:35:40PM -0500, Martin Duerst wrote a message of 690 lines which said: > >The format of the table is meant to be machine-readable but not > >human-readable. It is fairly trivial > > For some people, writing a C program or a perl script is > 'fairly trivial'. For others, it's not. It is easy to change > the format to make it even more trivial. Most humans do not program, they use shrink-wrapped software. Here is a small script which parses a table in the draft format and generate the bundle. WARNING: strings in the RHS are not really supported yet. Example of use: ~/AFNIC/IDN % ./gen-bundles.py bar bar bär bâr bār bår bãr bár --bg08WKrSYDhXBjb5 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="gen-bundles.py" #!/usr/bin/python import re import string import sys import unicodedata import getopt udigit = "u\+[0-9A-F]+" file = "variant-table" locale = "latin-1" # TODO: find it from the environment, instead dump_table = 0 def usage(): print "Usage: " + sys.argv[0] + "name..." print " (You can also send the names on standard input)" class Bundles: list = [] def __init__ (self, variants, label, canonical=1): if not canonical: canonic_label = u"" for character in label: if not variants.characters.has_key(character): raise "Invalid character \"" + \ character.encode(locale, 'replace') + "\" (" + \ unicodedata.name(character, "Unknown character") + ")" canonic_label = canonic_label + variants.characters[character] label = canonic_label self.list = [label] for i in range (len(label)): character = label[i] for variant in variants.base_characters[character]: if variant != character: prefix = u"" for j in range (i): prefix = prefix + label[j] prefix = prefix + variant rest = label[i+1:] #print "DEBUG (starting): " + prefix.encode (locale, 'replace') new_bundle = Bundles (variants, rest) self.append (prefix, new_bundle) def append(self, prefix, bundle): for string in bundle.get_list(): self.list.append (prefix + string) def get_list (self): # TODO: flatten before returning return self.list class Variants: characters = {} base_characters = {} def __init__(self, file): fh = open (file, "r") line = fh.readline() num = 1 while line: if re.match ("^\s*#", line): line = fh.readline() num = num + 1 continue line = string.strip(line) if not line: line = fh.readline() num = num + 1 continue expr = re.compile ("^(" + udigit + ")(\|(" + udigit + ":?)+)?$", re.IGNORECASE) first_digit = re.compile ("^(" + udigit + ")", re.IGNORECASE) found = expr.match (line) if not found: raise "Invalid " + str(num) + " line: " + line base_character = found.group(1) ubase_character = unichr (self.unhex(base_character[2:6])) self.characters[ubase_character] = ubase_character self.base_characters[ubase_character] = [] if not found.group(2): # No variant line = fh.readline() num = num + 1 continue over = 0 line = line[found.end(1)+1:] while not over: found = first_digit.match (line) if not found: raise "Invalid " + str(num) + " line: " + line character = found.group(1) ucharacter = unichr (self.unhex(character[2:6])) self.characters[ucharacter] = ubase_character if found.end(1) >= len(line): over = 1 continue if line[found.end(1)] == ":": line = line[found.end(1)+1:] elif line[found.end(1)] == "u" or line[found.end(1)] == "U": line = line[found.end(1):] else: raise Invalid_character_sequence + " in line " + str(num) line = fh.readline() num = num + 1 fh.close() for character in self.characters.keys(): self.base_characters[self.characters[character]].append (character) def unhex(self, s): """Get the integer value of a hexadecimal number.""" bits = 0 for c in s: if '0' <= c <= '9': i = ord('0') elif 'a' <= c <= 'f': i = ord('a')-10 elif 'A' <= c <= 'F': i = ord('A')-10 else: break bits = bits*16 + (ord(c) - i) return bits variants = Variants (file) if dump_table: for character in variants.characters.keys(): print "UTF-8: " + character.encode ('utf-8'), print ", Locale: " + character.encode(locale, 'replace'), print " (" + unicodedata.name(character, "Unknown character") + ")", print ". Base character: ", print variants.characters[character].encode(locale, 'replace') if len(sys.argv) > 1: for word in sys.argv[1:]: # TODO: we should nameprep before sending to variants.bundle() bundle = Bundles(variants, unicode(string.lower(word), locale), canonical = 0) for variant in bundle.get_list(): print variant.encode (locale, 'replace') else: name = sys.stdin.readline() name = name[:-1] # Chop end-of-line while name: bundle = Bundles(variants, unicode(string.lower(name), locale), canonical = 0) for variant in bundle.get_list(): print variant.encode (locale, 'replace') name = sys.stdin.readline() name = name[:-1] # Chop end-of-line --bg08WKrSYDhXBjb5-- From owner-idn-reg-policy Thu Apr 3 02:47:04 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33Al4JM029192 for ; Thu, 3 Apr 2003 02:47:04 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33Al43V029191 for idn-reg-policy-bks; Thu, 3 Apr 2003 02:47:04 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya40.nic.fr (maya40.nic.fr [192.134.4.151]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33Al2JM029185; Thu, 3 Apr 2003 02:47:03 -0800 (PST) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya40.nic.fr (8.12.4/8.12.4) with ESMTP id h33Akwvg899257; Thu, 3 Apr 2003 12:46:58 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id 90F92110F0; Thu, 3 Apr 2003 12:47:02 +0200 (CEST) Date: Thu, 3 Apr 2003 12:47:02 +0200 From: Stephane Bortzmeyer To: Paul Hoffman / IMC Cc: Stephane Bortzmeyer , idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030403104702.GA8121@nic.fr> References: <200303311956.h2VJuD6v006142@ludwigV.sources.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Tue, Apr 01, 2003 at 06:05:18PM -0800, Paul Hoffman / IMC wrote a message of 31 lines which said: > - you can choose which five to put in the zone; the other 1019 will be > blocked > - we have chosen the five that we will put in the zone, and the 1019 > that will be blocked > - you can put as many as you want in the zone, and it will cost you > US$10 per name per year to do so; the rest will be blocked for free > - because your bundle has 1024 names, the base bundle cost is five > times higher than if you had chosen a name that only had 128 names; > plus, you must pay... I agree with this description of options. From owner-idn-reg-policy Thu Apr 3 05:57:53 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33DvrJM015558 for ; Thu, 3 Apr 2003 05:57:53 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33DvrS3015557 for idn-reg-policy-bks; Thu, 3 Apr 2003 05:57:53 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33DvdJM015538 for ; Thu, 3 Apr 2003 05:57:45 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h33DvXF32354 for ; Thu, 3 Apr 2003 18:27:33 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h33E8mW16017 for ; Thu, 3 Apr 2003 18:38:48 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Thu, 3 Apr 2003 18:38:48 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: IDN registration policy list Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: <20030402022055.GB30135@nicemice.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Wed, 2 Apr 2003, Adam M. Costello wrote: > Roozbeh Pournader wrote: > > > 1. Mandatory equivalences as opposed to secondary/variant > > equivalences. This feature is necessary for defining equivalences > > between European and Arabic-Indic digit shapes in Arabic labels, for > > example. > > > > This is some feature that *all* Arabic script zones *require* > > I'm not sure what you mean by "mandatory" and "require". Well, neither am I. ;) > Suppose I create the names .nicemice.net and > .nicemice.net on the DNS server for nicemice.net (which I > control), and suppose these two names are the same except that one uses > ASCII digits and the other uses the corresponding Arabic-Indic digits, > and suppose I associate different resource records with these names. > > Should there be a standard that prohibits me from doing this? No. But the standard should allow you to do otherwise (define them to be the same thing as a general policy in your zone) automatically. Actually, I'm not worried a lot about these two referring to different resources, or how they are handled in web servers, mail servers, etc. I'm worried about these two being owned by different persons/entities. > What would you think of this model: The Arabic-speaking community > develops a best-practice recommendation regarding equivalences of > names that should resolve to the same resource records, and TLDs can > opt to support those equivalences, and can advertise their voluntary > conformance in order to attract Arabic registrants. That is very similiar to the model I'm thinking about. But TLDs, when opting to support those, should be able to do it automatically. This is best to be specified in IDN administration files. > The whole point of having bundles is so that two registered labels > cannot be variants of each other. Whichever one was registered first > blocks the other from being registered. I was talking about variants that are shared between two different labels (that are not variants of each other), which *will* happen in zones, since we can't define the equivalence tables to be 'equivalence classes' (in mathematical terms). roozbeh From owner-idn-reg-policy Thu Apr 3 06:13:58 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33EDwJM016632 for ; Thu, 3 Apr 2003 06:13:58 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33EDweD016631 for idn-reg-policy-bks; Thu, 3 Apr 2003 06:13:58 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33EDkJM016597; Thu, 3 Apr 2003 06:13:54 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h33EDjF04750; Thu, 3 Apr 2003 18:43:45 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h33EP0K16197; Thu, 3 Apr 2003 18:55:00 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Thu, 3 Apr 2003 18:55:00 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: Paul Hoffman / IMC cc: idn-reg-policy@imc.org Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Tue, 1 Apr 2003, Paul Hoffman / IMC wrote: > >2. Clear language about conflict resolution. There needs to be some clear > >guidelines or recommendations about the times that two registered labels > >come into an intersection regarding the variant labels associated to them. > >This will happen with almost any multi-language Arabic-script zone > >(e.g. U+0649 vs U+064A vs U+06CC). > > I am unclear on how this differs from point #1. If any of those three > characters are supposed to only be represented by one of them in > names, then the registry-specific early mapping step will take care > of them. Or is that not what you are referring to? Please be more > specific. I am talking about two main labels that are not variants of each other, but share a variant. For example, a multilanguage zone that support both Arabic and Persian will have something like this in its equivalence data table: U+0649|U+06CC U+064A|U+06CC U+06CC|U+0649;U+064A The first two characters are only used in the Arabic languages, and the third is only used in Persian. The above data is necessary if you want to distinguish the first two (that are distinguished in Arabic), but avoid security problems (U+06CC looks *identical* to U+0649 in isolated and final contextual forms, and *identical* to U+064A in medial and initial forms). Now, when one registers a label with a U+0649 and someone else goes with a U+064A, who will own the U+06CC version? > - the merging is a policy decision by the registry at the time of > table-making as to which language "wins" for the overlapping > characters Or which character wins, or which registrant wins, or which orthography wins... > - it is impossible to register without knowing the supposed language > of the registration > > I can add more discussion of that, but the third option is not > "merging", it is forcing the problem on the registrant (who might be > sly and use it as a way to make the bundle contain things that the > registry might not have intended). From my reading of the JET > document, they call the third option "merging" when in fact it is > just the opposite: it prevents merging by pointing at one table. I like to see a standard that limits each zone to a specific *language*, but I guess that can't happen in situations like a Swiss or European zone. But also, I like to see a nice mechanism or some guideline for a Swiss registry: "Grab the Italian, French, and German and merge them according to the following guidelines." The guidelines may require a language priority list or whatever, but it should be usable in a semi-automatic way. roozbeh From owner-idn-reg-policy Thu Apr 3 06:20:55 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33EKsJM017161 for ; Thu, 3 Apr 2003 06:20:54 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33EKsUn017160 for idn-reg-policy-bks; Thu, 3 Apr 2003 06:20:54 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33EKmJM017149; Thu, 3 Apr 2003 06:20:50 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h33EKgF07494; Thu, 3 Apr 2003 18:50:42 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h33EVpl16324; Thu, 3 Apr 2003 19:01:54 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Thu, 3 Apr 2003 19:01:51 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: Martin Duerst cc: Paul Hoffman / IMC , Benny Lipsicas , Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: <4.2.0.58.J.20030402151053.0321e328@localhost> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Wed, 2 Apr 2003, Martin Duerst wrote: > So a registry registring Hebrew would want to make sure that e.g. > a kaf in the middle of a label is always U+05DB, but at the end of > the label is always U+05DA. But why? First, I'm sure there is a list of Hebrew words that use the final forms in non-final places (mostly abbreviations). Second, why would a registry want to prevent such a (complex) thing automatically in this very low level of character equivalence tables? Can't a registry just have a complicated script that after one asks for a label to register tells: Sorry, you can't register this label, because it "contains the word sex" or "uses FINAL KAF in a non-final position"? roozbeh From owner-idn-reg-policy Thu Apr 3 07:17:10 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33FHAJM018859 for ; Thu, 3 Apr 2003 07:17:10 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33FH9hM018858 for idn-reg-policy-bks; Thu, 3 Apr 2003 07:17:09 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33FH8JN018852 for ; Thu, 3 Apr 2003 07:17:08 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030403014818.GC8966@nicemice.net> References: <20030331210443.GB15622@nicemice.net> <20030403014818.GC8966@nicemice.net> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Wed, 2 Apr 2003 19:15:19 -0800 To: IDN registration policy list From: Paul Hoffman / IMC Subject: Re: initial thoughts Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:48 AM +0000 4/3/03, Adam M. Costello wrote: >Couldn't you simply say that when more than one key matches, use the >longest match? It would be trickier to implement, but that's exactly >how routing tables work. It seems pretty easy to describe. Whether it >would be worth the additional complexity, I don't know. How do others feel about this? > > > We both propose partitioning the set of admissible labels into > > > groups (bundles). Paul imagines a function that constructs the >> > entire group given any member of the group, while I imagine a >> > function that computes a group identifier given any member of the >> > group, where the group identifier could be a single member of the >> > group arbitrarily chosen to stand for the whole group. Either >> > kind of function implies the same partition of the space, and both >> > functions would use the same tables, but I think the group-id >> > function might be easier to describe and understand. >> >> This seems like a somewhat academic difference > >Yes, it is academic, because it's possible to define the exact same >bundles either way. But people might find it difficult to wrap their >heads around the bundle-generating function and reason about it. I >know I do. Please propose alternate wording, then. The folks on the list could compare them and see which makes more sense. Given that they come up with the same bundles, we obviously would want to have the clearer wording. However, maybe wait until I get the -01 draft out with all the changes I have been promising... > > > Will it scale? > > > > Absolutely. > >Suppose JPNIC decides that hiragana and katakana should block each >other. Er, in order to say that something doesn't scale, you need to use real-world examples. JPNIC has already stated that they have no intention to do this. It is easy to come up with extreme examples for *any* protocol, then say "if you do this a zillion times it doesn't scale". Please show the scaling properties for typical use and, if they show that the current proposal cannot be deployed over the long term, what changes would be needed. > > Speaking of "scaling", your suggestion makes scaling harder. That >> is, the registration process would have to include a step where >> the registrant chooses some things from a list. This is much more >> difficult than the registry saying "here's what you get, you can >> contact us if you don't like it". Well, of course they won't like it, >> but at least the process isn't blocked. > >I would still call that "registrant chooses", even though the registry >is offering a default choice, because the registry still needs to be >able to store lists of visible names for registrants who ask to deviate >from the default. When I said "registry chooses", I meant that the >registrant has zero input, so that the registry could define the choice >algorithmically and omit any capability of storing lists. Right, but you're missing the problem of time-scaling. Any automated process that has a dependent step that includes novice end-users deciding something that they have no expertise in will scale more poorly than one that doesn't have the step in the control steps. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 08:15:00 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GExJM025932 for ; Thu, 3 Apr 2003 08:14:59 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33GExv6025930 for idn-reg-policy-bks; Thu, 3 Apr 2003 08:14:59 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GEwJM025921 for ; Thu, 3 Apr 2003 08:14:58 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33GErsl001715; Thu, 3 Apr 2003 11:14:59 -0500 Message-Id: <4.2.0.58.J.20030403102116.05b53c50@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 10:23:12 -0500 To: IDN registration policy list , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs In-Reply-To: <20030403025624.GD8966@nicemice.net> References: <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 02:56 03/04/03 +0000, Adam M. Costello wrote: >Martin Duerst wrote: > > > CNAME or DNAME? > >CNAME and DNAME mean different things. CNAME defines an alias for a >particular name (the alias is a leaf in the tree). DNAME defines an >alias for a domain suffix (the alias effectively creates a mirror of a >whole subtree). Paul's draft describes three different methods to get >www.foo.example.com and www.bar.example.com to refer to the same server: okay, got it. It would be great if this could be made clearer in the document, by numbering the proposals, and mentioning the used features in the intro text (which is currently not done for CNAME). Regards, Martin. From owner-idn-reg-policy Thu Apr 3 08:15:09 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GF9JM025943 for ; Thu, 3 Apr 2003 08:15:09 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33GF9FJ025942 for idn-reg-policy-bks; Thu, 3 Apr 2003 08:15:09 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GF8JM025936; Thu, 3 Apr 2003 08:15:08 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33GErsn001715; Thu, 3 Apr 2003 11:14:59 -0500 Message-Id: <4.2.0.58.J.20030403102556.02988430@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 10:27:15 -0500 To: Roozbeh Pournader From: Martin Duerst Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: Paul Hoffman / IMC , Benny Lipsicas , In-Reply-To: References: <4.2.0.58.J.20030402151053.0321e328@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 19:01 03/04/03 +0430, Roozbeh Pournader wrote: >On Wed, 2 Apr 2003, Martin Duerst wrote: > > > So a registry registring Hebrew would want to make sure that e.g. > > a kaf in the middle of a label is always U+05DB, but at the end of > > the label is always U+05DA. > >But why? First, I'm sure there is a list of Hebrew words that use the >final forms in non-final places (mostly abbreviations). Second, why would >a registry want to prevent such a (complex) thing automatically in this >very low level of character equivalence tables? > >Can't a registry just have a complicated script that after one asks for a >label to register tells: Sorry, you can't register this label, because it >"contains the word sex" or "uses FINAL KAF in a non-final position"? Note that above, I didn't say what means would be used for checking. We could agree that we don't want to do this by tables, but then a document that aims for BCP should mention such cases and clearly say that the tables don't cover everything. Regards, Martin. From owner-idn-reg-policy Thu Apr 3 08:15:00 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GExJM025933 for ; Thu, 3 Apr 2003 08:14:59 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33GExmX025931 for idn-reg-policy-bks; Thu, 3 Apr 2003 08:14:59 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GEvJM025920; Thu, 3 Apr 2003 08:14:58 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33GErsj001715; Thu, 3 Apr 2003 11:14:54 -0500 Message-Id: <4.2.0.58.J.20030403085402.02997fe8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 09:06:23 -0500 To: Paul Hoffman / IMC , IDN registration policy list From: Martin Duerst Subject: Re: initial thoughts In-Reply-To: References: <20030331210443.GB15622@nicemice.net> <20030331210443.GB15622@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 15:35 03/04/02 -0800, Paul Hoffman / IMC wrote: >At 9:04 PM +0000 3/31/03, Adam M. Costello wrote: >> Will it scale? > >Absolutely. As long as the zone's name servers can serve the records, >there is no problem. Even if the database is kept on disk instead of in >RAM, the response time on modern computers for even a gigantic zone is low >relative to typical network delays. There are also other mitigating >methods for huge zones, such as partitioning the responses on different >servers at the same address. The VeriSign folks have already done a lot of >research on this for the .com zone. I would want to call for extreme caution here. VeriSign definitely has managed this. But is their experience fully documented? Is it reasonable to assume that an average cctld will be able to manage it? And are the scales the same? The .com zone contains one entry per registration only, and is already huge. Now assume that .com uses some table generating mappings. This will blow up the .com zone, potentially to a size of a completely different magnitude. >> The combinatorics are exponential. >>I think the only way it would work well is if the authoritative DNS >>servers (both the registry's servers that delegate the name, and the >>registrant's servers that provide the data for the name) include support >>for the grouping. I guess you would supply the tables to the DNS >>server, and it would perform the group-id computation whenever a request >>comes in, then look for the group-id in the zone database. This would >>require a standard machine-readable format for describing the grouping >>policy for a zone. > >We disagree here. I don't see that requirement at all. As the managers of >the current large zones have shown, you can have gigantic zone files. Even at 'gigantic', there are still huge potential differences. The earth is gigantic, from a human point of view. Yet the solar system is much more 'gigantic'. And then there is the galaxis, and on top of that the universe. >>You could also imagine that the registry, rather than the registrant, >>chooses which member(s) will be visible, but I think it would be >>difficult for registries to come up with rules that would please >>everyone; it would probably be easier to let the registrant choose, and >>simply store the list. > >Speaking of "scaling", your suggestion makes scaling harder. That is, the >registration process would have to include a step where the registrant >chooses some things from a list. This is much more difficult than the >registry saying "here's what you get, you can contact us if you don't like >it". Well, of course they won't like it, but at least the process isn't >blocked. Well, in many cases, it would actually be very easy. The registrant inputs the label s/he wants, and that's what is entered, and the rest is blocked. Some kind of confirmation would be needed, of course. Regards, Martin. From owner-idn-reg-policy Thu Apr 3 08:43:54 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GhsJM027495 for ; Thu, 3 Apr 2003 08:43:54 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33Ghsr5027494 for idn-reg-policy-bks; Thu, 3 Apr 2003 08:43:54 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GhXJR027440; Thu, 3 Apr 2003 08:43:37 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 07:54:01 -0800 To: Roozbeh Pournader From: Paul Hoffman / IMC Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 6:55 PM +0430 4/3/03, Roozbeh Pournader wrote: >On Tue, 1 Apr 2003, Paul Hoffman / IMC wrote: > >> >2. Clear language about conflict resolution. There needs to be some clear >> >guidelines or recommendations about the times that two registered labels >> >come into an intersection regarding the variant labels associated to them. >> >This will happen with almost any multi-language Arabic-script zone >> >(e.g. U+0649 vs U+064A vs U+06CC). >> >> I am unclear on how this differs from point #1. If any of those three >> characters are supposed to only be represented by one of them in >> names, then the registry-specific early mapping step will take care >> of them. Or is that not what you are referring to? Please be more >> specific. > >I am talking about two main labels that are not variants of each other, >but share a variant. For example, a multilanguage zone that support both >Arabic and Persian will have something like this in its equivalence data >table: > >U+0649|U+06CC >U+064A|U+06CC >U+06CC|U+0649;U+064A > >The first two characters are only used in the Arabic languages, and the >third is only used in Persian. The above data is necessary if you want >to distinguish the first two (that are distinguished in Arabic), but avoid >security problems (U+06CC looks *identical* to U+0649 in isolated and >final contextual forms, and *identical* to U+064A in medial and initial >forms). > >Now, when one registers a label with a U+0649 and someone else goes with a >U+064A, who will own the U+06CC version? The first person to register either name. I can add (hopefully) "clear language" on this in the next draft. > > - the merging is a policy decision by the registry at the time of >> table-making as to which language "wins" for the overlapping >> characters > >Or which character wins, or which registrant wins, or which orthography >wins... Not "registrant"; the policy is in place before the registrants start using it. But "character" and "orthography" are certainly part of the decision that the registry that takes this route much factor in. It is really "which policy-maker at the registry wins". That is not a good position to be in. > > - it is impossible to register without knowing the supposed language >> of the registration >> >> I can add more discussion of that, but the third option is not >> "merging", it is forcing the problem on the registrant (who might be >> sly and use it as a way to make the bundle contain things that the >> registry might not have intended). From my reading of the JET >> document, they call the third option "merging" when in fact it is >> just the opposite: it prevents merging by pointing at one table. > >I like to see a standard that limits each zone to a specific *language*, >but I guess that can't happen in situations like a Swiss or European zone. There are many countries outside of Europe that have more than one official languages that by law would have to be represented in the country's ccTLD. There are even countries more that have multiple "unofficial" languages that the registries would catch tremendous grief if the people who spoke that langauge could not register names in the ccTLD. >But also, I like to see a nice mechanism or some guideline for a Swiss >registry: "Grab the Italian, French, and German and merge them >according to the following guidelines." The guidelines may require a >language priority list or whatever, but it should be usable in a >semi-automatic way. If you (or anyone!) have suggestions on how to do this with real-world examples, please send them along. I cannot see any, but I am hampered by speaking only one language and by being too close to the current proposal. Such examples might be useful in whatever proposal a multi-lingual zone adopts. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 08:44:04 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33Gi4JM027536 for ; Thu, 3 Apr 2003 08:44:04 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33Gi4I9027535 for idn-reg-policy-bks; Thu, 3 Apr 2003 08:44:04 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GhXJT027440; Thu, 3 Apr 2003 08:43:39 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: References: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 08:00:51 -0800 To: Roozbeh Pournader , Martin Duerst From: Paul Hoffman / IMC Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: Benny Lipsicas , Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 7:01 PM +0430 4/3/03, Roozbeh Pournader wrote: >Can't a registry just have a complicated script that after one asks for a >label to register tells: Sorry, you can't register this label, because it >"contains the word sex" or "uses FINAL KAF in a non-final position"? This is an important point. Registries do not have to blindly follow any rules, such as the ones in this document or the JET document. They can look at the output bundle and make local policy decisions. For characters that are position-dependent in words, a registry will have to have non-automatic rules. For example, assume that the letter "z" could not be at the end of an English word. "baz.us" should be not allowed, but the .us registry couldn't just check for "is 'z' at the end of the word" because that would allow someone to register "baz123.us". Or assume that "y" can only be at the end of a word. The registry should not have a rule that disallows any name that has "y" in the middle because someone might correctly want to register "floycorp.us". --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 08:54:54 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GssJM028150 for ; Thu, 3 Apr 2003 08:54:54 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33Gss4U028149 for idn-reg-policy-bks; Thu, 3 Apr 2003 08:54:54 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.Sharif.AC.IR [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33GshJM028140; Thu, 3 Apr 2003 08:54:47 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h33GsgF26083; Thu, 3 Apr 2003 21:24:42 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h33H61418064; Thu, 3 Apr 2003 21:36:01 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Thu, 3 Apr 2003 21:36:01 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: Paul Hoffman / IMC cc: idn-reg-policy@imc.org Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Thu, 3 Apr 2003, Paul Hoffman / IMC wrote: > >Now, when one registers a label with a U+0649 and someone else goes with a > >U+064A, who will own the U+06CC version? > > The first person to register either name. I can add (hopefully) > "clear language" on this in the next draft. Well, I'll do it the other way: neither will own it, since it will definitely create security problems. On a second thought, we may need three classes of equivalents: 1. Required: Always resolving. Example: different digit forms (U+0030->U+0660). 2. Variant: Optionally resolving. The current variant. 3. Blocking: Reserved forever. Example: Arabic vs Persian Yehs (U+0649->U+06CC). > >Or which character wins, or which registrant wins, or which orthography > >wins... > > Not "registrant"; the policy is in place before the registrants start > using it. Actually, you may have policies like "the first registrant wins". > If you (or anyone!) have suggestions on how to do this with > real-world examples, please send them along. I cannot see any, but I > am hampered by speaking only one language and by being too close to > the current proposal. Such examples might be useful in whatever > proposal a multi-lingual zone adopts. 'jseng-idn-admin' has such an algorithm. But let's see if someone comes with a better or clearer suggestion. roozbeh From owner-idn-reg-policy Thu Apr 3 09:16:06 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33HG6JM029015 for ; Thu, 3 Apr 2003 09:16:06 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33HG6Uq029014 for idn-reg-policy-bks; Thu, 3 Apr 2003 09:16:06 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33HG5JM029008; Thu, 3 Apr 2003 09:16:05 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33HG6sj024043; Thu, 3 Apr 2003 12:16:06 -0500 Message-Id: <4.2.0.58.J.20030403120442.03d16270@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 12:15:34 -0500 To: Paul Hoffman / IMC , IDN registration policy list From: Martin Duerst Subject: Re: initial thoughts In-Reply-To: References: <20030403014818.GC8966@nicemice.net> <20030331210443.GB15622@nicemice.net> <20030403014818.GC8966@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 19:15 03/04/02 -0800, Paul Hoffman / IMC wrote: >At 1:48 AM +0000 4/3/03, Adam M. Costello wrote: >>Couldn't you simply say that when more than one key matches, use the >>longest match? It would be trickier to implement, but that's exactly >>how routing tables work. It seems pretty easy to describe. Whether it >>would be worth the additional complexity, I don't know. > >How do others feel about this? I think first we need clarification from you about whether you intended, in your approach, that (variants in) bundles can overlap. There is no indication in your draft that they can't, but on the other hand, there is no indication that you were aware of the fact that they could. If you intended them to overlap, then your and Adam's approach are technically different, and we probably would have to discuss these actual differences first. If you didn't intend them to overlap, then you have to add the necessary restrictions to your draft. I think I agree with Adam that his approach then is easier to describe, because it can describe how to map one label to one other label, which should make the description a lot simpler. But of course I would like to first see an actual wording from Adam. Your description of how to generate a bundle could probably also be cleaned up (see my comments in a provious mail), but may still be more complicated. >> > > Will it scale? >> > >> > Absolutely. >> >>Suppose JPNIC decides that hiragana and katakana should block each >>other. > >Er, in order to say that something doesn't scale, you need to use >real-world examples. JPNIC has already stated that they have no intention >to do this. > >It is easy to come up with extreme examples for *any* protocol, then say >"if you do this a zillion times it doesn't scale". Please show the scaling >properties for typical use and, if they show that the current proposal >cannot be deployed over the long term, what changes would be needed. Stephane already has shown a quite reasonable example, and applied it to an actual zone, with clear results. And in general, for scalability, the burden of proof in on the people who claim that things scale. Just saying that things will scale because nobody has proved that they don't is not a good idea. >>I would still call that "registrant chooses", even though the registry >>is offering a default choice, because the registry still needs to be >>able to store lists of visible names for registrants who ask to deviate >>from the default. When I said "registry chooses", I meant that the >>registrant has zero input, so that the registry could define the choice >>algorithmically and omit any capability of storing lists. > >Right, but you're missing the problem of time-scaling. Any automated >process that has a dependent step that includes novice end-users deciding >something that they have no expertise in will scale more poorly than one >that doesn't have the step in the control steps. If the end users have to make a choice, then timewise, this scales because they can do that in parallel. If somebody at the registry has to intervene, it definitely doesn't scale. Also, we should assume that the end users at least know the language they are registering in. Regards, Martin. From owner-idn-reg-policy Thu Apr 3 10:47:53 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33IlqJM003334 for ; Thu, 3 Apr 2003 10:47:52 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33Ilq9u003330 for idn-reg-policy-bks; Thu, 3 Apr 2003 10:47:52 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33IlpJM003321 for ; Thu, 3 Apr 2003 10:47:51 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33Ilmsl019062; Thu, 3 Apr 2003 13:47:53 -0500 Message-Id: <4.2.0.58.J.20030403122243.05467b90@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 13:33:51 -0500 To: Stephane Bortzmeyer From: Martin Duerst Subject: Re: Format of the tables (U+xxxx, UTF-8, etc) Was: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: IDN registration policy list In-Reply-To: <20030403073536.GA6569@nic.fr> References: <4.2.0.58.J.20030402145707.03cc9c18@localhost> <4.2.0.58.J.20030402145707.03cc9c18@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Stephane, Thanks for making the subject more explicit. At 09:35 03/04/03 +0200, Stephane Bortzmeyer wrote: >On Wed, Apr 02, 2003 at 03:00:31PM -0500, > Martin Duerst wrote > a message of 16 lines which said: > > > >4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB > > >syntax is unreadable? Why can't one use a space? > > > > What about using Unicode (UTF-8) directly? > > What about defining an XML format for the tables? > > This would allow to publish tables in ASCII-only contexts > > but also easily view them in a browser with a simple stylesheet. > >Good idea but there is already a solution to do so. I suggest that >Paul rewrites his draft according to RFC 2629, which is widely >implemented and already has stylesheets. RFC 2629 is an XML solution for writing RFCs and Internet Drafts. It is not an XML-based format for the mapping/blocking tables. Also, RFC 2629 has virtually no support for characters outside US-ASCII (some such support was discussed, but hasn't been implemented yet). An xml-based format for the mapping/blocking tables is something completely different. Regards, Martin. From owner-idn-reg-policy Thu Apr 3 10:47:53 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33IlrJM003337 for ; Thu, 3 Apr 2003 10:47:53 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33Ilrnx003335 for idn-reg-policy-bks; Thu, 3 Apr 2003 10:47:53 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33IlpJM003323; Thu, 3 Apr 2003 10:47:51 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33Ilmsn019062; Thu, 3 Apr 2003 13:47:53 -0500 Message-Id: <4.2.0.58.J.20030403133539.03c8f7b8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 13:47:36 -0500 To: Paul Hoffman / IMC , Roozbeh Pournader From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: idn-reg-policy@imc.org In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 07:54 03/04/03 -0800, Paul Hoffman / IMC wrote: >At 6:55 PM +0430 4/3/03, Roozbeh Pournader wrote: >>U+0649|U+06CC >>U+064A|U+06CC >>U+06CC|U+0649;U+064A >> >>The first two characters are only used in the Arabic languages, and the >>third is only used in Persian. The above data is necessary if you want >>to distinguish the first two (that are distinguished in Arabic), but avoid >>security problems (U+06CC looks *identical* to U+0649 in isolated and >>final contextual forms, and *identical* to U+064A in medial and initial >>forms). Small aside: Would it be possible to create rules (not necessarily checked by a table approach) that are less restrictive by taking into account the position of the character with respect to other characters? (For non-experts of Arabic: this is not the position in the label, as there are some characters that don't connect to one side, and therefore make e.g. the following character take an initial or isolated form even if it's in the middle or at the end of a word.)I.e. we would have something like: For final and isolated: U+0649|U+06CC U+064A U+06CC|U+0649 In medial and initial form: U+0649 U+064A|U+06CC U+06CC|U+064A >>Now, when one registers a label with a U+0649 and someone else goes with a >>U+064A, who will own the U+06CC version? > >The first person to register either name. I can add (hopefully) "clear >language" on this in the next draft. Okay, this clears things up quite a bit. You definitely have to make that much clearer in your draft. This also makes it understandable that you require all the blockings to be put in a database, because otherwise, it would be difficult to reconstruct them (the exact sequence of registrations would be needed for reconstruction). But I think such a sequence-dependent structure is difficult to understand and treat mathematically. That's maybe why Adam has problems with it. It would be much better if we had a structure that didn't depend on sequence. Something along the lines of 'blockings can overlap (and don't belong to anybody), but mappings cannot overlap'. Regards, Martin. From owner-idn-reg-policy Thu Apr 3 10:47:58 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33IlwJM003345 for ; Thu, 3 Apr 2003 10:47:58 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h33IlwZ1003344 for idn-reg-policy-bks; Thu, 3 Apr 2003 10:47:58 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h33IlpJM003320 for ; Thu, 3 Apr 2003 10:47:57 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h33Ilmsj019062; Thu, 3 Apr 2003 13:47:53 -0500 Message-Id: <4.2.0.58.J.20030403122117.053479a8@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Thu, 03 Apr 2003 12:22:22 -0500 To: Stephane Bortzmeyer From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: IDN registration policy list In-Reply-To: <20030403082051.GA6764@nic.fr> References: <4.2.0.58.J.20030402150335.035b1fc8@localhost> <4.2.0.58.J.20030402150335.035b1fc8@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 10:20 03/04/03 +0200, Stephane Bortzmeyer wrote: >On Wed, Apr 02, 2003 at 03:07:03PM -0500, > Martin Duerst wrote > > What may work is that an accented character blocks the base > > character, but not characters with a different accent. > >Interesting. We could also draw inspiration from most Web search >engines. They work that way: If there is no composed character in the >query, they search "accent-insensitive". If there is at least one, >they switch to "accent-sensitive". Accent-sensitive is easy to do. Accent-insensitive would require multiple results, which DNS doesn't do. Regards, Martin. From owner-idn-reg-policy Thu Apr 3 17:11:59 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341BxJM029173 for ; Thu, 3 Apr 2003 17:11:59 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h341BxEm029172 for idn-reg-policy-bks; Thu, 3 Apr 2003 17:11:59 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341BtJN029167; Thu, 3 Apr 2003 17:11:56 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030402171329.0294df10@localhost> References: <4.2.0.58.J.20030402171329.0294df10@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 17:11:50 -0800 To: Martin Duerst , idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Many thanks for the detailed editorial comments! I haven't listed them here but went ahead and marked up the document. A few left-overs: At 6:35 PM -0500 4/2/03, Martin Duerst wrote: >I often see 'JET document' in this discussion. Is this >draft-jseng-idn-admin, or something else? That is the one. >I have been thinking about 'Framework' quite a bit. Is this draft >a framework? It seems to be a definition of a table format with >some associated algorithm(s), and a recommendation to use this >table format/algorithms. And the recommendation isn't very clear: >Should all registries use this table format/algorithm? Or is it >just one of potentially many formats/algorithms, and registries >can choose? Sorry, I don't see how your questions apply to the word "framework". Could you suggest alternative wording? >>This document discusses characters that have equivalent or >>near-equivalent characters or strings. The "base character" is the >>character that has one or more equivalents; the "variant(s)" are the >>character(s) and/or string(s) that are equivalent to the base character. > >'base character' is used in the context of combining characters, >it would be better to find another term. It seems pretty clear in the context of this document. Did anyone else have a problem with this term? If so, alternative proposals are appreciated. >I think the question of whether ASCII is allowed or not is a very >special one, which should be considered separately. There may >be cases where ASCII is already allowed; there is a good argument >for always allowing ASCII, in particular if the higher-level >domains are all ASCII; there is a good argument to not allow >ASCII if the higher-level domains are all non-ASCII. >These arguments are rather different from arguments about >allowing a few more or less characters. I don't see why ASCII should be special. A registry decides which characters are allowed in the IDN labels, regardless of the scripts. Why does it seems special? >what about tables that don't have the same base character twice, >but may map to the same character? E.g.: > >U+00E8|U+0065 >U+00E9|U+0065 > >Or where a base character also appears as a variant > >U+00E8|U+0065 >U+0065 > >Or where a base character appears as part of a variant: > >U+00FC|U+0075U+0065 >U+0075 >U+0065 I'm not sure what you mean by "what about" here. All of what you list is just fine. There is nothing anywhere that prohibits those possibilities. Are you proposing that I add examples with all these in them? >>If the owner of "example.com" used a DNAME > >CNAME or DNAME? DNAME. Using CNAMEs here would clearly be broken here because allocated labels are not terminal. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 17:41:36 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341faJM001621 for ; Thu, 3 Apr 2003 17:41:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h341fZt5001618 for idn-reg-policy-bks; Thu, 3 Apr 2003 17:41:35 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341fVJN001607; Thu, 3 Apr 2003 17:41:31 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030403082051.GA6764@nic.fr> References: <4.2.0.58.J.20030402150335.035b1fc8@localhost> <20030403082051.GA6764@nic.fr> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 17:16:52 -0800 To: Stephane Bortzmeyer , Martin Duerst From: Paul Hoffman / IMC Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: IDN registration policy list Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 10:20 AM +0200 4/3/03, Stephane Bortzmeyer wrote: >It doesn't scale if you want to actually generate the bundle and >publish them in a static zone file. I tried for the '.fr' zone which >is quite small - 150,000 domains - and the resulting zone file was >larger than '.com' even before the domains starting with the letter A >were fully processed. But you have other approaches: Exactly. Saying "it doesn't scale" in the IETF means something very different than "I can't think of how to make this work". --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 17:41:45 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341fiJM001648 for ; Thu, 3 Apr 2003 17:41:44 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h341fi4N001646 for idn-reg-policy-bks; Thu, 3 Apr 2003 17:41:44 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341fVJT001607; Thu, 3 Apr 2003 17:41:34 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030403133539.03c8f7b8@localhost> References: <4.2.0.58.J.20030403133539.03c8f7b8@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 17:39:07 -0800 To: Martin Duerst , Roozbeh Pournader From: Paul Hoffman / IMC Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:47 PM -0500 4/3/03, Martin Duerst wrote: >But I think such a sequence-dependent structure is difficult >to understand and treat mathematically. That's maybe why >Adam has problems with it. It would be much better if we >had a structure that didn't depend on sequence. Something >along the lines of 'blockings can overlap (and don't belong >to anybody), but mappings cannot overlap'. Blockings *have* to belong to someone so that the registry knows when they are unblocked. Registrations don't last forever. From the real examples that people have given in the JET document and the examples on this list, you can't generate a list of blocked names just by looking at the allocated names: you have to look at the registered names. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 17:41:37 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341fbJM001629 for ; Thu, 3 Apr 2003 17:41:37 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h341fagS001626 for idn-reg-policy-bks; Thu, 3 Apr 2003 17:41:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341fVJP001607; Thu, 3 Apr 2003 17:41:32 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030403102848.GA7626@nic.fr> References: <4.2.0.58.J.20030402171329.0294df10@localhost> <20030403102848.GA7626@nic.fr> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 17:18:14 -0800 To: Stephane Bortzmeyer , Martin Duerst From: Paul Hoffman / IMC Subject: Re: Equivalence only in one direction (Was: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 12:28 PM +0200 4/3/03, Stephane Bortzmeyer wrote: >This is a special case of a more general issue, which is not clearly >stated in the current draft: equivalence goes from the Left Hand Side >to the Right Hand Side but not the reverse. Correct. I'll make a point of that in the next draft. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 17:41:37 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341faJM001628 for ; Thu, 3 Apr 2003 17:41:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h341faV8001625 for idn-reg-policy-bks; Thu, 3 Apr 2003 17:41:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h341fVJR001607; Thu, 3 Apr 2003 17:41:33 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030403120442.03d16270@localhost> References: <20030403014818.GC8966@nicemice.net> <20030331210443.GB15622@nicemice.net> <20030403014818.GC8966@nicemice.net> <4.2.0.58.J.20030403120442.03d16270@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Thu, 3 Apr 2003 17:41:32 -0800 To: Martin Duerst , IDN registration policy list From: Paul Hoffman / IMC Subject: Re: initial thoughts Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 12:15 PM -0500 4/3/03, Martin Duerst wrote: >At 19:15 03/04/02 -0800, Paul Hoffman / IMC wrote: > >>At 1:48 AM +0000 4/3/03, Adam M. Costello wrote: >>>Couldn't you simply say that when more than one key matches, use the >>>longest match? It would be trickier to implement, but that's exactly >>>how routing tables work. It seems pretty easy to describe. Whether it >>>would be worth the additional complexity, I don't know. >> >>How do others feel about this? > >I think first we need clarification from you about whether >you intended, in your approach, that (variants in) bundles >can overlap. There is no indication in your draft that they >can't, but on the other hand, there is no indication that you >were aware of the fact that they could. I will add text saying that they cannot overlap, and add steps in the process to check for that. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Thu Apr 3 18:40:36 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h342eaJM004717 for ; Thu, 3 Apr 2003 18:40:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h342eaCe004715 for idn-reg-policy-bks; Thu, 3 Apr 2003 18:40:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h342eZJM004709 for ; Thu, 3 Apr 2003 18:40:35 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191H8B-0007HJ-00 for ; Thu, 03 Apr 2003 18:40:39 -0800 Date: Fri, 4 Apr 2003 02:40:39 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: Equivalence only in one direction Message-ID: <20030404024039.GE24059@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030403133539.03c8f7b8@localhost> <20030402022055.GB30135@nicemice.net> <4.2.0.58.J.20030402171329.0294df10@localhost> <20030403102848.GA7626@nic.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030403120442.03d16270@localhost> <4.2.0.58.J.20030403133539.03c8f7b8@localhost> <20030403102848.GA7626@nic.fr> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: This message responds to three people (Stephane, Roozbeh, and Martin) and three threads, but the subject line reflects the topic of the entire message. Stephane Bortzmeyer wrote: > This is a special case of a more general issue, which is not clearly > stated in the current draft: equivalence goes from the Left Hand Side > to the Right Hand Side but not the reverse. Roozbeh Pournader wrote: > I was talking about variants that are shared between two different labels > (that are not variants of each other), which *will* happen in zones, since > we can't define the equivalence tables to be 'equivalence classes' (in > mathematical terms). Now that we're finally starting to understand each other, I'd like to request that everyone please stop using the word "equivalence" to refer to relations that are not equivalence relations. It's confusing. I jumped to the conclusion that my approach and Paul's approach were essentially the same, merely described differently, because Paul's draft uses the word "equivalent" for the variant relation. Now I (and Martin) are starting to see that the two approaches may be fundamentally different afterall. An equivalence relation is a relation that is reflexive (X is equivalent to itself), symmetric (if X is equivalent to Y, then Y is equivalent to X) and transitive (if X is equivalent to Y, and Y is equivalent to Z, then X is equivalent to Z). In the quotations above, Stephane wants to talk about relations that are not symmetric. Roozbeh wants to talk about relations that are not transitive. That's fine, but we all (myself included) need to be careful to state our assumptions and choose our terms carefully. Roozbeh Pournader wrote: > I am talking about two main labels that are not variants of each > other, but share a variant. > > Now, when one registers a label with a U+0649 and someone else goes > with a U+064A, who will own the U+06CC version? Very good question, illustrating how much more complex the situation becomes when the relations are not equivalence relations. More generally, let's consider various orders of registrations. Let's call these three labels X, Y, and V, where V is the common variant of X and Y, and X and Y are not variants of each other. scenario 1: User1 asks for V. User2 asks for X. User1 asks for X. scenario 2: User1 asks for X. User1 asks for V. User2 asks for Y. scenario 3 (same as 2 but reordered): User1 asks for X. User2 asks for Y. User1 asks for V. It's not obvious who should get what in each scenario. > 1. Required: Always resolving. Example: different digit forms > (U+0030->U+0660). > > 2. Variant: Optionally resolving. The current variant. > > 3. Blocking: Reserved forever. Example: Arabic vs Persian Yehs > (U+0649->U+06CC). I think I'm starting to understand your vision. Let's me try to explain it back to you to check.. The name(s) that the registrant requests is(are) the center(s) of three nested regions of the namespace. The innermost region contains all the labels that are automatically active and delegated to the registrant. The next larger region contains all the labels that the registrant controls (including both the automatically active labels and the labels that the registrant may activate at any time with no further collision checking). The largest region contains all the names that cannot be delegated to any other registrant (some of them cannot be delegated to anyone at all). Presumably it would be okay for the outermost region of one registrant to overlap the outermost region of another registrant. But it would not be okay for the outermost region of one registrant to overlap the middle region of another registrant. I have not begun to think about how these regions might be formally defined. Martin Duerst wrote: > I think I agree with Adam that his approach then is easier to > describe, because it can describe how to map one label to one other > label, which should make the description a lot simpler. But of course > I would like to first see an actual wording from Adam. Here's one way the groupID function might work: 1. Apply ToUnicode to the label. 2. Apply Nameprep to the label. 3. Initialize an empty string buffer. 4. While the label is not empty apply a two-column mapping table (containing strings in both columns) as follows: a. Find the longest prefix match between the label and the first column of the table. b. If there was no match then remove the first character of the label and append it to the buffer, otherwise remove the matching prefix from the label and append the corresponding string (from the second column of the table) to the buffer. 5. Apply ToASCII to the buffer. 6. Apply tolower() to the buffer and return the result. This serves only to define an equivalence relation. If we decide we need some other sort of relation, this sort of function might not be useful at all. AMC From owner-idn-reg-policy Thu Apr 3 18:42:07 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h342g7JM004946 for ; Thu, 3 Apr 2003 18:42:07 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h342g7mj004945 for idn-reg-policy-bks; Thu, 3 Apr 2003 18:42:07 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h342g6JM004941 for ; Thu, 3 Apr 2003 18:42:06 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191H9e-0007IJ-00 for ; Thu, 03 Apr 2003 18:42:10 -0800 Date: Fri, 4 Apr 2003 02:42:10 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: initial thoughts Message-ID: <20030404024210.GF24059@nicemice.net> Reply-To: IDN registration policy list References: <20030331210443.GB15622@nicemice.net> <20030403014818.GC8966@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC wrote: > > Suppose JPNIC decides that hiragana and katakana should block each > > other. > > Er, in order to say that something doesn't scale, you need to use > real-world examples. JPNIC has already stated that they have no > intention to do this. Maybe it won't happen for Japanese, but there are other opportunities for it to happen. Serbo-Croatian can be written in either of two scripts (Cyrillic and Latin), and there are several Indian scripts that use identical names for most of their letters, and I've heard of Indian languages that can use more than one script interchangeably (sorry I don't have more details). Are you sure you want an architecture that dares to flirt with exponential explosions? AMC From owner-idn-reg-policy Fri Apr 4 00:16:57 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h348GvJM028970 for ; Fri, 4 Apr 2003 00:16:57 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h348Gvvt028969 for idn-reg-policy-bks; Fri, 4 Apr 2003 00:16:57 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h348GuJM028964 for ; Fri, 4 Apr 2003 00:16:56 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191MNc-00080G-00 for ; Fri, 04 Apr 2003 00:16:56 -0800 Date: Fri, 4 Apr 2003 08:16:56 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: model with overlapping variants Message-ID: <20030404081656.GA29886@nicemice.net> Reply-To: IDN registration policy list Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Let's suppose for a moment that it turns out to be essential to use a relation that is not transitive, as Roozbeh has suggested, and as (I now think) Paul would agree. That is, even if labels X and Y are unrelated, there might exist a label V that is related to both. But let's assume the relation can be symmetric; that is, there is no need to distinguish between "X is related to Y" versus "Y is related to X" versus "X and Y are related". Someone tell me if you think that's an unreasonable assumption. For the moment I'll call the relation "confusability". Given any two labels (in no particular order), they are either confusable or not, and it is possible to compute that boolean value. The computation might enumerate all the labels that could be confused with one of the inputs and check whether the other input is among them, or it might be possible to be more efficient by using a clever algorithm. We'll worry about that later. In this model, there are no atomic groups of labels. In a different model that used an equivalence relation, we could be sure that every group is either entirely registered or entirely available. But with this non-transitive relation, it's possible the set of labels confusable with a submitted label is partially registered and partially available. What properties ought a registration policy to have in this model? 1. Two labels in the zone belonging to different registrants must not be confusable. Corollary: Two labels in the zone belonging to different registration bundles must not be confusable, even if they belong to the same registrant (because the registrant could sell them to different buyers). 2. Bundles must not tie together unrelated things in the zone. (That would cheat the registry out of fees for what should be separate bundles.) There are at least three ways to define the relatedness property that a bundle must satisfy: a. loosely related: For every pair of labels X and Y in the zone that are in the same bundle, there must exist a sequence X, Z1, Z2, Z3, ..., Zk, Y such that every pair of adjacent labels in the sequence is confusable. In other words, X and Y must be related by the transitive closure of the confusability relation. b. tightly related: Every pair of labels X and Y in the zone that are in the same bundle must be confusable. In other words, the labels in the zone in a particular bundle form a clique. c. radially related: Among the labels in the zone that belong to a particular bundle, one is designated as the center, and all the others are confusable with it. I think properties 1 and 2(a|b|c) are all we need. Am I overlooking anything? Notice that the properties speak only of labels in the zone. They place no constraints on labels that are not in the zone. Nobody really cares what sort of behind-the-scenes bookkeeping the registry might be doing with the labels that are not in the zone, as long as the labels in the zone are well behaved. 2a looks difficult to enforce, and might lead to bundles that are too large, so let's put that one aside for now. 2b and 2c both look reasonable to me. How could a registry enforce those properties? Here's one general approach: Each bundle contains a set of related labels, all of which are in the zone. Bundles do not contain "blocked" labels. For 2c, one of the labels is flagged as the primary label and cannot be removed from the bundle. When a registrant asks to create a new bundle, containing a single initial label, the request is denied if the label is confusable with any label already in the zone. When a registrant asks to add a label to their existing bundle, the request is denied if the label is confusable with any label in any other bundle. Also, the request is denied if the label is not confusable with: every label in the bundle (2b) / the primary label (2c). When a registrant asks to remove a label from their existing bundle, the request is denied if it is the only(2b)/primary(2c) label in the bundle. A request by a registrant to remove one of their entire bundles is never denied. We do not need to specify whether a registry keeps records regarding labels that are not in the zone. It might do so, as a precomputation to help with the required checks, or there might be clever pattern-matching algorithms and clever indexing data structures that allow the registry to perform the required checks without extra storage. If this model looks promising, the big question is whether there is a confusability relation (parameterized by tables) that is both expressive enough to be useful and tractable enough to permit a feasible implementation of the model. With the sort of tables that have been proposed so far, the checks to be performed are effectively regular expression matches on the zone file or on bundles. And not arbitrary regular expressions, but a restricted class of regular expressions built from rows of the mapping table. It wouldn't surprise me if there were tricks that could be played, but I have no expertise in this area. AMC From owner-idn-reg-policy Fri Apr 4 07:27:00 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34FR0JM013176 for ; Fri, 4 Apr 2003 07:27:00 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34FR065013175 for idn-reg-policy-bks; Fri, 4 Apr 2003 07:27:00 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.12.9/8.11.6) with SMTP id h34FQwJM013154 for ; Fri, 4 Apr 2003 07:26:58 -0800 (PST) Received: (qmail 20762 invoked from network); 4 Apr 2003 15:26:53 -0000 Received: from adsl-65-43-33-245.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.33.245) by server.iicinternet.com with SMTP; 4 Apr 2003 15:26:53 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: References: Date: Fri, 4 Apr 2003 10:26:19 -0500 To: idn-reg-policy@imc.org From: tedd Subject: confusability Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: AMC said: >For the moment I'll call the relation "confusability". Given any two >labels (in no particular order), they are either confusable or not, and >it is possible to compute that boolean value. The computation might >enumerate all the labels that could be confused with one of the inputs >and check whether the other input is among them, or it might be possible >to be more efficient by using a clever algorithm. We'll worry about >that later. From an earlier post, someone talked about IBM.com vs 1BM.com -- which should have been ibm.com vs 1bm.com, but none the less this type of similar-looking-glyph use can be confusing. It can be even more confusing if one uses a Greek small letter iota with tonos (U03AF) to produce an ibm.com. Is this the type of confusion you are talking about? tedd -- http://sperling.com/ From owner-idn-reg-policy Fri Apr 4 08:55:07 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Gt7JM020617 for ; Fri, 4 Apr 2003 08:55:07 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34Gt7qu020616 for idn-reg-policy-bks; Fri, 4 Apr 2003 08:55:07 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Gt5JM020602; Fri, 4 Apr 2003 08:55:05 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h34Gt0sr007582; Fri, 4 Apr 2003 11:55:06 -0500 Message-Id: <4.2.0.58.J.20030404112907.0666ae48@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 04 Apr 2003 11:34:29 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs In-Reply-To: References: <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 17:11 03/04/03 -0800, Paul Hoffman / IMC wrote: >Many thanks for the detailed editorial comments! I haven't listed them >here but went ahead and marked up the document. A few left-overs: One more comment, of a general nature: The document should clearly state that it is intended for public registries with large numbers of registrations (i.e. mostly second-level and for some ccTLDs third-level zones) rather than tightly controlled, privately managed, and in most cases much smaller zones (mostly third-level and some higher levels). In the later case, the owner does not have to state a policy, and can just register whatever they think is feasible. Regards, Martin. From owner-idn-reg-policy Fri Apr 4 08:55:07 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Gt6JM020614 for ; Fri, 4 Apr 2003 08:55:07 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34Gt6vG020613 for idn-reg-policy-bks; Fri, 4 Apr 2003 08:55:06 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Gt5JM020601; Fri, 4 Apr 2003 08:55:05 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h34Gt0sp007582; Fri, 4 Apr 2003 11:55:05 -0500 Message-Id: <4.2.0.58.J.20030404110254.066b2530@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 04 Apr 2003 11:08:28 -0500 To: Paul Hoffman / IMC , Stephane Bortzmeyer From: Martin Duerst Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Cc: IDN registration policy list In-Reply-To: References: <20030403082051.GA6764@nic.fr> <4.2.0.58.J.20030402150335.035b1fc8@localhost> <20030403082051.GA6764@nic.fr> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 17:16 03/04/03 -0800, Paul Hoffman / IMC wrote: >At 10:20 AM +0200 4/3/03, Stephane Bortzmeyer wrote: >>It doesn't scale if you want to actually generate the bundle and >>publish them in a static zone file. I tried for the '.fr' zone which >>is quite small - 150,000 domains - and the resulting zone file was >>larger than '.com' even before the domains starting with the letter A >>were fully processed. But you have other approaches: > >Exactly. Saying "it doesn't scale" in the IETF means something very >different than "I can't think of how to make this work". Hello Paul, If your 'exactly' refers to the last sentence of Stephane, i.e. if you say that mapping with such a table is okay because, even though a brute force implementation will not scale, there are other implementation alternatives available, then this lets me wonder about the following: One of the overreaching goals of IDN/IDNA was that we would not have to change anything in the DNS servers. If we now say "well, we'll just fix this problem by fixing DNS servers", then don't we invalidate our whole approach? Regards, Martin. From owner-idn-reg-policy Fri Apr 4 08:55:04 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Gt4JM020599 for ; Fri, 4 Apr 2003 08:55:04 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34Gt4o2020598 for idn-reg-policy-bks; Fri, 4 Apr 2003 08:55:04 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Gt2JM020591; Fri, 4 Apr 2003 08:55:03 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h34Gt0sl007582; Fri, 4 Apr 2003 11:55:02 -0500 Message-Id: <4.2.0.58.J.20030404101948.03cd3e28@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 04 Apr 2003 11:36:49 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs In-Reply-To: References: <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 17:11 03/04/03 -0800, Paul Hoffman / IMC wrote: >At 6:35 PM -0500 4/2/03, Martin Duerst wrote: >>I often see 'JET document' in this discussion. Is this >>draft-jseng-idn-admin, or something else? > >That is the one. Thanks. Any explanation for the name 'JET'? >>I have been thinking about 'Framework' quite a bit. Is this draft >>a framework? It seems to be a definition of a table format with >>some associated algorithm(s), and a recommendation to use this >>table format/algorithms. And the recommendation isn't very clear: >>Should all registries use this table format/algorithm? Or is it >>just one of potentially many formats/algorithms, and registries >>can choose? > >Sorry, I don't see how your questions apply to the word "framework". Could >you suggest alternative wording? Okay, let me try again. On a very high level, I'm suspicious of documents with the word "framework" in the title, because this seems to be used more for buzword compatibility than anything else. On a lower level, I think your document contains two things: 1) the table format and the algorithm to use it (which is basically a technical/protocol spec, and therefore doesn't seem appropriate for BCP) 2) Some general recommendations for registries (think about what languages you want to support, think about what mappings/blockings to use, document your policies, use the table format/algorithm where possible). This kind of material seems to be adequate for a BCP. So I guess my suggestion is that the draft be split in two, which would then get rid of "framework" automatically. >>I think the question of whether ASCII is allowed or not is a very >>special one, which should be considered separately. There may >>be cases where ASCII is already allowed; there is a good argument >>for always allowing ASCII, in particular if the higher-level >>domains are all ASCII; there is a good argument to not allow >>ASCII if the higher-level domains are all non-ASCII. >>These arguments are rather different from arguments about >>allowing a few more or less characters. > >I don't see why ASCII should be special. A registry decides which >characters are allowed in the IDN labels, regardless of the scripts. Why >does it seems special? For ASCII, as for any other script, there is the question of whether some of these characters are part of the language or not. However, there are also other questions specific to ASCII, because all current domain names are in ASCII. These are not linguistic questions, but very much operational questions. One could imagine for example that we/somebody make/s a recommendation that every host reachable by an IDN is also reachable by an ASCII-only domain name. One could then imagine that in many cases, this is done most easily by allowing ASCII in the same zone. Also, there is the question of constructing tables for zones that already allow ASCII (and maybe already have ASCII entries). I think all these questions should be discussed in a BCP, even if it's only as "you may want to think about". >>what about tables that don't have the same base character twice, >>but may map to the same character? E.g.: >> >>U+00E8|U+0065 >>U+00E9|U+0065 >> >>Or where a base character also appears as a variant >> >>U+00E8|U+0065 >>U+0065 >> >>Or where a base character appears as part of a variant: >> >>U+00FC|U+0075U+0065 >>U+0075 >>U+0065 > >I'm not sure what you mean by "what about" here. My question was: Are these allowed or not? >All of what you list is just fine. There is nothing anywhere that >prohibits those possibilities. In another mail, you wrote: >>>> I will add text saying that they cannot overlap, and add steps in the process to check for that. >>>> So it seems that what you are saying is that by definition according to the table, bundles can overlap, but you will add checks so that once a label is taken by a certain bundle, it cannot be taken anymore by another bundle, and therefore effectively they cannot overlap. >Are you proposing that I add examples with all these in them? Yes, please add examples like these. Regards, Martin. From owner-idn-reg-policy Fri Apr 4 09:22:28 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34HMSJM021375 for ; Fri, 4 Apr 2003 09:22:28 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34HMRl8021373 for idn-reg-policy-bks; Fri, 4 Apr 2003 09:22:27 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34HMOJN021361; Fri, 4 Apr 2003 09:22:24 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030404112907.0666ae48@localhost> References: <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404112907.0666ae48@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 09:16:07 -0800 To: Martin Duerst , idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:34 AM -0500 4/4/03, Martin Duerst wrote: >The document should clearly state that it is intended for >public registries with large numbers of registrations >(i.e. mostly second-level and for some ccTLDs third-level >zones) rather than tightly controlled, privately managed, >and in most cases much smaller zones (mostly third-level >and some higher levels). But that's not true. It is intended for any zone regardless of size and regardless of where they are in the DNS. > In the later case, the owner >does not have to state a policy, and can just register >whatever they think is feasible. We disagree on how to run zones, then. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Apr 4 09:22:28 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34HMRJM021374 for ; Fri, 4 Apr 2003 09:22:28 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34HMR0a021372 for idn-reg-policy-bks; Fri, 4 Apr 2003 09:22:27 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34HMOJP021361 for ; Fri, 4 Apr 2003 09:22:25 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030404110254.066b2530@localhost> References: <20030403082051.GA6764@nic.fr> <4.2.0.58.J.20030402150335.035b1fc8@localhost> <20030403082051.GA6764@nic.fr> <4.2.0.58.J.20030404110254.066b2530@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 09:22:20 -0800 To: IDN registration policy list From: Paul Hoffman / IMC Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:08 AM -0500 4/4/03, Martin Duerst wrote: >One of the overreaching goals of IDN/IDNA was that we would not >have to change anything in the DNS servers. If we now say >"well, we'll just fix this problem by fixing DNS servers", >then don't we invalidate our whole approach? It is rude to put words in other people's mouths, particularly when the words you put there are stupid. A zone does not have to create variant tables. A zone that creates variant tables has to decide how those tables will be used in practice. A zone that creates variant tables and practices should consider what the effect of creating those tables would be on both its DNS servers and on its registration system. This is pretty obvious to people who actually run zones. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Apr 4 09:33:58 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34HXwJM021751 for ; Fri, 4 Apr 2003 09:33:58 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34HXwJr021750 for idn-reg-policy-bks; Fri, 4 Apr 2003 09:33:58 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34HXtJN021746; Fri, 4 Apr 2003 09:33:55 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030404101948.03cd3e28@localhost> References: <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 09:33:53 -0800 To: Martin Duerst , idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:36 AM -0500 4/4/03, Martin Duerst wrote: >Okay, let me try again. On a very high level, I'm suspicious of >documents with the word "framework" in the title, because this >seems to be used more for buzword compatibility than anything else. Years ago, I created a protocol that had the word "Simple" in the title. Someone commented that every protocol should have the word "simple" or "fast" or "revised" in the title so that people could not argue against it. Maybe I should add "framework" to that list. >On a lower level, I think your document contains two things: >1) the table format and the algorithm to use it (which is > basically a technical/protocol spec, and therefore doesn't > seem appropriate for BCP) >2) Some general recommendations for registries (think about > what languages you want to support, think about what > mappings/blockings to use, document your policies, > use the table format/algorithm where possible). > This kind of material seems to be adequate for a BCP. > >So I guess my suggestion is that the draft be split in two, >which would then get rid of "framework" automatically. I don't think the draft can be split into two because the whole idea of allocations/blockings is tied quite tightly to the algorithm that creates the bundles. Having said that, and since I'm slogging through creating the -01 draft, I will see what I can do to separate out the three parts (recommendations, algorithm, format) better in the words in the document. >For ASCII, as for any other script, there is the question of whether >some of these characters are part of the language or not. Agree, but that is a concern for the person making the table. >However, there are also other questions specific to ASCII, >because all current domain names are in ASCII. This document is only for IDNs. Non-IDN domain names are not relevant here. > These are >not linguistic questions, but very much operational questions. Sorry, I still don't see how. >One could imagine for example that we/somebody make/s a recommendation >that every host reachable by an IDN is also reachable by an ASCII-only >domain name. One could then imagine that in many cases, this is >done most easily by allowing ASCII in the same zone. We aren't making that recommendation here. If someone wants to make that recommendation, they would have to show how to have IDNs map to non-IDNs. They can do that, but it isn't relevant here, just as someone who says "every host reachable by an IDN is also reachable by an all-Ethiopic IDN." >Also, there is the question of constructing tables for >zones that already allow ASCII (and maybe already have >ASCII entries). Sorry, I'm still not seeing what you are saying. How does this relate to IDNs? >Yes, please add examples like these. Will do. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Apr 4 12:13:20 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34KDKJM028453 for ; Fri, 4 Apr 2003 12:13:20 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34KDKD3028452 for idn-reg-policy-bks; Fri, 4 Apr 2003 12:13:20 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34KDGJM028447 for ; Fri, 4 Apr 2003 12:13:19 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191XYt-0000uu-00 for ; Fri, 04 Apr 2003 12:13:19 -0800 Date: Fri, 4 Apr 2003 20:13:19 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: confusability Message-ID: <20030404201319.GA3202@nicemice.net> Reply-To: IDN registration policy list References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: tedd wrote: > > For the moment I'll call the relation "confusability". Given any > > two labels (in no particular order), they are either confusable or > > not, and it is possible to compute that boolean value. > > From an earlier post, someone talked about IBM.com vs 1BM.com -- which > should have been ibm.com vs 1bm.com, but none the less this type of > similar-looking-glyph use can be confusing. It can be even more > confusing if one uses a Greek small letter iota with tonos (U03AF) to > produce an ibm.com. Is this the type of confusion you are talking > about? Could be. A registry would define its confusability relation as it sees fit. It doesn't want to define confusability so narrowly that not enough things are considered confusable, because then it would be swamped by disputes about name ownership. But it doesn't want to define confusability so broadly that it drastically curtails the number of registrations (and hence revenue). Maybe "confusable" is not the best term. Maybe "neighboring" would be better. It's got some of the right intuition: If you are my neighbor, then I am your neighbor (symmetry), but my neighbor's neighbor is not necessarily my neighbor (intransitivity). You can speak of the neighborhood centered around a particular label. Neighborhoods centered around different labels can partially overlap. A bundle would be either a set of labels that are all neighbors of each other, or a subset of the neighborhood centered around the bundle's primary label, depending on which version of property 2 we use. Property 1 says that neighboring labels in a zone must not belong to distinct bundles. I just noticed that I forgot to state an assumption, which we can call property 0: Every label in a zone belongs to exactly one bundle. AMC From owner-idn-reg-policy Fri Apr 4 12:14:36 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34KEaJM028503 for ; Fri, 4 Apr 2003 12:14:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34KEaKu028502 for idn-reg-policy-bks; Fri, 4 Apr 2003 12:14:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34KEYJM028488; Fri, 4 Apr 2003 12:14:34 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h34KEZsl026309; Fri, 4 Apr 2003 15:14:36 -0500 Message-Id: <4.2.0.58.J.20030404133734.02e0b788@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 04 Apr 2003 13:59:32 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs In-Reply-To: References: <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 09:33 03/04/04 -0800, Paul Hoffman / IMC wrote: >At 11:36 AM -0500 4/4/03, Martin Duerst wrote: >>On a lower level, I think your document contains two things: >>1) the table format and the algorithm to use it (which is >> basically a technical/protocol spec, and therefore doesn't >> seem appropriate for BCP) >>2) Some general recommendations for registries (think about >> what languages you want to support, think about what >> mappings/blockings to use, document your policies, >> use the table format/algorithm where possible). >> This kind of material seems to be adequate for a BCP. >> >>So I guess my suggestion is that the draft be split in two, >>which would then get rid of "framework" automatically. > >I don't think the draft can be split into two because the whole idea of >allocations/blockings is tied quite tightly to the algorithm that creates >the bundles. Blockings are clearly related to the tables. But there are recommendations at a much higher level that aren't really related, such as the top recommendation of "think things through before you start registering". It might also be that we end up with two or more different tables/algorithms, each suited for different kinds of (linguistic/orthographic/whatever) situations. Each of these would then be described in its own document, and there would be a BCP that would contain the general considerations. But it is not clear to me from your current document whether you intend this to be the only table/algorith, or whether you think there might be several. >Having said that, and since I'm slogging through creating the -01 draft, I >will see what I can do to separate out the three parts (recommendations, >algorithm, format) better in the words in the document. I think this is an excellent way to move forward. >>For ASCII, as for any other script, there is the question of whether >>some of these characters are part of the language or not. > >Agree, but that is a concern for the person making the table. > >>However, there are also other questions specific to ASCII, >>because all current domain names are in ASCII. > >This document is only for IDNs. Non-IDN domain names are not relevant here. I'm sorry, but I don't understand what you are saying. As far as I understand, a zone can contain both ASCII-only and IDN domain names. Also, in practice, there are a lot of cases where there will be e.g. mappings from/to ASCII characters. We already have seen many examples. The tables could easily be used for a zone even if that zone only allowed ASCII, or it could be specifically used to state that a zone contains only ASCII. (ASCII always referring to what the user sees, because obviously with IDNA, the actual zone data is pure ASCII anyway). For example, an u-umlaut in German is often equated with 'ue'. Now suppose that the German TLD wants to define tables to take this into account, and apply these to the current ASCII registrations in .de. Can they design the tables so that they don't have to take away already registered domains from their owners? If they apply the algorithm, in what sequence do they apply it to the registrations they already have? Or, how do they design the tables so that there is no difference in the result independent of what sequence they apply it? These are all specific, real, questions. If there are easy answers to them, it doesn't hurt to state them in the draft. If these are unsolved questions, then we can state it so that we can start trying to solve them. Regards, Martin. From owner-idn-reg-policy Fri Apr 4 12:14:36 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34KEaJM028505 for ; Fri, 4 Apr 2003 12:14:36 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34KEaK8028504 for idn-reg-policy-bks; Fri, 4 Apr 2003 12:14:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34KEYJM028487; Fri, 4 Apr 2003 12:14:34 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h34KEZsj026309; Fri, 4 Apr 2003 15:14:36 -0500 Message-Id: <4.2.0.58.J.20030404133143.05c4e718@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 04 Apr 2003 13:34:02 -0500 To: Paul Hoffman / IMC , idn-reg-policy@imc.org From: Martin Duerst Subject: Re: New Internet Draft on registering IDNs In-Reply-To: References: <4.2.0.58.J.20030404112907.0666ae48@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404112907.0666ae48@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 09:16 03/04/04 -0800, Paul Hoffman / IMC wrote: >At 11:34 AM -0500 4/4/03, Martin Duerst wrote: >>The document should clearly state that it is intended for >>public registries with large numbers of registrations >>(i.e. mostly second-level and for some ccTLDs third-level >>zones) rather than tightly controlled, privately managed, >>and in most cases much smaller zones (mostly third-level >>and some higher levels). > >But that's not true. It is intended for any zone regardless of size and >regardless of where they are in the DNS. > >> In the later case, the owner >>does not have to state a policy, and can just register >>whatever they think is feasible. > >We disagree on how to run zones, then. Okay, so let's take an example. Assume that at w3.org, we want to create a small number of third-level IDNs. Do you think that we should set up tables and other infrastructure before we can put a single IDN into that zone? If you think this is the case, can you explain why? Regards, Martin. From owner-idn-reg-policy Fri Apr 4 13:09:05 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34L95JM004086 for ; Fri, 4 Apr 2003 13:09:05 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34L95nK004085 for idn-reg-policy-bks; Fri, 4 Apr 2003 13:09:05 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from bartok.sidn.nl (bartok.sidn.nl [193.176.144.164]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34L93JM004060; Fri, 4 Apr 2003 13:09:04 -0800 (PST) Received: from bartok.sidn.nl (localhost.sidn.nl [127.0.0.1]) by bartok.sidn.nl (8.12.9/8.12.9) with ESMTP id h34L8tuc001122; Fri, 4 Apr 2003 23:08:55 +0200 (CEST) (envelope-from jaap@bartok.sidn.nl) Message-Id: <200304042108.h34L8tuc001122@bartok.sidn.nl> To: Martin Duerst cc: Paul Hoffman / IMC , idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs In-reply-to: Your message of Fri, 04 Apr 2003 13:59:32 -0500. <4.2.0.58.J.20030404133734.02e0b788@localhost> Date: Fri, 04 Apr 2003 23:08:55 +0200 From: Jaap Akkerhuis Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin & all, Your next questions/examples was something I was thinking about some time. We (can) have similar problems with Dutch as well, but I keep my comments to your .de example. For example, an u-umlaut in German is often equated with 'ue'. Now suppose that the German TLD wants to define tables to take this into account, and apply these to the current ASCII registrations in .de. Can they design the tables so that they don't have to take away already registered domains from their owners? If they apply the algorithm, in what sequence do they apply it to the registrations they already have? Or, how do they design the tables so that there is no difference in the result independent of what sequence they apply it? These are all specific, real, questions. Yes, these are real problems. There is the domain name clueless.de (and clue.de and probably more such as feuhrer.de). If the algorithm specifies that u-umlaut and ue are in a single bundle it will might mean that either the clue.de delegation should be cancelled or that nonody can register clu+umlaut. That last one is probably unlikely, but feuhrer.de and fu(+umlaut)hrer.de (or any other word that makes sense and is already registered) is a potential clash. There are probably more of these clashes then I can think of, but I leave to more quelified German speakers then me to come up with them. If there are easy answers to them, it doesn't hurt to state them in the draft. If these are unsolved questions, then we can state it so that we can start trying to solve them. The easy answer is of course that denic decides not to introduce the u+umlaut at all and just live with the current status quo. But that is quite unsatisfactory. I'm afraid I don't see even a start to solve this type of problems. jaap From owner-idn-reg-policy Fri Apr 4 13:31:48 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34LVlJM004855 for ; Fri, 4 Apr 2003 13:31:47 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34LVlKJ004853 for idn-reg-policy-bks; Fri, 4 Apr 2003 13:31:47 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from bartok.sidn.nl (bartok.sidn.nl [193.176.144.164]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34LVhJM004841; Fri, 4 Apr 2003 13:31:44 -0800 (PST) Received: from bartok.sidn.nl (localhost.sidn.nl [127.0.0.1]) by bartok.sidn.nl (8.12.9/8.12.9) with ESMTP id h34LVeuc001195; Fri, 4 Apr 2003 23:31:40 +0200 (CEST) (envelope-from jaap@bartok.sidn.nl) Message-Id: <200304042131.h34LVeuc001195@bartok.sidn.nl> To: Martin Duerst cc: Paul Hoffman / IMC , idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs In-reply-to: Your message of Fri, 04 Apr 2003 13:34:02 -0500. <4.2.0.58.J.20030404133143.05c4e718@localhost> Date: Fri, 04 Apr 2003 23:31:40 +0200 From: Jaap Akkerhuis Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: (paul) >We disagree on how to run zones, then. It really seems you do. However, how to run zones is another political problem. Okay, so let's take an example. Assume that at w3.org, we want to create a small number of third-level IDNs. Do you think that we should set up tables and other infrastructure before we can put a single IDN into that zone? If you think this is the case, can you explain why? I cann't but it raises another issue. At first, one would say that if w3.org makes a big mess of its subdomains (in the eyes of the users of these domains) it is the problem of the w3.org domain name holder. They made the mess, they should sort it out. However, if the it spills over into a more generic bigger fight, can the .org holder be forced to delete the w3.org delegation so the problem is solved (somewhat) by brute force? Also, suppose that centralnic (www.centralnic.com) makes a mess of its *..com delegations, can .com pull the plug? I don't have an answer, but if you generalize this slightly more, such as .org makes a mess of its IDN delegations, you re likely to land into ICANN related policy discussions. jaap From owner-idn-reg-policy Fri Apr 4 13:35:26 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34LZQJM004936 for ; Fri, 4 Apr 2003 13:35:26 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34LZQJE004935 for idn-reg-policy-bks; Fri, 4 Apr 2003 13:35:26 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34LZPJM004931 for ; Fri, 4 Apr 2003 13:35:25 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191YqO-00018r-00 for ; Fri, 04 Apr 2003 13:35:28 -0800 Date: Fri, 4 Apr 2003 21:35:28 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030404213527.GC3202@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200304042108.h34L8tuc001122@bartok.sidn.nl> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Jaap Akkerhuis wrote: > There is the domain name clueless.de (and clue.de and probably more > such as fuehrer.de). If the algorithm specifies that u-umlaut and > ue are in a single bundle it might mean that either the clue.de > delegation should be cancelled or that nobody can register clu+umlaut. Or that only the owner of clue.de can have clu. > That last one is probably unlikely, but fuehrer.de and > fu(+umlaut)hrer.de (or any other word that makes sense and is already > registered) is a potential clash. Similarly, fuhrer.de might be unavailable to everyone except the owner of fuehrer.de. That seems like the right policy to me. I think revoking existing registrations is a non-starter; the registrants would be furious, and rightly so. AMC From owner-idn-reg-policy Fri Apr 4 13:44:10 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34LiAJM005268 for ; Fri, 4 Apr 2003 13:44:10 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34LiAFB005267 for idn-reg-policy-bks; Fri, 4 Apr 2003 13:44:10 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from tux.w3.org (IDENT:root@tux.w3.org [18.29.0.27]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34Li9JM005263 for ; Fri, 4 Apr 2003 13:44:09 -0800 (PST) Received: from enoshima (IDENT:root@tux.w3.org [18.29.0.27]) by tux.w3.org (8.12.9/8.12.9) with ESMTP id h34Li9sj019080; Fri, 4 Apr 2003 16:44:10 -0500 Message-Id: <4.2.0.58.J.20030404141327.02dfc8b0@localhost> X-Sender: duerst@localhost X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J Date: Fri, 04 Apr 2003 16:43:49 -0500 To: IDN registration policy list , IDN registration policy list From: Martin Duerst Subject: Re: model with overlapping variants In-Reply-To: <20030404081656.GA29886@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Hello Adam, Thanks a lot for this very clear writeup. At 08:16 03/04/04 +0000, Adam M. Costello wrote: >Let's suppose for a moment that it turns out to be essential to use a >relation that is not transitive, as Roozbeh has suggested, and as (I now >think) Paul would agree. That is, even if labels X and Y are unrelated, >there might exist a label V that is related to both. > >But let's assume the relation can be symmetric; that is, there is no >need to distinguish between "X is related to Y" versus "Y is related to >X" versus "X and Y are related". Someone tell me if you think that's an >unreasonable assumption. I currently tend to think that it could be quite reasonable. The main thing I'm worried about is what happens if X or Y (or both) are strings rather than characters. There may be some weird cases where the 'longest prefix' match somehow destroys the symmetry. >For the moment I'll call the relation "confusability". What about "similarity"? >Given any two >labels (in no particular order), they are either confusable or not, and >it is possible to compute that boolean value. The computation might >enumerate all the labels that could be confused with one of the inputs >and check whether the other input is among them, or it might be possible >to be more efficient by using a clever algorithm. We'll worry about >that later. > >In this model, there are no atomic groups of labels. In a different >model that used an equivalence relation, we could be sure that every >group is either entirely registered or entirely available. But with >this non-transitive relation, it's possible the set of labels confusable >with a submitted label is partially registered and partially available. > >What properties ought a registration policy to have in this model? > > 1. Two labels in the zone belonging to different registrants must not > be confusable. Corollary: Two labels in the zone belonging to > different registration bundles must not be confusable, even if they > belong to the same registrant (because the registrant could sell > them to different buyers). > > 2. Bundles must not tie together unrelated things in the zone. (That > would cheat the registry out of fees for what should be separate > bundles.) There are at least three ways to define the relatedness > property that a bundle must satisfy: > > a. loosely related: For every pair of labels X and Y in the zone > that are in the same bundle, there must exist a sequence X, Z1, > Z2, Z3, ..., Zk, Y such that every pair of adjacent labels in > the sequence is confusable. In other words, X and Y must be > related by the transitive closure of the confusability relation. > > b. tightly related: Every pair of labels X and Y in the zone that > are in the same bundle must be confusable. In other words, the > labels in the zone in a particular bundle form a clique. > > c. radially related: Among the labels in the zone that belong to a > particular bundle, one is designated as the center, and all the > others are confusable with it. > >I think properties 1 and 2(a|b|c) are all we need. Am I overlooking >anything? > >Notice that the properties speak only of labels in the zone. They place >no constraints on labels that are not in the zone. Nobody really cares >what sort of behind-the-scenes bookkeeping the registry might be doing >with the labels that are not in the zone, as long as the labels in the >zone are well behaved. > >2a looks difficult to enforce, and might lead to bundles that are too >large, so let's put that one aside for now. If we assume that registrants try to grab as much as possible, then the bundles could become very large. In fact, if everybody would try to grab as many entries as possible into a bundle, we would end up with an equivalence relation. Using ~ to express confusability, 2c gives us situations where we can have X~Y and Y~Z, which we could call a confusability chain of length 3. The question is whether there are longer confusability chains in practice. >2b and 2c both look reasonable to me. > >How could a registry enforce those properties? Here's one general >approach: > >Each bundle contains a set of related labels, all of which are in the >zone. Bundles do not contain "blocked" labels. For 2c, one of the >labels is flagged as the primary label and cannot be removed from the >bundle. > >When a registrant asks to create a new bundle, containing a single >initial label, the request is denied if the label is confusable with any >label already in the zone. > >When a registrant asks to add a label to their existing bundle, the >request is denied if the label is confusable with any label in any other >bundle. One way to visualize this is that there would be a one layer thick 'neutral zone' between any two bundles that might be transitively confusable. >Also, the request is denied if the label is not confusable >with: every label in the bundle (2b) / the primary label (2c). > >When a registrant asks to remove a label from their existing bundle, the >request is denied if it is the only(2b)/primary(2c) label in the bundle. The 'only(2b)' case seems irrelevant because of what you say in the next point. >A request by a registrant to remove one of their entire bundles is never >denied. > >We do not need to specify whether a registry keeps records regarding >labels that are not in the zone. It might do so, as a precomputation to >help with the required checks, or there might be clever pattern-matching >algorithms and clever indexing data structures that allow the registry >to perform the required checks without extra storage. > >If this model looks promising, the big question is whether there >is a confusability relation (parameterized by tables) that is both >expressive enough to be useful and tractable enough to permit a feasible >implementation of the model. > >With the sort of tables that have been proposed so far, the checks to be >performed are effectively regular expression matches on the zone file >or on bundles. And not arbitrary regular expressions, but a restricted >class of regular expressions built from rows of the mapping table. It >wouldn't surprise me if there were tricks that could be played, but I >have no expertise in this area. I think it's easy to make matching a regular expression against an ordered zone file much more efficient than matching that regular expression against each of the entries in the zone in turn. The main observation is that the state 'matched up to here' is represented not just by an index into the target string, but by an index and an interval from the first to the last string matched. To show a very simple example, assume we have a label "1bm", and a similarity "1" ~ "i". This gives us a regular expression "(1|i)bm". Now assume we try to match this against a zone file with the following entries: 1) 0bm 2) 1bb 3) 1oo 4) abb 5) bbc 6) ibc 7) ibm 8) ibx 9) quu Starting matching the "1", this gives us an interval of 2)-3) (i.e. it can match either string 2) or string 3). Trying with "1b", this reduces the interval to only 2), and trying with "1bm" reduces the interval to nothing, so we have to backtrack. Trying with "i" gives us an interval of 6)-8), and with "ib", we continue with the same interval, to then find a match with "ibm" at 7). Regards, Martin. From owner-idn-reg-policy Fri Apr 4 14:09:20 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34M9KJM006041 for ; Fri, 4 Apr 2003 14:09:20 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34M9J7d006039 for idn-reg-policy-bks; Fri, 4 Apr 2003 14:09:19 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34M9DJR006019; Fri, 4 Apr 2003 14:09:15 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200304042131.h34LVeuc001195@bartok.sidn.nl> References: <200304042131.h34LVeuc001195@bartok.sidn.nl> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 14:08:15 -0800 To: Jaap Akkerhuis , Martin Duerst From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:31 PM +0200 4/4/03, Jaap Akkerhuis wrote: > > (paul) >We disagree on how to run zones, then. > >It really seems you do. However, how to run zones is another political >problem. Exactly. This document should not say "little zones do this, but big zones do that", particularly if we can't say what is little and big. This document says "if you want to plan, here's what you should do; if you don't want to plan, here are some of the problems you will possibly encounter later". >At first, one would say that if w3.org makes a big mess of its >subdomains (in the eyes of the users of these domains) it is the >problem of the w3.org domain name holder. They made the mess, they >should sort it out. However, if the it spills over into a more >generic bigger fight, can the .org holder be forced to delete the >w3.org delegation so the problem is solved (somewhat) by brute >force? Nothing in this document suggests that. Of course, any zone (like w3.org, or .org, or the root) can decide to force rules on its sub-zones. >I don't have an answer, but if you generalize this slightly more, >such as .org makes a mess of its IDN delegations, you re likely to >land into ICANN related policy discussions. Of course. If things for which ICANN is responsible (the TLDs) make a mess, it is up to ICANN to fix it. The same is true for any level of the DNS hierarchy. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Apr 4 14:09:15 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34M9FJM006025 for ; Fri, 4 Apr 2003 14:09:15 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34M9FWJ006024 for idn-reg-policy-bks; Fri, 4 Apr 2003 14:09:15 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34M9DJN006019 for ; Fri, 4 Apr 2003 14:09:13 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <4.2.0.58.J.20030404133734.02e0b788@localhost> References: <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030404133734.02e0b788@localhost> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 14:00:34 -0800 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="iso-8859-1" ; format="flowed" Content-Transfer-Encoding: 8bit Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 1:59 PM -0500 4/4/03, Martin Duerst wrote: >But it is not clear to me from your current document whether you intend >this to be the only table/algorith, or whether you think there might >be several. I'll try to be clearer. And, obviously, other folks can submit different proposals, some of which might be more framework-y than this one. >For example, an u-umlaut in German is often equated with 'ue'. >Now suppose that the German TLD wants to define tables to take >this into account, and apply these to the current ASCII registrations >in .de. Can they design the tables so that they don't have to take >away already registered domains from their owners? This ties directly into the overlapping bundle question that I need to clarify. Every all-ASCII name in a a zone is inherently a bundle of one. The bundle for dürst would contain dürst and duerst, and the bundle for duerst would contain just duerst. An unstated rule that needs to be stated is that you cannot allocate names from bundles that already exist in the zone. They can be in the bundle and will block further registration, but they won't go into the zone for that registrant. In the case above, assume that someone already has duerst (which is true in .de and .ch). You now register dürst and get the bundle of dürst and duerst, with dürst being allocated. The current owner of duerst lets their registration expire. You could then put duerst into the zone as part of your bundle, if the registry rules allow that. But, even if you couldn't put it in the zone, the existence of your bundle would prevent me from registering duerst directly. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Apr 4 14:09:20 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34M9JJM006038 for ; Fri, 4 Apr 2003 14:09:19 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34M9JUp006037 for idn-reg-policy-bks; Fri, 4 Apr 2003 14:09:19 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34M9DJP006019; Fri, 4 Apr 2003 14:09:14 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <200304042108.h34L8tuc001122@bartok.sidn.nl> References: <200304042108.h34L8tuc001122@bartok.sidn.nl> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 14:03:35 -0800 To: Jaap Akkerhuis , Martin Duerst From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Cc: idn-reg-policy@imc.org Content-Type: text/plain; charset="iso-8859-1" ; format="flowed" Content-Transfer-Encoding: 8bit Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:08 PM +0200 4/4/03, Jaap Akkerhuis wrote: >Yes, these are real problems. There is the domain name clueless.de >(and clue.de and probably more such as feuhrer.de). If the algorithm >specifies that u-umlaut and ue are in a single bundle it will might >mean that either the clue.de delegation should be cancelled or that >nonody can register clu+umlaut. *Nothing* in this proposal or the JET proposal says anything about cancelling existing registrations. The next draft of this proposal will show that you can register clü just fine, but that registration won't cause conflict in the zone for clue. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Fri Apr 4 14:27:34 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34MRYJM006644 for ; Fri, 4 Apr 2003 14:27:34 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34MRYLU006643 for idn-reg-policy-bks; Fri, 4 Apr 2003 14:27:34 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34MRXJM006638 for ; Fri, 4 Apr 2003 14:27:33 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191Zeq-0001I2-00 for ; Fri, 04 Apr 2003 14:27:36 -0800 Date: Fri, 4 Apr 2003 22:27:36 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030404222736.GD3202@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030404133143.05c4e718@localhost> <200304042131.h34LVeuc001195@bartok.sidn.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200304042131.h34LVeuc001195@bartok.sidn.nl> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > how to run zones I think it would be useful to distinguish between registries and private zones. A private zone does not accept registrations from the public; the names in a private zone and all its subzones belong to the members/departments/etc of a single organization, or a single person. A registry accepts registrations from the public, so its zone contains names (and delegates subzones) belonging to organizations/individuals unrelated to the registry (except by the registration agreement). Here's an analogy: The regulations/expectations of what is proper in restaurants do not apply to homes. Although both are privately owned, one is considered a public place because the general public is invited in. AMC From owner-idn-reg-policy Fri Apr 4 14:37:21 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34MbLJM006848 for ; Fri, 4 Apr 2003 14:37:21 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34MbL4o006847 for idn-reg-policy-bks; Fri, 4 Apr 2003 14:37:21 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34MbKJM006843 for ; Fri, 4 Apr 2003 14:37:20 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191ZoJ-0001Jt-00 for ; Fri, 04 Apr 2003 14:37:23 -0800 Date: Fri, 4 Apr 2003 22:37:23 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030404223723.GE3202@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030404133734.02e0b788@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC wrote: > In the case above, assume that someone already has duerst (which is > true in .de and .ch). You now register dürst Are you sure you want to allow that? It seems to me that dürst and duerst should not be allowed to coexist in the same zone unless they belong to the same bundle (otherwise we fail in our goal to avoid surprising and confusing users). In this case that would mean that no one may add dürst to the zone except the owner of duerst. That seems fair to me. AMC From owner-idn-reg-policy Fri Apr 4 15:25:17 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34NPHJM008453 for ; Fri, 4 Apr 2003 15:25:17 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34NPHeM008452 for idn-reg-policy-bks; Fri, 4 Apr 2003 15:25:17 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34NPGJM008446 for ; Fri, 4 Apr 2003 15:25:16 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191aYh-0001Qr-00 for ; Fri, 04 Apr 2003 15:25:19 -0800 Date: Fri, 4 Apr 2003 23:25:19 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: model with overlapping variants Message-ID: <20030404232518.GF3202@nicemice.net> Reply-To: IDN registration policy list References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4.2.0.58.J.20030404141327.02dfc8b0@localhost> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Martin Duerst wrote: > The main thing I'm worried about is what happens if X or Y (or both) > are strings rather than characters. There may be some weird cases > where the 'longest prefix' match somehow destroys the symmetry. Could be. If so, we could force the symmetry by defining the relation this way: X and Y are neighbors iff table(X) matches Y or table(Y) matches X, where table() and "matches" are yet-to-be-defined operations that might be asymmetric. > > For the moment I'll call the relation "confusability". > > What about "similarity"? I think I like "neighbors" better. "Similarity" sounds like a real-valued function (how similar are X and Y?), while "neighbors" sounds more boolean (are X and Y are neighbors?). > > 2a looks difficult to enforce, and might lead to bundles that are > > too large, so let's put that one aside for now. > > If we assume that registrants try to grab as much as possible, then > the bundles could become very large. In fact, if everybody would try > to grab as many entries as possible into a bundle, we would end up > with an equivalence relation. If the registrants performed their grabbing sequentially (each bundle is maximally expanded before the next bundle is created), then we'd end up with an equivalence relation. If the bundles expand concurrently, then the equivalence classes would not form, because labels destined to be in the same class could start off in separate bundles, and bundles never merge. > Using ~ to express confusability, 2c gives us situations where we can > have X~Y and Y~Z, which we could call a confusability chain of length > 3. The question is whether there are longer confusability chains in > practice. Character-chains of length 2 can give rise to string-chains of greater length. For example, suppose we have characters related as follows: A~B B~C X~Y Y~Z Now suppose a zone contains AX, BX, CY, CZ. The shortest path (using labels in the zone) from AX to CZ goes through all four labels. There is a shorter path involving the label BY, but BY might not exist in the zone. > One way to visualize this is that there would be a one layer thick > 'neutral zone' between any two bundles that might be transitively > confusable. Right, any labels that neighbor multiple bundles are unavailable to everyone. (A label neighbors a bundle if it neighbors any label in the bundle.) We need to remember that we're speaking in the context of a model where bundles contain only labels that are in the zone, not any "blocked" labels. Maybe I shouldn't have used the term "bundle" for this purpose, since Paul introduced it to include blocked labels. > > When a registrant asks to remove a label from their existing bundle, > > the request is denied if it is the only(2b)/primary(2c) label in the > > bundle. > > The 'only(2b)' case seems irrelevant because of what you say in the > next point. > > > A request by a registrant to remove one of their entire bundles is > > never denied. It is relevant because it means that bundles cannot be empty. A registrant might imagine that they could remove the last label from a bundle, and still have an empty bundle until it expires normally, and before then they could add a label back into the bundle. But I want to assert that this is not allowed, because it could be used to disguise the registration of a completely new & different name as a mere expansion of an existing bundle (the fees for those two operations are likely to be different). > I think it's easy to make matching a regular expression against an > ordered zone file much more efficient than matching that regular > expression against each of the entries in the zone in turn. Thanks for the illustration. I didn't realize that so much could be gained with such a simple data structure (a sorted list). I think there are still pathological cases where the time spent backtracking is exponential, but maybe a more sophisticated data structure could help with that (if needed). AMC From owner-idn-reg-policy Fri Apr 4 15:43:28 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34NhSJM010044 for ; Fri, 4 Apr 2003 15:43:28 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h34NhS4j010043 for idn-reg-policy-bks; Fri, 4 Apr 2003 15:43:28 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h34NhRJM010039 for ; Fri, 4 Apr 2003 15:43:27 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191aqI-0001UC-00 for ; Fri, 04 Apr 2003 15:43:30 -0800 Date: Fri, 4 Apr 2003 23:43:30 +0000 From: "Adam M. Costello" To: idn-reg-policy@imc.org Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030404234330.GA5649@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030404133734.02e0b788@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC wrote: > Every all-ASCII name in a a zone is inherently a bundle of one. Why just one? When I register a non-ASCII label, I generally get a bundle containing multiple blocked labels (in your model). Why shouldn't the same thing happen when I register an ASCII label? Shouldn't an ASCII label block non-ASCII labels that are too similar to it? For existing registrations that predate the introduction of the bundling system, they could each be converted to a bundle, in the order they were originally registered. AMC From owner-idn-reg-policy Fri Apr 4 17:24:32 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h351OVJM014513 for ; Fri, 4 Apr 2003 17:24:31 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h351OVFM014512 for idn-reg-policy-bks; Fri, 4 Apr 2003 17:24:31 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from w1309.hostcentric.net (w1309.hostcentric.net [66.40.78.254]) by above.proper.com (8.12.9/8.11.6) with SMTP id h351OUJM014506 for ; Fri, 4 Apr 2003 17:24:30 -0800 (PST) Received: (qmail 8713 invoked by alias); 5 Apr 2003 01:24:33 -0000 Received: from unknown (HELO DAVIS1) (12.234.231.178) by 0 with SMTP; 5 Apr 2003 01:24:33 -0000 Message-ID: <01ad01c2fb12$0e7ad7f0$7900a8c0@DAVIS1> From: "Mark Davis" To: "IDN registration policy list" References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> <20030404232518.GF3202@nicemice.net> Subject: Re: model with overlapping variants Date: Fri, 4 Apr 2003 17:24:08 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > > I think it's easy to make matching a regular expression against an > > ordered zone file much more efficient than matching that regular > > expression against each of the entries in the zone in turn. > > Thanks for the illustration. I didn't realize that so much could be > gained with such a simple data structure (a sorted list). I think 1. Any equivalence relation on strings corresponds to a partition of all strings. By far the most efficient matching of strings is where one member of each partition is chosen as the 'skeleton' or representative for that partition. All you need is an efficient mapping function toSkeleton() that takes each member of each partition onto its skeleton. The toSkeleton() function can be any idempotent string function. Once you have that, for each partition, you only need to store the skeleton, *not* all the members of the partition. And to test any probe string, you only need to convert that probe to its skeleton, and do a straight, normal lookup in a SkeletonSet (typically a HashSet of some sort; could also be sorted if needed). Since the skeleton might not be the preferred, most-readable form, you need to store the original string as well. So the SkeletonSet can actually be a HashMap (or SortedMap), one that maps from the skeleton to the preferred form. So you end up storing 2N strings -- you *don't* have to store all the possible strings in each partition. 2. One can generalize the process in #1, if there are multiple equivalence relations and some kind of registration process, as long as the registrations are serial. For example, one could support both German and French equivalence relations. It's a bit more work: You have to keep a set of skeletons for all registrations, for each equivalence relation, e.g. FrenchSkeletonSet GermanSkeletonSet SwedishSkeletonSet ... What you do is when each name N is registered, you have to say which equivalence relation is to be used with it (e.g. German). To make sure that N doesn't collide with a pre-existing registered name, look N up against each of the above sets using *that* set's isSkeleton function. If it ever collides, don't register it. If it doesn't collide in any of the sets, add its skeleton to the appropriate Skeletons set (e.g. GermanSkeletonSet). To lookup an incoming string, probe each of the skeleton sets according to the set's isSkeleton function until you find a match. Since these are equivalence relations, the order of lookup in the sets doesn't matter. You end up with M hash lookups per test. Note that each SkeletonSet can have its own preferred forms, so the preferred form for simplified Chinese can be different than the preferred form for traditional, for example. Mark (Ų…ØąŲ‚Øĩ ØĻŲ† ØŊØ§ŲˆØŊ) ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Adam M. Costello" To: "IDN registration policy list" Sent: Friday, April 04, 2003 15:25 Subject: Re: model with overlapping variants > > Martin Duerst wrote: > > > The main thing I'm worried about is what happens if X or Y (or both) > > are strings rather than characters. There may be some weird cases > > where the 'longest prefix' match somehow destroys the symmetry. > > Could be. If so, we could force the symmetry by defining the relation > this way: X and Y are neighbors iff table(X) matches Y or table(Y) > matches X, where table() and "matches" are yet-to-be-defined operations > that might be asymmetric. > > > > For the moment I'll call the relation "confusability". > > > > What about "similarity"? > > I think I like "neighbors" better. "Similarity" sounds like a > real-valued function (how similar are X and Y?), while "neighbors" > sounds more boolean (are X and Y are neighbors?). > > > > 2a looks difficult to enforce, and might lead to bundles that are > > > too large, so let's put that one aside for now. > > > > If we assume that registrants try to grab as much as possible, then > > the bundles could become very large. In fact, if everybody would try > > to grab as many entries as possible into a bundle, we would end up > > with an equivalence relation. > > If the registrants performed their grabbing sequentially (each bundle is > maximally expanded before the next bundle is created), then we'd end up > with an equivalence relation. If the bundles expand concurrently, then > the equivalence classes would not form, because labels destined to be in > the same class could start off in separate bundles, and bundles never > merge. > > > Using ~ to express confusability, 2c gives us situations where we can > > have X~Y and Y~Z, which we could call a confusability chain of length > > 3. The question is whether there are longer confusability chains in > > practice. > > Character-chains of length 2 can give rise to string-chains of greater > length. For example, suppose we have characters related as follows: > > A~B B~C > X~Y Y~Z > > Now suppose a zone contains AX, BX, CY, CZ. The shortest path (using > labels in the zone) from AX to CZ goes through all four labels. There > is a shorter path involving the label BY, but BY might not exist in the > zone. > > > One way to visualize this is that there would be a one layer thick > > 'neutral zone' between any two bundles that might be transitively > > confusable. > > Right, any labels that neighbor multiple bundles are unavailable to > everyone. (A label neighbors a bundle if it neighbors any label in the > bundle.) > > We need to remember that we're speaking in the context of a model where > bundles contain only labels that are in the zone, not any "blocked" > labels. Maybe I shouldn't have used the term "bundle" for this purpose, > since Paul introduced it to include blocked labels. > > > > When a registrant asks to remove a label from their existing bundle, > > > the request is denied if it is the only(2b)/primary(2c) label in the > > > bundle. > > > > The 'only(2b)' case seems irrelevant because of what you say in the > > next point. > > > > > A request by a registrant to remove one of their entire bundles is > > > never denied. > > It is relevant because it means that bundles cannot be empty. A > registrant might imagine that they could remove the last label from > a bundle, and still have an empty bundle until it expires normally, > and before then they could add a label back into the bundle. But I > want to assert that this is not allowed, because it could be used to > disguise the registration of a completely new & different name as a mere > expansion of an existing bundle (the fees for those two operations are > likely to be different). > > > I think it's easy to make matching a regular expression against an > > ordered zone file much more efficient than matching that regular > > expression against each of the entries in the zone in turn. > > Thanks for the illustration. I didn't realize that so much could be > gained with such a simple data structure (a sorted list). I think > there are still pathological cases where the time spent backtracking is > exponential, but maybe a more sophisticated data structure could help > with that (if needed). > > AMC > From owner-idn-reg-policy Fri Apr 4 18:27:40 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h352ReJM018984 for ; Fri, 4 Apr 2003 18:27:40 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h352Rd5u018981 for idn-reg-policy-bks; Fri, 4 Apr 2003 18:27:39 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h352RcJM018976 for ; Fri, 4 Apr 2003 18:27:39 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191dPC-0001xD-00 for ; Fri, 04 Apr 2003 18:27:42 -0800 Date: Sat, 5 Apr 2003 02:27:42 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: model with overlapping variants Message-ID: <20030405022742.GG3202@nicemice.net> Reply-To: IDN registration policy list References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> <20030404232518.GF3202@nicemice.net> <01ad01c2fb12$0e7ad7f0$7900a8c0@DAVIS1> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <01ad01c2fb12$0e7ad7f0$7900a8c0@DAVIS1> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Mark Davis wrote: > 1. Any equivalence relation on strings corresponds to a partition > of all strings. By far the most efficient matching of strings is > where one member of each partition is chosen as the 'skeleton' or > representative for that partition. All you need is an efficient > mapping function toSkeleton() that takes each member of each partition > onto its skeleton. The toSkeleton() function can be any idempotent > string function. That was my original proposal, but it has been suggested that equivalence relations are not general enough to express the sort of policies that registries want to express, and that an intransitive relation is needed. > 2. One can generalize the process in #1, if there are multiple > equivalence relations and some kind of registration process, as long > as the registrations are serial. For example, one could support both > German and French equivalence relations. It's a bit more work: > > You have to keep a set of skeletons for all registrations, for each > equivalence relation, e.g. > FrenchSkeletonSet > GermanSkeletonSet > SwedishSkeletonSet > ... > > What you do is when each name N is registered, you have to say which > equivalence relation is to be used with it (e.g. German). To make > sure that N doesn't collide with a pre-existing registered name, look > N up against each of the above sets using *that* set's isSkeleton > function. If it ever collides, don't register it. If it doesn't > collide in any of the sets, add its skeleton to the appropriate > Skeletons set (e.g. GermanSkeletonSet). > > To lookup an incoming string, probe each of the skeleton sets > according to the set's isSkeleton function until you find a match. > Since these are equivalence relations, the order of lookup in the sets > doesn't matter. I'm not convinced, but maybe I'm not understanding. Suppose toSkeleton1 creates this view of the namespace: +-------+-------+ | | | | | | | | | | | | | | | | | | | | | +-------+-------+ That is, the entire namespace consists of two equivalence classes. Now suppose toSkeleton2 creates this view of the namespace: +---------------+ | | | | | | +---------------+ | | | | | | +---------------+ Now suppose the following two names are registered: +---------------+ | | | 1 | | | | | | | | 2 | | | +---------------+ The digits denote which version of toSkeleton the name was registered under. These two names do not collide with each other, no matter which toSkeleton you ask. Now let's try to do a lookup on this name: +---------------+ | | | | | | | | | | | * | | | +---------------+ toSkeleton1 will say it matches name 1, and toSkeleton2 will say it matches name 2. AMC From owner-idn-reg-policy Sat Apr 5 05:02:43 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35D2hJM007741 for ; Sat, 5 Apr 2003 05:02:43 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35D2hkc007740 for idn-reg-policy-bks; Sat, 5 Apr 2003 05:02:43 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.sharif.ac.ir [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35D2YJM007716; Sat, 5 Apr 2003 05:02:37 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h35D2PF08501; Sat, 5 Apr 2003 17:32:25 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h35DEdJ23711; Sat, 5 Apr 2003 17:44:39 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Sat, 5 Apr 2003 17:44:39 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: Paul Hoffman / IMC cc: Martin Duerst , IDN registration policy list Subject: Re: initial thoughts In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Thu, 3 Apr 2003, Paul Hoffman / IMC wrote: > >I think first we need clarification from you about whether > >you intended, in your approach, that (variants in) bundles > >can overlap. There is no indication in your draft that they > >can't, but on the other hand, there is no indication that you > >were aware of the fact that they could. > > I will add text saying that they cannot overlap, and add steps in the > process to check for that. Well, I object. Sometimes these overlaps are unavoidable (examples already posted). I really prefer a protocol/system that allows overlaps but states how to handle them. roozbeh From owner-idn-reg-policy Sat Apr 5 05:58:44 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35DwiJM010635 for ; Sat, 5 Apr 2003 05:58:44 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35Dwi9s010634 for idn-reg-policy-bks; Sat, 5 Apr 2003 05:58:44 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.12.9/8.11.6) with SMTP id h35DwhJM010621 for ; Sat, 5 Apr 2003 05:58:43 -0800 (PST) Received: (qmail 3967 invoked from network); 5 Apr 2003 13:58:38 -0000 Received: from adsl-65-43-33-245.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.33.245) by server.iicinternet.com with SMTP; 5 Apr 2003 13:58:38 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <20030404213527.GC3202@nicemice.net> References: <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> Date: Sat, 5 Apr 2003 08:58:04 -0500 To: IDN registration policy list From: tedd Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >Similarly, fuhrer.de might be unavailable to everyone except the >owner of fuehrer.de. > >That seems like the right policy to me. > >I think revoking existing registrations is a non-starter; the >registrants would be furious, and rightly so. > >AMC AMC: So then, under this policy the owner of fuehrer.de won't have to do anything? He doesn't have to take any steps protect his original name. He doesn't even have to register fuhrer.de? He can just sit there, do nothing and let your policy protect him -- nice for him. What about other TLD's? Does he not have to worry about them as well? This proposed policy doesn't sound right to me and I am sure that registrars won't like missing the revenue generated by customers protecting their name. As you well know, many organizations/companies register different names for protection. For example, my wife's business, namely Earth Stones, has registered earthstones.com as well as earth-stones.com and earthstones under five other TLDs. She did this because protective registration makes sense and it's cheaper than court. If the end result of this policy making makes it such that the original owner of a DN doesn't have to worry about like-registrations, then registrars will lose money. AND, if this policy extends to other TLD's, then this policy will revoke existing registrations from those who arrived late to register any name registered under a different TLD. I don't think this will work. tedd -- http://sperling.com/ From owner-idn-reg-policy Sat Apr 5 07:04:13 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35F4CJM016175 for ; Sat, 5 Apr 2003 07:04:12 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35F4CqC016173 for idn-reg-policy-bks; Sat, 5 Apr 2003 07:04:12 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sina.sharif.edu (sina.sharif.ac.ir [194.225.40.9]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35F45JM016154 for ; Sat, 5 Apr 2003 07:04:07 -0800 (PST) Received: from bamdad.org (IDENT:root@bamdad.org [81.31.160.190]) by sina.sharif.edu (8.11.6/8.11.6) with ESMTP id h35EtiF19795 for ; Sat, 5 Apr 2003 19:25:44 +0430 Received: from localhost (roozbeh@localhost) by bamdad.org (8.11.6/8.11.6) with ESMTP id h35F83424950 for ; Sat, 5 Apr 2003 19:38:03 +0430 X-Authentication-Warning: gilas.bamdad.org: roozbeh owned process doing -bs Date: Sat, 5 Apr 2003 19:38:03 +0430 (IRST) From: Roozbeh Pournader X-X-Sender: roozbeh@gilas.bamdad.org To: IDN registration policy list Subject: Re: Equivalence only in one direction In-Reply-To: <20030404024039.GE24059@nicemice.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Fri, 4 Apr 2003, Adam M. Costello wrote: > Let's call these three labels X, Y, and V, where V is the common > variant of X and Y, and X and Y are not variants of each other. Which of the following tables do you mean? X|V Y|V V|X;Y or X|V Y|V V ? > > 1. Required: Always resolving. Example: different digit forms > > (U+0030->U+0660). > > > > 2. Variant: Optionally resolving. The current variant. > > > > 3. Blocking: Reserved forever. Example: Arabic vs Persian Yehs > > (U+0649->U+06CC). > > I think I'm starting to understand your vision. Let's me try to explain > it back to you to check.. The name(s) that the registrant requests > is(are) the center(s) of three nested regions of the namespace. The > innermost region contains all the labels that are automatically active > and delegated to the registrant. Correct. > The next larger region contains all the labels that the registrant > controls (including both the automatically active labels and the labels > that the registrant may activate at any time with no further collision > checking). Correct. > The largest region contains all the names that cannot be delegated to > any other registrant (some of them cannot be delegated to anyone at > all). Correct. (But I will emphasize the part in the parentheses.) > Presumably it would be okay for the outermost region of one registrant > to overlap the outermost region of another registrant. Not only it would be okay, but it would be unavoidable because of language requirements. (Well, actually this is Unicode requirements, it is the way Unicode model that makes this unavoidable. You can usually design a character set for your zone that doesn't have some of these requirements.) > But it would not be okay for the outermost region of one registrant to > overlap the middle region of another registrant. Correct. > I have not begun to think about how these regions might be formally > defined. I suggest using three levels of variants (U+AAAA|U+BBBB|U+CCCC|U+DDDD), with some specified restrictions (TBD) on the data to make sure certain overlaps don't happen in the data. 'jseng-idn-admin' already specifies two levels. roozbeh From owner-idn-reg-policy Sat Apr 5 07:14:32 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35FEVJM016867 for ; Sat, 5 Apr 2003 07:14:31 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35FEVtE016866 for idn-reg-policy-bks; Sat, 5 Apr 2003 07:14:31 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.12.9/8.11.6) with SMTP id h35FEUJM016850 for ; Sat, 5 Apr 2003 07:14:30 -0800 (PST) Received: (qmail 6639 invoked from network); 5 Apr 2003 15:14:26 -0000 Received: from adsl-65-43-33-245.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.33.245) by server.iicinternet.com with SMTP; 5 Apr 2003 15:14:26 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <20030404201319.GA3202@nicemice.net> References: <20030404201319.GA3202@nicemice.net> Date: Sat, 5 Apr 2003 10:13:51 -0500 To: IDN registration policy list From: tedd Subject: Re: confusability Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >tedd wrote: > > > > For the moment I'll call the relation "confusability". Given any >> > two labels (in no particular order), they are either confusable or >> > not, and it is possible to compute that boolean value. >> >> From an earlier post, someone talked about IBM.com vs 1BM.com -- which >> should have been ibm.com vs 1bm.com, but none the less this type of >> similar-looking-glyph use can be confusing. It can be even more >> confusing if one uses a Greek small letter iota with tonos (U03AF) to >> produce an ibm.com. Is this the type of confusion you are talking >> about? > >Could be. A registry would define its confusability relation as it >sees fit. It doesn't want to define confusability so narrowly that >not enough things are considered confusable, because then it would be >swamped by disputes about name ownership. But it doesn't want to define >confusability so broadly that it drastically curtails the number of >registrations (and hence revenue). > >Maybe "confusable" is not the best term. Maybe "neighboring" would be >better. It's got some of the right intuition: If you are my neighbor, >then I am your neighbor (symmetry), but my neighbor's neighbor is >not necessarily my neighbor (intransitivity). You can speak of the >neighborhood centered around a particular label. Neighborhoods centered >around different labels can partially overlap. A bundle would be either >a set of labels that are all neighbors of each other, or a subset of the >neighborhood centered around the bundle's primary label, depending on >which version of property 2 we use. Property 1 says that neighboring >labels in a zone must not belong to distinct bundles. > >I just noticed that I forgot to state an assumption, which we can call >property 0: Every label in a zone belongs to exactly one bundle. > >AMC AMC: I understand -- but, I cannot see how the "confusability" avoidance issue can be implemented to the entire Unicode database. It appears to me (perhaps I'm wrong) that this group is trying to predict and solve all possible problems that may arise from IDN registrations because of look-alike possibilities within the Unicode database. I don't know the actual number of additional characters added thus far, but the upward limit is 65,535. So, as I see it, you will have some 65,000 different possibilities of character confusion at a single character domain level (i.e., a.com). Now, move to two characters (aa.com) and figure becomes much larger -- something in the order of 65000 x 65000 range. Now, what's the upper limit to the number of characters allowed in a domain name and what's it's factorial? Do you honestly believe that you can solve this confusability problem for all possible combinations -- even if your interpretation is the correct one for each situation? Be reasonable, you're approaching a number that rivals the US national debt. Plus, no offense, you're making decisions about glyphs in other languages that are not you're own. I think this group has made some significant progress in that some characters have been already mapped to others -- such as all occurrences of glyphs looking like "A" and have been mapped to "a" and so on. But,you have done that primarily because you are familiar with the Latin character set and it's use. Now, to map all occurrences of everything that looks similar to one character may do more harm than good in ways not apparent to you presently. Plus, considering the shear number of combinations and thoughtful considerations required for each one -- I don't think this group has enough time nor resources to accomplish the task. It might be best, for all concerned, to let the market and courts work it out. tedd -- http://sperling.com/ From owner-idn-reg-policy Sat Apr 5 08:35:40 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35GZeJM021318 for ; Sat, 5 Apr 2003 08:35:40 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35GZecq021315 for idn-reg-policy-bks; Sat, 5 Apr 2003 08:35:40 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [66.125.125.92] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35GZcJN021309 for ; Sat, 5 Apr 2003 08:35:38 -0800 (PST) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030404234330.GA5649@nicemice.net> References: <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030404133734.02e0b788@localhost> <20030404234330.GA5649@nicemice.net> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Fri, 4 Apr 2003 16:59:28 -0800 To: IDN registration policy list From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 11:43 PM +0000 4/4/03, Adam M. Costello wrote: >Paul Hoffman / IMC wrote: > >> Every all-ASCII name in a a zone is inherently a bundle of one. > >Why just one? Because there are no variant tables yet. > When I register a non-ASCII label, I generally get >a bundle containing multiple blocked labels (in your model). Why >shouldn't the same thing happen when I register an ASCII label? It could. >Shouldn't an ASCII label block non-ASCII labels that are too similar to >it? That is up to the registry. >For existing registrations that predate the introduction of the bundling >system, they could each be converted to a bundle, in the order they were >originally registered. You are assuming that zones keep registration date information; that is a bad assumption. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Sat Apr 5 08:37:37 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35GbbJM021347 for ; Sat, 5 Apr 2003 08:37:37 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35Gbatx021346 for idn-reg-policy-bks; Sat, 5 Apr 2003 08:37:36 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from hosting.altserver.com (hosting.altserver.com [209.124.80.2]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35GbZJM021342 for ; Sat, 5 Apr 2003 08:37:36 -0800 (PST) Received: from f07a-6-160.d1.club-internet.fr ([212.194.149.160] helo=mine.jefsey.com) by hosting.altserver.com with esmtp (Exim 3.36 #1) id 191qff-0002Ii-00 for idn-reg-policy@imc.org; Sat, 05 Apr 2003 08:37:35 -0800 Message-Id: <5.2.0.9.0.20030405142129.00a1a510@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Sat, 05 Apr 2003 14:25:33 +0200 To: IDN registration policy list From: "JFC (Jefsey) Morfin" Subject: Re: New Internet Draft on registering IDNs In-Reply-To: <20030404222736.GD3202@nicemice.net> References: <200304042131.h34LVeuc001195@bartok.sidn.nl> <4.2.0.58.J.20030404133143.05c4e718@localhost> <200304042131.h34LVeuc001195@bartok.sidn.nl> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - hosting.altserver.com X-AntiAbuse: Original Domain - imc.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [0 0] X-AntiAbuse: Sender Address Domain - jefsey.com Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On 00:27 05/04/03, Adam M. Costello said: >Here's an analogy: The regulations/expectations of what is proper in >restaurants do not apply to homes. Although both are privately owned, >one is considered a public place because the general public is invited in. If you welcome third parties in your home - let say for a non-profit organization - it will. The IDN situation we may very quickly wind up with non-profit gov sponsored new.net like 4/3LDs supporting ML.ML if appropriate solutions are not found with IDNA S/TLD. We all know that. jfc From owner-idn-reg-policy Sat Apr 5 14:54:24 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35MsOJM009236 for ; Sat, 5 Apr 2003 14:54:24 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35MsOOo009235 for idn-reg-policy-bks; Sat, 5 Apr 2003 14:54:24 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35MsNJM009231 for ; Sat, 5 Apr 2003 14:54:23 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191wYM-0004Ho-00 for ; Sat, 05 Apr 2003 14:54:26 -0800 Date: Sat, 5 Apr 2003 22:54:26 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030405225426.GB15271@nicemice.net> Reply-To: IDN registration policy list References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: tedd wrote: > > Similarly, fuhrer.de might be unavailable to everyone except > > the owner of fuehrer.de. > > So then, under this policy the owner of fuehrer.de won't have to do > anything? He doesn't have to take any steps protect his original > name. He doesn't even have to register fuhrer.de? He can > just sit there, do nothing and let your policy protect him -- nice for > him. Yes, nice for him, and nice for everyone in the internet community who doesn't have get surprised by fuehrer.de and fuhrer.de belonging to unrelated organizations. > What about other TLD's? Does he not have to worry about them as well? The whole purpose of a hierarchical domain name system is so that the exact same label foo can exist independently under multiple parent domains. It has always been understood that foo.org and foo.com and foo.de are three independent names that don't necessarily have any connection. No one is proposing to change that. > This proposed policy doesn't sound right to me and I am sure that > registrars won't like missing the revenue generated by customers > protecting their name. If registry revenue is the only metric, then the entire topic of this mailing list is completely pointless. No registration should block any other registration, there should be no bundles, and registrants should have to register every individual name that they don't want someone else to have. However, many people are concerned that the whole naming system will lose its value if human beings cannot recognize when two names are the same and when they are different. So registries will try to find a balance, designing tables that bundle enough names together so that names continue to be recognizable and useful to the general public, without bundling so many names together that registrants would get away with paying a single fee for what are obviously (even to humans) multiple names. > If the end result of this policy making makes it such that the > original owner of a DN doesn't have to worry about like-registrations, > then registrars will lose money. This mailing list is designing an architecture, not the actual tables. The tables will be designed by the registries themselves, who will presumably look out for their own interests. I have no position on whether u and ue ought to be considered neighbors in .de; my point was that *if* the registration of fuhrer.de blocks the registration of fuehrer.de, then the registration of fuehrer.de ought to block the registration of fuhrer.de, and therefore if fuehrer.de is already registered when the bundling system is introduced, it should be grandfathered in and expanded to whatever bundle it would have been if the bundling system had existed when it was originally registered. > It appears to me (perhaps I'm wrong) that this group is trying to > predict and solve all possible problems that may arise from IDN > registrations because of look-alike possibilities within the Unicode > database. No, this group is not trying to predict them all. It is trying to devise an architecture that can be extended to deal with the issues incrementally as they are discovered. The first tables to be developed (not by this group, but by experts in the relevant language) will prohibit all Unicode characters except those essential to a particular language, and will contain carefully crafted mappings for those characters. Then some tables will be developed that combine two languages, with special care taken for the characters used by both. Gradually, as more experience and expertise is developed, more inclusive tables will be created. Each registry will decide what table to use for each of its zones. > I don't know the actual number of additional characters added thus > far, but the upward limit is 65,535. It's actually about a million. > So, as I see it, you will have some 65,000 different possibilities > of character confusion at a single character domain level (i.e., > a.com). Now, move to two characters (aa.com) and figure becomes much > larger -- something in the order of 65000 x 65000 range. > > Now, what's the upper limit to the number of characters allowed in a > domain name and what's it's factorial? That's irrelevant. If I'm told that ue and u are confusable, I don't need to be told separately that xue and xu are confusable, and that foouebar and fooubar are confusable. > you're making decisions about glyphs in other languages that are not > you're own. This group is certainly are not doing that. It's not designing the tables. > I think this group has made some significant progress in that > some characters have been already mapped to others -- such as all > occurrences of glyphs looking like "A" and have been mapped to "a" and > so on. What are you refering to? I don't know of anyone who has done that. > considering the shear number of combinations and thoughtful > considerations required for each one -- I don't think this group has > enough time nor resources to accomplish the task. Right, which is why this group is not even going to try. At most, it's going to develop an architecture into which tables can be plugged, and other groups with the needed expertise will devise the tables. AMC From owner-idn-reg-policy Sat Apr 5 15:02:47 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35N2lJM009409 for ; Sat, 5 Apr 2003 15:02:47 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35N2lhq009408 for idn-reg-policy-bks; Sat, 5 Apr 2003 15:02:47 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from w1309.hostcentric.net (w1309.hostcentric.net [66.40.78.254]) by above.proper.com (8.12.9/8.11.6) with SMTP id h35N2jJM009401 for ; Sat, 5 Apr 2003 15:02:46 -0800 (PST) Received: (qmail 28332 invoked by alias); 5 Apr 2003 23:02:47 -0000 Received: from unknown (HELO DAVIS1) (12.234.231.178) by 0 with SMTP; 5 Apr 2003 23:02:47 -0000 Message-ID: <002701c2fbc7$6c097ce0$7900a8c0@DAVIS1> From: "Mark Davis" To: "IDN registration policy list" , "tedd" References: <20030404201319.GA3202@nicemice.net> Subject: Re: confusability Date: Sat, 5 Apr 2003 15:02:24 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > I don't know the actual number of additional characters added thus > far, but the upward limit is 65,535. So, as I see it, you will have A small correction: there are currently over 95,000 characters in Unicode 3.2; in Unicode 4.0 (very soon to be released) there will be an additional thousand-odd characters. In addition, there are 131,068 possible private use characters, and there are 871,758 reserved positions still available for future characters. Mark ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "tedd" To: "IDN registration policy list" Sent: Saturday, April 05, 2003 07:13 Subject: Re: confusability > > >tedd wrote: > > > > > > For the moment I'll call the relation "confusability". Given any > >> > two labels (in no particular order), they are either confusable or > >> > not, and it is possible to compute that boolean value. > >> > >> From an earlier post, someone talked about IBM.com vs 1BM.com -- which > >> should have been ibm.com vs 1bm.com, but none the less this type of > >> similar-looking-glyph use can be confusing. It can be even more > >> confusing if one uses a Greek small letter iota with tonos (U03AF) to > >> produce an ibm.com. Is this the type of confusion you are talking > >> about? > > > >Could be. A registry would define its confusability relation as it > >sees fit. It doesn't want to define confusability so narrowly that > >not enough things are considered confusable, because then it would be > >swamped by disputes about name ownership. But it doesn't want to define > >confusability so broadly that it drastically curtails the number of > >registrations (and hence revenue). > > > >Maybe "confusable" is not the best term. Maybe "neighboring" would be > >better. It's got some of the right intuition: If you are my neighbor, > >then I am your neighbor (symmetry), but my neighbor's neighbor is > >not necessarily my neighbor (intransitivity). You can speak of the > >neighborhood centered around a particular label. Neighborhoods centered > >around different labels can partially overlap. A bundle would be either > >a set of labels that are all neighbors of each other, or a subset of the > >neighborhood centered around the bundle's primary label, depending on > >which version of property 2 we use. Property 1 says that neighboring > >labels in a zone must not belong to distinct bundles. > > > >I just noticed that I forgot to state an assumption, which we can call > >property 0: Every label in a zone belongs to exactly one bundle. > > > >AMC > > AMC: > > I understand -- but, I cannot see how the "confusability" avoidance > issue can be implemented to the entire Unicode database. > > It appears to me (perhaps I'm wrong) that this group is trying to > predict and solve all possible problems that may arise from IDN > registrations because of look-alike possibilities within the Unicode > database. > > I don't know the actual number of additional characters added thus > far, but the upward limit is 65,535. So, as I see it, you will have > some 65,000 different possibilities of character confusion at a > single character domain level (i.e., a.com). Now, move to two > characters (aa.com) and figure becomes much larger -- something in > the order of 65000 x 65000 range. > > Now, what's the upper limit to the number of characters allowed in a > domain name and what's it's factorial? Do you honestly believe that > you can solve this confusability problem for all possible > combinations -- even if your interpretation is the correct one for > each situation? Be reasonable, you're approaching a number that > rivals the US national debt. Plus, no offense, you're making > decisions about glyphs in other languages that are not you're own. > > I think this group has made some significant progress in that some > characters have been already mapped to others -- such as all > occurrences of glyphs looking like "A" and have been mapped to "a" > and so on. But,you have done that primarily because you are familiar > with the Latin character set and it's use. > > Now, to map all occurrences of everything that looks similar to one > character may do more harm than good in ways not apparent to you > presently. Plus, considering the shear number of combinations and > thoughtful considerations required for each one -- I don't think this > group has enough time nor resources to accomplish the task. > > It might be best, for all concerned, to let the market and courts work it out. > > tedd > -- > http://sperling.com/ > From owner-idn-reg-policy Sat Apr 5 15:16:47 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35NGlJM009626 for ; Sat, 5 Apr 2003 15:16:47 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35NGk06009625 for idn-reg-policy-bks; Sat, 5 Apr 2003 15:16:46 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35NGkJM009621 for ; Sat, 5 Apr 2003 15:16:46 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191wu1-0004LU-00 for ; Sat, 05 Apr 2003 15:16:49 -0800 Date: Sat, 5 Apr 2003 23:16:49 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: Equivalence only in one direction Message-ID: <20030405231648.GC15271@nicemice.net> Reply-To: IDN registration policy list References: <20030404024039.GE24059@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Roozbeh Pournader wrote: > > Let's call these three labels X, Y, and V, where V is the common > > variant of X and Y, and X and Y are not variants of each other. > > Which of the following tables do you mean? > > X|V > Y|V > V|X;Y > > or > > X|V > Y|V > V > > ? Neither, for two reasons: (1) I haven't started thinking in terms of tables, I'm still thinking in terms of relations, and (2) X, Y, and Z are entire labels, so they are unlikely to appear in the table (the table contains single characters and perhaps short strings). I think you're asking whether I had a symmetric variant relation in mind, or an asymmetric one. I probably had a symmetric relation in mind. The important point was that it was intransitive--that's what made the scenarios interesting (to me). I think it would make sense to avoid fixating on a particular table format while we're still discussing what kind of relation it ought to generate. AMC From owner-idn-reg-policy Sat Apr 5 15:26:03 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35NQ3JM009836 for ; Sat, 5 Apr 2003 15:26:03 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35NQ3I5009835 for idn-reg-policy-bks; Sat, 5 Apr 2003 15:26:03 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35NQ2JM009831 for ; Sat, 5 Apr 2003 15:26:02 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191x2z-0004NR-00 for ; Sat, 05 Apr 2003 15:26:05 -0800 Date: Sat, 5 Apr 2003 23:26:05 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030405232605.GD15271@nicemice.net> Reply-To: IDN registration policy list References: <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030402171329.0294df10@localhost> <4.2.0.58.J.20030404101948.03cd3e28@localhost> <4.2.0.58.J.20030404133734.02e0b788@localhost> <20030404234330.GA5649@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Paul Hoffman / IMC wrote: > > For existing registrations that predate the introduction of the > > bundling system, they could each be converted to a bundle, in the > > order they were originally registered. > > You are assuming that zones keep registration date information; that > is a bad assumption. Indeed, I was making that assumption. I am astonished that a registry, which needs to remember a slew of information about the registration (everything that goes in the whois database) would not include the original registration date among that info, if for no other purpose than to serve as evidence in disputes. But anyway, if a registry really has no record of the original registration date of the names, I suppose a fair method would be to expand them into bundles in random order. Then at least every existing registrant would have priority over all future registrants for the variants of its own name, but ties among the existing registrants would be broken at random. By the way, this issue is moot in the model I recently proposal, because in that model bundles do not contain blocked names, only in-the-zone names; therefore each existing ASCII registration would indeed become a bundle of one. AMC From owner-idn-reg-policy Sat Apr 5 15:22:09 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h35NM9JM009772 for ; Sat, 5 Apr 2003 15:22:09 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h35NM9lC009771 for idn-reg-policy-bks; Sat, 5 Apr 2003 15:22:09 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from w1309.hostcentric.net (w1309.hostcentric.net [66.40.78.254]) by above.proper.com (8.12.9/8.11.6) with SMTP id h35NM8JM009767 for ; Sat, 5 Apr 2003 15:22:08 -0800 (PST) Received: (qmail 30370 invoked by alias); 5 Apr 2003 23:22:09 -0000 Received: from unknown (HELO DAVIS1) (12.234.231.178) by 0 with SMTP; 5 Apr 2003 23:22:09 -0000 Message-ID: <004201c2fbca$20b89b10$7900a8c0@DAVIS1> From: "Mark Davis" To: "IDN registration policy list" References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> <20030404232518.GF3202@nicemice.net> <01ad01c2fb12$0e7ad7f0$7900a8c0@DAVIS1> <20030405022742.GG3202@nicemice.net> Subject: Re: model with overlapping variants Date: Sat, 5 Apr 2003 15:21:46 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Adam, 1. I have not been following this discussion in detail, but using a non-transitive relation is really pretty ugly. It makes the implementation significantly more difficult, and I believe will make the matching much harder to understand for end-users. Can you point me to the use-cases that people felt were problems with this? 2. For the generalized case, I didn't explain it very well (going off the top of my head). I'll try to take some time to come up with a clear account, with examples. Mark (Ų…ØąŲ‚Øĩ ØĻŲ† ØŊØ§ŲˆØŊ) ________ mark.davis@jtcsv.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Adam M. Costello" To: "IDN registration policy list" Sent: Friday, April 04, 2003 18:27 Subject: Re: model with overlapping variants > > Mark Davis wrote: > > > 1. Any equivalence relation on strings corresponds to a partition > > of all strings. By far the most efficient matching of strings is > > where one member of each partition is chosen as the 'skeleton' or > > representative for that partition. All you need is an efficient > > mapping function toSkeleton() that takes each member of each partition > > onto its skeleton. The toSkeleton() function can be any idempotent > > string function. > > That was my original proposal, but it has been suggested that > equivalence relations are not general enough to express the sort of > policies that registries want to express, and that an intransitive > relation is needed. > > > 2. One can generalize the process in #1, if there are multiple > > equivalence relations and some kind of registration process, as long > > as the registrations are serial. For example, one could support both > > German and French equivalence relations. It's a bit more work: > > > > You have to keep a set of skeletons for all registrations, for each > > equivalence relation, e.g. > > FrenchSkeletonSet > > GermanSkeletonSet > > SwedishSkeletonSet > > ... > > > > What you do is when each name N is registered, you have to say which > > equivalence relation is to be used with it (e.g. German). To make > > sure that N doesn't collide with a pre-existing registered name, look > > N up against each of the above sets using *that* set's isSkeleton > > function. If it ever collides, don't register it. If it doesn't > > collide in any of the sets, add its skeleton to the appropriate > > Skeletons set (e.g. GermanSkeletonSet). > > > > To lookup an incoming string, probe each of the skeleton sets > > according to the set's isSkeleton function until you find a match. > > Since these are equivalence relations, the order of lookup in the sets > > doesn't matter. > > I'm not convinced, but maybe I'm not understanding. Suppose toSkeleton1 > creates this view of the namespace: > > +-------+-------+ > | | | > | | | > | | | > | | | > | | | > | | | > | | | > +-------+-------+ > > That is, the entire namespace consists of two equivalence classes. Now > suppose toSkeleton2 creates this view of the namespace: > > +---------------+ > | | > | | > | | > +---------------+ > | | > | | > | | > +---------------+ > > Now suppose the following two names are registered: > > +---------------+ > | | > | 1 | > | | > | | > | | > | 2 | > | | > +---------------+ > > The digits denote which version of toSkeleton the name was registered > under. These two names do not collide with each other, no matter which > toSkeleton you ask. > > Now let's try to do a lookup on this name: > > +---------------+ > | | > | | > | | > | | > | | > | * | > | | > +---------------+ > > toSkeleton1 will say it matches name 1, and toSkeleton2 will say it > matches name 2. > > AMC > From owner-idn-reg-policy Sat Apr 5 16:26:48 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h360QmJM012786 for ; Sat, 5 Apr 2003 16:26:48 -0800 (PST) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h360QmmS012785 for idn-reg-policy-bks; Sat, 5 Apr 2003 16:26:48 -0800 (PST) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h360QlJM012779 for ; Sat, 5 Apr 2003 16:26:47 -0800 (PST) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 191xzm-0004WW-00 for ; Sat, 05 Apr 2003 16:26:50 -0800 Date: Sun, 6 Apr 2003 00:26:50 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: model with overlapping variants Message-ID: <20030406002650.GE15271@nicemice.net> Reply-To: IDN registration policy list References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> <20030404232518.GF3202@nicemice.net> <01ad01c2fb12$0e7ad7f0$7900a8c0@DAVIS1> <20030405022742.GG3202@nicemice.net> <004201c2fbca$20b89b10$7900a8c0@DAVIS1> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <004201c2fbca$20b89b10$7900a8c0@DAVIS1> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Mark Davis wrote: > 1. I have not been following this discussion in detail, but using > a non-transitive relation is really pretty ugly. It makes the > implementation significantly more difficult, Agreed. > and I believe will make the matching much harder to understand for > end-users. I'm skeptical of that claim. Look at the end-user reactions to the handling of in case folding. Users find it reasonable that matches SS, and that SS matches ss, but then they're surprised when matches ss (and they complain about it). The transitive closure is needed for technical reasons, not because it meets user expectations. > Can you point me to the use-cases that people felt were problems with > this? Roozbeh gave this example: U+0649 Arabic letter alef maksura U+064A Arabic letter yeh U+06CC Arabic letter Farsi yeh In Arabic, the first two letters are distinct, and the third is not used. In Persian, the first two letters are not used, and the third looks exactly like either of the other two, depending on its position in a word. So if a zone wants to support both Arabic and Persian, it needs to prevent different registrants from having names whose only difference is that one uses the Persian letter and the other uses the identical-looking Arabic letter. But if we require the relation to be transitive, we make the two distinct Arabic letters equivalent, which will surprise users, and possibly upset both users and registries (because names would be blocked for no apparent reason). I've heard that there are examples involving simplified and traditional Chinese characters that motivate intransitive relations, but I'm not familiar with those. AMC From owner-idn-reg-policy Sun Apr 6 03:38:40 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36AcdJM001509 for ; Sun, 6 Apr 2003 03:38:39 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36AcdBX001508 for idn-reg-policy-bks; Sun, 6 Apr 2003 03:38:39 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from grappa.isoc.org.il (root@grappa.isoc.org.il [132.70.9.72]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36AcZJM001493 for ; Sun, 6 Apr 2003 03:38:36 -0700 (PDT) Received: from BENNYPC (benny-pc.isoc.org.il [192.114.22.72]) by grappa.isoc.org.il (8.9.3p2/8.9.0) with ESMTP id NAA03286; Sun, 6 Apr 2003 13:38:28 +0300 From: "Benny Lipsicas" To: "'tedd'" , "'IDN registration policy list'" Subject: RE: New Internet Draft on registering IDNs Date: Sun, 6 Apr 2003 13:42:24 +0200 Message-ID: <000001c2fc31$a0c1fce0$481672c0@BENNYPC> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 Importance: Normal In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >> >>>Similarly, fuhrer.de might be unavailable to everyone except the >>>owner of fuehrer.de. >>> >>>That seems like the right policy to me. >>> >>>I think revoking existing registrations is a non-starter; the >>>registrants would be furious, and rightly so. >>> >>>AMC >> >>AMC: >> >>So then, under this policy the owner of fuehrer.de won't have to do >>anything? He doesn't have to take any steps protect his original >>name. He doesn't even have to register fuhrer.de? He can just >>sit there, do nothing and let your policy protect him -- nice for him. >> >>What about other TLD's? Does he not have to worry about them as well? >> >>This proposed policy doesn't sound right to me and I am sure that >>registrars won't like missing the revenue generated by customers >>protecting their name. >> >>As you well know, many organizations/companies register different >>names for protection. For example, my wife's business, namely Earth >>Stones, has registered earthstones.com as well as earth-stones.com >>and earthstones under five other TLDs. She did this because >>protective registration makes sense and it's cheaper than court. >> >>If the end result of this policy making makes it such that the >>original owner of a DN doesn't have to worry about >>like-registrations, then registrars will lose money. AND, if this >>policy extends to other TLD's, then this policy will revoke existing >>registrations from those who arrived late to register any name >>registered under a different TLD. >> >>I don't think this will work. >> >>tedd >> >>-- >>http://sperling.com/ What about translations from one language to another? (that's a real life question already presented to me) - will the holder of family.co.il have any guarantee that no one else registers family.co.il? From owner-idn-reg-policy Sun Apr 6 03:38:38 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36AccJM001506 for ; Sun, 6 Apr 2003 03:38:38 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36Acc6L001505 for idn-reg-policy-bks; Sun, 6 Apr 2003 03:38:38 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from grappa.isoc.org.il (root@grappa.isoc.org.il [132.70.9.72]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36AcZJM001495; Sun, 6 Apr 2003 03:38:36 -0700 (PDT) Received: from BENNYPC (benny-pc.isoc.org.il [192.114.22.72]) by grappa.isoc.org.il (8.9.3p2/8.9.0) with ESMTP id NAA03289; Sun, 6 Apr 2003 13:38:28 +0300 From: "Benny Lipsicas" To: "'Paul Hoffman / IMC'" , "'Roozbeh Pournader'" , "'Martin Duerst'" Cc: Subject: RE: Comparison of hoffman-idn-reg and jseng-idn-admin Date: Sun, 6 Apr 2003 13:42:24 +0200 Message-ID: <000101c2fc31$a49d1430$481672c0@BENNYPC> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 Importance: Normal In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: >> >>At 7:01 PM +0430 4/3/03, Roozbeh Pournader wrote: >>>Can't a registry just have a complicated script that after one asks for a >>>label to register tells: Sorry, you can't register this label, because it >>>"contains the word sex" or "uses FINAL KAF in a non-final position"? >> >>This is an important point. Registries do not have to blindly follow >>any rules, such as the ones in this document or the JET document. >>They can look at the output bundle and make local policy decisions. >> >>For characters that are position-dependent in words, a registry will >>have to have non-automatic rules. For example, assume that the letter >>"z" could not be at the end of an English word. "baz.us" should be >>not allowed, but the .us registry couldn't just check for "is 'z' at >>the end of the word" because that would allow someone to register >>"baz123.us". Or assume that "y" can only be at the end of a word. The >>registry should not have a rule that disallows any name that has "y" >>in the middle because someone might correctly want to register >>"floycorp.us". >> I agree with Paul. Unless you ban in your rules *all* registrations of two consecutive words in a label (i.e., without any non-letter separation between them) then an automatic block of only the final forms will lead to "discriminatory" blockings of forms that are legit (in Paul's example "tellme.us" will go through, but "floycorp.us" will not, unless you don't allow these forms and allow only "tell-me" or "tell_me" etc.). This discussion also seems to assume that there is a requirement in the rules that labels must have *some* meaning. Having the above rule as binding, and utilizing it automatically, seems as a too strong limitation on the shape of the name space. On the other hand, unless the registry has no limitation on the shape of a label, this particular "corner" seems to indicate that human supervision and discretion in these languages is necessary. Benny. From owner-idn-reg-policy Sun Apr 6 04:57:08 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36Bv8JM003653 for ; Sun, 6 Apr 2003 04:57:08 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36Bv8Cx003652 for idn-reg-policy-bks; Sun, 6 Apr 2003 04:57:08 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36Bv6JM003648 for ; Sun, 6 Apr 2003 04:57:07 -0700 (PDT) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 1928ln-00061Y-00 for ; Sun, 06 Apr 2003 04:57:07 -0700 Date: Sun, 6 Apr 2003 11:57:07 +0000 From: "Adam M. Costello" To: "'IDN registration policy list'" Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030406115707.GA22913@nicemice.net> Reply-To: IDN registration policy list References: <000001c2fc31$a0c1fce0$481672c0@BENNYPC> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <000001c2fc31$a0c1fce0$481672c0@BENNYPC> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Benny Lipsicas wrote: > will the holder of family.co.il have any guarantee that no one else > registers family.co.il? No. No one expects big.com and large.com to be the same site, and your example is no different. Trying to block registrations that have the same meaning as other registrations would be hopelessly difficult, and even if it could be done, I think most registrants and all registries would hate such a policy. AMC From owner-idn-reg-policy Sun Apr 6 07:31:13 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36EVDJM007282 for ; Sun, 6 Apr 2003 07:31:13 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36EVDoO007281 for idn-reg-policy-bks; Sun, 6 Apr 2003 07:31:13 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.12.9/8.11.6) with SMTP id h36EVBJM007275 for ; Sun, 6 Apr 2003 07:31:11 -0700 (PDT) Received: (qmail 4994 invoked from network); 6 Apr 2003 14:31:04 -0000 Received: from adsl-65-43-33-245.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.33.245) by server.iicinternet.com with SMTP; 6 Apr 2003 14:31:04 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <002701c2fbc7$6c097ce0$7900a8c0@DAVIS1> References: <20030404201319.GA3202@nicemice.net> <002701c2fbc7$6c097ce0$7900a8c0@DAVIS1> Date: Sun, 6 Apr 2003 10:30:26 -0400 To: idn-reg-policy@imc.org From: tedd Subject: Re: confusability Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Mark: Thanks for the correction. I figured that someone would nail me on something I said. For some reason I thought the limit was FFFF, but now I see it's considerably larger than that. In any event, was there anything else that I was mistaken about? From my limited perspective the "confusability" issue appears very large, even larger than I originally thought. tedd > > I don't know the actual number of additional characters added thus >> far, but the upward limit is 65,535. So, as I see it, you will have > >A small correction: there are currently over 95,000 characters in Unicode >3.2; in Unicode 4.0 (very soon to be released) there will be an additional >thousand-odd characters. In addition, there are 131,068 possible private use >characters, and there are 871,758 reserved positions still available for >future characters. > >Mark >________ >mark.davis@jtcsv.com >IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 >(408) 256-3148 >fax: (408) 256-0799 > >----- Original Message ----- >From: "tedd" >To: "IDN registration policy list" >Sent: Saturday, April 05, 2003 07:13 >Subject: Re: confusability > > >> >> >tedd wrote: >> > >> > > > For the moment I'll call the relation "confusability". Given any >> >> > two labels (in no particular order), they are either confusable or >> >> > not, and it is possible to compute that boolean value. >> >> >> >> From an earlier post, someone talked about IBM.com vs 1BM.com -- which >> >> should have been ibm.com vs 1bm.com, but none the less this type of >> >> similar-looking-glyph use can be confusing. It can be even more >> >> confusing if one uses a Greek small letter iota with tonos (U03AF) to >> >> produce an ibm.com. Is this the type of confusion you are talking >> >> about? >> > >> >Could be. A registry would define its confusability relation as it >> >sees fit. It doesn't want to define confusability so narrowly that >> >not enough things are considered confusable, because then it would be >> >swamped by disputes about name ownership. But it doesn't want to define >> >confusability so broadly that it drastically curtails the number of >> >registrations (and hence revenue). >> > >> >Maybe "confusable" is not the best term. Maybe "neighboring" would be >> >better. It's got some of the right intuition: If you are my neighbor, >> >then I am your neighbor (symmetry), but my neighbor's neighbor is >> >not necessarily my neighbor (intransitivity). You can speak of the >> >neighborhood centered around a particular label. Neighborhoods centered >> >around different labels can partially overlap. A bundle would be either >> >a set of labels that are all neighbors of each other, or a subset of the >> >neighborhood centered around the bundle's primary label, depending on >> >which version of property 2 we use. Property 1 says that neighboring >> >labels in a zone must not belong to distinct bundles. >> > >> >I just noticed that I forgot to state an assumption, which we can call >> >property 0: Every label in a zone belongs to exactly one bundle. >> > >> >AMC >> >> AMC: >> >> I understand -- but, I cannot see how the "confusability" avoidance >> issue can be implemented to the entire Unicode database. >> >> It appears to me (perhaps I'm wrong) that this group is trying to >> predict and solve all possible problems that may arise from IDN >> registrations because of look-alike possibilities within the Unicode >> database. >> >> I don't know the actual number of additional characters added thus >> far, but the upward limit is 65,535. So, as I see it, you will have >> some 65,000 different possibilities of character confusion at a >> single character domain level (i.e., a.com). Now, move to two >> characters (aa.com) and figure becomes much larger -- something in >> the order of 65000 x 65000 range. >> >> Now, what's the upper limit to the number of characters allowed in a >> domain name and what's it's factorial? Do you honestly believe that >> you can solve this confusability problem for all possible > > combinations -- even if your interpretation is the correct one for >> each situation? Be reasonable, you're approaching a number that >> rivals the US national debt. Plus, no offense, you're making >> decisions about glyphs in other languages that are not you're own. >> >> I think this group has made some significant progress in that some >> characters have been already mapped to others -- such as all >> occurrences of glyphs looking like "A" and have been mapped to "a" >> and so on. But,you have done that primarily because you are familiar >> with the Latin character set and it's use. >> >> Now, to map all occurrences of everything that looks similar to one >> character may do more harm than good in ways not apparent to you >> presently. Plus, considering the shear number of combinations and >> thoughtful considerations required for each one -- I don't think this >> group has enough time nor resources to accomplish the task. >> >> It might be best, for all concerned, to let the market and courts work it >out. >> >> tedd >> -- >> http://sperling.com/ >> -- http://sperling.com/ From owner-idn-reg-policy Sun Apr 6 08:05:07 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36F56JM008493 for ; Sun, 6 Apr 2003 08:05:06 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36F56x9008492 for idn-reg-policy-bks; Sun, 6 Apr 2003 08:05:06 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.12.9/8.11.6) with SMTP id h36F55JM008487 for ; Sun, 6 Apr 2003 08:05:05 -0700 (PDT) Received: (qmail 6320 invoked from network); 6 Apr 2003 15:05:01 -0000 Received: from adsl-65-43-33-245.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.33.245) by server.iicinternet.com with SMTP; 6 Apr 2003 15:05:01 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <20030405225426.GB15271@nicemice.net> References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> <20030405225426.GB15271@nicemice.net> Date: Sun, 6 Apr 2003 11:04:23 -0400 To: IDN registration policy list From: tedd Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: > > I think this group has made some significant progress in that >> some characters have been already mapped to others -- such as all >> occurrences of glyphs looking like "A" and have been mapped to "a" and >> so on. > >What are you refering to? I don't know of anyone who has done that. AMC: Please forgive me, perhaps I'm not using the right terminology, but clearly "A" has been mapped to "a" already. Try using nameprep found at -- http://www.imc.org/nameprep/ -- and see what happens when you enter a HEX string of 0041 (A). You will find that it has been mapped to 0061 (a) -- as it has always been. But please note that every glyph that looks like "A" has also been mapped to 0061 (a). For example, from the newer Unicode Mathematical Alphanumeric Symbols table every "A" (1D400 to 1D670 ) has been mapped to 0061 (a) as well. I believe there are thirteen in all, namely bold, italic, bold-italic, script, bold-script, fraktur, double-struck, bold-frakture, sans-serif, sans-serif-bold, sans-serif-italic, sans-serif-bold-italic and monospace A's have been mapped to 0061 (a). That's what I meant about characters being mapped to others -- is there a better way to say this? Thanks. tedd -- http://sperling.com/ From owner-idn-reg-policy Sun Apr 6 10:16:11 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36HGAJM015674 for ; Sun, 6 Apr 2003 10:16:10 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36HGAsR015673 for idn-reg-policy-bks; Sun, 6 Apr 2003 10:16:10 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya40.nic.fr (maya40.nic.fr [192.134.4.151]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36HG4JM015664 for ; Sun, 6 Apr 2003 10:16:09 -0700 (PDT) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya40.nic.fr (8.12.4/8.12.4) with ESMTP id h36HFYvg860806; Sun, 6 Apr 2003 19:15:34 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id 8F39B110F0; Sun, 6 Apr 2003 19:15:38 +0200 (CEST) Date: Sun, 6 Apr 2003 19:15:38 +0200 From: Stephane Bortzmeyer To: tedd Cc: IDN registration policy list Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030406171538.GA4000@nic.fr> References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> <20030405225426.GB15271@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Sun, Apr 06, 2003 at 11:04:23AM -0400, tedd wrote a message of 39 lines which said: > But please note that every glyph that looks like "A" has also been > mapped to 0061 (a). This is clearly wrong (especially since Unicode does not deal with glyphs). For instance, the Greek capital Alpha U+0391 cannot be distinguished visually from a Latin capital A but is not mapped to A by nameprep. From owner-idn-reg-policy Sun Apr 6 12:57:13 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36JvDJM021253 for ; Sun, 6 Apr 2003 12:57:13 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36JvDX3021252 for idn-reg-policy-bks; Sun, 6 Apr 2003 12:57:13 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36JvBJM021248 for ; Sun, 6 Apr 2003 12:57:12 -0700 (PDT) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 192GGQ-0007KY-00 for ; Sun, 06 Apr 2003 12:57:14 -0700 Date: Sun, 6 Apr 2003 19:57:14 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030406195714.GA27898@nicemice.net> Reply-To: IDN registration policy list References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> <20030405225426.GB15271@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030406171538.GA4000@nic.fr> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: tedd wrote: > But please note that every glyph that looks like "A" has also been > mapped to 0061 (a). For example, from the newer Unicode Mathematical > Alphanumeric Symbols table every "A" (1D400 to 1D670 ) has been mapped > to 0061 (a) as well. Those are not merely things that look like A, they are all in fact instances of the letter A. That's why they are mapped. Things that merely look like A are not mapped. Stephane gave one example: > For instance, the Greek capital Alpha U+0391 cannot be distinguished > visually from a Latin capital A but is not mapped to A by nameprep. There is another example in the Cyrillic alphabet. By the way, the IDN folks are responsible for none of the mappings in Nameprep except the few map-to-nothing rules. All other mappings were simply imported from the Unicode Consortium. The normalization mappings are designed to coalesce alternate representations of the same text, while the case-folding mappings are designed to coalesce strings that differ only in the upper/lower case of letters. Neither set of mappings is designed to coalesce strings that are visually similar. AMC From owner-idn-reg-policy Sun Apr 6 13:03:54 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36K3rJM021466 for ; Sun, 6 Apr 2003 13:03:53 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36K3rM6021465 for idn-reg-policy-bks; Sun, 6 Apr 2003 13:03:53 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from hosting.altserver.com (hosting.altserver.com [209.124.80.2]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36K3qJM021461 for ; Sun, 6 Apr 2003 13:03:52 -0700 (PDT) Received: from f14m-1-78.d1.club-internet.fr ([212.195.88.78] helo=mine.jefsey.com) by hosting.altserver.com with esmtp (Exim 3.36 #1) id 192GMq-000380-00 for idn-reg-policy@imc.org; Sun, 06 Apr 2003 13:03:53 -0700 Message-Id: <5.2.0.9.0.20030406210249.00a4cdb0@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Sun, 06 Apr 2003 21:24:07 +0200 To: idn-reg-policy@imc.org From: "JFC (Jefsey) Morfin" Subject: Re: New Internet Draft on registering IDNs In-Reply-To: <20030405225426.GB15271@nicemice.net> References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - hosting.altserver.com X-AntiAbuse: Original Domain - imc.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [0 0] X-AntiAbuse: Sender Address Domain - jefsey.com Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On 00:54 06/04/03, Adam M. Costello said: >However, many people are concerned that the whole naming system will >lose its value if human beings cannot recognize when two names are the >same and when they are different. dear Adam, Your point is well taken, but human beings are not just alphabets. Basing such tables on unicode is like using the genome to decide about man/woman to fall in love. >The first tables to be developed (not by this group, but by experts in the >relevant language) will prohibit all Unicode characters except those >essential to a particular language, and will contain carefully crafted >mappings for those characters. This would be acceptable if any language oriented TLD did exist. ccTLDs are geographical and ruled by WTO regulations of equal and fair treatment to all. They also may be subject to local laws (but WTO supersedes laws as a Treaty). >That's irrelevant. If I'm told that ue and u are confusable, I >don't need to be told separately that xue and xu are confusable, >and that foouebar and fooubar are confusable. This does not take language context into account. Accentuations are by nature clarified by context. This is why rule has exceptions. > > you're making decisions about glyphs in other languages that are not > > you're own. > >This group is certainly are not doing that. It's not designing the >tables. The principles behind the tables are the problems. >Right, which is why this group is not even going to try. At most, it's >going to develop an architecture into which tables can be plugged, and >other groups with the needed expertise will devise the tables. Multi-layer working repartition: these groups will then say they delegate to other experts.... This is real life. So we all know that the ultimate experts will be lawyers and Judges. jfc From owner-idn-reg-policy Sun Apr 6 14:58:18 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36LwHJM024057 for ; Sun, 6 Apr 2003 14:58:17 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36LwHXM024056 for idn-reg-policy-bks; Sun, 6 Apr 2003 14:58:17 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from server1.matic.com (server.iicinternet.com [66.159.16.71] (may be forged)) by above.proper.com (8.12.9/8.11.6) with SMTP id h36LwGJM024045 for ; Sun, 6 Apr 2003 14:58:16 -0700 (PDT) Received: (qmail 25298 invoked from network); 6 Apr 2003 21:58:13 -0000 Received: from adsl-65-43-33-245.dsl.lgtpmi.ameritech.net (HELO ?192.168.0.100?) (65.43.33.245) by server.iicinternet.com with SMTP; 6 Apr 2003 21:58:13 -0000 Mime-Version: 1.0 X-Sender: tedd@sperling.com (Unverified) Message-Id: In-Reply-To: <20030405225426.GB15271@nicemice.net> References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> <20030405225426.GB15271@nicemice.net> Date: Sun, 6 Apr 2003 17:57:36 -0400 To: IDN registration policy list From: tedd Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: AMC said: >No, this group is not trying to predict them all. It is trying to >devise an architecture that can be extended to deal with the issues >incrementally as they are discovered. The first tables to be developed >(not by this group, but by experts in the relevant language) will >prohibit all Unicode characters except those essential to a particular >language, and will contain carefully crafted mappings for those >characters. Then some tables will be developed that combine two >languages, with special care taken for the characters used by both. >Gradually, as more experience and expertise is developed, more inclusive >tables will be created. Each registry will decide what table to use for >each of its zones. Okay, fair enough -- I understand then, you are simply trying to set up a procedure for dealing with confusing, or otherwise look-alike, code-points. With that said, are you trying to prohibit them, or map them to other code-points, or both? Another question -- What "experts in the relevant language" will oversee char sets such as Dingbats, Mathematical Symbols, or other such non-language specific (or multi-language) characters? Thanks for the clarification. tedd -- http://sperling.com/ From owner-idn-reg-policy Sun Apr 6 15:27:51 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36MRoJM024765 for ; Sun, 6 Apr 2003 15:27:50 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36MRov5024764 for idn-reg-policy-bks; Sun, 6 Apr 2003 15:27:50 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36MRnJM024760 for ; Sun, 6 Apr 2003 15:27:49 -0700 (PDT) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 192IcC-0007k2-00 for ; Sun, 06 Apr 2003 15:27:52 -0700 Date: Sun, 6 Apr 2003 22:27:52 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: New Internet Draft on registering IDNs Message-ID: <20030406222752.GC27898@nicemice.net> Reply-To: IDN registration policy list References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> <20030405225426.GB15271@nicemice.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: tedd wrote: > Okay, fair enough -- I understand then, you are simply trying to set > up a procedure for dealing with confusing, or otherwise look-alike, > code-points. With that said, are you trying to prohibit them, or map > them to other code-points, or both? The primary concern of this group (I think) is that registries prevent too-similar labels in the same zone from belonging to different registrants. If they belong to the same registrant, that's fine, and then the question of how the labels behave is less important. The registry could force the labels to resolve the same, or the registry could leave it up to the registrant to control how they behave (the registrant can configure the subordinate DNS servers and the application-level servers). For example, the traditional and simplified forms of a Chinese name might be required by the registry to belong to the same registrant, but that registrant might be able to choose whether to have them return the same web page or two different web pages (one using traditional Chinese and one using simplified Chinese). > Another question -- What "experts in the relevant language" will > oversee char sets such as Dingbats, Mathematical Symbols, or other > such non-language specific (or multi-language) characters? If potential registrants demand to use those characters in names (which is a big if, because registrants typically want names that are easy to type), then presumably the registries will find some experts on those characters. AMC From owner-idn-reg-policy Sun Apr 6 15:44:24 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36MiOJM025152 for ; Sun, 6 Apr 2003 15:44:24 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h36MiOOI025151 for idn-reg-policy-bks; Sun, 6 Apr 2003 15:44:24 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h36MiMJN025147 for ; Sun, 6 Apr 2003 15:44:22 -0700 (PDT) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: In-Reply-To: <20030406222752.GC27898@nicemice.net> References: <20030404201319.GA3202@nicemice.net> <4.2.0.58.J.20030404133734.02e0b788@localhost> <200304042108.h34L8tuc001122@bartok.sidn.nl> <20030404213527.GC3202@nicemice.net> <20030405225426.GB15271@nicemice.net> <20030406222752.GC27898@nicemice.net> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Sun, 6 Apr 2003 15:44:20 -0700 To: IDN registration policy list From: Paul Hoffman / IMC Subject: Re: New Internet Draft on registering IDNs Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 10:27 PM +0000 4/6/03, Adam M. Costello wrote: >The primary concern of this group (I think) is that registries prevent >too-similar labels in the same zone from belonging to different >registrants. Not quite (in fact, not close). The primary concern of the list is to help zones/registries decide: a) which characters they want in their zones b) if they want to handle variants c) if (b), how to decide what goes in the zone and what prevents other registrations Registries are free to use or not use any of the documents that we are discussing. We are *not* trying to get them to do anything specifically: that is completely their decision. It is particularly inappropriate for anyone who doesn't fully understand a script or language to try to get someone whose zone uses that script or language to do anything in particular. The trick is to come up with advice that is wide enough for almost any zone that is not so vague as to be useless to all of them. --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Mon Apr 7 08:53:20 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37FrKJM013036 for ; Mon, 7 Apr 2003 08:53:20 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h37FrKlI013035 for idn-reg-policy-bks; Mon, 7 Apr 2003 08:53:20 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.12.9/8.11.6) with SMTP id h37FrIJM013031 for ; Mon, 7 Apr 2003 08:53:19 -0700 (PDT) Message-ID: <02d901c2fd1d$bf8a8380$6701a8c0@neteka.inc> From: "Edmon Chung" To: "Roozbeh Pournader" , "IDN registration policy list" References: Subject: Re: Equivalence only in one direction Date: Mon, 7 Apr 2003 11:52:51 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Roozbeh Pournader" > > > 1. Required: Always resolving. Example: different digit forms > > > (U+0030->U+0660). > > > > > > 2. Variant: Optionally resolving. The current variant. > > > > > > 3. Blocking: Reserved forever. Example: Arabic vs Persian Yehs > > > (U+0649->U+06CC). This is quite right and what I think makes sense as well. This is actually concept I tried to capture and described in zoneprep where "Reserved Variants" are further distinguished into 4 possible types: - Normal Reserved Variants (blocked for registration, and can be later activated to Zone) - Restricted Reserved Variants (blocked for registration and cannot be activated) - Auotmatic Zone Variants (automatically included into zone files) - Suggested Reserved Variants (not blocked from registration pool but suggested to the client to reserve) Edmon From owner-idn-reg-policy Mon Apr 7 08:51:10 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37FpAJM012982 for ; Mon, 7 Apr 2003 08:51:10 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h37Fp974012981 for idn-reg-policy-bks; Mon, 7 Apr 2003 08:51:09 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.12.9/8.11.6) with SMTP id h37Fp8JM012973 for ; Mon, 7 Apr 2003 08:51:08 -0700 (PDT) Message-ID: <02cf01c2fd1d$328a5280$6701a8c0@neteka.inc> From: "Edmon Chung" To: "Roozbeh Pournader" , "Paul Hoffman / IMC" Cc: "Martin Duerst" , "IDN registration policy list" References: Subject: Re: initial thoughts Date: Mon, 7 Apr 2003 11:48:54 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Overlaps are very possible considering that multiple languages use the same script/codepoints. Edmon ----- Original Message ----- From: "Roozbeh Pournader" To: "Paul Hoffman / IMC" Cc: "Martin Duerst" ; "IDN registration policy list" Sent: Saturday, April 05, 2003 9:14 AM Subject: Re: initial thoughts > > On Thu, 3 Apr 2003, Paul Hoffman / IMC wrote: > > > >I think first we need clarification from you about whether > > >you intended, in your approach, that (variants in) bundles > > >can overlap. There is no indication in your draft that they > > >can't, but on the other hand, there is no indication that you > > >were aware of the fact that they could. > > > > I will add text saying that they cannot overlap, and add steps in the > > process to check for that. > > Well, I object. Sometimes these overlaps are unavoidable (examples already > posted). I really prefer a protocol/system that allows overlaps but states > how to handle them. > > roozbeh > From owner-idn-reg-policy Mon Apr 7 08:53:59 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37FrwJM013059 for ; Mon, 7 Apr 2003 08:53:58 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h37Frwh7013058 for idn-reg-policy-bks; Mon, 7 Apr 2003 08:53:58 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from neteka.com (www.namesbeyond.com [216.220.34.103]) by above.proper.com (8.12.9/8.11.6) with SMTP id h37FrvJM013053 for ; Mon, 7 Apr 2003 08:53:57 -0700 (PDT) Message-ID: <02e501c2fd1d$e06e5040$6701a8c0@neteka.inc> From: "Edmon Chung" To: "IDN registration policy list" References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> <20030404232518.GF3202@nicemice.net> Subject: Re: model with overlapping variants Date: Mon, 7 Apr 2003 11:53:46 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: ----- Original Message ----- From: "Adam M. Costello" > > What about "similarity"? > > I think I like "neighbors" better. "Similarity" sounds like a > real-valued function (how similar are X and Y?), while "neighbors" > sounds more boolean (are X and Y are neighbors?). I think both are confusing... just calling it a "variant" should be sufficient. Edmon From owner-idn-reg-policy Mon Apr 7 12:57:40 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37JveJM022955 for ; Mon, 7 Apr 2003 12:57:40 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h37Jvegs022953 for idn-reg-policy-bks; Mon, 7 Apr 2003 12:57:40 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from [63.202.92.152] (adsl-63-202-92-152.dsl.snfc21.pacbell.net [63.202.92.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37JvcJN022945 for ; Mon, 7 Apr 2003 12:57:38 -0700 (PDT) Mime-Version: 1.0 X-Sender: phoffman@mail.imc.org Message-Id: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . Date: Mon, 7 Apr 2003 12:57:09 -0700 To: idn-reg-policy@imc.org From: Paul Hoffman / IMC Subject: Why we don't want to restrict zones to a single language Content-Type: text/plain; charset="iso-8859-1" ; format="flowed" Content-Transfer-Encoding: 8bit Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Humor alert: in today's San Jose Mercury sports section, a talented young baseball pitcher from Venezuela is described as a "Wünderniņo". --Paul Hoffman, Director --Internet Mail Consortium From owner-idn-reg-policy Mon Apr 7 14:17:11 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37LHBJM011986 for ; Mon, 7 Apr 2003 14:17:11 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h37LHBiH011985 for idn-reg-policy-bks; Mon, 7 Apr 2003 14:17:11 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from nicemice.net (arwen.CS.Berkeley.EDU [128.32.132.165]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h37LHAJM011981 for ; Mon, 7 Apr 2003 14:17:10 -0700 (PDT) Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 192dzN-00027e-00 for ; Mon, 07 Apr 2003 14:17:13 -0700 Date: Mon, 7 Apr 2003 21:17:13 +0000 From: "Adam M. Costello" To: IDN registration policy list Subject: Re: model with overlapping variants Message-ID: <20030407211712.GD7147@nicemice.net> Reply-To: IDN registration policy list References: <20030404081656.GA29886@nicemice.net> <4.2.0.58.J.20030404141327.02dfc8b0@localhost> <20030404232518.GF3202@nicemice.net> <02e501c2fd1d$e06e5040$6701a8c0@neteka.inc> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <02e501c2fd1d$e06e5040$6701a8c0@neteka.inc> User-Agent: Mutt/1.4i Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Edmon Chung wrote: > > I think I like "neighbors" better. "Similarity" sounds like a > > real-valued function (how similar are X and Y?), while "neighbors" > > sounds more boolean (are X and Y are neighbors?). > > I think both are confusing... just calling it a "variant" should be > sufficient. For the model I proposed I need a word that intuitively refers to a symmetric intransitive relation. The word "variant" sounds likely to be asymmetric. (If X is a variant of Y, does that mean Y is a variant of X? Not necessarily. But if X is a neighbor of Y, then surely Y is a neighbor of X.) But it might turn out to be convenient to define the symmetric relation in terms of an asymmetric relation. For example, there might be tables that tell when one thing is a variant of another (which is asymmetric), and then we could define that X and Y are neighbors iff either is a variant of the other. AMC From owner-idn-reg-policy Tue Apr 8 00:42:29 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h387gTJM010209 for ; Tue, 8 Apr 2003 00:42:29 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h387gT8R010208 for idn-reg-policy-bks; Tue, 8 Apr 2003 00:42:29 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from maya20.nic.fr (maya20.nic.fr [192.134.4.152]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h387gNJM010185; Tue, 8 Apr 2003 00:42:28 -0700 (PDT) Received: from vespucci.nic.fr (postfix@vespucci.nic.fr [192.134.4.68]) by maya20.nic.fr (8.12.4/8.12.4) with ESMTP id h387gCMn1089101; Tue, 8 Apr 2003 09:42:13 +0200 (CEST) Received: by vespucci.nic.fr (Postfix, from userid 1055) id 7407D110F0; Tue, 8 Apr 2003 09:42:16 +0200 (CEST) Date: Tue, 8 Apr 2003 09:42:16 +0200 From: Stephane Bortzmeyer To: Paul Hoffman / IMC Cc: idn-reg-policy@imc.org Subject: Re: Why we don't want to restrict zones to a single language Message-ID: <20030408074216.GA18746@nic.fr> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.3.28i X-Operating-System: Debian GNU/Linux 3.0 X-Kernel: Linux 2.4.18-686 i686 Organization: NIC France X-URL: http://www.nic.fr/ Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: On Mon, Apr 07, 2003 at 12:57:09PM -0700, Paul Hoffman / IMC wrote a message of 6 lines which said: > Humor alert: in today's San Jose Mercury sports section, a talented > young baseball pitcher from Venezuela is described as a "Wünderniņo". The most funny thing is that "wunder" does not have an umlaut (diaeresis)... wunderlich, wunderbar, etc So, the paper's sentence is no more german than "Asereje" is spanish. From owner-idn-reg-policy Tue Apr 8 03:24:55 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h38AOtJM001929 for ; Tue, 8 Apr 2003 03:24:55 -0700 (PDT) Received: (from majordomo@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h38AOtSS001927 for idn-reg-policy-bks; Tue, 8 Apr 2003 03:24:55 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordomo set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from hosting.altserver.com (hosting.altserver.com [209.124.80.2]) by above.proper.com (8.12.9/8.11.6) with ESMTP id h38AOsJM001917; Tue, 8 Apr 2003 03:24:54 -0700 (PDT) Received: from f07a-9-5.d1.club-internet.fr ([212.194.152.5] helo=mine.jefsey.com) by hosting.altserver.com with esmtp (Exim 3.36 #1) id 192qHQ-0001ze-00; Tue, 08 Apr 2003 03:24:41 -0700 Message-Id: <5.2.0.9.0.20030408111624.02808880@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Tue, 08 Apr 2003 11:23:06 +0200 To: Stephane Bortzmeyer , Paul Hoffman / IMC From: "JFC (Jefsey) Morfin" Subject: Re: Why we don't want to restrict zones to a single language Cc: idn-reg-policy@imc.org In-Reply-To: <20030408074216.GA18746@nic.fr> References: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - hosting.altserver.com X-AntiAbuse: Original Domain - imc.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [0 0] X-AntiAbuse: Sender Address Domain - jefsey.com Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 09:42 08/04/03, Stephane Bortzmeyer wrote: >On Mon, Apr 07, 2003 at 12:57:09PM -0700, > Paul Hoffman / IMC wrote > a message of 6 lines which said: > > Humor alert: in today's San Jose Mercury sports section, a talented > > young baseball pitcher from Venezuela is described as a "Wünderniņo". >The most funny thing is that "wunder" does not have an umlaut >(diaeresis)... >wunderlich, wunderbar, etc >So, the paper's sentence is no more german than "Asereje" is spanish. Totally true. But this is intellectual creation by the San Jose Mercury Sport Redactor to carry a meaning we all understand, by the way..There are already enough TM problems not to add IP problems dont you think? I suggest that we used "l'étalon-wünderniņo" (a word most can type on their keyboards) to check that whatever proposition is made it matches a first level international (the San Jose Mercury is American) real life minimum requirements (without yet considering Berber, dialectal Arab, Greek, Cyrilic ..). jfc From owner-idn-reg-policy Sun Apr 13 20:07:03 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.8p1/8.12.8) with ESMTP id h3E3731r002498 for ; Sun, 13 Apr 2003 20:07:03 -0700 (PDT) (envelope-from owner-idn-reg-policy@mail.imc.org) Received: (from majordom@localhost) by above.proper.com (8.12.8p1/8.12.9/Submit) id h3E372P7002497 for idn-reg-policy-bks; Sun, 13 Apr 2003 20:07:02 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordom set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sentosa.post1.com (sentosa.post1.com [202.27.17.100]) by above.proper.com (8.12.8p1/8.12.8) with SMTP id h3E3701r002490 for ; Sun, 13 Apr 2003 20:07:01 -0700 (PDT) (envelope-from jseng@pobox.org.sg) Received: (qmail 15142 invoked from network); 14 Apr 2003 03:19:49 -0000 Received: from ida120.ida.gov.sg (HELO JSENGTOSHIBA) (210.24.194.120) by sentosa.post1.com with SMTP; 14 Apr 2003 03:19:49 -0000 Message-ID: <00f101c30232$e1716030$91784b0a@JSENGTOSHIBA> From: "James Seng" To: "Stephane Bortzmeyer" , "Martin Duerst" Cc: "IDN registration policy list" References: <4.2.0.58.J.20030402150335.035b1fc8@localhost> <20030403082051.GA6764@nic.fr> Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin Date: Mon, 14 Apr 2003 11:06:43 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Stephane, This is one of the reasons why I dont think a generic algorithm with a single table format can be designed to handle *all* languages[1]. I believe in a more strip down generic framework on handling of IDNs and its variants (e.g. registration, deletion, transfer, etc). But the exact algorithm to generate the variants should be defined per language. [1] language as defined by RFC 3066. -James Seng ----- Original Message ----- From: "Stephane Bortzmeyer" To: "Martin Duerst" Cc: "IDN registration policy list" Sent: Thursday, April 03, 2003 4:20 PM Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin > > On Wed, Apr 02, 2003 at 03:07:03PM -0500, > Martin Duerst wrote > a message of 28 lines which said: > > > The danger of bundles being too big can easily happen for European > > languages, with a bundle that defines that all accented versions of > > a character are treated as the same as the base character. > > Yes, see my previous message, in the thread "New Internet Draft on > registering IDNs". A typical example is the label > "3suisses-assurances" (which actually exist in '.fr') which has a > bundle of 306,250 labels with a table that uses (almost) all the > Latin-1 characters. > > Not all of Latin-1 characters exist in French so we could downsize the > table and therefore the bundles. But, on the other hand, for a > registry like '.eu', we will need an even larger table since Europe > requires more than just Latin-1. > > > In that case, Paul's approach (also described by Adam) of using > > equivalence classes won't scale. > > It doesn't scale if you want to actually generate the bundle and > publish them in a static zone file. I tried for the '.fr' zone which > is quite small - 150,000 domains - and the resulting zone file was > larger than '.com' even before the domains starting with the letter A > were fully processed. But you have other approaches: > > * a dynamic DNS server like PowerDNS with a > back-end that will match a label to its bundle at query-time, > > * Option 2 or 3 of Paul's draft, which do not require to actually > store the complete bundle. > > > What may work is that an accented character blocks the base > > character, but not characters with a different accent. > > Interesting. We could also draw inspiration from most Web search > engines. They work that way: If there is no composed character in the > query, they search "accent-insensitive". If there is at least one, > they switch to "accent-sensitive". > > > From owner-idn-reg-policy Sun Apr 13 20:02:44 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.8p1/8.12.8) with ESMTP id h3E32i1r002413 for ; Sun, 13 Apr 2003 20:02:44 -0700 (PDT) (envelope-from owner-idn-reg-policy@mail.imc.org) Received: (from majordom@localhost) by above.proper.com (8.12.8p1/8.12.9/Submit) id h3E32ifB002412 for idn-reg-policy-bks; Sun, 13 Apr 2003 20:02:44 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordom set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from sentosa.post1.com (sentosa.post1.com [202.27.17.100]) by above.proper.com (8.12.8p1/8.12.8) with SMTP id h3E32d1r002397 for ; Sun, 13 Apr 2003 20:02:42 -0700 (PDT) (envelope-from jseng@pobox.org.sg) Received: (qmail 15082 invoked from network); 14 Apr 2003 03:15:25 -0000 Received: from ida120.ida.gov.sg (HELO JSENGTOSHIBA) (210.24.194.120) by sentosa.post1.com with SMTP; 14 Apr 2003 03:15:25 -0000 Message-ID: <006501c30232$43e6ded0$91784b0a@JSENGTOSHIBA> From: "James Seng" To: "Paul Hoffman / IMC" , , "Martin Duerst" References: <3E818686.6010307@twnic.net.tw> <3E818686.6010307@twnic.net.tw> <4.2.0.58.J.20030402161258.0291df10@localhost> Subject: Re: New Internet Draft on registering IDNs Date: Mon, 14 Apr 2003 11:02:19 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: Not all variants generated are equal. Some variants are more often used then others. Some others dont even make sense because it is generated algorithmically. And because of the number of variants each characters may have multiple by the number of chinese characters per names, we may get thousands of possible variants. Putting them *all* in the zone file put too must stress in the DNS for little benefit. Thats why JET guideline have only some of variants (hopefully a handful) in the zone file and others put on reserved (to prevent potential dispute). -James Seng ----- Original Message ----- From: "Martin Duerst" To: "Paul Hoffman / IMC" ; Sent: Thursday, April 03, 2003 5:19 AM Subject: Re: New Internet Draft on registering IDNs > > At 08:16 03/03/26 -0800, Paul Hoffman / IMC wrote: > > >Right. Unfortunately, the current draft of the JET document is silent > >about these requirements, and from talking to some JET members, I haven't > >heard any good description of why Chinese needs both. In fact, I remember > >many long conversations with CNNIC and TWNIC people a few years ago where > >they all said that just blocking (with no allocating) was fine. Maybe > >opinions in the Chinese language community have changed since then, but I > >haven't seen any written down in the JET document yet. Maybe the next > >version will cover this clearly. > > This is just a wild guess, but it may have to do with the fact that > even in Taiwan, simplified characters are sometimes used. The > most often cited example is the 'tai' in Taiwan (U+53F0). This is > clearly a simplified character, but it is often used. While in > general, combinations of simplified and traditional variants > can just be blocked, this is a case where just blocking would not work. > > >True, but it would only help a little bit. Telling the users what has been > >done does not let them predict what will happen. If a registry says "we > >have mapped these characters to these other ones for this language > >reason", users will understand that; if a registry says "we have blocked > >these characters for this language reason", users will understand that. > >But I don't know how many users will understand "we have mapped some of > >them but blocked other ones even though the language reason is the same". > >If there is a good language reason for differentiating the two cases, that > >would be wonderful. > > 'language reason' may be the same or different. It may be the same language, > but a different reason. Also, in some cases, it may appear very natural > to people understanding the language to read 'we have mapped A, B, and C, > and blocked D, E, and F'. A very simplistic example would be French, > with somebody registering e-acute. If the system replied 'we have > mapped e (without any accent) and blocked e-grave, e-circumflex, > and e-diaeresis, that would make sense to somebody understanding > French. The e without accent can be used as an equivalent for e-acute, > and is therefore mapped, but the other accented variants are never > equivalents, and may be blocked just because they would otherwise > interfere with the e without accent. > [I don't claim that this is the right thing to do for French.] > > Regards, Martin. > From owner-idn-reg-policy Mon Apr 14 02:55:30 2003 Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.8p1/8.12.8) with ESMTP id h3E9tU1r040228 for ; Mon, 14 Apr 2003 02:55:30 -0700 (PDT) (envelope-from owner-idn-reg-policy@mail.imc.org) Received: (from majordom@localhost) by above.proper.com (8.12.8p1/8.12.9/Submit) id h3E9tU2H040227 for idn-reg-policy-bks; Mon, 14 Apr 2003 02:55:30 -0700 (PDT) X-Authentication-Warning: above.proper.com: majordom set sender to owner-idn-reg-policy@mail.imc.org using -f Received: from hosting.altserver.com (hosting.altserver.com [209.124.80.2]) by above.proper.com (8.12.8p1/8.12.8) with ESMTP id h3E9tS1r040221 for ; Mon, 14 Apr 2003 02:55:29 -0700 (PDT) (envelope-from jefsey@jefsey.com) Received: from f15v-6-128.d1.club-internet.fr ([212.195.197.128] helo=mine.jefsey.com) by hosting.altserver.com with esmtp (Exim 3.36 #1) id 1950gN-0005dm-00 for idn-reg-policy@imc.org; Mon, 14 Apr 2003 02:55:23 -0700 Message-Id: <5.2.0.9.0.20030414113923.00a73300@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Mon, 14 Apr 2003 12:01:53 +0200 To: idn-reg-policy@imc.org From: "JFC (Jefsey) Morfin" Subject: Re: Comparison of hoffman-idn-reg and jseng-idn-admin In-Reply-To: <00f101c30232$e1716030$91784b0a@JSENGTOSHIBA> References: <4.2.0.58.J.20030402150335.035b1fc8@localhost> <20030403082051.GA6764@nic.fr> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - hosting.altserver.com X-AntiAbuse: Original Domain - imc.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [0 0] X-AntiAbuse: Sender Address Domain - jefsey.com Sender: owner-idn-reg-policy@mail.imc.org Precedence: bulk List-Archive: List-Unsubscribe: List-ID: At 05:06 14/04/03, James Seng wrote: >I believe in a mo