| Autor | Mensaje |
|---|---|
|
Escrito en: 23. 05. 2005 [15:33]
|
|
|
fgomez@criba.edu.ar
=?Windows-1252?Q?Fernando_J._G=F3mez?
Autor del tema
registrado desde: 31.12.1969
Entradas: 0
|
Hola maleteros Regarding collation, I'd like to make an observation and a question. The observation is about the digraphs "ch" and "ll", which since 1994 no longer have a special status when sorting Spanish words. An "official" document stating this is the Ortografía de la Lengua Española (1999), pp. 1-2, available at http://www.rae.es/ (follow the link "Ortografía"). In fact, the Real Academia Española agreed in 1994 --at the request of various international organizations-- to return to its traditional (i.e., pre-1803) Latin alphabetical order: cg < ch < ci, lk < ll < lm. So, though I can see that these particular digraphs are useful to show the power of collation as implemented in Malete, IMHO they should be avoided in the examples, so that we don't unintentionally contribute to the perpetuation of an obsolete sorting habit. Now the question. The .m0d file allows A(lias) and M(aps). It seems that M is the preferred option when dealing with expansions (ä => ae). But in the case of single accented letters (e.g. á => a, Á => a), both A and M can be used, as can be seen in the examples provided on CharSet.txt and the test/*.m0d files. Is there any reason to choose one or the other method? Saludos! -- Fernando ---------------------------------------- Fernando J Gómez ---------------------------------------- Biblioteca Dr. Antonio Monteiro Instituto de Matemática de Bahía Blanca Conicet / Universidad Nacional del Sur Av. Alem 1253 B8000CPB Bahía Blanca - Argentina Tel. (54 291) 459 5116 ---------------------------------------- ------------------------------------------ Posted to Phorum via PhorumMail |
|
Escrito en: 24. 05. 2005 [08:06]
|
|
|
paul@malete.org
Klaus Ripke
registrado desde: 31.12.1969
Entradas: 0
|
On Mon, May 23, 2005 at 05:33:03PM -0300, Fernando Gomez wrote: > In fact, the Real Academia Espa=F1ola agreed in 1994 --at the request of= =20 uhhh! ten years late! too sad! > avoided in the examples, so that we don't unintentionally contribute to= =20 then I prolly have to ressort to de_phonebook as last really weird collation found in the wild ... of course, the french do better, but we are not yet ready for their reverse second level sorting. > Now the question. The .m0d file allows A(lias) and M(aps). It seems that= =20 > M is the preferred option when dealing with expansions (=E4 =3D> ae). But= in=20 > the case of single accented letters (e.g. =E1 =3D> a, =C1 =3D> a), both A= and M=20 > can be used, as can be seen in the examples provided on CharSet.txt and= =20 > the test/*.m0d files. Is there any reason to choose one or the other meth= od? Aliases are preferrable where sufficient. They are a little bit cheaper, as they directly use one code point, whereas mappings are inspected in turn (not recursively, exactly 1 level) to produce zero, one or multiple codes. saludos ------------------------------------------ Posted to Phorum via PhorumMail |