Ir al contenido principal.
Enlaces relacionados:  Prensa  Compañía  Clientes  Contáctenos
Solsoft

Foros

[php-gni] Inverted terms


Autor Mensaje
Escrito en: 31. 05. 2005 [14:57]
israel@FEE.TCHE.BR
Israel Jose Cefrin da Silva
Autor del tema
registrado desde: 31.12.1969
Entradas: 0
Hi all

I've got a site with Open*BEEP*/GNI search engine based.

When i do a search with this term: 'economia*informal'
It retrieves 5 records

But, if I do the same search like this: 'informal*economia'
It retrieves 57 records

What could be wrong with my search ? Is something on my FST index ?


You can test my search on this url
http://www.bibvirtual.rs.gov.br:8080/pg_pesquisa.php
!! mark the 'FEE' base

- Search with " economia*informal " :
http://www.bibvirtual.rs.gov.br:8080/pg_pesquisa_resultado.php?termo=3Decon=
omia*informal&tipodedocumento=3D%24&ano=3D&operador=3D*&campo=3D&from=3D0&p=
agina=3D1&base%5B%5D=3DFEE&from=3D0&enviar=3DPesquisar

- Search with " informal*economia " :
http://www.bibvirtual.rs.gov.br:8080/pg_pesquisa_resultado.php?termo=3Dinfo=
rmal*economia&tipodedocumento=3D%24&ano=3D&operador=3D*&campo=3D&from=3D0&p=
agina=3D1&base%5B%5D=3DFEE&from=3D0&enviar=3DPesquisar


regards
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D=
-=3D-=3D-=3D-=3D-=3D
=A0Israel Cefrin
=A0t=E9cnico webdesigner
=A0=A0=A0 israel @ fee.tche[dot]br
=A0=A0=A0 msn:isra_rs@hotmail.com
=A0=A0=A0 icq:74378983
=A0=A0=A0 +55 51 3216 9084 - work
=A0=A0=A0 +55 51 8421 7888 - cel
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D=
-=3D-=3D-=3D-=3D-=3D

------------------------------------------
Posted to Phorum via PhorumMail
Escrito en: 31. 05. 2005 [17:57]
paul@malete.org
Klaus Ripke
registrado desde: 31.12.1969
Entradas: 0
On Tue, May 31, 2005 at 03:57:28PM -0300, Israel Cefrin wrote:
> Hi all
>=20
> I've got a site with Open*BEEP*/GNI search engine based.
>=20
> When i do a search with this term: 'economia*informal'
> It retrieves 5 records
>=20
> But, if I do the same search like this: 'informal*economia'
> It retrieves 57 records
>=20
> What could be wrong with my search ? Is something on my FST index ?
nah

probably this is expected (and even documented!) behaviour.

It's an old story, the infamous OPENISIS_SETLEN .
Mea culpa, I should have provided a default with 1 or 2 zeros more.

For the record, the story is:
The SETLEN determines the number of *hits*, not records,
that will be kept in the internal query buffer.
So, if every record containing economia contains it five times
(which is likely for such a word, since those powerpoint freaks
=66rom the economics department bet their wages on repetition),
you will have only the first 200 records within the buffer
of 1000. The second term works completely different,
filtering the buffer (checking informal in the some 200 records).

The other way round, checking informal first, you will find
much more records within a 1000 hits, from which then
economia is filtered.

Yes, it is documented.

So:
a) recompile using some larger value for OPENISIS_SETLEN
b) educate your users to ask interesting questions first,
so there is less spam to filter out -- it helps a lot, anyway!


saludos

------------------------------------------
Posted to Phorum via PhorumMail
Escrito en: 31. 05. 2005 [17:57]
paul@malete.org
Klaus Ripke
registrado desde: 31.12.1969
Entradas: 0
On Tue, May 31, 2005 at 03:57:28PM -0300, Israel Cefrin wrote:
> Hi all
>=20
> I've got a site with Open*BEEP*/GNI search engine based.
>=20
> When i do a search with this term: 'economia*informal'
> It retrieves 5 records
>=20
> But, if I do the same search like this: 'informal*economia'
> It retrieves 57 records
>=20
> What could be wrong with my search ? Is something on my FST index ?
nah

probably this is expected (and even documented!) behaviour.

It's an old story, the infamous OPENISIS_SETLEN .
Mea culpa, I should have provided a default with 1 or 2 zeros more.

For the record, the story is:
The SETLEN determines the number of *hits*, not records,
that will be kept in the internal query buffer.
So, if every record containing economia contains it five times
(which is likely for such a word, since those powerpoint freaks
=66rom the economics department bet their wages on repetition),
you will have only the first 200 records within the buffer
of 1000. The second term works completely different,
filtering the buffer (checking informal in the some 200 records).

The other way round, checking informal first, you will find
much more records within a 1000 hits, from which then
economia is filtered.

Yes, it is documented.

So:
a) recompile using some larger value for OPENISIS_SETLEN
b) educate your users to ask interesting questions first,
so there is less spam to filter out -- it helps a lot, anyway!


saludos

------------------------------------------
Posted to Phorum via PhorumMail



¿Ha olvidado su contraseña?

Por favor introduzca su nombre de usuario o dirección de correo electrónico. Las instrucciones para restablecer la contraseña serán inmediatamente enviadas por correo electrónico.
Restablecer contraseña

Volver al formulario de inicio de sesión 


Copyright © 2003-2009, Solsoft de Costa Rica S.A.
Declaración de privacidad