Discussion:
Question About FST, multiple-column index
ly铖
2018-09-20 14:56:23 UTC
Permalink
Hi,


When I using Lucene as a Full Text search engine, I have a question about multi field index. For example, We have two fields: user, age. We always want to search one user which name is "xxx" and with the exactly age. So we add two fields to the lucene(may be there is better ways, I just want to explain my question ). In this case, we can see user result dataset is small, age result dataset is much more than previous. Even If lucene use Leading Query to reduce query result bitsets. but I wonder is there any Combined Index structure like multiple-column indexes in mysql? I think is there any solutions to extends to FST which make the FINAL state connect to another FST?


THANKS
Mikhail Khludnev
2018-09-21 07:40:06 UTC
Permalink
No way. And this is the point. To have combined index you need to combine
fields concatenating terms. It will be faster but it brings much other
hurdles. Do you think that this is the real problem? What's the search time
now and how do you search exactly?
Post by ly铖
Hi,
When I using Lucene as a Full Text search engine, I have a question about
multi field index. For example, We have two fields: user, age. We always
want to search one user which name is "xxx" and with the exactly age. So we
add two fields to the lucene(may be there is better ways, I just want to
explain my question ). In this case, we can see user result dataset is
small, age result dataset is much more than previous. Even If lucene use
Leading Query to reduce query result bitsets. but I wonder is there any
Combined Index structure like multiple-column indexes in mysql? I think is
there any solutions to extends to FST which make the FINAL state connect to
another FST?
THANKS
--
Sincerely yours
Mikhail Khludnev
Michael McCandless
2018-09-23 03:25:34 UTC
Permalink
You might want to index the name field normally (as StringField, for
example), then index the age as a NumericDocValuesField, and then make a
BooleanQuery with two required clauses, one clause TermQuery on the name,
the other a NumericDocValuesField.newSlowExactQuery. Even though its name
is "slow", it can be very fast for cases like what you are doing, where you
expect very few matches by name, and many many matches with exactly a
specific age.

This is assuming you want precise (including case) matching on the name; if
you do not, then index the name as TextField, and analyzing the search
terms at query time using a query parser.

Mike McCandless

http://blog.mikemccandless.com
Post by ly铖
Hi,
When I using Lucene as a Full Text search engine, I have a question about
multi field index. For example, We have two fields: user, age. We always
want to search one user which name is "xxx" and with the exactly age. So we
add two fields to the lucene(may be there is better ways, I just want to
explain my question ). In this case, we can see user result dataset is
small, age result dataset is much more than previous. Even If lucene use
Leading Query to reduce query result bitsets. but I wonder is there any
Combined Index structure like multiple-column indexes in mysql? I think is
there any solutions to extends to FST which make the FINAL state connect to
another FST?
THANKS
Loading...