Discussion:
regarding comparing texts using Lucene
Veda G M
2018-09-19 04:56:04 UTC
Permalink
Hello,

Is it possible to compare large chunks of text and get the similarity
score/percentage using Lucene?

Say for e.g., we have 2-3 paragraphs of text and need to search if there is
any document that matches this semantically and the similarity that the
returned hit and the search string share in terms of percentage.

Could you please let me know if this is possible with Lucene?

Thanks.

Regards,
Veda
Adrien Grand
2018-09-19 13:37:21 UTC
Permalink
Hi Veda,

Lucene doesn't provide such functionality out of the box, but you could use
MoreLikeThis (
https://lucene.apache.org/core/7_4_0/queries/org/apache/lucene/queries/mlt/MoreLikeThis.html)
to search for similar documents and then compute a finer-grained similarity
score on client-side. This would avoid having to compute a similarity score
with every document of your collection.
Post by Veda G M
Hello,
Is it possible to compare large chunks of text and get the similarity
score/percentage using Lucene?
Say for e.g., we have 2-3 paragraphs of text and need to search if there is
any document that matches this semantically and the similarity that the
returned hit and the search string share in terms of percentage.
Could you please let me know if this is possible with Lucene?
Thanks.
Regards,
Veda
Loading...