Rob Audenaerde
2018-09-28 12:40:41 UTC
Hi all,
We build a FST on the terms of our index by iterating the terms of the
readers for our fields, like this:
for (final LeafReaderContext ctx : leaves) {
final LeafReader leafReader = ctx.reader();
for (final String indexField : indexFields) {
final Terms terms =
leafReader.terms(indexField);
// If the field does not exist in this
reader, then we get null, so check for that.
if (terms != null) {
final TermsEnum termsEnum =
terms.iterator();
However, it sometimes the building of the FST seems to find terms that are
from documents that are deleted. This is what we expect, checking the
javadocs.
So, now we switched the IndexWriter to a config with a TieredMergePolicy
with: setForceMergeDeletesPctAllowed(0).
When calling indexWriter.forceMergeDeletes(true) we expect that there will
be no more deletes. However, the deleted terms still sometimes appear. We
use the DirectoryReader.openIfChanged() to refresh the reader before
iterating the terms.
Are we forgetting something?
Thanks in advance.
Rob Audenaerde
We build a FST on the terms of our index by iterating the terms of the
readers for our fields, like this:
for (final LeafReaderContext ctx : leaves) {
final LeafReader leafReader = ctx.reader();
for (final String indexField : indexFields) {
final Terms terms =
leafReader.terms(indexField);
// If the field does not exist in this
reader, then we get null, so check for that.
if (terms != null) {
final TermsEnum termsEnum =
terms.iterator();
However, it sometimes the building of the FST seems to find terms that are
from documents that are deleted. This is what we expect, checking the
javadocs.
So, now we switched the IndexWriter to a config with a TieredMergePolicy
with: setForceMergeDeletesPctAllowed(0).
When calling indexWriter.forceMergeDeletes(true) we expect that there will
be no more deletes. However, the deleted terms still sometimes appear. We
use the DirectoryReader.openIfChanged() to refresh the reader before
iterating the terms.
Are we forgetting something?
Thanks in advance.
Rob Audenaerde