You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure if this can be considered a bug, but it is certainly a caveat that may slip through testing due to its nature.
Consider the following scenario:
all documents in the index have a field "numfield" indexed as IntPoint
in addition, SOME of those documents are also indexed with a SortedNumericDocValuesField using the same "numfield" name
The documents without the DocValues cannot be matched from any queries that involve sorting, so we save some space by omitting the DocValues for those documents.
This works perfectly fine, unless
the index contains a segment that only contains documents without the DocValues
In this case, running a query that sorts by "numfield" will throw the following exception:
java.lang.IllegalStateException: unexpected docvalues type NONE for field 'numfield' (expected one of [SORTED_NUMERIC, NUMERIC]). Re-index with correct docvalues type.
at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
at org.apache.lucene.index.DocValues.getSortedNumeric(DocValues.java:389)
at org.apache.lucene.search.SortedNumericSortField$3.getNumericDocValues(SortedNumericSortField.java:159)
at org.apache.lucene.search.FieldComparator$NumericComparator.doSetNextReader(FieldComparator.java:155)
I have included a minimal example program that demonstrates the issue. This will
create an index with two documents, each having "numfield" indexed
add a DocValuesField "numfield" only for the first document
force the two documents into separate index segments
run a query that matches only the first document and sorts by "numfield"
This results in the aforementioned exception.
When removing the following lines from the code:
if (i==docCount/2) {
iw.commit();
}
both documents get added to the same segment. When re-running the code creating with a single index segment, the query works fine.
Tested with Lucene 8.3.1 and 8.8.0 .
Like I said, this may not be considered a bug. But it has slipped through our testing because the existence of such a DocValues-free segment is such a rare and short-lived event.
We can avoid this issue in the future by using a different field name for the DocValuesField. But for our production systems we have to patch DocValues.checkField() to suppress the IllegalStateException as reindexing is not an option right now.
>> all documents in the index have a field "numfield" indexed as IntPoint
>> in addition, SOME of those documents are also indexed with a SortedNumericDocValuesField using the same "numfield" name
Thomas Hecker. I am working on the #10374 that will ensure that this never happens. That is, if a document has "numfield" indexed as IntPoint, it also must have a "numfield" indexed as SortedNumericDocValuesField. In other words, there will be consistency between data-structures on a per-field across all the documents of an index.
But this will be from version 9.0. Your point is still valid for 8.x
Not sure if this can be considered a bug, but it is certainly a caveat that may slip through testing due to its nature.
Consider the following scenario:
The documents without the DocValues cannot be matched from any queries that involve sorting, so we save some space by omitting the DocValues for those documents.
This works perfectly fine, unless
In this case, running a query that sorts by "numfield" will throw the following exception:
I have included a minimal example program that demonstrates the issue. This will
This results in the aforementioned exception.
When removing the following lines from the code:
both documents get added to the same segment. When re-running the code creating with a single index segment, the query works fine.
Tested with Lucene 8.3.1 and 8.8.0 .
Like I said, this may not be considered a bug. But it has slipped through our testing because the existence of such a DocValues-free segment is such a rare and short-lived event.
We can avoid this issue in the future by using a different field name for the DocValuesField. But for our production systems we have to patch DocValues.checkField() to suppress the IllegalStateException as reindexing is not an option right now.
Migrated from LUCENE-9755 by Thomas Hecker, updated Feb 11 2021
Attachments: DocValuesTest.java
The text was updated successfully, but these errors were encountered: