Skip to content

NitriteCollection.update slow when unrelated properties in document are indexed #902

@chris9182

Description

@chris9182

We found that calling:

elementCollection.update(
        FluentFilter.where(ATTRIBUTE_PROPERTY_ID).eq(Long.toString(index)),
        Document.createDocument().put(somePropertyName, someValue));

is very slow when the database is large (>200000 entries). We have indexed (multiple) properties in this collection, but not the one changed in the update (so no index on somePropertyName in the example above).

Starting the debugger and pausing at a random time shows the following stack trace:

String.intern() line: not available [native method]
ObjectStreamField.(String, String, boolean) line: 109
ObjectStreamClass.readNonProxy(ObjectInputStream) line: 714
ObjectInputStream.readClassDescriptor() line: 988
ObjectInputStream.readNonProxyDesc(boolean) line: 2034
ObjectInputStream.readClassDesc(boolean) line: 1909
ObjectInputStream.readOrdinaryObject(boolean) line: 2235
ObjectInputStream.readObject0(Class, boolean) line: 1744 ObjectInputStream.readObject(Class) line: 514
ObjectInputStream.readObject() line: 472
ObjectDataType.deserialize(byte[]) line: 377
ObjectDataType$SerializedObjectType.read(ByteBuffer, int) line: 1612
ObjectDataType.read(ByteBuffer) line: 256
ObjectDataType(BasicDataType).read(ByteBuffer, Object, int) line: 74
Page$Leaf<K,V>(Page<K,V>).read(ByteBuffer) line: 657
Page<K,V>.read(ByteBuffer, long, MVMap<K,V>) line: 262
SingleFileStore(FileStore).readPage(MVMap<K,V>, long) line: 1968
MVStore.readPage(MVMap<K,V>, long) line: 1021
MVMap<K,V>.readPage(long) line: 632
Page$NonLeaf<K,V>.getChildPage(int) line: 1117
Cursor<K,V>.hasNext() line: 64
MVMap$2$1.hasNext() line: 745
NitriteMVMap$1.hasNext() line: 141
IndexOperations.buildIndexInternal(IndexDescriptor, boolean) line: 188
IndexOperations.buildIndex(IndexDescriptor, boolean) line: 76
DocumentIndexWriter.removeIndexEntryInternal(IndexDescriptor, Document, NitriteIndexer) line: 105
DocumentIndexWriter.updateIndexEntry(Document, Document) line: 74
WriteOperations.update(Filter, Document, UpdateOptions) line: 178
CollectionOperations.update(Filter, Document, UpdateOptions) line: 102
DefaultNitriteCollection.update(Filter, Document, UpdateOptions) line: 125
DefaultNitriteCollection(NitriteCollection).update(Filter, Document) line: 131
BackendCollectionWrapper.update(Filter, Document) line: 78
... our code...

Our guess is that the update method does not take the modified properties into account (via the .put method in the example above) and updates all indices for properties available in the document that is retrieved by the filter. Our investigation lets us believe that there are two solutions:

Either these properties should be taken into account starting from org.dizitart.no2.collection.operation.WriteOperations.update(Filter, Document, UpdateOptions) Line 129, where the modified properties are available and documentIndexWriter.updateIndexEntry(oldDocument, processed); could be passed those properties, such that non-modified ones can be skipped.

Or in org.dizitart.no2.collection.operation.DocumentIndexWriter.updateIndexEntry(Document, Document) only the properties differing between oldDocument and newDocument should be indexed anew, which would have more overhead than the suggestion above.

This slow-down is currently a really big problem for us, as updating properties in documents is essential for our workflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions