-
Notifications
You must be signed in to change notification settings - Fork 809
SOLR-16667: LTR Add feature vector caching for ranking #3433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-16667: LTR Add feature vector caching for ranking #3433
Conversation
a7e6fe7 to
ab63d05
Compare
alessandrobenedetti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes and some discussions to do, then I'll review the tests!
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
Outdated
Show resolved
Hide resolved
solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java
Outdated
Show resolved
Hide resolved
solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java
Outdated
Show resolved
Hide resolved
solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java
Outdated
Show resolved
Hide resolved
.../ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java
Show resolved
Hide resolved
.../ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java
Outdated
Show resolved
Hide resolved
solr/modules/ltr/src/test-files/solr/collection1/conf/solrconfig-ltr.xml
Outdated
Show resolved
Hide resolved
8994985 to
a2ed4f4
Compare
e4ebcc0 to
ca914a8
Compare
solr/modules/ltr/src/test/org/apache/solr/ltr/TestFeatureVectorCache.java
Outdated
Show resolved
Hide resolved
solr/modules/ltr/src/test/org/apache/solr/ltr/TestFeatureVectorCache.java
Outdated
Show resolved
Hide resolved
solr/modules/ltr/src/test/org/apache/solr/ltr/TestFeatureVectorCache.java
Outdated
Show resolved
Hide resolved
c4d0831 to
00fc756
Compare
|
We have finished the review iterations, and the code is open to further suggestions and revisions. In the meantime, I am running a benchmark to report the performance of the new contribution compared to the current implementation. |
solr/solr-ref-guide/modules/query-guide/pages/learning-to-rank.adoc
Outdated
Show resolved
Hide resolved
alessandrobenedetti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside some minors, we are ready to merge, just waiting for main to be stable
…e/SOLR-16667 # Conflicts: # solr/CHANGES.txt
…e/SOLR-16667 # Conflicts: # solr/CHANGES.txt
by Anna and Alessandro (cherry picked from commit aeb9063)
|
As mentioned, I report here the results of the benchmark that has been done: We compared the current main branch containing the FV_cache for LTR with our contributed cache. We have also done a test in both situations with all caches with zero size to be sure not to slow down the query execution when not using caches at all. The first column represents the difference in milliseconds between the first query (no cache hit - miss than insert) and the second query (cache hit). The higher, the better, since it means that the cache has a great impact on reducing the query execution time. The second column represents the average time taken to execute the first query in milliseconds (the one that does not find any entry - miss than insert). The third column represents the average time taken to execute the second query in milliseconds (the one that finds an entry - hit). Finally, the last columns are the cache statistics from the Solr UI. We executed 10 pairs of queries (the first having a miss, the second a hit in the cache), each retrieving 10000 documents. All the results are positive. We obtain a slightly slower result when doing the "miss" query during pure reranking, but we still have a comparable query execution time. A detailed blog post about this will follow on sease.io |
by Anna and Alessandro (cherry picked from commit aeb9063)

https://issues.apache.org/jira/browse/SOLR-16667
Description
The current definition and usage of the QUERY_DOC_FV feature cache has been modified to support both reranking and logging.
Solution
Tests
Tests have been added in the solr/modules/ltr/src/test/org/apache/solr/ltr/TestFeatureVectorCache.java file.
The tests check for the correct cache usage and response in different scenarios, considering the ltr parameters: logAll, store, efis... and their defaults.
Checklist
Please review the following and check all that apply:
mainbranch../gradlew check.