lucene - What's a recommended strategy to replace ord() and rord() function queries in Solr? -


i'm using rord() function in solr queries in order boost query results against "rank" field, using syntax this:

bf=rord(cur_rank)^1.8 

the algorithm works well, recent changes in solr indicate using ord() , rord() memory hog now. the changelog:

searching , sorting done on per-segment basis, meaning fieldcache entries used sorting , function queries created , used per-segment , can reused segments don't change between index updates. while beneficial, can lead increased memory usage on 1.3 in scenarios:

[...]

2) function queries such ord() , rord() require top level fieldcache instance , can lead increased memory usage. consider replacing ord() , rord() alternatives, such function queries based on ms() date boosting.

it mentions possible strategies handling date-based boosting, how number "rank" rank number between 1 , total number of records?

rord() seems ideal... other strategies?

the point of using segment-based field caches reduce load time. if want value of field after having added new segment (which done every time commit), have load new field cache newly added segment.

this not possible ord , rord give ordinal whole index instead of value single document.

so solution compute boost based value of field "cur_rank" instead of ord.

this how date boosting works : used use rord of date field in order compute boost, whereas uses number of milliseconds between value of date field , now. see http://wiki.apache.org/solr/solrrelevancyfaq ("how can boost score of newer documents") more details.


Comments