This comes down to the exact eDisMax query we use. As a general rule, tweaking any part of this expression results in some queries improving while others get worse.
|
params = { |
|
"query": { |
|
"edismax": { |
|
"query": query, |
|
# qf = query fields, i.e. how should we boost these fields if they contain the same fields as the input. |
|
# https://solr.apache.org/guide/solr/latest/query-guide/dismax-query-parser.html#qf-query-fields-parameter |
|
"qf": "preferred_name_exactish^250 names_exactish^100 preferred_name^25 names^10", |
|
# pf = phrase fields, i.e. how should we boost these fields if they contain the entire search phrase. |
|
# https://solr.apache.org/guide/solr/latest/query-guide/dismax-query-parser.html#pf-phrase-fields-parameter |
|
"pf": "preferred_name_exactish^300 names_exactish^200 preferred_name^30 names^20", |
|
# Boosts |
|
"bq": [], |
|
"boost": [ |
|
# The boost is multiplied with score -- calculating the log() reduces how quickly this increases |
|
# the score for increasing clique identifier counts. |
|
"log(sum(clique_identifier_count, 1))" |
|
], |
|
}, |
|
}, |
|
"sort": "score DESC, clique_identifier_count DESC, curie_suffix ASC", |
|
"limit": limit, |
|
"offset": offset, |
|
"filter": filters, |
|
"fields": "*, score", |
|
"params": inner_params, |
|
} |
There is a possibility that this might work very differently if we were to switch to ElasticSearch (#182).
This comes down to the exact eDisMax query we use. As a general rule, tweaking any part of this expression results in some queries improving while others get worse.
NameResolution/api/server.py
Lines 454 to 479 in 8e6e889
There is a possibility that this might work very differently if we were to switch to ElasticSearch (#182).