Commit Graph

131 Commits

Author SHA1 Message Date
Sarah Hoffmann
8bb53c22be raise minimum supported Python version to 3.9 2025-07-19 15:23:17 +02:00
Sarah Hoffmann
8a96e4f802 Merge pull request #3781 from lonvia/partial-address-index-lookup
Reduce number of tokens used for index lookups during search
2025-07-15 10:11:12 +02:00
Sarah Hoffmann
09b5ea097b restrict pre-selection by postcode to country 2025-07-14 14:21:09 +02:00
Sarah Hoffmann
e111257644 restrict name-only address searches early by postcode 2025-07-14 14:21:09 +02:00
Sarah Hoffmann
93ac1023f7 restrict name-only search more 2025-07-14 14:21:09 +02:00
Sarah Hoffmann
1fe2353682 restrict postcode distance computation to within country 2025-07-14 14:21:09 +02:00
Sarah Hoffmann
6d2b79870c only use most infrequent tokens for search index lookup 2025-07-14 14:18:22 +02:00
Sarah Hoffmann
621d8e785b Merge pull request #3779 from lonvia/fix-zero-devision-direction
Fix direction factor computation on empty strings
2025-07-11 14:51:00 +02:00
Sarah Hoffmann
21ef3be433 fix direction factor computation on empty strings 2025-07-11 11:25:14 +02:00
Sarah Hoffmann
fe30663b21 remove penalty from TokenRanges
The parameter is no longer needed.
2025-07-11 11:01:22 +02:00
Sarah Hoffmann
b9252cc348 reduce maximum number of SQL queries per search 2025-07-11 11:01:22 +02:00
Sarah Hoffmann
71025f3f43 fix order of address rankings prefering longest words 2025-07-11 11:01:21 +02:00
Sarah Hoffmann
e4b671f8b1 reinstate penalty for partial only matches 2025-07-11 11:01:21 +02:00
Sarah Hoffmann
7ebd121abc give word break slight advantage towards continuation
prefers longer words
2025-07-11 11:01:21 +02:00
Sarah Hoffmann
4634ad0720 rebalance word transition penalties 2025-07-11 11:01:21 +02:00
Sarah Hoffmann
4a9253a0a9 simplify QueryNode penalty and initial assignment 2025-07-11 11:01:09 +02:00
Sarah Hoffmann
cf9e8d6b8e split up query for deletable endpoint by osm type
This is needed to ensure index use on placex.
2025-07-08 11:03:29 +02:00
Sarah Hoffmann
3e04eb2ffe increase penalty on mismatching postcodes for address searches
Otherwise there is an imbalance towards matching housenumbers
instead of the actual street (where no housenumber exists).
2025-07-07 16:07:32 +02:00
Sarah Hoffmann
970d81fb27 sort housenumber parents by accuracy first
Sorting them by presence of housenumber only will give an undue
preference to results with a housenumber while disregarding other
factors like matching postcodes.
2025-07-07 12:06:06 +02:00
Sarah Hoffmann
cecdbeb7cf reduce candidates for place search 2025-07-07 12:03:56 +02:00
Sarah Hoffmann
c634e9fc5f differentiate between place searches with and without address 2025-07-07 12:03:56 +02:00
Sarah Hoffmann
13eaea8aae split place search into address search and named search
The presence/absence of houenumbers makes quite a difference for search.
2025-07-07 09:13:48 +02:00
Sarah Hoffmann
11d624e92a split db_searches moving each class in its own file 2025-07-01 22:57:04 +02:00
Sarah Hoffmann
87a8c246a0 improve result cutting when a POI comes out with top importance 2025-06-01 12:00:36 +02:00
Sarah Hoffmann
90050de717 only rerank results if there is more than one
With one result order is obvious.
2025-06-01 11:55:27 +02:00
Sarah Hoffmann
10a7d1106d reduce influence of query rematching a little bit 2025-06-01 11:54:21 +02:00
Sarah Hoffmann
f2236f68f1 when rematching only distinguish between perfect, somewhat and bad match 2025-06-01 11:53:23 +02:00
Sarah Hoffmann
d2e691b63f work around bogus type error in latest starlette 2025-05-31 09:43:48 +02:00
anqixxx
6220bde2d6 Added mypy ignore fix for logging.py (library change), as well as quick mac fix on mem.cached 2025-05-21 11:11:56 -07:00
Sarah Hoffmann
800c56642b tweak full count cut-off (as per deployment on osm.org) 2025-05-11 11:48:07 +02:00
Sarah Hoffmann
34b72591cc exclude address searches with country from direction penalty
Countries are not adequately represented by partial term counts.
2025-04-29 17:37:31 +02:00
Sarah Hoffmann
bc450d110c Merge pull request #3722 from emmanuel-ferdman/master
resolve datetime deprecation warnings
2025-04-22 14:21:05 +02:00
Sarah Hoffmann
3999977941 revert accidental change in json output format 2025-04-18 12:05:25 +02:00
Emmanuel Ferdman
df58870e3f resolve datetime deprecation warnings
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-04-17 11:15:16 -07:00
Sarah Hoffmann
7f710d2394 add a comment about the precomputed denominator 2025-04-15 09:38:05 +02:00
Sarah Hoffmann
06e39e42d8 add direction penalties
Direction penalties are estimated by getting the name to address
ratio usage for each partial term in the query and computing the
linear regression of that ratio over the entire phrase. Or to put
it in ither words: we try to determine if the terms at the beginning
or the end of the query are more likely to constitute a name.

Direction penalties are currently used only in classic name queries.
2025-04-11 20:41:06 +02:00
Sarah Hoffmann
2ef0e20a3f reorganise token reranking
As the reranking is about changing penalties in presence of other
tokens, change the datastructure to have the other tokens readily
avilable.
2025-04-11 13:38:34 +02:00
Sarah Hoffmann
b680d81f0a ensure that bailout-check is done after each iteration 2025-04-11 11:02:11 +02:00
Sarah Hoffmann
e0e067b1d6 replace use of range when computing word list 2025-04-11 09:59:04 +02:00
Sarah Hoffmann
3980791cfd use iterator instead of list to go over partials 2025-04-11 09:38:24 +02:00
Sarah Hoffmann
497e27bb9a move partial token into a separate field in the query struct
There is exactly one token to be expected and the token is usually
present.
2025-04-11 08:57:34 +02:00
Sarah Hoffmann
39f56ba4b8 restrict coordinate output to 7 digits 2025-04-04 11:02:51 +02:00
Sarah Hoffmann
c5bbeb626f Merge pull request #3700 from lonvia/ignore-inherited-addresses
Ignore POIs with inherited addresses for the address layer
2025-04-02 12:00:45 +02:00
Sarah Hoffmann
3bc77629c8 ignore POIs with inherited addresses for the address layer
We know that there is a building which describes the address as a
polygon and is therefore more suitable.
2025-04-02 10:30:45 +02:00
Sarah Hoffmann
6cf1287c4e Merge pull request #3686 from astridx/output_names
Output names as setting
2025-04-01 20:16:15 +02:00
Sarah Hoffmann
a49e8b9cf7 Merge pull request #3675 from TuringVerified/generic-preprocessors
Add generic preprocessors
2025-04-01 20:14:43 +02:00
TuringVerified
2eeec46040 Remove unnecessary assert statement, Fix regex_replace docstring and simplify regex_replace 2025-04-01 18:54:30 +05:30
TuringVerified
6d5a4a20c5 Update documentation, optimise regex_replace, add tests 2025-04-01 18:54:30 +05:30
TuringVerified
4665ea3e77 Add generic preprocessor 2025-04-01 18:54:30 +05:30
Sarah Hoffmann
fce279226f prepare release 5.1.0 2025-04-01 10:16:35 +02:00