Commit Graph

28 Commits

Author SHA1 Message Date
Sarah Hoffmann
193d6c4173 in-word penalty for final address token 2025-09-12 12:05:29 +02:00
Sarah Hoffmann
54620f9566 base penalty for housenumber searches on similar address searches 2025-09-12 10:52:42 +02:00
Sarah Hoffmann
341c09ee95 remove unused functions 2025-09-06 11:09:40 +02:00
Sarah Hoffmann
93ac1023f7 restrict name-only search more 2025-07-14 14:21:09 +02:00
Sarah Hoffmann
6d2b79870c only use most infrequent tokens for search index lookup 2025-07-14 14:18:22 +02:00
Sarah Hoffmann
71025f3f43 fix order of address rankings prefering longest words 2025-07-11 11:01:21 +02:00
Sarah Hoffmann
e4b671f8b1 reinstate penalty for partial only matches 2025-07-11 11:01:21 +02:00
Sarah Hoffmann
4634ad0720 rebalance word transition penalties 2025-07-11 11:01:21 +02:00
Sarah Hoffmann
c634e9fc5f differentiate between place searches with and without address 2025-07-07 12:03:56 +02:00
Sarah Hoffmann
13eaea8aae split place search into address search and named search
The presence/absence of houenumbers makes quite a difference for search.
2025-07-07 09:13:48 +02:00
Sarah Hoffmann
800c56642b tweak full count cut-off (as per deployment on osm.org) 2025-05-11 11:48:07 +02:00
Sarah Hoffmann
b680d81f0a ensure that bailout-check is done after each iteration 2025-04-11 11:02:11 +02:00
Sarah Hoffmann
3980791cfd use iterator instead of list to go over partials 2025-04-11 09:38:24 +02:00
Sarah Hoffmann
497e27bb9a move partial token into a separate field in the query struct
There is exactly one token to be expected and the token is usually
present.
2025-04-11 08:57:34 +02:00
Sarah Hoffmann
f2aa15778f always use lookup when requested
Doesn't seem to cause any issues in production.
2025-03-31 11:38:21 +02:00
Sarah Hoffmann
efe65c3e49 increase allowable address counts 2025-03-31 11:38:21 +02:00
Sarah Hoffmann
51847ebfeb more agressively reduce expected count for multi-word terms
Improves searching of non-latin scripts with forced token spaces.
2025-03-31 11:18:22 +02:00
Miroslav Šedivý
6ff51712fe Simplify int/float manipulation 2025-03-06 19:26:56 +01:00
Sarah Hoffmann
31412e0674 replace TokenType enum with simple char constants 2025-02-21 10:23:41 +01:00
Sarah Hoffmann
4577669213 replace BreakType enum with simple char constants 2025-02-21 09:57:48 +01:00
Sarah Hoffmann
9bf1428d81 consistently use query module as qmod 2025-02-21 09:31:21 +01:00
Sarah Hoffmann
46c4446dc2 remove address penalty for postcode search
Searches of the form <postcode> <city> are in fact quite common.
2025-02-20 11:11:45 +01:00
Sarah Hoffmann
499110f549 add SOFT_PHRASE break and enable parsing
Also enables parsing of PART breaks.
2025-01-06 17:10:24 +01:00
Sarah Hoffmann
1f07967787 fix style issue found by flake8 2024-11-10 22:47:14 +01:00
Sarah Hoffmann
a690605a96 remove support for unindexed tokens
This was a special feature of the legacy tokenizer who would not
index very frequent tokens.
2024-09-22 10:39:10 +02:00
Sarah Hoffmann
cfe5284f64 make housenumber search work with non-indexed partials 2024-07-31 14:09:35 +02:00
Mateusz Konieczny
e51973f8b1 fix some typos 2024-07-01 15:03:57 +02:00
Sarah Hoffmann
6e89310a92 split code into submodules 2024-06-26 11:52:47 +02:00