Sarah Hoffmann
|
193d6c4173
|
in-word penalty for final address token
|
2025-09-12 12:05:29 +02:00 |
|
Sarah Hoffmann
|
54620f9566
|
base penalty for housenumber searches on similar address searches
|
2025-09-12 10:52:42 +02:00 |
|
Sarah Hoffmann
|
341c09ee95
|
remove unused functions
|
2025-09-06 11:09:40 +02:00 |
|
Sarah Hoffmann
|
93ac1023f7
|
restrict name-only search more
|
2025-07-14 14:21:09 +02:00 |
|
Sarah Hoffmann
|
6d2b79870c
|
only use most infrequent tokens for search index lookup
|
2025-07-14 14:18:22 +02:00 |
|
Sarah Hoffmann
|
71025f3f43
|
fix order of address rankings prefering longest words
|
2025-07-11 11:01:21 +02:00 |
|
Sarah Hoffmann
|
e4b671f8b1
|
reinstate penalty for partial only matches
|
2025-07-11 11:01:21 +02:00 |
|
Sarah Hoffmann
|
4634ad0720
|
rebalance word transition penalties
|
2025-07-11 11:01:21 +02:00 |
|
Sarah Hoffmann
|
c634e9fc5f
|
differentiate between place searches with and without address
|
2025-07-07 12:03:56 +02:00 |
|
Sarah Hoffmann
|
13eaea8aae
|
split place search into address search and named search
The presence/absence of houenumbers makes quite a difference for search.
|
2025-07-07 09:13:48 +02:00 |
|
Sarah Hoffmann
|
800c56642b
|
tweak full count cut-off (as per deployment on osm.org)
|
2025-05-11 11:48:07 +02:00 |
|
Sarah Hoffmann
|
b680d81f0a
|
ensure that bailout-check is done after each iteration
|
2025-04-11 11:02:11 +02:00 |
|
Sarah Hoffmann
|
3980791cfd
|
use iterator instead of list to go over partials
|
2025-04-11 09:38:24 +02:00 |
|
Sarah Hoffmann
|
497e27bb9a
|
move partial token into a separate field in the query struct
There is exactly one token to be expected and the token is usually
present.
|
2025-04-11 08:57:34 +02:00 |
|
Sarah Hoffmann
|
f2aa15778f
|
always use lookup when requested
Doesn't seem to cause any issues in production.
|
2025-03-31 11:38:21 +02:00 |
|
Sarah Hoffmann
|
efe65c3e49
|
increase allowable address counts
|
2025-03-31 11:38:21 +02:00 |
|
Sarah Hoffmann
|
51847ebfeb
|
more agressively reduce expected count for multi-word terms
Improves searching of non-latin scripts with forced token spaces.
|
2025-03-31 11:18:22 +02:00 |
|
Miroslav Šedivý
|
6ff51712fe
|
Simplify int/float manipulation
|
2025-03-06 19:26:56 +01:00 |
|
Sarah Hoffmann
|
31412e0674
|
replace TokenType enum with simple char constants
|
2025-02-21 10:23:41 +01:00 |
|
Sarah Hoffmann
|
4577669213
|
replace BreakType enum with simple char constants
|
2025-02-21 09:57:48 +01:00 |
|
Sarah Hoffmann
|
9bf1428d81
|
consistently use query module as qmod
|
2025-02-21 09:31:21 +01:00 |
|
Sarah Hoffmann
|
46c4446dc2
|
remove address penalty for postcode search
Searches of the form <postcode> <city> are in fact quite common.
|
2025-02-20 11:11:45 +01:00 |
|
Sarah Hoffmann
|
499110f549
|
add SOFT_PHRASE break and enable parsing
Also enables parsing of PART breaks.
|
2025-01-06 17:10:24 +01:00 |
|
Sarah Hoffmann
|
1f07967787
|
fix style issue found by flake8
|
2024-11-10 22:47:14 +01:00 |
|
Sarah Hoffmann
|
a690605a96
|
remove support for unindexed tokens
This was a special feature of the legacy tokenizer who would not
index very frequent tokens.
|
2024-09-22 10:39:10 +02:00 |
|
Sarah Hoffmann
|
cfe5284f64
|
make housenumber search work with non-indexed partials
|
2024-07-31 14:09:35 +02:00 |
|
Mateusz Konieczny
|
e51973f8b1
|
fix some typos
|
2024-07-01 15:03:57 +02:00 |
|
Sarah Hoffmann
|
6e89310a92
|
split code into submodules
|
2024-06-26 11:52:47 +02:00 |
|