Sarah Hoffmann
86ad9efa8a
keep break indicators [:-] during normalisation
...
All punctuation will be converted to '-'. Soft breaks : may be
added by preprocessors. The break signs are only used during
query analysis and are ignored during import token analysis.
2025-01-09 09:21:55 +01:00
Sarah Hoffmann
d984100e23
add inner word break penalty
2025-01-07 21:42:25 +01:00
Sarah Hoffmann
499110f549
add SOFT_PHRASE break and enable parsing
...
Also enables parsing of PART breaks.
2025-01-06 17:10:24 +01:00
Sarah Hoffmann
2b87c016db
generalize normalization step for search query
...
It is now possible to configure functions for changing the query
input before it is analysed by the tokenizer.
Code is a cleaned-up version of the implementation by @miku.
2024-12-13 14:31:08 +01:00
Sarah Hoffmann
0770eaa5d0
use bbox size for secondary order of results
...
Helps to return the largest object when deduplicating results.
2024-11-19 10:38:50 +01:00
Sarah Hoffmann
122ecd4626
remove remaining pylint hints
2024-11-10 22:49:29 +01:00
Sarah Hoffmann
1f07967787
fix style issue found by flake8
2024-11-10 22:47:14 +01:00
Sarah Hoffmann
2c0f2e1ede
remove now unnecessary type-ignores
2024-10-25 17:56:47 +02:00
Sarah Hoffmann
5160a1d577
get bbox of postcode areas into results
2024-09-30 08:58:40 +02:00
Sarah Hoffmann
a690605a96
remove support for unindexed tokens
...
This was a special feature of the legacy tokenizer who would not
index very frequent tokens.
2024-09-22 10:39:10 +02:00
Sarah Hoffmann
b87d6226fb
remove legacy tokenizer and direct tests
2024-09-21 11:38:08 +02:00
Sarah Hoffmann
a97bfaf26c
fix postcode lookup with legacy tokenizer
2024-07-31 14:54:55 +02:00
Sarah Hoffmann
cfe5284f64
make housenumber search work with non-indexed partials
2024-07-31 14:09:35 +02:00
Mateusz Konieczny
e51973f8b1
fix some typos
2024-07-01 15:03:57 +02:00
Sarah Hoffmann
4da4cbfe27
reduce from 3 to 2 packages
2024-06-28 09:13:22 +02:00
Sarah Hoffmann
63da70685a
fix linting issues
2024-06-26 11:52:47 +02:00
Sarah Hoffmann
2bab0ca060
port unit tests to new python package layout
2024-06-26 11:52:47 +02:00
Sarah Hoffmann
dc7c11a9d1
adapt plugin imports
2024-06-26 11:52:47 +02:00
Sarah Hoffmann
6e89310a92
split code into submodules
2024-06-26 11:52:47 +02:00