Sarah Hoffmann
c2a311e69c
fix poscode update computation: use distance
2025-08-29 15:10:27 +02:00
marc tobias
247afe1f56
sanetizer no longer strips name parts in brackets when more parts follow
2025-08-23 01:06:35 +02:00
anqixxx
6b627df4fb
Locales and localization refactor with Locales as a localizer object.
...
Removed auto-localization from search/search_address APIs (now explicit), simplified AddressLines to subclass List[AddressLine], made display_name a computed property in Results instead of field and removed result-localization circular dependencies
2025-08-12 08:05:37 -04:00
Sarah Hoffmann
0045203092
don't restrict to viewbox for frequent terms
...
All searched places may be outside the viewbox in which case the
restriction means that there are no results at all. Add the penalty for
being outside the viewbox earlier instead and then cut the list.
2025-08-06 17:27:52 +02:00
marc tobias
9bad3b1e61
Better hint to user if database import didnt finish
2025-07-30 10:25:14 +02:00
Sarah Hoffmann
a9cd706bb6
adapt test to new lookup limits
2025-07-14 14:21:09 +02:00
Sarah Hoffmann
73ee17af95
adapt tests for new function signatures
2025-07-11 11:01:22 +02:00
Sarah Hoffmann
c634e9fc5f
differentiate between place searches with and without address
2025-07-07 12:03:56 +02:00
Sarah Hoffmann
13eaea8aae
split place search into address search and named search
...
The presence/absence of houenumbers makes quite a difference for search.
2025-07-07 09:13:48 +02:00
Sarah Hoffmann
f43fec0d57
Merge pull request #3764 from lonvia/update-importance
...
'refresh --importance' also needs to refresh importances in search_name table
2025-06-27 10:02:18 +02:00
Sarah Hoffmann
af82c3debb
remove duplicated test
...
There is a more extensive test of recompute_importance with
result check in test_refresh_wiki_data.py
2025-06-26 22:35:38 +02:00
Sarah Hoffmann
678702ceb7
rewrite importances in search_name after updating in placex
2025-06-26 20:27:37 +02:00
Sarah Hoffmann
f9eb93c4ab
remove support for deprecated gazetteer osm2pgsql output
2025-06-25 23:09:08 +02:00
anqixxx
20cf4b56b9
Refactored min and associated tests to follow greater than or equal to logic, so that min=0 accounted for no filtering
...
r
2025-06-04 00:53:52 -07:00
Sarah Hoffmann
75b4c7e56b
adapt to changed loop handling of pytest_asyncio
2025-05-26 11:51:20 +02:00
anqixxx
618fbc63d7
Added testing to test get classtype pairs in import special phrases
2025-05-21 10:39:51 -07:00
anqixxx
3f51cb3fd1
Made the limit configurable with an optional argument, updating the testing as well to reflect this. default is now 0, meaning that it will return everything that occurs more than once. Removed mock database test, and got rid of fetch all. Rebased all tests to monkeypatch
2025-05-21 10:38:34 -07:00
anqixxx
59a947c5f5
Removed class type pair getter that used style sheets from both spi_importer and the associated testing function
2025-05-21 10:38:08 -07:00
anqixxx
1952290359
Removed magic mocking, using monkeypatch instead, and using a placex table to simulate a 'real database'
2025-05-21 10:37:42 -07:00
anqixxx
1a323165f9
Filter special phrases by style and frequency to fix #235
2025-05-21 10:36:46 -07:00
Sarah Hoffmann
3980791cfd
use iterator instead of list to go over partials
2025-04-11 09:38:24 +02:00
Sarah Hoffmann
497e27bb9a
move partial token into a separate field in the query struct
...
There is exactly one token to be expected and the token is usually
present.
2025-04-11 08:57:34 +02:00
Sarah Hoffmann
97d9e3c548
allow updating postcodes without a project directory
...
Postcodes will then be updated without looking for external postcodes.
2025-04-09 20:04:01 +02:00
Sarah Hoffmann
2ce2d031fa
Merge pull request #3702 from lonvia/remove-tokenizer-dir
...
Remove automatic setup of tokenizer directory
So far the tokenizer factory would create a directory for private data for the tokenizer and then hand in the directory location to the tokenizer.
ICU tokenizer doesn't need any extra data anymore, so it doesn't make sense to create a directory which then remains empty. If a tokenizer needs such a directory in the future, it needs to create it on its own and make sure to handle the situation correctly where no project directory is used at all.
2025-04-03 09:04:48 +02:00
Sarah Hoffmann
186f562dd7
remove automatic setup of tokenizer directory
...
ICU tokenizer doesn't need any extra data anymore, so it doesn't
make sense to create a directory which then remains empty. If a
tokenizer needs such a directory in the future, it needs to create
it on its own and make sure to handle the situation correctly where
no project directory is used at all.
2025-04-02 20:20:04 +02:00
Sarah Hoffmann
c5bbeb626f
Merge pull request #3700 from lonvia/ignore-inherited-addresses
...
Ignore POIs with inherited addresses for the address layer
2025-04-02 12:00:45 +02:00
Sarah Hoffmann
3bc77629c8
ignore POIs with inherited addresses for the address layer
...
We know that there is a building which describes the address as a
polygon and is therefore more suitable.
2025-04-02 10:30:45 +02:00
Sarah Hoffmann
6cf1287c4e
Merge pull request #3686 from astridx/output_names
...
Output names as setting
2025-04-01 20:16:15 +02:00
TuringVerified
2eeec46040
Remove unnecessary assert statement, Fix regex_replace docstring and simplify regex_replace
2025-04-01 18:54:30 +05:30
TuringVerified
6d5a4a20c5
Update documentation, optimise regex_replace, add tests
2025-04-01 18:54:30 +05:30
astridx
12ad95067d
output names as setting
2025-03-31 16:55:05 +02:00
Sarah Hoffmann
be4ba370ef
adapt tests to extended results
2025-03-31 14:52:50 +02:00
Sarah Hoffmann
35baf77b18
make query upper-case when parsing postcodes
...
The postcode patterns expect upper-case letters.
2025-03-21 09:44:15 +01:00
Sarah Hoffmann
f5755a7a82
remove code for setting osm2pgsql via config.lib_dir
...
With the internal osm2pgsql gone, configuration of the binary location
via settings is the only option left that makes sense.
2025-03-11 09:04:05 +01:00
Sarah Hoffmann
4cc788f69e
enable flake for Python tests
2025-03-09 15:33:24 +01:00
Sarah Hoffmann
6b0d58d9fd
restrict postcode parsing in typed phrases
...
Postcodes can only appear in postcode-type phrases and must then
cover the full phrase
2025-03-05 10:09:33 +01:00
Sarah Hoffmann
afb89f9c7a
add unit tests for postcode parser
2025-03-04 16:25:00 +01:00
Sarah Hoffmann
a574b98e4a
remove postcode computation for word table during import
2025-03-04 08:57:59 +01:00
Sarah Hoffmann
6759edfb5d
make word generation from query a class method
2025-03-04 08:57:37 +01:00
Sarah Hoffmann
e362a965e1
search: merge QueryPart array with QueryNodes
...
The basic information on terms is pretty much always used together
with the node inforamtion. Merging them together saves some
allocation while making lookup easier at the same time.
2025-03-04 08:57:37 +01:00
Sarah Hoffmann
13db4c9731
replace datrie library with a more simple pure-Python class
2025-02-24 10:24:21 +01:00
Sarah Hoffmann
49bd18b048
replace PhraseType enum with simple int constants
2025-02-21 16:44:12 +01:00
Sarah Hoffmann
31412e0674
replace TokenType enum with simple char constants
2025-02-21 10:23:41 +01:00
Sarah Hoffmann
4577669213
replace BreakType enum with simple char constants
2025-02-21 09:57:48 +01:00
Sarah Hoffmann
95e2d8c846
adapt tests to changed wikimedia importance test table
2025-01-14 14:19:17 +01:00
Sarah Hoffmann
efc09a5cfc
add japanese phrase preprocessing
...
Code adapted from GSOC code by @miku.
2025-01-09 09:24:10 +01:00
Sarah Hoffmann
4760e8341b
move lua scripts into a separate directory
2024-12-16 10:26:55 +01:00
Sarah Hoffmann
2b87c016db
generalize normalization step for search query
...
It is now possible to configure functions for changing the query
input before it is analysed by the tokenizer.
Code is a cleaned-up version of the implementation by @miku.
2024-12-13 14:31:08 +01:00
Sarah Hoffmann
5160a1d577
get bbox of postcode areas into results
2024-09-30 08:58:40 +02:00
Sarah Hoffmann
90e207a497
drop automatic migration from versions <4.3
2024-09-27 12:07:48 +02:00