Commit Graph

48 Commits

Author SHA1 Message Date
Sarah Hoffmann
35baf77b18 make query upper-case when parsing postcodes
The postcode patterns expect upper-case letters.
2025-03-21 09:44:15 +01:00
Sarah Hoffmann
4cc788f69e enable flake for Python tests 2025-03-09 15:33:24 +01:00
Sarah Hoffmann
6b0d58d9fd restrict postcode parsing in typed phrases
Postcodes can only appear in postcode-type phrases and must then
cover the full phrase
2025-03-05 10:09:33 +01:00
Sarah Hoffmann
afb89f9c7a add unit tests for postcode parser 2025-03-04 16:25:00 +01:00
Sarah Hoffmann
a574b98e4a remove postcode computation for word table during import 2025-03-04 08:57:59 +01:00
Sarah Hoffmann
6759edfb5d make word generation from query a class method 2025-03-04 08:57:37 +01:00
Sarah Hoffmann
e362a965e1 search: merge QueryPart array with QueryNodes
The basic information on terms is pretty much always used together
with the node inforamtion. Merging them together saves some
allocation while making lookup easier at the same time.
2025-03-04 08:57:37 +01:00
Sarah Hoffmann
49bd18b048 replace PhraseType enum with simple int constants 2025-02-21 16:44:12 +01:00
Sarah Hoffmann
31412e0674 replace TokenType enum with simple char constants 2025-02-21 10:23:41 +01:00
Sarah Hoffmann
4577669213 replace BreakType enum with simple char constants 2025-02-21 09:57:48 +01:00
Sarah Hoffmann
5160a1d577 get bbox of postcode areas into results 2024-09-30 08:58:40 +02:00
Sarah Hoffmann
a690605a96 remove support for unindexed tokens
This was a special feature of the legacy tokenizer who would not
index very frequent tokens.
2024-09-22 10:39:10 +02:00
Sarah Hoffmann
b87d6226fb remove legacy tokenizer and direct tests 2024-09-21 11:38:08 +02:00
Sarah Hoffmann
882fb16881 restrict use of os.environ in Configuration
Only use the OS environment, when the environ parameter is set
to None. Currently it would use the OS env on an empty dict.
2024-09-01 16:17:30 +02:00
Sarah Hoffmann
7f11de0db9 allow None and str for project_dir in NominatimAPI init 2024-08-22 22:49:12 +02:00
Sarah Hoffmann
c2594aca40 make NominatimAPI[Async] a context manager
If close() isn't properly called, it can lead to odd error messages
about uncaught exceptions.
2024-08-19 11:31:38 +02:00
Sarah Hoffmann
2bab0ca060 port unit tests to new python package layout 2024-06-26 11:52:47 +02:00
Sarah Hoffmann
38798bba13 increase search area when filtering by postcode 2024-04-02 19:36:16 +02:00
Sarah Hoffmann
fe873ad0e2 adapt tests for windowing SQL
Results with high penalty are now thrown out earlier.
2024-04-02 16:32:49 +02:00
Sarah Hoffmann
07b7fd1dbb add address counts to tokens 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
fed46240d5 disallow category tokens in the middle of a query string
This already worked for left-to-right readings and now is also
implemented for right-to-left reading. A qualifier must always be
before or after the name.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
e0ca2ce6ec interpret stand-alone special terms always as near term
Fixes #3298.
2024-01-16 17:19:21 +01:00
Sarah Hoffmann
10a5424a71 do not run near queries on qualifier words
There is too much potential for confusion (e.g. 'Rio Grande' read
as 'river near Grande') fir too little gain. Use near phrases
instead.
2024-01-07 11:33:11 +01:00
Sarah Hoffmann
6d39563b87 enable all API tests for sqlite and port missing features 2023-12-07 09:32:02 +01:00
Sarah Hoffmann
b06f5fddcb simplify handling of SQL lookup code for search_name
Use function classes which can be instantiated directly.
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
b2319e52ff correctly exclude streets with housenumber searches
Street result are not subject to the full filtering in the SQL
query, so recheck.
2023-11-28 17:53:37 +01:00
Sarah Hoffmann
25279d009a add tests for interaction of category parameter with category terms 2023-11-28 16:56:08 +01:00
Sarah Hoffmann
3f72ca4bca rename use of category as POI search to near_item
Use the term category only as a short-cut for "tuple of key and value".
2023-11-28 16:27:05 +01:00
Sarah Hoffmann
70dc4957dc the category parameter in search should result in a qualifier 2023-11-28 12:01:49 +01:00
Sarah Hoffmann
a7f5c6c8f5 drop category tokens when they make up a full phrase 2023-11-26 20:58:50 +01:00
Sarah Hoffmann
b62dbd1f92 reduce influence of viewbox
Perfectly matching city names should still get priority.
2023-10-07 22:00:52 +02:00
Sarah Hoffmann
b00b16aa3a more unit tests for search 2023-09-27 15:00:05 +02:00
Sarah Hoffmann
c284df2dc9 restrict range for interpolated housenumbers
Interpolations are only supported up to 2^32 by the database.
Limit to 8 digits, which is still more than should be needed.
2023-09-05 11:41:41 +02:00
Sarah Hoffmann
fa3ac22a8f adapt tests to changes in search 2023-08-12 16:12:31 +02:00
Sarah Hoffmann
78648f1faf remove lookup by address only
There are too many lookups where the address is very frequent,
even when many address parts are present.
2023-08-06 21:00:10 +02:00
Sarah Hoffmann
afdbdb02a1 do not lookup by address vector when only few tokens are available
Names of countries and states are exceedingly rare in the word count
but are very frequent in the address. A short name has the danger
of producing too many results.
2023-08-02 09:25:47 +02:00
Sarah Hoffmann
927d2cc824 do not split names from typed phrases
When phrases are typed, they should only contain exactly one term.
2023-07-17 20:09:08 +02:00
Sarah Hoffmann
cc45930ef9 avoid lookup via partials on frequent words
Drops expensive searches via partials on terms like 'rue de'.

See #2979.
2023-07-06 12:16:57 +02:00
Sarah Hoffmann
9bc5be837b remove useless check
Found by new mypy version.
2023-06-21 11:56:39 +02:00
Sarah Hoffmann
d0a1e8e311 tweak postcode search
Give a preference to left-right reading, i.e <postcode>,<address>
prefers a postcode search while <address>,<postcode> rather does
an address search.

Also exclude non-addressables, countries and state from results when a
postcode is contained in the query.
2023-06-20 11:56:43 +02:00
Sarah Hoffmann
146a0b29c0 add support for search by houenumber 2023-05-26 14:10:57 +02:00
Sarah Hoffmann
dc99bbb0af implement actual database searches 2023-05-24 13:52:31 +02:00
Sarah Hoffmann
c42273a4db implement search builder 2023-05-23 11:23:44 +02:00
Sarah Hoffmann
3bf489cd7c implement token assignment 2023-05-22 15:49:03 +02:00
Sarah Hoffmann
d8240f9ee4 add query analyser for legacy tokenizer 2023-05-22 11:07:14 +02:00
Sarah Hoffmann
2448cf2a14 add factory for query analyzer 2023-05-22 09:23:19 +02:00
Sarah Hoffmann
004883bdb1 query analyzer for ICU tokenizer 2023-05-22 08:46:19 +02:00
Sarah Hoffmann
ff66595f7a add data structure for tokenized query 2023-05-21 09:30:57 +02:00