Sarah Hoffmann
93afe5a7c3
update typing for latest changes in SQLAlchemy
2023-12-29 20:55:33 +01:00
Sarah Hoffmann
6d39563b87
enable all API tests for sqlite and port missing features
2023-12-07 09:32:02 +01:00
Sarah Hoffmann
df6eddebcd
void unnecessary aliases
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
b6c8c0e72b
factor out SQL for filtering by location
...
Also improves on the decision if an indexed is used or not.
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
b06f5fddcb
simplify handling of SQL lookup code for search_name
...
Use function classes which can be instantiated directly.
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
615b166c68
clean up ST_DWithin and intersects() functions
...
A non-index version of ST_DWithin is not necessary. ST_Distance
can be used for that purpose. Index use for intersects can be
covered with a simple parameter.
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
c41f2fed21
simplify weigh_search() function
...
Use JSON arrays which can have mixed types and therefore have
a more logical structure than separate arrays. Avoid JSON dicts
because of their verboseness.
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
c4fd3ab97f
hide type differences between Postgres and Sqlite in custom types
...
Also define a custom set of operators in preparation of differences
in implementation.
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
8a2c6067a2
skip lookup with full names when there are none
2023-12-01 12:11:58 +01:00
Sarah Hoffmann
3c7a28dab0
further restrict stop search criterion
2023-11-29 11:28:54 +01:00
Sarah Hoffmann
0c72a434e0
use restrict for housenumber lookups with few numbers
2023-11-29 11:28:54 +01:00
Sarah Hoffmann
32e7b59b1f
NearSearch needs to inherit penalty from inner search
2023-11-29 11:28:52 +01:00
Sarah Hoffmann
b2319e52ff
correctly exclude streets with housenumber searches
...
Street result are not subject to the full filtering in the SQL
query, so recheck.
2023-11-28 17:53:37 +01:00
Sarah Hoffmann
25279d009a
add tests for interaction of category parameter with category terms
2023-11-28 16:56:08 +01:00
Sarah Hoffmann
3f72ca4bca
rename use of category as POI search to near_item
...
Use the term category only as a short-cut for "tuple of key and value".
2023-11-28 16:27:05 +01:00
Sarah Hoffmann
70dc4957dc
the category parameter in search should result in a qualifier
2023-11-28 12:01:49 +01:00
Sarah Hoffmann
a7f5c6c8f5
drop category tokens when they make up a full phrase
2023-11-26 20:58:50 +01:00
Sarah Hoffmann
a8b023e57e
restrict base results in near search by rank
...
This avoids in particular that roads or POIs are used as base
for the near search when a place result is present.
2023-11-26 17:41:29 +01:00
Sarah Hoffmann
47ca56f21b
deduplicate categories/qualifiers
2023-11-26 17:11:15 +01:00
Sarah Hoffmann
580a7b032f
order near searches by distance instead of importance
2023-11-26 16:48:04 +01:00
Sarah Hoffmann
8fcc2bb7f5
avoid duplicate lines during category search
2023-11-26 14:53:20 +01:00
Sarah Hoffmann
d6fe58f84e
fix polygon selection for classtable lookups
...
Polygons should be used preferably with higher address ranks
where the areas are smaller.
2023-11-25 21:01:27 +01:00
Sarah Hoffmann
4e4d29f653
increase penalty for one-letter words
2023-11-23 10:51:58 +01:00
Sarah Hoffmann
195c13ee8a
more preference for name-only queries in search
2023-11-22 23:57:23 +01:00
Sarah Hoffmann
ac5ef64701
avoid index use when filtering by layer
2023-11-22 20:54:04 +01:00
Sarah Hoffmann
155f26060d
avoid index on rank_address in near search
2023-11-22 17:33:17 +01:00
Sarah Hoffmann
8216899a9a
trim all coordinate output to 7 digits
2023-10-23 17:19:12 +02:00
Sarah Hoffmann
b62dbd1f92
reduce influence of viewbox
...
Perfectly matching city names should still get priority.
2023-10-07 22:00:52 +02:00
Sarah Hoffmann
b00b16aa3a
more unit tests for search
2023-09-27 15:00:05 +02:00
Sarah Hoffmann
7fcbe13669
move get_addressdata() implementation to Python
...
The pgsql function get_addressdata() does a lookup of a lot of data
that is already available in Python.
2023-09-26 11:21:36 +02:00
Sarah Hoffmann
21df87dedc
filter duplicate results after DB query
2023-09-20 14:58:54 +02:00
Sarah Hoffmann
fd26310d6a
rerank results by query
...
The algorithm is similar to the PHP reranking and uses the terms from
the display name to check against the query terms. However instead of
exact matching it uses a per-word-edit-distance, so that it is less
strict when it comes to mismatching accents or other one letter
differences.
Country names get a higher penalty because they don't receive a
penalty during token matching right now.
This will work badly with the legacy tokenizer. Given that it is
marked for removal, it is simply not worth optimising for it.
2023-09-20 14:52:05 +02:00
Sarah Hoffmann
44da684d1d
reduce expected count for multi-part words
...
Fixes #3196 .
2023-09-11 17:45:34 +02:00
Sarah Hoffmann
c284df2dc9
restrict range for interpolated housenumbers
...
Interpolations are only supported up to 2^32 by the database.
Limit to 8 digits, which is still more than should be needed.
2023-09-05 11:41:41 +02:00
Sarah Hoffmann
15e09f2b24
remove alias where it does not work with lambdas
...
Fixes #3177 .
2023-08-30 21:55:34 +02:00
Sarah Hoffmann
1115705cbc
add additional timeout for entire request
2023-08-25 09:16:53 +02:00
Sarah Hoffmann
2762c45569
apply adjusted counts only to final result
2023-08-24 21:37:02 +02:00
Sarah Hoffmann
0a2d0c3b5c
allow terms with frequent searches together with viewbox
2023-08-24 09:21:09 +02:00
Sarah Hoffmann
dcdda314e2
further tweak search containing very frequent tokens
...
Excluding non-rare full names is not really possible because it makes
addresses with street names like 'main st' unsearchable. This tries to
leav all names in but refrain from ordering results by accuracy
when too many results are expected. This means that the DB will simply
get the first n results without any particular order.
2023-08-23 23:04:12 +02:00
Sarah Hoffmann
23eed4ff2f
fix tag name for housename addresses in layer selection
...
Fixes #3156 .
2023-08-19 15:57:33 +02:00
Sarah Hoffmann
bfc706a596
cache ICU transliterators and reuse them
2023-08-15 23:08:44 +02:00
Sarah Hoffmann
746dd057b9
prefer name-only searches more
2023-08-13 15:24:16 +02:00
Sarah Hoffmann
b710297d05
return bbox of full country for country searches
...
Fixes #3149 .
2023-08-13 14:37:28 +02:00
Sarah Hoffmann
0a8e8cec0f
fix application of label to wrong expression
2023-08-13 11:59:01 +02:00
Sarah Hoffmann
96e5a23727
avoid lambda SQL in connection with alias tables
2023-08-13 11:40:49 +02:00
Sarah Hoffmann
cab2a74740
do not use index when searching in large areas
...
This concerns viewboxes as well as radius search.
2023-08-12 16:12:44 +02:00
Sarah Hoffmann
95d1048789
take token_assignment penalty into account
...
Also computes the expected count differently when addresses are
involved. Address token counts do not bare a direct relation to
real counts.
2023-08-12 15:33:50 +02:00
Sarah Hoffmann
38b2b8a143
fix debug output for NearSearch
...
The search info is in a subsearch and was therefore not taken into
account.
2023-08-12 11:27:55 +02:00
Sarah Hoffmann
3d0bc85b4d
improve penalty for token-split words
...
The rematch penalty for partial words created by the transliteration
need to take into account that they are rematched against the full word.
That means that missing beginning and end should not get a significant
penalty.
2023-08-12 11:26:02 +02:00
Sarah Hoffmann
78648f1faf
remove lookup by address only
...
There are too many lookups where the address is very frequent,
even when many address parts are present.
2023-08-06 21:00:10 +02:00