Sarah Hoffmann
f923304eea
add slight preference for locating point POIs over POI areas
2024-04-11 10:21:31 +02:00
Sarah Hoffmann
5c4c98d17e
Merge pull request #3384 from mtmail/geocodejson-admin-levels-only-boundaries
...
geocodejson: admin level output should only print boundaries
2024-04-03 11:52:08 +02:00
Sarah Hoffmann
38798bba13
increase search area when filtering by postcode
2024-04-02 19:36:16 +02:00
marc tobias
05eb1d5f42
geocodejson: admin level output should only print boundaries
2024-04-02 18:58:09 +02:00
Sarah Hoffmann
bdded69ab6
housenumber position should hint on direction
...
rather than increasing penalty.
2024-04-02 16:30:50 +02:00
Sarah Hoffmann
9f42c3f3b8
remove restriction on frequent one word names
...
This is now solved by reducing results with the windowing SQL
during search.
2024-04-02 16:28:17 +02:00
Sarah Hoffmann
424ebd7fe9
split search SQL in windowed search_name lookup and constraint search
2024-04-02 16:28:12 +02:00
Sarah Hoffmann
78c19bc006
minimum counts for tokens should always be 1
...
to avoid accidental devision by 0.
2024-04-01 14:25:51 +02:00
Sarah Hoffmann
c39fc5d180
don't even try heavily penalized searches
2024-03-26 22:00:25 +01:00
Sarah Hoffmann
a96b6a1289
reintroduce cutoffs when searching for very frequent words
2024-03-26 21:46:37 +01:00
Sarah Hoffmann
ace84ed0e3
use address counts for improving index lookup
2024-03-18 11:25:48 +01:00
Sarah Hoffmann
ff3230a7f3
add penalty for single words that look like stop words
2024-03-18 11:25:48 +01:00
Sarah Hoffmann
07b7fd1dbb
add address counts to tokens
2024-03-18 11:25:48 +01:00
Sarah Hoffmann
bb5de9b955
extend word statistics to address index
...
Word frequency in names is not sufficient to interpolate word
frequency in the address because names of towns, states etc. are
much more frequently used than, say street names.
2024-03-18 11:25:48 +01:00
Sarah Hoffmann
9c48726691
add geometry details for postcode area output
2024-03-12 13:51:29 +01:00
Sarah Hoffmann
6e688a0113
postcodes: exclude seen places later
...
The seen list will only have the postcode area when available but
we want the postcode point excluded as well if the area has been seen.
2024-03-11 15:18:57 +01:00
Sarah Hoffmann
dc7cfd1708
look for postcode areas when finding something in the postcode table
2024-03-11 14:48:24 +01:00
marc tobias
b7eea4d53a
Github Actions: add codespell linter, warn only
2024-03-04 00:22:24 +01:00
Sarah Hoffmann
9fa73cfb15
improve display name for postcodes
...
Don't add the postcode again in the list of address details and
make sure that the result proper always comes before anything else
independently of the address rank.
2024-02-28 16:50:40 +01:00
Sarah Hoffmann
247065ff6f
Merge pull request #3342 from mtmail/tyops
...
Correct some typos
2024-02-28 14:25:16 +01:00
Sarah Hoffmann
c6d40d4bf4
reduce importance when computed from search rank
2024-02-27 10:15:54 +01:00
Sarah Hoffmann
dc1baaa0af
prefer min() function over if construct
...
Fixes a linter complaint.
2024-02-27 09:26:50 +01:00
marc tobias
7205491b84
Correct some typos
2024-02-26 18:13:30 +01:00
Sarah Hoffmann
4aba36c5ac
API debug: properly escape non-highlighted code
2024-02-19 18:39:01 +01:00
Sarah Hoffmann
4ce13f5c1f
prefilter bad results before adding details and reranking
...
Move the first cutting of the result list before reranking
by result match. This means that results with significantly
less importance are removed early and independently of the
fact how well they match the original query.
Fixes #3266 .
2024-02-06 20:29:48 +01:00
Sarah Hoffmann
33c0f249b1
avoid LookupAny with address and too many name tokens
...
The index for nameaddress_vector has grown so large that PostgreSQL
will resort to a sequential scan if there are too many items
in the LookupAny list.
2024-01-29 16:52:14 +01:00
Sarah Hoffmann
f07f8530a8
housenumber-only searches cannot be combined with qualifiers
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
103800a732
adjust rankings for housenumber-only searches
...
A normal address search with housenumber will use name rankings for
the street name. This is slightly different than weighing for
address parts. Use the same ranking for the first part of the
address for housenumber-only searches to make sure that penalties
remain comparable.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
f9ba7a465a
always add a penalty for name + address search fallback
...
If there already was a search by full names, the search is likely
a repeatition that yields the same results, only running slower.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
fed46240d5
disallow category tokens in the middle of a query string
...
This already worked for left-to-right readings and now is also
implemented for right-to-left reading. A qualifier must always be
before or after the name.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
2703442fd2
protect against very frequent bad partials
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
2813bf18e6
avoid duplicates in the list of partial tokens for a query
...
This messes with the estimates for expected results.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
e0ca2ce6ec
interpret stand-alone special terms always as near term
...
Fixes #3298 .
2024-01-16 17:19:21 +01:00
Sarah Hoffmann
28f7e51279
add country code to words to be rematched
2024-01-08 12:23:23 +01:00
Sarah Hoffmann
b2afe3ce3e
when a country is in the results, restrict further searches to places
...
A country search result usually comes with a very high importance.
As a result only other very well known places will show up together
with country results and that means only places with lower address
ranks. Name searches for country names tend to yield a lot of POI
results because the country name is part of the name
(think "embassy of Sweden"). By excluding POIs from further searches,
the search is sped up quite a bit.
2024-01-07 17:29:12 +01:00
Sarah Hoffmann
7337898b84
dump params in log view
2024-01-07 15:37:53 +01:00
Sarah Hoffmann
4305160c91
prioritize country searches when penaly is equal
2024-01-07 15:28:37 +01:00
Sarah Hoffmann
d3a575319f
Merge pull request #3289 from lonvia/viewbox-and-housenumbers
...
Do not restrict by viewbox when housenumber or postcode is available
2024-01-07 15:23:14 +01:00
Sarah Hoffmann
2592bf1954
Merge pull request #3290 from lonvia/near-vs-quaifier-words
...
Do not run near queries on qualifier words
2024-01-07 15:23:00 +01:00
Sarah Hoffmann
474d4230b8
fix timezone handling for timestamps from the database
...
SQLite is not timezone-aware, so make sure to convert to UTC
before inserting any data.
2024-01-07 11:37:40 +01:00
Sarah Hoffmann
10a5424a71
do not run near queries on qualifier words
...
There is too much potential for confusion (e.g. 'Rio Grande' read
as 'river near Grande') fir too little gain. Use near phrases
instead.
2024-01-07 11:33:11 +01:00
Sarah Hoffmann
7eb04f67e2
do not restrict by viewbox when housenumber or postcode is available
...
Fixes #3274 .
2024-01-07 11:29:26 +01:00
Sarah Hoffmann
8e90fa3395
avoid closure variables in lambda statements
...
There is a bug in SQLAlchemy that assigns the wrong value to bind
parameters from closure variables when reusing lambda statements
that are later extended with other non-lambda expressions.
Thus either avoid lambda statements with closure variables or extending
them with non-lambda expressions.
2024-01-05 17:49:28 +01:00
Sarah Hoffmann
02af0a2c87
use correct SQLAlchemy pool for asynchronous connections
...
See https://github.com/sqlalchemy/sqlalchemy/issues/8771
2024-01-02 16:15:44 +01:00
Sarah Hoffmann
fa4e5513d1
API: avoid engine disposal on startup
2024-01-02 16:10:30 +01:00
Sarah Hoffmann
93afe5a7c3
update typing for latest changes in SQLAlchemy
2023-12-29 20:55:33 +01:00
Sarah Hoffmann
89094cf92e
error out when a SQLite database does not exist
...
Requires to mark the databse r/w when it is newly created in the
convert function.
2023-12-07 10:24:53 +01:00
Sarah Hoffmann
6d39563b87
enable all API tests for sqlite and port missing features
2023-12-07 09:32:02 +01:00
Sarah Hoffmann
df6eddebcd
void unnecessary aliases
2023-12-07 09:31:00 +01:00
Sarah Hoffmann
b6c8c0e72b
factor out SQL for filtering by location
...
Also improves on the decision if an indexed is used or not.
2023-12-07 09:31:00 +01:00