Commit Graph

986 Commits

Author SHA1 Message Date
Sarah Hoffmann
0fb4fe8e4d add Python package configuration
The global configuration builds one large package.
2024-06-26 11:52:47 +02:00
Sarah Hoffmann
90eea6b909 adapt database test for wikipedia importance to new tables 2024-05-16 15:23:54 +02:00
Sarah Hoffmann
60b03d506f add CSV format for importance import 2024-05-16 15:23:54 +02:00
Sarah Hoffmann
9889c72c55 work around new pylint warnings 2024-05-14 14:50:37 +02:00
Sarah Hoffmann
77631f90fd reindex postcodes that loose their parents
When the parent place of a postcode is deleted, it needs to
be reindexed to get a new parent. Otherwise displaying of
results is broken.
2024-05-04 12:33:26 +02:00
Sarah Hoffmann
8f3845660f add full tokens to addresses
This is now needed to weigh results.
2024-05-02 11:47:35 +02:00
Sarah Hoffmann
f923304eea add slight preference for locating point POIs over POI areas 2024-04-11 10:21:31 +02:00
Sarah Hoffmann
1a0f851d0d Merge pull request #3389 from mtmail/cli-autodiscover-valid-formats
CLI: get valid --format values via autodiscover
2024-04-09 14:58:53 +02:00
marc tobias
28444d9435 CLI: get valid --format values via autodiscover 2024-04-09 14:03:23 +02:00
Sarah Hoffmann
5c4c98d17e Merge pull request #3384 from mtmail/geocodejson-admin-levels-only-boundaries
geocodejson: admin level output should only print boundaries
2024-04-03 11:52:08 +02:00
Sarah Hoffmann
38798bba13 increase search area when filtering by postcode 2024-04-02 19:36:16 +02:00
marc tobias
05eb1d5f42 geocodejson: admin level output should only print boundaries 2024-04-02 18:58:09 +02:00
Sarah Hoffmann
bdded69ab6 housenumber position should hint on direction
rather than increasing penalty.
2024-04-02 16:30:50 +02:00
Sarah Hoffmann
9f42c3f3b8 remove restriction on frequent one word names
This is now solved by reducing results with the windowing SQL
during search.
2024-04-02 16:28:17 +02:00
Sarah Hoffmann
424ebd7fe9 split search SQL in windowed search_name lookup and constraint search 2024-04-02 16:28:12 +02:00
Sarah Hoffmann
78c19bc006 minimum counts for tokens should always be 1
to avoid accidental devision by 0.
2024-04-01 14:25:51 +02:00
Sarah Hoffmann
c39fc5d180 don't even try heavily penalized searches 2024-03-26 22:00:25 +01:00
Sarah Hoffmann
a96b6a1289 reintroduce cutoffs when searching for very frequent words 2024-03-26 21:46:37 +01:00
Sarah Hoffmann
ace84ed0e3 use address counts for improving index lookup 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
ff3230a7f3 add penalty for single words that look like stop words 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
07b7fd1dbb add address counts to tokens 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
bb5de9b955 extend word statistics to address index
Word frequency in names is not sufficient to interpolate word
frequency in the address because names of towns, states etc. are
much more frequently used than, say street names.
2024-03-18 11:25:48 +01:00
Sarah Hoffmann
9c48726691 add geometry details for postcode area output 2024-03-12 13:51:29 +01:00
Sarah Hoffmann
6e688a0113 postcodes: exclude seen places later
The seen list will only have the postcode area when available but
we want the postcode point excluded as well if the area has been seen.
2024-03-11 15:18:57 +01:00
Sarah Hoffmann
dc7cfd1708 look for postcode areas when finding something in the postcode table 2024-03-11 14:48:24 +01:00
Sarah Hoffmann
e5a5f02666 prepare release 4.4.0 2024-03-07 11:43:01 +01:00
Sarah Hoffmann
e929693cae Merge pull request #3356 from lonvia/use-date-from-osm2pgsql-prop
Use import date from osm2pgsql property table if available
2024-03-05 15:32:16 +01:00
Sarah Hoffmann
ae7c584e28 use import date from osm2pgsql property table if available 2024-03-05 11:33:32 +01:00
marc tobias
b7eea4d53a Github Actions: add codespell linter, warn only 2024-03-04 00:22:24 +01:00
Sarah Hoffmann
9fa73cfb15 improve display name for postcodes
Don't add the postcode again in the list of address details and
make sure that the result proper always comes before anything else
independently of the address rank.
2024-02-28 16:50:40 +01:00
Sarah Hoffmann
247065ff6f Merge pull request #3342 from mtmail/tyops
Correct some typos
2024-02-28 14:25:16 +01:00
Sarah Hoffmann
1879cf902c Merge pull request #3346 from lonvia/reduce-artificial-importance
Reduce default importance
2024-02-28 14:21:46 +01:00
Sarah Hoffmann
36b1660121 add support for new middle table format of osm2pgsql
Functions are adapted according to the format detected from the
osm2pgsql property table.
2024-02-27 18:18:19 +01:00
Sarah Hoffmann
c6d40d4bf4 reduce importance when computed from search rank 2024-02-27 10:15:54 +01:00
Sarah Hoffmann
a4f2e6a893 do not send outdated parameters to osm2pgsql flex 2024-02-27 10:15:36 +01:00
Sarah Hoffmann
dc1baaa0af prefer min() function over if construct
Fixes a linter complaint.
2024-02-27 09:26:50 +01:00
marc tobias
7205491b84 Correct some typos 2024-02-26 18:13:30 +01:00
Sarah Hoffmann
4aba36c5ac API debug: properly escape non-highlighted code 2024-02-19 18:39:01 +01:00
Sarah Hoffmann
05fad607ff make Python frontend default and PHP optional 2024-02-19 18:39:01 +01:00
Sarah Hoffmann
b2d3f0a8b3 remove unnecessary nested group in CLI import command 2024-02-16 11:32:50 +01:00
Sarah Hoffmann
4ce13f5c1f prefilter bad results before adding details and reranking
Move the first cutting of the result list before reranking
by result match. This means that results with significantly
less importance are removed early and independently of the
fact how well they match the original query.

Fixes #3266.
2024-02-06 20:29:48 +01:00
Sarah Hoffmann
bc51378aee properly grant rights to read-only user when switching out word table 2024-02-06 17:30:01 +01:00
Sarah Hoffmann
81eed0680c recreate word table when refreshing counts
The counting touches a large part of the word table, leaving
bloated tables and indexes. Thus recreate the table instead and
swap it in.
2024-02-04 21:35:10 +01:00
Sarah Hoffmann
33c0f249b1 avoid LookupAny with address and too many name tokens
The index for nameaddress_vector has grown so large that PostgreSQL
will resort to a sequential scan if there are too many items
in the LookupAny list.
2024-01-29 16:52:14 +01:00
Sarah Hoffmann
76eadc562c print any collected debug output when returning a timeout error 2024-01-28 22:30:34 +01:00
Sarah Hoffmann
f07f8530a8 housenumber-only searches cannot be combined with qualifiers 2024-01-28 19:03:11 +01:00
Sarah Hoffmann
103800a732 adjust rankings for housenumber-only searches
A normal address search with housenumber will use name rankings for
the street name. This is slightly different than weighing for
address parts. Use the same ranking for the first part of the
address for housenumber-only searches to make sure that penalties
remain comparable.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
f9ba7a465a always add a penalty for name + address search fallback
If there already was a search by full names, the search is likely
a repeatition that yields the same results, only running slower.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
fed46240d5 disallow category tokens in the middle of a query string
This already worked for left-to-right readings and now is also
implemented for right-to-left reading. A qualifier must always be
before or after the name.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
2703442fd2 protect against very frequent bad partials 2024-01-28 19:03:11 +01:00