Commit Graph

703 Commits

Author SHA1 Message Date
Sarah Hoffmann
2231401483 clean up uses of cli.nominatim()
They should not hand in data paths anymore.
2022-11-27 15:27:04 +01:00
Sarah Hoffmann
2abe9e6fd9 use data paths from new nominatim.paths 2022-11-27 12:15:41 +01:00
Sarah Hoffmann
0ed60d29cb remove NOMINATIM_NOMINATIM_TOOL variable
This was used by the old PHP scripts to call the Python tool.
With the scripts now gone, the variable can be removed.
2022-11-26 16:40:20 +01:00
Sarah Hoffmann
41e8bddaa9 remove BDD test for tiger:county
We no longer rely on the import to strip the tag.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
fd3dec8efe add sanitizer for TIGER tags
Currently only takes over cleaning the tiger:county data. This was
done by the import until now.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
c9ff7d2130 drop illegal values for addr:interpolation on update 2022-11-18 17:26:56 +01:00
Sarah Hoffmann
52456230cc Merge pull request #2887 from lonvia/lookup-linked-places
Add support for lookup of linked places
2022-11-17 13:35:53 +01:00
Sarah Hoffmann
c4b13f2b7f add support for lookup of linked places 2022-11-16 21:34:45 +01:00
Sarah Hoffmann
4f05a03d13 handle associatedStreet relations with multiple streets
When a associatedStreet relation has multiple street members
always take the closest one. Avoid geometry operations for
the frequent case that there is only one street.
2022-11-16 17:25:51 +01:00
Sarah Hoffmann
93ada250f7 bdd: add tests for osm2pgsql update of postcode nodes 2022-11-14 17:27:04 +01:00
Sarah Hoffmann
d8e3ba3b54 bdd: add osm2pgsql tests for updating interpolations 2022-11-14 16:57:31 +01:00
Sarah Hoffmann
a46348da38 bdd: test placex content when updating with osm2pgsql 2022-11-14 14:48:44 +01:00
Sarah Hoffmann
36cf0eb922 reorganize handling of place type changes
Always replace existing entries in place, never delete them because
a direct delete will cause conflicts.
2022-11-14 13:57:26 +01:00
Sarah Hoffmann
63a9bc94f7 fix country handling in flex style
If the country tag does not match a 2-letter code, it needs to
be dropped.
2022-11-10 15:52:13 +01:00
Sarah Hoffmann
2dafc4cf4f remove tests that differ between lua and gazetteer versions 2022-11-10 15:51:55 +01:00
Sarah Hoffmann
68d09f9cad node locations must be stable for osm2pgsql update tests 2022-11-10 11:11:45 +01:00
Sarah Hoffmann
b98d3d3f00 bdd: extend osm2pgsql update tests
Now also checks for correct indexing state of placex table.
2022-11-10 09:38:25 +01:00
Sarah Hoffmann
2fac507453 change updates to handle delete/insert workflow
This makes Nominatim compatible with osm2pgsql's default update
modus operandi of deleting and reinserting data. Deletes are diverted
into a TODO table instead of executing them. When data is reinserted,
the corresponding entry in the TODO table is deleted. After updates are
finished, the remaining entries in the TODO table are executed, doing
the same work as the delete trigger did before.

The new behaviour also works against the gazetteer output with its
insert-only mechanism.
2022-11-10 09:38:23 +01:00
Sarah Hoffmann
51ed55cc32 initial flex import scripts
Only implements the extratags style for the moment. Tests pass
for the same behaviour as the gazetteer output. Updates still need
to be done.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
de2a3bd5f8 bdd tests: make import style configurable
The switch is for development. Tests are not guaranteed to still
work when run with anything but the 'extratags' style.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
981e9700be add osm2pgsql gazetteer tests
This ports the gazetteer tests from osm2pgsql to BDD tests.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
5f6dcd36ed fix flaky API test
The search 'landstr' produces many duplicates so that with
some bad luck 4 or less results may appear. Disable deduplication
to make it more predictable.
2022-10-05 15:16:14 +02:00
Sarah Hoffmann
5877b69d51 do not run unit test when postgis_raster is not available 2022-10-01 11:01:49 +02:00
Sarah Hoffmann
5ec2c1b712 adapt unit tests to changed function names 2022-10-01 11:01:49 +02:00
Sarah Hoffmann
0a73ed7d64 add secondary importance to API BDD tests
Also fixes a path issue during API test DB creation that could
never possibly have worked.
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
0ab0f0ea44 Integrated OSM views into importance computation 2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
ac467c7a2d Enhanced the implementation of OSM views GeoTIFF import functionality 2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
c85b74497b Initial implementation of GeoTIFF import functionality 2022-10-01 11:01:49 +02:00
Sarah Hoffmann
f4d3ae6f70 consolidate indexes over geometry_sectors
The index over geometry_sectors are mainly used for ordering
the places which need indexing. That means they function effectively
as a TODO list. Consolodate them so that they always only contain
the places which are still to do. Also add the appropriate index
for the boundary indexing phase.
2022-09-21 10:38:58 +02:00
Sarah Hoffmann
dddfa3a075 ignore irrelevant extra tags on address interpolations
When deciding if an address interpolation has address information, only
look for addr:street and addr:place. If they are not there go looking
for the address on the address nodes. Ignores irrelevant tags like
addr:inclusion.

Fixes #2797.
2022-08-13 14:07:06 +02:00
Sarah Hoffmann
487e81fe3c more invalidations when boundary changes rank
When a boundary or place changes its address rank, all places where
it participates as address need to be potentially reindexed.
Also use the computed rank when testing place nodes against
boundaries. Boundaries are computed earlier.

Fixes #2794.
2022-08-12 09:48:46 +02:00
Sarah Hoffmann
51b6d16dc6 overhaul the token analysis interface
The functional split betweenthe two functions is now that the
first one creates the ID that is used in the word table and
the second one creates the variants. There no longer is a
requirement that the ID is the normalized version. We might
later reintroduce the requirement that a normalized version be available
but it doesn't necessarily need to be through the ID.

The function that creates the ID now gets the full PlaceName. That way
it might take into account attributes that were set by the sanitizers.

Finally rename both functions to something more sane.
2022-07-29 15:14:11 +02:00
Sarah Hoffmann
c8873d34af harmonize interface of token analysis module
The configure() function now receives a Transliterator object instead
of the ICU rules. This harmonizes the parameters with the create
function.
2022-07-29 10:43:07 +02:00
Sarah Hoffmann
6d41046b15 add support for external sanitizer modules 2022-07-25 16:10:19 +02:00
Sarah Hoffmann
7b7203c149 add function for loading plugin modules
Loads modules for configurable code like tokenizers, sanitizers, etc.
Supports internal modules, external libraries and code from the
project directory.
2022-07-25 16:10:10 +02:00
Sarah Hoffmann
cd4bcea894 ignore API parameters in array notation
PHP automatically parses parameters in an array notation(foo[]) into
array types. Ignore these parameters as 'unknown'.

Fixes #2763.
2022-07-23 10:51:44 +02:00
Kian-Meng Ang
f5e52e748f docs: fix typos 2022-07-20 22:05:31 +08:00
Sarah Hoffmann
9963261d8d add type annotations to special phrase importer 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
62eedbb8f6 add type hints for sanitizers 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
fc254fc744 adapt use of Connection in bdd tests to name change 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
aaf2b6032e fix uses of config.get_path() to expect None 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
b1903f0fbf Merge pull request #2761 from lonvia/repair-index-analysis
Repair `admin --analyse-indexing`
2022-07-18 09:38:08 +02:00
marc tobias
c70ca7f57b In tests for PHP 8 disable Just-in-time, it conflicts with tools that determine coverage 2022-07-09 22:03:48 +02:00
Sarah Hoffmann
4b12d52ef5 convert admin --analyse-indexing to new indexing method
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
cbbcbb1fd7 move country_info into data submodule 2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd move PlaceInfo into data submodule
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
69e51aebab test: avoid column names with upper-case letters
This may cause problems when the column names get quoted.
2022-07-05 09:12:55 +02:00
Marc Tobias
ccf119206d PHP 8 behaves slightly different with in_array and usort 2022-07-03 10:55:34 +02:00
Sarah Hoffmann
3dd7410bb7 bdd: correctly skip postcode tests for legacy 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
93d5be097a bdd: do not expect legacy word table to be without empty tokens
It can happen for bogus names and this will not get fixed anymore.
2022-06-23 23:42:31 +02:00