Commit Graph

128 Commits

Author SHA1 Message Date
Sarah Hoffmann
f314abcfe1 bdd: restrict imports to four languages
This mainly restricts the number of country names that are loaded.
2022-05-11 16:40:53 +02:00
Sarah Hoffmann
e74e577029 bdd: recreate functions on template DB
Avoids calling function refresh on every scenario. The content won't
change between runs.
2022-05-11 15:50:22 +02:00
Sarah Hoffmann
aa0ae610c6 avoid calling OSM servers during bdd tests 2022-05-11 15:33:01 +02:00
Sarah Hoffmann
adeebec32a switch tests to ICU tokenizer as default 2022-05-10 14:54:50 +02:00
Sarah Hoffmann
a0ed80d821 restore the tokenizer directory when missing
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.

See #2515.
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
e133476c35 merge linked names correctly into namedetails
Convert the '_place_*' entries back to normal entries before
returning them in the 'namedetails' section. If the name field is
duplicated, kept the '_place_*' notation. This preserves the previous
behaviour before _place_ names were introduces but adds the additional
names from the linked place for reference.
2022-03-17 11:02:02 +01:00
Sarah Hoffmann
524dc64ab7 make sure outputs take into account linked place names 2022-03-16 21:44:52 +01:00
Sarah Hoffmann
42cd021d04 save differing linked polace names in extra fields
This keeps the names tracable and ensures that all names are searchable
when they differ. Do not keep names when they are exactly the same
to save some space. Linked names are cleaned out before relinking.
2022-03-16 16:38:52 +01:00
Sarah Hoffmann
ef98a85b05 correctly handle single-point interpolations in reverse
Lookup in location_property_osmline needs to be special cased
for startnumber = endnumber. Also adds tests for the case.

Fixes #2680.
2022-03-16 11:19:09 +01:00
Sarah Hoffmann
f74228830d bdd: run full import on tests
This uncovered a couple of outdated/wrong tests which have been
fixed, too.
2022-02-24 14:27:51 +01:00
Sarah Hoffmann
c3788d765e add consistent SPDX copyright headers 2022-01-03 16:23:58 +01:00
Sarah Hoffmann
04857d32cd enable PHPUnit 9 for coverage
A couple of functions have been renamed.
2021-12-07 12:07:17 +01:00
Sarah Hoffmann
118858a55e rename legacy_icu tokenizer to icu tokenizer
The new icu tokenizer is now no longer compatible with the old
legacy tokenizer in terms of data structures. Therefore there
is also no longer a need to refer to the legacy tokenizer in the
name.
2021-08-17 23:11:47 +02:00
Sarah Hoffmann
1db098c05d reinstate word column in icu word table
Postgresql is very bad at creating statistics for jsonb
columns. The result is that the query planer tends to
use JIT for queries with a where over 'info' even when
there is an index.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
324b1b5575 bdd tests: do not query word table directly
The BDD tests cannot make assumptions about the structure of the
word table anymore because it depends on the tokenizer. Use more
abstract descriptions instead that ask for specific kinds of
tokens.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
2e3c5d4c5b adapt tests for ICU tokenizer 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
3aac51c81f switch BDD tests to always use search API 2021-06-06 15:27:52 +02:00
Sarah Hoffmann
00094c43d1 enable Tiger BDD API test for legacy_icu 2021-05-21 22:39:56 +02:00
AntoJvlt
3206bf59df Resolve conflicts 2021-05-17 13:52:35 +02:00
AntoJvlt
fb0ebb5bf0 Add tests for the new SPWikiLoader and SPCsvLoader 2021-05-16 16:10:06 +02:00
Darkshredder
e5ffc59cd5 feat: Added reverse-only-search validation 2021-05-14 02:36:21 +05:30
Sarah Hoffmann
1ccd4360b4 correctly handle removing all postcodes for country 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
a263e54b94 enable BDD tests for different tokenizers
The tokenizer to be used can be choosen with -DTOKENIZER.

Adapt all tests, so that they work with legacy_icu tokenizer.
Move lookup in word table to a function in the tokenizer.
Special phrases are temporarily imported from the wiki until
we have an implementation that can import from file. TIGER
tests do not work yet.
2021-05-05 10:31:51 +02:00
Sarah Hoffmann
be6262c6ce move status test to tokenizer
The availability of the module is now tested by the tokenizer.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
3eb4d88057 boilerplate for PHP code of tokenizer
This adds an installation step for PHP code for the tokenizer. The
PHP code is split in two parts. The updateable code is found in
lib-php. The tokenizer installs an additional script in the
project directory which then includes the code from lib-php and
defines all settings that are static to the database. The website
code then always includes the PHP from the project directory.
2021-04-30 11:31:52 +02:00
Sarah Hoffmann
e1c5673ac3 require tokeinzer for indexer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
9397bf54b8 introduce external processing in indexer
Indexing is now split into three parts: first a preparation step
that collects the necessary information from the database and
returns it to Python. In a second step the data is transformed
within Python as necessary and then returned to the database
through the usual UPDATE which now not only sets the indexed_status
but also other fields. The third step comprises the address
computation which is still done inside the update trigger in
the database.

The second processing step doesn't do anything useful yet.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
1fd483643b add tests for different scripts 2021-04-26 23:01:06 +02:00
Sarah Hoffmann
79d55357e8 simplify sql and website creation functions 2021-04-19 10:53:30 +02:00
Sarah Hoffmann
118befd7d7 bdd tests: make indexing less verbose
Do not print progress info for indexing when there is an error
in the BDD tests.
2021-03-20 10:39:29 +01:00
Sarah Hoffmann
ebae3553e0 bdd: run all setup via nominatim Python library
Drops all calls to PHP utility functions. nominatim cli functions
are used where possible, to stay as close to the final code as
possible with the tests.

By removing the PHP calls, the test code now only uses osm2pgsql and
the database module from the build directory.
2021-03-16 22:20:41 +01:00
Sarah Hoffmann
dd03aeb966 bdd: use python library where possible
Replace calls to PHP scripts with direct calls into the
nominatim Python library where possible. This speed up
tests quite a bit.
2021-02-26 16:14:29 +01:00
Sarah Hoffmann
f08078ccca bdd tests: directly call python code for setup-website 2021-02-19 18:20:55 +01:00
Sarah Hoffmann
a60c34bded use a frozen DB for API tests
This way we also test that dropping does the right thing.
2021-02-17 22:35:27 +01:00
Sarah Hoffmann
3cb6f3e460 use DataDir constant for data only
So far the data directory constant has pointed to the source
directory to be usable with different subdirectories. Now only
the data subdirectory itself is being used with the constant,
so point to the directory directly.
2021-02-09 20:04:08 +01:00
Sarah Hoffmann
8ffd7d9243 remove unused BINDIR constant 2021-02-09 19:30:31 +01:00
Sarah Hoffmann
298ed11261 introduce constant for configuration directory
This replaces {data_dir}/settings throughout the code, so that
the configuration may be placed somewhere else in the directory
structure (e.g. in /etc).
2021-02-09 18:45:45 +01:00
Sarah Hoffmann
b9517c99ae rename sql directory to lib-sql
Also introduces a separate constant for the sql directory, so that
it can be put separately from the rest of the data if required.
2021-02-09 15:26:56 +01:00
Sarah Hoffmann
db3ced17bb rename lib to lib-php 2021-02-09 11:52:07 +01:00
Sarah Hoffmann
504922ffbe remove old nominatim.py in favour of 'nominatim index'
The PHP scripts need to know the position of the nominatim
tool in order to call it. This is handed in as environment
variable, so it can be set by the Python script.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
340e7f7210 bdd: complete coverage for API tests
Also removes some functions that are no longer used and
fixes debug output where the tests found an issue.
2021-01-17 16:12:06 +01:00
Sarah Hoffmann
c6c907d451 bdd: clean up and extend API tests for details
- remove duplicates created by replacing HTML tests
  with JSON tests
- add tests for newer functions for returning geometries
  and hierarchies
2021-01-16 12:04:13 +01:00
Sarah Hoffmann
19ab038724 collect coverage for /website directory as well 2021-01-15 20:27:14 +01:00
Sarah Hoffmann
2f73bb3643 bdd: directly call utility scripts in lib
This removes the dependency on php-symfony-dotenv for the tests.
2021-01-14 18:19:22 +01:00
Sarah Hoffmann
74122dc965 bdd: improve assert output for API query checks
Adds wrapper function for checking address parts and
more explanation strings to asserts.
2021-01-09 16:58:37 +01:00
Sarah Hoffmann
ee18a511c6 bdd: import API test DB as part of step setup
In the future, the BDD tests will simply set up the required
test database themselves. Like with the template database, it
is not reimported when it already exists unless that is explicitly
forced.

Makes most of the API tests currently fail because they still
point to old test data.
2021-01-07 11:51:38 +01:00
Sarah Hoffmann
73cbb6eb9a bdd: clean up DB ops steps
Adds comments and modernizes code.
2021-01-06 16:37:32 +01:00
Sarah Hoffmann
1f29475fa5 bdd: move column comparison in separate file
Introduces a new class DBRow that encapsulates the comparison
functions. This also is responsible for formatting more informative
assert messages. place and placex steps are unified.
2021-01-06 12:28:09 +01:00
Sarah Hoffmann
d586b95ff1 bdd: move nominitim id reader to separate file 2021-01-05 16:00:48 +01:00
Sarah Hoffmann
25557e5f14 bdd: factor out reindexing on updates 2021-01-05 15:17:46 +01:00