Nominatim

Author	SHA1	Message	Date
Sarah Hoffmann	93d5be097a	bdd: do not expect legacy word table to be without empty tokens It can happen for bogus names and this will not get fixed anymore.	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	612d34930b	handle postcodes properly on word table updates update_postcodes_from_db() needs to do the full postcode treatment in order to derive the correct word table entries.	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	00d8df6fc3	bdd: move update tests from scenes to grid descriptions	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	3493d317e4	bdd: clear lof buffer after a successful import run	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	e74e577029	bdd: recreate functions on template DB Avoids calling function refresh on every scenario. The content won't change between runs.	2022-05-11 15:50:22 +02:00
Sarah Hoffmann	aa0ae610c6	avoid calling OSM servers during bdd tests	2022-05-11 15:33:01 +02:00
Sarah Hoffmann	adeebec32a	switch tests to ICU tokenizer as default	2022-05-10 14:54:50 +02:00
Sarah Hoffmann	42cd021d04	save differing linked polace names in extra fields This keeps the names tracable and ensures that all names are searchable when they differ. Do not keep names when they are exactly the same to save some space. Linked names are cleaned out before relinking.	2022-03-16 16:38:52 +01:00
Sarah Hoffmann	f74228830d	bdd: run full import on tests This uncovered a couple of outdated/wrong tests which have been fixed, too.	2022-02-24 14:27:51 +01:00
Sarah Hoffmann	c3788d765e	add consistent SPDX copyright headers	2022-01-03 16:23:58 +01:00
Sarah Hoffmann	118858a55e	rename legacy_icu tokenizer to icu tokenizer The new icu tokenizer is now no longer compatible with the old legacy tokenizer in terms of data structures. Therefore there is also no longer a need to refer to the legacy tokenizer in the name.	2021-08-17 23:11:47 +02:00
Sarah Hoffmann	1db098c05d	reinstate word column in icu word table Postgresql is very bad at creating statistics for jsonb columns. The result is that the query planer tends to use JIT for queries with a where over 'info' even when there is an index.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	324b1b5575	bdd tests: do not query word table directly The BDD tests cannot make assumptions about the structure of the word table anymore because it depends on the tokenizer. Use more abstract descriptions instead that ask for specific kinds of tokens.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	2e3c5d4c5b	adapt tests for ICU tokenizer	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	1ccd4360b4	correctly handle removing all postcodes for country	2021-05-13 14:15:42 +02:00
Sarah Hoffmann	a263e54b94	enable BDD tests for different tokenizers The tokenizer to be used can be choosen with -DTOKENIZER. Adapt all tests, so that they work with legacy_icu tokenizer. Move lookup in word table to a function in the tokenizer. Special phrases are temporarily imported from the wiki until we have an implementation that can import from file. TIGER tests do not work yet.	2021-05-05 10:31:51 +02:00
Sarah Hoffmann	e1c5673ac3	require tokeinzer for indexer	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	9397bf54b8	introduce external processing in indexer Indexing is now split into three parts: first a preparation step that collects the necessary information from the database and returns it to Python. In a second step the data is transformed within Python as necessary and then returned to the database through the usual UPDATE which now not only sets the indexed_status but also other fields. The third step comprises the address computation which is still done inside the update trigger in the database. The second processing step doesn't do anything useful yet.	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	118befd7d7	bdd tests: make indexing less verbose Do not print progress info for indexing when there is an error in the BDD tests.	2021-03-20 10:39:29 +01:00
Sarah Hoffmann	ebae3553e0	bdd: run all setup via nominatim Python library Drops all calls to PHP utility functions. nominatim cli functions are used where possible, to stay as close to the final code as possible with the tests. By removing the PHP calls, the test code now only uses osm2pgsql and the database module from the build directory.	2021-03-16 22:20:41 +01:00
Sarah Hoffmann	dd03aeb966	bdd: use python library where possible Replace calls to PHP scripts with direct calls into the nominatim Python library where possible. This speed up tests quite a bit.	2021-02-26 16:14:29 +01:00
Sarah Hoffmann	73cbb6eb9a	bdd: clean up DB ops steps Adds comments and modernizes code.	2021-01-06 16:37:32 +01:00
Sarah Hoffmann	1f29475fa5	bdd: move column comparison in separate file Introduces a new class DBRow that encapsulates the comparison functions. This also is responsible for formatting more informative assert messages. place and placex steps are unified.	2021-01-06 12:28:09 +01:00
Sarah Hoffmann	d586b95ff1	bdd: move nominitim id reader to separate file	2021-01-05 16:00:48 +01:00
Sarah Hoffmann	25557e5f14	bdd: factor out reindexing on updates	2021-01-05 15:17:46 +01:00
Sarah Hoffmann	197870e67a	bdd: move place table inserter into separate file Also simplifies usage by implementing a function that inserts a complete table row.	2021-01-05 12:12:59 +01:00
Sarah Hoffmann	b8e39d2dde	bdd: move scene setup to OSM data steps The step has nothing to do with the database.	2021-01-05 11:42:28 +01:00
Sarah Hoffmann	5dfa76a610	bdd: switch to auto commit mode Put the connection to the test database into auto-commit mode and get rid of the explicit commits. Also use cursors always in context managers and unify the two implementations that copy data from the place table.	2021-01-05 11:42:28 +01:00
Sarah Hoffmann	58c471c627	bdd: remove class for lazy formatting assert in combination with format() does the right thing and calls the __str__() method only when an assertion hits.	2021-01-05 10:39:44 +01:00
Sarah Hoffmann	213bf7d19d	bdd: rename db_ops steps Now all files implementing steps are called steps_*.py.	2021-01-05 10:20:00 +01:00

30 Commits