Nominatim

Author	SHA1	Message	Date
Sarah Hoffmann	28ee3d0949	move linking of places to the preparation stage Linked places may bring in extra names. These names need to be processed by the tokenizer. That means that the linking needs to be done before the data is handed to the tokenizer. Move finding the linked place into the preparation stage and update the name fields. Everything else is still done in the indexing stage.	2021-08-20 22:44:17 +02:00
Sarah Hoffmann	118858a55e	rename legacy_icu tokenizer to icu tokenizer The new icu tokenizer is now no longer compatible with the old legacy tokenizer in terms of data structures. Therefore there is also no longer a need to refer to the legacy tokenizer in the name.	2021-08-17 23:11:47 +02:00
Sarah Hoffmann	5f2b9e317a	add tests for US state hacks IL, AS and LA are replaced with the US state in Geocode because the old tokenizer would simply remove the abbreviations otherwise.	2021-08-17 10:49:07 +02:00
Sarah Hoffmann	1db098c05d	reinstate word column in icu word table Postgresql is very bad at creating statistics for jsonb columns. The result is that the query planer tends to use JIT for queries with a where over 'info' even when there is an index.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	324b1b5575	bdd tests: do not query word table directly The BDD tests cannot make assumptions about the structure of the word table anymore because it depends on the tokenizer. Use more abstract descriptions instead that ask for specific kinds of tokens.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	f70930b1a0	make compund decomposition pure import feature Compound decomposition now creates a full name variant on import just like abbreviations. This simplifies query time normalization and opens a path for changing abbreviation and compund decomposition lists for an existing database.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	2e3c5d4c5b	adapt tests for ICU tokenizer	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	e7b4fc70e7	make sure old data gets deleted on place type change When changing from some other place type to place=postcode make sure that the old place type entry in the place table is deleted.	2021-06-18 10:58:41 +02:00
Sarah Hoffmann	457982e1d2	update postcode in place if it already exists	2021-06-18 00:28:52 +02:00
Sarah Hoffmann	fe11d3cbbd	do not return POIs when dropping house number in query We've previously added searching through rank 30 in a house number search to enable searches for house number+name. This had the unintended side effect that rank 30 objects are also returned in s search that dropped the house number from the query. This is wrong because POIs cannot function as a parent to a house number. This fix drops all rank 30 objects from the results for a house number search if they do not match the requested house number.	2021-06-17 14:21:20 +02:00
Sarah Hoffmann	3aac51c81f	switch BDD tests to always use search API	2021-06-06 15:27:52 +02:00
Sarah Hoffmann	4f4d15c28a	reorganize keyword creation for legacy tokenizer - only save partial words without internal spaces - consider comma and semicolon a separator of full words - consider parts before an opening bracket a full word (but not the part after the bracket) Fixes #244.	2021-05-24 10:41:42 +02:00
Sarah Hoffmann	00094c43d1	enable Tiger BDD API test for legacy_icu	2021-05-21 22:39:56 +02:00
AntoJvlt	3206bf59df	Resolve conflicts	2021-05-17 13:52:35 +02:00
AntoJvlt	fb0ebb5bf0	Add tests for the new SPWikiLoader and SPCsvLoader	2021-05-16 16:10:06 +02:00
Darkshredder	e5ffc59cd5	feat: Added reverse-only-search validation	2021-05-14 02:36:21 +05:30
Sarah Hoffmann	1ccd4360b4	correctly handle removing all postcodes for country	2021-05-13 14:15:42 +02:00
Sarah Hoffmann	b2c6eca2c8	add missing transliterations The ICU library only offers transliterations for a limited set of script. Add transliterations for missing scripts from the PostgreSQL module. These means that the same selection of scripts is supported as with the old module.	2021-05-05 21:16:55 +02:00
Sarah Hoffmann	a263e54b94	enable BDD tests for different tokenizers The tokenizer to be used can be choosen with -DTOKENIZER. Adapt all tests, so that they work with legacy_icu tokenizer. Move lookup in word table to a function in the tokenizer. Special phrases are temporarily imported from the wiki until we have an implementation that can import from file. TIGER tests do not work yet.	2021-05-05 10:31:51 +02:00
Sarah Hoffmann	be6262c6ce	move status test to tokenizer The availability of the module is now tested by the tokenizer.	2021-04-30 17:41:08 +02:00
Sarah Hoffmann	3eb4d88057	boilerplate for PHP code of tokenizer This adds an installation step for PHP code for the tokenizer. The PHP code is split in two parts. The updateable code is found in lib-php. The tokenizer installs an additional script in the project directory which then includes the code from lib-php and defines all settings that are static to the database. The website code then always includes the PHP from the project directory.	2021-04-30 11:31:52 +02:00
Sarah Hoffmann	e1c5673ac3	require tokeinzer for indexer	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	9397bf54b8	introduce external processing in indexer Indexing is now split into three parts: first a preparation step that collects the necessary information from the database and returns it to Python. In a second step the data is transformed within Python as necessary and then returned to the database through the usual UPDATE which now not only sets the indexed_status but also other fields. The third step comprises the address computation which is still done inside the update trigger in the database. The second processing step doesn't do anything useful yet.	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	1fd483643b	add tests for different scripts	2021-04-26 23:01:06 +02:00
Sarah Hoffmann	788baafa26	bdd tests: fix place dependen ranking tests The ranks of places may differ for some countries. Force the place nodes in the test on null island which always uses the default ranking.	2021-04-22 17:31:00 +02:00
Sarah Hoffmann	79d55357e8	simplify sql and website creation functions	2021-04-19 10:53:30 +02:00
Sarah Hoffmann	16a66b5326	move transliteration of housenumbers into indexing Housenumbers are now saved in transliterated form in the housenumber column. This saves the transliteration step during lookup.	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	3590e76a1c	tests for finding non-ascii housenumbers	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	118befd7d7	bdd tests: make indexing less verbose Do not print progress info for indexing when there is an error in the BDD tests.	2021-03-20 10:39:29 +01:00
Sarah Hoffmann	0d9fe6e49c	Merge pull request #2219 from lonvia/bdd-test-remove-php BDD tests: run all setup via nominatim Python library	2021-03-17 11:40:34 +01:00
Sarah Hoffmann	ebae3553e0	bdd: run all setup via nominatim Python library Drops all calls to PHP utility functions. nominatim cli functions are used where possible, to stay as close to the final code as possible with the tests. By removing the PHP calls, the test code now only uses osm2pgsql and the database module from the build directory.	2021-03-16 22:20:41 +01:00
Sarah Hoffmann	4d7c5ec089	reverse: do not prefer interpolations over closer housenumbers Always look up the closest housenumber before looking up interpolations. This ensures that closer housenumbers are preferred over interpolations. Fixes #2214.	2021-03-15 10:50:04 +01:00
Sarah Hoffmann	dd03aeb966	bdd: use python library where possible Replace calls to PHP scripts with direct calls into the nominatim Python library where possible. This speed up tests quite a bit.	2021-02-26 16:14:29 +01:00
Sarah Hoffmann	5b7483ada5	return 404 for details when no bject is found in database Fixes #2157.	2021-02-22 16:28:29 +01:00
Sarah Hoffmann	f08078ccca	bdd tests: directly call python code for setup-website	2021-02-19 18:20:55 +01:00
Sarah Hoffmann	a60c34bded	use a frozen DB for API tests This way we also test that dropping does the right thing.	2021-02-17 22:35:27 +01:00
Sarah Hoffmann	3cb6f3e460	use DataDir constant for data only So far the data directory constant has pointed to the source directory to be usable with different subdirectories. Now only the data subdirectory itself is being used with the constant, so point to the directory directly.	2021-02-09 20:04:08 +01:00
Sarah Hoffmann	8ffd7d9243	remove unused BINDIR constant	2021-02-09 19:30:31 +01:00
Sarah Hoffmann	298ed11261	introduce constant for configuration directory This replaces {data_dir}/settings throughout the code, so that the configuration may be placed somewhere else in the directory structure (e.g. in /etc).	2021-02-09 18:45:45 +01:00
Sarah Hoffmann	b9517c99ae	rename sql directory to lib-sql Also introduces a separate constant for the sql directory, so that it can be put separately from the rest of the data if required.	2021-02-09 15:26:56 +01:00
Sarah Hoffmann	db3ced17bb	rename lib to lib-php	2021-02-09 11:52:07 +01:00
Sarah Hoffmann	504922ffbe	remove old nominatim.py in favour of 'nominatim index' The PHP scripts need to know the position of the nominatim tool in order to call it. This is handed in as environment variable, so it can be set by the Python script.	2021-01-18 15:43:27 +01:00
Sarah Hoffmann	340e7f7210	bdd: complete coverage for API tests Also removes some functions that are no longer used and fixes debug output where the tests found an issue.	2021-01-17 16:12:06 +01:00
Sarah Hoffmann	171ed36e36	bdd: remove duplicated tests	2021-01-16 16:57:28 +01:00
Sarah Hoffmann	c6c907d451	bdd: clean up and extend API tests for details - remove duplicates created by replacing HTML tests with JSON tests - add tests for newer functions for returning geometries and hierarchies	2021-01-16 12:04:13 +01:00
Sarah Hoffmann	19ab038724	collect coverage for /website directory as well	2021-01-15 20:27:14 +01:00
Sarah Hoffmann	2f73bb3643	bdd: directly call utility scripts in lib This removes the dependency on php-symfony-dotenv for the tests.	2021-01-14 18:19:22 +01:00
Sarah Hoffmann	5d656891ba	bdd: convert API tests to smaller test db Changes BDD API tests to restrict themselves to Liechtenstein. One test moved to DB as no appropriate data is available.	2021-01-09 16:59:46 +01:00
Sarah Hoffmann	74122dc965	bdd: improve assert output for API query checks Adds wrapper function for checking address parts and more explanation strings to asserts.	2021-01-09 16:58:37 +01:00
Sarah Hoffmann	ee18a511c6	bdd: import API test DB as part of step setup In the future, the BDD tests will simply set up the required test database themselves. Like with the template database, it is not reimported when it already exists unless that is explicitly forced. Makes most of the API tests currently fail because they still point to old test data.	2021-01-07 11:51:38 +01:00

1 2 3 4 5 ...

272 Commits