Nominatim

mirror of https://github.com/osm-search/Nominatim.git synced 2026-02-15 10:57:58 +00:00

Author	SHA1	Message	Date
Sarah Hoffmann	fdff579188	php: force use of global Exception class	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	001b2aa9f9	fix linitin issues in PHP	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	1db098c05d	reinstate word column in icu word table Postgresql is very bad at creating statistics for jsonb columns. The result is that the query planer tends to use JIT for queries with a where over 'info' even when there is an index.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	324b1b5575	bdd tests: do not query word table directly The BDD tests cannot make assumptions about the structure of the word table anymore because it depends on the tokenizer. Use more abstract descriptions instead that ask for specific kinds of tokens.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	6ad35aca4a	adapt special terms lookup to new word table	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	70f154be8b	switch word tokens to new word table layout	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	4342b28882	switch special phrases to new word table format	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	5394b1fa1b	switch postcode tokens to new word table layout	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	5ab0a63fd6	switch housenumber tokens to new word table layout	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	1618aba5f2	switch country name tokens to new word table layout	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	8096a1d67f	fix parameters for TokenWord creation	2021-07-20 10:21:40 +02:00
Sarah Hoffmann	3cd85eaaf1	remove Token from explicit input for SearchDescription extension The token string is only required by the PartialToken type, so it can simply save the token string internally. No need to pass it to every type. Also moves the check for multi-word partials to the token loader code in the tokenizer. Multi-word partials can only happen with the legacy tokenizer and when the database was loaded with an older version of Nominatim. No need to keep the check for everybody.	2021-07-17 18:18:31 +02:00
Sarah Hoffmann	143ff14466	remove special status of partial tokens Full-word tokens are no longer marked by a space at the beginning of the token. Use the new Partial token category instead. This removes a couple of special casing, we don't really need. The word table still has the space for compatibility reasons, so the tokenizer code needs to get rid of it when loading the tokens.	2021-07-14 22:17:17 +02:00
Sarah Hoffmann	6070c3d1d5	introduce a separate token type for partials This means that the leading space can be removed as a partial word indicator.	2021-07-13 16:57:12 +02:00
Sarah Hoffmann	500c61685b	remove unused variables As reported by sonarqube.	2021-07-09 16:36:42 +02:00
Sarah Hoffmann	8413075249	move abbreviation computation into import phase This adds precomputation of abbreviated terms for names and removes abbreviation of terms in the query. Basic import works but still needs some thorough testing as well as speed improvements during import. New dependency for python library datrie.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	ba8ed7967d	add PHP part for new ICU-base tokenizer	2021-05-05 10:15:27 +02:00
Sarah Hoffmann	be6262c6ce	move status test to tokenizer The availability of the module is now tested by the tokenizer.	2021-04-30 17:41:08 +02:00
Sarah Hoffmann	044bb6afa5	move tokenization in query into tokenizer	2021-04-30 17:41:08 +02:00
Sarah Hoffmann	3eb4d88057	boilerplate for PHP code of tokenizer This adds an installation step for PHP code for the tokenizer. The PHP code is split in two parts. The updateable code is found in lib-php. The tokenizer installs an additional script in the project directory which then includes the code from lib-php and defines all settings that are static to the database. The website code then always includes the PHP from the project directory.	2021-04-30 11:31:52 +02:00

20 Commits