Commit Graph

103 Commits

Author SHA1 Message Date
Miroslav Šedivý
6ff51712fe Simplify int/float manipulation 2025-03-06 19:26:56 +01:00
Sarah Hoffmann
6b0d58d9fd restrict postcode parsing in typed phrases
Postcodes can only appear in postcode-type phrases and must then
cover the full phrase
2025-03-05 10:09:33 +01:00
Sarah Hoffmann
434fbbfd18 add support for country prefixes in postcodes 2025-03-04 15:18:27 +01:00
Sarah Hoffmann
921db8bb2f cache all info of ICUQueryAnalyser in a single object 2025-03-04 08:58:57 +01:00
Sarah Hoffmann
a574b98e4a remove postcode computation for word table during import 2025-03-04 08:57:59 +01:00
Sarah Hoffmann
e67ae701ac show token begin and end in debug output 2025-03-04 08:57:59 +01:00
Sarah Hoffmann
fc1c6261ed add postcode parser 2025-03-04 08:57:37 +01:00
Sarah Hoffmann
6759edfb5d make word generation from query a class method 2025-03-04 08:57:37 +01:00
Sarah Hoffmann
e362a965e1 search: merge QueryPart array with QueryNodes
The basic information on terms is pretty much always used together
with the node inforamtion. Merging them together saves some
allocation while making lookup easier at the same time.
2025-03-04 08:57:37 +01:00
Sarah Hoffmann
13db4c9731 replace datrie library with a more simple pure-Python class 2025-02-24 10:24:21 +01:00
Sarah Hoffmann
49bd18b048 replace PhraseType enum with simple int constants 2025-02-21 16:44:12 +01:00
Sarah Hoffmann
31412e0674 replace TokenType enum with simple char constants 2025-02-21 10:23:41 +01:00
Sarah Hoffmann
4577669213 replace BreakType enum with simple char constants 2025-02-21 09:57:48 +01:00
Sarah Hoffmann
9bf1428d81 consistently use query module as qmod 2025-02-21 09:31:21 +01:00
Sarah Hoffmann
b56edf3d0a avoid yielding when extracting words from query 2025-02-20 23:32:39 +01:00
Sarah Hoffmann
abc911079e remove word_number counting for phrases
We can just examine the break types to know if we are dealing
with a partial token.
2025-02-20 17:36:50 +01:00
Sarah Hoffmann
adabfee3be Merge pull request #3655 from lonvia/remove-name-ranking-in-postcode-search
Tweak penalties for postcode searches
2025-02-20 14:32:43 +01:00
Sarah Hoffmann
46c4446dc2 remove address penalty for postcode search
Searches of the form <postcode> <city> are in fact quite common.
2025-02-20 11:11:45 +01:00
Sarah Hoffmann
add9244a2f do not rerank address by full match in postcode search
The reranking result will not be completely correct because
the address of a postcode refer to the address _and_ name
of the parent and reranking was only done against the
address. We assume here that the postcode is precise enough
as to not require a penalty to to partial matches.
2025-02-20 10:29:03 +01:00
Sarah Hoffmann
55c3176957 strip normalisation results of normal and special spaces 2025-02-19 14:40:35 +01:00
Sarah Hoffmann
6730c8bac8 add optional output of extratags to geocodejson 2025-02-16 10:16:40 +01:00
Sarah Hoffmann
ee8915f2b6 prepare 5.0.0 release 2025-02-05 10:54:38 +01:00
Sarah Hoffmann
c2cb6722fe use autocommit when creating tables and indexes
Might avoid some deadlock situations with autovacuum.
2025-01-09 17:14:37 +01:00
Sarah Hoffmann
efc09a5cfc add japanese phrase preprocessing
Code adapted from GSOC code by @miku.
2025-01-09 09:24:10 +01:00
Sarah Hoffmann
86ad9efa8a keep break indicators [:-] during normalisation
All punctuation will be converted to '-'. Soft breaks : may be
added by preprocessors. The break signs are only used during
query analysis and are ignored during import token analysis.
2025-01-09 09:21:55 +01:00
Sarah Hoffmann
d984100e23 add inner word break penalty 2025-01-07 21:42:25 +01:00
Sarah Hoffmann
499110f549 add SOFT_PHRASE break and enable parsing
Also enables parsing of PART breaks.
2025-01-06 17:10:24 +01:00
Sarah Hoffmann
eeb3d5dd0a make nominatim callable with themepark style 2024-12-16 10:26:55 +01:00
Sarah Hoffmann
4760e8341b move lua scripts into a separate directory 2024-12-16 10:26:55 +01:00
Sarah Hoffmann
fbb6edfdaf add documentation for new query preprocessing 2024-12-13 16:53:08 +01:00
Sarah Hoffmann
2b87c016db generalize normalization step for search query
It is now possible to configure functions for changing the query
input before it is analysed by the tokenizer.

Code is a cleaned-up version of the implementation by @miku.
2024-12-13 14:31:08 +01:00
Sarah Hoffmann
d9b4d1591d ignore postcode areas on reverse
Postcode lookups are best done by doing reverse at a higher
level and then extracting the postcode.
2024-12-12 19:02:00 +01:00
Sarah Hoffmann
416e70b97e have reverse fall back to country table when no country is found 2024-12-12 17:14:02 +01:00
Sarah Hoffmann
0770eaa5d0 use bbox size for secondary order of results
Helps to return the largest object when deduplicating results.
2024-11-19 10:38:50 +01:00
Sarah Hoffmann
98c1b923fc remove code only needed for older PostgreSQL/PostGIS versions 2024-11-18 10:11:09 +01:00
Sarah Hoffmann
fd1f2bc719 increase minimum versions for PostgreSQL and PostGIS 2024-11-18 09:28:06 +01:00
Sarah Hoffmann
689bcbd6ea Merge pull request #3590 from lonvia/lookup-per-osm-type
Look up different OSM types in placex separately
2024-11-15 09:44:16 +01:00
Sarah Hoffmann
3acd7df5c4 Merge pull request #3588 from lonvia/optional-reverse-api
Add support for adding endpoints to server conditionally
2024-11-14 19:33:57 +01:00
Sarah Hoffmann
7d418da564 look up different OSM types in placex separately
There are separate indexes on placex for the different OSM types.
PostgreSQL can only use these indexes if the type is fixed per query.
2024-11-14 17:47:01 +01:00
Sarah Hoffmann
20d0fb35ce enable search endpoint only when search table is available 2024-11-14 08:53:09 +01:00
Sarah Hoffmann
754ff15ebd move server route creation into async function 2024-11-13 21:27:14 +01:00
danieldegroot2
7c9002cae7 Update lookup.py - Correct spelling for "simultaneously"
Corrects minor spelling mistake.
2024-11-13 20:35:37 +01:00
Sarah Hoffmann
2735ea768a look up all places at once 2024-11-13 14:21:05 +01:00
Sarah Hoffmann
122ecd4626 remove remaining pylint hints 2024-11-10 22:49:29 +01:00
Sarah Hoffmann
1f07967787 fix style issue found by flake8 2024-11-10 22:47:14 +01:00
Sarah Hoffmann
b9e4563beb fix backward compatibility issues with Python 3.7 2024-10-25 23:43:59 +02:00
Sarah Hoffmann
2c0f2e1ede remove now unnecessary type-ignores 2024-10-25 17:56:47 +02:00
Sarah Hoffmann
5160a1d577 get bbox of postcode areas into results 2024-09-30 08:58:40 +02:00
Sarah Hoffmann
83013f819b derive bbox size for postcode nodes from rank_search 2024-09-30 08:58:40 +02:00
Sarah Hoffmann
15eb7f0bb1 add new format 'raw' for CLI commands
This dumps the original results with all details available.
2024-09-30 08:58:40 +02:00