Compare commits

...

463 Commits
4.0.x ... 4.1.x

Author SHA1 Message Date
Sarah Hoffmann
63852d2252 prepare release 4.1.2 2023-02-20 17:52:00 +01:00
Sarah Hoffmann
5c3691fb64 harmonize flags for PHP's htmlspecialchars 2023-02-20 17:44:59 +01:00
Sarah Hoffmann
6d94af3b5a adapt PHP tests for debug output 2023-02-20 17:44:32 +01:00
Sarah Hoffmann
a1592faf5f properly encode special HTML characters in debug mode 2023-02-20 17:44:29 +01:00
Sarah Hoffmann
ec533f6a1a prepare release 4.1.1 2022-11-19 16:15:47 +01:00
Sarah Hoffmann
9f5adabd12 update osm2pgsql to 1.7.1 2022-11-19 15:54:27 +01:00
Sarah Hoffmann
3d9c33192b drop illegal values for addr:interpolation on update 2022-11-19 15:53:29 +01:00
Sarah Hoffmann
05863ae5ca correctly handle special term + name combination
Special terms with operator name usually appear in combination with the
name. The current penalties only took name + special term into account
not special term + name.

Fixes #2876.
2022-11-19 15:52:19 +01:00
Sarah Hoffmann
a856c56450 fix type issues with calls to pyosmium 2022-11-19 15:51:09 +01:00
Marc Tobias
aa2e4e411b Tiger install doc: add -refresh website- step 2022-11-19 15:51:02 +01:00
Sarah Hoffmann
5cdeaac967 add types-requests dependency 2022-11-19 15:50:45 +01:00
Sarah Hoffmann
6a7b2b823a respect socket timeout also in other replication functions 2022-11-19 15:50:38 +01:00
Sarah Hoffmann
2dd8433ab6 fix timeout use for replication timeout
The timeout parameter is no longer taken into account since
pyosmium switched to the requests library. This adds the parameter
back.
2022-11-19 15:50:30 +01:00
Marc Tobias
951f92f665 update those github action packages still using node12 2022-11-19 15:49:58 +01:00
Sarah Hoffmann
9d009c7967 ignore interpolations without parent on reverse search
If no parent can be found for an interpolation, there is most
likely a data error involved. So don' t show these interpolations
in reverse search results.
2022-11-19 15:49:17 +01:00
marc tobias
442e8fb411 Install scripts: remove version from /var/run/php-fpm filenames 2022-11-19 15:48:52 +01:00
Sarah Hoffmann
6a5bbdfae0 actions: pin pyicu to 2.9 2022-11-19 15:48:30 +01:00
marc tobias
6bac238760 Documentation: remove year from TIGER filename 2022-11-19 15:47:05 +01:00
Sarah Hoffmann
185c3cf7a8 mypy: fix new warnings due to external type updates 2022-11-19 15:45:20 +01:00
Mauricio Scheffer
ae5687539a docs: fix links to rank docs 2022-11-19 15:44:58 +01:00
Sarah Hoffmann
d71be2b60a ignore irrelevant extra tags on address interpolations
When deciding if an address interpolation has address information, only
look for addr:street and addr:place. If they are not there go looking
for the address on the address nodes. Ignores irrelevant tags like
addr:inclusion.

Fixes #2797.
2022-11-19 15:44:20 +01:00
Sarah Hoffmann
d910f52221 more invalidations when boundary changes rank
When a boundary or place changes its address rank, all places where
it participates as address need to be potentially reindexed.
Also use the computed rank when testing place nodes against
boundaries. Boundaries are computed earlier.

Fixes #2794.
2022-11-19 15:43:08 +01:00
Sarah Hoffmann
f48a37deea fix base number of returned results
The intent was to always search for at least 10 results.

Improves on #882.
2022-11-19 15:40:43 +01:00
Sarah Hoffmann
c08e3849b8 adapt to new type annotations from typeshed
Some more functions frrom psycopg are now properly annotated.
No ignoring necessary anymore.
2022-11-19 15:40:01 +01:00
Sarah Hoffmann
ec92167514 docs: add types-psutil requirement 2022-11-19 15:39:47 +01:00
Sarah Hoffmann
5a05608b34 remove mypy ignore for psutil.virtual_memory()
Now available in typeshed.
2022-11-19 15:39:09 +01:00
Sarah Hoffmann
eecc73ea1a docs: fix dangling links 2022-08-05 15:29:43 +02:00
Sarah Hoffmann
8c73c0795e docs: update links to vagrant instructions 2022-08-05 15:27:11 +02:00
Sarah Hoffmann
7d68aa8f04 prepare release 4.1.0 2022-08-05 14:33:11 +02:00
Sarah Hoffmann
a0cd96e05e Merge pull request #2786 from lonvia/export-centroid-for-tokenizer
Export centroid to tokenizer
2022-08-01 11:38:24 +02:00
Sarah Hoffmann
b19c90b9a6 export centroid to tokenizer
May come in handy when developping sanitizers for an area smaller
than country size.
2022-07-31 22:10:58 +02:00
Sarah Hoffmann
e427712cb0 Merge pull request #2784 from lonvia/doscs-customizing-icu-tokenizer
Document the public API of sanitizers and token analysis modules
2022-07-31 19:15:50 +02:00
Sarah Hoffmann
9864b191b1 fix various typos 2022-07-31 17:10:35 +02:00
Sarah Hoffmann
e7574f119e add simple examples of sanitizers and token analysis 2022-07-29 17:15:25 +02:00
Sarah Hoffmann
51b6d16dc6 overhaul the token analysis interface
The functional split betweenthe two functions is now that the
first one creates the ID that is used in the word table and
the second one creates the variants. There no longer is a
requirement that the ID is the normalized version. We might
later reintroduce the requirement that a normalized version be available
but it doesn't necessarily need to be through the ID.

The function that creates the ID now gets the full PlaceName. That way
it might take into account attributes that were set by the sanitizers.

Finally rename both functions to something more sane.
2022-07-29 15:14:11 +02:00
Sarah Hoffmann
34d27ed45c move PlaceName into the generic data module 2022-07-29 11:42:20 +02:00
Sarah Hoffmann
094100bbf6 harmonize spelling
Stick with the American spelling of Analyze.
2022-07-29 10:52:01 +02:00
Sarah Hoffmann
c8873d34af harmonize interface of token analysis module
The configure() function now receives a Transliterator object instead
of the ICU rules. This harmonizes the parameters with the create
function.
2022-07-29 10:43:07 +02:00
Sarah Hoffmann
f0d640961a add documentation for custom token analysis 2022-07-29 09:41:28 +02:00
Sarah Hoffmann
3746befd88 add documentation for sanitizer interface
Also switches mkdocstrings to 0.18 with the rather unfortunate
consequence that now mkdocstrings-python-legacy is needed as well.
2022-07-28 22:00:29 +02:00
Sarah Hoffmann
a8b037669a Merge pull request #2780 from lonvia/python-modules-in-project-directory
Support for external sanitizer and token analysis modules
2022-07-28 21:58:04 +02:00
Sarah Hoffmann
d819036daa add support for external token analysis modules 2022-07-25 16:27:22 +02:00
Sarah Hoffmann
6d41046b15 add support for external sanitizer modules 2022-07-25 16:10:19 +02:00
Sarah Hoffmann
7b7203c149 add function for loading plugin modules
Loads modules for configurable code like tokenizers, sanitizers, etc.
Supports internal modules, external libraries and code from the
project directory.
2022-07-25 16:10:10 +02:00
Sarah Hoffmann
95d4061b2a Merge pull request #2775 from lonvia/remove-centos-instructions
Remove vagrant scripts for CentOS
2022-07-25 10:29:32 +02:00
Sarah Hoffmann
375b57a96a vagrant: remove proj dependency and only require php-cli 2022-07-24 10:24:18 +02:00
Sarah Hoffmann
12ace4329d remove CentOS installation instructions
Fixes #2601.
2022-07-24 10:22:22 +02:00
Sarah Hoffmann
09e0be0e39 Merge pull request #2774 from lonvia/parameter-arrays
Ignore URL parameters in array notation
2022-07-23 23:56:32 +02:00
Sarah Hoffmann
cd4bcea894 ignore API parameters in array notation
PHP automatically parses parameters in an array notation(foo[]) into
array types. Ignore these parameters as 'unknown'.

Fixes #2763.
2022-07-23 10:51:44 +02:00
Sarah Hoffmann
1bee151fe3 Merge pull request #2772 from kianmeng/fix-typos
docs: fix typos
2022-07-20 17:13:30 +02:00
Kian-Meng Ang
f5e52e748f docs: fix typos 2022-07-20 22:05:31 +08:00
Sarah Hoffmann
b7f6c7c76a docs: slightly increase recommended hardware requirements 2022-07-20 10:16:23 +02:00
Sarah Hoffmann
bc7f6209d8 Merge pull request #2770 from lonvia/typed-python
Type annotations for Python code
2022-07-19 09:03:30 +02:00
Sarah Hoffmann
372a548c28 CI: remove installation of pip on Ubuntu 20 2022-07-18 12:19:04 +02:00
Sarah Hoffmann
5aad105c73 add explicit cast for fetchone 2022-07-18 10:18:51 +02:00
Sarah Hoffmann
f40c83d025 CIL use psutil type stubs 2022-07-18 09:55:58 +02:00
Sarah Hoffmann
83054af46f remove typing_extensions requirement
The typing_extensions package is only necessary now when running mypy.
It won't be used at runtime anymore.
2022-07-18 09:55:58 +02:00
Sarah Hoffmann
cb81f11422 CI: make type checking strict 2022-07-18 09:55:58 +02:00
Sarah Hoffmann
a849f3c9ec add type annotations for command line functions 2022-07-18 09:55:54 +02:00
Sarah Hoffmann
25d854dc5c add type annotations for Tiger import function 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
9963261d8d add type annotations to special phrase importer 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
459ab3bbdc add type annotations to database check functions 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
a21d4d3ac4 add type annotations for database import functions 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
4da1f0da6f add type annotations for migrations 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
17bbe2637a add type annotations to tool functions 2022-07-18 09:54:27 +02:00
Sarah Hoffmann
6c6bbe5747 add type annotations for ICU tokenizer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
18b16e06ca add type annotations for legacy tokenizer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e37cfc64d2 add type annotations to ICU tokenizer helper modules 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
77510f4a3b add typing extensions for Ubuntu22.04 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d35e3c25b6 add type annotations for token analysis
No annotations for ICU types yet.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
62eedbb8f6 add type hints for sanitizers 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
5617bffe2f add type annotations for indexer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
8adab2c6ca add typing information for postcode formatter 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d0c44431d0 add typing information for place_info and country_info 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
282a61ce51 add typing information for utils submodule 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
7a1d22ff15 type annotations for non-blocking DB connection 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
0dff71a410 add type annotations for SQL preprocessor 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
26f30bff28 add type annotation to DB utils
As a cursor is needed as type, make this a public type.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6775e713c add typing information to DB properties 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
69f9122bef add typing annotations for DB status module
Requires TypedDict which is only available from Python 3.8. Require
therefore typing_extensions to make the functions available for
earlier Python versions.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
fc254fc744 adapt use of Connection in bdd tests to name change 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
845c43137a add type annotations to freeze functions 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
aaf2b6032e fix uses of config.get_path() to expect None 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
c4928c646d define type for enivronment dictionaries 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f12fe54d2b restrict return type more 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
fc03c0266a add type annotations to exec_utils 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
7b042de300 CI: install type info for psycopg2 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
681aad7e0d avoid issues with Python < 3.9 and linting 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f22fa992f7 move complex typing annotations to extra file 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
992e6f72cf type annotations for DB utils 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6ee3c772c type annotations for DB connection 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
9d716f0f7d mypy: add psycopg2 typing info from typeshed 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
95ed95c616 add type annotations to config module 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
bf36f33e79 add type annotations for version.py 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
58ab8319b9 mypy: ignore dotenv library 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
a87cb77ce8 document use of mypy 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
2be45a35b4 CI: add mypy to tests 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
9b636fdc10 mypy: minimal annotations to enable a clean run 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
b1903f0fbf Merge pull request #2761 from lonvia/repair-index-analysis
Repair `admin --analyse-indexing`
2022-07-18 09:38:08 +02:00
Sarah Hoffmann
00f5b78160 Merge pull request #2764 from otbutz/patch-4
Remove legacy Postgres options
2022-07-13 15:51:47 +02:00
otbutz
d58061473e Remove legacy Postgres options 2022-07-12 09:49:10 +02:00
Sarah Hoffmann
33cb925f2e Merge pull request #2691 from mtmail/ubuntu-22
Vagrant and CI tests for Ubuntu 22.04
2022-07-11 15:37:51 +02:00
marc tobias
c70ca7f57b In tests for PHP 8 disable Just-in-time, it conflicts with tools that determine coverage 2022-07-09 22:03:48 +02:00
Marc Tobias
a6dab5e300 Vagrant and CI tests for Ubuntu 22.04 2022-07-09 22:03:48 +02:00
Sarah Hoffmann
7cafec0750 decode_json() always create arrays instead of objects 2022-07-09 09:10:21 +02:00
Sarah Hoffmann
4b12d52ef5 convert admin --analyse-indexing to new indexing method
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
300612c5a8 Merge pull request #2760 from lonvia/reorganize-data-classes
Code cleanup: move some common code into the data submodule
2022-07-07 16:12:11 +02:00
Sarah Hoffmann
856925d19b remove analyze() from PlaceInfo class
The function creates circular dependencies.
2022-07-07 12:06:58 +02:00
Sarah Hoffmann
cbbcbb1fd7 move country_info into data submodule 2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd move PlaceInfo into data submodule
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
69e51aebab test: avoid column names with upper-case letters
This may cause problems when the column names get quoted.
2022-07-05 09:12:55 +02:00
Sarah Hoffmann
8ac133f2ee CI: remove unneed stuff to make space for DB 2022-07-03 16:42:57 +02:00
Sarah Hoffmann
67996929e0 Merge pull request #2706 from mtmail/php-fixes-php7-vs-php8
PHP 8 behaves slightly different with in_array and usort
2022-07-03 11:28:52 +02:00
Marc Tobias
ccf119206d PHP 8 behaves slightly different with in_array and usort 2022-07-03 10:55:34 +02:00
Sarah Hoffmann
bc63f10057 fix syntax error with tablespaces 2022-06-30 09:19:16 +02:00
Sarah Hoffmann
6f15306766 docs: replace deprecated pages option
Fixes #2661.
2022-06-29 20:30:28 +02:00
Sarah Hoffmann
161d83af5b fix handling of zero importance
To avoid importance becoming zero and cancelling out other weights,
df008d99f5 introduced a minimum value
for importance. That broke importances for interpolated addresses,
which are less than zero.

Instead of setting a minimum, set zero importances to a very small
value.

Fixes #2753.
2022-06-29 17:54:30 +02:00
Sarah Hoffmann
3bf3b894ea Merge pull request #2757 from lonvia/filter-postcodes
Add filtering, normalisation and variants for postcodes
2022-06-24 21:09:41 +02:00
Sarah Hoffmann
536f08f33a ignore 5+ postcodes in the US for now
Hierarchical postcodes need a different treatment.
2022-06-24 19:24:22 +02:00
Sarah Hoffmann
3dd7410bb7 bdd: correctly skip postcode tests for legacy 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
93d5be097a bdd: do not expect legacy word table to be without empty tokens
It can happen for bogus names and this will not get fixed anymore.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
6eb9044353 adapt search algorithm to new postcode format in word 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
612d34930b handle postcodes properly on word table updates
update_postcodes_from_db() needs to do the full postcode treatment
in order to derive the correct word table entries.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5be320368c add documentation for postcode customization 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7f2ad4ac7e fix linting issue 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
0f00f4968c fix up BDD tests for postcode changes
Includes smaller code fixes found by the tests.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
37b2c6a830 port legacy tokenizer to new postcode handling
Also documents the changes to the SQL functions of the tokenizer.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
e86db3001f fix postcode pattern for Mozambique
Optional groups are not implemented yet.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7b6ec4fc6c add tests for discarding bad postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
67dfa38e60 fix liniting problems 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
2eca9fc8af cache postcode normalization 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b5e5efc131 only add well-formatted postcodes to location table 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
80ea13437d move postcode matcher in a separate file 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
bf86b45178 move postcode centroid computation to Python 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
4885fdf0f9 add class for online centroid computation 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b7704833e4 icu: switch postcodes to using the pre-formatted one 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
ca7b46511d introduce and use analyzer for postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
18864afa8a postcodes: introduce a default pattern for countries without postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5ba75df507 postcode: generate a generic form 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9cf700e85d add postcodes for most of the remaining countries
Now includes all postcodes that have optional parts.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9172696324 postcodes: add support for optional spaces 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
49626ba709 add postcode formats with optional country code
If the country code is not part of the mandatory output, the
country code filter will do the correct handling.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
baee6f3de0 postcodes: strip leading country codes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
28ab2f6048 add postcodes patterns without optional spaces 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
90d4d339db initial postcode cleaner for simple patterns
Moves postcodes that are either in countries without a postcode
system or don't correspond to the local pattern for postcodes into
a field for a normal address part. Makes them searchable but not as
a special address. This has two consequences: they are no longer a
skippable part of the address and the postcodes cannot be searched
on their own.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
6e0014e138 add postcode patterns for numeric postcodes
Adds patterns for countries that have simple numeric-only postcodes.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
8080625747 remove postcodes from countries that don't have them
The postcodes will only be removed as a 'computed postcode' they
are still searchable for the given object.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
21fb501699 add info about countries without a postcode 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
0cd3a1b9bd avoid near searches in very large areas
At some point the contains call becomes too expensive.
2022-06-23 23:42:09 +02:00
Sarah Hoffmann
8de483a45b Merge pull request #2755 from Luflosi/fix-typo
Fix typo
2022-06-20 22:23:36 +02:00
Luflosi
3ea87169ac Fix typo 2022-06-20 20:41:00 +02:00
Sarah Hoffmann
42d16d8296 Merge pull request #2751 from mtmail/issue-2750
Documentation fix: should be "nominatim refresh"
2022-06-20 10:21:06 +02:00
marc tobias
adf3ae004f Documentation fix: should be "nominatim refresh" 2022-06-20 02:32:23 +02:00
Sarah Hoffmann
fced1172c4 Merge pull request #2746 from bgo-eiu/patch-2
Added additional languages for Pakistan in country settings
2022-06-18 09:40:47 +02:00
Sarah Hoffmann
299e98776e Merge pull request #2749 from stefkiourk/patch-1
Typos and syntax on Reverse.md
2022-06-17 22:11:55 +02:00
Stef Ki
b803505402 Typos and syntax on Reverse.md 2022-06-17 21:01:38 +02:00
Sarah Hoffmann
8fb9795d04 Merge pull request #2748 from lonvia/bdd-grid-origin
BDD tests: remove support for scenes
2022-06-17 15:25:29 +02:00
Sarah Hoffmann
d8623d6818 bdd: remove support for scenes
Only keep support for the special point geometry 'country:xx'.
2022-06-17 11:54:18 +02:00
Sarah Hoffmann
6c58a4c46c bdd: move query tests from scene to grid description 2022-06-17 11:54:18 +02:00
Sarah Hoffmann
19f67e167c bdd: remove step for scene setup 2022-06-17 11:54:18 +02:00
Sarah Hoffmann
00d8df6fc3 bdd: move update tests from scenes to grid descriptions 2022-06-17 11:54:18 +02:00
Sarah Hoffmann
02068aec7f bdd: move import tests from scenes to grid descriptions 2022-06-17 11:54:18 +02:00
Sarah Hoffmann
3493d317e4 bdd: clear lof buffer after a successful import run 2022-06-17 11:54:18 +02:00
Sarah Hoffmann
a2b486a5b0 bdd: allow to set an origin of the grid 2022-06-17 11:54:18 +02:00
Sarah Hoffmann
3d0f8bdc39 Merge pull request #2745 from lonvia/city-in-city-fix
Improve hierarchy computation for place areas
2022-06-16 15:36:39 +02:00
bgo-eiu
04644102f2 added additional languages for pakistan in country settings 2022-06-16 06:26:44 -04:00
Sarah Hoffmann
f833cc80df use default ranks when reorganising rank_address
When shifting address ranks, the evaluation is always done against
unshifted address ranks on import because the objects we compare against
have not been indexed yet. This changes for updates when the object have
been touched in the meantime. To ensure consistent behaviour across
imports and updates, always use the  unshifted address ranks.
2022-06-16 11:20:23 +02:00
Sarah Hoffmann
df0142678a improve address ordering with mixes of place and admin areas
Resolves a couple of situations where a mixed use of places areas and
administrative boundaries would result in a hierarchy that did not
properly respect the contains relation.
2022-06-16 10:44:16 +02:00
Sarah Hoffmann
800240550b Merge pull request #2737 from lonvia/reset-linking-ranks
Fix rank inheritance from linked places
2022-06-06 09:29:32 +02:00
Sarah Hoffmann
15cf7dd416 add testcase for #2551
This test proves that places that are linked need to be reindexed.
2022-06-05 21:39:17 +02:00
Sarah Hoffmann
2c05fc858a fix rank inheritance from linked places
When taking over the address rank from a linked place, it needs
to be the originally computed rank, not the one that might have
been adjusted in the meantime. The adjustment was made under the
assumption that the node is not linked.
2022-06-05 19:38:14 +02:00
Sarah Hoffmann
a024c7665c Merge pull request #2736 from lonvia/reverse-interpolation-index-order
Change indexing order for interpolations and non-addressable objects
2022-06-03 10:42:54 +02:00
Sarah Hoffmann
cbb4749996 change indexing order for interpolations
Interpolations are now indexed after rank 30 objects. The housenumber
nodes no longer need information from the interpolations while the
interpolations can make use of precomputed postcodes.
2022-06-02 15:16:46 +02:00
Sarah Hoffmann
4b0d9f71e8 Merge pull request #2735 from lonvia/geocodejson-type-reverse
Also fix type output in geocodejson for reverse
2022-06-01 22:14:06 +02:00
Sarah Hoffmann
218c56f9a6 use getattr() instead of __getattr__
Makes the linter happy.
2022-06-01 21:26:13 +02:00
Sarah Hoffmann
a35eda3d2a also fix type output in geocodejson for reverse 2022-06-01 20:46:08 +02:00
Sarah Hoffmann
8a0e3e2f3d Merge pull request #2732 from lonvia/fix-ordering-address-parts
Fix order when searching for addr:* components
2022-05-31 20:26:05 +02:00
Sarah Hoffmann
12a3d51bcc Merge pull request #2731 from lonvia/cleanup-special-phrases
Minor code reorganisation around special phrase parsing
2022-05-31 17:13:56 +02:00
Sarah Hoffmann
60367d95dd Merge pull request #2730 from lonvia/exclude-inclusion-tag
Exclude addr:inclusion from search
2022-05-31 17:13:37 +02:00
Sarah Hoffmann
bd0e157b91 fix order when searching for addr:* components
When matching addr:* components the preference was given to
matches that do not intersect with the place.
2022-05-31 16:57:37 +02:00
Sarah Hoffmann
1821f68ca0 exclude addr:inclusion from search 2022-05-31 14:19:19 +02:00
Sarah Hoffmann
b5ac546275 CI: always use the latest version of pylint
This makes it easier to reproduce issues locally.
2022-05-31 09:12:26 +02:00
Sarah Hoffmann
46689df668 custom comparison for SpecialPhrase
Duplicate elemination only works when a custom hash/equal function
is implemented that is based on the members.
2022-05-30 16:30:41 +02:00
Sarah Hoffmann
e828d0d3f7 move quoting hack to wiki loader
The bad quotes around the type for special phrases
specifically occure in the Wiki pages, so it should be
removed by the loader and not in the generic SpecialPhrase
object.
2022-05-30 14:40:33 +02:00
Sarah Hoffmann
cce0e5ea38 convert special phrase loaders to generators
Generators simplify the code quite a bit compared to the previous
Iterator approach.
2022-05-30 14:12:46 +02:00
Sarah Hoffmann
042e314589 remove the language parameter in the SPWikiLoader
Languages must always be configured through config or environment.
Also use monkeypatched environment in tests.
2022-05-30 10:26:20 +02:00
Sarah Hoffmann
61d813bfef add get_str_list() for config
Converts a config value written as a comma-sparated list into
a Python list of strings.
2022-05-29 13:53:50 +02:00
Sarah Hoffmann
ecee5cf801 Merge pull request #2728 from lonvia/allow-more-partials
Allow search for partials consisting of 3 or more words
2022-05-27 18:09:11 +02:00
Sarah Hoffmann
9e4e913bf7 allow search for partials consisting of 3 or more words
The search query builder currently rejects searches for partial
names only, when the partial terms are all very frequent to avoid
queries that return too many results.

This change slightly relaxes the condition to allow the search when
there are 3 or more partial terms. With so many terms the number
of matches should be managable.
2022-05-27 16:49:14 +02:00
Sarah Hoffmann
98fc528d8e Merge pull request #2715 from otbutz/patch-2
Simplify apache rewrite rules
2022-05-24 14:40:28 +02:00
otbutz
d1cd2d1674 Change to regular regex group 2022-05-24 11:32:59 +02:00
Sarah Hoffmann
b593fe9c3e Merge pull request #2718 from nslxndr/fix-log-endtime
Undefined offset in error log
2022-05-23 16:25:41 +02:00
Sarah Hoffmann
6ca6725f6e Merge pull request #2722 from lonvia/fix-relinking-on-updates
Fix bug with keeping linking on updates
2022-05-23 11:36:20 +02:00
Sarah Hoffmann
1d203fdb3c fix bug with keeping linking on updates
When moving the finding of linked places to the precomputation stage,
it was also moved before the statement where the linked_place_id was
removed from the linkee. The result was that the current linkee was
excluded when looking for a linked place on updates because it was
still linked to the boundary to be updated.

Fixed by allowing to either keep the linkage or change to an unlinked
place.
2022-05-23 10:55:10 +02:00
Sandor Nagy
3f30699131 correct end time computation 2022-05-20 23:11:00 +02:00
otbutz
22bd9c4993 Simplify apache rewrite rules 2022-05-20 10:15:28 +02:00
Sarah Hoffmann
4654701c10 Merge pull request #2713 from lonvia/remove-county-nodes-in-canada
Remove county nodes in Canada from addresses
2022-05-19 10:21:09 +02:00
Sarah Hoffmann
8a67ddcb2b remove county nodes in Canada from addresses
Canada has complete coverage for administrative boundaries on
county level. Removing the county nodes from the addresses avoids error
due to a wide-spread doubling of place nodes for city counties.
2022-05-18 10:19:05 +02:00
Sarah Hoffmann
ab71f17c47 Merge pull request #2710 from lonvia/offline-import-mode
Assorted performance improvements for BDD tests
2022-05-12 11:08:29 +02:00
Sarah Hoffmann
f314abcfe1 bdd: restrict imports to four languages
This mainly restricts the number of country names that are loaded.
2022-05-11 16:40:53 +02:00
Sarah Hoffmann
2d1a22705f Merge pull request #2709 from lonvia/less-strict-country-assignment
Be more strict with country assignments
2022-05-11 16:24:47 +02:00
Sarah Hoffmann
e74e577029 bdd: recreate functions on template DB
Avoids calling function refresh on every scenario. The content won't
change between runs.
2022-05-11 15:50:22 +02:00
Sarah Hoffmann
aa0ae610c6 avoid calling OSM servers during bdd tests 2022-05-11 15:33:01 +02:00
Sarah Hoffmann
dc6c4bf22e add offline import mode
In offline mode no attempts are made to download data from the internet.
At the moment that only concerns the computation of the database date.
It contacts the main API to get the date.
2022-05-11 15:03:02 +02:00
Sarah Hoffmann
a7a5f0161f Merge pull request #2708 from lonvia/use-format-literals
Assorted fixes for new pylint warnings
2022-05-11 14:29:56 +02:00
Sarah Hoffmann
739fe1c2c4 no longer allow fuzzy assignment of country
The fallback country boundaries already contain a sufficiently large
part of the water area, so there is no need to extend the country
assignment even more. Features outside countries should not show a
country in their address.
2022-05-11 11:54:25 +02:00
Sarah Hoffmann
3ba975466c fix spacing
Some versions of pylint are oddly picky.
2022-05-11 10:36:09 +02:00
Sarah Hoffmann
d14a585cc9 pylint: disable no-self-use check
This checker encourages bad behaviour (namely changing the static
status of a function during inheritence) and will be made optional
in upcoming versions of pylint.
2022-05-11 10:25:00 +02:00
Sarah Hoffmann
7f7a7df3a2 solve assorted issue with newer pylint versions
Includes more use of 'with', adding encodings to open statements
and a couple of issues with parameter renaming.
2022-05-11 10:22:14 +02:00
Sarah Hoffmann
5d5f40a82f use context management when processing Tiger data 2022-05-11 09:48:56 +02:00
Sarah Hoffmann
ae6b029543 remove redundant 'u' prefixes for unicode strings 2022-05-11 09:48:56 +02:00
Sarah Hoffmann
bb2bd76f91 pylint: avoid explicit use of format() function
Use psycopg2 SQL formatters for SQL and formatted string literals
everywhere else.
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
4e1e166c6a add a function to return a formatted version
Replaces the various repeated format strings throughout the code.
2022-05-11 09:01:24 +02:00
Sarah Hoffmann
5ff35d9984 Merge pull request #2707 from lonvia/make-icu-tokenizer-the-default
Make ICU tokenizer the default
2022-05-11 08:52:49 +02:00
Sarah Hoffmann
c6a426a885 no longer need postgresql-server-dev packages 2022-05-10 18:33:51 +02:00
Sarah Hoffmann
11103268e9 make legacy tokenizer tests the extra on CI 2022-05-10 18:33:34 +02:00
Sarah Hoffmann
b332b1ae23 Merge pull request #2704 from mtmail/migrate-phpunit-xml-schema
PHPUnit 9 changed configuration schema slightly
2022-05-10 17:44:34 +02:00
Sarah Hoffmann
7e70e5f503 always state encoding when opening files in text mode
Also applies to Path.write_text().
2022-05-10 15:36:29 +02:00
Marc Tobias
99fa23040a PHPUnit 9 changed configuration schema slightly 2022-05-10 15:20:43 +02:00
Sarah Hoffmann
adeebec32a switch tests to ICU tokenizer as default 2022-05-10 14:54:50 +02:00
Sarah Hoffmann
b93ef23d3f add migration hint for the new tokenizer default 2022-05-10 12:07:21 +02:00
Sarah Hoffmann
4002bee0c1 make ICU the default tokenizer 2022-05-10 12:02:50 +02:00
Sarah Hoffmann
ed6fda6968 Merge pull request #2702 from lonvia/move-country-names-into-includes
Clean up country name settings
2022-05-10 09:21:16 +02:00
Sarah Hoffmann
2ae13c5583 Merge pull request #2695 from mtmail/git-commit-hash-to-version
add git commit hash to 'nominatim --version' output
2022-05-10 09:14:15 +02:00
Marc Tobias
821dabb138 add git commit hash to --version output 2022-05-09 23:56:13 +02:00
Sarah Hoffmann
9d468f6da0 support arbitrary prefixes in country name list
This means we can now get rid of the last special cases for names.
2022-05-09 11:55:26 +02:00
Sarah Hoffmann
3a8ddf736e move country names into separate include files 2022-05-09 11:55:26 +02:00
Sarah Hoffmann
720c7b7519 Merge pull request #2696 from mtmail/norminatyn-typos
fix typos of name Nominatim
2022-05-05 10:04:55 +02:00
Marc Tobias
0de83c4a51 fix typos of name Nominatim 2022-05-05 01:04:47 +02:00
Sarah Hoffmann
8c073993ef Merge pull request #2693 from mtmail/nominatim-cli-version
new "nominatim --version" global CLI argument
2022-05-04 09:14:35 +02:00
Marc Tobias
a79ab41782 new nominatim --version CLI argument 2022-05-04 01:33:25 +02:00
Sarah Hoffmann
f509526e5c Merge pull request #2681 from lonvia/improve-geocodejson
Fix 'type' field in the geocodejson response
2022-05-02 16:05:02 +02:00
Sarah Hoffmann
896199c9d4 Merge pull request #2687 from lonvia/check-for-wikipedia
Add check for presence of wikipedia importance
2022-05-02 16:04:32 +02:00
Sarah Hoffmann
08672cdf0a explicit cast for osm_type parameter in SQL needed
Otherwise PostgreSQL won't correctly pick up the index
condition.
2022-05-02 14:12:17 +02:00
Sarah Hoffmann
8163723e22 respect exclude_place_ids for housenumber search 2022-05-02 11:44:10 +02:00
Sarah Hoffmann
32a5f812a9 Merge pull request #2689 from lonvia/relations-in-associated-street
Accept any OSM type in street member of associatedStreet
2022-05-02 11:42:34 +02:00
Sarah Hoffmann
372874e89a accept any OSM type in street member of associatedStreet
This is needed for pedestrian areas mapped as multipolygons
and consequently as relations. The lookup in placex guarantees
that the referenced OSM object is indeed a street.

Fixes #2669.
2022-05-02 09:48:51 +02:00
Sarah Hoffmann
8ebb8ee304 Merge pull request #2686 from mtmail/ubuntu20-php-fpm-version
Install-on-Ubuntu-20.sh - correct php version
2022-04-29 14:16:44 +02:00
Sarah Hoffmann
3d58254462 skip wikipedia table test on reverse-only installations
Wikipedia importances are not imported on reverse-only imports.
2022-04-29 14:12:55 +02:00
Marc Tobias
95de411a81 Install-on-Ubuntu-20.sh - correct php-fpm version 2022-04-29 13:24:15 +02:00
Sarah Hoffmann
439d17569d Merge pull request #2685 from lonvia/show-inherited-housenumber
Keep address parts inherited from surrounding buildings after indexing
2022-04-29 12:15:33 +02:00
Sarah Hoffmann
8bcdba1a14 add check for wikipedia importance data
Adds a new check level WARNING because missing wikipedia importances
are not necessarily an error. If the database is run for reverse
requests only, then it is fine to go without them.
2022-04-29 12:14:53 +02:00
Sarah Hoffmann
37e5f07d83 Merge pull request #2684 from lonvia/translit-keep-spacing-marks
ICU: better letter identification in normalization
2022-04-29 10:38:28 +02:00
Sarah Hoffmann
3c68b12176 keep inherited address parts after indexing
The inherited housenumber is needed for display output. We can't
take the one from the housenumber field because it is already
normalized. Remove the inherited address only when reindexing.

Fixes #2683.
2022-04-28 21:38:00 +02:00
Sarah Hoffmann
63dc4b39bc ICU: better letter identification in normalization
The Letter class does not include non-spacing marks that can also
have a consonant or vowel meaning, especially in Indian languages.
Use the alnum propoerty instead which includes them all. Also
include the vowel-canceling Virama, which is not a letter by itself
but changes the transliteration.
2022-04-28 18:23:17 +02:00
Sarah Hoffmann
0ea099bfd5 mention the breaking API change in the migratioin docs 2022-04-27 11:52:53 +02:00
Sarah Hoffmann
310776671b adapt docs to geocodejson changes 2022-04-27 11:50:12 +02:00
Sarah Hoffmann
4b84de400b geocodejson: add osm_key and osm_value fields
Return OSM main tag information in geocodejson. This is not part
of the official spec but can be useful to get more detailed information
of the object type. Brings the Nominatim output closer to what
Photon produces.
2022-04-27 10:58:25 +02:00
Sarah Hoffmann
8677da2a72 geocodejson: type should contain the general feature class
'type' so far contained the value of the OSM tag. That is rarely
helpful because it is not a restricted class of values. Change
this to contain the types as defined in the geocodejson spec,
which correspond to the address layer names.
2022-04-27 10:53:12 +02:00
Sarah Hoffmann
de828b723e Merge pull request #2678 from lonvia/address-part-order
Change selection of primary address part for ways that cross boundaries
2022-04-22 20:32:10 +02:00
Sarah Hoffmann
a515761193 further tweaking of address distance
For point features, keep using the distance to centroid.
For area features, add a tie breaker for the case where the
center point falls on the boundary.
2022-04-22 14:32:19 +02:00
Sarah Hoffmann
784dad866f change distance computation between place and address part
Instead of computing the distance to the centroid of the area
compute the distance of the area to the centroid of the feature.
This means we give preference to the area that covers the centroid.
It's still a heuristics but one that is a bit less random.
2022-04-22 14:32:09 +02:00
Sarah Hoffmann
403e6f7e5c Merge pull request #2666 from lonvia/admin-command-for-forced-indexing
Admin command for forced indexing
2022-04-14 21:44:08 +02:00
Sarah Hoffmann
27f7c7fd88 add documentation for new refresh command 2022-04-14 15:10:24 +02:00
Sarah Hoffmann
4f59644cc2 add tests for new data invalidation functions 2022-04-14 14:52:13 +02:00
Sarah Hoffmann
c3f1d34b71 add new commands for forced invalidation before indexing 2022-04-14 11:05:43 +02:00
Sarah Hoffmann
f8f20899a3 recommend PostgreSQL 13+
See https://github.com/osm-search/Nominatim/discussions/2659.
2022-04-14 09:21:25 +02:00
Sarah Hoffmann
a319b0a0b4 docs: different default for format on osm.org
Add a note that the format parameter is needed for
nominatim.openstreetmap.org for historical reasons.
2022-04-08 17:13:42 +02:00
Sarah Hoffmann
604ddc0f9d Merge pull request #2660 from lonvia/pyosmium-contextmanager
Support using ReplicationServer as contextmanager
2022-04-08 17:07:33 +02:00
Sarah Hoffmann
126cabacb8 support new ReplicationServer as contextmanager 2022-04-07 17:58:04 +02:00
Sarah Hoffmann
f78ae969e9 Merge pull request #2466 from I70l0teN4ik/state-code
add ISO3166-2-lvl<admin_level> field to response address details
2022-04-07 16:39:50 +02:00
Artem Ziablytskyi
d1479072ae fix bdd tests and docs 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
9a56e53d50 use ISO3166-2-lvl<admin_level> instead of typeLabel prefix 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
7899654675 proper instruction to import data 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
a79c1bda9b Fix API docs and Vagrant instructions to import data 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
665fae8343 Fix API docs and Vagrant instructions to import data 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
6bee188f24 Change the key to <addresspart_type>-ISO3166-2 to support xml response correctly 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
82dbcbb12a add <addresspart_type>:ISO3166-2 field to response address details 2022-04-07 16:37:51 +02:00
Artem Ziablytskyi
76c146f326 add state_code field to response address details 2022-04-07 16:37:51 +02:00
Sarah Hoffmann
fd4ab3f262 Merge pull request #2629 from tareqpi/country-names-yaml-configuration
Move default country names into yaml configuration
2022-04-04 09:04:25 +02:00
Tareq Al-Ahdal
cfbd3652ef fix linting error 2022-04-02 00:14:18 +08:00
Tareq Al-Ahdal
e9c14979a4 remove the conversion to json for name 2022-04-01 22:54:14 +08:00
Tareq Al-Ahdal
e9f979b67b 'read_config' is no longer a fixture
add 'read_config' to test cases that need it
2022-04-01 22:52:17 +08:00
Tareq Al-Ahdal
a323b8f63a test for loading special characters from country_settings.yaml 2022-04-01 21:58:57 +08:00
Tareq Al-Ahdal
9411c14fd2 fix reset country info before loading custom data 2022-04-01 21:55:34 +08:00
Tareq Al-Ahdal
8525e7542f custom country config loads correctly 2022-04-01 21:46:56 +08:00
Sarah Hoffmann
7dabbc5462 Merge pull request #2655 from lonvia/migration-internal-country-name
Add migration for new country name handling in ICU tokenizer
2022-03-31 18:04:18 +02:00
Sarah Hoffmann
de18cd1523 add test for new table_has_column function 2022-03-31 15:55:20 +02:00
Sarah Hoffmann
36a1560117 add migration to mark internal country names 2022-03-31 15:55:20 +02:00
Tareq Al-Ahdal
b5f311d6bc separate unit test function into three functions 2022-03-30 22:06:59 +08:00
Sarah Hoffmann
83dd4362aa remove temporary file 2022-03-30 15:13:31 +02:00
Sarah Hoffmann
a71cab639b Merge pull request #2650 from mtmail/update-lookup-examples
documentation: update example output of lookup endpoint
2022-03-28 20:21:45 +02:00
Marc Tobias
5e0155ae29 documentation: update example output of lookup endpoint 2022-03-28 16:41:10 +02:00
Tareq Al-Ahdal
afef83b1c6 fix edge case handling when 'names' is not there 2022-03-25 22:25:55 +08:00
Tareq Al-Ahdal
9db13aac72 Added unit tests for loading country info from yaml file 2022-03-25 22:22:44 +08:00
Tareq Al-Ahdal
9a1f891998 fix linting error 2022-03-24 13:27:24 +08:00
Tareq Al-Ahdal
7bb7ed468a fix storing of escape sequences in database 2022-03-24 13:18:44 +08:00
Tareq Al-Ahdal
4fc61d260f clean up 2022-03-24 13:16:59 +08:00
Tareq Al-Ahdal
1ceb6926b7 merge of insert query + modularity enhancements 2022-03-24 13:13:38 +08:00
Sarah Hoffmann
d33c82cb66 Merge pull request #2641 from lonvia/reinit-tokenizer-dir
Transparantly reinitialize tokenizer directory when necessary
2022-03-20 21:46:07 +01:00
Sarah Hoffmann
4c66c35ed6 reinit the tokenizer directory on website refresh
This means the project directory is usable again, once refresh --website
was run.
2022-03-20 17:49:22 +01:00
Sarah Hoffmann
54db1d8915 docs: copying project dir no longer necessary 2022-03-20 16:01:27 +01:00
Sarah Hoffmann
a0ed80d821 restore the tokenizer directory when missing
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.

See #2515.
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
e65913d376 cache loaded configuration
Reading the YAML files is fairly expensive and slows down the BDD tests
significantly. Therefore cache the results from reading the file.
2022-03-20 11:30:03 +01:00
Sarah Hoffmann
2f266d946b Merge pull request #2639 from lonvia/remove-operator
No longer use operator tag as a name
2022-03-18 16:42:18 +01:00
Tareq Al-Ahdal
b6ac4ad837 fix linting error 2022-03-18 21:05:47 +08:00
Sarah Hoffmann
42f0282f14 remove special case for operator names
The OSM data has been sufficiently cleaned up by now that
the operator no longer needs to be considered a name tag.
Use 'brand' as the searchable alternative.
2022-03-18 10:48:53 +01:00
Tareq Al-Ahdal
af739d2f57 modify logic of _include_key function 2022-03-18 06:52:16 +08:00
Tareq Al-Ahdal
fa2aca1cbc adding prefix to keys is now more configurable 2022-03-18 06:20:00 +08:00
Tareq Al-Ahdal
943e5fe699 Revert the removal of new line at the end of the file 2022-03-18 06:07:48 +08:00
Tareq Al-Ahdal
d09670d208 modify logic to prepend 'name:' to keys' 2022-03-18 06:01:25 +08:00
Tareq Al-Ahdal
83b4b8d9c1 reattach 'name:' prefix to keys 2022-03-18 05:46:23 +08:00
Tareq Al-Ahdal
d32a7c1888 initialize an empty dictionary for nested name key 2022-03-18 02:50:33 +08:00
Tareq Al-Ahdal
d0c1b73fb3 remove duplicate values 2022-03-18 02:43:42 +08:00
Tareq Al-Ahdal
90ac15748e fix comment 2022-03-18 02:38:04 +08:00
Tareq Al-Ahdal
6be2077d92 Merge branch 'master' into country-names-yaml-configuration 2022-03-18 02:36:12 +08:00
Tareq Al-Ahdal
456d439e97 Reformatting of country keys 2022-03-18 02:23:11 +08:00
Sarah Hoffmann
2723553593 Merge pull request #2637 from lonvia/keep-linked-place-names
Introduce separation of names from linked places
2022-03-17 16:39:30 +01:00
Sarah Hoffmann
23de4c7aca adapt ParameterParser tests to new key list 2022-03-17 11:45:05 +01:00
Sarah Hoffmann
ce14964943 fix linting 2022-03-17 11:05:32 +01:00
Sarah Hoffmann
e133476c35 merge linked names correctly into namedetails
Convert the '_place_*' entries back to normal entries before
returning them in the 'namedetails' section. If the name field is
duplicated, kept the '_place_*' notation. This preserves the previous
behaviour before _place_ names were introduces but adds the additional
names from the linked place for reference.
2022-03-17 11:02:02 +01:00
Sarah Hoffmann
524dc64ab7 make sure outputs take into account linked place names 2022-03-16 21:44:52 +01:00
Sarah Hoffmann
17da5f45be fix return code for PHP exceptions
These have returned a 0 until now.
2022-03-16 21:44:02 +01:00
Sarah Hoffmann
42cd021d04 save differing linked polace names in extra fields
This keeps the names tracable and ensures that all names are searchable
when they differ. Do not keep names when they are exactly the same
to save some space. Linked names are cleaned out before relinking.
2022-03-16 16:38:52 +01:00
Sarah Hoffmann
433d2f4c7d Merge pull request #2633 from lonvia/fix-reverse-single-interpolation-point
Correctly handle single-point interpolations in reverse
2022-03-16 14:22:59 +01:00
Sarah Hoffmann
be8f5778a1 use https protocol for cloning from github
Does not need authentication.
2022-03-16 12:05:58 +01:00
Sarah Hoffmann
ef98a85b05 correctly handle single-point interpolations in reverse
Lookup in location_property_osmline needs to be special cased
for startnumber = endnumber. Also adds tests for the case.

Fixes #2680.
2022-03-16 11:19:09 +01:00
Tareq Al-Ahdal
b4bd4ff67d fix linting error 2022-03-15 19:14:04 +08:00
Sarah Hoffmann
930a5cd12a Merge pull request #2632 from nslxndr/fix-log-typo
Fix typo in log message on replication initialisation
2022-03-15 11:01:57 +01:00
Sandor Nagy
7e3701b64a Fix typo in log message on replication initialisation 2022-03-15 07:50:47 +01:00
Tareq Al-Ahdal
165d17f7f7 reintroduce 'name:' prefix to country name keys 2022-03-13 18:58:27 +08:00
Tareq Al-Ahdal
3939cb614e Remove country.sql from CMakeLists.txt 2022-03-13 18:56:19 +08:00
Tareq Al-Ahdal
377cf36be3 modify data import logic to load country names from yaml 2022-03-12 15:20:57 +08:00
Tareq Al-Ahdal
8b6652a40b move default country names into yaml configuration 2022-03-12 15:17:01 +08:00
Sarah Hoffmann
479d726774 Merge pull request #2627 from mtmail/location-of-osm2pgsql
documentation: clarify osm2pgsql isnt in project directory by default
2022-03-10 15:39:10 +01:00
Marc Tobias
1fcc9717bb documentation: clarify osm2pgsql isnt in project directory by default 2022-03-10 14:16:12 +01:00
Sarah Hoffmann
c35b3ea5c7 Merge pull request #2621 from lonvia/housenumber-analyzer
Introduce optional token analysis for housenumbers
2022-03-01 15:19:07 +01:00
Sarah Hoffmann
15beeef6ce do not expand records in select list
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
92bc3cd0a7 fix linting issue 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
0a9f971e44 add tests for new analyzed housenumbers 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
4a3bbd0319 adapt housenumber cleanup to new word table structure 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
89e1446131 bdd: disable some housenumber tests for legacy
Optional spaces in housenumbers are not supported by legacy tokenizer,
so disable those tests.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b694a97edf add documentation for housenumber analyzer 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
13ed184efd housenumber analyzer: avoid creating too many variants
Housenumber fields with lots of text are likely bad data. So is
data with many changes from letter to digit. Exclude them from adding
optional spaces.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
f03a05f6bb add new analyser for houenumbers
This analyser makes spaces optional.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
a6903651fc add framework for analysing housenumbers
This lays the groundwork for adding variants for housenumbers.
When analysis is enabled, then the 'word' field in the word table
is used as usual, so that variants can be created. There will be
only one analyser allowed which must have the fixed name
'@housenumber'.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b8c544cc98 icu: move token deduplication into TokenInfo
Puts collection into one common place.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
243725aae1 icu: move housenumber token computation out of TokenInfo
This was the last function to use the cache. There is a more clean
separation of responsibility now.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
0bb59b2e22 handle unknown analyzer
When changing something in the default configuration of the sanatizers
that refers to an analyzer that is not yet loaded, there shouldn't be
any errors.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
837d44391c move generation of normalized token form to analyzer
This gives the analyzer more flexibility in choosing the normalized
form. In particular, an analyzer creating different variants can choose
the variant that will be used as the canonical form.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
691ec08586 Merge pull request #2614 from lonvia/reorganise-country-names
Reorganise handling of country names imported from OSM
2022-02-25 09:46:20 +01:00
Sarah Hoffmann
5425394654 add migration to add new derived_names column 2022-02-24 20:50:33 +01:00
Sarah Hoffmann
1d82569f6d add tests for country updates 2022-02-24 16:18:49 +01:00
Sarah Hoffmann
f74228830d bdd: run full import on tests
This uncovered a couple of outdated/wrong tests which have been
fixed, too.
2022-02-24 14:27:51 +01:00
Sarah Hoffmann
a9e3329c39 country_name: use separate columns for names from OSM
This allows us to distinguish between base names and imported ones
and consiquently removing imported ones if necessary.
2022-02-23 09:23:06 +01:00
Sarah Hoffmann
a3e4e8e5cd delete unused country name tokens 2022-02-23 09:23:06 +01:00
Sarah Hoffmann
898febcec5 update supported versions 2022-02-23 09:22:17 +01:00
Sarah Hoffmann
855909b4e9 add 'healthcare' as main tag
Given that the tag is most of the time duplicated by an amenity
tag which is already imported, only import it as a fallback when
there is no name.

Fixes #2609.
2022-02-21 11:52:17 +01:00
Sarah Hoffmann
85d65a2fd2 create idx_place_interpolations for import already
It is needed to look up if a node is part of an interpolation.

Fixes #2608.
2022-02-18 11:11:22 +01:00
Sarah Hoffmann
cd9b0c9a20 Merge pull request #2603 from lonvia/one-step-housenumber-search
One step housenumber search
2022-02-10 17:27:56 +01:00
Sarah Hoffmann
0e11ca9b76 add test that interpolations are found by odd/even 2022-02-10 11:23:51 +01:00
Sarah Hoffmann
fd38dd02ce make sure step is taken into account for interpolations 2022-02-09 21:42:28 +01:00
Sarah Hoffmann
474418f03c include houseumber search in name query
The name query already looks for the existence of housenumbers and
may as well retrive them. Saves up to threee additional lookups.
It also means that we can lift the restriction on looking
for existance of housenumbers for simple queries only.
2022-02-08 22:35:12 +01:00
Sarah Hoffmann
6b9fea6f1a disable debug message in interpolation processing 2022-02-07 23:30:25 +01:00
Sarah Hoffmann
02894ca4a4 Merge pull request #2602 from lonvia/filter-bad-housenumbers
Handle mistagged housenumbers like names
2022-02-07 16:27:04 +01:00
Sarah Hoffmann
7d19209fa1 liniting: disable too-many-ancestors
This is triggered by UserDict which is meant of deriving.
2022-02-07 11:49:18 +01:00
Sarah Hoffmann
a6b4e8ff67 add tests for housenumber-as-name feature 2022-02-07 11:45:12 +01:00
Sarah Hoffmann
38c3ef3da0 add tests for get_string_list()
Renaming test file for sanitizer config because pytest requires
unique names for test files.
2022-02-07 11:22:24 +01:00
Sarah Hoffmann
610f2cc254 sanitizer: move helpers into a configuration class 2022-02-07 10:48:00 +01:00
Sarah Hoffmann
a79a3210e6 implement is-a-name option for housenumbers 2022-02-07 09:27:11 +01:00
Sarah Hoffmann
39ede26b5c Merge pull request #2598 from geofabrik/doc-update-systemd-timer
Document how to set up systemd timers for --once updates
2022-02-06 10:24:48 +01:00
Sarah Hoffmann
c3f206733f really remove CentOS from CI 2022-02-05 16:07:12 +01:00
Sarah Hoffmann
69481d1590 remove CentOS from CI
The CentOS docker image no longer works after CentOS8 went EOL.
See #2601 for discussion.
2022-02-05 15:14:47 +01:00
Sarah Hoffmann
6877668cab Merge pull request #2599 from StephanGeorg/patch-1
Fixed link
2022-02-03 09:45:59 +01:00
Stephan Georg
dc520bd156 Fixed link 2022-02-03 09:39:03 +01:00
Amanda McCann
bc4a343502 Document how to set up systemd timers for --once updates 2022-02-01 17:01:45 +01:00
Sarah Hoffmann
fbc8884693 restrict change propagation to interpolation lines
Also means that Postgresql will use the right index for the query.
2022-01-28 11:05:37 +01:00
Sarah Hoffmann
c50c534d19 Merge pull request #2597 from lonvia/reorganise-interpolations
Reorganise interpolation code
2022-01-28 08:40:08 +01:00
Sarah Hoffmann
45627b485f Merge pull request #2596 from lonvia/remove-codecov
Remove codecov
2022-01-27 17:11:17 +01:00
Sarah Hoffmann
b6fa121f53 remove tests for closest housenumber function 2022-01-27 16:21:45 +01:00
Sarah Hoffmann
9b31ffaa9f php unit tests don't work on ubuntu 18 2022-01-27 15:18:23 +01:00
Sarah Hoffmann
39e300640e remove codecov
Causes more trouble than doing good.
2022-01-27 15:17:33 +01:00
Sarah Hoffmann
2ffc1537e7 raise PostgreSQL requirement to 9.6
The new code uses the open-ended array notation which is only
available sind psql 9.6.
2022-01-27 15:15:56 +01:00
Sarah Hoffmann
64abc90d30 use new tiger step column for queries 2022-01-27 14:08:08 +01:00
Sarah Hoffmann
788505095e add step column to tiger data table
This replaces the interpolationtype column.
2022-01-27 11:54:12 +01:00
Sarah Hoffmann
98432395c3 add migration for upcoming change to tiger tables 2022-01-27 11:48:27 +01:00
Sarah Hoffmann
6b89624f33 adapt frontend to new interpolation table layout 2022-01-27 11:14:55 +01:00
Sarah Hoffmann
4b28b4fed4 adapt BDD tests for new interpolation style 2022-01-27 11:14:55 +01:00
Sarah Hoffmann
fea4dbba50 inherit tags from interpolation not parent
Nodes on an interpolation now only get the address tags of
interpolations and then compute their own parent from that. They no
longer inherit the parent directly.
2022-01-27 11:14:55 +01:00
Sarah Hoffmann
83d2c440d5 add migration for new interpolation table layout 2022-01-27 11:14:55 +01:00
Sarah Hoffmann
e6d855b954 add migration for new lookup index 2022-01-27 11:14:55 +01:00
Sarah Hoffmann
9f64c34f1a optimize indexes for interpolation lines
Do not index 'inactive' rows (with startnumber is null) where possible.
2022-01-27 11:14:55 +01:00
Sarah Hoffmann
638ed15ada improve handling von updates on nodes in interpolations
Use the same update mechanism as for updates on the interpolations
themselves. Updates must solely happen in place_insert as this is
the place where actual changes of the data happen.
2022-01-27 11:14:55 +01:00
Sarah Hoffmann
c0d8b95f67 update interpolations instead of deleting and recreating 2022-01-27 11:14:55 +01:00
Sarah Hoffmann
c65938d53c Merge pull request #2595 from nslxndr/fix-doc-typos
Fix typos in UI doc
2022-01-26 23:08:41 +01:00
Sandor Nagy
2e3f3a55f1 Fix typos in UI doc 2022-01-26 21:39:20 +01:00
Sarah Hoffmann
cdd0f78bc6 Merge pull request #2594 from lonvia/update-osm2pgsql
Update to osm2pgsql 1.6.0
2022-01-25 12:11:57 +01:00
Sarah Hoffmann
9fac20ceef update to osm2pgsql 1.6.0 2022-01-24 16:55:52 +01:00
Sarah Hoffmann
38bd08d25f Merge pull request #2591 from lonvia/cleanup-place-insert
Reorganise code of place_insert() trigger
2022-01-24 15:58:23 +01:00
Sarah Hoffmann
b44493e7f2 reorganise place_insert trigger
Code cleanup and formatting as well as minor improvements, in
particular removal of unnecessary code.
2022-01-24 09:12:50 +01:00
Sarah Hoffmann
f6ec8d2e33 Merge pull request #2589 from lonvia/clean-housenumbers
Add command for cleaning up word table
2022-01-21 10:17:58 +01:00
Sarah Hoffmann
c170d323d9 add tests for cleaning housenumbers 2022-01-20 23:47:20 +01:00
Sarah Hoffmann
3ce123ab69 do not clean housenumbers in reverse-only mode 2022-01-20 20:21:13 +01:00
Sarah Hoffmann
d8b7a51ab6 add actual removal of housenumber tokens 2022-01-20 20:18:15 +01:00
Sarah Hoffmann
344a2bfc1a add new command for cleaning word tokens
Just pulls outdated housenumbers for the moment.
2022-01-20 20:05:15 +01:00
Sarah Hoffmann
86588419fb Merge pull request #2588 from lonvia/housenumber-sanitizer
Move housenumber parsing into sanitizer
2022-01-20 17:44:24 +01:00
Sarah Hoffmann
d09db09849 adapt ICU tets to new housenumber sanitizer
Restrict tests to making sure that handing in multiple housenumbers
works.
2022-01-20 16:05:49 +01:00
Sarah Hoffmann
1e5a8561c0 fix linting issues 2022-01-20 16:00:23 +01:00
Sarah Hoffmann
f3c9578bca complete documentation for new clean-houseunubmers sanatizer 2022-01-20 15:49:32 +01:00
Sarah Hoffmann
3741afa6dc generalize filter-kind parameter for sanatizers
Now behaves the same for tag_analyzer_by_language and
clean_housenumbers. Adds tests.
2022-01-20 15:42:42 +01:00
Sarah Hoffmann
560a006892 add pytest config
We are using custom marks now which need to be registered to avoid
warnings.
2022-01-20 15:38:02 +01:00
Sarah Hoffmann
4774e45218 clean_housenumbers: make kinds and delimiters configurable
Also adds unit tests for various options.
2022-01-20 12:07:12 +01:00
Sarah Hoffmann
206ee87188 factor out housenumber splitting into sanitizer 2022-01-19 17:27:50 +01:00
Sarah Hoffmann
a7e048484b Merge pull request #2585 from lonvia/name-mutations
Introduce character mutations to token analysis
2022-01-19 17:09:36 +01:00
Sarah Hoffmann
d6b5f2f5da docs: add pointer to caddy deployment discussion 2022-01-19 15:28:01 +01:00
Sarah Hoffmann
3df560ea38 fix linting error 2022-01-18 11:09:21 +01:00
Sarah Hoffmann
adbaf700cd move parsing of mutation config to setup phase 2022-01-18 11:09:21 +01:00
Sarah Hoffmann
4a41bff3ab add documentation for new mutation feature 2022-01-18 11:09:21 +01:00
Sarah Hoffmann
b453b0ea95 introduce mutation variants to generic token analyser
Mutations are regular-expression-based replacements that are applied
after variants have been computed. They are meant to be used for
variations on character level.

Add spelling variations for German umlauts.
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
0192a7af96 move variant configuration reading in separate file 2022-01-18 11:09:21 +01:00
Sarah Hoffmann
630ad38a67 refactor variant production to use generators 2022-01-18 11:09:21 +01:00
Sarah Hoffmann
21156fc2a2 Merge pull request #2578 from lonvia/iso-3166-2
Make ISO3166-2 references searchable
2022-01-13 14:54:35 +01:00
Sarah Hoffmann
fa99f5bc03 Merge pull request #2579 from geofabrik/doc-update-typo
Fix typo in name of service. The rest of the docs call it nominatim-updateS
2022-01-13 14:01:57 +01:00
Amanda McCann
09aa1e7af4 Fix typo in name of service. The rest of the docs call it nominatim-updateS 2022-01-13 13:14:17 +01:00
Sarah Hoffmann
2034ed387b make ISO3166-2 references searchable 2022-01-13 09:44:42 +01:00
Sarah Hoffmann
d6140d6d54 Merge pull request #2571 from lonvia/ukrainian-apostrophe
Consider "modifier letter apostrophe" to be punctuation
2022-01-11 09:41:07 +01:00
Sarah Hoffmann
fb54bd3fcf consider "modifier letter apostrophe" to be punctuation
While technically being a letter, the apostrophe is often replaced
with a normal apostrophe in writing which is a punctuation mark.
This makes sure that the modifier letter apostrophe yields the same
normalization results and thus is really interchangable.

Only has an effect after the next reimport.

Fixes #2569.
2022-01-10 17:40:03 +01:00
Sarah Hoffmann
a486ee347a Merge pull request #2570 from woodpeck/patch-3
Fix typos
2022-01-10 14:21:48 +01:00
Frederik Ramm
5fb3582b31 Fix typos 2022-01-10 13:38:53 +01:00
Sarah Hoffmann
8b0b9db31e Merge pull request #2565 from lonvia/swap-wordset-order
Swap order of query interpretation
2022-01-06 09:02:46 +01:00
Sarah Hoffmann
f9889f81d6 swap order of query interpretation
A forward interpretation of the form 'street, city, country' is
much more frequent than the reverse form 'country, city, street'.
Thus swap the order of interpretations that the forward order comes
first.
2022-01-05 15:21:14 +01:00
Sarah Hoffmann
efafa52719 Merge pull request #2562 from lonvia/copyright-headers
Add consistent copyright headers
2022-01-04 23:10:37 +01:00
Sarah Hoffmann
c3788d765e add consistent SPDX copyright headers 2022-01-03 16:23:58 +01:00
Sarah Hoffmann
e407558f76 Merge pull request #2559 from lonvia/disable-jit-in-queries
Disable JIT and parallel workers on search frontend
2022-01-03 15:13:57 +01:00
Sarah Hoffmann
042df4198a disable JIT and parallel workers on search frontend
Bad query planning now also interferes with queries for search and
reverse.
2021-12-22 10:47:54 +01:00
Sarah Hoffmann
ab6f35d83a Merge pull request #2553 from lonvia/revert-street-matching-to-full-names
Revert street matching to full names
2021-12-14 15:52:34 +01:00
Sarah Hoffmann
f9b56a8581 correctly match abbreviated addr:street
This only works when addr:street is abbreviated and the street
name isn't. It does not work the other way around.
2021-12-08 21:58:43 +01:00
Sarah Hoffmann
fedc8ed474 Merge pull request #2542 from lonvia/update-phpunit
Update PHPUnit use to 9.5
2021-12-07 15:44:45 +01:00
Sarah Hoffmann
79aeb31088 restrict PHPUnit to 9.5 version
There are so many breaking changes with PHPUnit that it is
impossible to give any other guarantees.
2021-12-07 14:49:31 +01:00
Sarah Hoffmann
04857d32cd enable PHPUnit 9 for coverage
A couple of functions have been renamed.
2021-12-07 12:07:17 +01:00
Sarah Hoffmann
109cdce92c php unit: replace deprecated regex assert
The regEx assertion has been renamed in PHPUnit 9.5
and causes deprecation warnings.
2021-12-07 11:34:21 +01:00
Sarah Hoffmann
b7554d9ed8 php unit: don't enforce a name on the test database
Also gets rid of a PHPUnit deprecation warning.
2021-12-07 11:31:45 +01:00
Sarah Hoffmann
6106f1a32e php test: class must be called like the file 2021-12-07 11:20:38 +01:00
Sarah Hoffmann
f2a8307bb6 disable codecov
Not working.
2021-12-07 11:13:30 +01:00
Sarah Hoffmann
470ee7aef9 Merge pull request #2540 from lonvia/remove-support-for-centos7
Remove installation instructions for CentOS 7
2021-12-07 09:17:29 +01:00
Sarah Hoffmann
aefca48e78 remove installation instructions for CentOS 7
This ends official support for CentOS 7.
2021-12-06 16:05:27 +01:00
Sarah Hoffmann
5e792078b3 remove some odd varaints of addr:street from the styles
Some import has added names in partial tags which confuse the
street name matching.
2021-12-06 15:17:00 +01:00
Sarah Hoffmann
7f7d2fd5b3 skip most addr: tags with suffixes
Only one addr: tag can be processed currently, so make
sure it is the one without suffixes to not get odd data.
addr:street is the exception because it uses a different
matching mechanism.
2021-12-06 14:55:10 +01:00
Sarah Hoffmann
5e435b41ba ICU: matching any street name will do again 2021-12-06 14:26:08 +01:00
Sarah Hoffmann
44cfce1ca4 revert to using full names for street name matching
Using partial names turned out to not work well because there are
often similarly named streets next to each other. It also
prevents us from being able to take into account all addr:street:*
tags.

This change gets all the full term tokens for the addr:street tags
from the DB. As they are used for matching only, we can assume that
the term must already be there or there will be no match. This
avoid creating unused full name tags.
2021-12-06 11:38:38 +01:00
Sarah Hoffmann
bb175cc958 Merge pull request #2539 from lonvia/clean-up-python-tests
Restructure and extend python unit tests
2021-12-03 17:08:25 +01:00
Sarah Hoffmann
5a9fb6eaf7 specify text type in test SQL
Older version of postgres fail otherwise.
2021-12-03 13:56:23 +01:00
Sarah Hoffmann
54d35ddfe9 split cli tests by subcommand and extend coverage 2021-12-02 23:45:48 +01:00
Sarah Hoffmann
7beccb7997 remove unnecessary pass statements 2021-12-02 15:54:24 +01:00
Sarah Hoffmann
14a78f55cd more unit tests for tokenizers 2021-12-02 15:46:36 +01:00
Sarah Hoffmann
7617a9316e extend API unit tests 2021-12-01 20:48:29 +01:00
Sarah Hoffmann
a52ed366e4 add tests for migration 2021-12-01 20:27:40 +01:00
Sarah Hoffmann
7be164e2a5 more testing for refresh functions 2021-12-01 14:58:54 +01:00
Sarah Hoffmann
a24f25c0d8 more tests for exec utilities 2021-12-01 14:23:51 +01:00
Sarah Hoffmann
993b238a41 add more tests for database import 2021-12-01 11:54:58 +01:00
Sarah Hoffmann
bbbfc8201c add tests for adding additional data
Also adds checks that parameters for osm2pgsql are set
as expected.
2021-12-01 11:22:46 +01:00
Sarah Hoffmann
6f03a4d6ce add tests for flatten_config_file and other than yaml formats 2021-12-01 10:24:11 +01:00
Sarah Hoffmann
c8958a22d2 tests: add fixture for making test project directory 2021-11-30 18:01:46 +01:00
Sarah Hoffmann
37afa2180b generalize fixtures for cli tests 2021-11-30 14:07:39 +01:00
Sarah Hoffmann
b2df8e478a python test: move single-use fixtures to subdirectories 2021-11-30 12:03:16 +01:00
Sarah Hoffmann
50fccb52be remove unused test files 2021-11-30 11:44:10 +01:00
Sarah Hoffmann
b90e719da5 organise python tests in subdirectories
The directories follow the same structure as the modules in
nominatim/.
2021-11-30 11:22:26 +01:00
Sarah Hoffmann
97f1723181 Merge pull request #2530 from lonvia/declassify-highway
Change default rank for highway objects to 30
2021-11-25 08:41:25 +01:00
Sarah Hoffmann
80e0a3cce4 change default rank for highway objects to 30
The highway key is being used more and more for non-ways these
days. This clashes with Nominatim's assumption that essentially
everything that has a highway tag can be used as the street part
of the address.

Change the default rank of highway objects to 30 to avoid this.
Only the known values for streets keep the rank 26 and are now
listed explicitly.
2021-11-24 22:10:40 +01:00
Sarah Hoffmann
79effae933 Merge pull request #2529 from lonvia/sort-street-results-by-tiger-housenumber
Take tiger housenumber into account when ranking street results
2021-11-24 16:23:41 +01:00
Sarah Hoffmann
810056349f add migration for inclusive housenumber Tiger index 2021-11-24 12:03:20 +01:00
Sarah Hoffmann
b1d490ea53 add index for Tiger housenumber queries 2021-11-24 11:10:20 +01:00
Sarah Hoffmann
345637290b take Tiger housenumbers into account when ranking street results
Queries with a housenumber need to rank streets higher that
have the requested housenumber attached. We already do that for
ordinary housenumber objects and for interpolations. This
adds support for Tiger housenumbers as well.

Fixes #2501.
2021-11-24 11:10:20 +01:00
609 changed files with 40894 additions and 7156 deletions

View File

@@ -5,34 +5,38 @@ inputs:
description: 'Version of Ubuntu to install on'
required: false
default: '20'
cmake-args:
description: 'Additional options to hand to cmake'
required: false
default: ''
runs:
using: "composite"
steps:
- name: Clean out the disk
run: |
sudo rm -rf /opt/hostedtoolcache/go /opt/hostedtoolcache/CodeQL /usr/lib/jvm /usr/local/share/chromium /usr/local/lib/android
df -h
shell: bash
- name: Install prerequisites
run: |
sudo apt-get install -y -qq libboost-system-dev libboost-filesystem-dev libexpat1-dev zlib1g-dev libbz2-dev libpq-dev libproj-dev libicu-dev
if [ "x$UBUNTUVER" == "x18" ]; then
pip3 install python-dotenv psycopg2==2.7.7 jinja2==2.8 psutil==5.4.2 pyicu osmium PyYAML==5.1 datrie
pip3 install python-dotenv psycopg2==2.7.7 jinja2==2.8 psutil==5.4.2 pyicu==2.9 osmium PyYAML==5.1 datrie
else
sudo apt-get install -y -qq python3-icu python3-datrie python3-pyosmium python3-jinja2 python3-psutil python3-psycopg2 python3-dotenv python3-yaml
fi
shell: bash
env:
UBUNTUVER: ${{ inputs.ubuntu }}
- name: Download dependencies
run: |
if [ ! -f country_grid.sql.gz ]; then
wget --no-verbose https://www.nominatim.org/data/country_grid.sql.gz
fi
cp country_grid.sql.gz Nominatim/data/country_osm_grid.sql.gz
shell: bash
CMAKE_ARGS: ${{ inputs.cmake-args }}
- name: Configure
run: mkdir build && cd build && cmake ../Nominatim
run: mkdir build && cd build && cmake $CMAKE_ARGS ../Nominatim
shell: bash
env:
CMAKE_ARGS: ${{ inputs.cmake-args }}
- name: Build
run: |

View File

@@ -22,7 +22,7 @@ runs:
- name: Install PostgreSQL
run: |
sudo apt-get install -y -qq --no-install-suggests --no-install-recommends postgresql-client-${PGVER} postgresql-${PGVER}-postgis-${POSTGISVER} postgresql-${PGVER}-postgis-${POSTGISVER}-scripts postgresql-contrib-${PGVER} postgresql-${PGVER} postgresql-server-dev-${PGVER}
sudo apt-get install -y -qq --no-install-suggests --no-install-recommends postgresql-client-${PGVER} postgresql-${PGVER}-postgis-${POSTGISVER} postgresql-${PGVER}-postgis-${POSTGISVER}-scripts postgresql-contrib-${PGVER} postgresql-${PGVER}
shell: bash
env:
PGVER: ${{ inputs.postgresql-version }}

View File

@@ -7,11 +7,11 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
submodules: true
- uses: actions/cache@v2
- uses: actions/cache@v3
with:
path: |
data/country_osm_grid.sql.gz
@@ -27,7 +27,7 @@ jobs:
mv nominatim-src.tar.bz2 Nominatim
- name: 'Upload Artifact'
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v3
with:
name: full-source
path: nominatim-src.tar.bz2
@@ -37,10 +37,10 @@ jobs:
needs: create-archive
strategy:
matrix:
ubuntu: [18, 20]
ubuntu: [18, 20, 22]
include:
- ubuntu: 18
postgresql: 9.5
postgresql: 9.6
postgis: 2.5
pytest: pytest
php: 7.2
@@ -49,11 +49,16 @@ jobs:
postgis: 3
pytest: py.test-3
php: 7.4
- ubuntu: 22
postgresql: 14
postgis: 3
pytest: py.test-3
php: 8.1
runs-on: ubuntu-${{ matrix.ubuntu }}.04
steps:
- uses: actions/download-artifact@v2
- uses: actions/download-artifact@v3
with:
name: full-source
@@ -64,10 +69,10 @@ jobs:
uses: shivammathur/setup-php@v2
with:
php-version: ${{ matrix.php }}
coverage: xdebug
tools: phpunit, phpcs, composer
ini-values: opcache.jit=disable
- uses: actions/setup-python@v2
- uses: actions/setup-python@v4
with:
python-version: 3.6
if: matrix.ubuntu == 18
@@ -82,12 +87,19 @@ jobs:
ubuntu: ${{ matrix.ubuntu }}
- name: Install test prerequsites
run: sudo apt-get install -y -qq pylint python3-pytest python3-behave python3-pytest-cov php-codecoverage
run: sudo apt-get install -y -qq python3-pytest python3-behave
if: matrix.ubuntu == 20
- name: Install test prerequsites
run: pip3 install pylint==2.6.0 pytest pytest-cov behave==1.2.6
if: matrix.ubuntu == 18
run: pip3 install pylint pytest behave==1.2.6
if: ${{ (matrix.ubuntu == 18) || (matrix.ubuntu == 22) }}
- name: Install test prerequsites
run: sudo apt-get install -y -qq python3-pytest
if: matrix.ubuntu == 22
- name: Install latest pylint/mypy
run: pip3 install -U pylint mypy types-PyYAML types-jinja2 types-psycopg2 types-psutil types-requests typing-extensions
- name: PHP linting
run: phpcs --report-width=120 .
@@ -97,60 +109,34 @@ jobs:
run: pylint nominatim
working-directory: Nominatim
- name: Python static typechecking
run: mypy --strict nominatim
working-directory: Nominatim
- name: PHP unit tests
run: phpunit --coverage-clover ../../coverage-php.xml ./
run: phpunit ./
working-directory: Nominatim/test/php
if: matrix.ubuntu == 20
if: ${{ (matrix.ubuntu == 20) || (matrix.ubuntu == 22) }}
- name: Python unit tests
run: $PYTEST --cov=nominatim --cov-report=xml test/python
run: $PYTEST test/python
working-directory: Nominatim
env:
PYTEST: ${{ matrix.pytest }}
- name: BDD tests
run: |
mkdir cov
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build --format=progress3 -DPHPCOV=./cov
composer require phpunit/phpcov:7.0.2
vendor/bin/phpcov merge --clover ../../coverage-bdd.xml ./cov
working-directory: Nominatim/test/bdd
if: matrix.ubuntu == 20
- name: BDD tests
run: |
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build --format=progress3
working-directory: Nominatim/test/bdd
if: matrix.ubuntu == 18
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
files: ./Nominatim/coverage*.xml
directory: ./
name: codecov-umbrella
fail_ci_if_error: false
path_to_write_report: ./coverage/codecov_report.txt
verbose: true
if: matrix.ubuntu == 20
icu-test:
legacy-test:
needs: create-archive
strategy:
matrix:
ubuntu: [20]
include:
- ubuntu: 20
postgresql: 13
postgis: 3
pytest: py.test-3
php: 7.4
runs-on: ubuntu-${{ matrix.ubuntu }}.04
runs-on: ubuntu-20.04
steps:
- uses: actions/download-artifact@v2
- uses: actions/download-artifact@v3
with:
name: full-source
@@ -160,35 +146,27 @@ jobs:
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: ${{ matrix.php }}
coverage: xdebug
tools: phpunit, phpcs, composer
- uses: actions/setup-python@v2
with:
python-version: 3.6
if: matrix.ubuntu == 18
php-version: 7.4
- uses: ./Nominatim/.github/actions/setup-postgresql
with:
postgresql-version: ${{ matrix.postgresql }}
postgis-version: ${{ matrix.postgis }}
postgresql-version: 13
postgis-version: 3
- name: Install Postgresql server dev
run: sudo apt-get install postgresql-server-dev-13
- uses: ./Nominatim/.github/actions/build-nominatim
with:
ubuntu: ${{ matrix.ubuntu }}
ubuntu: 20
cmake-args: -DBUILD_MODULE=on
- name: Install test prerequsites
run: sudo apt-get install -y -qq python3-behave
if: matrix.ubuntu == 20
- name: Install test prerequsites
run: pip3 install behave==1.2.6
if: matrix.ubuntu == 18
- name: BDD tests (icu tokenizer)
- name: BDD tests (legacy tokenizer)
run: |
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build -DTOKENIZER=icu --format=progress3
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build -DTOKENIZER=legacy --format=progress3
working-directory: Nominatim/test/bdd
@@ -198,7 +176,7 @@ jobs:
strategy:
matrix:
name: [Ubuntu-18, Ubuntu-20, Centos-8]
name: [Ubuntu-18, Ubuntu-20, Ubuntu-22]
include:
- name: Ubuntu-18
flavour: ubuntu
@@ -210,9 +188,11 @@ jobs:
image: "ubuntu:20.04"
ubuntu: 20
install_mode: install-apache
- name: Centos-8
flavour: centos
image: "centos:8"
- name: Ubuntu-22
flavour: ubuntu
image: "ubuntu:22.04"
ubuntu: 22
install_mode: install-apache
container:
image: ${{ matrix.image }}
@@ -251,7 +231,7 @@ jobs:
OS: ${{ matrix.name }}
INSTALL_MODE: ${{ matrix.install_mode }}
- uses: actions/download-artifact@v2
- uses: actions/download-artifact@v3
with:
name: full-source
path: /home/nominatim
@@ -281,6 +261,10 @@ jobs:
working-directory: /home/nominatim
if: matrix.flavour == 'centos'
- name: Print version
run: nominatim --version
working-directory: /home/nominatim/nominatim-project
- name: Import
run: nominatim import --osm-file ../test.pbf
working-directory: /home/nominatim/nominatim-project
@@ -309,12 +293,20 @@ jobs:
NOMINATIM_REPLICATION_MAX_DIFF=1 nominatim replication --once
working-directory: /home/nominatim/nominatim-project
- name: Clean up database
run: nominatim refresh --postcodes --word-tokens
working-directory: /home/nominatim/nominatim-project
- name: Run reverse-only import
run : |
echo 'NOMINATIM_DATABASE_DSN="pgsql:dbname=reverse"' >> .env
nominatim import --osm-file ../test.pbf --reverse-only --no-updates
working-directory: /home/nominatim/data-env-reverse
- name: Check reverse import
- name: Check reverse-only import
run: nominatim admin --check-database
working-directory: /home/nominatim/data-env-reverse
- name: Clean up database (reverse-only import)
run: nominatim refresh --postcodes --word-tokens
working-directory: /home/nominatim/nominatim-project

13
.mypy.ini Normal file
View File

@@ -0,0 +1,13 @@
[mypy]
[mypy-icu.*]
ignore_missing_imports = True
[mypy-osmium.*]
ignore_missing_imports = True
[mypy-datrie.*]
ignore_missing_imports = True
[mypy-dotenv.*]
ignore_missing_imports = True

View File

@@ -10,6 +10,9 @@ ignored-modules=icu,datrie
# closing added here because it sometimes triggers a false positive with
# 'with' statements.
ignored-classes=NominatimArgs,closing
disable=too-few-public-methods,duplicate-code
# 'too-many-ancestors' is triggered already by deriving from UserDict
# 'not-context-manager' disabled because it causes false positives once
# typed Python is enabled. See also https://github.com/PyCQA/pylint/issues/5273
disable=too-few-public-methods,duplicate-code,too-many-ancestors,bad-option-value,no-self-use,not-context-manager
good-names=i,x,y,fd,db
good-names=i,x,y,fd,db,cc

16
AUTHORS
View File

@@ -1,15 +1,15 @@
Nominatim was written by:
Brian Quinion
Sarah Hoffmann
Marc Tobias Metten
* Brian Quinion
* Sarah Hoffmann
* Marc Tobias Metten
markigail
gemo1011
IrlJidel
Frederik Ramm
* markigail
* AntoJvlt
* gemo1011
* darkshredder
and many more.
For a full list of contributors see
For a full list of contributors see the Git logs or visit
https://github.com/openstreetmap/Nominatim/graphs/contributors

View File

@@ -19,13 +19,24 @@ list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake")
project(nominatim)
set(NOMINATIM_VERSION_MAJOR 4)
set(NOMINATIM_VERSION_MINOR 0)
set(NOMINATIM_VERSION_PATCH 0)
set(NOMINATIM_VERSION_MINOR 1)
set(NOMINATIM_VERSION_PATCH 2)
set(NOMINATIM_VERSION "${NOMINATIM_VERSION_MAJOR}.${NOMINATIM_VERSION_MINOR}.${NOMINATIM_VERSION_PATCH}")
add_definitions(-DNOMINATIM_VERSION="${NOMINATIM_VERSION}")
# Setting GIT_HASH
find_package(Git)
if (GIT_FOUND)
execute_process(
COMMAND "${GIT_EXECUTABLE}" log -1 --format=%h
WORKING_DIRECTORY ${CMAKE_CURRENT_LIST_DIR}
OUTPUT_VARIABLE GIT_HASH
OUTPUT_STRIP_TRAILING_WHITESPACE
ERROR_QUIET
)
endif()
#-----------------------------------------------------------------------------
# Configuration
@@ -33,7 +44,7 @@ add_definitions(-DNOMINATIM_VERSION="${NOMINATIM_VERSION}")
set(BUILD_IMPORTER on CACHE BOOL "Build everything for importing/updating the database")
set(BUILD_API on CACHE BOOL "Build everything for the API server")
set(BUILD_MODULE on CACHE BOOL "Build PostgreSQL module")
set(BUILD_MODULE off CACHE BOOL "Build PostgreSQL module for legacy tokenizer")
set(BUILD_TESTS on CACHE BOOL "Build test suite")
set(BUILD_DOCS on CACHE BOOL "Build documentation")
set(BUILD_MANPAGE on CACHE BOOL "Build Manual Page")
@@ -226,8 +237,7 @@ if (BUILD_IMPORTER)
PATTERN __pycache__ EXCLUDE)
install(DIRECTORY lib-sql DESTINATION ${NOMINATIM_LIBDIR})
install(FILES data/country_name.sql
${COUNTRY_GRID_FILE}
install(FILES ${COUNTRY_GRID_FILE}
data/words.sql
DESTINATION ${NOMINATIM_DATADIR})
endif()
@@ -266,6 +276,8 @@ install(FILES settings/env.defaults
install(DIRECTORY settings/icu-rules
DESTINATION ${NOMINATIM_CONFIGDIR})
install(DIRECTORY settings/country-names
DESTINATION ${NOMINATIM_CONFIGDIR})
if (INSTALL_MUNIN_PLUGINS)
install(FILES munin/nominatim_importlag

View File

@@ -36,7 +36,7 @@ Nominatim historically hasn't followed a particular coding style but we
are in process of consolidating the style. The following rules apply:
* Python code uses the official Python style
* indention
* indentation
* SQL use 2 spaces
* all other file types use 4 spaces
* [BSD style](https://en.wikipedia.org/wiki/Indent_style#Allman_style) for braces

View File

@@ -1,3 +1,70 @@
4.1.2
* fix XSS vulnerability in debug view
4.1.1
* fix crash on update when addr:interpolation receives an illegal value
* fix minimum number of retrived results to be at least 10
* fix search for combinations of special term + name (e.g Hotel Bellevue)
* do not return interpolations without a parent street on reverse search
* improve invalidation of linked places on updates
* fix address parsing for interpolation lines
* make sure socket timeouts are respected during replication
(working around a bug in some versions of pyosmium)
* update bundled osm2pgsql to 1.7.1
* typing fixes to work with latest type annotations from typeshed
* smaller improvements to documention (thanks to @mausch)
4.1.0
* switch to ICU tokenizer as default
* add housenumber normalization and support optional spaces during search
* add postcode format checking and support optional spaces during search
* add function for cleaning housenumbers in word table
* add updates/deletion of country names imported from OSM
* linked places no longer overwrite names from a place permanently
* move default country name configuration into yaml file (thanks @tareqpi)
* more compact layout for interpolation and TIGER tables
* introduce mutations to ICU tokenizer (used for German umlauts)
* support reinitializing a full project directory with refresh --website
* fix various issues with linked places on updates
* add support for external sanitizers and token analyzers
* add CLI commands for forced indexing
* add CLI command for version report
* add offline import mode
* change geocodejson to return a feature class in the 'type' field
* add ISO3166-2 to address output (thanks @I70l0teN4ik)
* improve parsing and matching of addr: tags
* support relations as street members of associatedStreet
* better ranking for address results from TIGER data
* adapt rank classification to changed tag usage in OSM
* update bundled osm2pgsql to 1.6.0
* add typing information to Python code
* improve unit test coverage
* reorganise and speed up code for BDD tests, drop support for scenes
* move PHP unit tests to PHP 9.5
* extensive typo fixes in documentation (thanks @woodpeck,@StephanGeorg,
@amandasaurus, @nslxndr, @stefkiourk, @Luflosi, @kianmeng)
* drop official support for installation on CentOS
* add installation instructions for Ubuntu 22.04
* add support for PHP8
* add setup instructions for updates and systemd
* drop support for PostgreSQL 9.5
4.0.2
* fix XSS vulnerability in debug view
4.0.1
* fix initialisation error in replication script
* ICU tokenizer: avoid any special characters in word tokens
* better error message when API php script does not exist
* fix quoting of house numbers in SQL queries
* small fixes and improvements in search query parsing
* add documentation for moving the database to a different machine
4.0.0
* refactor name token computation and introduce ICU tokenizer
@@ -27,6 +94,10 @@
* add testing of installation scripts via CI
* drop support for Python < 3.6 and Postgresql < 9.5
3.7.3
* fix XSS vulnerability in debug view
3.7.2
* fix database check for reverse-only imports

View File

@@ -9,10 +9,10 @@ versions.
| Version | End of support for security updates |
| ------- | ----------------------------------- |
| 4.0.x | 2023-11-02 |
| 3.7.x | 2023-04-05 |
| 3.6.x | 2022-12-12 |
| 3.5.x | 2022-06-05 |
| 3.4.x | 2021-10-24 |
## Reporting a Vulnerability

View File

@@ -42,9 +42,9 @@ is.
```
# inside the virtual machine:
cd build
wget --no-verbose --output-document=/tmp/monaco.osm.pbf http://download.geofabrik.de/europe/monaco-latest.osm.pbf
./utils/setup.php --osm-file /tmp/monaco.osm.pbf --osm2pgsql-cache 1000 --all 2>&1 | tee monaco.$$.log
cd nominatim-project
wget --no-verbose --output-document=monaco.osm.pbf http://download.geofabrik.de/europe/monaco-latest.osm.pbf
nominatim import --osm-file monaco.osm.pbf 2>&1 | tee monaco.$$.log
```
To repeat an import you'd need to delete the database first
@@ -56,7 +56,7 @@ is.
## Development
Vagrant maps the virtual machine's port 8089 to your host machine. Thus you can
see Nominatim in action on [locahost:8089](http://localhost:8089/nominatim/).
see Nominatim in action on [localhost:8089](http://localhost:8089/nominatim/).
You edit code on your host machine in any editor you like. There is no need to
restart any software: just refresh your browser window.

View File

@@ -7,6 +7,9 @@ sys.path.insert(1, '@NOMINATIM_LIBDIR@/lib-python')
os.environ['NOMINATIM_NOMINATIM_TOOL'] = os.path.abspath(__file__)
from nominatim import cli
from nominatim import version
version.GIT_COMMIT_HASH = '@GIT_HASH@'
exit(cli.nominatim(module_dir='@NOMINATIM_LIBDIR@/module',
osm2pgsql_path='@NOMINATIM_LIBDIR@/osm2pgsql',

View File

@@ -7,6 +7,9 @@ sys.path.insert(1, '@CMAKE_SOURCE_DIR@')
os.environ['NOMINATIM_NOMINATIM_TOOL'] = os.path.abspath(__file__)
from nominatim import cli
from nominatim import version
version.GIT_COMMIT_HASH = '@GIT_HASH@'
exit(cli.nominatim(module_dir='@CMAKE_BINARY_DIR@/module',
osm2pgsql_path='@CMAKE_BINARY_DIR@/osm2pgsql/osm2pgsql',

View File

@@ -1,14 +0,0 @@
codecov:
require_ci_to_pass: yes
coverage:
status:
project: off
patch: off
comment:
require_changes: true
after_n_builds: 2
fixes:
- "Nominatim/::"

File diff suppressed because one or more lines are too long

View File

@@ -23,10 +23,9 @@ foreach (src ${DOC_SOURCES})
endforeach()
ADD_CUSTOM_TARGET(doc
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-7.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-7.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-8.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-8.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-18.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-18.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-20.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-20.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-22.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-22.md
COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs build -d ${CMAKE_CURRENT_BINARY_DIR}/../site-html -f ${CMAKE_CURRENT_BINARY_DIR}/../mkdocs.yml
)

View File

@@ -111,7 +111,7 @@ library.
!!! note
The external module is only needed when using the legacy tokenizer.
If you have choosen the ICU tokenizer, then you can ignore this section
If you have chosen the ICU tokenizer, then you can ignore this section
and follow the standard import documentation.
### Option 1: Compiling the library on the database server
@@ -198,11 +198,10 @@ target machine.
of a full database.
Next install Nominatim on the target machine by following the standard installation
instructions. Again make sure to use the same version as the source machine.
instructions. Again, make sure to use the same version as the source machine.
You can now copy the project directory from the source machine to the new machine.
If necessary, edit the `.env` file to point it to the restored database.
Finally run
Create a project directory on your destination machine and set up the `.env`
file to match the configuration on the source machine. Finally run
nominatim refresh --website
@@ -210,6 +209,8 @@ to make sure that the local installation of Nominatim will be used.
If you are using the legacy tokenizer you might also have to switch to the
PostgreSQL module that was compiled on your target machine. If you get errors
that PostgreSQL cannot find or access `nominatim.so` then copy the installed
version into the `module` directory of your project directory. The installed
copy can usually be found under `/usr/local/lib/nominatim/module/nominatim.so`.
that PostgreSQL cannot find or access `nominatim.so` then rerun
nominatim refresh --functions
on the target machine to update the the location of the module.

View File

@@ -82,7 +82,7 @@ The website should now be available on `http://localhost/nominatim`.
### Installing the required packages
Nginx has no built-in PHP interpreter. You need to use php-fpm as a deamon for
Nginx has no built-in PHP interpreter. You need to use php-fpm as a daemon for
serving PHP cgi.
On Ubuntu/Debian install nginx and php-fpm with:
@@ -99,7 +99,7 @@ Unix socket instead, change the pool configuration
``` ini
; Replace the tcp listener and add the unix socket
listen = /var/run/php-fpm.sock
listen = /var/run/php-fpm-nominatim.sock
; Ensure that the daemon runs as the correct user
listen.owner = www-data
@@ -121,7 +121,7 @@ location @php {
fastcgi_param SCRIPT_FILENAME "$document_root$uri.php";
fastcgi_param PATH_TRANSLATED "$document_root$uri.php";
fastcgi_param QUERY_STRING $args;
fastcgi_pass unix:/var/run/php-fpm.sock;
fastcgi_pass unix:/var/run/php-fpm-nominatim.sock;
fastcgi_index index.php;
include fastcgi_params;
}
@@ -131,7 +131,7 @@ location ~ [^/]\.php(/|$) {
if (!-f $document_root$fastcgi_script_name) {
return 404;
}
fastcgi_pass unix:/var/run/php-fpm.sock;
fastcgi_pass unix:/var/run/php-fpm-nominatim.sock;
fastcgi_index search.php;
include fastcgi.conf;
}
@@ -140,3 +140,9 @@ location ~ [^/]\.php(/|$) {
Restart the nginx and php-fpm services and the website should now be available
at `http://localhost/`.
## Nominatim with other webservers
Users have created instructions for other webservers:
* [Caddy](https://github.com/osm-search/Nominatim/discussions/2580)

View File

@@ -79,7 +79,7 @@ When running the import you may get a version mismatch:
pg_config seems to use bad includes sometimes when multiple versions
of PostgreSQL are available in the system. Make sure you remove the
server development libraries (`postgresql-server-dev-9.5` on Ubuntu)
server development libraries (`postgresql-server-dev-13` on Ubuntu)
and recompile (`cmake .. && make`).
@@ -106,11 +106,6 @@ If you are using a flatnode file, then it may also be that the underlying
filesystem does not fully support 'mmap'. A notable candidate is virtualbox's
vboxfs.
### I see the error: "clang: Command not found" on CentOS
On CentOS 7 users reported `/opt/rh/llvm-toolset-7/root/usr/bin/clang: Command not found`.
Double-check clang is installed. Instead of `make` try running `make CLANG=true`.
### nominatim UPDATE failed: ERROR: buffer 179261 is not owned by resource owner Portal
Several users [reported this](https://github.com/openstreetmap/Nominatim/issues/1168)
@@ -126,22 +121,6 @@ The server cannot access your database. Add `&debug=1` to your URL
to get the full error message.
### On CentOS the website shows "Could not connect to server"
`could not connect to server: No such file or directory`
On CentOS v7 the PostgreSQL server is started with `systemd`. Check if
`/usr/lib/systemd/system/httpd.service` contains a line `PrivateTmp=true`. If
so then Apache cannot see the `/tmp/.s.PGSQL.5432` file. It's a good security
feature, so use the
[preferred solution](../appendix/Install-on-Centos-7.md#adding-selinux-security-settings).
However, you can solve this the quick and dirty way by commenting out that line and then run
sudo systemctl daemon-reload
sudo systemctl restart httpd
### Website reports "DB Error: insufficient permissions"
The user the webserver, e.g. Apache, runs under needs to have access to the
@@ -181,9 +160,6 @@ by everybody, e.g.
Try `chmod a+r nominatim.so; chmod a+x nominatim.so`.
When running SELinux, make sure that the
[context is set up correctly](../appendix/Install-on-Centos-7.md#adding-selinux-security-settings).
When you recently updated your operating system, updated PostgreSQL to
a new version or moved files (e.g. the build directory) you should
recreate `nominatim.so`. Try

View File

@@ -150,7 +150,7 @@ database or reuse the space later.
If you only want to use the Nominatim database for reverse lookups or
if you plan to use the installation only for exports to a
[photon](https://photon.komoot.de/) database, then you can set up a database
[photon](https://photon.komoot.io/) database, then you can set up a database
without search indexes. Add `--reverse-only` to your setup command above.
This saves about 5% of disk space.

View File

@@ -4,10 +4,9 @@ This page contains generic installation instructions for Nominatim and its
prerequisites. There are also step-by-step instructions available for
the following operating systems:
* [Ubuntu 22.04](../appendix/Install-on-Ubuntu-22.md)
* [Ubuntu 20.04](../appendix/Install-on-Ubuntu-20.md)
* [Ubuntu 18.04](../appendix/Install-on-Ubuntu-18.md)
* [CentOS 8](../appendix/Install-on-Centos-8.md)
* [CentOS 7.2](../appendix/Install-on-Centos-7.md)
These OS-specific instructions can also be found in executable form
in the `vagrant/` directory.
@@ -25,8 +24,9 @@ and can't offer support.
### Software
!!! Warning
For larger installations you **must have** PostgreSQL 11+ and Postgis 3+
For larger installations you **must have** PostgreSQL 11+ and PostGIS 3+
otherwise import and queries will be slow to the point of being unusable.
Query performance has marked improvements with PostgreSQL 13+ and PostGIS 3.2+.
For compiling:
@@ -42,7 +42,7 @@ For compiling:
For running Nominatim:
* [PostgreSQL](https://www.postgresql.org) (9.5+ will work, 11+ strongly recommended)
* [PostgreSQL](https://www.postgresql.org) (9.6+ will work, 11+ strongly recommended)
* [PostGIS](https://postgis.net) (2.2+ will work, 3.0+ strongly recommended)
* [Python 3](https://www.python.org/) (3.6+)
* [Psycopg2](https://www.psycopg.org) (2.7+)
@@ -67,10 +67,10 @@ the [Development section](../develop/Development-Environment.md).
### Hardware
A minimum of 2GB of RAM is required or installation will fail. For a full
planet import 64GB of RAM or more are strongly recommended. Do not report
planet import 128GB of RAM or more are strongly recommended. Do not report
out of memory problems if you have less than 64GB RAM.
For a full planet install you will need at least 900GB of hard disk space.
For a full planet install you will need at least 1TB of hard disk space.
Take into account that the OSM database is growing fast.
Fast disks are essential. Using NVME disks is recommended.
@@ -89,8 +89,7 @@ your `postgresql.conf` file.
work_mem = (50MB)
effective_cache_size = (24GB)
synchronous_commit = off
checkpoint_segments = 100 # only for postgresql <= 9.4
max_wal_size = 1GB # postgresql > 9.4
max_wal_size = 1GB
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
@@ -113,7 +112,7 @@ For the initial import, you should also set:
fsync = off
full_page_writes = off
Don't forget to reenable them after the initial import or you risk database
Don't forget to re-enable them after the initial import or you risk database
corruption.
@@ -130,7 +129,7 @@ If you want to install latest development version from github, make sure to
also check out the osm2pgsql subproject:
```
git clone --recursive git://github.com/openstreetmap/Nominatim.git
git clone --recursive https://github.com/openstreetmap/Nominatim.git
```
The development version does not include the country grid. Download it separately:
@@ -158,6 +157,17 @@ make
sudo make install
```
!!! warning
The default installation no longer compiles the PostgreSQL module that
is needed for the legacy tokenizer from older Nominatim versions. If you
are upgrading an older database or want to run the
[legacy tokenizer](../customize/Tokenizers.md#legacy-tokenizer) for
some other reason, you need to enable the PostgreSQL module via
cmake: `cmake -DBUILD_MODULE=on ../Nominatim`. To compile the module
you need to have the server development headers for PostgreSQL installed.
On Ubuntu/Debian run: `sudo apt install postgresql-server-dev-<postgresql version>`
Nominatim installs itself into `/usr/local` per default. To choose a different
installation directory add `-DCMAKE_INSTALL_PREFIX=<install root>` to the
cmake command. Make sure that the `bin` directory is available in your path

View File

@@ -34,6 +34,30 @@ to rerun the statistics computation when adding larger amounts of new data,
for example, when adding an additional country via `nominatim add-data`.
## Forcing recomputation of places and areas
Command: `nominatim refresh --data-object [NWR]<id> --data-area [NWR]<id>`
When running replication updates, Nominatim tries to recompute the search
and address information for all places that are affected by a change. But it
needs to restrict the total number of changes to make sure it can keep up
with the minutely updates. Therefore it will refrain from propagating changes
that affect a lot of objects.
The administrator may force an update of places in the database.
`nominatim refresh --data-object` invalidates a single OSM object.
`nominatim refresh --data-area` invalidates an OSM object and all dependent
objects. That are usually the places that inside its area or around the
center of the object. Both commands expect the OSM object as an argument
of the form OSM type + OSM id. The type must be `N` (node), `W` (way) or
`R` (relation).
After invalidating the object, indexing must be run again. If continuous
update are running in the background, the objects will be recomputed together
with the next round of updates. Otherwise you need to run `nominatim index`
to finish the recomputation.
## Removing large deleted objects
Nominatim refuses to delete very large areas because often these deletions are

View File

@@ -15,6 +15,23 @@ breaking changes. **Please read them before running the migration.**
If you are migrating from a version <3.6, then you still have to follow
the manual migration steps up to 3.6.
## 4.0.0 -> 4.1.0
### ICU tokenizer is the new default
Nominatim now installs the [ICU tokenizer](../customize/Tokenizers.md#icu-tokenizer)
by default. This only has an effect on newly installed databases. When
updating older databases, it keeps its installed tokenizer. If you still
run with the legacy tokenizer, make sure to compile Nominatim with the
PostgreSQL module, see [Installation](Installation.md#building-nominatim).
### geocodejson output changed
The `type` field of the geocodejson output has changed. It now contains
the address class of the object instead of the value of the OSM tag. If
your client has used the `type` field, switch them to read `osm_value`
instead.
## 3.7.0 -> 4.0.0
### NOMINATIM_PHRASE_CONFIG removed

View File

@@ -16,12 +16,12 @@ and run it. Grab the latest release from
[nominatim-ui's Github release page](https://github.com/osm-search/nominatim-ui/releases)
and unpack it. You can use `nominatim-ui-x.x.x.tar.gz` or `nominatim-ui-x.x.x.zip`.
Next you need to adapt the UI yo your installation. Custom settings need to be
Next you need to adapt the UI to your installation. Custom settings need to be
put into `dist/theme/config.theme.js`. At a minimum you need to
set `Nominatim_API_Endpoint` to point to your Nominatim installation:
cd nominatim-ui
echo "Nominatim_Config.Nominatim_API_Endpoint='https:\\myserver.org\nominatim';" > dist/theme/config.theme.js
echo "Nominatim_Config.Nominatim_API_Endpoint='https://myserver.org/nominatim/';" > dist/theme/config.theme.js
For the full set of available settings, have a look at `dist/config.defaults.js`.
@@ -161,24 +161,16 @@ directory like this:
# If no endpoint is given, then use search.
RewriteRule ^(/|$) "search.php"
# If format-html is explicity requested, forward to the UI.
# If format-html is explicitly requested, forward to the UI.
RewriteCond %{QUERY_STRING} "format=html"
RewriteRule ^([^/]+).php ui/$1.html [R,END]
# Same but .php suffix is missing.
RewriteCond %{QUERY_STRING} "format=html"
RewriteRule ^([^/]+) ui/$1.html [R,END]
RewriteRule ^([^/]+)(.php)? ui/$1.html [R,END]
# If no format parameter is there then forward anything
# but /reverse and /lookup to the UI.
RewriteCond %{QUERY_STRING} "!format="
RewriteCond %{REQUEST_URI} "!/lookup"
RewriteCond %{REQUEST_URI} "!/reverse"
RewriteRule ^([^/]+).php ui/$1.html [R,END]
# Same but .php suffix is missing.
RewriteCond %{QUERY_STRING} "!format="
RewriteCond %{REQUEST_URI} "!/lookup"
RewriteCond %{REQUEST_URI} "!/reverse"
RewriteRule ^([^/]+) ui/$1.html [R,END]
RewriteRule ^([^/]+)(.php)? ui/$1.html [R,END]
</Directory>
```

View File

@@ -70,7 +70,7 @@ The update application keeps running forever and retrieves and applies
new updates from the server as they are published.
You can run this command as a simple systemd service. Create a service
description like that in `/etc/systemd/system/nominatim-update.service`:
description like that in `/etc/systemd/system/nominatim-updates.service`:
```
[Unit]
@@ -122,14 +122,71 @@ cd /srv/nominatim
while true; do
nominatim replication --once
if [ -f "/srv/nominatim/schedule-mainenance" ]; then
rm /srv/nominatim/schedule-mainenance
if [ -f "/srv/nominatim/schedule-maintenance" ]; then
rm /srv/nominatim/schedule-maintenance
nominatim refresh --postcodes
fi
done
```
A cron job then creates the file `/srv/nominatim/need-mainenance` once per night.
A cron job then creates the file `/srv/nominatim/schedule-maintenance` once per night.
##### One-time mode with systemd
You can run the one-time mode with a systemd timer & service.
Create a timer description like `/etc/systemd/system/nominatim-updates.timer`:
```
[Unit]
Description=Timer to start updates of Nominatim
[Timer]
OnActiveSec=2
OnUnitActiveSec=1min
Unit=nominatim-updates.service
[Install]
WantedBy=multi-user.target
```
And then a similar service definition: `/etc/systemd/system/nominatim-updates.service`:
```
[Unit]
Description=Single updates of Nominatim
[Service]
WorkingDirectory=/srv/nominatim
ExecStart=nominatim replication --once
StandardOutput=append:/var/log/nominatim-updates.log
StandardError=append:/var/log/nominatim-updates.error.log
User=nominatim
Group=nominatim
Type=simple
[Install]
WantedBy=multi-user.target
```
Replace the `WorkingDirectory` with your project directory. Also adapt user and
group names as required. `OnUnitActiveSec` defines how often the individual
update command is run.
Now activate the service and start the updates:
```
sudo systemctl daemon-reload
sudo systemctl enable nominatim-updates.timer
sudo systemctl start nominatim-updates.timer
```
You can stop future data updates, while allowing any current, in-progress
update steps to finish, by running `sudo systemctl stop
nominatim-updates.timer` and waiting until `nominatim-updates.service` isn't
running (`sudo systemctl is-active nominatim-updates.service`). Current output
from the update can be seen like above (`systemctl status
nominatim-updates.service`).
#### Catch-up mode
@@ -158,7 +215,7 @@ replication catch-up at whatever interval you desire.
a replication source with an update frequency that is an order of magnitude
lower. For example, if you want to update once a day, use an hourly updated
source. This makes sure that you don't miss an entire day of updates when
the source is unexpectely late to publish its update.
the source is unexpectedly late to publish its update.
If you want to use the source with the same update frequency (e.g. a daily
updated source with daily updates), use the

View File

@@ -90,11 +90,11 @@ This overrides the specified machine readable format. (Default: 0)
##### XML
[https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189](https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189)
[https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189](https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W50637691,N240109189)
```xml
<lookupresults timestamp="Mon, 29 Jun 15 18:01:33 +0000" attribution="Data © OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright" querystring="R146656,W104393803,N240109189" polygon="false">
<place place_id="127761056" osm_type="relation" osm_id="146656" place_rank="16" lat="53.4791466" lon="-2.2447445" display_name="Manchester, Greater Manchester, North West England, England, United Kingdom" class="boundary" type="administrative" importance="0.704893333438333">
<lookupresults timestamp="Mon, 28 Mar 22 14:38:54 +0000" attribution="Data &#xA9; OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright" querystring="R146656,W50637691,N240109189" more_url="">
<place place_id="282236157" osm_type="relation" osm_id="146656" place_rank="16" address_rank="16" boundingbox="53.3401044,53.5445923,-2.3199185,-2.1468288" lat="53.44246175" lon="-2.2324547359718547" display_name="Manchester, Greater Manchester, North West England, England, United Kingdom" class="boundary" type="administrative" importance="0.35">
<city>Manchester</city>
<county>Greater Manchester</county>
<state_district>North West England</state_district>
@@ -102,21 +102,20 @@ This overrides the specified machine readable format. (Default: 0)
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
<place place_id="77769745" osm_type="way" osm_id="104393803" place_rank="30" lat="52.5162024" lon="13.3777343363579" display_name="Brandenburg Gate, 1, Pariser Platz, Mitte, Berlin, 10117, Germany" class="tourism" type="attraction" importance="0.443472858361592">
<attraction>Brandenburg Gate</attraction>
<house_number>1</house_number>
<pedestrian>Pariser Platz</pedestrian>
<suburb>Mitte</suburb>
<city_district>Mitte</city_district>
<city>Berlin</city>
<state>Berlin</state>
<postcode>10117</postcode>
<place place_id="115462561" osm_type="way" osm_id="50637691" place_rank="30" address_rank="30" boundingbox="52.3994612,52.3996426,13.0479574,13.0481754" lat="52.399550700000006" lon="13.048066846939687" display_name="Brandenburger Tor, Brandenburger Stra&#xDF;e, Historische Innenstadt, Innenstadt, Potsdam, Brandenburg, 14467, Germany" class="tourism" type="attraction" importance="0.29402874005524">
<tourism>Brandenburger Tor</tourism>
<road>Brandenburger Stra&#xDF;e</road>
<suburb>Historische Innenstadt</suburb>
<city>Potsdam</city>
<state>Brandenburg</state>
<postcode>14467</postcode>
<country>Germany</country>
<country_code>de</country_code>
</place>
<place place_id="2570600569" osm_type="node" osm_id="240109189" place_rank="15" lat="52.5170365" lon="13.3888599" display_name="Berlin, Germany" class="place" type="city" importance="0.822149797630868">
<place place_id="567505" osm_type="node" osm_id="240109189" place_rank="15" address_rank="16" boundingbox="52.3586925,52.6786925,13.2396024,13.5596024" lat="52.5186925" lon="13.3996024" display_name="Berlin, 10178, Germany" class="place" type="city" importance="0.78753902824914">
<city>Berlin</city>
<state>Berlin</state>
<postcode>10178</postcode>
<country>Germany</country>
<country_code>de</country_code>
</place>
@@ -125,38 +124,50 @@ This overrides the specified machine readable format. (Default: 0)
##### JSON with extratags
[https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json](https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json)
[https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json&extratags=1](https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json&extratags=1)
```json
[
{
"place_id": "84271358",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "way",
"osm_id": "50637691",
"lat": "52.39955055",
"lon": "13.04806574678",
"display_name": "Brandenburger Tor, Brandenburger Straße, Nördliche Innenstadt, Innenstadt, Potsdam, Brandenburg, 14467, Germany",
"class": "historic",
"type": "city_gate",
"importance": "0.221233780277011",
"address": {
"address29": "Brandenburger Tor",
"pedestrian": "Brandenburger Straße",
"suburb": "Nördliche Innenstadt",
"city": "Potsdam",
"state": "Brandenburg",
"postcode": "14467",
"country": "Germany",
"country_code": "de"
},
"extratags": {
"image": "http://commons.wikimedia.org/wiki/File:Potsdam_brandenburger_tor.jpg",
"wikidata": "Q695045",
"wikipedia": "de:Brandenburger Tor (Potsdam)",
"wheelchair": "yes",
"description": "Kleines Brandenburger Tor in Potsdam"
}
}
{
"place_id": 115462561,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "way",
"osm_id": 50637691,
"boundingbox": [
"52.3994612",
"52.3996426",
"13.0479574",
"13.0481754"
],
"lat": "52.399550700000006",
"lon": "13.048066846939687",
"display_name": "Brandenburger Tor, Brandenburger Straße, Historische Innenstadt, Innenstadt, Potsdam, Brandenburg, 14467, Germany",
"class": "tourism",
"type": "attraction",
"importance": 0.2940287400552381,
"address": {
"tourism": "Brandenburger Tor",
"road": "Brandenburger Straße",
"suburb": "Historische Innenstadt",
"city": "Potsdam",
"state": "Brandenburg",
"postcode": "14467",
"country": "Germany",
"country_code": "de"
},
"extratags": {
"image": "http://commons.wikimedia.org/wiki/File:Potsdam_brandenburger_tor.jpg",
"heritage": "4",
"wikidata": "Q695045",
"architect": "Carl von Gontard;Georg Christian Unger",
"wikipedia": "de:Brandenburger Tor (Potsdam)",
"wheelchair": "yes",
"description": "Kleines Brandenburger Tor in Potsdam",
"heritage:website": "http://www.bldam-brandenburg.de/images/stories/PDF/DML%202012/04-p-internet-13.pdf",
"heritage:operator": "bldam",
"architect:wikidata": "Q68768;Q95223",
"year_of_construction": "1771"
}
}
]
```

View File

@@ -28,6 +28,7 @@ a single place (for reverse) of the following format:
"city": "London",
"state_district": "Greater London",
"state": "England",
"ISO3166-2-lvl4": "GB-ENG",
"postcode": "SW1A 2DU",
"country": "United Kingdom",
"country_code": "gb"
@@ -97,7 +98,10 @@ The GeocodeJSON format follows the
The following feature attributes are implemented:
* `osm_type`, `osm_id` - reference to the OSM object (unofficial extension, [see notes](#osm-reference))
* `type` - value of the main tag of the object (e.g. residential, restaurant, ...)
* `type` - the 'address level' of the object ('house', 'street', `district`, `city`,
`county`, `state`, `country`, `locality`)
* `osm_key`- key of the main tag of the OSM object (e.g. boundary, highway, amenity)
* `osm_value` - value of the main tag of the OSM object (e.g. residential, restaurant)
* `label` - full comma-separated address
* `name` - localised name of the place
* `housenumber`, `street`, `locality`, `district`, `postcode`, `city`,
@@ -126,6 +130,7 @@ formats depending on the API call.
</result>
<addressparts>
<state>Bavaria</state>
<ISO3166-2-lvl4>DE-BY</ISO3166-2-lvl4>
<country>Germany</country>
<country_code>de</country_code>
</addressparts>
@@ -179,6 +184,7 @@ Additional information requested with `addressdetails=1`, `extratags=1` and
<city>London</city>
<state_district>Greater London</state_district>
<state>England</state>
<ISO3166-2-lvl4>GB-ENG</ISO3166-2-lvl4>
<postcode>SW1A 2DU</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
@@ -205,8 +211,8 @@ be more than one. The attributes of that element contain:
* `ref` - content of `ref` tag if it exists
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `boundingbox` - comma-separated list of corner coordinates ([see notes](#boundingbox))
* `place_rank` - class [search rank](../develop/Ranking#search-rank)
* `address_rank` - place [address rank](../develop/Ranking#address-rank)
* `place_rank` - class [search rank](../customize/Ranking#search-rank)
* `address_rank` - place [address rank](../customize/Ranking#address-rank)
* `display_name` - full comma-separated address
* `class`, `type` - key and value of the main OSM tag
* `importance` - computed importance rank
@@ -230,7 +236,7 @@ on another server. It may even change its ID on the same server when it is
removed and reimported while updating the database with fresh OSM data.
It is thus not useful to treat it as permanent for later use.
The combination `osm_type`+`osm_id` is slighly better but remember in
The combination `osm_type`+`osm_id` is slightly better but remember in
OpenStreetMap mappers can delete, split, recreate places (and those
get a new `osm_id`), there is no link between those old and new ids.
Places can also change their meaning without changing their `osm_id`,
@@ -279,12 +285,12 @@ with a designation label. Per default the following labels may appear:
* continent
* country, country_code
* region, state, state_district, county
* region, state, state_district, county, ISO3166-2-lvl<admin_level>
* municipality, city, town, village
* city_district, district, borough, suburb, subdivision
* hamlet, croft, isolated_dwelling
* neighbourhood, allotments, quarter
* city_block, residental, farm, farmyard, industrial, commercial, retail
* city_block, residential, farm, farmyard, industrial, commercial, retail
* road
* house_number, house_name
* emergency, historic, military, natural, landuse, place, railway,

View File

@@ -27,11 +27,11 @@ The main format of the reverse API is
https://nominatim.openstreetmap.org/reverse?lat=<value>&lon=<value>&<params>
```
where `lat` and `lon` are latitude and longitutde of a coordinate in WGS84
where `lat` and `lon` are latitude and longitude of a coordinate in WGS84
projection. The API returns exactly one result or an error when the coordinate
is in an area with no OSM data coverage.
Additional paramters are accepted as listed below.
Additional parameters are accepted as listed below.
!!! warning "Deprecation warning"
The reverse API used to allow address lookup for a single OSM object by
@@ -118,7 +118,7 @@ geometry. Topology is preserved in the result. (Default: 0.0)
* `email=<valid email address>`
If you are making large numbers of request please include an appropriate email
If you are making a large number of requests, please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.

View File

@@ -51,6 +51,12 @@ Both query forms accept the additional parameters listed below.
See [Place Output Formats](Output.md) for details on each format. (Default: jsonv2)
!!! note
The Nominatim service at
[https://nominatim.openstreetmap.org](https://nominatim.openstreetmap.org)
has a different default behaviour for historical reasons. When the
`format` parameter is omitted, the request will be forwarded to the Web UI.
* `json_callback=<string>`
Wrap JSON output in a callback function ([JSONP](https://en.wikipedia.org/wiki/JSONP)) i.e. `<string>(<json>)`.

View File

@@ -0,0 +1,149 @@
# Customizing Per-Country Data
Whenever an OSM is imported into Nominatim, the object is first assigned
a country. Nominatim can use this information to adapt various aspects of
the address computation to the local customs of the country. This section
explains how country assignment works and the principal per-country
localizations.
## Country assignment
Countries are assigned on the basis of country data from the OpenStreetMap
input data itself. Countries are expected to be tagged according to the
[administrative boundary schema](https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative):
a OSM relation with `boundary=administrative` and `admin_level=2`. Nominatim
uses the country code to distinguish the countries.
If there is no country data available for a point, then Nominatim uses the
fallback data imported from `data/country_osm_grid.sql.gz`. This was computed
from OSM data as well but is guaranteed to cover all countries.
Some OSM objects may also be located outside any country, for example a buoy
in the middle of the ocean. These object do not get any country assigned and
get a default treatment when it comes to localized handling of data.
## Per-country settings
### Global country settings
The main place to configure settings per country is the file
`settings/country_settings.yaml`. This file has one section per country that
is recognised by Nominatim. Each section is tagged with the country code
(in lower case) and contains the different localization information. Only
countries which are listed in this file are taken into account for computations.
For example, the section for Andorra looks like this:
```
partition: 35
languages: ca
names: !include country-names/ad.yaml
postcode:
pattern: "(ddd)"
output: AD\1
```
The individual settings are described below.
#### `partition`
Nominatim internally splits the data into multiple tables to improve
performance. The partition number tells Nominatim into which table to put
the country. This is purely internal management and has no effect on the
output data.
The default is to have one partition per country.
#### `languages`
A comma-separated list of ISO-639 language codes of default languages in the
country. These are the languages used in name tags without a language suffix.
Note that this is not necessarily the same as the list of official languages
in the country. There may be officially recognised languages in a country
which are only ever used in name tags with the appropriate language suffixes.
Conversely, a non-official language may appear a lot in the name tags, for
example when used as an unofficial Lingua Franca.
List the languages in order of frequency of appearance with the most frequently
used language first. It is not recommended to add languages when there are only
very few occurrences.
If only one language is listed, then Nominatim will 'auto-complete' the
language of names without an explicit language-suffix.
#### `names`
List of names of the country and its translations. These names are used as
a baseline. It is always possible to search countries by the given names, no
matter what other names are in the OSM data. They are also used as a fallback
when a needed translation is not available.
!!! Note
The list of names per country is currently fairly large because Nominatim
supports translations in many languages per default. That is why the
name lists have been separated out into extra files. You can find the
name lists in the file `settings/country-names/<country code>.yaml`.
The names section in the main country settings file only refers to these
files via the special `!include` directive.
#### `postcode`
Describes the format of the postcode that is in use in the country.
When a country has no official postcodes, set this to no. Example:
```
ae:
postcode: no
```
When a country has a postcode, you need to state the postcode pattern and
the default output format. Example:
```
bm:
postcode:
pattern: "(ll)[ -]?(dd)"
output: \1 \2
```
The **pattern** is a regular expression that describes the possible formats
accepted as a postcode. The pattern follows the standard syntax for
[regular expressions in Python](https://docs.python.org/3/library/re.html#regular-expression-syntax)
with two extra shortcuts: `d` is a shortcut for a single digit([0-9])
and `l` for a single ASCII letter ([A-Z]).
Use match groups to indicate groups in the postcode that may optionally be
separated with a space or a hyphen.
For example, the postcode for Bermuda above always consists of two letters
and two digits. They may optionally be separated by a space or hyphen. That
means that Nominatim will consider `AB56`, `AB 56` and `AB-56` spelling variants
for one and the same postcode.
Never add the country code in front of the postcode pattern. Nominatim will
automatically accept variants with a country code prefix for all postcodes.
The **output** field is an optional field that describes what the canonical
spelling of the postcode should be. The format is the
[regular expression expand syntax](https://docs.python.org/3/library/re.html#re.Match.expand) referring back to the bracket groups in the pattern.
Most simple postcodes only have one spelling variant. In that case, the
**output** can be omitted. The postcode will simply be used as is.
In the Bermuda example above, the canonical spelling would be to have a space
between letters and digits.
!!! Warning
When your postcode pattern covers multiple variants of the postcode, then
you must explicitly state the canonical output or Nominatim will not
handle the variations correctly.
### Other country-specific configuration
There are some other configuration files where you can set localized settings
according to the assigned country. These are:
* [Place ranking configuration](Ranking.md)
Please see the linked documentation sections for more information.

View File

@@ -10,7 +10,7 @@ option. There are a number of default styles, which are explained in detail
in the [Import section](../admin/Import.md#filtering-imported-data). These
standard styles may be referenced by their name.
You can also create your own custom syle. Put the style file into your
You can also create your own custom style. Put the style file into your
project directory and then set `NOMINATIM_IMPORT_STYLE` to the name of the file.
It is always recommended to start with one of the standard styles and customize
those. You find the standard styles under the name `import-<stylename>.style`

View File

@@ -189,7 +189,7 @@ will be used.
| **Description:** | Enable searching for Tiger house number data |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim --refresh --functions` |
| **After Changes:** | run `nominatim refresh --functions` |
When this setting is enabled, search and reverse queries also take data
from [Tiger house number data](Tiger.md) into account.
@@ -202,7 +202,7 @@ from [Tiger house number data](Tiger.md) into account.
| **Description:** | Enable searching in external house number tables |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim --refresh --functions` |
| **After Changes:** | run `nominatim refresh --functions` |
| **Comment:** | Do not use. |
When this setting is enabled, search queries also take data from external

View File

@@ -5,22 +5,22 @@ address set to complement the OSM house number data in the US. You can add
TIGER data to your own Nominatim instance by following these steps. The
entire US adds about 10GB to your database.
1. Get preprocessed TIGER 2021 data:
1. Get preprocessed TIGER data:
cd $PROJECT_DIR
wget https://nominatim.org/data/tiger2021-nominatim-preprocessed.csv.tar.gz
wget https://nominatim.org/data/tiger-nominatim-preprocessed-latest.csv.tar.gz
2. Import the data into your Nominatim database:
nominatim add-data --tiger-data tiger2021-nominatim-preprocessed.csv.tar.gz
nominatim add-data --tiger-data tiger-nominatim-preprocessed-latest.csv.tar.gz
3. Enable use of the Tiger data in your `.env` by adding:
3. Enable use of the Tiger data in your existing `.env` file by adding:
echo NOMINATIM_USE_US_TIGER_DATA=yes >> .env
4. Apply the new settings:
nominatim refresh --functions
nominatim refresh --functions --website
See the [TIGER-data project](https://github.com/osm-search/TIGER-data) for more

View File

@@ -19,7 +19,22 @@ they can be configured.
The legacy tokenizer implements the analysis algorithms of older Nominatim
versions. It uses a special Postgresql module to normalize names and queries.
This tokenizer is currently the default.
This tokenizer is automatically installed and used when upgrading an older
database. It should not be used for new installations anymore.
### Compiling the PostgreSQL module
The tokeinzer needs a special C module for PostgreSQL which is not compiled
by default. If you need the legacy tokenizer, compile Nominatim as follows:
```
mkdir build
cd build
cmake -DBUILD_MODULE=on
make
```
### Enabling the tokenizer
To enable the tokenizer add the following line to your project configuration:
@@ -47,6 +62,7 @@ normalization functions are hard-coded.
The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to
normalize names and queries. It also offers configurable decomposition and
abbreviation handling.
This tokenizer is currently the default.
To enable the tokenizer add the following line to your project configuration:
@@ -99,6 +115,9 @@ token-analysis:
- words:
- road -> rd
- bridge -> bdge,br,brdg,bri,brg
mutations:
- pattern: 'ä'
replacements: ['ä', 'ae']
```
The configuration file contains four sections:
@@ -178,6 +197,21 @@ The following is a list of sanitizers that are shipped with Nominatim.
rendering:
heading_level: 6
##### clean-housenumbers
::: nominatim.tokenizer.sanitizers.clean_housenumbers
selection:
members: False
rendering:
heading_level: 6
##### clean-postcodes
::: nominatim.tokenizer.sanitizers.clean_postcodes
selection:
members: False
rendering:
heading_level: 6
#### Token Analysis
@@ -196,21 +230,25 @@ by a sanitizer (see for example the
The token-analysis section contains the list of configured analyzers. Each
analyzer must have an `id` parameter that uniquely identifies the analyzer.
The only exception is the default analyzer that is used when no special
analyzer was selected.
analyzer was selected. There are analysers with special ids:
* '@housenumber'. If an analyzer with that name is present, it is used
for normalization of house numbers.
* '@potcode'. If an analyzer with that name is present, it is used
for normalization of postcodes.
Different analyzer implementations may exist. To select the implementation,
the `analyzer` parameter must be set. Currently there is only one implementation
`generic` which is described in the following.
the `analyzer` parameter must be set. The different implementations are
described in the following.
##### Generic token analyzer
The generic analyzer is able to create variants from a list of given
abbreviation and decomposition replacements. It takes one optional parameter
`variants` which lists the replacements to apply. If the section is
omitted, then the generic analyzer becomes a simple analyzer that only
applies the transliteration.
The generic analyzer `generic` is able to create variants from a list of given
abbreviation and decomposition replacements and introduce spelling variations.
The variants section defines lists of replacements which create alternative
###### Variants
The optional 'variants' section defines lists of replacements which create alternative
spellings of a name. To create the variants, a name is scanned from left to
right and the longest matching replacement is applied until the end of the
string is reached.
@@ -296,6 +334,48 @@ decomposition has an effect here on the source as well. So a rule
means that for a word like `hauptstrasse` four variants are created:
`hauptstrasse`, `haupt strasse`, `hauptstr` and `haupt str`.
###### Mutations
The 'mutation' section in the configuration describes an additional set of
replacements to be applied after the variants have been computed.
Each mutation is described by two parameters: `pattern` and `replacements`.
The pattern must contain a single regular expression to search for in the
variant name. The regular expressions need to follow the syntax for
[Python regular expressions](file:///usr/share/doc/python3-doc/html/library/re.html#regular-expression-syntax).
Capturing groups are not permitted.
`replacements` must contain a list of strings that the pattern
should be replaced with. Each occurrence of the pattern is replaced with
all given replacements. Be mindful of combinatorial explosion of variants.
###### Modes
The generic analyser supports a special mode `variant-only`. When configured
then it consumes the input token and emits only variants (if any exist). Enable
the mode by adding:
```
mode: variant-only
```
to the analyser configuration.
##### Housenumber token analyzer
The analyzer `housenumbers` is purpose-made to analyze house numbers. It
creates variants with optional spaces between numbers and letters. Thus,
house numbers of the form '3 a', '3A', '3-A' etc. are all considered equivalent.
The analyzer cannot be customized.
##### Postcode token analyzer
The analyzer `postcodes` is pupose-made to analyze postcodes. It supports
a 'lookup' varaint of the token, which produces variants with optional
spaces. Use together with the clean-postcodes sanitizer.
The analyzer cannot be customized.
### Reconfiguration
Changing the configuration after the import is currently not possible, although

View File

@@ -119,7 +119,7 @@ to compute the address relations between places. These tables are partitioned.
Each country is assigned a partition number in the country_name table (see
below) and the data is then split between a set of tables, one for each
partition. Note that Nominatim still manually manages partitioned tables.
Native support for partitions in PostgreSQL only became useable with version 13.
Native support for partitions in PostgreSQL only became usable with version 13.
It will be a little while before Nominatim drops support for older versions.
![address tables](address-tables.svg)
@@ -155,9 +155,9 @@ Nominatim also creates a number of static tables at import:
default languages and saves the assignment of countries to partitions.
* `country_osm_grid` provides a fallback for country geometries
## Auxilary data tables
## Auxiliary data tables
Finally there are some table for auxillary data:
Finally there are some table for auxiliary data:
* `location_property_tiger` - saves housenumber from the Tiger import. Its
layout is similar to that of `location_propoerty_osmline`.

View File

@@ -1,6 +1,6 @@
# Setting up Nominatim for Development
This chapter gives an overview how to set up Nominatim for developement
This chapter gives an overview how to set up Nominatim for development
and how to run tests.
!!! Important
@@ -30,15 +30,18 @@ unit tests (using PHPUnit for PHP code and pytest for Python code).
It has the following additional requirements:
* [behave test framework](https://behave.readthedocs.io) >= 1.2.6
* [phpunit](https://phpunit.de) >= 7.3
* [phpunit](https://phpunit.de) (9.5 is known to work)
* [PHP CodeSniffer](https://github.com/squizlabs/PHP_CodeSniffer)
* [Pylint](https://pylint.org/) (2.6.0 is used for the CI)
* [Pylint](https://pylint.org/) (CI always runs the latest version from pip)
* [mypy](http://mypy-lang.org/) (plus typing information for external libs)
* [Python Typing Extensions](https://github.com/python/typing_extensions) (for Python < 3.9)
* [pytest](https://pytest.org)
The documentation is built with mkdocs:
* [mkdocs](https://www.mkdocs.org/) >= 1.1.2
* [mkdocstrings](https://mkdocstrings.github.io/)
* [mkdocstrings](https://mkdocstrings.github.io/) >= 0.16
* [mkdocstrings-python-legacy](https://mkdocstrings.github.io/python-legacy/)
### Installing prerequisites on Ubuntu/Debian
@@ -50,9 +53,10 @@ To install all necessary packages run:
```sh
sudo apt install php-cgi phpunit php-codesniffer \
python3-pip python3-setuptools python3-dev pylint
python3-pip python3-setuptools python3-dev
pip3 install --user behave mkdocs mkdocstrings pytest
pip3 install --user behave mkdocs mkdocstrings pytest pylint \
mypy types-PyYAML types-jinja2 types-psycopg2 types-psutil
```
The `mkdocs` executable will be located in `.local/bin`. You may have to add

View File

@@ -0,0 +1,227 @@
# Writing custom sanitizer and token analysis modules for the ICU tokenizer
The [ICU tokenizer](../customize/Tokenizers.md#icu-tokenizer) provides a
highly customizable method to pre-process and normalize the name information
of the input data before it is added to the search index. It comes with a
selection of sanitizers and token analyzers which you can use to adapt your
installation to your needs. If the provided modules are not enough, you can
also provide your own implementations. This section describes the API
of sanitizers and token analysis.
!!! warning
This API is currently in early alpha status. While this API is meant to
be a public API on which other sanitizers and token analyzers may be
implemented, it is not guaranteed to be stable at the moment.
## Using non-standard sanitizers and token analyzers
Sanitizer names (in the `step` property) and token analysis names (in the
`analyzer`) may refer to externally supplied modules. There are two ways
to include external modules: through a library or from the project directory.
To include a module from a library, use the absolute import path as name and
make sure the library can be found in your PYTHONPATH.
To use a custom module without creating a library, you can put the module
somewhere in your project directory and then use the relative path to the
file. Include the whole name of the file including the `.py` ending.
## Custom sanitizer modules
A sanitizer module must export a single factory function `create` with the
following signature:
``` python
def create(config: SanitizerConfig) -> Callable[[ProcessInfo], None]
```
The function receives the custom configuration for the sanitizer and must
return a callable (function or class) that transforms the name and address
terms of a place. When a place is processed, then a `ProcessInfo` object
is created from the information that was queried from the database. This
object is sequentially handed to each configured sanitizer, so that each
sanitizer receives the result of processing from the previous sanitizer.
After the last sanitizer is finished, the resulting name and address lists
are forwarded to the token analysis module.
Sanitizer functions are instantiated once and then called for each place
that is imported or updated. They don't need to be thread-safe.
If multi-threading is used, each thread creates their own instance of
the function.
### Sanitizer configuration
::: nominatim.tokenizer.sanitizers.config.SanitizerConfig
rendering:
show_source: no
heading_level: 6
### The main filter function of the sanitizer
The filter function receives a single object of type `ProcessInfo`
which has with three members:
* `place`: read-only information about the place being processed.
See PlaceInfo below.
* `names`: The current list of names for the place. Each name is a
PlaceName object.
* `address`: The current list of address names for the place. Each name
is a PlaceName object.
While the `place` member is provided for information only, the `names` and
`address` lists are meant to be manipulated by the sanitizer. It may add and
remove entries, change information within a single entry (for example by
adding extra attributes) or completely replace the list with a different one.
#### PlaceInfo - information about the place
::: nominatim.data.place_info.PlaceInfo
rendering:
show_source: no
heading_level: 6
#### PlaceName - extended naming information
::: nominatim.data.place_name.PlaceName
rendering:
show_source: no
heading_level: 6
### Example: Filter for US street prefixes
The following sanitizer removes the directional prefixes from street names
in the US:
``` python
import re
def _filter_function(obj):
if obj.place.country_code == 'us' \
and obj.place.rank_address >= 26 and obj.place.rank_address <= 27:
for name in obj.names:
name.name = re.sub(r'^(north|south|west|east) ',
'',
name.name,
flags=re.IGNORECASE)
def create(config):
return _filter_function
```
This is the most simple form of a sanitizer module. If defines a single
filter function and implements the required `create()` function by returning
the filter.
The filter function first checks if the object is interesting for the
sanitizer. Namely it checks if the place is in the US (through `country_code`)
and it the place is a street (a `rank_address` of 26 or 27). If the
conditions are met, then it goes through all available names and
removes any leading directional prefix using a simple regular expression.
Save the source code in a file in your project directory, for example as
`us_streets.py`. Then you can use the sanitizer in your `icu_tokenizer.yaml`:
``` yaml
...
sanitizers:
- step: us_streets.py
...
```
!!! warning
This example is just a simplified show case on how to create a sanitizer.
It is not really read for real-world use: while the sanitizer would
correcly transform `West 5th Street` into `5th Street`. it would also
shorten a simple `North Street` to `Street`.
For more sanitizer examples, have a look at the sanitizers provided by Nominatim.
They can be found in the directory
[`nominatim/tokenizer/sanitizers`](https://github.com/osm-search/Nominatim/tree/master/nominatim/tokenizer/sanitizers).
## Custom token analysis module
::: nominatim.tokenizer.token_analysis.base.AnalysisModule
rendering:
show_source: no
heading_level: 6
::: nominatim.tokenizer.token_analysis.base.Analyzer
rendering:
show_source: no
heading_level: 6
### Example: Creating acronym variants for long names
The following example of a token analysis module creates acronyms from
very long names and adds them as a variant:
``` python
class AcronymMaker:
""" This class is the actual analyzer.
"""
def __init__(self, norm, trans):
self.norm = norm
self.trans = trans
def get_canonical_id(self, name):
# In simple cases, the normalized name can be used as a canonical id.
return self.norm.transliterate(name.name).strip()
def compute_variants(self, name):
# The transliterated form of the name always makes up a variant.
variants = [self.trans.transliterate(name)]
# Only create acronyms from very long words.
if len(name) > 20:
# Take the first letter from each word to form the acronym.
acronym = ''.join(w[0] for w in name.split())
# If that leds to an acronym with at least three letters,
# add the resulting acronym as a variant.
if len(acronym) > 2:
# Never forget to transliterate the variants before returning them.
variants.append(self.trans.transliterate(acronym))
return variants
# The following two functions are the module interface.
def configure(rules, normalizer, transliterator):
# There is no configuration to parse and no data to set up.
# Just return an empty configuration.
return None
def create(normalizer, transliterator, config):
# Return a new instance of our token analysis class above.
return AcronymMaker(normalizer, transliterator)
```
Given the name `Trans-Siberian Railway`, the code above would return the full
name `Trans-Siberian Railway` and the acronym `TSR` as variant, so that
searching would work for both.
## Sanitizers vs. Token analysis - what to use for variants?
It is not always clear when to implement variations in the sanitizer and
when to write a token analysis module. Just take the acronym example
above: it would also have been possible to write a sanitizer which adds the
acronym as an additional name to the name list. The result would have been
similar. So which should be used when?
The most important thing to keep in mind is that variants created by the
token analysis are only saved in the word lookup table. They do not need
extra space in the search index. If there are many spelling variations, this
can mean quite a significant amount of space is saved.
When creating additional names with a sanitizer, these names are completely
independent. In particular, they can be fed into different token analysis
modules. This gives a much greater flexibility but at the price that the
additional names increase the size of the search index.

View File

@@ -78,7 +78,7 @@ The inheritance is computed in the data preparation step.
The prepared place information is handed to the tokenizer next. This is a
Python module responsible for processing the names from both name and address
terms and building up the word index from them. The process is explained in
more detail in the [Tokenizer chapter](Tokenizer.md).
more detail in the [Tokenizer chapter](Tokenizers.md).
### Address processing

View File

@@ -22,8 +22,8 @@ This test directory is sturctured as follows:
|
+- php PHP unit tests
+- python Python unit tests
+- scenes Geometry test data
+- testdb Base data for generating API test database
+- testdata Additional test data used by unit tests
```
## PHP Unit Tests (`test/php`)

View File

@@ -93,7 +93,7 @@ for a custom tokenizer implementation.
Nominatim expects two files for a tokenizer:
* `nominiatim/tokenizer/<NAME>_tokenizer.py` containing the Python part of the
* `nominatim/tokenizer/<NAME>_tokenizer.py` containing the Python part of the
implementation
* `lib-php/tokenizer/<NAME>_tokenizer.php` with the PHP part of the
implementation
@@ -105,7 +105,7 @@ functions. By convention, these should be placed in `lib-sql/tokenizer`.
If the tokenizer has a default configuration file, this should be saved in
the `settings/<NAME>_tokenizer.<SUFFIX>`.
### Configuration and Persistance
### Configuration and Persistence
Tokenizers may define custom settings for their configuration. All settings
must be prefixed with `NOMINATIM_TOKENIZER_`. Settings may be transient or
@@ -245,11 +245,11 @@ Currently, tokenizers are encouraged to make sure that matching works against
both the search token list and the match token list.
```sql
FUNCTION token_normalized_postcode(postcode TEXT) RETURNS TEXT
FUNCTION token_get_postcode(info JSONB) RETURNS TEXT
```
Return the normalized version of the given postcode. This function must return
the same value as the Python function `AbstractAnalyzer->normalize_postcode()`.
Return the postcode for the object, if any exists. The postcode must be in
the form that should also be presented to the end-user.
```sql
FUNCTION token_strip_info(info JSONB) RETURNS JSONB

View File

@@ -13,7 +13,7 @@ More details in [osm-search/country-grid-data](https://github.com/osm-search/cou
## US Census TIGER
For the United States you can choose to import additonal street-level data.
For the United States you can choose to import additional street-level data.
The data isn't mixed into OSM data but queried as fallback when no OSM
result can be found.

View File

@@ -14,10 +14,11 @@ th {
background-color: #eee;
}
/* Indentation for mkdocstrings.
div.doc-contents:not(.first) {
padding-left: 25px;
border-left: 4px solid rgba(230, 230, 230);
margin-bottom: 60px;
}*/
.doc-object h6 {
margin-bottom: 0.8em;
font-size: 120%;
}
.doc-object {
margin-bottom: 1.3em;
}

View File

@@ -3,7 +3,7 @@ theme: readthedocs
docs_dir: ${CMAKE_CURRENT_BINARY_DIR}
site_url: https://nominatim.org
repo_url: https://github.com/openstreetmap/Nominatim
pages:
nav:
- 'Introduction' : 'index.md'
- 'API Reference':
- 'Overview': 'api/Overview.md'
@@ -28,6 +28,7 @@ pages:
- 'Overview': 'customize/Overview.md'
- 'Import Styles': 'customize/Import-Styles.md'
- 'Configuration Settings': 'customize/Settings.md'
- 'Per-Country Data': 'customize/Country-Settings.md'
- 'Place Ranking' : 'customize/Ranking.md'
- 'Tokenizers' : 'customize/Tokenizers.md'
- 'Special Phrases': 'customize/Special-Phrases.md'
@@ -38,14 +39,14 @@ pages:
- 'Database Layout' : 'develop/Database-Layout.md'
- 'Indexing' : 'develop/Indexing.md'
- 'Tokenizers' : 'develop/Tokenizers.md'
- 'Custom modules for ICU tokenizer': 'develop/ICU-Tokenizer-Modules.md'
- 'Setup for Development' : 'develop/Development-Environment.md'
- 'Testing' : 'develop/Testing.md'
- 'External Data Sources': 'develop/data-sources.md'
- 'Appendix':
- 'Installation on CentOS 7' : 'appendix/Install-on-Centos-7.md'
- 'Installation on CentOS 8' : 'appendix/Install-on-Centos-8.md'
- 'Installation on Ubuntu 18' : 'appendix/Install-on-Ubuntu-18.md'
- 'Installation on Ubuntu 20' : 'appendix/Install-on-Ubuntu-20.md'
- 'Installation on Ubuntu 22' : 'appendix/Install-on-Ubuntu-22.md'
markdown_extensions:
- codehilite
- admonition
@@ -57,7 +58,7 @@ plugins:
- search
- mkdocstrings:
handlers:
python:
python-legacy:
rendering:
show_source: false
show_signature_annotations: false

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -84,6 +92,10 @@ class AddressDetails
|| $aLine['class'] == 'place')
) {
$aAddress[$sTypeLabel] = $sName;
if (!empty($aLine['name'])) {
$this->addSubdivisionCode($aAddress, $aLine['admin_level'], $aLine['name']);
}
}
}
@@ -166,4 +178,14 @@ class AddressDetails
{
return $this->aAddressLines;
}
private function addSubdivisionCode(&$aAddress, $iAdminLevel, $nameDetails)
{
if (is_string($nameDetails)) {
$nameDetails = json_decode('{' . str_replace('"=>"', '":"', $nameDetails) . '}', true);
}
if (!empty($nameDetails['ISO3166-2'])) {
$aAddress["ISO3166-2-lvl$iAdminLevel"] = $nameDetails['ISO3166-2'];
}
}
}

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\ClassTypes;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -38,6 +46,9 @@ class DB
$conn->exec("SET DateStyle TO 'sql,european'");
$conn->exec("SET client_encoding TO 'utf-8'");
// Disable JIT and parallel workers. They interfere badly with search SQL.
$conn->exec("UPDATE pg_settings SET setting = -1 WHERE name = 'jit_above_cost'");
$conn->exec("UPDATE pg_settings SET setting = 0 WHERE name = 'max_parallel_workers_per_gather'");
$iMaxExecution = ini_get('max_execution_time');
if ($iMaxExecution > 0) {
$conn->setAttribute(\PDO::ATTR_TIMEOUT, $iMaxExecution); // seconds

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -127,7 +135,7 @@ class Debug
public static function printSQL($sSQL)
{
echo '<p><tt><font color="#aaa">'.$sSQL.'</font></tt></p>'."\n";
echo '<p><tt><font color="#aaa">'.htmlspecialchars($sSQL, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401).'</font></tt></p>'."\n";
}
private static function outputVar($mVar, $sPreNL)
@@ -170,11 +178,12 @@ class Debug
}
if (is_string($mVar)) {
echo "'$mVar'";
return strlen($mVar) + 2;
$sOut = "'$mVar'";
} else {
$sOut = (string)$mVar;
}
echo (string)$mVar;
return strlen((string)$mVar);
echo htmlspecialchars($sOut, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401);
return strlen($sOut);
}
}

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -95,7 +103,7 @@ class Geocode
}
$this->iFinalLimit = $iLimit;
$this->iLimit = $iLimit + min($iLimit, 10);
$this->iLimit = $iLimit + max($iLimit, 10);
}
public function setFeatureType($sFeatureType)
@@ -182,7 +190,7 @@ class Geocode
$this->bFallback = $oParams->getBool('fallback', $this->bFallback);
// List of excluded Place IDs - used for more acurate pageing
// List of excluded Place IDs - used for more accurate pageing
$sExcluded = $oParams->getStringList('exclude_place_ids');
if ($sExcluded) {
foreach ($sExcluded as $iExcludedPlaceID) {
@@ -609,16 +617,15 @@ class Geocode
}
$aReverseGroupedSearches = $this->getGroupedSearches($aSearches, $aPhrases, $oValidTokens);
foreach ($aGroupedSearches as $aSearches) {
foreach ($aReverseGroupedSearches as $aSearches) {
foreach ($aSearches as $aSearch) {
if (!isset($aReverseGroupedSearches[$aSearch->getRank()])) {
$aReverseGroupedSearches[$aSearch->getRank()] = array();
if (!isset($aGroupedSearches[$aSearch->getRank()])) {
$aGroupedSearches[$aSearch->getRank()] = array();
}
$aReverseGroupedSearches[$aSearch->getRank()][] = $aSearch;
$aGroupedSearches[$aSearch->getRank()][] = $aSearch;
}
}
$aGroupedSearches = $aReverseGroupedSearches;
ksort($aGroupedSearches);
}
} else {
@@ -836,7 +843,9 @@ class Geocode
$aResult['importance'] = 0.001;
$aResult['foundorder'] = $aResult['addressimportance'];
} else {
$aResult['importance'] = max(0.001, $aResult['importance']);
if ($aResult['importance'] == 0) {
$aResult['importance'] = 0.0001;
}
$aResult['importance'] *= $this->viewboxImportanceFactor(
$aResult['lon'],
$aResult['lat']

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -14,7 +22,10 @@ class ParameterParser
public function getBool($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName]) || strlen($this->aParams[$sName]) == 0) {
if (!isset($this->aParams[$sName])
|| !is_string($this->aParams[$sName])
|| strlen($this->aParams[$sName]) == 0
) {
return $bDefault;
}
@@ -23,7 +34,7 @@ class ParameterParser
public function getInt($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName])) {
if (!isset($this->aParams[$sName]) || is_array($this->aParams[$sName])) {
return $bDefault;
}
@@ -36,7 +47,7 @@ class ParameterParser
public function getFloat($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName])) {
if (!isset($this->aParams[$sName]) || is_array($this->aParams[$sName])) {
return $bDefault;
}
@@ -49,7 +60,10 @@ class ParameterParser
public function getString($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName]) || strlen($this->aParams[$sName]) == 0) {
if (!isset($this->aParams[$sName])
|| !is_string($this->aParams[$sName])
|| strlen($this->aParams[$sName]) == 0
) {
return $bDefault;
}
@@ -58,11 +72,14 @@ class ParameterParser
public function getSet($sName, $aValues, $sDefault = false)
{
if (!isset($this->aParams[$sName]) || strlen($this->aParams[$sName]) == 0) {
if (!isset($this->aParams[$sName])
|| !is_string($this->aParams[$sName])
|| strlen($this->aParams[$sName]) == 0
) {
return $sDefault;
}
if (!in_array($this->aParams[$sName], $aValues)) {
if (!in_array($this->aParams[$sName], $aValues, true)) {
userError("Parameter '$sName' must be one of: ".join(', ', $aValues));
}
@@ -106,21 +123,27 @@ class ParameterParser
}
foreach ($aLanguages as $sLanguage => $fLanguagePref) {
$aLangPrefOrder['name:'.$sLanguage] = 'name:'.$sLanguage;
$this->addNameTag($aLangPrefOrder, 'name:'.$sLanguage);
}
$aLangPrefOrder['name'] = 'name';
$aLangPrefOrder['brand'] = 'brand';
$this->addNameTag($aLangPrefOrder, 'name');
$this->addNameTag($aLangPrefOrder, 'brand');
foreach ($aLanguages as $sLanguage => $fLanguagePref) {
$aLangPrefOrder['official_name:'.$sLanguage] = 'official_name:'.$sLanguage;
$aLangPrefOrder['short_name:'.$sLanguage] = 'short_name:'.$sLanguage;
$this->addNameTag($aLangPrefOrder, 'official_name:'.$sLanguage);
$this->addNameTag($aLangPrefOrder, 'short_name:'.$sLanguage);
}
$aLangPrefOrder['official_name'] = 'official_name';
$aLangPrefOrder['short_name'] = 'short_name';
$aLangPrefOrder['ref'] = 'ref';
$aLangPrefOrder['type'] = 'type';
$this->addNameTag($aLangPrefOrder, 'official_name');
$this->addNameTag($aLangPrefOrder, 'short_name');
$this->addNameTag($aLangPrefOrder, 'ref');
$this->addNameTag($aLangPrefOrder, 'type');
return $aLangPrefOrder;
}
private function addNameTag(&$aLangPrefOrder, $sTag)
{
$aLangPrefOrder[$sTag] = $sTag;
$aLangPrefOrder['_place_'.$sTag] = '_place_'.$sTag;
}
public function hasSetAny($aParamNames)
{
foreach ($aParamNames as $sName) {

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -24,7 +32,7 @@ class Phrase
}
/**
* Get the orginal phrase of the string.
* Get the original phrase of the string.
*/
public function getPhrase()
{

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -340,7 +348,9 @@ class PlaceLookup
$sSQL .= ' null::text AS extra_place ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT place_id, '; // interpolate the Tiger housenumbers here
$sSQL .= ' ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float) AS centroid, ';
$sSQL .= ' CASE WHEN startnumber != endnumber';
$sSQL .= ' THEN ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float)';
$sSQL .= ' ELSE ST_LineInterpolatePoint(linegeo, 0.5) END AS centroid, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place';
$sSQL .= ' FROM (';
@@ -397,7 +407,7 @@ class PlaceLookup
$sSQL .= ' CASE '; // interpolate the housenumbers here
$sSQL .= ' WHEN startnumber != endnumber ';
$sSQL .= ' THEN ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float) ';
$sSQL .= ' ELSE ST_LineInterpolatePoint(linegeo, 0.5) ';
$sSQL .= ' ELSE linegeo ';
$sSQL .= ' END as centroid, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place ';
@@ -435,18 +445,14 @@ class PlaceLookup
if ($this->bExtraTags) {
if ($aPlace['extra']) {
$aPlace['sExtraTags'] = json_decode($aPlace['extra']);
$aPlace['sExtraTags'] = json_decode($aPlace['extra'], true);
} else {
$aPlace['sExtraTags'] = (object) array();
}
}
if ($this->bNameDetails) {
if ($aPlace['names']) {
$aPlace['sNameDetails'] = json_decode($aPlace['names']);
} else {
$aPlace['sNameDetails'] = (object) array();
}
$aPlace['sNameDetails'] = $this->extractNames($aPlace['names']);
}
$aPlace['addresstype'] = ClassTypes\getLabelTag(
@@ -469,6 +475,33 @@ class PlaceLookup
return $aResults;
}
private function extractNames($sNames)
{
if (!$sNames) {
return (object) array();
}
$aFullNames = json_decode($sNames, true);
$aNames = array();
foreach ($aFullNames as $sKey => $sValue) {
if (strpos($sKey, '_place_') === 0) {
$sSubKey = substr($sKey, 7);
if (array_key_exists($sSubKey, $aFullNames)) {
$aNames[$sKey] = $sValue;
} else {
$aNames[$sSubKey] = $sValue;
}
} else {
$aNames[$sKey] = $sValue;
}
}
return $aNames;
}
/* returns an array which will contain the keys
* aBoundingBox
* and may also contain one or more of the keys
@@ -479,8 +512,6 @@ class PlaceLookup
* lat
* lon
*/
public function getOutlines($iPlaceID, $fLon = null, $fLat = null, $fRadius = null, $fLonReverse = null, $fLatReverse = null)
{

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -56,12 +64,15 @@ class ReverseGeocode
{
Debug::newFunction('lookupInterpolation');
$sSQL = 'SELECT place_id, parent_place_id, 30 as rank_search,';
$sSQL .= ' ST_LineLocatePoint(linegeo,'.$sPointSQL.') as fraction,';
$sSQL .= ' startnumber, endnumber, interpolationtype,';
$sSQL .= ' (CASE WHEN endnumber != startnumber';
$sSQL .= ' THEN (endnumber - startnumber) * ST_LineLocatePoint(linegeo,'.$sPointSQL.')';
$sSQL .= ' ELSE startnumber END) as fhnr,';
$sSQL .= ' startnumber, endnumber, step,';
$sSQL .= ' ST_Distance(linegeo,'.$sPointSQL.') as distance';
$sSQL .= ' FROM location_property_osmline';
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', linegeo, '.$fSearchDiam.')';
$sSQL .= ' and indexed_status = 0 and startnumber is not NULL ';
$sSQL .= ' and indexed_status = 0 and startnumber is not NULL ';
$sSQL .= ' and parent_place_id != 0';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
@@ -255,7 +266,7 @@ class ReverseGeocode
// starts if the search is on POI or street level,
// searches for the nearest POI or street,
// if a street is found and a POI is searched for,
// the nearest POI which the found street is a parent of is choosen.
// the nearest POI which the found street is a parent of is chosen.
$sSQL = 'select place_id,parent_place_id,rank_address,country_code,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM ';
@@ -319,9 +330,9 @@ class ReverseGeocode
&& $this->iMaxRank >= 28
) {
$sSQL = 'SELECT place_id,parent_place_id,30 as rank_search,';
$sSQL .= 'ST_LineLocatePoint(linegeo,'.$sPointSQL.') as fraction,';
$sSQL .= 'ST_distance('.$sPointSQL.', linegeo) as distance,';
$sSQL .= 'startnumber,endnumber,interpolationtype';
$sSQL .= ' (endnumber - startnumber) * ST_LineLocatePoint(linegeo,'.$sPointSQL.') as fhnr,';
$sSQL .= ' startnumber, endnumber, step,';
$sSQL .= ' ST_Distance('.$sPointSQL.', linegeo) as distance';
$sSQL .= ' FROM location_property_tiger WHERE parent_place_id = '.$oResult->iId;
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', linegeo, 0.001)';
$sSQL .= ' ORDER BY distance ASC limit 1';
@@ -333,7 +344,11 @@ class ReverseGeocode
if ($aPlaceTiger) {
$aPlace = $aPlaceTiger;
$oResult = new Result($aPlaceTiger['place_id'], Result::TABLE_TIGER);
$oResult->iHouseNumber = closestHouseNumber($aPlaceTiger);
$iRndNum = max(0, round($aPlaceTiger['fhnr'] / $aPlaceTiger['step']) * $aPlaceTiger['step']);
$oResult->iHouseNumber = $aPlaceTiger['startnumber'] + $iRndNum;
if ($oResult->iHouseNumber > $aPlaceTiger['endnumber']) {
$oResult->iHouseNumber = $aPlaceTiger['endnumber'];
}
$iRankAddress = 30;
}
}
@@ -345,7 +360,7 @@ class ReverseGeocode
// We can't reliably go from the closest street to an
// interpolation line because the closest interpolation
// may have a different street segments as a parent.
// Therefore allow an interpolation line to take precendence
// Therefore allow an interpolation line to take precedence
// even when the street is closer.
$fDistance = $iRankAddress < 28 ? 0.001 : $aPlace['distance'];
}
@@ -355,7 +370,11 @@ class ReverseGeocode
if ($aHouse) {
$oResult = new Result($aHouse['place_id'], Result::TABLE_OSMLINE);
$oResult->iHouseNumber = closestHouseNumber($aHouse);
$iRndNum = max(0, round($aHouse['fhnr'] / $aHouse['step']) * $aHouse['step']);
$oResult->iHouseNumber = $aHouse['startnumber'] + $iRndNum;
if ($oResult->iHouseNumber > $aHouse['endnumber']) {
$oResult->iHouseNumber = $aHouse['endnumber'];
}
$aPlace = $aHouse;
}
}

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -228,7 +236,7 @@ class SearchDescription
* Add the given full-word token to the list of terms to search for in the
* name.
*
* @param interger iId ID of term to add.
* @param integer iId ID of term to add.
* @param bool bRareName True if the term is infrequent enough to not
* require other constraints for efficient search.
*/
@@ -256,6 +264,8 @@ class SearchDescription
{
if (empty($this->aName)) {
$this->bNameNeedsAddress = $bNeedsAddress;
} elseif ($bSearchable && count($this->aName) >= 2) {
$this->bNameNeedsAddress = false;
} else {
$this->bNameNeedsAddress &= $bNeedsAddress;
}
@@ -377,7 +387,7 @@ class SearchDescription
*
* @return mixed[] An array with two fields: IDs contains the list of
* matching place IDs and houseNumber the houseNumber
* if appicable or -1 if not.
* if applicable or -1 if not.
*/
public function query(&$oDB, $iMinRank, $iMaxRank, $iLimit)
{
@@ -412,28 +422,6 @@ class SearchDescription
$iLimit
);
// Now search for housenumber, if housenumber provided. Can be zero.
if (($this->sHouseNumber || $this->sHouseNumber === '0') && !empty($aResults)) {
$aHnResults = $this->queryHouseNumber($oDB, $aResults);
// Downgrade the rank of the street results, they are missing
// the housenumber. Also drop POI places (rank 30) here, they
// cannot be a parent place and therefore must not be shown
// as a result for a search with a missing housenumber.
foreach ($aResults as $oRes) {
if ($oRes->iAddressRank < 28) {
if ($oRes->iAddressRank >= 26) {
$oRes->iResultRank++;
} else {
$oRes->iResultRank += 2;
}
$aHnResults[$oRes->iId] = $oRes;
}
}
$aResults = $aHnResults;
}
// finally get POIs if requested
if ($this->sClass && !empty($aResults)) {
$aResults = $this->queryPoiByOperator($oDB, $aResults, $iLimit);
@@ -579,41 +567,6 @@ class SearchDescription
$aTerms = array();
$aOrder = array();
// Sort by existence of the requested house number but only if not
// too many results are expected for the street, i.e. if the result
// will be narrowed down by an address. Remember that with ordering
// every single result has to be checked.
if ($this->sHouseNumber && ($this->bRareName || !empty($this->aAddress) || $this->sPostcode)) {
$sHouseNumberRegex = $oDB->getDBQuoted('\\\\m'.$this->sHouseNumber.'\\\\M');
// Housenumbers on streets and places.
$sChildHnr = 'SELECT * FROM placex WHERE parent_place_id = search_name.place_id';
$sChildHnr .= ' AND housenumber ~* E'.$sHouseNumberRegex;
// Interpolations on streets and places.
if (preg_match('/^[0-9]+$/', $this->sHouseNumber)) {
$sIpolHnr = 'SELECT * FROM location_property_osmline ';
$sIpolHnr .= 'WHERE parent_place_id = search_name.place_id ';
$sIpolHnr .= ' AND startnumber is not NULL';
$sIpolHnr .= ' AND '.$this->sHouseNumber.'>=startnumber ';
$sIpolHnr .= ' AND '.$this->sHouseNumber.'<=endnumber ';
} else {
$sIpolHnr = false;
}
// Housenumbers on the object iteself for unlisted places.
$sSelfHnr = 'SELECT * FROM placex WHERE place_id = search_name.place_id';
$sSelfHnr .= ' AND housenumber ~* E'.$sHouseNumberRegex;
$sSql = '(CASE WHEN address_rank = 30 THEN EXISTS('.$sSelfHnr.') ';
$sSql .= ' ELSE EXISTS('.$sChildHnr.') ';
if ($sIpolHnr) {
$sSql .= 'OR EXISTS('.$sIpolHnr.') ';
}
$sSql .= 'END) DESC';
$aOrder[] = $sSql;
}
if (!empty($this->aName)) {
$aTerms[] = 'name_vector @> '.$oDB->getArraySQL($this->aName);
}
@@ -659,10 +612,6 @@ class SearchDescription
$aTerms[] = 'centroid && '.$this->oContext->sqlViewboxSmall;
}
if ($this->oContext->hasNearPoint()) {
$aOrder[] = $this->oContext->distanceSQL('centroid');
}
if ($this->sHouseNumber) {
$sImportanceSQL = '- abs(26 - address_rank) + 3';
} else {
@@ -685,122 +634,128 @@ class SearchDescription
$sExactMatchSQL = '0::int as exactmatch';
}
if ($this->sHouseNumber || $this->sClass) {
$iLimit = 40;
if (empty($aTerms)) {
return array();
}
$aResults = array();
if ($this->hasHousenumber()) {
$sHouseNumberRegex = $oDB->getDBQuoted('\\\\m'.$this->sHouseNumber.'\\\\M');
if (!empty($aTerms)) {
$sSQL = 'SELECT place_id, address_rank,'.$sExactMatchSQL;
// Housenumbers on streets and places.
$sPlacexSql = 'SELECT array_agg(place_id) FROM placex';
$sPlacexSql .= ' WHERE parent_place_id = sin.place_id AND sin.address_rank < 30';
$sPlacexSql .= $this->oContext->excludeSQL(' AND place_id');
$sPlacexSql .= ' and housenumber ~* E'.$sHouseNumberRegex;
// Interpolations on streets and places.
$sInterpolSql = 'null';
$sTigerSql = 'null';
if (preg_match('/^[0-9]+$/', $this->sHouseNumber)) {
$sIpolHnr = 'WHERE parent_place_id = sin.place_id ';
$sIpolHnr .= ' AND startnumber is not NULL AND sin.address_rank < 30';
$sIpolHnr .= ' AND '.$this->sHouseNumber.' between startnumber and endnumber';
$sIpolHnr .= ' AND ('.$this->sHouseNumber.' - startnumber) % step = 0';
$sInterpolSql = 'SELECT array_agg(place_id) FROM location_property_osmline '.$sIpolHnr;
if (CONST_Use_US_Tiger_Data) {
$sTigerSql = 'SELECT array_agg(place_id) FROM location_property_tiger '.$sIpolHnr;
$sTigerSql .= " and sin.country_code = 'us'";
}
}
if ($this->sClass) {
$iLimit = 40;
}
$sSelfHnr = 'SELECT * FROM placex WHERE place_id = search_name.place_id';
$sSelfHnr .= ' AND housenumber ~* E'.$sHouseNumberRegex;
$aTerms[] = '(address_rank < 30 or exists('.$sSelfHnr.'))';
$sSQL = 'SELECT sin.*, ';
$sSQL .= '('.$sPlacexSql.') as placex_hnr, ';
$sSQL .= '('.$sInterpolSql.') as interpol_hnr, ';
$sSQL .= '('.$sTigerSql.') as tiger_hnr ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT place_id, address_rank, country_code,'.$sExactMatchSQL.',';
$sSQL .= ' CASE WHEN importance = 0 OR importance IS NULL';
$sSQL .= ' THEN 0.75001-(search_rank::float/40) ELSE importance END as importance';
$sSQL .= ' FROM search_name';
$sSQL .= ' WHERE '.join(' and ', $aTerms);
$sSQL .= ' ORDER BY '.join(', ', $aOrder);
$sSQL .= ' LIMIT 40000';
$sSQL .= ') as sin';
$sSQL .= ' ORDER BY address_rank = 30 desc, placex_hnr, interpol_hnr, tiger_hnr,';
$sSQL .= ' importance';
$sSQL .= ' LIMIT '.$iLimit;
} else {
if ($this->sClass) {
$iLimit = 40;
}
$sSQL = 'SELECT place_id, address_rank, '.$sExactMatchSQL;
$sSQL .= ' FROM search_name';
$sSQL .= ' WHERE '.join(' and ', $aTerms);
$sSQL .= ' ORDER BY '.join(', ', $aOrder);
$sSQL .= ' LIMIT '.$iLimit;
Debug::printSQL($sSQL);
$aDBResults = $oDB->getAll($sSQL, null, 'Could not get places for search terms.');
foreach ($aDBResults as $aResult) {
$oResult = new Result($aResult['place_id']);
$oResult->iExactMatches = $aResult['exactmatch'];
$oResult->iAddressRank = $aResult['address_rank'];
$aResults[$aResult['place_id']] = $oResult;
}
}
return $aResults;
}
private function queryHouseNumber(&$oDB, $aRoadPlaceIDs)
{
$aResults = array();
$sRoadPlaceIDs = Result::joinIdsByTableMaxRank(
$aRoadPlaceIDs,
Result::TABLE_PLACEX,
27
);
$sPOIPlaceIDs = Result::joinIdsByTableMinRank(
$aRoadPlaceIDs,
Result::TABLE_PLACEX,
30
);
$aIDCondition = array();
if ($sRoadPlaceIDs) {
$aIDCondition[] = 'parent_place_id in ('.$sRoadPlaceIDs.')';
}
if ($sPOIPlaceIDs) {
$aIDCondition[] = 'place_id in ('.$sPOIPlaceIDs.')';
}
if (empty($aIDCondition)) {
return $aResults;
}
$sHouseNumberRegex = $oDB->getDBQuoted('\\\\m'.$this->sHouseNumber.'\\\\M');
$sSQL = 'SELECT place_id FROM placex WHERE';
$sSQL .= ' housenumber ~* E'.$sHouseNumberRegex;
$sSQL .= ' AND ('.join(' OR ', $aIDCondition).')';
$sSQL .= $this->oContext->excludeSQL(' AND place_id');
Debug::printSQL($sSQL);
// XXX should inherit the exactMatches from its parent
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId);
}
$aDBResults = $oDB->getAll($sSQL, null, 'Could not get places for search terms.');
$bIsIntHouseNumber= (bool) preg_match('/[0-9]+/', $this->sHouseNumber);
$iHousenumber = intval($this->sHouseNumber);
if ($bIsIntHouseNumber && $sRoadPlaceIDs && empty($aResults)) {
// if nothing found, search in the interpolation line table
$sSQL = 'SELECT distinct place_id FROM location_property_osmline';
$sSQL .= ' WHERE startnumber is not NULL';
$sSQL .= ' AND parent_place_id in ('.$sRoadPlaceIDs.') AND (';
if ($iHousenumber % 2 == 0) {
// If housenumber is even, look for housenumber in streets
// with interpolationtype even or all.
$sSQL .= "interpolationtype='even'";
} else {
// Else look for housenumber with interpolationtype odd or all.
$sSQL .= "interpolationtype='odd'";
$aResults = array();
foreach ($aDBResults as $aResult) {
$oResult = new Result($aResult['place_id']);
$oResult->iExactMatches = $aResult['exactmatch'];
$oResult->iAddressRank = $aResult['address_rank'];
$bNeedResult = true;
if ($this->hasHousenumber() && $aResult['address_rank'] < 30) {
if ($aResult['placex_hnr']) {
foreach (explode(',', substr($aResult['placex_hnr'], 1, -1)) as $sPlaceID) {
$iPlaceID = intval($sPlaceID);
$oHnrResult = new Result($iPlaceID);
$oHnrResult->iExactMatches = $aResult['exactmatch'];
$oHnrResult->iAddressRank = 30;
$aResults[$iPlaceID] = $oHnrResult;
$bNeedResult = false;
}
}
if ($aResult['interpol_hnr']) {
foreach (explode(',', substr($aResult['interpol_hnr'], 1, -1)) as $sPlaceID) {
$iPlaceID = intval($sPlaceID);
$oHnrResult = new Result($iPlaceID, Result::TABLE_OSMLINE);
$oHnrResult->iExactMatches = $aResult['exactmatch'];
$oHnrResult->iAddressRank = 30;
$oHnrResult->iHouseNumber = intval($this->sHouseNumber);
$aResults[$iPlaceID] = $oHnrResult;
$bNeedResult = false;
}
}
if ($aResult['tiger_hnr']) {
foreach (explode(',', substr($aResult['tiger_hnr'], 1, -1)) as $sPlaceID) {
$iPlaceID = intval($sPlaceID);
$oHnrResult = new Result($iPlaceID, Result::TABLE_TIGER);
$oHnrResult->iExactMatches = $aResult['exactmatch'];
$oHnrResult->iAddressRank = 30;
$oHnrResult->iHouseNumber = intval($this->sHouseNumber);
$aResults[$iPlaceID] = $oHnrResult;
$bNeedResult = false;
}
}
if ($aResult['address_rank'] < 26) {
$oResult->iResultRank += 2;
} else {
$oResult->iResultRank++;
}
}
$sSQL .= " or interpolationtype='all') and ";
$sSQL .= $iHousenumber.'>=startnumber and ';
$sSQL .= $iHousenumber.'<=endnumber';
$sSQL .= $this->oContext->excludeSQL(' AND place_id');
Debug::printSQL($sSQL);
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$oResult = new Result($iPlaceId, Result::TABLE_OSMLINE);
$oResult->iHouseNumber = $iHousenumber;
$aResults[$iPlaceId] = $oResult;
}
}
// If nothing found then search in Tiger data (location_property_tiger)
if (CONST_Use_US_Tiger_Data && $sRoadPlaceIDs && $bIsIntHouseNumber && empty($aResults)) {
$sSQL = 'SELECT place_id FROM location_property_tiger';
$sSQL .= ' WHERE parent_place_id in ('.$sRoadPlaceIDs.') and (';
if ($iHousenumber % 2 == 0) {
$sSQL .= "interpolationtype='even'";
} else {
$sSQL .= "interpolationtype='odd'";
}
$sSQL .= " or interpolationtype='all') and ";
$sSQL .= $iHousenumber.'>=startnumber and ';
$sSQL .= $iHousenumber.'<=endnumber';
$sSQL .= $this->oContext->excludeSQL(' AND place_id');
Debug::printSQL($sSQL);
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$oResult = new Result($iPlaceId, Result::TABLE_TIGER);
$oResult->iHouseNumber = $iHousenumber;
$aResults[$iPlaceId] = $oResult;
if ($bNeedResult) {
$aResults[$aResult['place_id']] = $oResult;
}
}
@@ -852,6 +807,7 @@ class SearchDescription
$sSQL = 'SELECT geometry FROM placex';
$sSQL .= " WHERE place_id in ($sPlaceIDs)";
$sSQL .= " AND rank_search < $iMaxRank + 5";
$sSQL .= ' AND ST_Area(Box2d(geometry)) < 20';
$sSQL .= " AND ST_GeometryType(geometry) in ('ST_Polygon','ST_MultiPolygon')";
$sSQL .= ' ORDER BY rank_search ASC ';
$sSQL .= ' LIMIT 1';

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -112,13 +120,18 @@ class SimpleWordList
return array_slice($aWordSets, 0, SimpleWordList::MAX_WORDSETS);
}
/**
* Custom search routine which takes two arrays. The array with the fewest
* items wins. If same number of items then the one with the longest first
* element wins.
*/
public static function cmpByArraylen($aA, $aB)
{
$iALen = count($aA);
$iBLen = count($aB);
if ($iALen == $iBLen) {
return 0;
return strlen($aB[0]) <=> strlen($aA[0]);
}
return ($iALen < $iBLen) ? -1 : 1;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\Token;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\Token;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\Token;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\Token;
@@ -9,7 +17,7 @@ class Postcode
{
/// Database word id, if available.
private $iId;
/// Full nomralized postcode (upper cased).
/// Full normalized postcode (upper cased).
private $sPostcode;
// Optional country code the postcode belongs to (currently unused).
private $sCountryCode;
@@ -17,7 +25,12 @@ class Postcode
public function __construct($iId, $sPostcode, $sCountryCode = '')
{
$this->iId = $iId;
$this->sPostcode = $sPostcode;
$iSplitPos = strpos($sPostcode, '@');
if ($iSplitPos === false) {
$this->sPostcode = $sPostcode;
} else {
$this->sPostcode = substr($sPostcode, 0, $iSplitPos);
}
$this->sCountryCode = empty($sCountryCode) ? '' : $sCountryCode;
}

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\Token;
@@ -61,19 +69,31 @@ class SpecialTerm
*/
public function extendSearch($oSearch, $oPosition)
{
$iSearchCost = 2;
$iSearchCost = 0;
$iOp = $this->iOperator;
if ($iOp == \Nominatim\Operator::NONE) {
if ($oSearch->hasName() || $oSearch->getContext()->isBoundedSearch()) {
if ($oPosition->isFirstToken()
|| $oSearch->hasName()
|| $oSearch->getContext()->isBoundedSearch()
) {
$iOp = \Nominatim\Operator::NAME;
$iSearchCost += 3;
} else {
$iOp = \Nominatim\Operator::NEAR;
$iSearchCost += 2;
$iSearchCost += 4;
if (!$oPosition->isFirstToken()) {
$iSearchCost += 3;
}
}
} elseif (!$oPosition->isFirstToken() && !$oPosition->isLastToken()) {
} elseif ($oPosition->isFirstToken()) {
$iSearchCost += 2;
} elseif ($oPosition->isLastToken()) {
$iSearchCost += 4;
} else {
$iSearchCost += 6;
}
if ($oSearch->hasHousenumber()) {
$iSearchCost ++;
}

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\Token;
@@ -54,7 +62,7 @@ class Word
public function extendSearch($oSearch, $oPosition)
{
// Full words can only be a name if they appear at the beginning
// of the phrase. In structured search the name must forcably in
// of the phrase. In structured search the name must forcibly in
// the first phrase. In unstructured search it may be in a later
// phrase when the first phrase is a house number.
if ($oSearch->hasName()

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
@define('CONST_LibDir', dirname(dirname(__FILE__)));
// Script to extract structured city and street data
// from a running nominatim instance as CSV data

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
@define('CONST_LibDir', dirname(dirname(__FILE__)));
require_once(CONST_LibDir.'/init-cmd.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/Shell.php');
@@ -98,7 +106,7 @@ function getCmdOpt($aArg, $aSpec, &$aResult, $bExitOnError = false, $bExitOnUnkn
showUsage($aSpec, $bExitOnError, 'Option \''.$aLine[0].'\' is missing');
}
if ($aCounts[$aLine[0]] > $aLine[3]) {
showUsage($aSpec, $bExitOnError, 'Option \''.$aLine[0].'\' is pressent too many times');
showUsage($aSpec, $bExitOnError, 'Option \''.$aLine[0].'\' is present too many times');
}
if ($aLine[6] == 'bool' && !array_key_exists($aLine[0], $aResult)) {
$aResult[$aLine[0]] = false;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require('Symfony/Component/Dotenv/autoload.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once('init.php');
require_once('cmd.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once('init.php');
require_once('ParameterParser.php');
@@ -18,7 +26,7 @@ function userError($sMsg)
function exception_handler_json($exception)
{
http_response_code($exception->getCode());
http_response_code($exception->getCode() == 0 ? 500 : $exception->getCode());
header('Content-type: application/json; charset=utf-8');
include(CONST_LibDir.'/template/error-json.php');
exit();
@@ -26,7 +34,7 @@ function exception_handler_json($exception)
function exception_handler_xml($exception)
{
http_response_code($exception->getCode());
http_response_code($exception->getCode() == 0 ? 500 : $exception->getCode());
header('Content-type: text/xml; charset=utf-8');
echo '<?xml version="1.0" encoding="UTF-8" ?>'."\n";
include(CONST_LibDir.'/template/error-xml.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/lib.php');
require_once(CONST_LibDir.'/DB.php');

View File

@@ -1,9 +1,17 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
function loadSettings($sProjectDir)
{
@define('CONST_InstallDir', $sProjectDir);
// Temporary hack to set the direcory via environment instead of
// Temporary hack to set the directory via environment instead of
// the installed scripts. Neither setting is part of the official
// set of settings.
defined('CONST_ConfigDir') or define('CONST_ConfigDir', $_SERVER['NOMINATIM_CONFIGDIR']);
@@ -198,24 +206,34 @@ function parseLatLon($sQuery)
return array($sFound, $fQueryLat, $fQueryLon);
}
function closestHouseNumber($aRow)
function addressRankToGeocodeJsonType($iAddressRank)
{
$fHouse = $aRow['startnumber']
+ ($aRow['endnumber'] - $aRow['startnumber']) * $aRow['fraction'];
switch ($aRow['interpolationtype']) {
case 'odd':
$iHn = (int)($fHouse/2) * 2 + 1;
break;
case 'even':
$iHn = (int)(round($fHouse/2)) * 2;
break;
default:
$iHn = (int)(round($fHouse));
break;
if ($iAddressRank >= 29 && $iAddressRank <= 30) {
return 'house';
}
if ($iAddressRank >= 26 && $iAddressRank < 28) {
return 'street';
}
if ($iAddressRank >= 22 && $iAddressRank < 26) {
return 'locality';
}
if ($iAddressRank >= 17 && $iAddressRank < 22) {
return 'district';
}
if ($iAddressRank >= 13 && $iAddressRank < 17) {
return 'city';
}
if ($iAddressRank >= 10 && $iAddressRank < 13) {
return 'county';
}
if ($iAddressRank >= 5 && $iAddressRank < 10) {
return 'state';
}
if ($iAddressRank >= 4 && $iAddressRank < 5) {
return 'country';
}
return max(min($aRow['endnumber'], $iHn), $aRow['startnumber']);
return 'locality';
}
if (!function_exists('array_key_last')) {

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
function logStart(&$oDB, $sType = '', $sQuery = '', $aLanguageList = array())
@@ -70,7 +78,7 @@ function logEnd(&$oDB, $hLog, $iNumResults)
if (CONST_Log_DB) {
$aEndTime = explode('.', $fEndTime);
if (!$aEndTime[1]) {
if (!isset($aEndTime[1])) {
$aEndTime[1] = '0';
}
$sEndTime = date('Y-m-d H:i:s', $aEndTime[0]).'.'.$aEndTime[1];

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
function formatOSMType($sType, $bIncludeExternal = true)

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
function getOsm2pgsqlBinary()
{

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
// https://github.com/geocoders/geocodejson-spec/
@@ -28,7 +36,7 @@ if (empty($aPlace)) {
$aFilteredPlaces['properties']['geocoding']['osm_id'] = $aPlace['osm_id'];
}
$aFilteredPlaces['properties']['geocoding']['type'] = $aPlace['type'];
$aFilteredPlaces['properties']['geocoding']['type'] = addressRankToGeocodeJsonType($aPlace['rank_address']);
$aFilteredPlaces['properties']['geocoding']['accuracy'] = (int) $fDistance;
@@ -48,7 +56,7 @@ if (empty($aPlace)) {
}
if (isset($aPlace['asgeojson'])) {
$aFilteredPlaces['geometry'] = json_decode($aPlace['asgeojson']);
$aFilteredPlaces['geometry'] = json_decode($aPlace['asgeojson'], true);
} else {
$aFilteredPlaces['geometry'] = array(
'type' => 'Point',

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aFilteredPlaces = array();
@@ -57,7 +65,7 @@ if (empty($aPlace)) {
}
if (isset($aPlace['asgeojson'])) {
$aFilteredPlaces['geometry'] = json_decode($aPlace['asgeojson']);
$aFilteredPlaces['geometry'] = json_decode($aPlace['asgeojson'], true);
} else {
$aFilteredPlaces['geometry'] = array(
'type' => 'Point',

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aFilteredPlaces = array();
@@ -55,7 +63,7 @@ if (empty($aPlace)) {
}
if (isset($aPlace['asgeojson'])) {
$aFilteredPlaces['geojson'] = json_decode($aPlace['asgeojson']);
$aFilteredPlaces['geojson'] = json_decode($aPlace['asgeojson'], true);
}
if (isset($aPlace['assvg'])) {

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
header('content-type: text/xml; charset=UTF-8');
echo '<';

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aPlaceDetails = array();
@@ -40,7 +48,7 @@ $aPlaceDetails['centroid'] = array(
'coordinates' => array( (float) $aPointDetails['lon'], (float) $aPointDetails['lat'] )
);
$aPlaceDetails['geometry'] = json_decode($aPointDetails['asgeojson']);
$aPlaceDetails['geometry'] = json_decode($aPointDetails['asgeojson'], true);
$funcMapAddressLine = function ($aFull) {
return array(

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$error = array(
'code' => $exception->getCode(),
'message' => $exception->getMessage()

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aOutput = array();
$aOutput['licence'] = 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright';
@@ -52,7 +60,7 @@ foreach ($aBatchResults as $aSearchResults) {
}
if (isset($aPointDetails['asgeojson'])) {
$aPlace['geojson'] = json_decode($aPointDetails['asgeojson']);
$aPlace['geojson'] = json_decode($aPointDetails['asgeojson'], true);
}
if (isset($aPointDetails['assvg'])) {

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aFilteredPlaces = array();
foreach ($aSearchResults as $iResNum => $aPointDetails) {
@@ -17,8 +25,10 @@ foreach ($aSearchResults as $iResNum => $aPointDetails) {
$aPlace['properties']['geocoding']['osm_type'] = $sOSMType;
$aPlace['properties']['geocoding']['osm_id'] = $aPointDetails['osm_id'];
}
$aPlace['properties']['geocoding']['osm_key'] = $aPointDetails['class'];
$aPlace['properties']['geocoding']['osm_value'] = $aPointDetails['type'];
$aPlace['properties']['geocoding']['type'] = $aPointDetails['type'];
$aPlace['properties']['geocoding']['type'] = addressRankToGeocodeJsonType($aPointDetails['rank_address']);
$aPlace['properties']['geocoding']['label'] = $aPointDetails['langaddress'];
@@ -36,7 +46,7 @@ foreach ($aSearchResults as $iResNum => $aPointDetails) {
}
if (isset($aPointDetails['asgeojson'])) {
$aPlace['geometry'] = json_decode($aPointDetails['asgeojson']);
$aPlace['geometry'] = json_decode($aPointDetails['asgeojson'], true);
} else {
$aPlace['geometry'] = array(
'type' => 'Point',

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aFilteredPlaces = array();
foreach ($aSearchResults as $iResNum => $aPointDetails) {
@@ -46,7 +54,7 @@ foreach ($aSearchResults as $iResNum => $aPointDetails) {
}
if (isset($aPointDetails['asgeojson'])) {
$aPlace['geometry'] = json_decode($aPointDetails['asgeojson']);
$aPlace['geometry'] = json_decode($aPointDetails['asgeojson'], true);
} else {
$aPlace['geometry'] = array(
'type' => 'Point',

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
$aFilteredPlaces = array();
foreach ($aSearchResults as $iResNum => $aPointDetails) {
@@ -45,7 +53,7 @@ foreach ($aSearchResults as $iResNum => $aPointDetails) {
}
if (isset($aPointDetails['asgeojson'])) {
$aPlace['geojson'] = json_decode($aPointDetails['asgeojson']);
$aPlace['geojson'] = json_decode($aPointDetails['asgeojson'], true);
}
if (isset($aPointDetails['assvg'])) {

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
header('content-type: text/xml; charset=UTF-8');
echo '<';

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
@@ -149,7 +157,8 @@ class Tokenizer
$sSQL = 'SELECT word_id, word_token, type, word,';
$sSQL .= " info->>'op' as operator,";
$sSQL .= " info->>'class' as class, info->>'type' as ctype,";
$sSQL .= " info->>'count' as count";
$sSQL .= " info->>'count' as count,";
$sSQL .= " info->>'lookup' as lookup";
$sSQL .= ' FROM word WHERE word_token in (';
$sSQL .= join(',', $this->oDB->getDBQuotedList($aTokens)).')';
@@ -171,7 +180,8 @@ class Tokenizer
}
break;
case 'H': // house number tokens
$oValidTokens->addToken($sTok, new Token\HouseNumber($iId, $aWord['word_token']));
$sLookup = $aWord['lookup'] ?? $aWord['word_token'];
$oValidTokens->addToken($sTok, new Token\HouseNumber($iId, $sLookup));
break;
case 'P': // postcode tokens
// Postcodes are not normalized, so they may have content
@@ -180,13 +190,17 @@ class Tokenizer
if ($aWord['word'] !== null
&& pg_escape_string($aWord['word']) == $aWord['word']
) {
$sNormPostcode = $this->normalizeString($aWord['word']);
if (strpos($sNormQuery, $sNormPostcode) !== false) {
$oValidTokens->addToken(
$sTok,
new Token\Postcode($iId, $aWord['word'], null)
);
$iSplitPos = strpos($aWord['word'], '@');
if ($iSplitPos === false) {
$sPostcode = $aWord['word'];
} else {
$sPostcode = substr($aWord['word'], 0, $iSplitPos);
}
$oValidTokens->addToken(
$sTok,
new Token\Postcode($iId, $sPostcode, null)
);
}
break;
case 'S': // tokens for classification terms (special phrases)

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/log.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/log.php');
@@ -198,7 +206,7 @@ if ($bIncludeLinkedPlaces) {
$aLinkedLines = $oDB->getAll($sSQL);
}
// All places this is an imediate parent of
// All places this is an immediate parent of
$aHierarchyLines = false;
if ($bIncludeHierarchy) {
$sSQL = 'SELECT obj.place_id, osm_type, osm_id, class, type, housenumber,';

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/log.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/log.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/ParameterParser.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/log.php');

View File

@@ -1,4 +1,12 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
require_once(CONST_LibDir.'/init-website.php');
require_once(CONST_LibDir.'/log.php');

Some files were not shown because too many files have changed in this diff Show More