Compare commits

...

2224 Commits

Author SHA1 Message Date
Sarah Hoffmann
a3343ab048 fix package name in Centos 7 vagrant script
Fixes #2531.
2021-12-06 15:45:09 +01:00
Sarah Hoffmann
517eb52352 prepare 4.0.1 release 2021-11-22 16:23:19 +01:00
Sarah Hoffmann
02bd1bad2d add a section about moving the database to another machine 2021-11-22 16:17:47 +01:00
Sarah Hoffmann
8d7afbfd4f adapt to release 4.0.0 2021-11-02 20:58:07 +01:00
Sarah Hoffmann
d479a0585d prepare release 4.0.0 2021-11-02 20:27:55 +01:00
Sarah Hoffmann
addfae31b6 fix typo 2021-11-02 11:09:17 +01:00
Sarah Hoffmann
ccf61db726 Merge pull request #2502 from lonvia/improve-development-documentation
Extend developer's documentation
2021-11-01 16:12:23 +01:00
Sarah Hoffmann
5b86b2078a docs: add overview over indexing 2021-11-01 11:04:03 +01:00
Sarah Hoffmann
a069479340 docs: section about database layout
Replaces the import description which basically was
table layout only now.
2021-10-29 12:03:22 +02:00
Sarah Hoffmann
d11bf9288e Merge pull request #2498 from lonvia/ordering-for-unlisted-place-results
Include unlisted places in ordering by housenumber
2021-10-28 15:28:47 +02:00
Sarah Hoffmann
86eeb4d2ed Merge pull request #2497 from lonvia/docs-maintenance
docs: add new maintenance section
2021-10-28 11:33:34 +02:00
Sarah Hoffmann
2275fe59ab include unlisted places in ordering by housenumber
When ordering results by the fact that they have a housenumber,
also take cases into account where the housenumber is on the
place itself. This may happen when the search includes the name
of the place and the housenumber or for addr:place addresses
where the place is unlisted.
2021-10-28 11:27:31 +02:00
Sarah Hoffmann
48be8c33ba docs: add new maintenance section
currently used for postcode updates, word count updates and
deleted relations.
2021-10-28 09:22:37 +02:00
Sarah Hoffmann
d3d07128b2 Merge pull request #2495 from lonvia/fix-normalization-in-php
ICU: use correct normalization during search
2021-10-27 14:40:42 +02:00
Sarah Hoffmann
37eeccbf4c ICU: use normalization from config in PHP
The TERM_NORMALIZATION config option is no longer applicable.
That was already documented but not yet implemented.
2021-10-27 11:32:44 +02:00
Sarah Hoffmann
1722fc537f bdd: add tests for non-latin scripts 2021-10-26 17:29:03 +02:00
Sarah Hoffmann
b240b182cb Merge pull request #2493 from lonvia/handle-frequent-partials
Tune search queries with frequent partial words
2021-10-26 17:00:43 +02:00
Sarah Hoffmann
c0f347fc8c adapt BDD tests to stricter partial search 2021-10-26 15:52:57 +02:00
Sarah Hoffmann
53dbe58ada do not count words when in reverse-only mode 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
2c4b798f9b further refactor setup to keep function small 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
1cf14a8e94 searches for house numbers must have an address 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
4864bf1509 disallow search for partials without address
Very frequent partial terms take too long to look up and
do not return any valuable results unless the search is
further narrowed down by an address.
2021-10-26 12:00:13 +02:00
Sarah Hoffmann
9934421442 make word count computation part of the import
Accurate word counts are now essential when using
the ICU tokenizer and don't hurt for the legacy one.

Adds about an hour import time.
2021-10-26 12:00:13 +02:00
Sarah Hoffmann
d7267c1603 actions: move ICU tests into its own run 2021-10-26 11:59:13 +02:00
Sarah Hoffmann
5c778c6d32 Merge pull request #2486 from lonvia/fix-special-phrases
Fix parsing of operator in special phrases
2021-10-25 21:45:08 +02:00
Sarah Hoffmann
85797acf1e ICU: add an index over word_ids
Needed for keyword lookup in the details response.
2021-10-25 21:33:27 +02:00
Sarah Hoffmann
c4f5c11a4e be case-insensitve about special phrase operator 2021-10-25 19:51:20 +02:00
Sarah Hoffmann
5a1c3dbea3 fix parsing of operator in special phrases
Because of unstripped input, the operators wouldn't match.
2021-10-25 19:46:30 +02:00
Sarah Hoffmann
8e439d3dd9 Merge pull request #2484 from lonvia/fix-index-use
Reverse: add index hints
2021-10-25 17:20:42 +02:00
Sarah Hoffmann
9ebf921c53 Merge pull request #2483 from lonvia/fix-warming
Fix warming for ICU tokenizer
2021-10-25 16:21:36 +02:00
Sarah Hoffmann
7bd9094aaa reverse: add index hints
The fairly complex where condition of idx_placex_geometry_placenode
won't always be matched by the query planner if the condition
part doesn't appear verbatim in the query.

Fixes #2480.
2021-10-25 15:01:03 +02:00
Sarah Hoffmann
16cc395f78 fix warming for ICU tokenizer
Running the warm-up search requests requires querying
the most frequent words. This must be done via the tokenizer
to honor the different formats of the word table.
2021-10-25 13:08:16 +02:00
Sarah Hoffmann
13e7398566 allow relative paths for log files 2021-10-25 10:26:05 +02:00
Sarah Hoffmann
8b90ee4364 Merge pull request #2476 from lonvia/harmonize-configuration-file-settings
Standardize handling of file names in configuration values
2021-10-24 10:57:48 +02:00
Sarah Hoffmann
1098ab732f allow relative paths for flatnode file 2021-10-22 17:32:51 +02:00
Sarah Hoffmann
507fdd4f40 switch IMPORT_STYLE to use generic file search
Allows relative paths wrt project directory.
2021-10-22 16:49:57 +02:00
Sarah Hoffmann
0ae8d7ac08 have ADDRESS_LEVEL_CONFIG use load_sub_configuration
This means that relative paths now are looked up in the
project directory.
2021-10-22 16:36:52 +02:00
Sarah Hoffmann
c77df2d1eb replace NOMINATIM_PHRASE_CONFIG with command line option 2021-10-22 14:41:14 +02:00
Sarah Hoffmann
cefae021db doc: clarify relative paths for tokenizer config 2021-10-21 16:38:06 +02:00
Sarah Hoffmann
771aee8cd8 Merge pull request #2475 from lonvia/catchup-mode
Add catch-up mode to replication and extend documentation for updating
2021-10-21 16:21:58 +02:00
Sarah Hoffmann
2d13d8b3b6 extend documentation for updating database
Explains the different modes and adds hints for
setting up a systemd job.
2021-10-21 12:14:47 +02:00
Sarah Hoffmann
c1fa70639b add new replication mode catch-up
This mode gets updates until the server reports no new diffs
anymore.

Also adds additional indexing, when the main indexing step left
a couple of objects to process. This happens only when the
next update is expected to be more than 40min away.
2021-10-20 22:05:15 +02:00
Sarah Hoffmann
12643c5986 run Tiger import with parallel threads per default 2021-10-19 15:00:26 +02:00
Sarah Hoffmann
a0f5613a23 Merge pull request #2472 from lonvia/word-count-computation
Fix word count computation for ICU tokenizer
2021-10-19 14:58:57 +02:00
Sarah Hoffmann
824562357b adapt tests for new word count mechanism 2021-10-19 12:03:48 +02:00
Sarah Hoffmann
ec7184c533 icu: no longer precompute terms
The ICU analyzer no longer drops frequent partials, so it is no
longer necessary to know the frequencies in advance.
2021-10-19 11:52:28 +02:00
Sarah Hoffmann
e8e2502e2f make word recount a tokenizer-specific function 2021-10-19 11:21:16 +02:00
Sarah Hoffmann
c86cfefc48 Merge pull request #2471 from lonvia/update-install-rules
Reorganise, update and extend documentation
2021-10-19 09:11:16 +02:00
Sarah Hoffmann
2635fe8b4c docs: fix more links 2021-10-18 17:26:14 +02:00
Sarah Hoffmann
632436d54d docs: refer to our new Settings chapter in the import instruchtions 2021-10-18 17:02:52 +02:00
Sarah Hoffmann
74be6828dd check and fix all liks in documentation 2021-10-18 16:53:24 +02:00
Sarah Hoffmann
f4acfed48f add extended documentation of settings 2021-10-18 16:30:52 +02:00
Sarah Hoffmann
91e1c1bea8 docs: update overview pages 2021-10-18 09:04:06 +02:00
Sarah Hoffmann
bbb9a41ea4 docs: move place ranking into customization part 2021-10-18 09:04:06 +02:00
Sarah Hoffmann
f6418887b2 docs: nominatim-ui has a new place for custom config 2021-10-18 09:04:06 +02:00
Sarah Hoffmann
a3f8a097a1 docs: move import style description to customize section 2021-10-18 09:04:06 +02:00
Sarah Hoffmann
751563644f docs: make customization chapter a separate section 2021-10-18 09:04:01 +02:00
Sarah Hoffmann
e52b801cd0 fix typo 2021-10-18 09:03:07 +02:00
Sarah Hoffmann
445a6428a6 docs: remove the development warning for ICU tokenizer 2021-10-18 09:03:07 +02:00
Sarah Hoffmann
d59b26dad7 docs: add a warning about using --no-updates with TIGER data 2021-10-18 09:03:07 +02:00
Sarah Hoffmann
47417d1871 update and extend man page
Provide extended descriptions for most subcommands.
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
381aecb952 rename manual directory to man
Avoids confusion between 'docs' and 'manual'.
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
45344575c6 add munin scipts and ICU subrules to installation 2021-10-18 09:03:07 +02:00
Sarah Hoffmann
83381625bd Merge pull request #2469 from lonvia/fix-tablespace-assignment
Fix template expressions for tablespaces
2021-10-15 18:20:43 +02:00
Sarah Hoffmann
552fb16cb2 fix template expressions for tablespaces 2021-10-15 15:11:09 +02:00
Sarah Hoffmann
75c631f080 Merge pull request #2450 from mtmail/tiger-data-2021
US TIGER data 2021 released
2021-10-11 19:22:15 +02:00
Sarah Hoffmann
e2464fdf62 Merge pull request #2465 from lonvia/use-spgist-index
Use SP-GIST for building index
2021-10-11 10:48:44 +02:00
Sarah Hoffmann
9ff98073db remove outdated country_languages.php 2021-10-10 21:58:43 +02:00
Sarah Hoffmann
98ee5def37 add recommendation for Postgis 3+ 2021-10-10 21:55:38 +02:00
Sarah Hoffmann
3649487f5e use SP-GIST index for building index where available
Point-in-polygon queries are much faster with a SP-GIST geometry
index, so use that for the index used to check if a housenumber
is inside a building.

Only available with Postgis 3. There is an automatic fallback to
GIST for Postgis 2.
2021-10-10 21:55:38 +02:00
Sarah Hoffmann
4b007ae740 Merge pull request #2460 from lonvia/multiple-analyzers
Add support for multiple token analyzers
2021-10-09 14:41:09 +02:00
Sarah Hoffmann
6c79a60e19 add documentation for new configuration of ICU tokenizer 2021-10-07 11:55:53 +02:00
Sarah Hoffmann
2a94bfc703 fix argument description for check_database 2021-10-07 09:49:13 +02:00
Sarah Hoffmann
299934fd2a reorganize and complete tests around generic token analysis 2021-10-06 17:03:37 +02:00
Sarah Hoffmann
b18d042832 add tests for sanitizer tagging language 2021-10-06 12:29:25 +02:00
Sarah Hoffmann
97a10ec218 apply variants by languages
Adds a tagger for names by language so that the analyzer of that
language is used. Thus variants are now only applied to names
in the specific language and only tag name tags, no longer to
reference-like tags.
2021-10-06 11:09:54 +02:00
Sarah Hoffmann
d35400a7d7 use analyser provided in the 'analyzer' property
Implements per-name choice of analyzer. If a non-default
analyzer is choosen, then the 'word' identifier is extended
with the name of the ana;yzer, so that we still have unique
items.
2021-10-05 14:10:32 +02:00
Sarah Hoffmann
92f6ec2328 remove support for properties on variants
Those are not going to be used in the near future, so no need to
carry that code around just now.
2021-10-05 10:29:36 +02:00
Sarah Hoffmann
9ba2019470 precompute replacements while loading configuration 2021-10-05 10:20:08 +02:00
Sarah Hoffmann
c171d88194 move parsing of token analysis config to analyzer
Adds a second callback for the analyzer which is responsible
for parsing the configuration rules and converting it to
whatever format necessary. This way, each analyzer implementation
can define its own configuration rules.
2021-10-04 18:31:58 +02:00
Sarah Hoffmann
7cfcbacfc7 make token analyzers configurable modules
Adds a mandatory section 'analyzer' to the token-analysis entries
which define, which analyser to use. Currently there is exactly
one, generic, which implements the former ICUNameProcessor.
2021-10-04 17:37:34 +02:00
Sarah Hoffmann
52847b61a3 extend ICU config to accomodate multiple analysers
Adds parsing of multiple variant lists from the configuration.
Every entry except one must have a unique 'id' paramter to
distinguish the entries. The entry without id is considered
the default. Currently only the list without an id is used
for analysis.
2021-10-04 16:40:28 +02:00
Sarah Hoffmann
5a36559834 move flatten_config_list into config module
For general usage by other modules.
2021-10-04 11:56:54 +02:00
Sarah Hoffmann
19d4e047f6 Merge pull request #2458 from lonvia/add-tokenizer-preprocessing
Add a "sanitation" step for name and address tags before token processing
2021-10-01 21:53:34 +02:00
Sarah Hoffmann
6b348d43c6 replace test variable for PG env tests
'tty' was removed in PG14 and causes an error.
2021-10-01 12:27:24 +02:00
Sarah Hoffmann
732cd27d2e add unit tests for new sanatizer functions 2021-10-01 12:27:24 +02:00
Sarah Hoffmann
8171fe4571 introduce sanitizer step before token analysis
Sanatizer functions allow to transform name and address tags before
they are handed to the tokenizer. Theses transformations are visible
only for the tokenizer and thus only have an influence on the
search terms and address match terms for a place.

Currently two sanitizers are implemented which are responsible for
splitting names with multiple values and removing bracket additions.
Both was previously hard-coded in the tokenizer.
2021-10-01 12:27:24 +02:00
Sarah Hoffmann
16daa57e47 unify ICUNameProcessorRules and ICURuleLoader
There is no need for the additional layer of indirection that
the ICUNameProcessorRules class adds. The ICURuleLoader can
fill the database properties directly.
2021-10-01 12:27:24 +02:00
Sarah Hoffmann
5e5addcdbf fix typo 2021-09-29 14:16:09 +02:00
Sarah Hoffmann
be65c8303f export more data for the tokenizer name preparation
Adds class, type, country and rank to the exported information
and removes the rather odd hack for countries. Whether a place
represents a country boundary can now be computed by the tokenizer.
2021-09-29 11:54:14 +02:00
Sarah Hoffmann
231250f2eb add wrapper class for place data passed to tokenizer
This is mostly for convenience and documentation purposes.
2021-09-29 11:54:07 +02:00
Sarah Hoffmann
d44a428b74 Merge pull request #2455 from lonvia/adjust-address-levels-slovakia
Adjust address levels for boundaries in Slovakia
2021-09-28 11:21:08 +02:00
Sarah Hoffmann
40f9d52ad8 Merge pull request #2454 from lonvia/sort-out-token-assignment-in-sql
ICU tokenizer: switch match method to using partial terms
2021-09-28 09:45:15 +02:00
Sarah Hoffmann
7f3b05c179 adjust address levels for boundaries in Slovakia
Levels choosen according to OSM wiki. Mainly moves admin_level 6
to county level and admin_level 8 to city/town level. Higher
levels are adjusted accordingly.

Fixes #2453.
2021-09-27 23:32:11 +02:00
Sarah Hoffmann
09c9fad6c3 adapt tests to new ICU address token handling 2021-09-27 17:36:23 +02:00
Sarah Hoffmann
bb18479d5b remove unused parameter 2021-09-27 14:58:43 +02:00
Sarah Hoffmann
779ea8ac62 Merge pull request #2452 from lonvia/update-houses-on-street-name-change
Force update of surrounding houses when street or place name changes
2021-09-27 14:55:50 +02:00
Sarah Hoffmann
bd7c7ddad0 icu tokenizer: switch to matching against partial names
When matching address parts from addr:* tags against place names,
the address names where so far converted to full names and compared
those to the place names. This can become problematic with the new
ICU tokenizer once we introduce creation of different variants
depending on the place name context. It wouldn't be clear which
variant to produce to get a match, so we would have to create all of
them. To work around this issue, switch to using the partial terms
for matching. This introduces a larger fuzziness between matches but
that shouldn't be a problem because matching is always geographically
restricted.

The search terms created for address parts have a different problem:
they are already created before we even know if they are going to be
used. This can lead to spurious entries in the word table, which slows
down searching. This problem can also be circumvented by using only
partial terms for the search terms. In terms of searching that means
that the address terms would not get the full-word boost, but given
that the case where an address part does not exist as an OSM object
should be the exception, this is likely acceptable.
2021-09-27 11:36:19 +02:00
Sarah Hoffmann
c6fdcf9b0d adapt documentation for SQL tokenizer interface 2021-09-27 11:36:19 +02:00
Sarah Hoffmann
59fe74ddf6 move name matching into tokenizer module
Instead of requesting the match tokens from the tokenizer
when looking for parent streets/places and address parts,
hand in the saved tokens and ask if they match. This gives
the tokenizer more freedom to decide how name matching
should be done.
2021-09-27 11:36:19 +02:00
Sarah Hoffmann
6d7c067461 force update on rank30 children when place name changes
Name changes may have an effect on parenting. Don't update
surrounding rank30 objects with addr:place tags as this is
potentially too expensive.
2021-09-27 11:04:17 +02:00
Sarah Hoffmann
316205e455 force update of surrounding houses when street name changes
When the street changes its name then this may cause changes
in the parenting of rank-30 objects with an addr:street
tag.

Fixes #2242.
2021-09-27 10:22:41 +02:00
marc tobias
834ae0a93f US TIGER data 2021 released 2021-09-25 00:05:17 +02:00
Sarah Hoffmann
d562f11298 slightly increase radius to look for postcodes 2021-09-24 23:56:42 +02:00
Sarah Hoffmann
972628c751 Merge pull request #2449 from lonvia/address-ranking-spain
Adjust address ranks for Spain
2021-09-24 22:48:21 +02:00
Sarah Hoffmann
09b1db63f4 adjust address ranks for Spain
Adjusts levels for boundaries according to the list on
https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative

* no admin_level 5, so drop that from addresses
* admin_level 6 has the province
* admin_level 7 has the county when it exists

Also reranks place=province so that it matches up with
admin_level 6 and introduces place=civil_parish which
is used as a place node for some admin_level=9 boundaries
in Galicia.
2021-09-24 18:39:44 +02:00
Sarah Hoffmann
e9d54f752c Merge pull request #2447 from lonvia/fix-dynamic-address-assignment
Fix dynamic assignment of address parts
2021-09-19 15:57:28 +02:00
Sarah Hoffmann
c335025167 CI: install locale for CentOS 2021-09-19 13:49:11 +02:00
Sarah Hoffmann
2b2109c89a Remove the installation warning
Installation has become a lot easier.
2021-09-19 13:01:32 +02:00
Sarah Hoffmann
56124546a6 fix dynamic assignment of address parts
A boolean check for dynamic changes of address parts is not
sufficient. The order of choice should be:

 1. an addr:* part matches the name
 2. the address part surrounds the object
 3. the address part was declared as isaddress

The implementation uses a slightly different ordering
to avoid geometry checks unless strictly necessary (isaddress
is false and no matching address).

See #2446.
2021-09-19 12:34:39 +02:00
Sarah Hoffmann
336258ecf8 Merge pull request #2440 from lonvia/generic-config-loader
Add generic loader for YAML configuration files
2021-09-04 17:41:15 +02:00
Sarah Hoffmann
b894d2c04a fix indent 2021-09-04 10:30:35 +02:00
Sarah Hoffmann
8e1d4818ac use yaml config loader for country info 2021-09-04 00:22:55 +02:00
Sarah Hoffmann
28c98584c1 add tests for generic YAML config reader 2021-09-03 22:31:30 +02:00
Sarah Hoffmann
1c42780bb5 introduce generic YAML config loader
Adds a function to the Configuration class to load a YAML
file. This means that searching for the file is generalised
and works the same now for all configuration files. Changes
the search logic, so that it is always possible to have a
custom version of the configuration file in the project
directory.

Move ICU tokenizer to use new load function.
2021-09-03 18:20:07 +02:00
Sarah Hoffmann
18554dfed7 Merge pull request #2437 from lonvia/tweak-ranking-searches
Some more tweaks for search interpretation
2021-09-03 14:16:23 +02:00
Sarah Hoffmann
2e493fec46 Merge pull request #2436 from lonvia/country-configuration
Move configuration of default languages into a configuration file
2021-09-03 08:55:36 +02:00
Sarah Hoffmann
98c2e08add reduce penalty for special searches by name
Additional penalty for special terms with operator None
should only go to near searches. To reduce the number
of produced searches, restrict the none operator to
appear only in conjunction with the name.
2021-09-03 08:50:38 +02:00
Sarah Hoffmann
94d3dee369 further increase penalty on housenumbers without numbers
Make the penality dependent on the length of the token:
no penalty for one letter house numbers and increasing one
for more letters.
2021-09-02 18:11:49 +02:00
Sarah Hoffmann
7e7dd769fd remove language and partition from name import 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
79da96b369 read partition and languages from config file 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
78fcabade8 move country name generation to country_info module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
284645f505 move generation of country tables in own module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
0b349761a8 add country configuration
The new configuration saves the default language(s) originally
maintained in the OSM wiki as well as the partition information.
2021-09-02 14:41:11 +02:00
Sarah Hoffmann
d18794931a Merge pull request #2435 from lonvia/simplified-to-traditional-chinese
icu: normalise simplified to traditional chinese
2021-08-31 15:29:26 +02:00
Sarah Hoffmann
b7d4ff3201 icu: normalise simplified to traditional chinese
The conversion is unambigious in most cases, so that the
information loss is minimal.
2021-08-31 11:18:34 +02:00
Sarah Hoffmann
4c6d674e03 Merge pull request #2434 from lonvia/vagrant-scripts-in-actions
Test installation instructions via CI
2021-08-29 10:11:59 +02:00
Sarah Hoffmann
2c97af8021 CI: use packaged source also for test runs 2021-08-24 10:10:01 +02:00
Sarah Hoffmann
832f75a55e CI: unify jobs for different vagrant scripts 2021-08-24 10:10:01 +02:00
Sarah Hoffmann
4e77969545 add workflow for centos 8 2021-08-24 10:10:01 +02:00
Sarah Hoffmann
6ebbbfee61 CI: use vagrant scripts for import tests
Use vanilla docker images of Ubuntu and leave the setup
to the vagrant scripts. Then do the usual import tests.

Also fixes a couple of issues found with the scripts
2021-08-24 10:10:01 +02:00
Sarah Hoffmann
0fabeefc3e Merge pull request #2432 from Mastercuber/patch-1
Added postcode
2021-08-22 09:32:31 +02:00
Mastercuber
c70d72f06b Added postcode
Added postcode to the list of addressdetails
2021-08-22 02:52:41 +02:00
Sarah Hoffmann
cc141bf1a5 Add link to fixthemap to issue template 2021-08-21 20:36:16 +02:00
Sarah Hoffmann
199532c802 Merge pull request #2429 from lonvia/place-name-to-admin-boundary
Indexing: move linking of places to the preparation stage
2021-08-21 10:21:39 +02:00
Sarah Hoffmann
28ee3d0949 move linking of places to the preparation stage
Linked places may bring in extra names. These names need to be
processed by the tokenizer. That means that the linking needs
to be done before the data is handed to the tokenizer. Move finding
the linked place into the preparation stage and update the name
fields. Everything else is still done in the indexing stage.
2021-08-20 22:44:17 +02:00
Sarah Hoffmann
925195725d Merge pull request #2428 from lonvia/rename-icu-tokenizer
Rename legacy_icu tokenizer to icu tokenizer
2021-08-18 15:02:19 +02:00
Sarah Hoffmann
f6d22df76e adapt CI workflow to new tokenizer name 2021-08-18 09:08:20 +02:00
Sarah Hoffmann
118858a55e rename legacy_icu tokenizer to icu tokenizer
The new icu tokenizer is now no longer compatible with the old
legacy tokenizer in terms of data structures. Therefore there
is also no longer a need to refer to the legacy tokenizer in the
name.
2021-08-17 23:11:47 +02:00
Sarah Hoffmann
656c1291b1 Merge pull request #2427 from lonvia/remove-us-states-special-casing
Move US state hack into legacy tokenizer
2021-08-17 21:55:32 +02:00
Sarah Hoffmann
f00b8dd1c3 move special hack for US states to legacy tokenizer
The hack for IL, AL and LA is only needed because these abbreviations
are removed by the legacy tokenizer as a stop word. There is no need
to keep the hack for future tokenizers. Move it therefore to the
token extraction function.
2021-08-17 14:28:55 +02:00
Sarah Hoffmann
5f2b9e317a add tests for US state hacks
IL, AS and LA are replaced with the US state in Geocode because
the old tokenizer would simply remove the abbreviations otherwise.
2021-08-17 10:49:07 +02:00
Sarah Hoffmann
4ae5ba7fc4 Merge pull request #2425 from lonvia/tokenizer-documentation
Introduce official Tokenizer API
2021-08-17 09:38:03 +02:00
Sarah Hoffmann
3656eed9ad add mkdocstrings requirement for building docs
mkdocstrings also needs access to the Python sources, so set
a PYTHONPATH accordingly. This makes running mkdocs directly
a bit awkward, therefore add a `make serve-doc` target.
2021-08-16 11:51:49 +02:00
Sarah Hoffmann
2e82a6ce03 docs: extend explanation of query phrase 2021-08-16 11:51:49 +02:00
Sarah Hoffmann
c4b8a3b768 add documentation for PHP part of tokenizer 2021-08-16 11:51:49 +02:00
Sarah Hoffmann
1147b83b22 php: make word list a first-class object
This separates the logic of creating word sets from the Phrase
class. A tokenizer may now derived the word sets any way they
like. The SimpleWordList class provides a standard implementation
for splitting phrases on spaces.
2021-08-16 11:51:49 +02:00
Sarah Hoffmann
0fb8eade13 remove country restriction from tokenizer
Restricting tokens due to the search context is better done in
the generic search part instead of repeating the same test in
every tokenizer implementation.
2021-08-16 11:41:54 +02:00
Sarah Hoffmann
78d11fe628 document tokenizer SQL interface 2021-08-16 11:41:54 +02:00
Sarah Hoffmann
90b40fc3e6 define formal public Python interface for tokenizer
This introduces an abstract class for the Tokenizer/Analyzer
for documentation purposes.
2021-08-16 11:41:54 +02:00
Sarah Hoffmann
e25e268e2e docs: querying and tokenizers 2021-08-16 08:59:44 +02:00
Sarah Hoffmann
68bff31cc9 docs: add developer doc page for Tokenizer 2021-08-16 08:58:56 +02:00
Sarah Hoffmann
31d9545702 Merge pull request #2424 from lonvia/multi-country-import
Update instructions for importing multiple regions
2021-08-16 08:48:28 +02:00
Sarah Hoffmann
e449071a35 Merge pull request #2423 from hummeltech/patch-1
Fix old paths for `phpcs` when using `make test`
2021-08-15 22:00:50 +02:00
Sarah Hoffmann
23e3724abb ignore words without id for status 2021-08-15 21:59:36 +02:00
Sarah Hoffmann
75a5c7013f split up large setup function 2021-08-15 12:24:13 +02:00
Sarah Hoffmann
56d24085f9 port multi-region update scripts to nominatim tool
Also updates the documentation. For the simple case of just
importing multiple regions, provide simplified instructions
that use the new multi-file import feature.

Fixes #2365.
2021-08-14 23:55:48 +02:00
Sarah Hoffmann
95b82af42a update osm2pgsql to 1.5.1 2021-08-14 22:46:35 +02:00
Sarah Hoffmann
87dedde5d6 allow multiple files for the import command
The files are forwarded to osm2pgsql which is now able to merge
them correctly.
2021-08-14 21:42:21 +02:00
David Hummel
8b6489c60e Fix old paths for phpcs when using make test
These paths no longer exist since db3ced17bb, they are now all located under `lib-php`
2021-08-12 13:34:18 -07:00
Sarah Hoffmann
bf4f05fff3 Merge pull request #2413 from osm-search/helm-chart
Installation docs - link to Kubernetes install project
2021-08-08 11:09:36 +02:00
mtmail
b0aaa25f0d Installation docs - link to Kubernetes install project
As reported by @robjuz in https://github.com/osm-search/Nominatim/discussions/2412
2021-08-03 12:02:35 +02:00
Sarah Hoffmann
c3ddc7579a Merge pull request #2408 from lonvia/icu-change-word-table-layout
Change table layout of word table for ICU tokenizer
2021-07-28 14:28:49 +02:00
Sarah Hoffmann
fdff579188 php: force use of global Exception class 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
d48793c22c fix Python linitin errors 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
001b2aa9f9 fix linitin issues in PHP 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
1db098c05d reinstate word column in icu word table
Postgresql is very bad at creating statistics for jsonb
columns. The result is that the query planer tends to
use JIT for queries with a where over 'info' even when
there is an index.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
324b1b5575 bdd tests: do not query word table directly
The BDD tests cannot make assumptions about the structure of the
word table anymore because it depends on the tokenizer. Use more
abstract descriptions instead that ask for specific kinds of
tokens.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
e42878eeda adapt unit test for new word table
Requires a second wrapper class for the word table with the new
layout. This class is interface-compatible, so that later when
the ICU tokenizer becomes the default, all tests that depend on
behaviour of the default tokenizer can be switched to the other
wrapper.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
eb6814d74e convert word info column to json before copying 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
6ad35aca4a adapt special terms lookup to new word table 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
70f154be8b switch word tokens to new word table layout 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
4342b28882 switch special phrases to new word table format 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
5394b1fa1b switch postcode tokens to new word table layout 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
5ab0a63fd6 switch housenumber tokens to new word table layout 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
1618aba5f2 switch country name tokens to new word table layout 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
8377528952 new word table layout for icu tokenizer
The table now directly reflects the different token types.
Extra information is saved in a json structure that may be
dynamically extended in the future without affecting the
table layout.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
34dcf02dee fix typos in tokenizer docs 2021-07-28 11:28:49 +02:00
Sarah Hoffmann
5d7d7f15d9 Merge pull request #2401 from lonvia/port-add-data-to-python
Port add-data functions from PHP to Python
2021-07-26 12:38:56 +02:00
Sarah Hoffmann
0c023fb4d2 adapt cli tests to Python port for add-data 2021-07-26 10:41:37 +02:00
Sarah Hoffmann
1bd068d42d remove unused update script 2021-07-26 10:41:37 +02:00
Sarah Hoffmann
e42349c963 replace add-data function with native Python code 2021-07-26 10:41:37 +02:00
Sarah Hoffmann
878835e4bd move add-data subcommand into a separate file 2021-07-25 18:14:12 +02:00
Sarah Hoffmann
8096a1d67f fix parameters for TokenWord creation 2021-07-20 10:21:40 +02:00
Sarah Hoffmann
e16c5d5f70 Merge pull request #2397 from lonvia/increase-minimum-required-versions
Increase minimum required PostgreSQL version to 9.5
2021-07-19 14:28:02 +02:00
Sarah Hoffmann
2c8242c8df remove special code for pre9.5 postgresql
9.5 is now the minimum requirement.
2021-07-19 10:24:57 +02:00
Sarah Hoffmann
e7d6f89aca increase minimum version for PostgreSQL to 9.5
This is the minimum version we can test with the CI.
With 9.5 there is also complete support for jsonb available.
2021-07-19 10:21:19 +02:00
Sarah Hoffmann
379f5db516 require Python 3.6 also in CMakeFile
This had been forgotten when increasing the minimum Python version.
2021-07-19 10:14:14 +02:00
Sarah Hoffmann
ee32315378 Merge pull request #2396 from lonvia/partial-word-token
Reorganise code that build the SearchDescription
2021-07-19 09:42:37 +02:00
Sarah Hoffmann
cca912af4e make all Token menbers private 2021-07-18 22:54:55 +02:00
Sarah Hoffmann
86ea077092 merge marking rare name with adding name token
Only name tokens can be rare, so this should be the same
function.
2021-07-18 16:52:37 +02:00
Sarah Hoffmann
5d6aabc457 add documentation for public interface of SearchDescription 2021-07-18 16:10:42 +02:00
Sarah Hoffmann
b14ce959d9 factor out check if a token fits current search
Saves allocating an empty array.
2021-07-17 22:01:35 +02:00
Sarah Hoffmann
a48ebd9b47 move SearchDescription building into tokens
Moving the logic for extending the SearchDescription into the
token classes splits up the code and makes it more readable.
More importantly: it allows tokenizer to define custom token
classes in the future.
2021-07-17 20:24:33 +02:00
Sarah Hoffmann
3cd85eaaf1 remove Token from explicit input for SearchDescription extension
The token string is only required by the PartialToken type, so
it can simply save the token string internally. No need to pass
it to every type.

Also moves the check for multi-word partials to the token loader
code in the tokenizer. Multi-word partials can only happen with
the legacy tokenizer and when the database was loaded with an
older version of Nominatim. No need to keep the check for
everybody.
2021-07-17 18:18:31 +02:00
Sarah Hoffmann
ec3f6c9c42 factor out query position
Moves token and phrase position and phrase type into a separate
class that is handed in when assembling the search description.
This drastically reduces the number of parameters for the function
to extend the search descriptions and gives us more flexibility
in the future for more complex positional analysis.
2021-07-15 14:12:59 +02:00
Sarah Hoffmann
143ff14466 remove special status of partial tokens
Full-word tokens are no longer marked by a space at the
beginning of the token. Use the new Partial token category
instead. This removes a couple of special casing, we don't
really need.

The word table still has the space for compatibility reasons,
so the tokenizer code needs to get rid of it when loading the
tokens.
2021-07-14 22:17:17 +02:00
Sarah Hoffmann
6070c3d1d5 introduce a separate token type for partials
This means that the leading space can be removed as a partial
word indicator.
2021-07-13 16:57:12 +02:00
Sarah Hoffmann
bc8b2d4ae0 Merge pull request #2393 from lonvia/fix-flake8-issues
Fix flake8 issues
2021-07-13 16:46:12 +02:00
Sarah Hoffmann
14f777da18 use psycopg's SQL quoting where possible
Use the SQL formatting supplied with psycopg whenever the
query needs to be put together from snippets.
2021-07-12 22:05:22 +02:00
Sarah Hoffmann
6f6681ce67 add helper function for execute_values
Make psycopg2's convenience function accessible through
the cursor.
2021-07-12 21:08:20 +02:00
Sarah Hoffmann
06602b4ec0 provide wrapper function for DROP TABLE
Use psycopg2 formatting to ensure correct quoting.
2021-07-12 20:32:46 +02:00
Sarah Hoffmann
cf98cff2a1 more formatting fixes
Found by flake8.
2021-07-12 17:45:42 +02:00
Sarah Hoffmann
b4fec57b6d Merge pull request #2391 from lonvia/fix-sonar-issues
Fix bugs and code smells found by Sonarqube
2021-07-12 17:14:59 +02:00
Sarah Hoffmann
f8b5a63de3 factor out connection reset code 2021-07-12 14:58:44 +02:00
Sarah Hoffmann
568316f07c simplify analyse function 2021-07-12 14:47:50 +02:00
Sarah Hoffmann
daa597b300 split up variant computation for better readability 2021-07-12 14:43:50 +02:00
Sarah Hoffmann
47adb2a3fc reorganise process_place function
Move address processing into its own function as it is
rather extensive.
2021-07-12 11:57:55 +02:00
Sarah Hoffmann
fff0012249 simplify website setup code
Use formaat strings and move variable quoting code into extra
function.
2021-07-12 11:41:05 +02:00
Sarah Hoffmann
d5a1883b62 avoid repeated patterns for table name 2021-07-12 11:33:09 +02:00
Sarah Hoffmann
a08ef43e40 simplify if statements 2021-07-12 11:28:47 +02:00
Sarah Hoffmann
bc5e15996a convert single case switch to if statement 2021-07-12 11:28:47 +02:00
Sarah Hoffmann
128ca800cd avoid local variable assignment 2021-07-11 23:22:53 +02:00
Sarah Hoffmann
000d133af6 fix more missing braces on one-liners 2021-07-11 23:22:53 +02:00
Sarah Hoffmann
1e40d65aa9 remove dead code 2021-07-11 23:22:53 +02:00
Sarah Hoffmann
bffbe68ec3 do not intermix params with and without default 2021-07-11 23:22:53 +02:00
Sarah Hoffmann
58b10074ad directly return data in function
The temporary variable is not necessary.
2021-07-11 19:24:04 +02:00
Sarah Hoffmann
d933ead2b5 remove unnecessayly nested ifs
Found by Sonarqube.
2021-07-11 19:11:37 +02:00
Sarah Hoffmann
1cdc30c5e8 remove unused functions
The functions were necessary for the transitory code
to Python and are no longer used.
2021-07-11 19:10:04 +02:00
Sarah Hoffmann
3661f7a321 avoid multiple returns of same value
Found by Sonarqube.
2021-07-11 18:23:42 +02:00
Sarah Hoffmann
27af9b102c always use brackets on if statements
This adds bracket around all one-line if statements that did
not have them yet.
2021-07-10 17:04:46 +02:00
Sarah Hoffmann
500c61685b remove unused variables
As reported by sonarqube.
2021-07-09 16:36:42 +02:00
Sarah Hoffmann
106d960f84 fix bad use of echo in PHP output 2021-07-09 12:50:35 +02:00
Sarah Hoffmann
322fa19ceb Merge pull request #2390 from lonvia/responsible-disclosure
Add security issue disclosure policy
2021-07-09 12:32:37 +02:00
Sarah Hoffmann
5bea0b6086 add security issue disclosure policy 2021-07-09 11:36:59 +02:00
Sarah Hoffmann
a5970d7548 Merge pull request #2384 from lonvia/actions-add-icu-tokenizer
CI: run tests on Ubuntu 18
2021-07-07 14:39:53 +02:00
Sarah Hoffmann
c216144dd1 add missing pyyaml requirement 2021-07-07 11:29:33 +02:00
Sarah Hoffmann
42e08da7ca enable PHP 7.2 for Ubuntu 18 CI 2021-07-07 11:29:33 +02:00
Sarah Hoffmann
a2edbbf78a cannot use capture_output in subprocess.run
Only available since Python 3.7.
2021-07-06 22:57:42 +02:00
Sarah Hoffmann
1e86dc1d93 remove default parameter for namedtuple
This is only available in Python 3.7.
2021-07-06 22:57:42 +02:00
Sarah Hoffmann
54f295be52 CI: run tests on older Ubuntu version as well 2021-07-06 22:57:42 +02:00
Sarah Hoffmann
8bc3c0a07c Merge pull request #2382 from lonvia/remove-json-config
Remove outdated ICU tokenizer JSON config
2021-07-05 12:34:34 +02:00
Sarah Hoffmann
d75bc20174 Merge pull request #2383 from lonvia/remove-more-names
Exclude name:etymology and name:signed
2021-07-05 12:34:16 +02:00
Sarah Hoffmann
fd8751658f exclude name:etymology and name:signed
name:etymology contains a description of the name origin and is
thus more informative than search-worthy.

name:signed basically indicates that the feature does not have
a name.
2021-07-05 11:04:16 +02:00
Sarah Hoffmann
4db5a1a0b8 remove outdated ICU tokenizer JSON config 2021-07-05 11:01:35 +02:00
Sarah Hoffmann
4c52777ef0 Merge pull request #2371 from lonvia/increase-python-version
Increase minimum required Python version to 3.6
2021-07-05 10:32:38 +02:00
Sarah Hoffmann
d4c7bf20a2 Merge pull request #2381 from lonvia/reorganise-abbreviations
Reorganise abbreviation handling
2021-07-05 10:32:16 +02:00
Sarah Hoffmann
affe1300d9 add warning about experimental nature of ICU tokenizer 2021-07-04 10:44:58 +02:00
Sarah Hoffmann
62d5984b1b limit the number of variants that can be produced 2021-07-04 10:28:28 +02:00
Sarah Hoffmann
c32551b4e0 restrict partial word counting to names of reasoanble length
The partial word count does not split names to save a bit of time.
The result is that it might enounter unreasonably long names
which in truth consist of multiple words. No accurate statistics
are needed so simply restrict the count to words shorter than
75 characters.
2021-07-04 10:28:28 +02:00
Sarah Hoffmann
e85f7e7aa9 fix subsequent replacements
Two replacement words directly following each other did not
work as expected because each expects a space at the
beginning/end while there was only one space available.

Also forbit composing a word after a space was added in the
end by a previous replacement.
2021-07-04 10:28:28 +02:00
Sarah Hoffmann
7b0f6b7905 leave ICU variant properties empty for now
Saving unused properties causes unnecessary duplicates.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
0894ce9dc3 import abbreviations from OSM Wiki
Replaces the variant rules with a slightly cleaned-up
version of the abbreviation lists at
https://wiki.openstreetmap.org/wiki/Name_finder:Abbreviations
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
4fd2e961b6 improve normalization
Make sure all special symbols are removed during normalization already.
Those won't be interpreted in any way because they are unlikely to be
searched for.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
b9fbfeff67 only consider partials in multi-words for initial count
This ensures that it is less likely that we exclude meaningful
words like 'hauptstrasse' just because they are frequent.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
5dd24b3ef0 add documentation for ICU tokenizer configuration 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
62828fc5c1 switch to a more flexible variant description format
The new format combines compound splitting and abbreviation.
It also allows to restrict rules to additional conditions
(like language or region). This latter ability is not used
yet.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
a6aa6360e0 use yaml tag syntax to mark include files 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
c4f6c06f44 add dependency on datrie 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
0d80a9b897 tests for composing decomposed suffixes 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
f70930b1a0 make compund decomposition pure import feature
Compound decomposition now creates a full name variant on
import just like abbreviations. This simplifies query time
normalization and opens a path for changing abbreviation
and compund decomposition lists for an existing database.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
9ff4f66f55 complete tests for icu tokenizer 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
32ca631b74 fix full term token in special phrases 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
2e81084f35 complete tests for rule loader 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
a0a7b05c9f correctly quote strings when copying in data
Encapsulate the copy string in a class that ensures that
copy lines are written with correct quoting.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
2f6e4edcdb update unit tests for adapted abbreviation code 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
1bd9f455fc add abbreviations from legacy tokenizer
These abbreviations are not a perfect fit anymore because
abbreviation replacement is now applied before transliteration.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
2e3c5d4c5b adapt tests for ICU tokenizer 2021-07-04 10:28:20 +02:00
Sarah Hoffmann
8413075249 move abbreviation computation into import phase
This adds precomputation of abbreviated terms for names and removes
abbreviation of terms in the query. Basic import works but still
needs some thorough testing as well as speed improvements during
import.

New dependency for python library datrie.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
6ba00e6aee icu tokenizer: move transliteration rules in separate file
The tokenizer configuration has become difficult to handle
due to the additional manual transliteration rules. Allow
to have a separate rule file that is given to the ICU library
as is.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
de4fac33dc docs: nominatim-ui should be installed from the release
The development version does not provide the pre-packaged
dist directory anymore.
2021-07-03 21:16:52 +02:00
Sarah Hoffmann
c9984669a7 Merge pull request #2373 from lonvia/tweak-search-cost
Further tweaking of search cost
2021-06-26 16:21:08 +02:00
Sarah Hoffmann
63755c31ff remove penalty for full words in address
Now that mutli-word partials no longer exist, multi-word full
words need to be used to search in addresses and therefore no
longer should have a penalty.

Also changes the condition when a full word is included into
the address. It is no longer relevant if an equivalent partial
exists but only if the term consists of more than one word.
2021-06-26 11:37:15 +02:00
Sarah Hoffmann
161f5f5cee adjust penalty for housenumber-in-name searches
When searching for house numbers in the name (for place-only
terms) then the same penalties need to apply as for the
regular house number search.

Change the code to first compute the penalties and then create
the new search variants.
2021-06-26 11:37:15 +02:00
Sarah Hoffmann
c7073a1fc0 increase minimum Python to 3.6
Python 3.6 introduces formatted string literals and
flag enums as well as a much faster dict implementation.
These changes make the code so much simpler as to warrant
dropping Python 3.5 support.

Affected distributions are Ubuntu 16.04 and Debian Stretch.
2021-06-21 18:37:37 +02:00
Sarah Hoffmann
e7b4fc70e7 make sure old data gets deleted on place type change
When changing from some other place type to place=postcode
make sure that the old place type entry in the place table
is deleted.
2021-06-18 10:58:41 +02:00
Sarah Hoffmann
457982e1d2 update postcode in place if it already exists 2021-06-18 00:28:52 +02:00
Sarah Hoffmann
aa558e6080 Merge pull request #2369 from lonvia/exclude-poi-from-housenumber-search
Do not return POIs when dropping house number in query
2021-06-17 15:30:05 +02:00
Sarah Hoffmann
fe11d3cbbd do not return POIs when dropping house number in query
We've previously added searching through rank 30 in a house
number search to enable searches for house number+name.
This had the unintended side effect that rank 30 objects
are also returned in s search that dropped the house number
from the query. This is wrong because POIs cannot function
as a parent to a house number.

This fix drops all rank 30 objects from the results for a
house number search if they do not match the requested house
number.
2021-06-17 14:21:20 +02:00
Sarah Hoffmann
1ce223a83b Merge pull request #2360 from AntoJvlt/postcodes-place-table
Use place instead of placex to compute postcodes
2021-06-16 11:45:07 +02:00
AntoJvlt
3676310efe Improved performance of the postcodes query and some code cleaning 2021-06-12 15:46:08 +02:00
AntoJvlt
ddf866c4c7 Always delete old placex entry for type=postcode when inserting a new one into the place table 2021-06-12 15:35:51 +02:00
AntoJvlt
9e07a197e9 Handle postcode type change in place insert trigger 2021-06-09 09:31:32 +02:00
AntoJvlt
1c175e3a67 Clean and update tests for postcodes 2021-06-09 09:31:32 +02:00
AntoJvlt
47fb7cd3a8 Use place_exists() into can_compute() for postcodes 2021-06-09 09:31:32 +02:00
AntoJvlt
e879814e43 Update tests for postcodes 2021-06-09 09:31:32 +02:00
AntoJvlt
a4733eed90 Use place instead of placex to compute postcodes 2021-06-09 09:31:32 +02:00
Sarah Hoffmann
38fbc4fcbb do not fail CI on codecov errors
The CodeCove upload depends on unreliable external code.
2021-06-08 10:42:14 +02:00
Sarah Hoffmann
c6fe91bfa5 Merge pull request #2359 from lonvia/switch-bdd-tests-to-api-search
Remove deprecated commandline query function
2021-06-06 18:29:51 +02:00
Sarah Hoffmann
7383f05e45 remove deprecated query interface
Searches can now be done via the thin API wrapper.
2021-06-06 15:28:21 +02:00
Sarah Hoffmann
3aac51c81f switch BDD tests to always use search API 2021-06-06 15:27:52 +02:00
Sarah Hoffmann
f0a7850edf Merge pull request #2358 from AntoJvlt/documentation-update
Update documentation
2021-06-04 23:54:37 +02:00
AntoJvlt
4336ca69c7 Update documentation 2021-06-03 18:39:40 +02:00
Sarah Hoffmann
4bca5e838b Merge pull request #2357 from lonvia/legacy-tokenizer-fix-word-entries
Fix insertion of special terms and countries into word table
2021-06-02 20:58:14 +02:00
Sarah Hoffmann
bc981d0261 fix insertion of special terms and countries into word table
Special terms need to be prefixed by a space because they are
full terms.

For countries avoid duplicate entries of word tokens.

Adds tests for adding country terms.
2021-06-02 20:22:39 +02:00
Sarah Hoffmann
b1d33e6b49 Merge pull request #2356 from lonvia/freeze-after-import
Call freeze after running and non-updateable import
2021-06-02 16:25:26 +02:00
Sarah Hoffmann
38d442edf6 docs: reload SQL when migrating to 3.6
SQL functions must always be reloaded when updating the software.
All other updates included the instruction as part of some other
migration. From 3.7 on it will happen as part of the migration
command.

Fixes #2335.
2021-06-02 16:11:29 +02:00
Sarah Hoffmann
72625dc72a call freeze after running and non-updateable import
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Sarah Hoffmann
cc2f152d70 commit changes to replication log table
Fixes #2350.
2021-05-26 11:47:08 +02:00
Sarah Hoffmann
f74dc38766 always compute guessed postcode for POIs from centroid
When guessing postcodes from the area, only postcodes within
that area are accepted. For POIs that is usually not what we
want as the postcode would have to be within a house for
example.

Fixes #2301.
2021-05-26 11:15:13 +02:00
Sarah Hoffmann
7d9665d8d2 Merge pull request #2349 from lonvia/fix-website-refresh
Only initialise tokenizer for refresh functions where needed
2021-05-25 20:43:44 +02:00
Sarah Hoffmann
a0e85cc17c only initialise tokenizer for refresh functions where needed
Fixes #2347.
2021-05-25 19:16:22 +02:00
Sarah Hoffmann
29b02f9e56 Merge pull request #2346 from lonvia/words-vs-tokens
Cleanup use of partial words in legacy tokenizers
2021-05-24 17:41:38 +02:00
Sarah Hoffmann
24c986c842 add tests for new full name computation with ICU 2021-05-24 10:41:42 +02:00
Sarah Hoffmann
4f4d15c28a reorganize keyword creation for legacy tokenizer
- only save partial words without internal spaces
- consider comma and semicolon a separator of full words
- consider parts before an opening bracket a full word
  (but not the part after the bracket)

Fixes #244.
2021-05-24 10:41:42 +02:00
Sarah Hoffmann
fa3e48c59f use make_keywords for place search terms also
Ensures that place indeed uses the same search names as other
names.
2021-05-23 23:08:11 +02:00
Sarah Hoffmann
02f6afa51b always ignore multi term partials in search
Partial terms should only ever consist of one word. Ignore
any other, they are a leftover from inefficient word index
builts.
2021-05-23 22:13:03 +02:00
Sarah Hoffmann
10143e0ac7 Merge pull request #2342 from lonvia/icu-tokenizer-ci
Add BDD tests with icu tokenizer to CI runs
2021-05-22 10:36:35 +02:00
Sarah Hoffmann
8f3429939f CI: run BDD tests with legacy_icu tokenizer 2021-05-21 23:18:45 +02:00
Sarah Hoffmann
00094c43d1 enable Tiger BDD API test for legacy_icu 2021-05-21 22:39:56 +02:00
Sarah Hoffmann
8bf15fa691 Merge pull request #2341 from lonvia/cleanup-python-tests
Cleanup and linting of python tests
2021-05-20 17:30:30 +02:00
Sarah Hoffmann
63dc503b8d Merge pull request #2337 from mogita/fix/invalid-query-string
fix: add the missing question mark
2021-05-20 10:26:23 +02:00
Sarah Hoffmann
430c316e45 test: fix linting errors 2021-05-19 23:07:39 +02:00
Sarah Hoffmann
01f5a9ff84 test: more use of table_factory 2021-05-19 17:37:03 +02:00
Sarah Hoffmann
af52eed0dd test: avoid use of tempfile module
Use the tmp_path fixture instead which provides automatic
cleanup.
2021-05-19 16:43:26 +02:00
Sarah Hoffmann
f93d0fa957 test: use src_dir fixture instead of self-computed paths 2021-05-19 16:03:54 +02:00
Sarah Hoffmann
c06a1d007a test: replace raw execute() with fixture code where possible 2021-05-19 12:11:04 +02:00
Sarah Hoffmann
65bd749918 test: use table_rows() and execute_values() where possible
Some uses of scalar() could also be replaced with convenience
functions from the word table mock.
2021-05-19 10:51:10 +02:00
Sarah Hoffmann
510eb53f53 test: move Testingcursor into separate class
Also adds more convenience functions: counting with a where
statement and a wrapper to execute_values().
2021-05-19 10:30:36 +02:00
mogita
507543a482 fix: add the missing question mark 2021-05-19 13:35:15 +08:00
Sarah Hoffmann
16bb007135 Merge pull request #2336 from lonvia/do-not-mask-error-when-loading-tokenizer
Do not hide errors when importing tokenizer
2021-05-18 23:00:10 +02:00
Sarah Hoffmann
1ffb6bd5d0 Merge pull request #2321 from AntoJvlt/csv-import-special-phrases
CSV import for special phrases and loader refactoring
2021-05-18 22:58:25 +02:00
AntoJvlt
799a4c9ab6 Documentation update and small code fixes 2021-05-18 22:35:21 +02:00
Sarah Hoffmann
b2722650d4 do not hide errors when importing tokenizer
Explicitly check for the tokenizer source file to check that
the name is correct. We can't use the import error for that
because it hides other import errors like a missing
library.

Fixes #2327.
2021-05-18 16:28:21 +02:00
Sarah Hoffmann
54b06d7abc Merge pull request #2332 from lonvia/fix-keyword-details
Always use object type for details keywords
2021-05-18 11:30:58 +02:00
Sarah Hoffmann
fef1bbb1a7 always use object type for details keywords
When name and address is empty, the keywords field in the response
of the details API would be an array because that is what PHP's
json_encode defaults to with empty array(). This default can only
be changed globally per json_encode call and that might cause
unintended colleteral damage. Work around the issue by making
name and address an empty array instead of keywords.

Fixes #2329.
2021-05-17 16:36:32 +02:00
AntoJvlt
3206bf59df Resolve conflicts 2021-05-17 13:52:35 +02:00
AntoJvlt
a33f2c0f5b Special phrases documentation updated 2021-05-17 13:25:16 +02:00
AntoJvlt
8b8dfc46eb Added --no-replace command for special phrases importation and added corresponding tests 2021-05-17 13:25:06 +02:00
AntoJvlt
06aab389ed Code cleaning and SPLoader deleted 2021-05-16 16:59:12 +02:00
AntoJvlt
fb0ebb5bf0 Add tests for the new SPWikiLoader and SPCsvLoader 2021-05-16 16:10:06 +02:00
Sarah Hoffmann
925726222f Merge pull request #2323 from darkshredder/disable-search-reverse-only
Feat: Disabled search API for --reverse-only imports
2021-05-14 10:40:22 +02:00
Sarah Hoffmann
550e7edb64 Merge pull request #2328 from lonvia/convert-tiger-to-csv
Switch external Tiger data to CSV format
2021-05-14 09:58:50 +02:00
Sarah Hoffmann
2992dea5c8 install default settings for legacy_icu tokenizer 2021-05-14 09:44:10 +02:00
Sarah Hoffmann
e76e4bd964 adapt documentation to use Tiger CSV dump 2021-05-14 00:02:50 +02:00
Sarah Hoffmann
7d621389ee adapt tests to new TIGER CSV format 2021-05-14 00:02:50 +02:00
Sarah Hoffmann
35efe3b41c use tokenizer during Tiger data import
This also changes the required import format to CSV.
2021-05-14 00:02:50 +02:00
Darkshredder
e5ffc59cd5 feat: Added reverse-only-search validation 2021-05-14 02:36:21 +05:30
Sarah Hoffmann
d7f9d2bde9 Merge pull request #2326 from lonvia/wokerpool-for-tiger-data
Use WorkerPool when importing Tiger data
2021-05-13 22:09:56 +02:00
Sarah Hoffmann
5feece64c1 use WorkerPool for Tiger data import
Requires adding an option that SQL errors are ignored.
2021-05-13 20:36:50 +02:00
Sarah Hoffmann
b9a09129fa move WorkerPool into db module
The pool is independent of the indexer and may also be used
by other parts of the software.
2021-05-13 17:11:17 +02:00
Sarah Hoffmann
96e6bbe3a1 Merge pull request #2325 from lonvia/do-not-precompute-postcodes
Do not preload postcodes in the legacy tokenizer
2021-05-13 17:00:29 +02:00
Frederik Ramm
fe39185894 Add array_key_last function for PHP <7.3
This patch adds an array_key_last function if it doesn't yet exist, fixes #2316. It is tested on PHP 7.2.24 but not PHP 7.3.
2021-05-13 16:42:22 +02:00
Sarah Hoffmann
fc860787dd do not preload postcodes
This is too expensive for updates.
2021-05-13 16:14:12 +02:00
Sarah Hoffmann
63e35574d4 Merge pull request #2324 from lonvia/generic-external-postcodes
Rework postcode handling and generalised external postcode support
2021-05-13 14:52:19 +02:00
Sarah Hoffmann
db2dbf15f7 fix token_info migration
A bad indent meant that only one table received the new column.
2021-05-13 14:31:41 +02:00
Sarah Hoffmann
f5977dac75 ignore invalid coordinates in external postcodes 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
8f2746fe24 ignore entries without country code 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
41b9bc9984 add documentation for external postcode feature 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
1ccd4360b4 correctly handle removing all postcodes for country 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
bf864b2c54 index postcodes after refreshing 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
4abaf71234 add and extend tests for new postcode handling 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
a4aba23a83 move filling of postcode table to python
The Python code now takes care of reading postcodes from placex,
enhancing them with potentially existing external postcodes and
updating location_postcodes accordingly. The initial setup and
updates use exactly the same function.

External postcode handling has been generalized. External postcodes
for any country are now accepted. The format of the external postcode
file has changed. We now expect CSV, potentially gzipped. The
postcodes are no longer saved in the database.
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
cae0cf3546 Merge pull request #2322 from mtmail/type-label-already-lowercased
typelabel value is already lowercased
2021-05-12 20:25:22 +02:00
marc tobias
38f9e18afb typelabel value is already lowercased 2021-05-12 19:16:51 +02:00
AntoJvlt
9d83da830f Introduction of SPCsvLoader to load special phrases from a csv file 2021-05-10 23:26:39 +02:00
AntoJvlt
00959fac57 Refactoring loading of external special phrases and importation process by introducing SPLoader and SPWikiLoader 2021-05-10 21:49:31 +02:00
Sarah Hoffmann
40cb17d299 Merge pull request #2314 from lonvia/fix-status-no-import-date
Correctly catch the exception when import date is missing
2021-05-06 17:41:53 +02:00
Sarah Hoffmann
2ae293aeb6 Merge pull request #2312 from lonvia/icu-tokenizer
Add new tokenizer based on libICU
2021-05-06 17:22:04 +02:00
Sarah Hoffmann
d8ead78e03 correctly catch the exception when import date is missing 2021-05-06 16:27:42 +02:00
Sarah Hoffmann
b2c6eca2c8 add missing transliterations
The ICU library only offers transliterations for a limited set of
script. Add transliterations for missing scripts from the PostgreSQL
module. These means that the same selection of scripts is supported
as with the old module.
2021-05-05 21:16:55 +02:00
Sarah Hoffmann
872ab91421 fix name of transliterator
Should be different from the normalisation rules.
2021-05-05 17:09:38 +02:00
Sarah Hoffmann
a263e54b94 enable BDD tests for different tokenizers
The tokenizer to be used can be choosen with -DTOKENIZER.

Adapt all tests, so that they work with legacy_icu tokenizer.
Move lookup in word table to a function in the tokenizer.
Special phrases are temporarily imported from the wiki until
we have an implementation that can import from file. TIGER
tests do not work yet.
2021-05-05 10:31:51 +02:00
Sarah Hoffmann
18c99a5c5f add unit tests for legacy ICU tokenizer 2021-05-05 10:15:27 +02:00
Sarah Hoffmann
d55fc39275 cache translieration results 2021-05-05 10:15:27 +02:00
Sarah Hoffmann
ba8ed7967d add PHP part for new ICU-base tokenizer 2021-05-05 10:15:27 +02:00
Sarah Hoffmann
f44af49df9 add Python part for new ICU-based tokenizer 2021-05-05 10:15:27 +02:00
Sarah Hoffmann
3c67bae868 Merge pull request #2310 from RhinoDevel/master
2nd try: Add hint about replication update & recheck intervals being in seconds.
2021-05-04 12:45:26 +02:00
Marc
3dade534fd Add hint about replication update & recheck intervals being in seconds. 2021-05-04 11:47:15 +02:00
Sarah Hoffmann
8b1a509442 Merge pull request #2305 from lonvia/tokenizer
Factor out normalization into a separate module
2021-05-03 09:15:34 +02:00
Sarah Hoffmann
8bdb9aa607 mock tokenizer factory for replication tests 2021-05-01 10:50:39 +02:00
Sarah Hoffmann
36c624ec71 commit between migrations
Later migrations may require tables set up by older ones.
2021-05-01 10:47:35 +02:00
Sarah Hoffmann
7fd871a74d increase database version for tokenizer migration 2021-05-01 10:47:35 +02:00
Sarah Hoffmann
ced8f0f4a2 fix liniting issues 2021-04-30 17:59:50 +02:00
Sarah Hoffmann
388ebcbae2 move index creation for word table to tokenizer
This introduces a finalization routing for the tokenizer
where it can post-process the import if necessary.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
20891abe1c indexer: fetch extra place data asynchronously
The indexer now fetches any extra data besides the place_id
asynchronously while processing the places from the last batch.
This also means that more places are now fetched at once.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
6ce6f62b8e fetch place info asynchronously 2021-04-30 17:41:08 +02:00
Sarah Hoffmann
602728895e indexer: fetch ids in batches 2021-04-30 17:41:08 +02:00
Sarah Hoffmann
fc995ea6b9 move database check for module to tokenizer 2021-04-30 17:41:08 +02:00
Sarah Hoffmann
be6262c6ce move status test to tokenizer
The availability of the module is now tested by the tokenizer.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
893490f94e add more tests for legacy tokenizer 2021-04-30 17:41:08 +02:00
Sarah Hoffmann
044bb6afa5 move tokenization in query into tokenizer 2021-04-30 17:41:08 +02:00
Sarah Hoffmann
3eb4d88057 boilerplate for PHP code of tokenizer
This adds an installation step for PHP code for the tokenizer. The
PHP code is split in two parts. The updateable code is found in
lib-php. The tokenizer installs an additional script in the
project directory which then includes the code from lib-php and
defines all settings that are static to the database. The website
code then always includes the PHP from the project directory.
2021-04-30 11:31:52 +02:00
Sarah Hoffmann
23fd1d032a tests for legacy tokenizer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
7cb7cf848d move amenity creation to tokenizer
The BDD tests still use the old-style amenity creation scripts
because we don't have simple means to import a hand-crafted
test file of special phrases right now.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
bef300305e move default country name creation to tokenizer
The new function is also used, when a country us updated. All SQL
function related to country names have been removed.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
dc700c25b6 cache all postcodes 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
0ba93e5ba9 reorganise address iteration in tokenizer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
0da481f207 remove debug code 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
d75a235c1f use address tokens in SQL 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
9e92759ac7 extract address tokens in tokenizer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
ffc2d82b0e move postcode normalization into tokenizer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
d8ed1bfc60 move houseunumber handling to tokenizer
Normalization and token computation are now done in the tokenizer.
The tokenizer keeps a cache to the hundred most used house numbers
to keep the numbers of calls to the database low.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
d711f5a81e move name token creation into tokenizer
Name tokens are now handed in via token_info and used from there.

Also moves the generic search name insertion function back to
placex_triggers.sql.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fa2bc60468 introduce name analyzer
The name analyzer is the actual work horse of the tokenizer. It
is instantiated on a thread-base and provides all functions for
analysing names and queries.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
e1c5673ac3 require tokeinzer for indexer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
1b1ed820c3 introduce index for finding surrounding buildings 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
a73711f3cd add extra column for tokenizer
Add a jsonb column to the placex and location_property_osmline tables
which can be used by the installed tokenizer as required. No other
part of the software will use or otherwise rely on this column.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
9397bf54b8 introduce external processing in indexer
Indexing is now split into three parts: first a preparation step
that collects the necessary information from the database and
returns it to Python. In a second step the data is transformed
within Python as necessary and then returned to the database
through the usual UPDATE which now not only sets the indexed_status
but also other fields. The third step comprises the address
computation which is still done inside the update trigger in
the database.

The second processing step doesn't do anything useful yet.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fbbdd31399 move word table and normalisation SQL into tokenizer
Creating and populating the word table is now the responsibility
of the tokenizer.

The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
b5540dc35c add migration for configurable tokenizer
Adds a migration that initialises a legacy tokenizer for
an existing database. The migration is not active yet as
it will need completion when more functionality is added
to the legacy tokenizer.
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
296a66558f move module installation to legacy tokenizer 2021-04-30 11:29:57 +02:00
Sarah Hoffmann
af968d4903 introduce tokenizer modules
This adds the boilerplate for selecting configurable tokenizers.
A tokenizer can be chosen at import time and will then install
itself such that it is fixed for the given database import even
when the software itself is updated.

The legacy tokenizer implements Nominatim's traditional algorithms.
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
5c7b9ef909 Merge pull request #2303 from lonvia/remove-aux-support
Remove support for AUX housenumber tables
2021-04-30 11:19:35 +02:00
Sarah Hoffmann
185d369404 remove support for AUX housenumber tables
These tables have never been actively maintained and the code is
completely untested. With the upcomming changes, it is unlikely
that the code remains usable.

This removes the aux tables and all code that references them.
2021-04-30 10:08:29 +02:00
Sarah Hoffmann
51d20b19b6 Merge pull request #2299 from lonvia/update-actions
Fix database check for reverse-only
2021-04-27 12:18:45 +02:00
Sarah Hoffmann
46e8c6b112 Merge pull request #2291 from AntoJvlt/special-phrases-statistics
Special phrases statistics
2021-04-27 11:57:05 +02:00
Sarah Hoffmann
c8fb25201a do not check for extra housenumber index for reverse-only
Also adds a database check for reverse only import to the CI.
2021-04-27 10:14:26 +02:00
Sarah Hoffmann
1fd483643b add tests for different scripts 2021-04-26 23:01:06 +02:00
Sarah Hoffmann
a21a0864f1 Merge pull request #2298 from lonvia/add-warming-to-ci
Add warming to CI import tests and fix more Python 3.5 compatibility issues
2021-04-26 11:21:44 +02:00
Sarah Hoffmann
4457bf7528 avoid Path in subprocess parameters
Not supported by Python 3.5.
2021-04-26 10:55:23 +02:00
Sarah Hoffmann
5ed6f18d83 add warming to CI import test 2021-04-26 09:54:09 +02:00
AntoJvlt
abb3d56b20 Switching to log info and only send warning for invalid phrases 2021-04-25 17:57:43 +02:00
AntoJvlt
c5ecb9bae0 Implemented statistics for the import of special phrases through the SpecialPhrasesImporterStatistics class 2021-04-25 17:57:43 +02:00
AntoJvlt
1b68152fb2 reorganization of folder/file for the special phrases importer 2021-04-25 17:57:42 +02:00
Sarah Hoffmann
6812f397af Merge pull request #2297 from lonvia/update-deployment-docs
docs: update deployment to use project directory
2021-04-24 15:35:00 +02:00
Sarah Hoffmann
68bd9c6091 Merge pull request #2296 from lonvia/disable-too-few-public-methods-check
pylint: disable too-few-public-methods check
2021-04-24 15:03:28 +02:00
Sarah Hoffmann
754f9e3a20 docs: update deployment to use project directory
Fixes #2295.
2021-04-24 15:00:46 +02:00
Sarah Hoffmann
b951b11336 fix pylint complaints 2021-04-24 11:59:32 +02:00
Sarah Hoffmann
89c90bedb9 pylint: disable check too-few-public-methods 2021-04-24 11:39:44 +02:00
Sarah Hoffmann
b4fe7d7c7d Merge pull request #2293 from darkshredder/update-manpage
Updated manual page
2021-04-24 09:20:28 +02:00
Sarah Hoffmann
5071710db7 Merge pull request #2294 from lonvia/update-actions
CI: add import test against Python 3.5 and fix discovered issues
2021-04-23 23:33:15 +02:00
Sarah Hoffmann
9faaf3fc88 actions: add import on ubuntu 18.04
This uses oldest possible dependencies where possible.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
9c51c133f7 indexes with includes are not available for postgresql < 11 2021-04-23 22:50:08 +02:00
Sarah Hoffmann
91d2fb6b1c use group() for regex matches
Needed for compatibility with Python 3.5.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
280406c0d7 use pathlib version of open 2021-04-23 22:50:08 +02:00
Sarah Hoffmann
d5fc3b5e99 subprocess needs string argument
Compatibility change for Python 3.5.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
f8f8c7e534 check for existance of custom .env before opening 2021-04-23 22:50:08 +02:00
Sarah Hoffmann
3a642d50a4 use more generic ImportError to check for module
ModuleNotFoundError was only introduced in Python 3.6.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
9685c68e30 replace usages of fromisoformat() with strptime()
fromisoformat was only introduced with Python 3.7 while we
still support Python 3.5.

Fixes #2292.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
95e6ec091b remove argparse dependency for vagrant scripts
Users don't need to recreate the manpage.
2021-04-23 22:50:08 +02:00
Darkshredder
34f5e4a199 Updated manual page 2021-04-24 01:42:38 +05:30
Sarah Hoffmann
788baafa26 bdd tests: fix place dependen ranking tests
The ranks of places may differ for some countries. Force the
place nodes in the test on null island which always uses the
default ranking.
2021-04-22 17:31:00 +02:00
Sarah Hoffmann
4c31813398 Merge pull request #2288 from RhinoDevel/patch-1
Replace "nominatim-update" with "nominatim".
2021-04-22 17:12:25 +02:00
RhinoDevel
b7bae80616 Replace "nominatim-update" with "nominatim".
If I am not mistaken, the correct command to index imported data via commandline is "nominatim index".
2021-04-22 15:40:22 +02:00
Sarah Hoffmann
f7e4aa51d3 indexer: reset query counter
Reset the counter for queries after the asynchronous connections
have been reopened.
2021-04-21 10:33:45 +02:00
Sarah Hoffmann
696c50459f Merge pull request #2285 from lonvia/split-indexer-code
Rework indexer code
2021-04-20 15:34:14 +02:00
Sarah Hoffmann
50b6d7298c factor out async connection handling into separate class
Also adds a test for reconnecting regularly while indexing.
2021-04-20 14:08:37 +02:00
Sarah Hoffmann
26a81654a8 indexer: make self.conn function-local
Also switches to our internal connect function which gives us
a cursor with a sclar() function.
2021-04-20 14:08:37 +02:00
Sarah Hoffmann
6430371d7d make index() function private 2021-04-20 14:08:37 +02:00
Sarah Hoffmann
18705b3f18 move analyse function into indexinf function 2021-04-20 14:08:37 +02:00
Sarah Hoffmann
c6bd2bb7fb indexer: move runner into separate file 2021-04-20 14:08:37 +02:00
Sarah Hoffmann
c4fd94bd1a Merge pull request #2284 from lonvia/cleanup-word-frequency-computation
Rename and simplify function for word pre-computation
2021-04-19 18:28:04 +02:00
Sarah Hoffmann
b88b952f56 simplify token precomputation
Rename function to reflect that it is only used for precomputation.
The token IDs are not really needed, so don't bother to compute
the array of tokens.
2021-04-19 17:24:19 +02:00
Sarah Hoffmann
d68b02d36a remove unused word recomputation script
Has been replaced by a script recomputing counts from search_name.
2021-04-19 16:40:57 +02:00
Sarah Hoffmann
b9b85eb208 Merge pull request #2283 from darkshredder/tiger-data-test-fix
Fix: tiger-data tarfile test
2021-04-19 13:56:36 +02:00
Darkshredder
1f898405a6 Fix: tiger-data tarfile test 2021-04-19 16:02:52 +05:30
Sarah Hoffmann
6f6910101e Merge pull request #2282 from lonvia/add-paths-to-config
Include software paths in Python config object
2021-04-19 12:14:25 +02:00
Sarah Hoffmann
79d55357e8 simplify sql and website creation functions 2021-04-19 10:53:30 +02:00
Sarah Hoffmann
4fa6c0ad53 simplify constructor for SQL preprocessor
Use sql path from config.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
8f63f9516b simplify interface for adding tiger data
Also simplifies tests using existing fixtures.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
995ba2c7c2 add library directories to config
Allows to reduce the number of parameters in functions that take
the config anyway.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
830e3be1e6 Merge pull request #2281 from changpingc/changping/fix-tiger-index
fix index on location_property_tiger (parent_place_id)
2021-04-19 08:42:59 +02:00
Channgping Chen
29a314a092 fix index on location_property_tiger (parent_place_id)
Looks like 2af82975cd
accidentally renamed an index. Because of the added "if not
exists" clause, the index doesn't get created. This
significantly slows down reverse queries because they now
require full scans on location_property_tiger.

Without this fix, reverse queries can take 8s on a full
planet install on an r5.8xlarge instance in EC2.
2021-04-19 00:33:15 +00:00
Sarah Hoffmann
abdba5fdc7 Merge pull request #2280 from AntoJvlt/Fix-special-phrases-import-and-tests-cleaning
Fix regex and sanity check for the import of special phrases and tests cleaning.
2021-04-18 11:57:19 +02:00
AntoJvlt
b2ae715699 Only log a warning if a wrong input is detected on the wiki while importing special phrases 2021-04-17 20:19:39 +02:00
AntoJvlt
a95c748363 Fix occurence regex 2021-04-17 19:24:13 +02:00
AntoJvlt
ec859e41c6 Cleaned tests and add database cleaning tests on test_import_from_wiki 2021-04-17 19:23:33 +02:00
Sarah Hoffmann
7aeae9da81 Merge pull request #2279 from lonvia/add-index-for-continued-indexing
Add index for continued indexing
2021-04-17 11:51:21 +02:00
Sarah Hoffmann
2ca11ccc6b add tests for continuing import 2021-04-17 11:10:36 +02:00
Sarah Hoffmann
d74ae669e3 add support index when continuing import at index phase
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
2021-04-17 11:07:04 +02:00
Sarah Hoffmann
9fabc5572d Merge pull request #2278 from lonvia/remove-transistion-functions
Remove transition functions
2021-04-17 10:13:33 +02:00
Sarah Hoffmann
da98a2102a remove transition functions from Python 2021-04-16 18:41:14 +02:00
Sarah Hoffmann
fb3353b854 Merge pull request #2277 from lonvia/update-osm2pgsql
Update osm2pgsql to current master
2021-04-16 17:40:43 +02:00
Sarah Hoffmann
b7e5c54593 remove PHP code for transition functions 2021-04-16 17:28:51 +02:00
Sarah Hoffmann
68beec5590 remove installation of PHP util scripts 2021-04-16 17:09:40 +02:00
Sarah Hoffmann
6ba06d1eb4 Merge pull request #2276 from lonvia/port-country-code-creation-to-python
Port country code creation to python
2021-04-16 16:57:04 +02:00
Sarah Hoffmann
0f11e311c4 add test for new postcode import function 2021-04-16 16:11:20 +02:00
Sarah Hoffmann
886a01c796 port function to compute initial postcodes to Python 2021-04-16 16:11:20 +02:00
Sarah Hoffmann
a632b9f86a Merge pull request #2275 from lonvia/switch-to-absolute-imports
Use absolute imports in Python code
2021-04-16 15:04:10 +02:00
Sarah Hoffmann
76b1885595 use absolute imports in Python code
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Sarah Hoffmann
c55b409cf6 update osm2pgsql to current master (fixes version output) 2021-04-15 10:24:01 +02:00
Sarah Hoffmann
c64193f839 Merge pull request #2263 from AntoJvlt/special-phrases-autoupdate
Implemented auto update of special phrases while importing them
2021-04-15 10:13:25 +02:00
Sarah Hoffmann
28a2a795ba Merge pull request #2270 from lonvia/simplify-place-boundary-merge
Simplify matching between place and boundary names
2021-04-15 10:12:53 +02:00
Sarah Hoffmann
e90adfc7c3 adapt database check to new index layout 2021-04-14 17:52:59 +02:00
Sarah Hoffmann
16267dc021 add migration for new placenode geometry index 2021-04-14 17:52:59 +02:00
Sarah Hoffmann
e7266b52ae simplify name matching between boundary and place node
Instead of normalising the names simply compare them in lower
case. This removes the dependency on the tokenizer for
linking boundaries and nodes. When looking up the linked places
by place type also allow that one name is simply contained in the
other. This catches the frequent case where one of the names has
an addendum (e.g. Newport vs. City of Newport).

Drops the special index for the name lookup and insted relies
on a slightly extended version of the geometry index used for
reverse lookup. Saves around 100MB on a planet.
2021-04-14 17:52:59 +02:00
Sarah Hoffmann
dc02610408 Merge pull request #2269 from lonvia/fix-actions
github actions: reintroduce postgresql repo
2021-04-14 17:50:02 +02:00
Sarah Hoffmann
dc1bfe4a93 github actions: reintroduce postgresql repo 2021-04-14 17:25:44 +02:00
Sarah Hoffmann
cf69daaafb Merge pull request #2264 from darkshredder/tiger-data-tests
Fix:  Error if last statements is wrong and improved tests in tiger data import
2021-04-14 10:56:12 +02:00
Darkshredder
49ee7505ed Fix: Removed error if endstatement is wrong and improved tests 2021-04-13 15:44:12 +05:30
AntoJvlt
ae2b2cb9a5 Tests added for the auto update of special phrases during import 2021-04-12 14:35:29 +02:00
AntoJvlt
8c2f287ce4 Implemented auto update of special phrases while importing them 2021-04-12 14:30:48 +02:00
Sarah Hoffmann
2351f36315 Merge pull request #2260 from AntoJvlt/fix-load-languages-special-phrases
Fix default languages loading for special phrases import
2021-04-11 23:09:45 +02:00
AntoJvlt
5ecae10713 Fix default languages loading 2021-04-11 22:26:31 +02:00
Sarah Hoffmann
2e3d657794 Merge pull request #2258 from darkshredder/code-coverage
Disabled Code coverage status checks
2021-04-10 21:19:55 +02:00
Darkshredder
90f990b806 CodeCov comment only when codecoverage changes 2021-04-10 22:28:29 +05:30
Darkshredder
7666d48409 Disabled Coverage status checks 2021-04-10 20:44:52 +05:30
Sarah Hoffmann
be4cb190e8 add badge for codecov 2021-04-10 16:57:39 +02:00
Sarah Hoffmann
2f4eca8c46 Merge pull request #2252 from darkshredder/code-coverage
Added Code coverage support using Codecov
2021-04-10 16:37:12 +02:00
Sarah Hoffmann
71564fa1de split LANGUAGES parameter before use
The user supplies the languages as a comma-separated list.
2021-04-09 17:48:28 +02:00
Sarah Hoffmann
ce08cb6cd7 add migration information for new configuration format 2021-04-08 11:01:46 +02:00
Sarah Hoffmann
1f0cf6311a Merge pull request #2256 from lonvia/remove-reverseinplan-option
Remove ReverseInPlan option
2021-04-08 10:54:16 +02:00
Sarah Hoffmann
1db468b6c3 remove special handling for reversed queries in getGroupedSearches
getGroupedSearches is guaranteed not to be called with reversed
structured queries, so there is no need to have special exclusion
code.
2021-04-08 10:35:14 +02:00
Sarah Hoffmann
534de5ba81 remove reverseInPlan option from Geocode
Disabling query reversal is no longer possible in the configuration,
so there is no need to keep this as an option. Reversal is
automatically disabled for structured search only.
2021-04-08 10:19:27 +02:00
Sarah Hoffmann
492186716f prepare 3.7.0 release 2021-04-06 21:23:29 +02:00
Sarah Hoffmann
07fda48cee docs: minor spelling corrections 2021-04-06 16:09:53 +02:00
Sarah Hoffmann
4b31be5203 docs: unpacking tiger data is no longer necessary 2021-04-06 15:56:08 +02:00
Sarah Hoffmann
5d69c7ade1 Merge pull request #2250 from lonvia/save-transliterated-housenumbers
Switch to saving transliterated housenumbers in placex
2021-04-05 15:48:22 +02:00
Darkshredder
2bfea15fdc Fixed BDD tests coverage reports 2021-04-05 06:30:31 +05:30
Sarah Hoffmann
96b0699621 add migration for transliterated housenumbers 2021-04-04 15:26:47 +02:00
Sarah Hoffmann
6cbef84cad use new transliteration in initial housenumber word computation
The new create_housenumber_id() function splits housenumber
lists correctly. Otherwise there is no difference.
2021-04-04 15:26:47 +02:00
Sarah Hoffmann
55fcc44c8c correctly handle housenumber lists
Lists are now standardised to use a semicolon separator.
2021-04-04 15:26:47 +02:00
Sarah Hoffmann
16a66b5326 move transliteration of housenumbers into indexing
Housenumbers are now saved in transliterated form in the housenumber
column. This saves the transliteration step during lookup.
2021-04-04 15:26:47 +02:00
Sarah Hoffmann
3590e76a1c tests for finding non-ascii housenumbers 2021-04-04 15:26:47 +02:00
Sarah Hoffmann
0ec3fdd3ba return housenumbers always from address field
This means that we can use normalized versions of the
housenumber in the housenumber field as it is no longer
a user visible field.
2021-04-04 15:26:47 +02:00
Sarah Hoffmann
c0f0b66509 Merge pull request #2248 from darkshredder/special-term-test
Added Test for TokenSpecialTerm
2021-04-03 18:31:01 +02:00
Darkshredder
0f9df32d11 Added Test for TokenSpecialTerm 2021-04-02 04:49:05 +05:30
Sarah Hoffmann
a370c8be4b Merge pull request #2247 from lonvia/index-for-housenumber-lookup
Index for housenumber lookup
2021-04-01 18:35:00 +02:00
Sarah Hoffmann
d6e0bc698e add recommendation for Postgresql 11+ 2021-04-01 17:10:44 +02:00
Sarah Hoffmann
8d8b1d4307 use non-key index to speed up housenumber search
On Postgresql versions 11+ add an index to speed up the lookup
of housenumbers for terms found in search_name. This is really
just a band-aid around the query planer's interpretation of the
query.
2021-04-01 17:10:44 +02:00
Darkshredder
771b3377c0 Added code-cov Support for Code Coverage 2021-03-31 05:00:03 +05:30
Sarah Hoffmann
8dbfdd59b0 Merge pull request #2243 from darkshredder/XML-format-fix
Fixed: XML format: more_url points to localhost, not base URL
2021-03-30 09:19:01 +02:00
Sarah Hoffmann
cd03882536 Merge pull request #2244 from AntoJvlt/import-special-phrases-tests-cleaning
Cleaned tests for special phrases.
2021-03-30 09:17:27 +02:00
Darkshredder
0b154a2a1a Added HTTP_HOST to if statement 2021-03-30 03:02:55 +05:30
AntoJvlt
e82de99e5a Cleaned tests of exceptions and fix phrase_settings.json test file name. 2021-03-29 22:07:29 +02:00
Darkshredder
27b379c1e3 fixed: XML format: more_url points to localhost, not base URL 2021-03-30 01:02:43 +05:30
Sarah Hoffmann
f9517e9143 Merge pull request #2234 from darkshredder/add-man-page
Added Manual page for Nominatim tool
2021-03-29 14:25:10 +02:00
Sarah Hoffmann
e05dee6df5 allow sorting by housenumbers for rare street names
Usually we don't narrow down search results by house number when
only a street name is given because there may be a lot of rows
to cross check when the street name is very frequent. However,
when it is known to be rare, the housenumber check may be done
anyway.

Fixes #2238.
2021-03-29 12:06:51 +02:00
Darkshredder
3fad492c6f Update manpage after rebase 2021-03-29 14:27:06 +05:30
Darkshredder
b7d6ae93e3 Nominatim/cli.py rebase fixes 2021-03-29 14:16:41 +05:30
Darkshredder
21b1b75b08 Rebase with master 2021-03-29 14:00:45 +05:30
Darkshredder
bbe0353b23 fixed indentation and used sed to remove AUTHORS section 2021-03-29 13:57:13 +05:30
Darkshredder
51e2654cd2 Added Manual page and fixed documentation 2021-03-29 13:57:13 +05:30
Sarah Hoffmann
09b2510219 Merge pull request #2228 from AntoJvlt/import-special-phrases-porting-python
Import special phrases porting python
2021-03-29 09:49:35 +02:00
AntoJvlt
57ce75eb67 Change command 'import-special-phrases --from-wiki' to 'special-phrases --import-from-wiki'. 2021-03-26 02:22:38 +01:00
AntoJvlt
cde9389e75 Errors fixes, Cleaning code, Improvement and addition of tests 2021-03-26 01:53:33 +01:00
AntoJvlt
2c19bd5ea3 Encapsulation of tools/special_phrases.py into SpecialPhrasesImporter class and add new tests. 2021-03-25 21:13:57 +01:00
AntoJvlt
ff34198569 Code cleaning, tests simplification and use of python3-icu package 2021-03-23 23:56:39 +01:00
AntoJvlt
919469c8fe Updated documentation for PyICU support 2021-03-23 23:34:19 +01:00
AntoJvlt
1ce8b530cd Introduction of PyICU for transliteration in python. Reversed changes in normalization.sql. 2021-03-23 23:34:16 +01:00
AntoJvlt
2fb6018078 Added wrapper in specialphrases.php to call corresponding nominatim command. 2021-03-23 23:30:42 +01:00
AntoJvlt
6d56cbb3e8 Changed phrase_settings.py to phrase-settings.json and added migration function for old php settings file. 2021-03-23 23:30:39 +01:00
AntoJvlt
1a93319093 Changed phrase_settings.py to phrase-settings.json and added migration function for old php settings file. 2021-03-23 23:27:56 +01:00
Sarah Hoffmann
28b4fb12b6 Merge pull request #2233 from lonvia/index-for-postcode-ids
Create postcode id index earlier
2021-03-23 09:18:10 +01:00
Sarah Hoffmann
5dabc0aca8 create postcode id index earlier
Now that the indexer takes care of indexing the postcode tables,
the id index is needed to find the rows to index.
2021-03-22 22:24:56 +01:00
Sarah Hoffmann
4f1bdde32e Merge pull request #2231 from mtmail/correct-cli-help-page
nominatim -h was printing wrong text for lookup and details
2021-03-21 16:52:20 +01:00
Sarah Hoffmann
a08ca5b1b5 avoid division by zero in progress meter
On Windows systems the timer may not be accurate enough to measure
the time between init() and done(). Avoid computing statistics with
a diff time of 0 in such cases.

Fixes #2230.
2021-03-21 16:47:22 +01:00
marc tobias
87d5883ddb nominatim -h was priting wrong text for lookup and details 2021-03-21 16:06:41 +01:00
AntoJvlt
d5acade4db Deleted specialphrases.php and phrase_settings.php 2021-03-20 19:48:05 +01:00
AntoJvlt
9d1c23e4f5 Updated specialphrases_testdb.sql 2021-03-20 19:17:03 +01:00
AntoJvlt
17cb59efbd Ported functions for the import of special phrases from php to python.
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
2021-03-20 19:11:50 +01:00
Sarah Hoffmann
118befd7d7 bdd tests: make indexing less verbose
Do not print progress info for indexing when there is an error
in the BDD tests.
2021-03-20 10:39:29 +01:00
Sarah Hoffmann
0d9fe6e49c Merge pull request #2219 from lonvia/bdd-test-remove-php
BDD tests: run all setup via nominatim Python library
2021-03-17 11:40:34 +01:00
Sarah Hoffmann
ebae3553e0 bdd: run all setup via nominatim Python library
Drops all calls to PHP utility functions. nominatim cli functions
are used where possible, to stay as close to the final code as
possible with the tests.

By removing the PHP calls, the test code now only uses osm2pgsql and
the database module from the build directory.
2021-03-16 22:20:41 +01:00
Sarah Hoffmann
d3ff831b8a Merge pull request #2216 from lonvia/fix-reverse-interpolation
Reverse: do not prefer interpolations over closer housenumbers
2021-03-15 14:08:54 +01:00
Sarah Hoffmann
4d7c5ec089 reverse: do not prefer interpolations over closer housenumbers
Always look up the closest housenumber before looking up
interpolations. This ensures that closer housenumbers are
preferred over interpolations.

Fixes #2214.
2021-03-15 10:50:04 +01:00
Sarah Hoffmann
81a6b746b8 Merge pull request #2212 from darkshredder/country-name
Ported createCountryNames() to python and Added tests
2021-03-15 09:36:06 +01:00
Darkshredder
f356a75a24 Add setup.php 2021-03-14 15:02:30 +05:30
Sarah Hoffmann
7212fa8630 fix template variable name 2021-03-13 12:05:53 +01:00
Sarah Hoffmann
6cabc44841 Merge pull request #2213 from lonvia/tweak-search-weights
Some more tweaking of the ranking of search interpretations
2021-03-12 15:47:36 +01:00
Darkshredder
b108bd1c1e Linting fix 2021-03-12 18:28:47 +05:30
Darkshredder
077a8c1f95 refactored tests and made changes to code for easy readibility 2021-03-12 18:23:20 +05:30
Darkshredder
7a874d5b97 Ported createCountryNames() to python and added tests 2021-03-12 10:28:41 +05:30
Sarah Hoffmann
9086a794a1 Merge pull request #2204 from darkshredder/tiger-data
Ported tiger-data-import to Python and Added Tarball Support
2021-03-11 22:48:38 +01:00
Sarah Hoffmann
6dd2b9c2ec do not mix partial names with other words
As soon as a housenumber, postcode, etc. appear, the name term
must obviously be closed and no further partial terms can be
appended.
2021-03-11 22:44:49 +01:00
Sarah Hoffmann
3fbe4511f9 make linter happy 2021-03-11 21:14:23 +01:00
Sarah Hoffmann
3933fc3ad3 avoid multi-term partials in names
Names are either full words or single-word partial names.
Searching for multi-word partials yields exactly the same
result as with full words.
2021-03-11 20:42:37 +01:00
Sarah Hoffmann
00b05e2394 higher penalty for special searches
Adds a general higher penalty for special search term and an
additional one if the term is anywhere but the beginning or the
end. Also housenumbers and special searches together are less
likely.
2021-03-11 20:37:51 +01:00
Sarah Hoffmann
d5e8c5e975 do not mix partial and full name terms
If NameNonSearch already contains a partial term, then a
full term must not be added to the Name list anymore.
2021-03-11 20:22:54 +01:00
Sarah Hoffmann
478dfb0639 add one-rank penalty for using partial search
Ensures that full matches are preferred over partial ones even when
the full word consists of only one term.
2021-03-11 17:52:44 +01:00
Sarah Hoffmann
f498e40208 fix result splitting for last search group
When we are in the final iteration of the search groups, it is not
possible to further delay the results. Unconditionally use the
results with the best rank instead.
2021-03-11 17:14:46 +01:00
Sarah Hoffmann
182f5f5d7b give preference to full words in address, too
Full word terms are already preferred for the name part. Adding
only one-word partials to the address, makes it impossible to
give a similar preference for the address part. Each term adds
a rank penalty. The problem here is that we interpret the query
forwards and backwards. Having different penalty systems for
name and address means that the same term ends up with different
penalties and that often leads to interpretations of the wrong
direction being in the way.
2021-03-11 15:03:36 +01:00
Darkshredder
e5719de657 Added fixture for sql_preprocessor and fixed some issues 2021-03-11 15:39:17 +05:30
Darkshredder
8486a83cf5 Added test for tarfile 2021-03-10 18:14:17 +05:30
Darkshredder
ccfad57fca Added test and removed runlegacyscript 2021-03-10 17:18:12 +05:30
Darkshredder
64128b699a fixed linting, refactored threaded sql handling and removed importTigerData() function 2021-03-10 13:28:29 +05:30
Darkshredder
4080fbb95c Test fixes 2021-03-09 01:00:56 +05:30
Darkshredder
14ec83c886 Linting fixes 2021-03-08 23:10:49 +05:30
Darkshredder
122c4618b9 Linting fixes 2021-03-08 22:59:51 +05:30
Darkshredder
2af82975cd Ported tiger-data-import to python and Added Tarball Support 2021-03-08 21:57:56 +05:30
Sarah Hoffmann
35f4695b67 Merge pull request #2200 from lonvia/migrations-for-current-version
Introduce a command for database migration
2021-03-08 10:14:03 +01:00
Sarah Hoffmann
3c9e09545e documentation for new migration command 2021-03-06 16:38:37 +01:00
Sarah Hoffmann
764a41b973 automatic migration from 3.6 release
Adds a 'admin --migrate' command that checks for the current
database version and runs any necessary migrations. Also
has migrations going back to 3.6.
2021-03-06 16:36:57 +01:00
Sarah Hoffmann
9d103503f7 Merge pull request #2197 from lonvia/use-jinja-for-sql-preprocessing
Use jinja2 for SQL preprocessing
2021-03-04 16:36:18 +01:00
Sarah Hoffmann
09f4d767e4 port index creation to python
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00
Sarah Hoffmann
dd301cf5ac indexer: ANALYSE must be run outside transactions 2021-03-04 11:06:33 +01:00
Sarah Hoffmann
eacabb0e96 move table creation to jinja-based preprocessing 2021-03-03 22:07:51 +01:00
Sarah Hoffmann
6cda021d9b add new jinja2 requirement 2021-03-03 17:51:08 +01:00
Sarah Hoffmann
d2bd6aa78d introduce jinja2 for preprocessing SQL
Replaces various hand-crafted replacements of varying format with
a single Jinja2 templating mechanism. Allows full access to
configuration if necessary.
2021-03-03 17:51:08 +01:00
Sarah Hoffmann
6b306f30b6 Merge pull request #2194 from grischard/patch-1
Fix typo in .github/actions/build-nominatim/action.yml
2021-03-03 11:34:12 +01:00
Guillaume Rischard
c48fd18344 Update action.yml 2021-03-03 11:20:21 +01:00
Sarah Hoffmann
8ea7e04363 Merge pull request #2192 from lonvia/database-versioning
Introduce database versioning
2021-03-02 15:57:46 +01:00
Sarah Hoffmann
32c2d2b248 document new status fields 2021-03-01 22:21:37 +01:00
Sarah Hoffmann
111cca8c9a return database version with status API 2021-03-01 22:17:16 +01:00
Sarah Hoffmann
7ae9c3a9f0 add database_version setting to tests 2021-03-01 21:49:33 +01:00
Sarah Hoffmann
bf4320a7d6 do not depend on cmdline parameter for creating partition tables
The partition numbers in use only depend on the entries in search_name.
2021-03-01 21:28:39 +01:00
Sarah Hoffmann
3a0a4b9175 save software version in the database
The version represents the software version that was used to
import the data.
2021-03-01 20:35:15 +01:00
Sarah Hoffmann
4faefe156c report software version of status call 2021-03-01 16:47:19 +01:00
Sarah Hoffmann
86273f5e2a introduce database patch level for version
This will be needed later for automatic migrations.
2021-03-01 16:46:19 +01:00
Sarah Hoffmann
b4f64aa770 make sure that calls to PHP legacy scripts are fatal on error 2021-03-01 16:10:45 +01:00
Sarah Hoffmann
976c5e9121 introduce table for in-database properties
Adds a simple table where settings for the database can be
saved. This is useful for state that must not change after
import.
2021-03-01 16:09:17 +01:00
Sarah Hoffmann
db663dd92f remove unused import 2021-03-01 09:26:08 +01:00
Sarah Hoffmann
90a5d23016 use tmp_path fixture in config tests 2021-03-01 09:24:04 +01:00
Sarah Hoffmann
99e35d256a fix typo 2021-03-01 09:07:49 +01:00
Sarah Hoffmann
e14e7c6235 Merge pull request #2186 from lonvia/port-import-to-python
Move setup procedure to Python
2021-02-27 12:09:23 +01:00
Sarah Hoffmann
b46adbad22 make sure psql always finishes
If an execption is raised by other means, we still have to close
the stdin pipe to psql to make sure that it exits and releases its
connection to the database.
2021-02-27 10:24:40 +01:00
Sarah Hoffmann
afabbeb546 older versions of Postgresql need explicit return type 2021-02-27 09:46:42 +01:00
Sarah Hoffmann
d14a3df10f do not truncate search_name in reverse-only mode 2021-02-27 09:46:42 +01:00
Sarah Hoffmann
9feb84e426 actions: add psutil dependency 2021-02-26 16:50:09 +01:00
Sarah Hoffmann
c7f40e3cee fix verbose flag for PHP wrapper scripts
The flag must come after the command.
2021-02-26 16:49:32 +01:00
Sarah Hoffmann
dd03aeb966 bdd: use python library where possible
Replace calls to PHP scripts with direct calls into the
nominatim Python library where possible. This speed up
tests quite a bit.
2021-02-26 16:14:29 +01:00
Sarah Hoffmann
15b5906790 move setup function to python
There are still back-calls to PHP for some of the sub-steps.
These needs some larger refactoring to be moved to Python.
2021-02-26 15:02:39 +01:00
Sarah Hoffmann
3ee8d9fa75 properly close connections of indexer after use 2021-02-26 12:10:54 +01:00
Sarah Hoffmann
57db5819ef prot load-data function to python 2021-02-25 21:32:40 +01:00
Sarah Hoffmann
3c186f8030 add a function for the intial indexing run
Also moves postcodes to fully parallel indexing.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
db5e78c879 remove unused partitionfunction function 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
c7fd0a7af4 port wikipedia importance functions to python 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
32683f73c7 move import-data option to native python
This adds a new dependecy to the Python psutil package.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
7222235579 introduce custom object for cmdline arguments
Allows to define special functions over the arguments.

Also splits CLI tests in two files as they have become too many.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
f6e894a53a port database setup function to python
Hide the former PHP functions in a transition command until
they are removed.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
b93ec2522e use psql for executing sql files
This allows to run larger files without needing to keep
them in memory.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
af7226393a add function to set up libpq environment
Instead of parsing the DSN for each external libpq program we
are going to execute, provide a function that feeds them all
necessary parameters through the environment.

osm2pgsql is the first user.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
e520613362 convert connect() into a context manager 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
204fe20b4b Merge pull request #2185 from lonvia/fix-deadlock-handling-for-psycopg27
Improve deadlock detection for various versions of psycopg2
2021-02-25 18:39:40 +01:00
Sarah Hoffmann
a1f0fc1a10 improve deadlock detection for various versions of psycopg2
Psycopg2 has changed the kind of exception that is emitted on
deadlocks between versions 2.7 and 2.8. The code was already
trying to catch both kind of errors but because the
psycopg2.errors package is unknown in 2.7 and below, the
code would throw an exception on anything but a deadlock error.

This commit wraps the deadlock handling into a context manager
to avoid code duplication and uses module imports to detect if
the new error codes are available.

Also sets the required psycopg2 version to 2.7 or bigger as
versions below are difficult to test.
2021-02-25 18:11:16 +01:00
Sarah Hoffmann
68c3862270 Merge pull request #2182 from lonvia/change-error-for-details
Return 404 for details when no object is found in database
2021-02-23 09:09:35 +01:00
Sarah Hoffmann
5b7483ada5 return 404 for details when no bject is found in database
Fixes #2157.
2021-02-22 16:28:29 +01:00
Sarah Hoffmann
72b01148d2 Merge pull request #2181 from lonvia/port-more-tool-functions-to-python
Port more tool functions to python
2021-02-22 16:11:21 +01:00
Sarah Hoffmann
971df231b0 avoid os.environ as default valie 2021-02-19 19:29:57 +01:00
Sarah Hoffmann
4b32cbe518 fix return code for check database run with 'not applicable' 2021-02-19 18:32:00 +01:00
Sarah Hoffmann
f08078ccca bdd tests: directly call python code for setup-website 2021-02-19 18:20:55 +01:00
Sarah Hoffmann
389138abfe port setup-website to python 2021-02-19 17:51:06 +01:00
Sarah Hoffmann
a0ae4945cd add unit tests for new check_database code 2021-02-18 20:36:11 +01:00
Sarah Hoffmann
b169e4c88c port check-database function to python
This change also adapts the hints to use the nominatim tool.
Slightly changed checks, so that they are just as effective on
a frozen database.
2021-02-18 17:32:30 +01:00
Sarah Hoffmann
a60c34bded use a frozen DB for API tests
This way we also test that dropping does the right thing.
2021-02-17 22:35:27 +01:00
Sarah Hoffmann
153dbb71b8 remove unused code 2021-02-17 22:25:23 +01:00
Sarah Hoffmann
101a1f895d port freeze function to python 2021-02-17 21:43:15 +01:00
Sarah Hoffmann
bd27310c68 Merge pull request #2173 from lonvia/penality-for-housenumberless-places
Increase penalty for places without housenumber
2021-02-17 17:52:59 +01:00
Sarah Hoffmann
42ecd535b3 Merge pull request #2174 from lonvia/disable-jit-for-osm2pgsql-again
Disable JIT and parallel execution for osm2pgsql updates again
2021-02-16 21:32:57 +01:00
Sarah Hoffmann
c9838a02ce disable JIT and parallel execution for osm2pgsql updates again
The gazetteer output doesn't disable these functions when
writing to the place table but the triggers may contain
operations that cause misplanning for the query planner.
2021-02-16 18:23:47 +01:00
Sarah Hoffmann
7ebcf602ac add simple test for result splitting with multiple ranks 2021-02-16 17:59:12 +01:00
Sarah Hoffmann
8eb85f1340 increase penalty for places without housenumber
Results where the housenumber was dropped are an unlikely result
when they refer to something other than a street. Therefore
increase their result rank so that other matches are tried first
before choosing them as a result.

Improves #2167.
2021-02-16 17:47:06 +01:00
Sarah Hoffmann
2a8e3741fa Merge pull request #2166 from mtmail/tiger-2020
documentation: 2020 TIGER data got released
2021-02-16 14:45:14 +01:00
Sarah Hoffmann
684378722c Merge pull request #2171 from lonvia/update-vagrant-scripts-for-make-install
Update vagrant scripts for make install
2021-02-16 14:42:38 +01:00
Sarah Hoffmann
286a686f88 switch vagrant scripts to make install 2021-02-16 12:04:34 +01:00
Sarah Hoffmann
7360e6c5df use file copy on older cmake to install osm2pgsql
Fixes #2170.
2021-02-16 11:06:14 +01:00
Sarah Hoffmann
fbe7be760b ignore failure to get replication date 2021-02-14 12:17:30 +01:00
marc tobias
a3ce89aeff documentation: 2020 TIGER data got released 2021-02-12 23:57:12 +01:00
Sarah Hoffmann
6a7e0d652b Merge pull request #2164 from lonvia/add-make-install
Make Nominatim installable
2021-02-12 10:32:26 +01:00
Sarah Hoffmann
7cc4c53adb always return 0 for updates unless there is an error
This is more in line with previous behavioru than returning
a status code when no updates are available.
2021-02-11 10:33:49 +01:00
Sarah Hoffmann
24b13a7a87 docs: adapt check-database command 2021-02-10 21:55:04 +01:00
Sarah Hoffmann
b6c2dbf69c actions: remove install directories before import
This ensures that any dangling references to the build
or source directory are caught by the CI.
2021-02-10 17:59:52 +01:00
Sarah Hoffmann
0e0e9a6809 need test database for analysing cli test 2021-02-10 16:19:51 +01:00
Sarah Hoffmann
ed60154552 actions: test import with installed version of Nominatim 2021-02-10 16:17:52 +01:00
Sarah Hoffmann
85589cf7dc add 'make install' to installation instructions 2021-02-10 11:15:21 +01:00
Sarah Hoffmann
99dcd10d3f test for existance of country grid in cmake already
Given that the file potentially gets installed, it needs to be
present during build time already. Checking during the import
is too late.
2021-02-10 10:40:36 +01:00
Sarah Hoffmann
745ae02f47 make installation targets conditional to what is built 2021-02-10 10:04:07 +01:00
Sarah Hoffmann
b6bd11f292 add make install target
Installation includes PHP andPython libraries, settings, the basic
country data, the postgresql module and our custom version of
osm2pgsql. The latter is installed in our private library directory
so that it does not get in the way of a potentially installed
osm2pgsql from the distribution.
2021-02-09 21:04:42 +01:00
Sarah Hoffmann
c60a0784ea adapt unit tests to new directory structure 2021-02-09 20:13:00 +01:00
Sarah Hoffmann
3cb6f3e460 use DataDir constant for data only
So far the data directory constant has pointed to the source
directory to be usable with different subdirectories. Now only
the data subdirectory itself is being used with the constant,
so point to the directory directly.
2021-02-09 20:04:08 +01:00
Sarah Hoffmann
de37dc9300 forgot to replace one occurence of sql_dir 2021-02-09 19:32:05 +01:00
Sarah Hoffmann
8ffd7d9243 remove unused BINDIR constant 2021-02-09 19:30:31 +01:00
Sarah Hoffmann
298ed11261 introduce constant for configuration directory
This replaces {data_dir}/settings throughout the code, so that
the configuration may be placed somewhere else in the directory
structure (e.g. in /etc).
2021-02-09 18:45:45 +01:00
Sarah Hoffmann
b9517c99ae rename sql directory to lib-sql
Also introduces a separate constant for the sql directory, so that
it can be put separately from the rest of the data if required.
2021-02-09 15:26:56 +01:00
Sarah Hoffmann
db3ced17bb rename lib to lib-php 2021-02-09 11:52:07 +01:00
Sarah Hoffmann
248b4cddab update osm2pgsql (disable install rule) 2021-02-09 09:48:50 +01:00
Sarah Hoffmann
d81e152804 integrate analyse of indexing into nominatim tool 2021-02-08 22:22:49 +01:00
Sarah Hoffmann
0cbf98c020 consolidate warm and db-check into single admin command 2021-02-08 21:05:06 +01:00
Sarah Hoffmann
195f9f5ef3 split cli.py by subcommands
Reduces file size below 1000 lines.
2021-02-08 17:23:05 +01:00
Sarah Hoffmann
a759c5b75b move website into php library directory 2021-02-08 12:00:34 +01:00
Sarah Hoffmann
7dfe645b5f move postcode table setup to sql/
Also moves the call to the setup from the setup-db
step to the calculate-postcodes step. The tables also need
no longer be accessible by the webservice.
2021-02-08 11:53:01 +01:00
Sarah Hoffmann
ca3283cbaa remove unused SQL script 2021-02-08 11:28:24 +01:00
Sarah Hoffmann
861e67dfe8 fix off-by-one error in replication download 2021-02-04 17:04:04 +01:00
Sarah Hoffmann
82ef02cd1a Merge pull request #2161 from lonvia/timeout-for-replication
Reintroduce timeout for replication file download
2021-02-04 16:52:24 +01:00
Sarah Hoffmann
948217d5e9 reintroduce timeout for replication file download
This ports the --socket-timeout parameter from
pyosmium-get-changes which ensures that the update
process eventually times out on hanging network connections.
2021-02-04 11:47:11 +01:00
Sarah Hoffmann
6cc06828db Merge pull request #2160 from lonvia/introduce-project-dir
Officially introduce and recommend use of a project directory
2021-02-04 09:52:59 +01:00
Sarah Hoffmann
0b2abfb115 replace make serve with nominatim serve command
With the website directory now tied to the project directory instead
of the build directory, it is no longer possible to use make for
running the web server.
2021-02-03 16:34:31 +01:00
Sarah Hoffmann
b2f8fb6201 add migration info for status table 2021-02-03 14:13:09 +01:00
Sarah Hoffmann
e2329c03fe Revert "increase splitting for large geometries"
This reverts commit 559fe513fa.

Increasing the splitting results in geometries where with rounding
issues at the split points, so that contain operations do not
work as expected anymore.

Fixes #2137.
2021-02-03 10:23:38 +01:00
Sarah Hoffmann
9bca670b4e adapt quick start instructions in README to project dir 2021-02-03 10:17:22 +01:00
Sarah Hoffmann
cb06d1f4ca do not overwrite custom set module paths
Given that the module is now copied to the project directory
when no module path is set, we need the information that the
module path is empty. Therefore hand in the default module path
in a separate variable.
2021-02-02 18:31:25 +01:00
Sarah Hoffmann
36447c488a print project directory before running any command 2021-02-02 11:19:31 +01:00
Sarah Hoffmann
69092030cd make phpcs happy 2021-02-02 11:15:56 +01:00
Sarah Hoffmann
109aa9c428 actions: switch to using separate project dir
Also fixes reverse-only import which not run at all.
2021-02-02 11:03:09 +01:00
Sarah Hoffmann
1d97816c53 docs: add hint about putting the nominatim tool into the PATH 2021-02-02 10:56:40 +01:00
Sarah Hoffmann
7591c4fb42 copy database module on install
When no explicity database module is configured, then the
module is now copied into the project directory and used from
there. This means that Nominatim can be updated to a new
version of the module while existing installation keep their
version of normalisation.
2021-02-02 10:56:40 +01:00
Sarah Hoffmann
60cbeb165e hand in absolute path to nominatim tool to php scripts 2021-02-02 10:56:40 +01:00
Sarah Hoffmann
bddfc109f8 refer to new nominatim tool in configuration comments 2021-02-02 10:56:40 +01:00
Sarah Hoffmann
b05c379b39 change the default location for external data to project dir 2021-02-02 10:56:40 +01:00
Sarah Hoffmann
7ba5283fe8 actions: revert to reletive paths for caching 2021-02-02 10:37:18 +01:00
Sarah Hoffmann
98fe5af07d actions: remove setting custom .env
It only set the pyosmium-get-changes binary which is no longer
needed.
2021-02-02 10:35:30 +01:00
Sarah Hoffmann
59cb1d6c27 remove pyosmium-get-changes detection from cmake
pyosmium-get-changes is not longer used.
2021-02-02 10:33:15 +01:00
Sarah Hoffmann
0ad1b28497 Merge pull request #2155 from lonvia/port-regresh-to-python
Port replication and part of the refrsh function to native Python
2021-02-01 11:50:05 +01:00
Sarah Hoffmann
5f63d4ca1f print nice summary after updates 2021-02-01 10:34:31 +01:00
Sarah Hoffmann
90aaab77fc fix linting issues 2021-01-30 16:42:25 +01:00
Sarah Hoffmann
7158433cd3 disable warning about non-toplevel import
They are needed here so nominatim can be run when osmium
is not installed. Everything except replication will work fine.
2021-01-30 16:29:28 +01:00
Sarah Hoffmann
e629a175ed introduce custom UsageError
This is a exception to be thrown when the error occures because
of bad user data. We don't want to print a full stack trace in
these cases but just tell the user what went wrong.
2021-01-30 16:20:10 +01:00
Sarah Hoffmann
45ea73913f remove setting for PYOSMIUM_BINARY
pyosmium is now called as a library from the python code,
so that pyosmium-get-changes is no longer needed.
2021-01-30 15:55:04 +01:00
Sarah Hoffmann
01e0fd7e13 whitelist pyosmium for pylint 2021-01-30 15:52:49 +01:00
Sarah Hoffmann
4cb6dc01f3 port replication update function to python 2021-01-30 15:50:34 +01:00
Sarah Hoffmann
8f0885f6cb port check-for-update function to python 2021-01-28 14:50:14 +01:00
Sarah Hoffmann
beb0fa0727 Merge pull request #2153 from rizkyarlin/patch-1
fix indentation
2021-01-28 09:06:33 +01:00
Muh. Rizky Eka Arlin
436cb9229b fix indentation 2021-01-28 14:21:54 +08:00
Sarah Hoffmann
d78f0ba804 port replication initialisation to Python 2021-01-26 22:50:54 +01:00
Sarah Hoffmann
5b46fcad8e convert functon creation to python
The new functions always creates normal and partitioned functions.
Also adds specialised connection and cursor classes for adding
frequently used helper functions.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
94fa7162be port address level computation to Python
Also adds simple tests for correct table creation.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
e6c2842b66 move update code for postcode and word count to Python
Adds also tests for the new function to execute a SQL script.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
e6d9485c4a cli: import python modules for commands on demand
Given that only one command will be executed in the end, it is
not necessary to import what amounts to the whole library. This
becomes in particular important for update functions that have
a dependency on pyosmium. The dependency can remain optional for
people not using updates.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
30cd2f2280 remove API comparison util
This is outdated and unmaintained. There are tools out there
which can do this better. Try, for example
https://github.com/radarlabs/api-diff
2021-01-26 22:46:35 +01:00
Sarah Hoffmann
2c909c1f0c Merge pull request #2147 from lonvia/tests-for-python-code
Add basic set of tests for Python code
2021-01-21 10:07:50 +01:00
Sarah Hoffmann
063a4cb403 cli indexer tests need a fake database
The Indexer constructor opens a connection to the given database.
2021-01-20 21:30:27 +01:00
Sarah Hoffmann
42ec67f63c add more tests for CLI parameter parser 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
8c02786820 add tests for indexer 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
c26f323bf5 add simple tests for CLI parsing 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
041ae67fd9 optionally hand in command line arguments to CLI functions
Allows easier testing.
2021-01-20 21:30:27 +01:00
Sarah Hoffmann
bfa6580ad5 use pytest mocking functions for manipulating os.environ 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
52b76d1d01 add tests for Python exec_utils 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
a3767f9142 Merge pull request #2146 from mtmail/two-typos
correct parameter name in query CLI
2021-01-20 21:29:49 +01:00
marc tobias
f62c784102 correct parameter name in query CLI 2021-01-20 21:09:41 +01:00
Sarah Hoffmann
ffc221a87f Merge pull request #2145 from lonvia/cli-query-functions
Add interface to search via command line tool
2021-01-20 09:00:45 +01:00
Sarah Hoffmann
8cf54a1317 add API functions to nominatim tool 2021-01-19 19:38:46 +01:00
Sarah Hoffmann
77e287f669 rename nominatim.admin to nominatim.tools 2021-01-19 19:38:46 +01:00
Sarah Hoffmann
5d95a72758 probe for php_cgi in cmake to be used for querying 2021-01-19 19:38:46 +01:00
Sarah Hoffmann
3475e1dfd6 Merge pull request #2143 from lonvia/integrate-indexer-into-nominatim-tool
Integrate indexer into nominatim tool
2021-01-19 08:42:22 +01:00
Sarah Hoffmann
504922ffbe remove old nominatim.py in favour of 'nominatim index'
The PHP scripts need to know the position of the nominatim
tool in order to call it. This is handed in as environment
variable, so it can be set by the Python script.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
c77877a934 implementaion of 'nominatim index' 2021-01-18 15:43:27 +01:00
Sarah Hoffmann
27977411e9 move indexing function into its own Python module
This makes it mow a standard function of our new Python
library instead of a stand-alone program.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
b79c79fa73 add function to get a DSN for psycopg
Converts the PHP DSN syntax into psycopg syntax when necessary.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
cd0001b55a Merge pull request #2142 from lonvia/update-bdd-api-tests
Update BDD API tests
2021-01-18 15:40:50 +01:00
Sarah Hoffmann
340e7f7210 bdd: complete coverage for API tests
Also removes some functions that are no longer used and
fixes debug output where the tests found an issue.
2021-01-17 16:12:06 +01:00
Sarah Hoffmann
f9c43137c9 remove unused output formatting functions 2021-01-16 17:39:49 +01:00
Sarah Hoffmann
171ed36e36 bdd: remove duplicated tests 2021-01-16 16:57:28 +01:00
Sarah Hoffmann
c6c907d451 bdd: clean up and extend API tests for details
- remove duplicates created by replacing HTML tests
  with JSON tests
- add tests for newer functions for returning geometries
  and hierarchies
2021-01-16 12:04:13 +01:00
Sarah Hoffmann
19ab038724 collect coverage for /website directory as well 2021-01-15 20:27:14 +01:00
Sarah Hoffmann
1c26fd489d Merge pull request #2139 from lonvia/add-pytest
Introduce unit testing for Python code
2021-01-15 17:37:36 +01:00
Sarah Hoffmann
e8cfba1b10 pytest may also be installed as py-test[-3] 2021-01-15 17:22:31 +01:00
Sarah Hoffmann
496a3d29db enable pytest testing in CI 2021-01-15 15:33:53 +01:00
Sarah Hoffmann
438ed431dd add documentation for new pytest tests 2021-01-15 15:18:45 +01:00
Sarah Hoffmann
f1f0032758 add pytest as a test goal in cmake 2021-01-15 15:09:36 +01:00
Sarah Hoffmann
eb3b789855 add initial pytest test for Configuration 2021-01-15 14:42:03 +01:00
Sarah Hoffmann
c077050855 Merge pull request #2136 from lonvia/introduce-pylint
Introduce pylint for code style checking for Python.
2021-01-15 14:39:26 +01:00
Sarah Hoffmann
d9998bfab3 pylint may be available as pylint3 or pylint 2021-01-15 10:52:25 +01:00
Sarah Hoffmann
7cf9d459d6 use check parameter of subprocess.run
...instead of checking on our own.

Also increase required version of Python to 3.5 because of
subprocess.run().
2021-01-15 10:43:04 +01:00
Sarah Hoffmann
de724aa576 add pylint to list of required linting tools
With pylint being run in the CI, passing it is required now.
2021-01-15 10:43:04 +01:00
Sarah Hoffmann
8e53f63036 fix errors reported by pylint 2021-01-15 08:57:00 +01:00
Sarah Hoffmann
565356613a Merge pull request #2135 from lonvia/python-frontend
Introduce new 'nominatim' all-in-one command-line tool
2021-01-15 08:56:07 +01:00
Sarah Hoffmann
eda0900c8e fix typo 2021-01-14 20:30:27 +01:00
Sarah Hoffmann
3dd67083b2 replace Symfony dotenv dependency with Python dotenv 2021-01-14 18:31:18 +01:00
Sarah Hoffmann
2f73bb3643 bdd: directly call utility scripts in lib
This removes the dependency on php-symfony-dotenv for the tests.
2021-01-14 18:19:22 +01:00
Sarah Hoffmann
9348fc5e15 move dotenv parsing to installed PHP scripts
This means that the php-symfony-dotenv library is now only needed
when using the legacy scripts. This includes the BDD tests which
currently still rely on the PHP utils.
2021-01-14 18:06:22 +01:00
Sarah Hoffmann
97710ee9d1 use cli tool for github CI 2021-01-14 16:35:01 +01:00
Sarah Hoffmann
9619cb3fe5 forward cli tool return value as exit code 2021-01-14 14:36:41 +01:00
Sarah Hoffmann
1c1e951826 adapt documentation to new nominatim cli tool 2021-01-14 12:12:38 +01:00
Sarah Hoffmann
88c57b4dc8 maller command execution fixes 2021-01-14 12:03:49 +01:00
Sarah Hoffmann
ba13cfd9ff make sure that environment variables have highest prio 2021-01-14 11:12:45 +01:00
Sarah Hoffmann
1ff8751caa liniting of new python code 2021-01-14 10:19:21 +01:00
Sarah Hoffmann
98dbc84836 add wrapper calls for all nominatim tool functions 2021-01-14 09:37:47 +01:00
Sarah Hoffmann
0847964a27 avoid accessing constants when getting data from env
When a setting can be read from the environment variable, avoid
accessing the internal defaults. This way the scripts can be
accessed directly in \lib as long as the environment is set up
correctly with full defaults.
2021-01-14 09:37:04 +01:00
Sarah Hoffmann
bc09d7aedb fix access to environment variable 2021-01-14 09:29:43 +01:00
Sarah Hoffmann
04690ad8c4 implement warming in new cli tool
Adds infrastructure for calling the legacy PHP scripts. As the
CONST_* values cannot be set from the python script, hand the values
in via secret environment variables instead. These are all
temporary hacks for the transition phase to python code.
2021-01-13 18:25:15 +01:00
Sarah Hoffmann
ec636111ba warm.php needs constant setup for queries
Warming is done using the query classes and therefore the same
copy-over from dotenv settings to CONST_* parameters is needed
as for query.php.
2021-01-13 18:12:53 +01:00
Sarah Hoffmann
e467b956ff set CONST_LibDir directly from the source scripts
Now that the source scripts have been moved to \lib, they
can determine the position of the PHP library relative to
themselves.
2021-01-13 17:00:38 +01:00
Sarah Hoffmann
ff5a237200 move PHP utilities into the lib directory
These are not called directly as programs but used in a library
fashion by the installed utilities. So the library directory
is a better place.
2021-01-13 14:44:45 +01:00
Sarah Hoffmann
d6bcb7c8b7 consolidate cli interface to single tool 2021-01-13 10:11:58 +01:00
Sarah Hoffmann
57f5e6d898 create skeleton for new CLI tools 2021-01-12 22:21:20 +01:00
Sarah Hoffmann
612fd50612 add skeleton for new Nominatim executables 2021-01-12 10:17:28 +01:00
Sarah Hoffmann
a74e736283 Merge pull request #2132 from lonvia/reduce-api-testdb
Reduce BDD API test database to Liechtenstein
2021-01-11 10:42:22 +01:00
Sarah Hoffmann
86cd5ddd65 also run BDD API tests in CI 2021-01-09 17:58:06 +01:00
Sarah Hoffmann
812de0545d test can be run all in one go with make 2021-01-09 17:57:30 +01:00
Sarah Hoffmann
3bed5516da update documentation for new BDD API tests 2021-01-09 17:54:45 +01:00
Sarah Hoffmann
0495dbe756 bdd: add new API test data
Make all data necessary for API tests directly available in the
repository.
2021-01-09 17:01:33 +01:00
Sarah Hoffmann
5d656891ba bdd: convert API tests to smaller test db
Changes BDD API tests to restrict themselves to
Liechtenstein. One test moved to DB as no appropriate
data is available.
2021-01-09 16:59:46 +01:00
Sarah Hoffmann
74122dc965 bdd: improve assert output for API query checks
Adds wrapper function for checking address parts and
more explanation strings to asserts.
2021-01-09 16:58:37 +01:00
Sarah Hoffmann
ee18a511c6 bdd: import API test DB as part of step setup
In the future, the BDD tests will simply set up the required
test database themselves. Like with the template database, it
is not reimported when it already exists unless that is explicitly
forced.

Makes most of the API tests currently fail because they still
point to old test data.
2021-01-07 11:51:38 +01:00
Sarah Hoffmann
da20881096 Merge pull request #2129 from lonvia/cleanup-bdd-tests
Clean up Python support code for BDD tests
2021-01-07 09:10:40 +01:00
Sarah Hoffmann
aaabb46f20 add symphony dotenv to prerequisites list 2021-01-07 08:56:52 +01:00
Sarah Hoffmann
49142eb6e5 use relative dir for sources for phpunit 2021-01-07 08:55:15 +01:00
Sarah Hoffmann
73cbb6eb9a bdd: clean up DB ops steps
Adds comments and modernizes code.
2021-01-06 16:37:32 +01:00
Sarah Hoffmann
1f29475fa5 bdd: move column comparison in separate file
Introduces a new class DBRow that encapsulates the comparison
functions. This also is responsible for formatting more informative
assert messages. place and placex steps are unified.
2021-01-06 12:28:09 +01:00
Sarah Hoffmann
d586b95ff1 bdd: move nominitim id reader to separate file 2021-01-05 16:00:48 +01:00
Sarah Hoffmann
25557e5f14 bdd: factor out reindexing on updates 2021-01-05 15:17:46 +01:00
Sarah Hoffmann
197870e67a bdd: move place table inserter into separate file
Also simplifies usage by implementing a function that inserts
a complete table row.
2021-01-05 12:12:59 +01:00
Sarah Hoffmann
b8e39d2dde bdd: move scene setup to OSM data steps
The step has nothing to do with the database.
2021-01-05 11:42:28 +01:00
Sarah Hoffmann
5dfa76a610 bdd: switch to auto commit mode
Put the connection to the test database into auto-commit mode
and get rid of the explicit commits. Also use cursors always in
context managers and unify the two implementations that copy
data from the place table.
2021-01-05 11:42:28 +01:00
Sarah Hoffmann
58c471c627 bdd: remove class for lazy formatting
assert in combination with format() does the right thing and calls
the __str__() method only when an assertion hits.
2021-01-05 10:39:44 +01:00
Sarah Hoffmann
213bf7d19d bdd: rename db_ops steps
Now all files implementing steps are called steps_*.py.
2021-01-05 10:20:00 +01:00
Sarah Hoffmann
12ae8a4ed3 bdd: move output format computation into response 2021-01-05 10:17:59 +01:00
Sarah Hoffmann
8a93f8ed94 bdd: move Response classes in own file and simplify
Removes most of the duplicated parse functions, introduces
a common assert_field function with a more expressive error
message.
2021-01-05 10:03:47 +01:00
Sarah Hoffmann
2712c5f90e bdd: rename and clean up osm_data steps
Move common OPL creation code into a function and remove
unused imports.
2021-01-04 20:17:17 +01:00
Sarah Hoffmann
72587b08fa bdd: move external process execution in separate func 2021-01-04 19:58:59 +01:00
Sarah Hoffmann
faa85ded50 bdd: move NominatimEnvironment into separate file
Also cleans up and modernizes the code and adds documentation.
2021-01-04 17:54:51 +01:00
Sarah Hoffmann
14e5bc7a17 bdd: move grid generation code into geometry factory 2021-01-04 17:04:47 +01:00
Sarah Hoffmann
f727620859 bdd: move geoemtry creation into separate file
Also renames the OsmDataFactory in the more appropriate
GeometryFactory and modernizes code for python3.
2021-01-04 16:34:40 +01:00
Sarah Hoffmann
843d3a137c remove stale code for python2 2021-01-04 14:14:34 +01:00
Sarah Hoffmann
e4691005e2 Merge pull request #2125 from lonvia/independent-project-directory
Allow for truely independent project directory
2021-01-04 14:10:24 +01:00
Sarah Hoffmann
4aba70caee create a temporary project dir for tests
The project directory contains the website script as
configured through the test configuration. This means
that tests are now completely independet of any
configuration that may be contained in the build
directory.

Also removes the hack to inject additional settings via
a environment variable.
2021-01-04 11:39:45 +01:00
Sarah Hoffmann
5e989b9296 configure osm2pgsql and module location via cmake
The default location of osm2pgsql and the postgresql module
is decided at compile/installation time and is not necessarily
in the project directory.

With this change it is now possible to have a project directory
that is completely separate from the build directory.
2021-01-04 11:37:56 +01:00
Sarah Hoffmann
cba2d252c8 Merge pull request #2124 from lonvia/remove-nose
Remove nose dependency for tests
2021-01-03 21:04:59 +01:00
Sarah Hoffmann
2ecec19df0 remove nose requirement from documentation 2021-01-03 17:23:44 +01:00
Sarah Hoffmann
4ca7197826 replace nose assertions with simple asserts 2021-01-03 17:21:24 +01:00
Sarah Hoffmann
a8ec250993 Merge pull request #2119 from mtmail/check-import-finished-when-tables-droped
utils/check_import_finished: skip some checks when setup ran with --drop
2020-12-22 15:57:48 +01:00
Sarah Hoffmann
f3e0e401fd Merge pull request #2118 from mtmail/vagrant-ubuntu-dotenv
Vagrant ubuntu: install dotenv package
2020-12-22 15:54:48 +01:00
marc tobias
d60f89867b utils/check_import_finished: skip some checks when setup ran with --drop 2020-12-21 20:12:31 +01:00
marc tobias
b133f2bc4c Vagrant ubuntu: install dotenv package 2020-12-21 20:10:13 +01:00
Sarah Hoffmann
301fd7f7e8 Merge pull request #2115 from lonvia/use-dotenv
Switch configuration to dotenv
2020-12-21 11:33:38 +01:00
Sarah Hoffmann
45148c7078 switch documentation to describing dotenv 2020-12-20 12:09:27 +01:00
Sarah Hoffmann
3c75194448 adapt instructions for creating the test db to dotenv 2020-12-20 11:53:19 +01:00
Sarah Hoffmann
f218e20522 mark CentOS installation instructions as broken
Getting symfony-dotenv installed on CentOS is a major pain,
so just mark it broken instead.

Still sSwitch the config format to dotenv already.
2020-12-20 11:35:29 +01:00
Sarah Hoffmann
33b038ce6f tests: always create the config file
There is also one database test that uses the API functions.
2020-12-19 17:55:46 +01:00
Sarah Hoffmann
f62c65e9d9 adapt php tests to new directory constants 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
867baab3d1 make phpcs happy 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
63ad0cb498 github actions: need dotenv 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
433017b990 move creation of website scripts to setup script
Instead of creating the website wrapper scripts with cmake,
they are now created when --setup-website is called. The
setup of the configuration constants is directly embedded
into the scripts. This means we can get rid of the separate
settings-frontend.php. More importantly however, it means
that it is now possible to set up multiple website directories
from the same build directory.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
d97aed8741 adapt tests to new dotenv environment
DB tests now can simply set the environment to change configuration
variables. API tests still rely on a configuration file.

Also, query.php needs to set up the CONST_* variables to work with
the query scripts. That is a tiny bit messy and duplicates code
but this part will need to be reworked later.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
06d89e1d47 fix various typos 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
8676e45d88 remove old default settings 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
992d3faac8 switch all utils to initialising dotenv 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
0947b61808 switch remaining settings to dotenv format
CONST_Search_AreaPolygons and CONST_Search_ReversePlanForAll have
been removed completely.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
d43f30903c use explicit DSN for website scripts
Website scripts have no access to the dotenv variables, so use
the DSN constant instead when connecting to the database.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
15a1666f8a replace database settings with dotenv variant
As we can't refer to the project root dir in the module path, the
module path may now also be a relative directory which is then
taken as being relative to the project root path.

Moves the checkModulePresence() function into the Setup class, so
that it can work on the computed absolute module path.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
25bdd7c6d9 introduce dotenv parsing for setup.php
This adds the notion of a project directory. This is the directory
that holds all necessary files for one specific installation of
Nominatim. Dotenv looks for an .env file in this directory and
adds it to the global environment together with the defaults from
Nominatim's data directory.

Add's symfony's dotenv library as a new dependency.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
ac116980ac make HTTP proxy setup explicit
The setup relies on the project configuration which we want to
explicitly set up in later steps. Therefore proxy setup needs to
be done explicitly as well. There is the added bonus that the
setup is done only for the utils which try to call outside.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
b5480f6e36 reorganise path settings in config
CONST_BasePath is split into separate configuration variables
for binaries, libraries and data. These variables as well as
the installation path are now set in the executable directly and
no longer configurable via project settings.

This is the first step towards an installable software. The
executables should know per installation where to find their
necessary data to execute. Project configuration needs to be
restricted to settings that really concern the specific Nominatim
installation.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
17a8cc5e29 use /usr/bin/env for python script
Makes it easier to use the script with a virtualenv setup.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
aeeee0d5da Merge pull request #2112 from lonvia/fix-tests-for-php-8
work around failing CI tests
2020-12-18 14:25:50 +01:00
Sarah Hoffmann
de03a0f924 work around failing CI tests
Force use of phpunit7 to avoid an issue with different sort order.
2020-12-18 10:58:09 +01:00
Sarah Hoffmann
5528918d5d Issue templates: require postgres config modifications 2020-12-18 10:37:01 +01:00
Sarah Hoffmann
9e0d5cb669 Issue templates: more commenting of instructions 2020-12-16 08:41:20 +01:00
Sarah Hoffmann
11622b2863 Issue templates: put instructions into comments 2020-12-16 08:38:04 +01:00
Sarah Hoffmann
9610664fc5 prepare 3.6.0 release 2020-12-13 11:59:16 +01:00
Sarah Hoffmann
eed2e3f2a8 Merge pull request #2105 from lonvia/fix-use-of-records
Use a typed record for place info in get_addressdata
2020-12-13 09:48:21 +01:00
Sarah Hoffmann
1b62162e08 actions: run tests also on postgresql 9.5
This is the oldest version for which a postgis is available.
2020-12-12 20:55:54 +01:00
Sarah Hoffmann
5cf573c340 use a typed record for place info in get_addressdata
Older versions of Postgresql cannot handle an untyped record
for INTO.

Fixes #2100.
2020-12-12 14:46:34 +01:00
Sarah Hoffmann
76cd8bf258 Merge pull request #2102 from mtmail/typo-in-reverse-md
Reverse.md: fix two typos
2020-12-12 10:16:23 +01:00
marc tobias
31ba7e7cf0 Reverse.md: fix two typos 2020-12-12 00:30:26 +01:00
Sarah Hoffmann
a338ba695b Merge pull request #2101 from lonvia/update-documentation
Update documentation
2020-12-11 18:02:44 +01:00
Sarah Hoffmann
9f6e2de4ed fix typos 2020-12-11 17:08:13 +01:00
Sarah Hoffmann
1b3acc4f8f update import times 2020-12-11 16:53:38 +01:00
Sarah Hoffmann
56cf3b362e Merge pull request #2099 from lonvia/update-country-names
Dynamically update country names from OSM data
2020-12-10 17:56:32 +01:00
Sarah Hoffmann
b59d01fe85 update country names
Copies all name:xx country names that are in OSM as of today
into the country name fallback table.
2020-12-09 17:52:25 +01:00
Sarah Hoffmann
65d8770b28 update country_names from OSM data
Update names in the coutry_names table on the fly from incomming
OSM country data. Adding a small sanity check that the country
must be an OSM relation and within the area where we expect the
country to be.
2020-12-09 11:38:19 +01:00
Sarah Hoffmann
38582c3e52 Merge pull request #2096 from otbutz/patch-1
Add link to discussions when creating new issue
2020-12-09 09:55:59 +01:00
otbutz
5f6fe6cab7 Add link to discussions when creating new issue 2020-12-09 09:33:42 +01:00
Sarah Hoffmann
34f8e6ddf7 remove bug reporting hints in favour of issue templates 2020-12-08 17:47:38 +01:00
Sarah Hoffmann
603367dced updates to admin and develop documentation
Mostly minor updates in wording and resource consumption.
2020-12-08 17:47:38 +01:00
Sarah Hoffmann
eaee87e73d update API documentation
* remove traces of HTML output
* add details on artificial objects (see also #1671)
* add geometry output documentation for lookup
* deprecate query by ID via reverse endpoint
* remove /search/<query> query format, no longer supported
* explain better what reverse geocoding does
* lots of smaller fixes to wording
2020-12-08 17:47:38 +01:00
Sarah Hoffmann
ea844db847 do not classify housenumbers as rare
House numbers are highly redundant, so don't even attempt to
do it as a rare name search. Greatly improves speed of such
queries.
2020-12-08 17:25:15 +01:00
Sarah Hoffmann
3ca8cc0344 Merge pull request #2092 from lonvia/update-osm2pgsql
Update osm2pgsql to 1.4.0 release
2020-12-08 10:56:25 +01:00
Sarah Hoffmann
f67c06f128 update osm2pgsql to 1.4.0 release 2020-12-08 10:18:18 +01:00
Sarah Hoffmann
7889509856 Create issue templates 2020-12-06 22:24:45 +01:00
Sarah Hoffmann
67c995aef6 Merge pull request #2090 from lonvia/avoid-contains-operator
avoid contains operator for geometries
2020-12-03 09:39:50 +01:00
Sarah Hoffmann
e20defeebd avoid contains operator for geometries
Postgis keeps messing up use of index in some circumstances.
2020-12-02 22:18:27 +01:00
Sarah Hoffmann
cca646a19e Merge pull request #2087 from lonvia/only-one-link-per-node
Place nodes can only be linked once against boundaries
2020-12-02 22:16:56 +01:00
Sarah Hoffmann
ddc2b4b806 Merge pull request #2088 from lonvia/update-osm2pgsql
update osm2pgsql
2020-12-02 22:16:38 +01:00
Sarah Hoffmann
0334915067 update osm2pgsql
Needs now an explicit switch to avoid propagating changes from
nodes to ways to relations.
2020-12-02 18:27:18 +01:00
Sarah Hoffmann
987d60ccda place nodes can only be linked once against boundaries
If a place node is already linked against a boundary, it should not
be used for linking again. It is usually a sign of a mapping error,
when there are multiple boundary candidates. This change just avoids
inconsistent data in the database, it does not guarantee that the
linking is against the more correct boundary.
2020-12-02 15:31:02 +01:00
Sarah Hoffmann
7d520bf448 also add explicit cast for varchar 2020-12-01 22:15:51 +01:00
Sarah Hoffmann
dc53288e6b Merge pull request #2085 from lonvia/add-address-rank-to-xml-output
add address rank to XML output
2020-12-01 19:59:29 +01:00
Sarah Hoffmann
db02673b60 Merge pull request #2084 from lonvia/fix-term-count
fix use of term count in partial terms
2020-12-01 19:59:07 +01:00
Sarah Hoffmann
cf9b248f29 add address rank to XML output
The address rank is much more interesting than the search rank
these days because it tells something about the kind of object.

Reverse did have neither rank, so add both for consistency.
2020-12-01 17:54:53 +01:00
Sarah Hoffmann
df12954312 fix use of term count in partial terms
Term count for partial words is one less than the actual number
of words. Take that into account when adding to the search rank.

Fixes #2081.
2020-12-01 17:21:01 +01:00
Sarah Hoffmann
a9357b4dce Merge pull request #2082 from lonvia/compute-address-on-the-fly-II
Compute address for POIs on the fly
2020-12-01 16:41:31 +01:00
Sarah Hoffmann
63544db8f9 null entries need to be typed 2020-12-01 14:54:42 +01:00
Sarah Hoffmann
7295cad715 compute address parts for rank 30 objects on the fly
Rank 30 objects usually use the address parts of their parent.
When the parent has address parts that are areas but not marked
as isaddress, then the parent might go through multiple administrative
areas. In that case recheck if the right area has been choosen
for the object in question instead of relying on isaddress.
Note that we really only have to do the recomputation in the
case of 'isarea = True and isaddress = False' which hopefully
keeps the number of additional geometric operations we have to do
to a minimum.

There is one more special case to be taken into account here: a
street may go through two administrative areas and a house along
that street is placed in one of the area while the addr:* tags
says it belongs to the other. In that case we must not switch
the isaddress to the one it is situated. To avoid that recheck
the address names against the name of the ara. That is not perfect
but should cover most cases.

Fixes #328.
2020-12-01 11:58:25 +01:00
Sarah Hoffmann
ff85da0a31 cleanup get_addressdata
Save location data in a ROW instead of using separate varaibles
for each value.
2020-11-30 22:54:36 +01:00
Sarah Hoffmann
75b2d7ca99 Merge pull request #2080 from donalhunt/fix-Migration.md-typos
Migration.md: fix typos, improve style consistency and readability.
2020-11-30 16:21:35 +01:00
Donal Hunt
3c9eeb11fa Migration.md: fix typos, improve style consistency and readability. 2020-11-30 11:59:10 +00:00
Sarah Hoffmann
63bacaee2e Merge pull request #2079 from lonvia/improve-progress-logging
Improve progress logging during indexing
2020-11-30 11:42:08 +01:00
Sarah Hoffmann
5016eace34 improve progress logging during indexing
Wait for 2 seconds before logging the first progress, so that we
have numbers that are a bit more reliable statistically speaking.

Also provides an actual implementation for the log_interval
parameter and fixes some small style issues.
2020-11-30 10:59:29 +01:00
Sarah Hoffmann
2e5c8b5cd3 Merge pull request #2077 from lonvia/optimize-large-rank-0-areas
Restrict size of features that get a full address search
2020-11-26 14:40:54 +01:00
Sarah Hoffmann
bf0f81adcb Merge pull request #2076 from lonvia/search-name-index-migration
Docs: add migration for search_name_* tables
2020-11-26 12:01:38 +01:00
Sarah Hoffmann
2db751700e restrict size of features that get a full address search
It would be nice to always compute addresses for rank 0 objects
over the complete geometry, so that they can be found via all
the admin boundaries that they intersect. However, there are a
couple of extramely large boundaries in OSM (like timezones)
where this results in thousands of possible address candidates
that need to be checked. Fall back to getting the address of the
centroid for them.
2020-11-26 11:53:58 +01:00
Sarah Hoffmann
62bee4ed37 docs: add migration for search_name_* tables 2020-11-26 09:18:33 +01:00
Sarah Hoffmann
1f07d63dc5 Merge pull request #2075 from lonvia/filter-postcodes-from-location-area-large
Filter postcodes by search rank when adding to address list
2020-11-25 21:42:27 +01:00
Sarah Hoffmann
cc1af99dbd filter postcodes by search rank when adding to address list
The post codes are the last part that does not fit the new
address ranking scheme. In particular, the search rank is still
relevant for choosing if a postcode should be included into
the address terms. Filter out irrelevant postcodes in
getNearFeatures() already, to avoid having to check for
geometry relation.
2020-11-25 21:01:33 +01:00
Sarah Hoffmann
c5d98effc0 Merge pull request #2074 from lonvia/add-housenumber-to-unknown-places
Improve finding addresses that have their own search_name entry because of unknown addr:* parts
2020-11-25 16:57:09 +01:00
Sarah Hoffmann
b68b2ff6b8 Merge pull request #2073 from lonvia/multi-word-partial-terms-in-search-description
Improve handling of multi-word partials in SearchDescription
2020-11-25 16:57:00 +01:00
Sarah Hoffmann
57f0d55c2e make phpcs happy 2020-11-25 16:14:31 +01:00
Sarah Hoffmann
3cf763475f do not use artificial housenumbers as names
If they are artificial they cannot have a search_name entry.
2020-11-25 16:11:32 +01:00
Sarah Hoffmann
0f87da017f improve handling of multi-word partials in SearchDescription
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.

This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.

Also adds a penalty to very short postcodes.
2020-11-25 12:07:04 +01:00
Sarah Hoffmann
22800d7d59 Search housenumbers with unknown address parts by housenumber term
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.

This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.

The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.

no penalty for housenumber searches
2020-11-25 11:36:10 +01:00
Sarah Hoffmann
f21853ea9d Merge pull request #2071 from lonvia/fix-more-ranks
Search rank 30 must always go with address rank 30
2020-11-24 21:45:30 +01:00
Sarah Hoffmann
1e76d668bd Merge pull request #2070 from lonvia/unlisted-places-to-rank-25
Move unlisted places to address rank 25
2020-11-24 21:45:16 +01:00
Sarah Hoffmann
b4b50eef15 search rank 30 must always go with address rank 30 2020-11-24 17:57:28 +01:00
Sarah Hoffmann
a9ad390b9e move unlisted places to address rank 25
Unlisted places are derived from addr:place and as such are
still places not streets.
2020-11-24 17:54:00 +01:00
Sarah Hoffmann
2e9e961fff Merge pull request #2068 from lonvia/fix-reverse-only
Do not create POI search terms in reverse-only mode
2020-11-24 08:22:48 +01:00
Sarah Hoffmann
13180989d9 Test --reverse-only with CI 2020-11-23 22:36:28 +01:00
Sarah Hoffmann
a4f1e40b72 do not create POI search terms on reverse-only
Fixes #2067.
2020-11-23 19:55:36 +01:00
Sarah Hoffmann
04d485c550 Merge pull request #2065 from rustycamper/patch-1
viewbox arguments are no longer accepter "in any order"
2020-11-23 09:55:29 +01:00
Pietro
a92bd1e2db viewbox arguments are no longer accepter "in any order"
Order should be longitude, then latitude
2020-11-23 10:40:43 +02:00
Sarah Hoffmann
f89e71a861 make sure that admin levels in NL are kept in order 2020-11-19 09:44:02 +01:00
Hendrik Morée
dcc075b34b Admin levels 8 and 10 of the Netherlands are municipal / city 2020-11-18 11:30:24 +01:00
Sarah Hoffmann
49083c2597 Merge pull request #2058 from lonvia/split-address-words
Split addr:* tags into words before adding to the search index
2020-11-18 08:58:17 +01:00
Sarah Hoffmann
29785ba166 Merge pull request #2059 from lonvia/include-parent-name-for-unknown-places
POIs with unknown addr:place must add parent name to address
2020-11-18 08:58:03 +01:00
Sarah Hoffmann
ffb2c93ba3 POIs with unknown addr:place must add parent name to address
The previous behaviour was a left-over from a former version
where such POIs parented to the street. Now that they parent to
places, it should be included.
2020-11-17 19:44:43 +01:00
Sarah Hoffmann
30a6b6bdac split addr: tags into words before adding to the search index
Address parts are only matched by single partial words. If
the addr: names are not split, then multi-word names cannot
be found.
2020-11-17 18:03:33 +01:00
Sarah Hoffmann
cc345f531a Merge pull request #2056 from lonvia/avoid-linking-postal-areas
Disallow linking for postcode areas
2020-11-17 11:15:56 +01:00
Sarah Hoffmann
9ede048769 disallow linking for postcode areas 2020-11-17 10:53:26 +01:00
Sarah Hoffmann
d23bf6e659 Merge pull request #2054 from lonvia/display-addr-terms
Merge places into address lists referred to by addr:* tags but not computed by Nominatim
2020-11-16 16:08:06 +01:00
Sarah Hoffmann
6b60f0ab03 use bool_or(ST_Intersects) instead of ST_Intersects(ST_Collect)
ST_Intersects segfaults on geometry collections for certain versions
of Postgis 3.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
aa9923bf07 fix typo 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
9160cce6d8 remove unused columns in search_name_* and use right index
We only need the address rank these days, so get rid of
search rank. Also switch indexes to work on address rank.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
885dc0a8e1 more tests for absense of additional addressline entries 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
7324431b12 get additional addresses for rank 30 objects
get_addressdata() now also checks if the place itself has entries
in the place_addressline table and merges them into the results.

Also restrict checking for address tag places to cases where the
name cannot be found in the parent's address search terms. Looking
up all address tags is just too slow.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
021f2bef4c get address terms from address tags for rank 30
For rank 30 objects add extra elements into the place_addressline
table.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
6260fef2e8 add test for placex from addr tags 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
c7472662a6 lookup places for address tags for rank < 30
While previously the content of addr:* tags was only added
to the list of address search keywords, we now really look up
the matching place. This has the advantage that we pull in all
potential translations from the place, just like all the other
address terms that are looked up by neighbourhood search.

If no place can be found for a given name, the content of the
addr:* tag is still added to the search keywords as before.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
fecfe62fc6 Merge pull request #2055 from lonvia/fix-actions
Actions: update apt repo before installing software
2020-11-16 11:26:10 +01:00
Sarah Hoffmann
21b0430e46 actions: update apt repo before installing software 2020-11-16 10:14:38 +01:00
Sarah Hoffmann
66595c2d2b Merge pull request #2046 from lonvia/less-parallel-ranking
Only index larger batches for rank 30
2020-11-06 09:39:07 +01:00
Sarah Hoffmann
4ac29fb525 only index larger batches for rank 30
Fixes #2045.
2020-11-05 22:14:49 +01:00
Sarah Hoffmann
04d50a271d Merge pull request #2041 from lonvia/address-ranks-belgium
Adapt admin_levels for Belgium
2020-11-03 16:23:35 +01:00
Sarah Hoffmann
cbbda1ddf0 adapt admin_levels for Belgium
Fixes #272.
2020-11-03 10:46:52 +01:00
Sarah Hoffmann
928c6245c9 Merge pull request #2038 from lonvia/addresses-for-large-areas
Improve addresses for large areas
2020-11-03 08:49:01 +01:00
Sarah Hoffmann
ac9be161f6 Merge pull request #2039 from lonvia/migration-for-ui
Add  migration hints for UI removal and remove tests for icon attribute
2020-11-03 08:48:45 +01:00
Sarah Hoffmann
33378dcf6e remove tests for icon attribute
The icon attribute requires the CONST_MapIcon_URL to be present
which we cannot guarantee for the tests.
2020-11-02 16:46:29 +01:00
Sarah Hoffmann
e31a1f7ef1 docs: add migration hints for removed UI 2020-11-02 16:34:17 +01:00
Sarah Hoffmann
fa574ae9fd use different area estimates for large countries 2020-11-02 14:21:30 +01:00
Sarah Hoffmann
b2ebf4b4b7 adapt tests to rank changes of natural 2020-11-02 11:42:10 +01:00
Sarah Hoffmann
0f5615b618 guess a base address level for address rank 0 objects
The guess is based on the area and mainly avoids odd
addresses for very large or small objects.
2020-11-02 11:42:10 +01:00
Sarah Hoffmann
f050f898bc elevate most natural feature to address rank 22
Makes them be in par with landuse features.
2020-11-02 11:42:10 +01:00
Sarah Hoffmann
ce1c3bab6d Merge pull request #2032 from lonvia/remove-ui
Remove HTML output
2020-11-01 15:12:12 +01:00
Sarah Hoffmann
5cdabc5173 Merge pull request #2035 from lonvia/add-index-to-multicountry-script
docs: need to index after updating with a file
2020-11-01 15:11:02 +01:00
Sarah Hoffmann
42775f959b docs: need to index after updating with a file
Fixes #2031.
2020-10-31 22:53:08 +01:00
Sarah Hoffmann
7f55dcef3a use simpler recurse operator for overpass download
Also fixes a typo in the OSM link.
2020-10-31 21:44:28 +01:00
Sarah Hoffmann
ba8accf4bb make phpcs happy 2020-10-29 11:36:16 +01:00
Sarah Hoffmann
e62db51b06 vagrant: setting website URL is not longer necessary 2020-10-29 11:13:32 +01:00
Sarah Hoffmann
b81894d3d5 remove now unused settings related to website
There are two places where the website URL is still used:
for icons, replace the URL with a link to the icon repository
of the UI repo. The more URL now builds the link from the
server info.
2020-10-29 11:13:32 +01:00
Sarah Hoffmann
d86cf6801f remove tests for HTML output 2020-10-29 11:13:32 +01:00
Sarah Hoffmann
c0d21d0bd3 remove HTML output and UI elements 2020-10-29 11:13:04 +01:00
Sarah Hoffmann
b838f66dd7 remove hierarchy endpoint
This endpoint was never maintained and most of the information
can be obtained via the details endpoint.
2020-10-29 11:13:03 +01:00
Sarah Hoffmann
db9cc270b3 Merge pull request #2030 from lonvia/improve-ci
Small improvements for github actions run
2020-10-28 15:20:40 +01:00
Sarah Hoffmann
4147e04319 action: cache downloaded dependencies 2020-10-28 14:51:05 +01:00
Sarah Hoffmann
a888f6ff93 Merge pull request #2027 from lonvia/remove-duplicate-admin-boundaries
Handle duplicated admin boundaries
2020-10-28 11:11:42 +01:00
Sarah Hoffmann
bd04c49bc1 actions: tweak database settings
Disabling fsync and friends should speed up the CI run
significantly.
2020-10-28 11:10:14 +01:00
Sarah Hoffmann
5872b81232 use highest admin boundary for duplicated ones 2020-10-28 10:49:26 +01:00
Sarah Hoffmann
abd20d3ca6 disable admin level 5 in Russia
They either interfere with cities or refer to historical boundaries.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
95f83b90d2 minor fixes for geometry compuation during boundary ranking
Go back to using centroid when determining if one admin level
is within another. There are cases where boundaries are slightly
misaligned due to mapping errors (not using the same ways in the
relations).

Only declare boundaries the same when they have the same wikidata
tag _and_ have exactly the same geometry. This works around tagging
errors with the wikidata tag, which happen because of automated
edits to the wikidata tag.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
7a16909219 detect and remove admin boundary duplicates
The Polish community maps admin boundaries that span multiple
levels by duplicating the boundary relations. Detect this situation
by looking out for matching wikidata tags. The higher ranked
duplicates are then thrown out from the address pool by setting
their address rank to 0.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
ee4684e6a9 Merge pull request #2029 from lonvia/master
Switch CI to github actions
2020-10-28 10:12:52 +01:00
Sarah Hoffmann
a2b1a1eb33 switch CI badge to github actions 2020-10-28 10:11:22 +01:00
Sarah Hoffmann
7bc0fc9611 switch CI to github acitons 2020-10-28 09:29:22 +01:00
Sarah Hoffmann
e0e18e2b6f Merge pull request #2025 from lonvia/fix-secondary-importance-for-countries
Improve secondary result ordering for administrative boundaries
2020-10-22 13:43:22 +02:00
Sarah Hoffmann
788ba6d985 adjust secondary order when no addressimportance available
In cases of countries and remote places without an address
it is possible that 'addressimportance' comes back empty.
Adjust the 'foundorder' to the places importance instead
in such cases.

Fixes #2023.
2020-10-22 10:20:16 +02:00
Sarah Hoffmann
b012e15245 readd boundary:administrative to class importance 2020-10-22 10:16:04 +02:00
Sarah Hoffmann
ba31456278 Merge pull request #2022 from lonvia/populate-rank-25
reorganize ranks of high-level place types
2020-10-20 23:26:18 +02:00
Sarah Hoffmann
b661c66c00 reorganize ranks of high-level place types
Rank 25 is now available for places that should appear in addresses
but not when a street is present. Use this for som block-like
place types. Also document the particularity of rank 25.

subdevisions and allotments are now at the same level as landuse
which they are frequently used together with.
2020-10-20 20:20:49 +02:00
Sarah Hoffmann
f0372736aa Merge pull request #2019 from lonvia/find-pgconfig-inpostgresql-repo
Add support for finding pg_config in Postgresql repos
2020-10-20 15:02:00 +02:00
Sarah Hoffmann
e286d3f23d add support for finding pg_config in Postgresql repos
It uses the same PostgreSQL_ADDITIONAL_VERSIONS variable as
osm2pgsql so that setting that should be sufficient to make
it work.
2020-10-20 11:39:00 +02:00
Sarah Hoffmann
6c8a7b0a1a Merge pull request #2018 from lonvia/update-osm2pgsql
Update to latest osm2pgsql
2020-10-20 11:33:36 +02:00
Sarah Hoffmann
3107dc208a update to latest osm2pgsql
Important changes:

* fix disabling of JIT in Postgresql
* support for finding latests Postgresql from their repos
* no longer create nodes table with flatnodes
2020-10-20 11:08:07 +02:00
Sarah Hoffmann
bf4d75458c add explicit bbox contains check
Now that the containment check uses ST_Relate, we need to add
a separate bbox contains check to ensure that Postgis does the
efficient check first. Note that we still cannot get rid of the
overlap(&&) check because then Postgis will use the wrong indexes.
2020-10-19 10:39:01 +02:00
Sarah Hoffmann
3604d0d913 Merge pull request #2016 from lonvia/locale-address-russia
add country-specific address ranks for Russia
2020-10-18 09:48:23 +02:00
Sarah Hoffmann
dd06638dec Merge pull request #2015 from lonvia/cleanup-address-computation
Rework collection of address parts
2020-10-18 09:48:05 +02:00
Sarah Hoffmann
73a0ec22a3 add country-specific address ranks for Russia
Removes admin level 7, which should not exist and promotes
admin level 8 to municipality level.

place=municipality is only used for boroughs of St. Petersburg,
so demote to level 18.

Fixes #926.
2020-10-17 17:54:06 +02:00
Sarah Hoffmann
b0ef84caae add tests for rank computation 2020-10-17 17:51:22 +02:00
Sarah Hoffmann
64899ef54b add tests for address computation 2020-10-16 11:07:17 +02:00
Sarah Hoffmann
1064a9264e revert to && comparison for geometries
Postgis 3 picks the wrong index when using ~ or @.
2020-10-16 09:49:48 +02:00
Sarah Hoffmann
acfa7bec9c use computed centroid for location_area_large
The new address computation assumes that the centroid is inside
the area. Therefore we cannot use the centroid function. Use the
pre-computed centroid instead which has already been corrected to
be inside the area.
2020-10-15 17:30:52 +02:00
Sarah Hoffmann
62b94e838b correctly set from area column in place_addressline
This was always set to true which brings us to the question
if it is even still needed.
2020-10-15 12:06:53 +02:00
Sarah Hoffmann
5236e7a03e fix use of geometry operators
@ is contained by while ~ is contains.
2020-10-15 12:06:18 +02:00
Sarah Hoffmann
7e9412a044 demote admin boundaries for place areas
Also demote the address rank of an admin boundary when there
is a place area of higher rank that completely contains the
area. This catches the case where city boundaries do not exactly
align with administrative units (see for example Moscow).
2020-10-14 11:33:47 +02:00
Sarah Hoffmann
e47c19beb9 exclude rank 25 when computing addresses of streets
Address rank 25 is used for squares which are address-wise on the
same level as streets.
2020-10-13 22:36:17 +02:00
Sarah Hoffmann
2fe3c654fc overhaul address computation
This is a complete rewrite of the selection of address parts to
be inserted into the place_addressline table.

The new algorithm selects for each rank:
* the boundary overlapping with the addressee and contained
  in the already selected boundaries of lower rank, or failing that
* the place node closest to the addressee that is contained in
  the already selected boundaries and in the influence radius
  of already selected place nodes of lower rank

Place nodes that are not contained in already selected boundaries
of lower rank are completely thrown away. All other candidates are
added as non-address parts.
2020-10-13 22:10:07 +02:00
Sarah Hoffmann
5ec48c66cb move ordering out of getNearFeatures
The two places where the function is called have different ordering
requirement.
2020-10-13 15:24:54 +02:00
Sarah Hoffmann
4e7ec92d6f Merge pull request #2012 from lonvia/format-reverse-debug-output
use Debug class for formatting reverse debug output
2020-10-13 09:23:43 +02:00
Sarah Hoffmann
e115de47fc use Debug class for formatting reverse debug output 2020-10-12 17:12:03 +02:00
Sarah Hoffmann
bb5ffd3904 Merge pull request #2011 from lonvia/increase-city-radius
Increase radius of influence around city nodes
2020-10-12 16:17:39 +02:00
Sarah Hoffmann
887ae7fcab increase radius of influence around city nodes
The current radius does not cover cities with more than a
million inhabitants well.
2020-10-12 14:17:37 +02:00
Sarah Hoffmann
abaaf942cb Merge pull request #2004 from lonvia/demote-place-nodes-in-admin-areas
demote place nodes in admin areas
2020-10-12 11:54:52 +02:00
Sarah Hoffmann
ff47f6f65d when linking always check against original address rank 2020-10-11 12:29:49 +02:00
Sarah Hoffmann
ca680fc9fc make housenumber interpolation tests more lenient 2020-10-11 12:04:53 +02:00
Sarah Hoffmann
b04463bb2d demote place nodes in admin areas
If a place node of city rank and above finds itself in an
administrative boundary of the same address rank, then
increase the address rank by 2. This catches the rather
frequent case where city suburbs are tagged for historical
reasons as towns or villages.
2020-10-11 12:04:53 +02:00
Sarah Hoffmann
c66f701232 Merge pull request #2008 from lonvia/docs-missing-index-migration
docs: migration to new wikipedia needs new index
2020-10-11 11:05:59 +02:00
Sarah Hoffmann
3b23144ae6 docs: migration to new wikipedia needs new index
Fixes #1998.
2020-10-11 10:40:23 +02:00
Sarah Hoffmann
f7a9462337 Merge pull request #2006 from mtmail/ubuntu-20-postgresql-contrib
Ubuntu 20: use postgresql-contrib-12 so no version higher gets installed
2020-10-11 09:45:37 +02:00
Sarah Hoffmann
17f24d0061 Merge pull request #2003 from lonvia/admin-levels-indonesia
Adapt address levels for admin boundaries in Indonesia
2020-10-11 09:43:33 +02:00
marc tobias
eb8bce22b8 Ubuntu 20: use postgresql-contrib-12 so no version higher gets installed 2020-10-10 02:42:46 +02:00
Sarah Hoffmann
6a31691121 adapt address levels for admin boundaries in Indonesia 2020-10-09 22:28:06 +02:00
Sarah Hoffmann
25c4bf6ed4 Merge pull request #2002 from lonvia/analyse-indexing-script
Add script for detailed explaing of indexing trigger
2020-10-09 20:23:51 +02:00
Sarah Hoffmann
ec5743bcc0 add script for detailed explaing of indexing trigger 2020-10-09 17:38:33 +02:00
Sarah Hoffmann
7cd330ee55 Merge pull request #1996 from lonvia/remove-postcode-search-structured
Restrict postcode searches to postcode in first token
2020-10-06 17:04:22 +02:00
Sarah Hoffmann
7d2b6879c8 restrict postcode searches to postcode in first token
In structured queries we should only assume that it is
a postcode search when only the postcode and optionally
the country is given. If any other term is present, it
is better to avoid the search for postcode as it yields
too many bad searches. Given that the terms in a structured
query are ordered, this means that the postcode must be
the first token just like in the unstructured query.

Fixes #1988.
2020-10-06 14:08:31 +02:00
Sarah Hoffmann
a40684162a Revert "adapt tests to rank_search removal"
This reverts commit 2a717da850.
2020-10-06 13:59:50 +02:00
Sarah Hoffmann
cd997ff058 Merge pull request #1995 from lonvia/update-osm2pgsql
update to latest osm2pgsql version
2020-10-05 18:14:29 +02:00
Sarah Hoffmann
5391bb6cf7 update to latest osm2pgsql version
The latest version of osm2pgsql no longer creates indexes on
the members of planet_osm_rels. So we do that ourselves.
Given that we only need to access associated street relations,
the index can become quite a bit smaller.
2020-10-05 17:11:13 +02:00
Sarah Hoffmann
6b8ce1ee74 Merge pull request #1986 from mtmail/document-drop-idx_placex_geometry_reverse_lookupPoint
migration guide: idx_placex_geometry_reverse_lookupPoint can be dropped
2020-10-05 16:18:46 +02:00
Sarah Hoffmann
a37d46ec25 Merge pull request #1993 from mtmail/country-name-eSwatini
country names: Swaziland => eSwatini
2020-10-05 16:17:54 +02:00
marc tobias
b5c8237118 country names: Swaziland => eSwatini 2020-10-04 13:38:15 +02:00
marc tobias
eeffe2caf4 migration guide: idx_placex_geometry_reverse_lookupPoint can be dropped 2020-09-30 14:28:20 +02:00
Sarah Hoffmann
c5c7a6a453 Merge pull request #1984 from lonvia/remove-reverse-index
Remove unused idx_placex_geometry_reverse_lookupPoint
2020-09-30 12:02:34 +02:00
Sarah Hoffmann
b969392f40 remove removed index from database check 2020-09-30 11:33:15 +02:00
Sarah Hoffmann
40bc1752c2 remove unused idx_placex_geometry_reverse_lookupPoint
The index has been unused ever since the query using it was
changed two years ago. Given that it has not been missed much,
drop it completely here.
2020-09-30 09:21:35 +02:00
Sarah Hoffmann
851c3779b5 Merge pull request #1982 from lonvia/more-rank-search-removal
More rank search removal
2020-09-26 11:08:20 +02:00
Sarah Hoffmann
2a717da850 adapt tests to rank_search removal 2020-09-26 09:10:37 +02:00
Sarah Hoffmann
f8694da3c9 Remove more rank_search usage from address computation
Fixes #1904.
2020-09-25 17:50:36 +02:00
Sarah Hoffmann
eeb2b1f998 Merge pull request #1980 from lonvia/add-descriptive-term-for-address-rank-24
add descriptive term for address rank 24
2020-09-25 16:56:35 +02:00
Sarah Hoffmann
1db8f7e353 add descriptive term for address rank 24
With that term we have terms for all ranks, so that no generic
'administrative' term will show up in the address details anymore.
2020-09-25 16:02:17 +02:00
Sarah Hoffmann
6625e93be6 Merge pull request #1975 from lonvia/simplify-parent-assignment-for-unlisted-places
Use closest containing place area for parent of unlisted addr:place
2020-09-23 19:10:42 +02:00
Sarah Hoffmann
56053988bf Merge pull request #1970 from lonvia/remove-duplicate-geometry-check
Switch recursive updates to using rank_search
2020-09-23 18:51:35 +02:00
Sarah Hoffmann
d9325dc11a use rank_address when invalidating containing objects
Only rank_address is now relevant for determining if a place
could be part of an address.
2020-09-23 17:44:31 +02:00
Sarah Hoffmann
d3ca9dd3f7 remove ST_Covers check when also testing for ST_Intersects
Using both is slightly problematic because they have different
ways to use the index. Newer versions of Postgis exhibit a
query planner issue when both functions appear together.
As ST_Intersects includes ST_Covers, simply remove the latter.
2020-09-23 17:44:31 +02:00
Sarah Hoffmann
e552f6bce5 use closest containing place for unlisted addr:place
We can't use getNearFeatures() to determine the parent of a
place with an unlisted addr:place because this function
returns place nodes that are potentially outside the area
of interest. Doing the complete address computation is too
expensive, so simply use the area with the largest rank that
contains the feature instead.
2020-09-23 17:33:42 +02:00
Sarah Hoffmann
cf23e10382 Merge pull request #1974 from lonvia/show-unknown-addr-place
add unknown addr:place to address output
2020-09-23 15:26:12 +02:00
Sarah Hoffmann
c84e7e72f1 add unknown addr:place to address output
When a POI has no addr:street but an addr:place that is not
contained in the name list of the parent place, then remember
this situation and merge the content of addr:place into the
address output.

We don't need to care about translations in this case because
it is obvious that no object with translations exists if the
parent isn't the object named in addr:place.
2020-09-23 11:55:18 +02:00
Sarah Hoffmann
f2ff351da4 Merge pull request #1971 from lonvia/drop-support-for-isin
Drop support for is_in tag
2020-09-23 09:20:35 +02:00
Sarah Hoffmann
c5c242d193 Merge pull request #1972 from lonvia/exclude-unnamed-highway-areas
Exclude unnamed highway areas
2020-09-23 09:20:16 +02:00
Sarah Hoffmann
72193a1c23 exclude unnamed highway areas
These are used to mark large paved areas. Sometimes they exists
together with named regular streets. In such cases the unnamed
area may overshadow the actual street when computing the address
parent. As unnamed highways are not very useful anyway, we
simply remove them from the database.
2020-09-22 21:42:13 +02:00
Sarah Hoffmann
248d6b413a remove test for is_in 2020-09-22 21:36:49 +02:00
Sarah Hoffmann
d04e87fb80 drop suport for is_in tag 2020-09-22 20:26:36 +02:00
Sarah Hoffmann
cc61b74cde Merge pull request #1965 from lonvia/make-addr-tags-searchable
Make addresses searchable by their addr: tags
2020-09-22 20:21:06 +02:00
Sarah Hoffmann
915d362b11 Merge pull request #1966 from lonvia/remove-dead-code
remove dead code
2020-09-21 10:59:58 +02:00
Sarah Hoffmann
f8a5f2964f remove dead code
The SQL query has moved into the addTokensFromDB() funtion.
2020-09-21 10:39:14 +02:00
Sarah Hoffmann
a8dfbcef44 always bind addr:place to place instead of street
If an addr:place is given but no addr:street tag, then bind
the rank 30 object always to a <=25 object, even when there
is none found with the same name.
2020-09-21 10:15:14 +02:00
Sarah Hoffmann
caea14d035 merge addr tags into search_name table
When a place of rank 30 has addr tags that are not covered by the
search terms of the parent, add a separate entry for the POI in
the search_name table that includes the addr tags. We can only
do that with named places. For POIs without a name the housenumber
is used as name. If that is not available either, searching still
won't work.
2020-09-21 10:15:14 +02:00
Sarah Hoffmann
c5fc12e04b Merge pull request #1964 from lonvia/remove-postcodes-with-colon
ignore postcodes with colons
2020-09-19 17:47:43 +02:00
Sarah Hoffmann
731c620e31 ignore postcodes with colons
Colons are used as a delimiter in tiger:left and tiger:right tags
when multiple postcodes are given. Ignore those. This was already
done in the postcode update script. This changes just makes the
two places consistent where postcodes are added.
2020-09-19 17:23:40 +02:00
Ben Antony
0dad470eb9 Update broken links in documentation 2020-09-18 22:52:59 +02:00
Sarah Hoffmann
6947ab3a65 Merge pull request #1963 from lonvia/remove-postcodes-from-search-index
Remove postcodes from search index
2020-09-18 22:41:24 +02:00
Sarah Hoffmann
b219374d36 remove special casing for rank 25 postcodes
They can be computed like any other place.
2020-09-18 16:18:02 +02:00
Sarah Hoffmann
576ee5aaab use same label for all types of postcode in address 2020-09-18 16:17:30 +02:00
Sarah Hoffmann
4c9cfe2532 remove postcodes entirely from indexing
place=postcode places are artificial places that collect addr:postcode
points for aggration. They should neither show up in the address nor
be searchable. That means that there is no need to index them at all.
Only let boundary=postal_code through which define correct areas for
postcodes.
2020-09-18 15:09:35 +02:00
Sarah Hoffmann
7fb62ea904 postal boundary may be imported without name
Postal boundaries usually just have the postcode tag set and are
therefore officially 'nameless'. We want to have them as
boundary=postal_code anyways in order to distiguish them from postcode
points inherited from addr: tags.
2020-09-18 11:33:45 +02:00
Sarah Hoffmann
fe250d3ee8 Merge pull request #1961 from lonvia/set-place-type-for-result-in-address
Use place type of for result object in address parts
2020-09-17 21:23:40 +02:00
Sarah Hoffmann
6f55c67d16 Merge pull request #1960 from lonvia/fix-postcodes-duplicated-by-normalization
Make sure that all postcodes have an entry in the word table
2020-09-17 21:23:23 +02:00
Sarah Hoffmann
3aa6e6a365 Merge pull request #1957 from lonvia/docs-separate-out-deployment
Restructure vagrant scripts and installation documentation
2020-09-17 21:22:48 +02:00
Sarah Hoffmann
b8ced5d96b Merge pull request #1962 from mtmail/travis-ci-without-webserver
travis-ci: we dont need Apache installed
2020-09-17 20:38:17 +02:00
marc tobias
9e21c6a862 travis-ci: we dont need Apache installed 2020-09-17 18:25:23 +02:00
Sarah Hoffmann
fe8566928e use place type of for result object in address parts
Boundaries shound derive the address part type from the
linked place if possible. This was already implemented
for the address objects but not for the address information
from the address itself.

Fixes #1949.
2020-09-17 18:17:01 +02:00
Sarah Hoffmann
3600709116 make sure that all postcodes have an entry in word
It may happen that two different postcodes normalize to exactly
the same token. In that case we still need two different entries
in the word table. Token lookup will then make sure that the correct
one is choosen.

Fixes #1953.
2020-09-17 17:11:22 +02:00
Sarah Hoffmann
df115c73b2 Merge pull request #1945 from mtmail/travis-ubuntu-20
Upgrade Travis-CI from Ubuntu 18 to 20
2020-09-17 16:20:19 +02:00
Sarah Hoffmann
dbe025fe40 docs: fix formatting 2020-09-17 10:16:25 +02:00
Sarah Hoffmann
2b11a47a2f restructure developer's manual
Add a section on setting up the development environment which now
also includes the former chapter on recreating the documentation.
Move the README from test/ into the manual as the new Testing
chapter.
2020-09-17 09:54:46 +02:00
Sarah Hoffmann
fb63a7418e make CentOS 8 the default vagrant script
This puts it in line with the Ubunutu scripts.
2020-09-16 17:34:36 +02:00
Sarah Hoffmann
2029931b95 adapt Ubunut-20 vagrant file to triple webserver config 2020-09-16 16:27:34 +02:00
Sarah Hoffmann
91219bb3dd restructure webserver setup in ubuntu 18 script
Unify the two vagrant scripts for Ubuntu 18. The script can now
be run in three modes: no webserver, with apache, with nginx.
The default mode is to not install any webserver at all. This is
normally sufficient when just developping.

The commit also switches from bento to generic boxes and adds config
for running with a libvirt provider. You need an NFS deamon for
synchronized folders.
2020-09-16 11:19:38 +02:00
Sarah Hoffmann
ab4fe4d58a add 'make serve-global'
This runs the PHP development server in a mode where it listens
globally. This is needed when running inside vagrant and port-forwarding
to the host machine.
2020-09-16 11:15:55 +02:00
Sarah Hoffmann
8ff1f16b7f remove host from default website URL
Just assume that Nominatim runs under the root URL. This is a
more versatile base that also makes 'make serve' work out of the
box.
2020-09-16 11:13:51 +02:00
Sarah Hoffmann
cf4f62c82c docs: move webserver installation into separate chapter 2020-09-15 23:51:25 +02:00
marc tobias
1d2f4264a2 Upgrade Travis-CI from Ubuntu 18 to 20 2020-09-14 12:44:50 +02:00
Sarah Hoffmann
9d506a4afa Merge pull request #1943 from mtmail/no-get-magic-quotes
starting PHP 5.4 get_magic_quotes_gpc() returns false, no need to check
2020-09-14 08:47:49 +02:00
marc tobias
7ac22e9227 starting PHP 5.4 get_magic_quotes_gpc() returns false, no need to check 2020-09-14 00:45:22 +02:00
Sarah Hoffmann
47f2fe724f Merge pull request #1940 from mtmail/north-macedonia
country_name Macedonia => North Macedonia
2020-09-12 16:15:22 +02:00
marc tobias
130571fb29 country_name Macedonia => North Macedonia 2020-09-07 17:24:06 +02:00
Sarah Hoffmann
da7218350b Merge pull request #1936 from lonvia/tweeking-of-ranks
More fine tuning of default rank assignments
2020-09-01 21:23:28 +02:00
Sarah Hoffmann
b6078de6f8 adapt tests to ranking changes 2020-09-01 18:03:17 +02:00
Sarah Hoffmann
07430b0194 tweak size of large POIs
Further reduce the size from which on POIs are no longer bound
to streets but only to larger objects. The point of reference,
of what a largest POI could be that is still bound is JFK airport.
2020-09-01 18:00:40 +02:00
Sarah Hoffmann
fae02fab00 address rank adjustment for addressable boundaries only
Only administrative boundaries with an address rank need
to be adjusted. Otherwise just handle them like any other
object.
2020-09-01 17:59:26 +02:00
Sarah Hoffmann
a68cdc40be improve fallback ranking
Boundaries and places now always get a rank < 26 to make sure that
they do not parent to a street. Skip boundary=place completely
because they will be covered throught the secondary place tag.
2020-09-01 17:55:40 +02:00
Sarah Hoffmann
76b307f42a Merge pull request #1934 from lonvia/fix-deletion-of-large-highway-areas
Do not block deletion of large highway areas
2020-08-28 10:08:42 +02:00
Sarah Hoffmann
6e4b7eb966 do not block deletion of large highway areas
Deletion of areas should only e blocked for addressable features.
Streets and POIs do not have a large impact on updates.
2020-08-28 09:49:21 +02:00
Sarah Hoffmann
770754ae2c place lookup: filter places that have no details
In rare cases search_name might have entries for places for
which we do not return details, in particular for linkees.
Need to remove those entries in the result list before returning
the details.

Fixes #1932.
2020-08-27 09:33:21 +02:00
Sarah Hoffmann
a932855f6f Merge pull request #1931 from lonvia/stable-sort-for-results
Reranking of results must be stable
2020-08-26 20:52:17 +02:00
Sarah Hoffmann
72ee1abc90 ensure that ordering by importance is stable
The initial search results retrieved from the database already come
preordered, either by importnace or by distance. We want to keep
that order if all other things are equal.
2020-08-26 17:42:43 +02:00
Sarah Hoffmann
9e1909643c PlaceLookup should return results in input order 2020-08-26 17:15:11 +02:00
Sarah Hoffmann
77a1329285 Merge pull request #1930 from lonvia/add-support-for-squares
Add support for place=square
2020-08-26 15:11:45 +02:00
Sarah Hoffmann
be6ecc388c add support for place=square
Squares are now addressable (on address level 25) and thus can
be attached to a house number via addr:place. Needed to increase
the rank range for matching up addr:place to 25.
2020-08-26 12:12:52 +02:00
Sarah Hoffmann
13dba94307 do not run rank 0 objects in parallel
Waterways are at address rank 0 and do linking. This might lead to
deadlocks.
2020-08-22 19:51:19 +02:00
Sarah Hoffmann
d51440bb5d Merge pull request #1926 from lonvia/speed-up-location-lookup
Increase splitting for large geometries
2020-08-22 17:04:34 +02:00
Sarah Hoffmann
d730e179bf tests: use larger grid to avoid rouding errors 2020-08-22 16:04:24 +02:00
Sarah Hoffmann
559fe513fa increase splitting for large geometries
When computing the address parts for a geometry, we need to do
a ST_Relates lookup in the location_area_large_* tables. This is
potentially very expensive for geometries with many vertices.
There is already a funtion for splitting large areas to reduce the
impact. This commit reduces the minimum area of a split, effectively
increasing the number of splits.

The effect on database size is minimal (around 3% increase), while
the indexing speed for streets increases by a good 60%.
2020-08-20 16:37:33 +02:00
Sarah Hoffmann
5b20fa7e38 Merge pull request #1923 from lonvia/split-indexing-for-boundries
Rework indexing order of places
2020-08-20 15:03:29 +02:00
Sarah Hoffmann
d16e75de91 Merge pull request #1924 from lonvia/installation-instructions-external-server
docs: installation hints for external databases
2020-08-19 15:17:32 +02:00
Sarah Hoffmann
6fd9994590 docs: installation hints for external databases
Fixes #1882.
2020-08-19 15:03:42 +02:00
Sarah Hoffmann
b9729f3b66 Merge pull request #1921 from lonvia/skip-over-traffic-signs
Remove traffic signs from full styles
2020-08-19 11:50:28 +02:00
Sarah Hoffmann
d6ff7475f1 make sure that addr:* tags can always be searched for
Always add contents of addr:* tags into address part of the search
table, even when there is no corresponding other name. This keeps
search tolerant to the kind of tagging where parts show up in the
address that have no corresponding object in the database or where
it is only an unaddressable object.
2020-08-19 11:44:10 +02:00
Sarah Hoffmann
984979d9bf add migration for new indxing schema 2020-08-18 21:40:53 +02:00
Sarah Hoffmann
73c449b97b switch indexind to address rank
A place needs all lower address rank object indexed to make up
the address. The search rank no longer ensures that as it can have
a different ordering than the address rank.

This switches indexing rank order to address ranks. Non-address
objects (with address rank 0) are indexed together with POIs.
2020-08-18 16:58:58 +02:00
Sarah Hoffmann
1529666232 use only centroid to get parent admin boundaries
Using the full geometry is far too expensive.
2020-08-18 15:17:09 +02:00
Sarah Hoffmann
3816b86a9e nominatim: also index boundaries by rank
We need to make sure that the entry in serach_name from a lower rank
is indeed available.
2020-08-18 15:17:09 +02:00
Sarah Hoffmann
a4b30fc649 index admin boundaries before everything else
Avoids irregularities that might happen because the address
rank of a boundary is changed through linking.
2020-08-18 15:17:09 +02:00
Sarah Hoffmann
fc50eb8688 nominatim: move DBConnection class into its own file 2020-08-18 15:17:09 +02:00
Sarah Hoffmann
071db1fae7 remove traffic signs from full styles
Traffic signs rarely have name and are therefore mostly not
searchable. Remove them completely. Allow street lamps only when
they have a name. Removes about 2M object from a planet instance.
2020-08-15 22:37:45 +02:00
Sarah Hoffmann
a163ea63c5 Merge pull request #1920 from lonvia/remove-linked-place-when-updating
Remove linked_place from extratags when updating
2020-08-14 09:44:56 +02:00
Sarah Hoffmann
e21a707166 remove linked_place from extratags when updating
Before updating an admin boundary we need to make sure that any
artificially generated 'linked_place' entry is removed from the
extratags column. This ensures that the place designation does
not linger when a linked place disappears and that it is updated
when the linking changes.
2020-08-13 16:59:11 +02:00
Sarah Hoffmann
1b484fa90d Merge pull request #1919 from lonvia/tests-for-ranking
More tests and fixes for address rank computation
2020-08-13 09:13:10 +02:00
Sarah Hoffmann
06aa0f0b76 use address rank for address forming when available 2020-08-12 22:22:24 +02:00
Sarah Hoffmann
fb8bb30144 boundary address ranks must not go above 25
Fixes #1914.
2020-08-12 22:22:24 +02:00
Sarah Hoffmann
7429a33818 add simple tests for address rank computation 2020-08-12 22:22:24 +02:00
Sarah Hoffmann
5b9f61cff8 also take place tags into account for address rank
An admin boundary might have a place tag but no matching place node.
We still should use the place value as indicator for the address
rank in this case.
2020-08-12 22:22:24 +02:00
Sarah Hoffmann
4ff0e20e1f Merge pull request #1917 from lonvia/docs-rank-levels
docs: add tables for the meaning of address and search ranks
2020-08-11 15:12:16 +02:00
Sarah Hoffmann
a692bfa8f9 docs: add tables for the meaning of address and search ranks
Also makes tables a bit more readable by adding margins and better
headers.
2020-08-11 11:48:55 +02:00
Sarah Hoffmann
49dd927406 Merge pull request #1918 from lonvia/remove-more-osmosis-init
remove more traces of --osmosis-init switch
2020-08-11 11:48:17 +02:00
Sarah Hoffmann
4b21cc1737 remove more traces of osmosis-init 2020-08-11 10:43:04 +02:00
Sarah Hoffmann
7ae16f7302 Merge pull request #1912 from lonvia/remove-unused-import-update-functions
remove unused functions from setup and update
2020-08-09 09:31:48 +02:00
Sarah Hoffmann
73566a9f15 remove unused functions from setup and update
Removes the defunct --osmosis-init and --no-api switches and the
unsupported (and unnecessary) deduplicate. Also removes
'experimental' from --setup-website as this is a required
function now.
2020-08-06 16:16:35 +02:00
Sarah Hoffmann
fbdf205ab4 Merge pull request #1909 from lonvia/minor-fixes
Make SQL debug statements execute again
2020-08-06 11:07:38 +02:00
Sarah Hoffmann
83b2b4970d Make SQL debug statements execute again
There were some old variable names used that are no longer valid.
Either fix them or remove the statement completely.

Fixes #1907.
2020-08-06 09:29:19 +02:00
Sarah Hoffmann
f29dc7d7ac Merge pull request #1865 from mtmail/how-to-import-test-db
test/README.md - more instructions how to import test db
2020-08-04 14:31:19 +02:00
Sarah Hoffmann
4d5db74c18 Merge pull request #1902 from lonvia/avoid-touching-boundaries-in-addresses
Be more strict what areas make up an address
2020-08-04 14:30:08 +02:00
Sarah Hoffmann
8c7d285e03 Merge pull request #1901 from lonvia/speed-up-indexing
Batch-index places at rank 30
2020-08-04 12:32:16 +02:00
Sarah Hoffmann
1347abb1e7 be more strict what areas make up an address
Exclude boundaries that touch a line in only one point and
that touch areas only along the boundary.

Fixes #1900.
2020-08-04 12:08:50 +02:00
Sarah Hoffmann
2cb85e48b4 adapt test results to new ranking 2020-08-03 16:57:22 +02:00
Sarah Hoffmann
5be084e0f5 indexer: allow batch processing of places
Request and process multiple place_ids at once so that
Postgres can make better use of caching and there are less
transactions running.
2020-08-03 10:32:39 +02:00
Sarah Hoffmann
2323923bec indexer: move progress tracker into separate class 2020-08-03 10:32:39 +02:00
Sarah Hoffmann
0f54d42863 indexer: get rid of special handling of few places
Given that we do not distiribute geometry sectors to threads anymore,
there is no point in this kind of special handling.
2020-08-03 10:32:39 +02:00
Sarah Hoffmann
8201c7f46c Merge pull request #1899 from mtmail/use-new-dsn-format-in-vagrant-md
VAGRANT.md: we use different database DSN syntax these days
2020-08-03 10:26:35 +02:00
marc tobias
ed22d640f4 VAGRANT.md: we use different database DSN syntax these days 2020-07-31 16:52:29 +02:00
marc tobias
01b009ff24 test/README.md - more instructions how to import test db 2020-07-31 16:50:27 +02:00
Sarah Hoffmann
665b90bf5a Merge pull request #1898 from lonvia/show-housenumber-with-housename
make house number reappear in display name on named POIs
2020-07-31 10:11:09 +02:00
Sarah Hoffmann
4e1f245331 make house number reappear in display name on named POIs
After 6cc6cf950c names and house numbers
of POIS got mingled into a single item when creating the display name.
Add the house number as extra information without place_id to avoid
later mangling.
2020-07-30 23:39:55 +02:00
Sarah Hoffmann
f8e1d39208 Merge pull request #1894 from lonvia/fix-hierarchy-by-admin-level
Preserve admin level hierarchy between admin boundaries
2020-07-29 09:20:34 +02:00
Sarah Hoffmann
955dae5d4b Merge pull request #1895 from lonvia/update-less-quiet
Make indexing during updates less quiet
2020-07-29 09:20:14 +02:00
Sarah Hoffmann
b78cd3f4c9 make indexing during updates less quiet
Adjust verbosity behaviour to that of indexing during setup.
2020-07-28 22:35:51 +02:00
Sarah Hoffmann
6a3eb7edf2 preserve admin level hierarchy between admin boundaries
When the address rank of an admin boundary is changed because
of an attached place type, it may happen that the admin_level
hierarchy gets inversed. Avoid that by adjusting the address
rank if an inversion is detected.
2020-07-28 22:15:25 +02:00
Sarah Hoffmann
0a710c0762 Merge pull request #1891 from lonvia/automatic-db-setup
Implicitly connect to database during setup
2020-07-26 16:08:45 +02:00
Sarah Hoffmann
9a204f6284 test: make road really cross the boundary 2020-07-26 15:57:07 +02:00
Sarah Hoffmann
7837970303 remove connect() in update script
This is now implicit.
2020-07-26 12:27:52 +02:00
Sarah Hoffmann
8cd9550295 implicitly connect to database during setup
Make access to the DB object a function, so that the connection
can be opened implicitly when the object is accessed for the first
time. This way we no longer need to check beforehand if a specific
function of the setup needs DB access or not.

Also move the check for the module to the relevant sub step.
2020-07-26 11:56:00 +02:00
Sarah Hoffmann
840c692d5b Merge pull request #1890 from lonvia/add-wiki-tags-to-all-styles
Add wiki tags to all styles
2020-07-25 11:34:51 +02:00
Sarah Hoffmann
05e0d3e2d4 add wiki tags to all styles
wikipedia and wikidata tags are needed to compute the importance
so we need to put them into extra tags for all styles.

Fixes #1885.
2020-07-25 10:00:18 +02:00
Sarah Hoffmann
7429ff9dce forgit to adapt info message 2020-07-18 12:14:51 +02:00
Sarah Hoffmann
29602fe0cf Merge pull request #1874 from joy-yyd/rank_modification
House number search fix for #164
2020-07-18 12:12:37 +02:00
Sarah Hoffmann
1b95ec5591 Merge pull request #1884 from lonvia/fixes-for-webserver-settigns
Small fixes for new webserver settings file
2020-07-18 11:44:01 +02:00
Sarah Hoffmann
3efe0dc8dc move website settings back to settings/
We don't want the settings to become visible when a server is
accidentally configured wrongly.
2020-07-18 11:02:07 +02:00
Sarah Hoffmann
241e4af1b0 log file is a string when not set to false 2020-07-18 11:00:17 +02:00
(Joy) Yuanyue Ding
cac8a8df18 Modifiy the range of address_rank, fix for issue #164 2020-07-08 17:47:38 +02:00
Sarah Hoffmann
1181ceb735 Merge pull request #1873 from lonvia/resurrect-debug-option
Reenable debug parameter
2020-07-08 10:06:49 +02:00
Sarah Hoffmann
f376f45277 default language is a string when not set to false 2020-07-08 08:38:11 +02:00
Sarah Hoffmann
d364afdf3b reenable debug parameter
The parameter got lost when switching to website settings.
Given that the use of a fixed parameter is limited,
debugging output can now only be set via the URL parameter.
2020-07-08 08:32:46 +02:00
Sarah Hoffmann
7ecfcf7eaa Merge pull request #1869 from lonvia/migration-for-setup-website
docs: add migration for new --setup-website step
2020-07-05 21:06:27 +02:00
Sarah Hoffmann
709c9bbe88 Merge pull request #1868 from lonvia/reverse-for-addressable-places-only
reverse: ignore place nodes without an address rank
2020-07-05 21:06:00 +02:00
Sarah Hoffmann
db175f606e docs: add migration for new --setup-website step 2020-07-05 15:46:06 +02:00
Sarah Hoffmann
3a664dc676 reverse: ignore place nodes without an address rank
We already exclude all polygon places without an address
rank. place nodes should also be ignored. This removes
places like locality from the reverse results.

Fixes #1839.
2020-07-05 15:38:49 +02:00
Sarah Hoffmann
5f8d5f10a6 docs: rename documentation chapter
Avoids confusion about this being the documentation itself.
2020-07-05 11:14:48 +02:00
Sarah Hoffmann
f02d4d9677 docs: move external data sources into simple page 2020-07-05 11:13:28 +02:00
Sarah Hoffmann
1889643eca Merge branch 'move-datasources-into-separate-repos' of https://github.com/mtmail/Nominatim into mtmail-move-datasources-into-separate-repos 2020-07-05 10:57:12 +02:00
Sarah Hoffmann
4fc5c2024b Merge pull request #1864 from lonvia/langauge-specific-presuffixes
exclude language-specific name:prefix and name:suffix
2020-07-05 10:54:42 +02:00
Sarah Hoffmann
354487d7f4 Merge pull request #1829 from krahulreddy/websiteSetup
Added setup-website option
2020-07-01 18:11:50 +02:00
Sarah Hoffmann
6478058946 Merge pull request #1857 from mtmail/db-migration-update-functions
Migration.md - admin also need to run recreate db functions
2020-07-01 18:11:14 +02:00
Sarah Hoffmann
8b8dcea3de exclude language-specific name:prefix and name:suffix
There are about 1k suffixes and 20k prefixes with a
language-speicfic variant in use. These should not
show up as names.
2020-07-01 18:00:53 +02:00
marc tobias
64ace51e02 move data-sources/ directory in new git repos 2020-07-01 17:38:44 +02:00
marc tobias
4cb5c67a44 Migration.md - admin also need to run recreate db functions 2020-07-01 16:46:54 +02:00
Sarah Hoffmann
2edefd9e80 sql: fix rank variable type
The rank type needs to match the parameter type of
update_place_diameter().

Fixes #1851.
2020-07-01 15:48:00 +02:00
Sarah Hoffmann
fa8b16e7e7 Merge pull request #1858 from lonvia/update-osm2pgsql
Update osm2pgsql
2020-07-01 11:38:40 +02:00
Sarah Hoffmann
2a953700e2 update osm2pgsql (hang on multipolygons) 2020-06-30 22:39:38 +02:00
Sarah Hoffmann
c1dc835b5c Merge pull request #1852 from osm-search/disable-jit-for-updates
Disable Postgresql jit and parallel processing for osm2pgsql updates
2020-06-28 22:39:44 +02:00
Sarah Hoffmann
214f92c428 make phpcs happy 2020-06-28 18:24:29 +02:00
Sarah Hoffmann
95fc680af9 travis: reduce the size of diff download 2020-06-28 18:21:42 +02:00
Sarah Hoffmann
22d0c6b5e1 travis: run a single round of updates on the Monaco import 2020-06-28 18:09:08 +02:00
Sarah Hoffmann
ff1be13d0e disable JIT and parallel processing for osm2pgsql in updates
This is known to cause issues because of bad indexing
statistics.
2020-06-28 18:06:06 +02:00
K Rahul Reddy
a3201be7e7 Moved settings-frontend to website/ 2020-06-27 10:45:53 +05:30
K Rahul Reddy
37f0b51dff Updated setup.php 2020-06-27 10:45:53 +05:30
K Rahul Reddy
95d2dd74ad Documentation updated 2020-06-27 10:45:53 +05:30
K Rahul Reddy
a175a25e6c Added setup-website to travis.yml 2020-06-27 10:45:51 +05:30
K Rahul Reddy
6c406124dd Added setup-website option 2020-06-27 10:45:51 +05:30
Sarah Hoffmann
0f17529486 Merge pull request #1836 from lonvia/rework-large-location-II
Change processing of place nodes in addresses
2020-06-26 21:36:32 +02:00
Sarah Hoffmann
dd10c867db docs: minor typo and grammar fixes 2020-06-23 23:31:18 +02:00
Sarah Hoffmann
8335dd3aa5 Merge branch 'split-off-test-tool-installation-instructions' of https://github.com/mtmail/Nominatim into mtmail-split-off-test-tool-installation-instructions 2020-06-23 23:25:46 +02:00
Sarah Hoffmann
cd73ac7038 Merge pull request #1841 from mtmail/faq-entry-about-rebuilding-nominatim-so
FAQ addition when to rebuild nominatim.so
2020-06-23 23:20:43 +02:00
Sarah Hoffmann
7cc33a839c Merge pull request #1834 from mtmail/faq-invalid-page-in-block
FAQ entry for PostgreSQL -invalid page in block-
2020-06-23 23:10:26 +02:00
marc tobias
828da6a425 FAQ addition when to rebuild nominatim.so 2020-06-20 04:01:50 +02:00
marc tobias
2e5bdb8794 Put install instructions of test tools into separate docs/ markdown file 2020-06-20 03:48:07 +02:00
marc tobias
f56bac350b FAQ entry for PostgreSQL -invalid page in block- 2020-06-19 21:16:57 +02:00
Sarah Hoffmann
d373f16c81 Merge pull request #1838 from lonvia/make-serve
add 'make serve' command
2020-06-19 20:12:56 +02:00
Sarah Hoffmann
ebffa15c7c add 'make serve' command
Starts up PHP's built-in webserver in the website/ directory.
Useful for testing and development.

See #1831.
2020-06-19 17:35:24 +02:00
Sarah Hoffmann
6e4ee160ee adapt tests to new search ranks 2020-06-17 10:53:11 +02:00
Sarah Hoffmann
a5697c5279 change place node expansion for large area table
So far we've used a buffer around a place node to define its
potential address reach. This had two problems: the buffer was
so large that addresses often contain false positives and the
buffer is really distorted when getting closer to the poles.

Change the buffer here to draw a bounndig box at a certain
distance in meter. This means that we always use the same
box everywhere on the planet and can make the extent much
smaller. Using a box has the advantage that it is much faster
to figure out if a point is within the box.
2020-06-17 10:53:11 +02:00
Sarah Hoffmann
5abec720d8 Merge pull request #1830 from lonvia/docs-for-nominatim-ui
Add usage docs for nominatim-ui
2020-06-17 09:30:29 +02:00
Sarah Hoffmann
84403b47cb add usage docs for nominatim-ui
Includes migration guides for Apache and nginx.
2020-06-13 20:09:20 +02:00
Sarah Hoffmann
4342a539af Merge pull request #1827 from mtmail/centos8-postgresql12-without-proj
Vagrant centos8: proj52 not needed, use postgresql 12/postgis 3.0
2020-06-11 23:12:38 +02:00
Sarah Hoffmann
f4e744ade5 remove version warning from ubuntu 20 installation 2020-06-11 22:46:04 +02:00
Sarah Hoffmann
155d4c5591 cmake: only require php for import and api 2020-06-11 22:44:00 +02:00
marc tobias
f5cbe0e6ba Vagrant centos8: proj52 not needed, use postgresql 12/postgis 3.0 2020-06-11 19:10:38 +02:00
Sarah Hoffmann
a0e7d80daf prepare 3.5.0 release 2020-06-06 20:30:29 +02:00
Sarah Hoffmann
d7e2f61e13 Merge pull request #1822 from lonvia/document-address-labels
document which labels may appear in the address info
2020-06-06 18:02:40 +02:00
Sarah Hoffmann
96ed4b02d7 document which labels may appear in the address info
The list is manually generated and only valid for the default
configuration as used on openstreetmap.org.

Fixes #1808.
2020-06-06 17:32:30 +02:00
Sarah Hoffmann
cffc7c0121 parents for large POIs must be address features
There are a couple of places with a search rank < 25 which are
not addressable like waterways and islands. We don't want them
to function as parents for POI-level objects. So use the
address rank for finding parents, not the search rank.

See #1815.
2020-06-03 11:30:51 +02:00
Sarah Hoffmann
d89000cc3d Merge pull request #1813 from lonvia/revert-concurrent-indexing
revert building indexes concurrently
2020-06-01 22:13:33 +02:00
Sarah Hoffmann
3661c75b39 Merge pull request #1814 from lonvia/disable-jit
Disable JIT and parallel workers when indexing
2020-06-01 22:13:07 +02:00
Sarah Hoffmann
3b20b11a9f remove warnings about postgres 12 and postgis 3 2020-05-30 11:25:00 +02:00
Sarah Hoffmann
cca366196d Disable JIT and parallel workers when indexing
Locally disable jit and parallel workers in the connection that
do indexing. The query planner tends to be overenthusiatic about
using JIT. But with the rather less complex queries we have, the
overhead tends to be larger than the performance gain.

Fixes #1677.
2020-05-30 11:20:16 +02:00
Sarah Hoffmann
e09d444068 revert building indexes concurrently
This does not really solve the issue with blocking autovacuum
requests and can lead to incomplete indexes and bogus
out-of-disk messages.

Fixes #1549.
2020-05-30 11:00:05 +02:00
Sarah Hoffmann
4956f5e710 Merge pull request #1809 from lonvia/fix-display-names
Fix some glitches in choice of address tags
2020-05-27 21:28:17 +02:00
Sarah Hoffmann
5bebdfa434 Merge pull request #1804 from lonvia/ranking-improvement-germany
Localized ranking adaptions for Germany, Sweden and Norway
2020-05-27 11:58:21 +02:00
Sarah Hoffmann
aea915aa8d prefer linked place type over own place type
For state cities, tagging might prefer the place=state on
the admin boundary. The linked place is a more reliable indicator.
2020-05-27 11:31:50 +02:00
Sarah Hoffmann
e0d29f398e each address line must fill at most one geocodejson field
This fixes an issue where a postcode with rank_address 5
would also appear in the state field.
2020-05-27 11:16:27 +02:00
Frederik Ramm
c43b39bd88 Fix script names in README (#1805) 2020-05-25 12:45:35 +02:00
Sarah Hoffmann
8218da27b3 adapt tests to new ranks 2020-05-23 19:40:41 +02:00
Sarah Hoffmann
aa4bd00631 Adapt boundary labels for Sweden and Norway
This also gives us the correct labels for address output in
json and xml.
2020-05-23 16:19:27 +02:00
Sarah Hoffmann
af6b9fdb39 fix admin levels for Norway and Sweden
Admin levels 3 and 4 are used for region and county respectively,
so downgrade the ranking.
2020-05-23 15:48:40 +02:00
Sarah Hoffmann
c1b6493373 adapt municipality and region for Germany 2020-05-23 15:20:15 +02:00
Sarah Hoffmann
c386cca73f Merge pull request #1801 from lonvia/rework-classtypes
Rework ClassTypes helper functions
2020-05-20 08:22:56 +02:00
Sarah Hoffmann
cadbdaff18 fix style 2020-05-18 22:20:36 +02:00
Sarah Hoffmann
57510f517a adapt tests to modified address types 2020-05-17 16:53:33 +02:00
Sarah Hoffmann
3a2ddbe2e0 encapsulate icon URL in a function 2020-05-17 16:46:45 +02:00
Sarah Hoffmann
859347523f also adapt uses of ClassTypes in website/ 2020-05-17 16:46:45 +02:00
Sarah Hoffmann
528fe6553f adapt php tests
Also fixes some errors found by the tests.
2020-05-17 16:46:45 +02:00
Sarah Hoffmann
1faa0f4d41 reorganise class/type information
Add a separate function for each property which saves necessary
information independently. Simplify computation of labels and
simple labels to not explicitly save the labels.
2020-05-17 16:46:45 +02:00
Sarah Hoffmann
82a11cae2d first draft 2020-05-17 16:46:45 +02:00
Sarah Hoffmann
431948d768 nominatim: always use deadlock-protected wait
Fixes #1785.
2020-05-15 18:49:27 +02:00
Sarah Hoffmann
f69c3d2b66 Merge pull request #1793 from lonvia/remove-struct-params-in-gui
search UI: hide unused query parameters
2020-05-15 16:50:54 +02:00
mmd
08b05964fa Update travis to bionic=Ubuntu18 (#1800) 2020-05-14 22:52:04 +02:00
Sarah Hoffmann
bd7f597682 Merge pull request #1797 from mtmail/jquery-3-5-1
update jquery dependency 3.5.0 => 3.5.1
2020-05-14 20:54:17 +02:00
marc tobias
6d4fbc9d32 update jquery dependency 3.5.0 => 3.5.1 2020-05-14 15:53:05 +02:00
Sarah Hoffmann
124410a17b improve syntax highlighting for apache and nginx examples 2020-05-13 10:13:15 +02:00
Sarah Hoffmann
a543d57cbd switch to php-fpm 7.3
Also fixes indent.
2020-05-13 10:04:31 +02:00
Sarah Hoffmann
8c3a0efe8b Merge branch 'patch-1' of https://github.com/ganeshkrishnan1/Nominatim into ganeshkrishnan1-patch-1 2020-05-13 09:55:48 +02:00
Sarah Hoffmann
9e2841ad44 search UI: hide unused query parameters
Only send query parameters relevant for the current query
type (simple/structured), hide the other input fields.

This is quite a bit of CSS state changing, so move the intial
setup of the input field states into Javascript.
2020-05-11 00:19:33 +02:00
Sarah Hoffmann
233e5f7c0e show simple query field when no parameters are given 2020-05-10 23:52:53 +02:00
Sarah Hoffmann
d5d9445cfd Fix PHP errors in structured HTML output
Correctly handle missing parameters.
2020-05-10 23:41:04 +02:00
Sarah Hoffmann
7be7417b5b Merge pull request #1792 from lonvia/remove-from-location-area
remove linked places also from the location_area_large tables
2020-05-10 15:49:20 +02:00
Sarah Hoffmann
0a14142156 remove linked places also from the location_area_large tables
We don't want linked places to show up in addresses either,
so remove them from the address lookup table.
2020-05-10 13:59:47 +02:00
galewis2
a5e3785843 Add simple/structured query selector to HTML search page (#1722) 2020-05-08 01:29:44 +02:00
Sarah Hoffmann
fc19ebb218 Merge pull request #1786 from lonvia/remove-ubuntu-1604
remove Ubuntu 16 installation instructions
2020-05-07 22:42:31 +02:00
Sarah Hoffmann
b45411f988 Merge pull request #1782 from Simon-Will/1781-make-tests-work-with-phpunit-8
Make tests work with phpunit 8
2020-05-07 22:01:35 +02:00
Sarah Hoffmann
42f6371e47 remove Ubuntu 16 installation instructions
Also fixes up CentOS 8 links in documentation.
2020-05-07 21:55:04 +02:00
Simon Will
be2aa6ab3a Use Ubuntu’s packaged composer, not the custom installation 2020-05-07 21:44:45 +02:00
Sarah Hoffmann
6e39ed9573 Merge pull request #1780 from Simon-Will/1768-vagrant-installation-for-ubuntu-20
Add vagrant machine for Ubuntu 20.04
2020-05-07 20:46:44 +02:00
Simon Will
daf45a2993 Integrate Ubuntu 20 instructions into documentation 2020-05-07 00:36:13 +02:00
Simon Will
d351b10fde Document minimum phpunit version 2020-05-06 23:47:16 +02:00
Simon Will
0b21050904 Install phpunit 8 on Ubuntu 18 with composer 2020-05-06 23:46:53 +02:00
Sarah Hoffmann
644a7f524c Merge pull request #1784 from krahulreddy/patch-1
Removed redundant question
2020-05-06 21:37:29 +02:00
K Rahul Reddy
53949ace36 Removed redundant question 2020-05-06 21:26:32 +05:30
Simon Will
14dba39157 Use assertEqualsWithDelta for float comparisons
PHPUnit 7.3 introduced the functions assertEqualsWithDelta for comparing
floats with a delta. The old four-argument version of assertEquals is
deprecated in PHPUnit 8 and removed in PHPUnit 9.

This commit means that the tests will fail with PHPUnit < 7.3 because
assertEqualsWithDelta is not defined there.
2020-05-05 23:43:09 +02:00
Simon Will
43fd2a7423 Declare return type of testcase setUp method
PHPUnit 7 changed the signature of the TestCase methods to include the
return type.
2020-05-05 23:40:18 +02:00
Simon Will
4b0ac5356e Add vagrant machine for Ubuntu 20.04
The instructions in
[`VAGRANT.md`](42c80893cb/VAGRANT.md)
still work as before. The names of the Vagrant machines are updated so
that Ubuntu 18.04 (previously called `ubuntu`) is now called `ubuntu18`
and Ubuntu 20.04 is now called `ubuntu20`.

The version changes from Ubuntu 18.04 to Ubuntu 20.04 are:

  - Python: 3.6 to 3.8
  - Postgres: 10 to 12
  - PHP: 7.2 to 7.4

In the `apt-get`, I changed `--force` to `--allow-downgrades --allow-remove-essential --allow-change-held-packages`, because the former is deprecated. Cf. the [manpage of apt-get](http://manpages.ubuntu.com/manpages/focal/man8/apt-get.8.html)

The php module `codesniffer` was previously installed via Composer, but it is available via the Ubuntu repository, so I installed it via `apt-get` now.
2020-05-05 23:10:35 +02:00
Sarah Hoffmann
c2f0d8e5ba docs: add link to new status page 2020-05-04 21:11:57 +02:00
marc tobias
0fb93b1e8a documenation for /status endpoint 2020-05-04 17:06:06 +02:00
Sarah Hoffmann
f94828c3f4 properly escape class parameter
The class parameter was used as is, allowing for potential
SQL injection via the API.

Thanks to @bladeswords for finding this.
2020-05-02 21:54:14 +02:00
Sarah Hoffmann
0e1e7c7df2 Merge pull request #1770 from lonvia/eyusupov-separate-compilation
Separate compilation
2020-04-26 21:48:43 +02:00
Sarah Hoffmann
06110ba358 Merge pull request #1769 from lonvia/display-name-order
Ensure that result object name is always first in display_name
2020-04-26 16:18:56 +02:00
Sarah Hoffmann
bae69f0102 cmake: reintroduce check script 2020-04-26 16:17:43 +02:00
Sarah Hoffmann
77e7f4696b fix docs typos 2020-04-26 15:00:28 +02:00
Sarah Hoffmann
47fb2c9126 cmake: restructure splitting between modules
Make a clear distinction between parts used for the importer
and parts used for the API.
2020-04-26 14:17:21 +02:00
Sarah Hoffmann
2ab9e4acd3 Merge branch 'separate-compilation' of https://github.com/eyusupov/Nominatim into eyusupov-separate-compilation 2020-04-26 10:47:41 +02:00
Sarah Hoffmann
65ee7a8002 Merge pull request #1754 from mtmail/nominatim-db-tests-against-postgres
Nominatim::DB tests against separate postgresql database
2020-04-26 10:20:30 +02:00
marc tobias
a5d0657d9b lonvia PR feedback 2020-04-26 03:33:15 +02:00
Sarah Hoffmann
b8f01f91ca simplify display_name computation 2020-04-26 00:18:29 +02:00
Sarah Hoffmann
6cc6cf950c ensure that result object name is always first in display_name
The display name might be mixed up if the result object has a lower
rank_address than its address members.
2020-04-26 00:14:55 +02:00
Sarah Hoffmann
0b0349f746 Merge pull request #1752 from mtmail/new-oo-shell-class
new PHP Nominatim\Shell class to wrap shell escaping
2020-04-25 16:48:04 +02:00
Sarah Hoffmann
2740974a13 Merge pull request #1758 from krahulreddy/advanced-installations
Advanced installations
2020-04-22 09:59:44 +02:00
Sarah Hoffmann
97a9a262bb Merge pull request #1764 from mtmail/docs-countrycodes-based-on-adminlevel-2
API docs: countrycode assignment happens using admin_level=2 tags
2020-04-22 09:57:59 +02:00
Sarah Hoffmann
207efe700f highway:construction should appear as 'road' in the address list
Fixes #1763.
2020-04-22 09:08:33 +02:00
marc tobias
e33315eaa6 API documentation: clarification countrycode assignment happens using admin_level=2 tags 2020-04-21 17:42:12 +02:00
Sarah Hoffmann
5469d02d03 nominatim.py: fix wrong use of assert
Fixes #1762.
2020-04-19 17:59:49 +02:00
K Rahul Reddy
42c80893cb Fix documentation links (#1760)
Update installation documentation link in VAGRANT.md, update.php
2020-04-19 00:42:24 +02:00
K Rahul Reddy
5c56ea3198 Adjustments made to documentation 2020-04-17 21:53:50 +05:30
K Rahul Reddy
42f86329a9 Added Advanced Installations documentation 2020-04-17 21:53:41 +05:30
K Rahul Reddy
08e273c0c7 Added scripts for multiple country setup and updates 2020-04-17 21:50:59 +05:30
Sarah Hoffmann
5f8f98fa03 Merge pull request #1756 from lonvia/downgrade-waterways
downgrade waterways
2020-04-17 08:46:40 +02:00
Sarah Hoffmann
08c53ae27d downgrade waterways
A lot of streams in OSM are of minor importance, they certainly
should show up lower in the list of results than villages. Those
rivers/streams that are well known have a wikipedia page and get
a higher importance from that.

The disadvantage with downgrading is that the address gets even
more useless but that's something that needs to be solved outside
the rank search.
2020-04-14 17:14:20 +02:00
Sarah Hoffmann
f4f369895c Merge pull request #1753 from mtmail/fix-travis-ci-badge
fix Travis-CI badge in README output
2020-04-13 18:36:02 +02:00
marc tobias
38c21de0ee Nominatim::DB tests against separate postgresql database 2020-04-13 18:01:37 +02:00
marc tobias
43cf36e0c7 fix Travis-CI badge in README output 2020-04-13 17:55:17 +02:00
Sarah Hoffmann
9a9ff95989 fix logging of lookup calls
Log start was called but the actual writing was missing.
2020-04-13 11:55:24 +02:00
marc tobias
fc40939775 new PHP Nominatim\Shell class to wrap shell escaping 2020-04-12 03:50:40 +02:00
Sarah Hoffmann
553d8a828c Merge pull request #1751 from lonvia/respect-admin-hierarchy
Address ranks must not invert admin_level hierarchy
2020-04-11 23:41:19 +02:00
Sarah Hoffmann
80f7392fb1 address ranks must not invert admin_level hierarchy
When inheriting an address rank from a linked place we
must be careful not to destroy the hierarchy established
through boundary admin_level. Therefore, before assigning
an address rank from a linked place, find the next higher
boundary in the admin_level hierarchy, look up its address
rank and then only use the address rank from the linked
place if it is higher.
2020-04-11 20:56:30 +02:00
Sarah Hoffmann
61535c9972 Merge branch 'update-jquery-leaflet' of https://github.com/mtmail/Nominatim into mtmail-update-jquery-leaflet 2020-04-11 20:53:14 +02:00
Sarah Hoffmann
b443c92a7a Merge pull request #1749 from lonvia/ranking-during-updates
Reset search and address ranks on update
2020-04-11 20:52:05 +02:00
marc tobias
22da6c541d website dependencies: jQuery v2.1 => 3.5, leaflet 1.3 => 1.6 2020-04-11 18:18:57 +02:00
Sarah Hoffmann
cd96354bc7 reset address and search ranks on update
With ranks being dynamically changed through linking of places,
it is important to reset the ranks on update, so that changes
of the rank due to changes in linking are correctly taken into
account.
2020-04-11 09:20:13 +02:00
Sarah Hoffmann
c6d859a08a factor out computation of address and search rank 2020-04-10 23:18:31 +02:00
Sarah Hoffmann
ef47515420 make admin levels 3 and 7 distinct ones in addresses
There really is no need to conflate these two levels as they
are in use in various countries.

Also adds province as a distinct place.

Fixes #1736.
2020-04-10 22:58:11 +02:00
Sarah Hoffmann
a471a3d1b0 Merge pull request #1745 from lonvia/shuffle-sql-functions
Some more SQL function reorganisation
2020-04-10 17:23:09 +02:00
Sarah Hoffmann
79a68fc2db Merge branch 'deletable-and-polygons-as-json' of https://github.com/mtmail/Nominatim into mtmail-deletable-and-polygons-as-json 2020-04-10 17:20:51 +02:00
marc tobias
93ddd46231 Add JSON output for /deletable.php and /polygons.php 2020-04-10 15:34:56 +02:00
Sarah Hoffmann
f5f0c197be move ranks-related functions in separate sql file
Also adds a common function for computing the update radius
around place nodes.
2020-04-10 11:34:14 +02:00
Sarah Hoffmann
4a30ec28b9 move helper functions from placex_triggers into utils
Also adds documentation for these functions.
2020-04-10 11:05:11 +02:00
Sarah Hoffmann
37ef9bb3d3 Merge pull request #1742 from mtmail/travis-ci-add-os
Travis-Ci configuration: remove -sudo-, add -os-
2020-04-10 08:46:50 +02:00
marc tobias
3e1d4a87fa travis-ci configuration: remove -sudo-, add -os- 2020-04-10 01:20:51 +02:00
Sarah Hoffmann
320d46cc96 Merge pull request #1741 from filimongeorge/patch-1
Updated Import and Update .md file
2020-04-09 23:00:34 +02:00
Sarah Hoffmann
a06ceeef4c Merge pull request #1740 from mtmail/setupclass-index-outputfile-not-used
SetupClass.php: remove unused variable
2020-04-09 22:57:51 +02:00
Sarah Hoffmann
def573d7b4 Merge pull request #1739 from lonvia/remove-self-from-geojson
Further tweaks to geocodejson output
2020-04-09 22:51:20 +02:00
filimongeorge
7f7d29fdd1 Updated Import and Update .md file 2020-04-09 20:51:38 +03:00
marc tobias
c611d49941 SetupClass.php: remove unused variable 2020-04-08 14:16:06 +02:00
Sarah Hoffmann
11cd648699 remove name from geocodejson when not set 2020-04-08 11:19:43 +02:00
Sarah Hoffmann
98be5bf637 adapt tests to geocodejson format adaptions 2020-04-08 11:19:43 +02:00
Sarah Hoffmann
29df9771bb further tweaks to geocodejson address output
Removes the place itself from the address details and use
the lowest ranking element in the rank range for the output.
2020-04-08 11:11:33 +02:00
Sarah Hoffmann
e68c1132da ignore isaddress in details output when it is not present 2020-04-08 10:28:28 +02:00
Sarah Hoffmann
1047b1c191 Merge pull request #1737 from mtmail/expose-isaddress-in-details-json
details JSON: also print isaddress addressline field
2020-04-07 20:49:31 +02:00
marc tobias
9431e80eb4 details JSON: also print isaddress addressline field 2020-04-07 14:50:41 +02:00
Sarah Hoffmann
178501de61 Merge pull request #1734 from krahulreddy/fixed-parselatlon
Added whitespace support for parseLatLon
2020-04-05 23:25:50 +02:00
Sarah Hoffmann
81c7f618fb avoid deletes on search_name in reverse-only mode 2020-04-04 18:26:27 +02:00
Rahul
eb2d816f2a Added test cases for whitespaces in LatLon 2020-04-04 00:53:40 +05:30
Rahul
244cb0e98c Added whitespace characters support in LatLon parsing 2020-04-04 00:53:40 +05:30
Sarah Hoffmann
300ac4b77b fix phpcs issues 2020-04-03 20:08:08 +02:00
Sarah Hoffmann
0d189ac5df Merge pull request #1733 from krahulreddy/whitespaces-considered-as-single-space
Support whitespace characters(x09-x0d) as single space
2020-04-03 18:01:47 +02:00
Sarah Hoffmann
fed2c307a7 Merge pull request #1732 from lonvia/improve-geocodejson-output
Improve geocodejson output
2020-04-02 21:21:04 +02:00
K Rahul Reddy
7aa2df5389 Support whitespace characters(x09-x0d) as single space 2020-04-02 05:04:40 +05:30
Sarah Hoffmann
975ef0b305 re-add district to geocodejson 2020-04-01 21:24:42 +02:00
Sarah Hoffmann
e59146a733 update documentation for geocodejson
Address parts should be usable now.
2020-04-01 11:17:25 +02:00
Sarah Hoffmann
8150c3602b add tests for geocodejson address fields 2020-04-01 11:14:48 +02:00
Sarah Hoffmann
ca8d776724 determine geocodejson address by rank instead of type
Using the address rank to set the address parts catches
a much wider variety of types like 'town' and 'suburb'.
With recent address ranking changes the rank ranges
are relatively reliable.
2020-04-01 11:12:52 +02:00
Sarah Hoffmann
fdc40d5169 factor out geocodejson address generation
Unifies the two implementations currently used for search and address.
2020-04-01 10:27:17 +02:00
Sarah Hoffmann
d0a97056c4 Merge pull request #1731 from lonvia/remove-polygon-from-docs
docs: remove example with polygon parameter
2020-04-01 10:21:45 +02:00
Sarah Hoffmann
e98619f801 docs: remove example with polygon parameter
This parameter was undocumented, long deprecated and is gone now.
2020-03-31 20:10:03 +02:00
Sarah Hoffmann
86eebc4305 fix typo
Fixes #1730
2020-03-31 19:53:55 +02:00
Sarah Hoffmann
4930f776fe fix handling of postcode areas in addresses
The order of preference is now:
1. a post code on the place itself
2. a post code area in the address
3. the computed postcode from the place

Fixes #1723.
2020-03-30 23:27:48 +02:00
Sarah Hoffmann
19948c378a adapt tests to new borough ranking 2020-03-30 23:04:20 +02:00
Sarah Hoffmann
b3215b802d downgrade borough and remove unincorporated area 2020-03-30 18:37:23 +02:00
Sarah Hoffmann
ed16d5b6aa Merge pull request #1729 from lonvia/fix-details-link-for-boundaries
Fix details link for boundaries
2020-03-29 23:12:16 +02:00
marc tobias
7a94872413 remove polygon=1 (polypoints) feature 2020-03-29 21:58:11 +02:00
Sarah Hoffmann
98750922eb also emit place_type in json version of details 2020-03-29 21:06:39 +02:00
Sarah Hoffmann
60c4c9ef2c rather use new place_type in getAddressNames()
If for a boundary the place_type is defined, handle the address
part like a place node. This is the same behaviour as before
when class/type where patched earlier.
2020-03-29 20:49:35 +02:00
Sarah Hoffmann
101f04bbf2 Fix address link for boundaries in details
Removes the special casing for boundaries with a place
type in get_addressdata(). Instead the place type is now
returned as an extra field, so that the caller has to
handle the situation.

This fixes the details link next to the address in the details
view, which previously would go to a place class instead of the
original boundary class.
2020-03-29 17:40:56 +02:00
Sarah Hoffmann
4c593fa859 Merge pull request #1720 from lonvia/better-linking-of-places
Use wikidata tags for improving linking of places with boundaries
2020-03-27 21:12:39 +01:00
Aakankasha Sharma
6603ad4006 Updated TIGER database link in documentation (#1725)
Updated TIGER database link in documentation
2020-03-27 15:50:05 +01:00
Sarah Hoffmann
d56c69dd01 adapt API tests to place linkage changes
The missing district is due to a data error for wikidata tags.
2020-03-25 11:38:31 +01:00
Sarah Hoffmann
e26a300c2f use wikidata tag for linking places
Having the same wikidata is a strong indicator that the same place
is meant. There are some assignment errors where the wikidata does
not link to the object itself but to something that mentions the
place. To reduce errors there, prefer same name.
2020-03-21 22:46:54 +01:00
Sarah Hoffmann
405482ede4 remove linking via admin_centre role
The admin_centre role is for the seat of government which is not
the same as the administrative entity. This is mostly used
correctly these days, so avoid matching by that role.
2020-03-21 21:59:11 +01:00
Sarah Hoffmann
3db2b05069 linking: better name matching for address-less places
Administrative boundaries that do not figure in the address
should still be able to take part in the name matching.
Use the rank_search for comparison in this case.
2020-03-21 21:57:04 +01:00
Sarah Hoffmann
ce5870223a Merge pull request #1706 from mtmail/warn-if-no-tiger-files-found
print warning if no Tiger files found
2020-03-06 22:55:37 +01:00
Sarah Hoffmann
9c1bb87493 Merge pull request #1707 from lonvia/regression-address-in-area
place node address parts must be in lower rank area
2020-03-06 22:55:24 +01:00
Sarah Hoffmann
1f7394dd54 place node address parts must be in lower rank area
This fixes a regression where the area of the lower ranking
area was not computed correctly.

Also excludes postcodes areas now as they have their own
hierarchy.
2020-03-06 21:51:38 +01:00
marc tobias
bb569aa484 print warning if no Tiger files found 2020-03-06 17:52:46 +01:00
Sarah Hoffmann
b0a350db37 Merge pull request #1705 from lonvia/delete-linkee-from-search-name
Remove linkees from search_name
2020-03-04 11:55:05 +01:00
Sarah Hoffmann
78526a33b4 Remove linkees from search_name
Fixes #722
2020-03-04 11:36:39 +01:00
Sarah Hoffmann
ab997b7fb1 Merge pull request #1704 from lonvia/centroid-within-geometry
linked centroids must always be within geometry
2020-03-04 10:18:57 +01:00
Sarah Hoffmann
6d431aebb7 linked centroids must always be within geometry
When using a linked place as centroid for a boundary, check
first that it is really within the area. If it is outside,
just keep the computed centroid because a centroid outside the
area just causes havok.

Fixes #1352.
2020-03-04 09:59:57 +01:00
Sarah Hoffmann
a00ea23847 Merge pull request #1702 from lonvia/rename-derived-place
Make admin boundaries inherit address rank from place nodes
2020-03-04 08:08:39 +01:00
Sarah Hoffmann
53ca751a02 fix style 2020-03-01 22:24:32 +01:00
Sarah Hoffmann
8c444378bc better grouping 2020-02-28 22:10:35 +01:00
Sarah Hoffmann
55fdf0abda output linked place into address details 2020-02-28 22:07:06 +01:00
Sarah Hoffmann
acd8ca2ebd add testing for rank adaption while linking 2020-02-28 15:22:48 +01:00
Sarah Hoffmann
06fdfad89e link against place nodes by place type
If a boundary relation has no label member preferably
link against a place node with the same place type.

Also inherit the rank_address from the place node (only
has an effect when linking via lable member or place type).
2020-02-28 15:22:48 +01:00
Sarah Hoffmann
00ca493f33 move linked place type into linked_place extratags
Using linked_place means that we don't overwrite any
place tags on the boundary. This is important when we
wanto to use the information for linking.
2020-02-28 15:22:48 +01:00
Sarah Hoffmann
b00d16fd7d Merge pull request #1698 from lonvia/cleanup-partition-functions
Cleanup partition functions
2020-02-26 20:21:10 +01:00
Sarah Hoffmann
03c373a4b3 make all query partition functions stable 2020-02-26 11:41:49 +01:00
Sarah Hoffmann
bdaa39573f remove unused nearfeature types
Also move the remaining nearfeaturecentr type close to the
function that is using it.
2020-02-26 11:01:29 +01:00
Sarah Hoffmann
8a4c7f6e2b simplify getNearestParallelRoadFeature function
The function only ever returns one result of which only the
place_id is used. So simplify it to return a single place_id
only (or NULL if none is found).

Also fix typo in function name.
2020-02-26 11:01:04 +01:00
Sarah Hoffmann
84ea0753d8 simplify getNearestRoadFeature function
The function only ever returns one result of which only the
place_id is used. So simplify it to return a single place_id
only (or NULL if none is found). Rename funciton to avoid
conflicts when updating an existing database.
2020-02-26 10:58:55 +01:00
Sarah Hoffmann
c1ef56c870 advise against using Postgresql 12 and Postgis 3
See also #1677
2020-02-25 09:44:32 +01:00
Sarah Hoffmann
0e3252f045 revert using stricter uniqueness constraint on place
Multiple objects with the same (osm_type, osm_id, class) may
exist when we hold back deleting an area because it is so
large.

Fixes #1695.
2020-02-24 22:55:03 +01:00
Sarah Hoffmann
65df218f91 Merge pull request #1693 from lonvia/reorganize-addressline-computation
Reorganize addressline computation
2020-02-24 22:39:51 +01:00
Sarah Hoffmann
5220a92be4 adapt API tests 2020-02-22 16:46:03 +01:00
Sarah Hoffmann
d643ca8dee move address line computation in its own function 2020-02-21 16:38:14 +01:00
Sarah Hoffmann
de45152028 Merge pull request #1692 from mtmail/tests-for-HasSetAny
unit tests for ParameterParser::hasSetAny
2020-02-20 21:16:39 +01:00
marc tobias
7fd9d0eeef unit tests for ParameterParser::hasSetAny 2020-02-19 16:55:17 +01:00
Sarah Hoffmann
d35a0b392e Merge pull request #1691 from lonvia/structured-query-via-cmdline
add structured search to command-line query tool
2020-02-19 11:12:37 +01:00
Sarah Hoffmann
cbddfcde5b add structured search to command-line query tool 2020-02-19 11:04:07 +01:00
Sarah Hoffmann
02ffa752ea Merge pull request #1690 from lonvia/parenting-large-rank-30-areas
improve parenting for large areas with rank 30
2020-02-19 09:20:12 +01:00
Sarah Hoffmann
6189e0c79b improve parenting for large areas with rank 30
Instead of unconditionally parenting them to a street, the
larger areas get a parent area that contains them. To keep
things computationally light-weight, only use the centroid and
bbox to determine if an area is contained.

Requires renaming of parenting functions because renaming
a parameter of the function causes issues when updating the
function (it requires a manual delete, which I'd like to
avoid).
2020-02-19 08:43:53 +01:00
Sarah Hoffmann
6ed6a0d447 Merge pull request #1689 from mtmail/travis-postgres-stopped-working
Travis: documentation suggests we need to add postgresql-client package
2020-02-19 08:00:08 +01:00
marc tobias
484892ae97 Travis: documentation suggests we need to add postgresql-client package pre-startup 2020-02-18 23:45:21 +01:00
Sarah Hoffmann
027d9e938c Merge pull request #1688 from mtmail/snippet-noun-vs-snipped-verb
documentation wording: snipped->snippet
2020-02-18 22:55:20 +01:00
marc tobias
e171f90d81 documentation wording: snipped->snippet 2020-02-18 22:48:27 +01:00
Sarah Hoffmann
92c5d3b720 make sure that linked places are within a boundary
This is a regression from previous code refactoring.

Fixes #1684.
2020-02-18 22:46:32 +01:00
Sarah Hoffmann
2a6e8ad68e add bbox whereclause to make postgis 3.0 happy
Normally ST_Covers() should include a bbox index use,
so adding a bbox where clause is not really necessary.
However, the query planner messes up and uses a parallel
index search with a second index instead of exclusively
running on the geometry index, when the bbox part is
missing.
2020-02-16 14:10:22 +01:00
Sarah Hoffmann
55c8a0ac08 Merge pull request #1678 from lonvia/early-drop
Clean up intermediate tables earlier with --drop
2020-02-13 22:50:41 +01:00
Sarah Hoffmann
6073d948e6 fix duplicate keys in tests
The tests suddenly failed because the unique key constraint
is more strict and does no longer include the type.
2020-02-12 11:29:33 +01:00
Sarah Hoffmann
b9171dd10b clean up intermediate tables earlier with --drop
When --drop is given, we can remove all node geometry information
already after the import with osm2pgsql. Also drop all unnecessary
tables before creating the final indices.
2020-02-12 11:03:20 +01:00
Sarah Hoffmann
97b892fac2 Merge pull request #1675 from lonvia/refresh-connection-while-indexing
Fix a couple of issues with the new Python nominatim script
2020-02-12 08:18:09 +01:00
Sarah Hoffmann
b3fdf19b85 Merge pull request #1674 from mtmail/testdb-how-to-select-tiger-data
document how to extract subset of TIGER data needed for API tests
2020-02-11 22:57:39 +01:00
Sarah Hoffmann
8c89e16ad2 Merge pull request #1673 from mtmail/wikidata-wget-incomplete
wikipedia: wget didnt download, skip index generation
2020-02-11 22:56:45 +01:00
Sarah Hoffmann
960409c701 psycopg 2.6 is now usable on ubuntu 16 2020-02-11 22:49:03 +01:00
Sarah Hoffmann
d1eeaa59a6 nominatim.py: use async in connect() function
The _async parameter name is only supported since psycopg 2.7.
However, async is a keyword in Python >= 3.7, so using this
gives us a syntax error. Working around this by defining the
parameters in a dict and handing that into the connect function.
2020-02-11 22:16:17 +01:00
Sarah Hoffmann
882f496e0a nominatim.py: also catch deadlocks on final wait 2020-02-11 22:16:17 +01:00
Sarah Hoffmann
8b8aa1b4e6 regularly close connection while indexing
Postgres sooner or later runs out of memory when the connection
is used for too long.
2020-02-11 22:16:17 +01:00
marc tobias
932ac23f18 document how to extract subset of TIGER data needed for API tests 2020-02-11 18:50:27 +01:00
marc tobias
6c6560ca53 wikipedia: wget didnt download, skip index generation 2020-02-10 17:20:11 +01:00
Sarah Hoffmann
0698757e6e Merge pull request #1670 from lonvia/permalinks-for-tiger-and-interpolation
Enable Permalinks to dtails for tiger and interpolation
2020-02-09 21:07:19 +01:00
Sarah Hoffmann
3a3f9b3496 fix formatting 2020-02-09 16:57:55 +01:00
Sarah Hoffmann
97d87895bf details: also look for interpolations when way id is given 2020-02-09 16:50:04 +01:00
Sarah Hoffmann
c36fd72f99 use detailsPermaLink function on main website as well 2020-02-09 16:05:22 +01:00
Sarah Hoffmann
57ae3d03a1 return place_id link to details when not an OSM object
Stop-gap solution to find the right object for Tiger and
interpolation objects.
2020-02-09 15:45:38 +01:00
Sarah Hoffmann
3737712044 Merge pull request #1667 from mtmail/setup-delete-invalid-indices
setup: delete invalid indices in create-search-indices step
2020-02-09 14:21:17 +01:00
Sarah Hoffmann
8531339b4e Remove hack that changes the class/type of cities
This interferes badly with the details view.

Fixes #1668.
2020-02-09 12:14:32 +01:00
marc tobias
540b12537a setup: delete invalid indices in create-search-indices step 2020-02-08 15:16:20 +01:00
Sarah Hoffmann
9e2fb45783 Merge pull request #1665 from mtmail/centos7-php7
CentOS7: update from PHP 5.4 to 7.2, add psycopg2
2020-02-07 20:43:49 +01:00
Sarah Hoffmann
e7c128b973 Merge pull request #1666 from mtmail/ubuntu-installs-add-psycopg2
Vagrant Ubuntu: psycopg2 is required
2020-02-07 20:42:38 +01:00
marc tobias
d4a3470c9e Vagrant Ubuntu: psycopg2 is required 2020-02-07 15:26:09 +01:00
marc tobias
4165b8c011 CentOS7: update from PHP 5.4 to 7.2 2020-02-07 15:18:46 +01:00
Sarah Hoffmann
357ba2f64d Merge pull request #1663 from mtmail/vagrant-centos-8
Vagrant setup for CentOS 8
2020-02-06 21:12:28 +01:00
Sarah Hoffmann
3ce8818045 Merge pull request #1664 from mtmail/check-import-finished-for-reverse-only
check_import_finished.php - reverse_only mode has less indices
2020-02-06 21:07:04 +01:00
marc tobias
76082ac7cb check_import_finished.php - reverse_only mode has less indices 2020-02-06 16:48:06 +01:00
marc tobias
4a451671d3 Vagrant setup for CentOS 8 2020-02-06 00:43:30 +01:00
marc tobias
a3728b7188 details html page: no longer use place_id in URLs 2020-02-02 01:16:31 +01:00
Sarah Hoffmann
a8711ab013 fix verboseness of nominiatim script during updates 2020-01-31 18:18:50 +01:00
Sarah Hoffmann
3e4754febd Merge pull request #1648 from lonvia/nominatim-as-python-script
Replace C Nominatim indexer with Python script
2020-01-31 17:53:49 +01:00
Sarah Hoffmann
da1d661fa0 remove libxml dependency for travis as well 2020-01-31 15:15:35 +01:00
Sarah Hoffmann
1801db523b fix typo 2020-01-29 11:50:30 +01:00
Sarah Hoffmann
2979c39628 also adapt indexing command in update script 2020-01-29 11:36:12 +01:00
Sarah Hoffmann
bb9bb40287 update cMake build documentation
Remove the dependency on libxml, no longer needed.
2020-01-24 22:53:26 +01:00
Sarah Hoffmann
8f6fdfeb0b forgot to index last rank 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
b4e6d72fde replace nominatim C program 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
a338ebfce0 fix log levels 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
4144364a15 add time display for nominatim.py 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
11c0dd235b clean up and document script 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
4a9502bf88 fix SQL and some other stuff 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
6c0d6d3178 Revert "switch to threading"
This reverts commit 8b1c2181be5aa5335c68d36a49cab9c4e2cd8bef.
2020-01-24 22:06:30 +01:00
Sarah Hoffmann
0a26ca7104 switch to threading 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
2a15b2522f use generator for thread choice 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
c11d1d78e9 add prepared statement 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
7e51aa4cef simple implementation 2020-01-24 22:06:30 +01:00
Sarah Hoffmann
9abb96fa6b Merge pull request #1647 from lonvia/split-out-linking
Split up placex update trigger code
2020-01-24 21:55:19 +01:00
Sarah Hoffmann
879aafc916 fix indent 2020-01-24 21:16:26 +01:00
Sarah Hoffmann
5ec25122f6 rename functions where return parameter changed
Postgresql cannot cleanly reimport these functions when
upgrading, so simply rename to avoid errors.
2020-01-23 22:28:43 +01:00
Sarah Hoffmann
9371b1aeb9 forgot new trigger sql 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
6f6d116451 adapt index for changes name lookup 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
3ff6eccfd7 move trigger creation later in setup 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
5d1fa597ea clean up get_word_id function
Replaced by addr_ids_from_name() which also normalises the
string.
2020-01-23 22:28:43 +01:00
Sarah Hoffmann
3b6c2c9155 getNearestNamed*Feature functions better return values 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
f863040b38 factor out parent search from addr:street/addr:place 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
1033f8bce7 factor out searching for parent road for pois 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
cf4dbbd681 remove unused function 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
6dccc693d0 factor out computation of default names
Also moves the computation down the line so that we never
have to do it twice.
2020-01-23 22:28:43 +01:00
Sarah Hoffmann
c3dc66ce9c factor out place linking sql 2020-01-23 22:28:43 +01:00
Sarah Hoffmann
4856f56d61 adapt test to change in hamlet classification 2020-01-23 22:26:47 +01:00
Sarah Hoffmann
2edc15dfb8 doc: clarify the influence of autovacuum on memory 2020-01-22 12:02:38 +01:00
Sarah Hoffmann
69e31baf68 docs: add a note that the faltnode file is needed for updates
Fixes #1644.
2020-01-22 11:44:05 +01:00
Sarah Hoffmann
586ff0c364 Merge pull request #1638 from mtmail/check-for-invalid-indices
check_import_finished: check for invalid indices
2020-01-14 21:34:35 +01:00
Sarah Hoffmann
f3ba358d50 hint that invalid indices must be manually deleted 2020-01-14 21:33:09 +01:00
Sarah Hoffmann
54bf4c3339 Merge pull request #1637 from mtmail/fix-warnings-in-verbose-warm
warm.php verbose mode was printing errors
2020-01-14 21:30:46 +01:00
Sarah Hoffmann
acda4344de Merge pull request #1636 from lonvia/update-introduction
Update help text on website
2020-01-14 21:29:25 +01:00
marc tobias
6b0afd5d9b check_import_finished: check for invalid indices 2020-01-14 19:36:40 +01:00
marc tobias
850910ed9e warm.php verbose mode was printing errors 2020-01-14 18:24:49 +01:00
Sarah Hoffmann
4ffa11a26c update report a problem page 2020-01-13 22:42:49 +01:00
Sarah Hoffmann
f5e60f8c40 update help links
All links should go to nominatim.org now. Also add links
to Help OSM and the Github repository.
2020-01-13 22:20:53 +01:00
Sarah Hoffmann
ddaf1b79d4 remove special handling of rail
Skip railway=rail in the style, so that installations can remove
it if they wish.
2020-01-08 23:53:23 +01:00
Sarah Hoffmann
d732dc3bb2 update place address levels
Adds province and allotments and downgrades hamlet.
2020-01-08 23:53:03 +01:00
Sarah Hoffmann
f0af5c5643 Merge pull request #1628 from lonvia/split-sql-functions
Split monstrous functions.sql
2020-01-08 21:36:40 +01:00
Sarah Hoffmann
7a194789bc remove remaining sql functions into function/ directory 2020-01-08 11:45:51 +01:00
Sarah Hoffmann
827d7a9a62 move postcode table triggers to own file 2020-01-08 11:22:23 +01:00
Sarah Hoffmann
dae2761137 move placex triggers into own file 2020-01-08 11:18:42 +01:00
Sarah Hoffmann
4304c1a7bb move place triggers into own file 2020-01-07 23:55:38 +01:00
Sarah Hoffmann
c537ea18a4 move functions for interpolation table in own file 2020-01-05 16:36:46 +01:00
Sarah Hoffmann
28fa7be75a move functions for address lookup into own file 2020-01-05 16:16:21 +01:00
Sarah Hoffmann
f1a5862f3d move creation function for aux data into own file
This function is currently unused, so don't even load it.
2020-01-05 16:04:04 +01:00
Sarah Hoffmann
4088e4e371 move importance/wikipedia functions into separate file 2020-01-05 15:55:39 +01:00
Sarah Hoffmann
0ef6425847 move SQL functions for normalisation in separate file 2020-01-05 15:38:20 +01:00
Sarah Hoffmann
7489deb1b7 Merge pull request #1627 from lonvia/cleanup-setup-scripts
Minor cleanup of setup scripts
2020-01-05 14:17:39 +01:00
Sarah Hoffmann
2059e18e8b setup: factor out parameter replacement in SQL scripts
Put all into a single function and use for all SQL
templates.
2020-01-04 23:48:49 +01:00
Sarah Hoffmann
d11ee4c6d9 fix osm link in polygon error view 2020-01-04 21:51:14 +01:00
Sarah Hoffmann
c74cbde329 Merge pull request #1626 from lonvia/move-tests-to-osm2pgsql
Move tests to osm2pgsql
2020-01-04 20:08:35 +01:00
Sarah Hoffmann
20d541af06 remove osm2pgsql tag tests
These tests are now part of the osm2pgsql test suite.
2020-01-04 16:23:29 +01:00
Sarah Hoffmann
2c163b3959 update osm2pgsql (gazetteer output tests) 2020-01-04 16:20:52 +01:00
Sarah Hoffmann
c6a7ef5574 Merge pull request #1624 from lonvia/add-extratags-style
Add new extratags style
2020-01-03 10:06:47 +01:00
Sarah Hoffmann
7005c6af12 add new extratags style
This is the same as the full style but also adds all unused tags
except for a couple of internal tags to the extratags column.
2020-01-02 14:24:51 +01:00
Sarah Hoffmann
256986f01f Merge pull request #1622 from mtmail/clang-faq-entry
Documentation: add FAQ entry about clang not found
2020-01-02 14:14:00 +01:00
Sarah Hoffmann
33d322df9d update osm2pgsql (exclude country and postcode from address tags) 2019-12-28 22:36:02 +01:00
marc tobias
631013be02 Documentation: add FAQ entry about clang not found 2019-12-28 21:21:39 +01:00
Sarah Hoffmann
89a990e000 Merge pull request #1612 from mtmail/cleanup-wiki-scripts
refactor import_wiki* script for readability
2019-12-23 22:05:39 +01:00
Sarah Hoffmann
ccddd9d1de Merge pull request #1619 from mtmail/document-check-import-script
Use check_import_finished in test suite, document
2019-12-23 22:04:25 +01:00
marc tobias
9587fc9909 refactor import_wiki* script for readability 2019-12-23 21:38:33 +01:00
marc tobias
22b7aed769 Use check_import_finished in test suite, document 2019-12-23 21:25:06 +01:00
marc tobias
7db0da40ad new script utils/check_import_finished.php 2019-12-23 15:13:18 +01:00
Sarah Hoffmann
2be70b2c36 Merge pull request #1596 from mtmail/remove-obsolete-wikidata-scripts
remove old wikidata script. See data-sources/wikipedia-wikidata/ for new process
2019-12-18 21:55:17 +01:00
Sarah Hoffmann
0e03668cf2 Merge pull request #1601 from mtmail/spheric-distance-on-details-page
details page: differentiate between spheric distance and distance in meters
2019-12-18 21:54:06 +01:00
Sarah Hoffmann
28b89daa22 remove duplicate addr:country from style 2019-12-17 20:48:27 +01:00
Sarah Hoffmann
2bfa2f4292 Merge pull request #1609 from txtor/master
typo - fixes openstreetmap#1606
2019-12-17 20:46:20 +01:00
Francesc Hervada-Sala
3b22b9911b typo - fixes openstreetmap#1606 2019-12-17 17:24:29 +01:00
Sarah Hoffmann
bc68ff1e43 update osm2pgsql (deletion and address updates) 2019-12-16 21:27:41 +01:00
Sarah Hoffmann
f59af7483b Merge pull request #1604 from mtmail/wiki-create-pagelinkcount-earlier
wikipedia: create all language pagelinkcount tables before querying them
2019-12-14 21:43:16 +01:00
marc tobias
394f85a96b wikipedia: create all language pagelinkcount tables before querying them 2019-12-14 20:36:54 +01:00
marc tobias
626e3238f2 details page: differentiate between spheric distance and distance in meters 2019-12-11 00:49:32 +01:00
marc tobias
2051a84a09 remove old wikidata script. See data-sources/wikipedia-wikidata/ for new process 2019-12-03 19:27:32 +01:00
Sarah Hoffmann
f8bd4f5133 add test for finding housenumber 0 2019-12-01 20:36:59 +01:00
Sarah Hoffmann
a4e514033d Merge branch 'housenumber-zero' of https://github.com/mtmail/Nominatim into mtmail-housenumber-zero 2019-12-01 20:17:27 +01:00
Sarah Hoffmann
95f20ed7ab Merge pull request #1585 from mmd-osm/patch-1
Mention FAQ / troubleshooting page in README
2019-12-01 20:13:35 +01:00
Sarah Hoffmann
bfe92ea191 bdd tests: enforce use of full import style 2019-12-01 16:25:39 +01:00
Sarah Hoffmann
d3bacf475a Revert "update osm2pgsql (reduce memory usage)"
This reverts commit 3474464894.
2019-12-01 16:25:02 +01:00
Sarah Hoffmann
3474464894 update osm2pgsql (reduce memory usage) 2019-11-29 19:31:48 +01:00
mmd
5b25bff2d8 Mention FAQ / troubleshooting page in README 2019-11-26 22:33:00 +01:00
Sarah Hoffmann
c36896c524 Merge pull request #1578 from lonvia/docs-installation-support
Update installation documentation wrt memory usage
2019-11-26 22:21:45 +01:00
Sarah Hoffmann
d9fe25ac2e Merge pull request #1581 from mtmail/wrap-leaflet-map
Allow leaflet map to wrap-around, still longitude should be -180..+180
2019-11-26 21:54:26 +01:00
Sarah Hoffmann
0bd006eef8 fix typo 2019-11-26 21:52:37 +01:00
Sarah Hoffmann
081d1f9779 Merge pull request #1582 from mtmail/documentation-direct-link-osm2pgsql
add wiki link to osm2psql README
2019-11-26 21:51:30 +01:00
Sarah Hoffmann
546c975e28 Merge pull request #1583 from mtmail/documentation-pear-db-no-longer-prerequisite
PHP PEAR::DB is not longer a prerequisite
2019-11-26 21:51:01 +01:00
marc tobias
05fb037edb PHP PEAR::DB is not longer a prerequisite 2019-11-25 19:11:08 +01:00
marc tobias
5cdc196df1 add wiki link to osm2psql README 2019-11-25 19:08:34 +01:00
marc tobias
0896c07972 Allow map to wrap-around, still longitude should be -180..+180 2019-11-25 18:56:46 +01:00
Sarah Hoffmann
be9f54d0a9 set default osm2pgsql to 0 when using flatnode file 2019-11-24 14:36:36 +01:00
Sarah Hoffmann
88fab44006 update minimum required RAM to 64GB
Also adds more background explanation on time and RAM use,
as well as a hint that github issues are not good for
installation support.
2019-11-24 10:31:34 +01:00
Sarah Hoffmann
000fe3ddff remove reference to nominatim.xml OpenSearch description
The file was meant for the openstreetmap instance only.

Fixes #1554.
2019-11-20 11:53:04 +01:00
Sarah Hoffmann
f180f99a95 Merge pull request #1570 from lonvia/wikipedia-importance-updates
Wikipedia importance updates
2019-11-20 11:25:23 +01:00
Sarah Hoffmann
8d9aa9bf33 formatting: avoid long lines 2019-11-20 10:57:36 +01:00
Sarah Hoffmann
9fed91a47f adapt tests for new wikipedia tables 2019-11-20 09:57:40 +01:00
Sarah Hoffmann
12f830fbbb fix loading of data without wikipedia files
Also removes unused place_boundingbox table.
2019-11-17 23:58:43 +01:00
Sarah Hoffmann
6d764a05b0 prefer English wikipedia for wikidata matches 2019-11-17 19:53:42 +01:00
Sarah Hoffmann
cd3ddec746 Switch to sql.gz format for wikipedia data
The dump import is no longer needed.
2019-11-17 10:09:31 +01:00
Sarah Hoffmann
e4555a208d fix wikidata import
The loop was not skipping empty results of get_wikipedia_match().
2019-11-17 10:06:34 +01:00
Sarah Hoffmann
d53af96aa4 update documentation for new wikipedia data 2019-11-16 16:52:23 +01:00
Sarah Hoffmann
d1a9dc0f24 use wikidata links for importance as well 2019-11-16 16:52:23 +01:00
Sarah Hoffmann
a1bcb28cea also update wikipedia from linked places 2019-11-16 16:52:23 +01:00
Sarah Hoffmann
3fbba8b9db add command for recomputing importance 2019-11-16 16:52:23 +01:00
Sarah Hoffmann
5fb850982a move importance computation into its own function 2019-11-16 16:52:23 +01:00
Sarah Hoffmann
6f2e767c77 Merge pull request #1557 from mtmail/document-boundingbox
documentation: add note what bounding box can be used for
2019-11-13 10:30:22 +01:00
Sarah Hoffmann
1ead5b0f3f Merge pull request #1563 from mtmail/remove-pear-db-faq-entries
documentation: remove FAQ entries related to PEAR::DB
2019-11-13 10:29:52 +01:00
Sarah Hoffmann
eb6681d486 Merge pull request #1564 from mtmail/typo-sql-in-postcodes-md
typo in SQL in Postcodes.md
2019-11-13 10:28:57 +01:00
marc tobias
7503987630 typo in SQL in Postcodes.md 2019-11-11 23:34:44 +01:00
marc tobias
1d337e8a76 documentation: remove FAQ entries related to PEAR::DB, we no longer use that 2019-11-11 16:54:12 +01:00
Sarah Hoffmann
f42e40712e Merge pull request #1555 from mtmail/setup-escape-shell-args
setup: escape arguments when executing shell commands (psql, createdb)
2019-11-06 22:47:00 +01:00
Sarah Hoffmann
b5fb8608ba some reformatting of documentation changes and typo fixes
The newest mkdocs is more demanding when it comes to code block
formatting.
2019-11-06 22:34:43 +01:00
marc tobias
dea1d67d03 documentation: new page explaining calculated postcodes 2019-11-06 22:15:44 +01:00
Sarah Hoffmann
9e6fc8f073 Merge pull request #1548 from mtmail/centos7-postgresql-11
Update CentOS7 instruction to postgresql 11 (default was 9.2)
2019-11-06 21:52:45 +01:00
marc tobias
db6da75683 documentation: add note what bounding box can be used for 2019-11-06 17:33:20 +01:00
marc tobias
9cfd891fb9 setup: escape arguments when executing shell commands (psql, createdb) 2019-11-05 23:50:46 +01:00
marc tobias
f985680d2c Update CentOS instruction to postgresql 11 (default was 9.2) 2019-11-02 14:17:53 +01:00
Sarah Hoffmann
8ae317e002 Merge pull request #1545 from mtmail/details-index-html-page
HTML page with search form when /details.php called without params
2019-10-31 18:30:35 +01:00
marc tobias
36e99c43ce HTML page with search form when /details.php called without params 2019-10-31 17:13:43 +01:00
marc tobias
eeb26aaa6f Addresses with housenumber 0 are found 2019-10-31 13:52:10 +01:00
Sarah Hoffmann
260dbe302a Merge pull request #1542 from mtmail/tiger-import-more-verbose
Tiger data import: display which path is searched for files
2019-10-31 10:24:02 +01:00
marc tobias
c297584726 Tiger data import: display which path is searched for files 2019-10-30 15:50:29 +01:00
Sarah Hoffmann
5930383404 Merge branch 'fix-getcol-zero' of https://github.com/eyusupov/Nominatim into eyusupov-fix-getcol-zero 2019-10-28 23:07:28 +01:00
Sarah Hoffmann
7395eb9791 Merge pull request #1532 from eyusupov/use-extradir
Use ExtraDataPath for country grid
2019-10-28 23:01:42 +01:00
Sarah Hoffmann
e0de838b13 adapt tests to short_name demotion 2019-10-28 22:53:41 +01:00
Sarah Hoffmann
5292239714 demote short_name in the list of names to display
The usage of short_name in OSM has changed with time. It is no
longer suitable as a display name.

Fixes #783.
2019-10-28 22:12:21 +01:00
Eldar Yusupov
b54fca4f7e Handle zero in getCol() 2019-10-26 20:19:36 +03:00
Sarah Hoffmann
26f47d2eb7 switch to pygments for mkdocs hilighting 2019-10-25 23:57:23 +02:00
Sarah Hoffmann
233e064f0b prepare for 3.4.0 release 2019-10-25 22:04:59 +02:00
Eldar Yusupov
72c0898409 Add optional compilation of osm2pgsl 2019-10-25 23:01:19 +03:00
Eldar Yusupov
92d1f5122b Separate Nominatim build 2019-10-25 23:01:19 +03:00
Eldar Yusupov
314de3c3c0 Add options to compile only PG module/frontend
It makes it easier to install Nominatim server and PostgresSQL on
separate servers or use separate docker images for them.
2019-10-25 23:01:19 +03:00
Eldar Yusupov
544db43026 Separate more dependencies 2019-10-25 23:01:19 +03:00
Eldar Yusupov
4aca3700b2 Move nominatim lib deps to its dir 2019-10-25 23:01:19 +03:00
Eldar Yusupov
af833ff042 Include CheckSymbolExists CMake module 2019-10-25 23:01:19 +03:00
Eldar Yusupov
96c1a0a101 Use ExtraDataPath for country grid 2019-10-23 06:19:04 +03:00
Sarah Hoffmann
203e210d3a update osm2pgsql (bound COPY buffers) 2019-10-22 22:47:03 +02:00
Sarah Hoffmann
ff1c78fef5 Merge pull request #1502 from mtmail/specialphrases-quotes
Specialphrases quotes
2019-10-22 21:41:53 +02:00
Sarah Hoffmann
d3a731dae4 Merge pull request #1528 from chatelao/patch-2
Typo - Wekipedia (Wikipedia)
2019-10-22 00:21:33 +02:00
chatelao
73a4433d8e Typo - Wekipedia (Wikipedia) 2019-10-21 15:35:55 +02:00
Sarah Hoffmann
3b4ffea690 Merge pull request #1526 from lonvia/index-concurrently
create/drop indexes concurrently
2019-10-19 18:23:59 +02:00
Sarah Hoffmann
05d7f91392 fix rank of postcode results
Fixes #1487.
2019-10-19 18:12:22 +02:00
Sarah Hoffmann
e3e9f69654 fix rank of postcode results
Fixes #1487.
2019-10-19 17:57:57 +02:00
Sarah Hoffmann
34a4a9b08f create/drop indexes concurrently
Fixes #1507.
2019-10-19 17:13:05 +02:00
Sarah Hoffmann
e0836664e5 Merge pull request #1524 from MatthiasLohr/bugfix/uninitialized-string-offset
Fix for #1523: Fix PHP warning
2019-10-15 09:51:20 +02:00
Matthias Lohr
8d7499342f Fixed PHP warning from #1523
Signed-off-by: Matthias Lohr <mail@mlohr.com>
2019-10-15 08:46:19 +02:00
Sarah Hoffmann
a7b24627b5 Merge pull request #1484 from mtmail/ignore-errors-on-setup-drop
on --drop warn on non-existing tables, dont croak
2019-10-15 00:37:33 +02:00
Sarah Hoffmann
452324cf01 Merge pull request #1519 from mtmail/doc-viewbox-parameters2
documentation: add note what x,y mean for viewbox parameter
2019-10-15 00:34:30 +02:00
Sarah Hoffmann
15c5c8db24 add place=city_block/quarter to address hierarchy
Fixes #1516.
2019-10-14 23:49:06 +02:00
marc tobias
423efd54e4 documentation: add note what x,y mean for viewbox parameter 2019-10-08 19:22:51 +13:00
TC Haddad
5e45e0b3d7 Gsoc2019 contributions for adding Wikidata to Nominatim (#1475)
Complete rewrite of wikipedia processing scripts, addition of processing wikidata, new data source, new documentation by @tchaddad during Google Summer of Code 2019 project.
2019-10-06 15:56:39 +08:00
Sarah Hoffmann
a60e7f2376 Merge pull request #1511 from cbpetersen/patch-1
Remove duplicate format query param
2019-10-01 13:52:24 +02:00
Christoffer Bo Petersen
ac7f0f7581 Remove duplicate format query param 2019-10-01 12:37:53 +02:00
marc tobias
9c872345d6 special phrases: use printf, line length below 120char 2019-09-19 01:12:42 +02:00
marc tobias
bd312fa747 special phrases: sometimes quotes are not escaped 2019-09-19 00:20:30 +02:00
marc tobias
573fba55af SetupClass: on --drop check if table exists first 2019-09-04 13:12:11 +02:00
Sarah Hoffmann
39787f7d62 Merge pull request #1474 from mtmail/tiger-data-2019
US TIGER 2019 data got released
2019-09-03 22:54:22 +02:00
Sarah Hoffmann
f4c067d527 Merge pull request #1478 from tbertels/patch-1
Remove administrative arrondissements from Belgian addresses
2019-09-02 17:50:51 +02:00
Thomas Bertels
8d3595c3e2 Remove administrative arrondissements from Belgian addresses
"administrative7" -> [14, 0]
2019-08-27 14:15:18 +02:00
Sarah Hoffmann
b81a57f1e4 Merge pull request #1477 from dpasqualin/fix-python-shebang
Fix python shebang following PEP 394 recommendation
2019-08-26 22:40:43 +02:00
Diego Pasqualin
a624f8b599 Fix python shebang following PEP 394 recommendation 2019-08-26 14:54:19 +02:00
marc tobias
74f49a9d89 US TIGER 2019 data got released 2019-08-23 14:59:03 +02:00
TC Haddad
b7b89b30ea fix spelling on US-Tiger documentation page (#1459) 2019-08-12 01:40:13 +02:00
Sarah Hoffmann
fb012504b2 Merge pull request #1444 from lonvia/require-python-3
Require python 3
2019-08-07 22:38:43 +02:00
Sarah Hoffmann
7ed9ecf350 Merge pull request #1453 from mtmail/add-boundingbox-to-lookup-results
lookup endpoint returns boundingbox
2019-08-06 20:40:06 +02:00
marc tobias
3af1520461 lookup endpoint returns boundingbox 2019-08-05 23:32:46 +02:00
Sarah Hoffmann
a7edda32ba Merge pull request #1445 from mtmail/hierarchy-endpoint-broke
/hierarchy.php was missing namespace calling AddressDetails
2019-07-28 23:11:02 +02:00
marc tobias
7b09e320a8 /hierarchy.php was missing namespace calling AddressDetails 2019-07-28 22:05:51 +02:00
Sarah Hoffmann
46e077c40b adapt TIGER conversion script for python 3 2019-07-28 20:56:02 +02:00
Sarah Hoffmann
7753ba6019 require python 3 for all tools used in updates 2019-07-28 20:36:35 +02:00
Sarah Hoffmann
511204c158 Merge pull request #1443 from lonvia/reorganize-search-name-partition-tables
Reorganize search name partition tables
2019-07-28 15:18:12 +02:00
Sarah Hoffmann
65daef70c1 Merge pull request #1433 from mtmail/us-postcode-import-optional
make US postcode data to an optional download
2019-07-28 14:50:13 +02:00
Sarah Hoffmann
7ab373e86d add cmake mode for building documentation only 2019-07-28 14:27:14 +02:00
Sarah Hoffmann
79b81d39d8 streamline indexes of search_name partition tables
Remove index on name_vector. We always do near search where the
geometry index is sufficient. Also split centroid index in low
and high rank indexes. Reduces index size by about 25%.
2019-07-28 13:29:35 +02:00
Sarah Hoffmann
2bbe5017d4 use bbox of geometry when searching for attached streets
As we are doing a distance search, this improves results for
large places like airports.

Fixes #1442.
2019-07-28 13:28:27 +02:00
marc tobias
765a932561 make US postcode data to an optional download 2019-07-24 01:13:57 +02:00
Sarah Hoffmann
4a2c9431ee Merge pull request #1432 from mtmail/two-outputformats-for-lookup-endpoint
lookup endpoint supports jsonv2 and geocodejson output now
2019-07-22 23:31:56 +02:00
Sarah Hoffmann
de15d10f86 Merge pull request #1430 from mtmail/exclude-negative-tiger-housenumber-ranges
during TIGER import skip records with negative house number range
2019-07-22 23:30:06 +02:00
Sarah Hoffmann
55d414bd72 Merge pull request #1427 from mtmail/documentation-how-to-build-documentation
New readme file on how to build the documentation
2019-07-22 21:24:32 +02:00
marc tobias
1560685020 lookup endpoint supports jsonv2 and geocodejson output now 2019-07-21 23:20:48 +02:00
marc tobias
0e44659033 during TIGER import skip records with negative house number range 2019-07-21 21:41:12 +02:00
marc tobias
3b39cfb1cf New readme file on how to build the documentation 2019-07-21 21:31:14 +02:00
Sarah Hoffmann
15bca71b0d Merge pull request #1422 from lonvia/remove-country-from-addressline
Remove country from addressline
2019-07-16 22:29:17 +02:00
Sarah Hoffmann
3c12455c5b Merge pull request #1421 from asantoz/patch-1
Minor change on lookup endpoint doc
2019-07-11 10:09:33 +02:00
Sarah Hoffmann
927b4c928e add migration hints for country table 2019-07-10 22:54:32 +02:00
Sarah Hoffmann
be47cd2549 remove country from place_addressline
The country information can be determined sufficiently from
the country code. We only loose the specific OSM object
behind the address.

Also streamlines the location_area_country table.
2019-07-10 21:29:47 +02:00
André Santos
a4a17f93f5 Minor change on lookup endpoint doc
Fix documentation about lookup endpoint on output formats available on filter `format`
2019-07-10 19:26:38 +01:00
Sarah Hoffmann
745e52b798 Merge pull request #1419 from asantoz/minor-fix-doc
Minor issue on api docs
2019-07-08 22:23:10 +02:00
André Santos
bbc2da2a4b Minor issue on api docs
Fix a minor issue on API docs in details endpoint example 🙏
2019-07-08 20:08:43 +01:00
Sarah Hoffmann
4c1793b4e3 recreate interpolations when one of their support nodes changes
A simple update is not enough because the interpolation splits
might change as well as the housenumbers.

Fixes #1360.
2019-07-03 23:15:54 +02:00
Sarah Hoffmann
d1ca73f813 Reset housenumber on every place update
As it is a computed field, it needs to be computed from scratch
to take into account any surrounding changes.

Fixes #1395.
2019-07-03 20:56:35 +02:00
Sarah Hoffmann
cdc7d0fe0e remove visibility modifier from constants again
Only supported on PHP >= 7.1.
2019-07-02 23:24:49 +02:00
Sarah Hoffmann
a27a271034 Merge pull request #1415 from nslxndr/fix-db-log
Fix DB log
2019-07-02 20:47:42 +02:00
Sandor Nagy
6c097d24b1 Fix SQL concatenation for new query log 2019-07-02 01:19:59 +02:00
Sandor Nagy
0115b655bd lib/log.php broke after switch to PDO DB abstraction 2019-07-02 01:19:55 +02:00
Sarah Hoffmann
e8f1463cc2 Merge pull request #1414 from lonvia/remove-more-places-from-address
Remove more places from address ranking
2019-07-01 22:33:20 +02:00
Sarah Hoffmann
e164d53fcc adapt tests to new place address ranks 2019-06-30 23:09:43 +02:00
Sarah Hoffmann
b8f7b3cc8d Remove county places and Regierungsbezirke vom German addresses 2019-06-30 22:27:44 +02:00
Sarah Hoffmann
b0e6fb73c6 generally remove all country and state places from address
Gets rid of the hard-coded expection for place nodes and sets
the address rank generally via the address level config instead.
That means only administrative boundaries are now used at that
level in addresses.
2019-06-30 22:27:44 +02:00
Sarah Hoffmann
dd50f1737b Merge pull request #1412 from lonvia/rewrite-wordset-computation
Rework word set computation
2019-06-30 10:48:09 +02:00
Sarah Hoffmann
38a99856c0 Rework word set computation
Switch from an recursive algorithm for computing the word sets
to an iterative one that benefits from caching intermediate
results. This considerably reduces the amount of memory needed,
so that the depth restriction can be dropped. To ensure that
the number of word sets remains manageable, only sets up to
a certain length are accepted and only a certain number of
total word sets. If word sets need to be dropped, we drop
the ones with more words per word set first.

To further reduce the number of potential word sets, the valid
tokens are looked up first and then only word sets containing
valid tokens are computed.

Fixes #1403, #1404 and #654.
2019-06-29 18:22:31 +02:00
Sarah Hoffmann
09e7f0d013 remove historic:neighbourhood from address ranks
Should not be reverse searchable.

Fixes #1379.
2019-06-10 20:12:27 +02:00
Sarah Hoffmann
e05e413cc4 use real centroid when looking for near roads
The point-on-surface may be at the corner in large objects, so
that roads are too far away.

Fixes #1389.
2019-06-10 18:23:12 +02:00
Sarah Hoffmann
2c21cbb5e6 update osm2pgsql (downgrading unnamed places)
Also adds tests for updating unnamed places.
2019-06-10 18:22:11 +02:00
Sarah Hoffmann
3bc4b4bf9f update osm2pgsql (import special tags) 2019-06-09 13:58:05 +02:00
Sarah Hoffmann
a09f2a6987 Merge pull request #1381 from mtmail/faq-entry-about-managed-database-services
FAQ entry about managed database services
2019-06-09 11:04:47 +02:00
Sarah Hoffmann
1f57d730df Merge pull request #1394 from mtmail/update-postcodes-without-colon
exclude postcode ranges separated by colon from centre point calculation
2019-06-09 11:03:10 +02:00
Sarah Hoffmann
eebc72b2bc Merge pull request #1388 from mtmail/register-shutdown-function
register shutdown function to handle out-of-memory errors
2019-06-09 10:20:19 +02:00
rlytleatrel8edto
2f3cf19afa Ubuntu18-nginx install instructions - fix php-fpm socket path (#1398)
Ubuntu18-nginx install instructions - fix php-fpm socket path
2019-06-02 17:04:02 +02:00
marc tobias
10fbda702b exclude postcode ranges separated by colon from centre point calculation 2019-05-25 20:43:38 +02:00
Sarah Hoffmann
17f130550e Merge pull request #1387 from joto/master
Fix some minor issues in docs.
2019-05-23 23:54:29 +02:00
Jochen Topf
251f335fe3 Revert layout changes in list. 2019-05-22 09:25:41 +02:00
marc tobias
ed2fb84e82 register shutdown function to handle out-of-memory errors 2019-05-21 18:41:06 +02:00
Jochen Topf
634684236c Fix some minor issues in docs. 2019-05-21 13:55:16 +02:00
marc tobias
11e0d9ec14 FAQ entry about managed database services 2019-05-14 19:45:56 +02:00
Sarah Hoffmann
5fd8f5aa27 Merge pull request #1372 from lonvia/raise-postgres-version
increase minimum versions for PostgreSQL, Postgis and PHP
2019-05-02 22:56:08 +02:00
Sarah Hoffmann
c05ddb6119 increase minimum versions for PostgreSQL, Postgis and PHP
Remove checks and hacks for older versions.
2019-05-02 21:48:40 +02:00
Sarah Hoffmann
ec86a972a2 prepare for 3.3.0 release 2019-05-01 09:38:45 +02:00
Sarah Hoffmann
62da8a34f3 add documentation for new reverse zoom 17 2019-04-30 23:27:04 +02:00
Sarah Hoffmann
6511ec3aa8 Convert importance to float value
Fixes #1369.
2019-04-30 23:21:53 +02:00
Sarah Hoffmann
1707157c4d fix indent in docs 2019-04-29 23:13:37 +02:00
Sarah Hoffmann
ee49ab84a4 Merge branch 'markdown-syntax-fix-gb-postcodes' of https://github.com/mtmail/Nominatim into mtmail-markdown-syntax-fix-gb-postcodes 2019-04-29 23:12:37 +02:00
marc tobias
b92a55f5fe Readme for GB postcodes had markdown syntax error 2019-04-28 23:18:36 +02:00
Sarah Hoffmann
7d3b16f24c Ignore no-fatal errors during dump file restore
The owner should never be restored, the table should be owned
by the caller instead. Non-existing indexes and similar only
started to throw a warning with Postgresql 9.4 and later, so
ignore them explicitly there.
2019-04-28 22:44:42 +02:00
Sarah Hoffmann
b612b99421 Merge pull request #1321 from mtmail/interpolating-0-housenumbers
Support housenumber=0 in interpolations
2019-04-19 18:29:43 +02:00
Sarah Hoffmann
5a5b3de79a Merge pull request #1359 from mtmail/fix-export-script
utils/export.php broke after switch to PDO DB abstraction
2019-04-17 23:04:51 +02:00
marc tobias
0862e21a1b utils/export.php broke after switch to PDO DB abstraction 2019-04-17 22:29:50 +02:00
Sarah Hoffmann
c148b768f4 Merge pull request #1358 from mtmail/travis-php-7dot1
travis-CI: use PHP 7.1
2019-04-17 22:05:28 +02:00
marc tobias
fab9f684af travis-CI: use PHP 7.1 2019-04-17 16:05:49 +02:00
Sarah Hoffmann
0af48fe802 exclude all objects without address rank from reverse
This was forgotten when looking for a housenumber for
a street point.

Fixes #1319.
2019-04-16 23:13:27 +02:00
Sarah Hoffmann
a9ae2c7457 add reverse zoom level that includes minor streets
Zoom 17 now also resolves service roads and similar.

Fixes #1350.
2019-04-15 22:43:07 +02:00
Sarah Hoffmann
87c0049e75 isaddress field may be missing in details view 2019-04-14 12:03:37 +02:00
Sarah Hoffmann
e5eb7ecdc1 Merge branch 'observe-bounded-viewbox-in-postcode-search' of https://github.com/mtmail/Nominatim into mtmail-observe-bounded-viewbox-in-postcode-search 2019-04-14 11:29:28 +02:00
Sarah Hoffmann
a7e7823535 Merge pull request #1336 from mtmail/faq-entry-about-buffer-not-owned
New FAQ entry about -buffer is not owned by resource owner-
2019-04-14 11:27:36 +02:00
Sarah Hoffmann
33ff96fd83 Merge pull request #1348 from mtmail/checkmodulepresence-to-raise-exception
checkModulePresence now raises exception instead of its callers
2019-04-14 11:25:52 +02:00
Sarah Hoffmann
58852b3eeb Merge pull request #1347 from mtmail/pdo-returns-proper-boolean
PDO library returns proper boolean. We dont need string comparison
2019-04-14 11:24:23 +02:00
Sarah Hoffmann
403ee260f6 Ensure that postcode relations are used in addresses
Postcode nodes are normally thrown away as they only play
a role for computing artifical postcodes. However, if we
have a postcode area this still should take part of the
address.

Fixes #1330.
2019-04-14 11:20:03 +02:00
marc tobias
84149f26df checkModulePresence now raises exception instead of its callers 2019-04-02 18:37:11 +02:00
marc tobias
2ab836c11c PDO library returns proper boolean. We dont need string comparison 2019-04-02 16:52:37 +02:00
marc tobias
7d9dbd62c7 Support housenumber=0 in interpolations 2019-04-02 15:13:45 +02:00
marc tobias
c9a6350894 On postcode searches observe given bounded viewbox 2019-04-02 14:49:31 +02:00
Sarah Hoffmann
2a4198f94d add test for issue #1343
Keyword details for countries (which don't have address details).
2019-03-26 21:49:44 +01:00
marc tobias
850ab6999c if nameaddress_vector was {} the database queries failed 2019-03-26 18:03:26 +01:00
marc tobias
2946e81995 New FAQ entry about -buffer is not owned by resource owner- 2019-03-19 01:52:35 +01:00
Sarah Hoffmann
c78a64ec9b Merge pull request #1334 from mtmail/fix-PDOException-call
PDOException call in catch was causing exception itself
2019-03-18 21:26:23 +01:00
marc tobias
61386c5b4d PDOException call in catch was causing exception itself 2019-03-17 02:47:28 +01:00
Sarah Hoffmann
279eae4b92 Merge pull request #1333 from Arun179/patch-1
Rectified a small spelling mistake
2019-03-14 19:06:07 +01:00
Arun Kumar
37f7af56e4 Rectified a small spelling mistake
changed "mailinglist" to "mailing list"
2019-03-14 22:53:09 +05:30
Sarah Hoffmann
ec2d491dc8 Merge pull request #1328 from mtmail/php-pdo-with-prepare
Nominatim::DB support input variables, custom error messages
2019-03-13 11:10:17 +01:00
marc tobias
890d415e1f Nominatim::DB support input variables, custom error messages 2019-03-10 16:56:36 +01:00
Sarah Hoffmann
75f951d254 Merge pull request #1318 from mtmail/php-pdo
replace database abstraction DB with PDO
2019-03-09 11:27:51 +01:00
marc tobias
d4b633bfc5 replace database abstraction DB with PDO 2019-03-09 00:18:15 +01:00
marc tobias
b20a534e0c add logEnd to reverse.php, just like search.php 2019-02-27 20:22:50 +01:00
Sarah Hoffmann
64f7b13888 Merge pull request #1315 from mtmail/ui-initialize-switch-to-reverse-link
UI: update the switch-to-reverse link after each map click
2019-02-24 17:51:09 +01:00
Sarah Hoffmann
7523359aba Merge pull request #1314 from mtmail/query-php-5dot4
query.php - we no longer support PHP < 5.4
2019-02-24 17:50:09 +01:00
marc tobias
eae9e1cbfa UI: update the switch-to-reverse link after each map click 2019-02-24 16:34:21 +01:00
marc tobias
178cb98795 query.php - we no longer support PHP < 5.4 2019-02-24 16:22:55 +01:00
Sarah Hoffmann
8f0c628310 downgrade housenumbers without numbers
Fixes #1312.
2019-02-24 14:39:14 +01:00
Sarah Hoffmann
16794a84de Change accepted features for reverse geocoding at rank 30
Always exclude line features (removes railways, tunnels,
piers, historical streets etc.) and boundaries (removes
electoral, historical boundaries etc.)

Fixes #1313.
2019-02-24 11:00:33 +01:00
Sarah Hoffmann
189da9afb3 add osm_id index for osmline table
Needed when deleting address interpolation.
2019-02-21 23:26:31 +01:00
Sarah Hoffmann
af97682cca add hint that setup.php must be run from build directory
Fixes #1307.
2019-02-13 21:58:59 +01:00
Sarah Hoffmann
bdd64093e5 Merge pull request #1295 from mtmail/move-searchrank-labels-to-php
Remove get_addressrank_label. Move get_searchrank_label to PHP
2019-02-10 17:22:49 +01:00
Sarah Hoffmann
1dc5a6e5f8 Merge pull request #1294 from mtmail/behave-support-for-DB-PORT
BDD: support for DB_PORT environment variable
2019-02-10 17:20:57 +01:00
Sarah Hoffmann
89a888de76 Merge pull request #1305 from mtmail/travis-ci-ubuntu-16
Travis CI Ubuntu 14 => Ubuntu 16
2019-02-10 17:20:06 +01:00
marc tobias
3be797c759 BDD: support for DB_PORT environment variable 2019-02-09 20:54:18 +01:00
marc tobias
ad585771e7 Travis CI Ubuntu 14 => Ubuntu 16 2019-02-09 20:47:35 +01:00
marc tobias
853b536394 Remove get_addressrank_label. Move get_searchrank_label to PHP 2019-02-09 20:38:36 +01:00
Sarah Hoffmann
bfb20aaa47 Merge pull request #1282 from lonvia/remove-self-from-place-addressline
Remove self from place addressline
2019-02-09 16:00:46 +01:00
Sarah Hoffmann
cf4dcb12ed docs: adapt sizes to smaller place_address table 2019-02-09 15:26:10 +01:00
Sarah Hoffmann
3811d916b9 remove self-reference from place-addressline 2019-02-09 15:26:10 +01:00
Sarah Hoffmann
db6d3ba486 Merge pull request #1304 from mtmail/travis-use-composer-instead-of-pear-install
pear.php.net offline. Use composer instead of pear install
2019-02-09 15:24:04 +01:00
Sarah Hoffmann
dd5315cbaa Merge pull request #1303 from mtmail/remove-deprecated-phpunit-config-key
remove phpunit config key deprecated since version 3.5
2019-02-09 09:54:34 +01:00
marc tobias
f8ac0ef0b9 pear.php.net offline. Use composer instead of pear install 2019-02-09 02:19:00 +01:00
marc tobias
b56f7e8ad2 remove phpunit config key deprecated since version 3.5 2019-02-09 00:37:11 +01:00
Sarah Hoffmann
57ca1e0cf6 Merge pull request #1296 from mtmail/readme-markdown-syntax-error
README: tiny markdown syntax error
2019-01-28 21:19:46 +01:00
marc tobias
d9e0ef0ebf README: tiny markdown syntax error 2019-01-28 19:42:40 +01:00
Sarah Hoffmann
8237aba840 Do not allow --no-index together with --import-osmosis-all
Fixes #1283.
2019-01-26 15:25:00 +01:00
Sarah Hoffmann
63781c4953 set import_status when indexing only
Makes sure that things work as expected when running
`--import-osmosis --no-index` and `--index` separately.

Fixes #1284.
2019-01-26 15:01:39 +01:00
Sarah Hoffmann
c822012aad ignore admin boundary ways for countries and states
Countries and states are mapped world-wide as relations by now.

Fixes #543 and #1291.
2019-01-26 13:37:10 +01:00
Sarah Hoffmann
7d192ace6d Merge pull request #1277 from lonvia/osm2pgsql-import-from-json
Osm2pgsql import from json style file
2019-01-10 20:36:20 +01:00
Sarah Hoffmann
58e461e4c7 postcodes also need fallback 2019-01-08 23:46:18 +01:00
Sarah Hoffmann
5dc10bd5a2 add final missing import numbers 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
f9a098743b update osm2pgsql (custom style) 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
f1fe70656f more style docs 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
e24ea7c1bb add tests for import of interpolations 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
8e2e852b89 add postcodes and interpolations to osm2pgsql style 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
4c10294a29 document import style variants 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
31bf7443a6 fix typo 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
e6d18fc948 fixup admin import style and add two new ones
Remove unnamed landuses and postcode points from
importing. The latter will cause all objects with
address tags to be imported after all. Not expected
in the admin import style.
2019-01-08 22:54:41 +01:00
Sarah Hoffmann
a90ace7fa1 add documentation for new import style 2019-01-08 22:54:41 +01:00
Sarah Hoffmann
caa8210112 Switch to configurable style for osm2pgsql
Includes the full style, which is the same as now (minus
sidwalk exclusion) and a minimal style for boundaries only.
2019-01-08 22:54:41 +01:00
Sarah Hoffmann
1c85edbda9 adapt warm.php to new Result objects
Fixes #1276.
2019-01-08 22:45:21 +01:00
Sarah Hoffmann
72d19cd523 keep country_osm_grid table when dropping update tables 2019-01-08 22:44:33 +01:00
Sarah Hoffmann
cc17aa8d6b Remove postcodes also from word table when they no longer exist
Also adds tests for postcode updates.

Fixes #1273.
2019-01-04 23:11:47 +01:00
Sarah Hoffmann
181e238b55 Do not log file sizes in the index step on updates
Fixes #1274.
2019-01-04 21:51:18 +01:00
Sarah Hoffmann
7d74bf781c correctly discard partially matching duplicates
The same result may be found with different result ranks
in the same search loop when housenumber or postcode are
part of the name or address. In this case we need to keep
the result with the lower result rank.

Fixes #1264.
2019-01-03 21:49:50 +01:00
Sarah Hoffmann
a0fde50c08 Merge pull request #1263 from mtmail/add-new-postcodes-to-searchnames
when updating GB postcodes, also run SQL getorcreate_postcode_id
2018-12-11 21:39:28 +01:00
marc tobias
8224cc34ea documentation: when updating GB postcodes, also run SQL getorcreate_postcode_id [SKIP CI] 2018-12-11 21:34:56 +01:00
Sarah Hoffmann
411f361fcb traverse address list backwards when computing admin levels
By starting with the lowest address level, when collecting
administrative boundaries by level, there is a better chance
to actually get the boundary when the capital of the
administrative boundary is tagged with the level of the
boundary as well.

This is just a heuristics adaption to make the best out of
the imprecise admin_level definition for place nodes.

Fixes #1261.
2018-12-06 21:18:47 +01:00
Sarah Hoffmann
e080bdff0f Don't escape slashes in json output
Fixes #1256.
2018-12-04 22:28:29 +01:00
Sarah Hoffmann
fae8da2bcb Merge pull request #1252 from mtmail/update-and-document-gb-postcode-data2
GB postcode: new conversion script, documentation
2018-12-04 22:20:55 +01:00
Sarah Hoffmann
2d1337e190 Merge pull request #1255 from mtmail/faq-about-permanent-ids
new FAQ entry about place_id values
2018-12-04 22:03:41 +01:00
Sarah Hoffmann
ec4e3c36af Merge pull request #1258 from lonvia/cleanup-utils
Restructure script and website installation
2018-12-04 22:00:31 +01:00
marc tobias
1489e6c00e Explain place_id, i.e. shouldn't be use as permanent id 2018-12-03 19:05:18 +01:00
Marc Tobias Metten
61769a1bad GB postcode: new conversion script, documentation 2018-12-03 18:43:28 +01:00
Sarah Hoffmann
56839ba50f No longer install phrase configuration
Instead add it as a configurable path with the one from
the source directory as the default.

Also reinstates that settings/defaults.php is installed as
settings/settings.php.
2018-12-02 11:50:44 +01:00
Sarah Hoffmann
fe6a2e9f14 Remove server_compare from list of installed scripts
This is a stand alone script and does not depend on
the configured environment.
2018-12-02 11:21:09 +01:00
Sarah Hoffmann
11c91e3b8d Remove settings/settings.php
This was only a stub to warn when something was
executed directly from utils/ in the source directory.
This is no longer possible.
2018-12-02 11:16:41 +01:00
Sarah Hoffmann
e70f405abd Restructure script and website installation
Just make cmake install a small stub that includes
the settings from the build directory and then the
script from the source directory. Remove executable
rights from php files in utils/ so that they cannot
be accidentally executed.
2018-12-02 11:13:48 +01:00
Sarah Hoffmann
8b8ee00725 remove blocks script
Belongs to the rate-limiting code that has been
removed a long time ago.
2018-12-02 10:18:54 +01:00
Sarah Hoffmann
121126cb50 Migration hint for address levels 2018-12-01 23:20:04 +01:00
Sarah Hoffmann
9a13086122 fixup typos and linking of data-source docs
Can't create symbolic links to a directory and then
to files within.
2018-12-01 22:40:37 +01:00
Sarah Hoffmann
c68833cd7f Merge branch 'document-osm-country-grid' of https://github.com/mtmail/Nominatim 2018-12-01 22:20:50 +01:00
Sarah Hoffmann
d4fa528d5c Merge pull request #1245 from lonvia/address-levels-from-json
Make rank assignments configurable
2018-12-01 21:43:53 +01:00
marc tobias
8e19336f49 document what country_osm_grid does 2018-11-29 17:06:04 +01:00
Sarah Hoffmann
52178caa98 fix tests 2018-11-28 23:40:17 +01:00
Sarah Hoffmann
e5cb5d439d Merge pull request #1251 from mtmail/remove-naturalearth-boundary-fallback
remove Natural Earth dataset
2018-11-28 22:29:41 +01:00
Sarah Hoffmann
e28fa6c787 Merge pull request #1253 from RhinoDevel/patch-1
Fix typo.
2018-11-28 22:28:16 +01:00
RhinoDevel
313574ce97 Fix typo. 2018-11-28 13:11:06 +01:00
Sarah Hoffmann
96a84294f4 use consistent naming in doc pages 2018-11-27 22:59:18 +01:00
Sarah Hoffmann
7611aa2f65 Move address level config into settings/ 2018-11-27 22:32:27 +01:00
Sarah Hoffmann
97fa7e0817 Merge pull request #1250 from mtmail/correct-builddir-variable-in-test-readme
test/README.txt: BUILDDIR should be BUILD_DIR [SKIP CI]
2018-11-27 22:27:25 +01:00
marc tobias
417b5b031b test/README.txt: BUILDDIR should be BUILD_DIR [SKIP CI] 2018-11-27 20:17:27 +01:00
marc tobias
a7e26e8f59 remove Natural Earth dataset 2018-11-27 20:13:33 +01:00
Sarah Hoffmann
7665e5a035 Merge pull request #1247 from mtmail/exit-with-error-code
settings.php: when printing error, also exit with error code
2018-11-26 14:53:11 +01:00
marc tobias
c9a553fdb4 settings.php: when printing error, also exit with error code 2018-11-26 14:28:09 +01:00
Sarah Hoffmann
5e072dabc3 remove PHP parameter typing
Older PHPs don't seem to like it.
2018-11-24 19:05:13 +01:00
Sarah Hoffmann
e5b7424592 travis: make sure to start with fresh template for DB tests 2018-11-24 16:22:05 +01:00
Sarah Hoffmann
e99dc2a3da Add function to update address levels 2018-11-24 16:21:16 +01:00
Sarah Hoffmann
211214a8d3 Add documentation for new ranking level configuration 2018-11-24 16:21:16 +01:00
Sarah Hoffmann
e10d11c6c7 Make rank assignments configurable
The initial search and address rank is saved in a table
that is set up from a json configuration file. Ranks may
be assigned on a country level according to class and
type of the object. Special handling that depends on the
geometry or OSM type is still hard-coded in placex insert.

The new default config file mimicks the current assignment
as close as possible. A couple of exceptions have been
removed, most notably the exception for Irish townlands.
2018-11-24 16:21:16 +01:00
Sarah Hoffmann
f0088ca2be Merge pull request #1237 from ckquentvp/fix-accept-language-underscore-parsing
match languages such as ja_rm (or any other with underscore) properly
2018-11-24 16:20:00 +01:00
name
3cd3d1f5ae test languages with underscores (e.g. ja_rm) 2018-11-24 16:52:24 +02:00
Sarah Hoffmann
14cef94e61 fix variable name in setup --drop 2018-11-24 12:29:00 +01:00
Sarah Hoffmann
71ef94dae6 add Makefile for tests 2018-11-24 12:25:28 +01:00
Sarah Hoffmann
fc99954b2e Merge pull request #1242 from lonvia/import-for-reverse-only
Add a reverse-only mode
2018-11-21 21:36:11 +01:00
Sarah Hoffmann
1526501ed7 add documentation for reverse-only 2018-11-21 19:38:39 +01:00
Sarah Hoffmann
5d98c09ee9 Add reverse-only parameter to setup
Avoids creating the search_name table. Useful when only /reverse
is used or the content is directly exported to photon.

Fixes #939.
2018-11-21 19:36:21 +01:00
Sarah Hoffmann
7da5196bac setup: add convenience function for executing SQL commands 2018-11-21 12:18:13 +01:00
Sarah Hoffmann
7fd40cb0e6 Merge pull request #1238 from lonvia/simplify-version-check
Simplify parsing of postgres and postgis versions
2018-11-20 23:07:55 +01:00
Sarah Hoffmann
b6b1c23575 fix phpcs offences 2018-11-20 23:05:56 +01:00
Sarah Hoffmann
409ded385f simplify connection handling in setup script
- factor out runWithEnv
- require explicit connect() call to avoid rechecking for oDB
  (more for readability than for speed)
- clean DSNInfo of empty strings and simplify check for entries
2018-11-20 22:51:37 +01:00
Sarah Hoffmann
e2d0c9f3c1 fix variable prefix 2018-11-20 21:07:24 +01:00
Sarah Hoffmann
9cf85f90fb Simplify parsing of postgres and postgis versions
Switch to functions server_version_num and postgis_lib_version
which both only return the version string, so that no elaborate
string parsing is necessary anymore. The version string could
become especially cumbersome in pre-release versions.
2018-11-18 17:27:20 +01:00
Sarah Hoffmann
fb796d14ec Always ignore continents for addresses
Fixes #1236.
2018-11-18 17:00:59 +01:00
Sarah Hoffmann
43c2eb383e Remove country and state nodes from address computation
OSM has by now almost complete coverage of admin
boundaries up to state level. Place nodes will do more
harm than good in this case.
2018-11-17 23:32:08 +01:00
name
2bc46b8f21 match languages such as ja_rm (or any other with underscore) properly 2018-11-17 20:20:06 +02:00
Sarah Hoffmann
c84648c157 update osm2pgsql (restrict operator)
Fixes #1176.
2018-11-17 17:32:30 +01:00
Sarah Hoffmann
b15441df1c Document the mmap requirement for flatnode files
Fixes #877.
2018-11-17 15:37:46 +01:00
Sarah Hoffmann
85f32d6c0f Keep matches without house number
Now that we have result ranking, we can keep the street results
for housenumber searches and reuse them in the next group round
if required. Also fixes an issue where postcode and housenumber
are in the query and one of them is wrong.

Fixes #1200.
2018-11-17 00:35:38 +01:00
Sarah Hoffmann
9908c93d4c Add result ranking for missing housenumber and postcode
Fixes #988.
2018-11-17 00:00:01 +01:00
Sarah Hoffmann
388c7f706d Merge pull request #1233 from mtmail/better-gbpostcode-setup-warning
Improved warning message when looking for optional GB postcode file
2018-11-15 23:28:04 +01:00
Sarah Hoffmann
36398eedca docs: more specific chapter title 2018-11-15 23:01:08 +01:00
marc tobias
aa41b813b8 2018 TIGER data conversion scripts, add documentation to /docs/data-sources 2018-11-15 23:01:08 +01:00
Sarah Hoffmann
4e2fe6427c Merge pull request #1188 from mtmail/prototype-setup-ubuntu18-nginx
copy of the Ubuntu18 vagrant setup but with nginx as webserver
2018-11-15 22:25:41 +01:00
Sarah Hoffmann
7f0a0ce5e5 make HTML error message less technical 2018-11-15 21:19:31 +01:00
Sarah Hoffmann
2a39bc6e68 Merge branch 'set-exception-handler-by-request-format' of https://github.com/mtmail/Nominatim into mtmail-set-exception-handler-by-request-format 2018-11-15 20:57:20 +01:00
marc tobias
07c47eed54 Improved warning message when looking for optional GB postcode file 2018-11-09 10:06:17 +00:00
marc tobias
a165072915 copy of the Ubuntu18 vagrant setup but with nginx as webserver 2018-10-31 16:13:02 +01:00
Sarah Hoffmann
c5109d39d0 increase limit when searching for street w/ house number
Increase the chance that the correct street is found.
2018-10-20 17:26:45 +02:00
marc tobias
e4a51e460e set exception handler by request format, not always HTML 2018-10-03 22:58:20 +02:00
Sarah Hoffmann
2467e9996e fix permissions for CMakeLists.txt 2018-10-02 23:42:33 +02:00
Sarah Hoffmann
3afd12f977 simplify constructor of SetupFunctions
Also cleans up spacing.
2018-10-02 23:42:33 +02:00
Sarah Hoffmann
f45b3fa3f2 Merge branch 'updatePHP' of https://github.com/ThomasBarris/Nominatim into ThomasBarris-updatePHP 2018-10-02 22:46:53 +02:00
Sarah Hoffmann
fc6b08c8ab Merge branch '201809-test-db' of https://github.com/mtmail/Nominatim into mtmail-201809-test-db 2018-10-02 21:41:57 +02:00
Sarah Hoffmann
441cd27a53 Merge pull request #1193 from mtmail/postgresql-10-postgis-scripts
also install postgis.control for postgresql-10
2018-09-30 21:14:55 +02:00
marc tobias
c73737f77f adjust BDD api test cases to 2018 test database 2018-09-28 18:46:35 +02:00
marc tobias
ecd92d5e71 also install postgis.control for postgresql-10 2018-09-27 19:38:56 +02:00
Sarah Hoffmann
d1143b4580 docs: rewrite functions when migrating
Fixes #1183.
2018-09-22 13:22:08 +02:00
Sarah Hoffmann
09595697cc Merge pull request #1189 from mtmail/classtypes-unit-tests
PHP unit tests for Nominatim\ClassTypes
2018-09-22 10:48:32 +02:00
Sarah Hoffmann
1f887a6ca0 Merge pull request #1187 from mtmail/faq-about-pear-db-warning
Installation FAQ entry about a PHP warning that started with PHP7.2
2018-09-22 10:47:01 +02:00
Sarah Hoffmann
eba6e46c74 Merge pull request #1186 from mtmail/getAddressDetails-fix
fix AddressDetails->getAddressDetails, add tests
2018-09-22 10:46:08 +02:00
marc tobias
f0daf11375 PHP unit tests for Nominatim\ClassTypes 2018-09-20 19:15:58 +02:00
marc tobias
71341a623a Installation FAQ entry about a PHP warning that started with PHP7.2 [SKIP CI] 2018-09-20 13:41:43 +02:00
marc tobias
e2a7a795d4 fix AddressDetails->getAddressDetails, add tests 2018-09-20 02:16:01 +02:00
Sarah Hoffmann
ed7d7a9ad9 Merge pull request #1185 from mtmail/three-faq-entries
three further FAQ entries regarding timezone, continents, exports
2018-09-19 20:24:34 +02:00
marc tobias
9b69bde613 three further FAQ entries regarding timezone, continents, exports 2018-09-19 19:31:54 +02:00
Sarah Hoffmann
119ffbab40 address tokens get a double search rank also as full terms
Fixes #1170.
2018-09-18 21:54:08 +02:00
ThomasBarris
e92b54b869 Merge branch 'updatePHP' of https://github.com/ThomasBarris/Nominatim into updatePHP 2018-09-18 21:29:24 +02:00
ThomasBarris
0273e128f4 change variables for class SetupClass.php instantiation 2018-09-18 21:28:05 +02:00
ThomasBarris
a948050015 typo 2018-09-18 09:17:54 +02:00
ThomasBarris
a0dbeabed1 move setupclass, move command line array, remove args from update array 2018-09-17 10:28:00 +02:00
Mateusz Konieczny
eb615347d2 link CONTRIBUTING file from README file
This change should also encourage to read "how to report bugs" guide before reporting bugs

fixes #1133
2018-09-16 20:49:02 +02:00
Sarah Hoffmann
2d4063234a Merge pull request #1180 from mtmail/php-testsuite-phpunit6-compatible
make PHP testsuite work with PHPUnit6
2018-09-16 20:45:41 +02:00
Sarah Hoffmann
4fcb66df92 Merge pull request #1179 from mtmail/import-table-higher-batchsize
import_osmosis_log table: increase possible batch size
2018-09-15 18:58:37 +02:00
marc tobias
a9bdac836c make PHP testsuite work with PHPUnit6 2018-09-15 15:23:10 +02:00
marc tobias
bb696f3fd0 import_osmosis_log table: increase possible batch size 2018-09-15 11:36:46 +02:00
Sarah Hoffmann
bc26244114 docs: remove tablespace placeholder from index commands
Fixes #1171.
2018-09-10 21:00:15 +02:00
ThomasBarris
08c2f03ccc moving comment to right position 2018-09-08 10:14:08 +02:00
ThomasBarris
9e35e5c2b0 move checkModilePresence to class, delete own debug echo 2018-09-08 09:26:23 +02:00
ThomasBarris
d10f63b666 format change revert, removed bogus CL options, SetupClass to a new dir 2018-09-05 22:01:03 +02:00
ThomasBarris
aa6ac5a751 more format changes for Mr. Travis 2018-08-31 22:01:53 +02:00
ThomasBarris
42e79bfab9 delete an empty line to make the pendantic Mister Travis happy 2018-08-31 21:44:49 +02:00
ThomasBarris
a3b4f80c99 small fixes on setup.php and a bring update.php to work 2018-08-31 21:31:38 +02:00
ThomasBarris
b2f3cfde0b splitted createTables and changed formatting to please Travis 2018-08-29 22:54:28 +02:00
ThomasBarris
c036480ce2 first draft of setupClass 2018-08-29 21:31:19 +02:00
Ganesh Krishnan
043f9d8298 allow nginx to serve files without php extensions
The apache config allows api calls without extension for eg /search?q=query string.
This does not work on nginx and we need to enable this via this patch
2018-08-29 12:59:29 -04:00
Sarah Hoffmann
627a487fcf prepare release 3.2.0 2018-08-26 17:33:49 +02:00
Sarah Hoffmann
b0d0d046e7 add migration for 3.2 version 2018-08-26 17:27:31 +02:00
Sarah Hoffmann
e09c42a78a add docs for class parameter in /details 2018-08-26 16:40:52 +02:00
Sarah Hoffmann
28d7e11e4f Merge pull request #1155 from lonvia/namespace-for-phpunit
use namespaces for PHPUnit classes
2018-08-26 15:51:11 +02:00
Sarah Hoffmann
a5d35d6e84 use namespaces for PHPUnit classes
This is mandatory for PHPUnit 6.x. Older versions provide a
forward compatibility layer, so we should be good.

Fixes #1150.
2018-08-25 14:57:31 +02:00
Sarah Hoffmann
e8982068b6 fix quotes to make phpcs happy 2018-08-24 21:39:20 +02:00
Sarah Hoffmann
6e3670bce5 add vagrant and installation instructions for ubuntu 18 2018-08-24 21:33:24 +02:00
ThomasBarris
14aca11dcd moving functions from setup.php to a lib file in lib/setup_functions.php and change a passthru in setup.php by calling the function with this new lib 2018-08-24 16:15:39 +02:00
Sarah Hoffmann
e7b738fe35 cleanup documentation
Remove level 3 (page title) from TOC and add permalinks.
Also fix and update some minor stuff in the docs.
2018-08-23 23:14:41 +02:00
Sarah Hoffmann
1ee636461c Mute notices from postgresql during setup
They are mostly warnings about tables that do not exists. This
is intentional and would only confuse the casual user.
2018-08-23 21:14:34 +02:00
Sarah Hoffmann
c7e7d5e980 improve wording of error message 2018-08-23 20:58:15 +02:00
Sarah Hoffmann
57bf76a0e1 Merge branch 'customPHP1' of https://github.com/ThomasBarris/Nominatim into ThomasBarris-customPHP1 2018-08-23 20:41:44 +02:00
Sarah Hoffmann
c80c80200c fix links in details documentation
Thanks to @JamesKingdom.
2018-08-23 20:30:07 +02:00
ThomasBarris
5859f9a3cb just add missing semicolon in importWikipedia.php line 312 2018-08-23 19:36:13 +02:00
ThomasBarris
a825414558 push the right version of update.php 2018-08-23 10:02:34 +02:00
ThomasBarris
6577be3744 delete PHP_BIN from default and add @PHP_BIN@ to passthru function in update.php 2018-08-23 09:24:51 +02:00
ThomasBarris
e936041713 Revert "Split of setup.php into one file with functions and one with the control of the workflow. The functions will also be included by update.php to avoid the passthru"
This reverts commit 55fa051d3a.
2018-08-23 08:07:40 +02:00
ThomasBarris
a8b31090a1 Revert "code beauty improvements"
This reverts commit ee3973f507.
2018-08-23 08:07:08 +02:00
Sarah Hoffmann
14e708f366 ignore linked country nodes for reverse geocoding
Fixes #1145.
2018-08-22 23:33:11 +02:00
Sarah Hoffmann
3502ff837f Merge pull request #1151 from mtmail/documentation-for-details-endpoint
documentation for /details endpoint
2018-08-22 22:38:32 +02:00
Sarah Hoffmann
84cfe5db53 fix warning in keyword output of details 2018-08-22 22:35:46 +02:00
Sarah Hoffmann
26bd8304c6 fix formatting 2018-08-22 22:25:55 +02:00
ThomasBarris
ee3973f507 code beauty improvements 2018-08-22 21:33:17 +02:00
ThomasBarris
55fa051d3a Split of setup.php into one file with functions and one with the control of the workflow. The functions will also be included by update.php to avoid the passthru 2018-08-22 21:06:55 +02:00
marc tobias
4d38833170 documentation for /details endpoint 2018-08-22 13:55:43 +02:00
marc tobias
e47646ddba in /details JSON output add check if place (type) has an icon set 2018-08-22 13:34:34 +02:00
marc tobias
7081e2ab66 documentation for /details endpoint 2018-08-21 17:15:47 +02:00
ThomasBarris
e286536959 replaced all shebangs in utility scripts with @PHP_BIN@, to be replaced with detected PHP binary on install 2018-08-21 13:21:08 +02:00
Sarah Hoffmann
88374c2522 improve test coverage for reverse and debug output 2018-08-20 23:05:49 +02:00
Sarah Hoffmann
3fcfab9772 clean up reverse code
Removes unused variables, improves naming of internal functions
and makes structure of SQL calls a bit cleaner.
2018-08-20 23:04:33 +02:00
Sarah Hoffmann
de8888ec58 directly do country search for reverse zoom < 5
Fixes #1145.
2018-08-20 22:09:08 +02:00
Sarah Hoffmann
3af8dc9580 Merge pull request #1148 from mtmail/docs-review
small typos and wording in the API docs
2018-08-20 21:23:01 +02:00
marc tobias
71ae7f10f7 small typos and wording in the API docs 2018-08-20 19:11:38 +02:00
ThomasBarris
38304136d3 Allow custom locations for PHP binary - part 1 2018-08-20 17:24:20 +02:00
Sarah Hoffmann
ff4b1758e1 reduce search radius when searching for nodes within country 2018-08-19 18:08:05 +02:00
Sarah Hoffmann
a9b4894b07 update osm2pgsql to latest master 2018-08-17 22:43:08 +02:00
Sarah Hoffmann
28ba5fc8a4 Merge pull request #1138 from mtmail/php-test-for-sqlViewboxLarge
test for SearchContext::setViewboxFromBox
2018-08-14 21:59:24 +02:00
marc tobias
c7a4c2c88a test for SearchContext::setViewboxFromBox 2018-08-14 17:37:54 +02:00
Sarah Hoffmann
0617768ee2 Fix partial word computation
Partial word tokens have a space at the beginning of the
token not the word.
2018-08-13 21:22:45 +02:00
Sarah Hoffmann
62b60c70e6 reverse on street level should compute distance to object
The centroid of a building may be far away even when still inside
the building.

Fixes #1136.
2018-08-13 21:17:49 +02:00
Sarah Hoffmann
0394f32438 fix large viewbox computation
Fixes #1137.
2018-08-13 20:52:49 +02:00
Sarah Hoffmann
569184a5b0 add FAQ from nominatim.org 2018-08-11 14:47:36 +02:00
Sarah Hoffmann
41c4b51be5 add description of output of API 2018-08-11 14:17:41 +02:00
Sarah Hoffmann
513bf485f2 clean up docs for lookup call 2018-08-11 09:24:59 +02:00
Sarah Hoffmann
263240919c clean up and format search documentation 2018-08-09 23:06:52 +02:00
Sarah Hoffmann
d2b9493d72 clean up and format search documentation 2018-08-09 21:47:57 +02:00
Sarah Hoffmann
60cea6c8bf fixup use of indexes for latest reverse changes 2018-08-08 22:04:39 +02:00
Sarah Hoffmann
9b7f0627ea Merge pull request #1111 from lonvia/remove-postcode-nodes-from-address
Do not have postcode node appear in addresses directly
2018-08-05 17:04:54 +02:00
Sarah Hoffmann
2e9cebf025 define types for null returns in PlaceLookup
Fixes #1127.
2018-08-05 15:47:55 +02:00
Sarah Hoffmann
48d4ea5542 Do not have postcode node appear in addresses directly
Many of the postcode nodes are actually derived from
incomplete addresses and are as such not even centroids.
Better let them only take part in the address computation
via the postcode table.
2018-08-05 14:25:20 +02:00
Sarah Hoffmann
9bdbbec0c8 Merge pull request #1110 from lonvia/remove-address-check-for-long-lines
Remove special search for address part for long ways
2018-08-04 23:19:32 +02:00
Sarah Hoffmann
8f0b3cb00f Merge pull request #1126 from lonvia/improve-country-reverse
improve place node search when no areas found
2018-08-04 23:00:13 +02:00
Sarah Hoffmann
646fa53b44 improve place node search when no areas found
Only look for place nodes in a certain radius according
to the rank_search of the place node.
2018-08-04 21:51:23 +02:00
Sarah Hoffmann
7f10264fb6 Merge pull request #1095 from estadtherr/remote_postgres_pr
Enable Postgres to run on a different host than the web server
2018-08-02 23:16:07 +02:00
Sarah Hoffmann
7e0fdf5928 fall back to debugInfo() for printing objects
Fixes #1122.
2018-08-02 00:06:02 +02:00
Sarah Hoffmann
4a28d28c08 prefix function calls in make_standard_name() with schema
Fixes #1097.
2018-08-01 23:12:34 +02:00
Eric Stadtherr
bfea79f1e4 address phpcs issue (strange it didn't appear in earlier runs) 2018-07-30 11:20:43 -06:00
Eric Stadtherr
87d78e87d2 changed nominatim.so module path to be a runtime configuration setting as opposed to a command line argument 2018-07-24 15:25:12 -06:00
Sarah Hoffmann
c712c6e55e Merge pull request #1115 from mtmail/php-linting-for-all-not-just-tests
On Travis phpcs only ran against tests/php/ directory
2018-07-23 22:41:24 +02:00
Marc Tobias Metten
510054492c On Travis phpcs only ran against tests/php/ directory 2018-07-23 22:18:32 +02:00
Sarah Hoffmann
12db7a9af2 Merge pull request #1113 from mtmail/remove-empty-icon-configuration
ClassTypes: treat empty string for -icon-same as null
2018-07-23 20:37:31 +02:00
Sarah Hoffmann
fcab682231 Merge pull request #1114 from mtmail/fix-tiny-phplint-error
PHP code style: use long array syntax
2018-07-23 20:35:38 +02:00
marc tobias
0981a6403d PHP code style: use long array syntax 2018-07-23 17:24:31 +02:00
marc tobias
37bd4dfd83 ClassTypes: treat empty string for -icon-same as null 2018-07-23 17:10:27 +02:00
Eric Stadtherr
057d77fcd4 rework repeatability change in data/words.sql 2018-07-22 15:58:11 -06:00
Sarah Hoffmann
713ea080d2 Remove special search for address part for long ways
Ways now always have the complete set of crossing boundaries
independent of the length.

Fixes #1108.
2018-07-22 15:05:45 +02:00
Eric Stadtherr
bf5063de9a fix variable reference 2018-07-21 21:59:03 -06:00
Eric Stadtherr
13efedde34 fix omitted initialization 2018-07-21 21:53:41 -06:00
Eric Stadtherr
c1beefd543 PR review changes 2018-07-21 21:45:23 -06:00
Eric Stadtherr
1d81c17335 fix a couple errors with naming convention changes 2018-07-21 20:43:48 -06:00
Eric Stadtherr
b8b87716db adapt PR changes to use new variable naming convention 2018-07-21 17:09:59 -06:00
Eric Stadtherr
1108bf7d86 PR review changes 2018-07-21 12:09:47 -06:00
Eric Stadtherr
62747c934d Work on setup/update scripts, unit tests, and documentation to enable Postgres server to be optionally configured on a remote host 2018-07-21 12:09:47 -06:00
Sarah Hoffmann
81b90c9f15 add a note about variable naming for PHP 2018-07-21 08:47:37 +02:00
Sarah Hoffmann
5a772a5770 Don't add viewbox weight when no viewbox is given
Fixes #1068.
2018-07-20 23:29:36 +02:00
Sarah Hoffmann
3433ec306e fix operator type assignment
Fixes #1084.
2018-07-20 22:27:27 +02:00
Sarah Hoffmann
03039ebfa8 Merge pull request #1102 from mtmail/tests-for-tokenlist
tests for Nominatim::TokenList
2018-07-20 21:39:04 +02:00
Sarah Hoffmann
271c23f459 Merge pull request #1099 from lonvia/sanity-check-pyosmium
More sanity checks for pyosmium tools
2018-07-20 21:37:27 +02:00
Marc Tobias Metten
0892eab1d3 tests for Nominatim::TokenList 2018-07-19 14:19:50 +02:00
Sarah Hoffmann
d68996127d fix bad namespace for Operator class 2018-07-17 22:38:40 +02:00
Sarah Hoffmann
83270557a7 more sanity checks for pyosmium tools 2018-07-17 21:54:37 +02:00
Sarah Hoffmann
0e4f80bf1b Merge pull request #1090 from mtmail/add-more-nominatim.so-faq-entries
FAQ: more answers regarding nominatim.so file permissions
2018-07-13 21:33:13 +02:00
Sarah Hoffmann
b7abc8566e Merge pull request #1089 from lonvia/clean-up-address-computation
Classes for ClassTypes and AddressDetails and geocodejson cleanup
2018-07-13 21:27:00 +02:00
marc tobias
92f86de938 FAQ: more answers regarding nominatim.so file permissions 2018-07-13 18:43:51 +02:00
Sarah Hoffmann
b17019a21c fix unit tests for class types 2018-07-12 23:59:29 +02:00
Sarah Hoffmann
be58b929f2 add tests for geocodejson and fix syntax errors 2018-07-12 22:00:18 +02:00
Sarah Hoffmann
25baaf530d unify address details lookup
Introduces new AddressDetails class which is responsible
for address lookups. Saves always the complete result
and then allows filtering throught the different access
function. Remove special handling in Geocode() and use
there the lookup throught PlaceLookup() as well.
2018-07-10 23:54:35 +02:00
Sarah Hoffmann
320d488627 move ClassTypes into own namespace
Also adds some convenience functions for lookups.
2018-07-09 23:20:46 +02:00
marc tobias
879f818d81 add geojson,geocodejson formats to API documentation 2018-07-09 16:06:48 +02:00
Sarah Hoffmann
80a6751c51 make phpcs happy 2018-07-06 22:06:05 +02:00
Sarah Hoffmann
05bef92f0f ignore admin_level = 15 in geocodejson output
Level 15 is an artifical value.
2018-07-06 21:59:17 +02:00
Sarah Hoffmann
01d5ecb86b use already existing address field in geocodejson 2018-07-06 21:58:41 +02:00
Sarah Hoffmann
f50f46c1ce Merge branch 'geojson-output' of https://github.com/mtmail/Nominatim into mtmail-geojson-output 2018-07-06 20:26:33 +02:00
Sarah Hoffmann
adae49b184 remove trailing spaces 2018-07-05 19:28:17 +02:00
Sarah Hoffmann
2bd7c75a35 avoid 'SELECT *' 2018-07-05 19:27:21 +02:00
Sarah Hoffmann
9955155ce0 update tests for off-coast reverse geocoding 2018-07-04 21:03:04 +02:00
Sarah Hoffmann
b0e0f7ae76 make sure index is used when looking for places in country 2018-07-04 20:56:09 +02:00
Sarah Hoffmann
0315b87664 update indexes for new reverse algorithm 2018-07-02 23:56:56 +02:00
Sarah Hoffmann
ac29f8bc91 Merge branch 'better-reverse' of https://github.com/gemo1011/Nominatim into gemo1011-better-reverse 2018-07-02 21:33:27 +02:00
Unknown
ec6a427e0a edited indices an setup file to grant select for table country_osm_grid 2018-06-28 11:34:19 +02:00
Sarah Hoffmann
09b59bd56a increase search radius when looking for addr:place base objects
Fixes #1036.
2018-06-27 22:45:34 +02:00
Sarah Hoffmann
d0548caa76 use computed postcode by default in export script 2018-06-27 21:39:00 +02:00
Sarah Hoffmann
96d2a331a1 install export script in build directory 2018-06-27 21:26:50 +02:00
gemo1011
1d7ed6737e added comments and improved geOutline function 2018-06-27 14:55:24 +02:00
gemo1011
97b6656182 no polygon search over country-level 2018-06-27 14:55:23 +02:00
gemo1011
f108eac527 changed the lookupPolygon function
- Search for Polygons begins at rank_address 4
- $iMaxRank changed to 25 if its higher
2018-06-27 14:55:22 +02:00
gemo1011
144c3b3eb2 fixed syntax error 2018-06-27 14:55:19 +02:00
gemo1011
426e108b34 added case when for highways in subquery 2018-06-27 14:55:18 +02:00
gemo1011
229ad01042 added search diameter for the place node search, depending on rank 2018-06-27 14:55:17 +02:00
gemo1011
71e3d8b1de GRANT SELECT ON Table country_osm_grid 2018-06-27 14:55:16 +02:00
gemo1011
a5750a6ef6 fixed getoutlinesfunction 2018-06-27 14:55:15 +02:00
gemo1011
d5e39260b3 updated indices for reverse geocoding 2018-06-27 14:55:14 +02:00
gemo1011
0996fdfb55 improvements for pull request 2018-06-27 14:55:13 +02:00
gemo1011
07eb108a6d fixed typo 2018-06-27 14:55:12 +02:00
gemo1011
41249377d2 fixed getoutlines function if no coordinates are passed 2018-06-27 14:55:11 +02:00
gemo1011
be091b17d9 better search for interpolated housenumbers 2018-06-27 14:55:10 +02:00
gemo1011
398467b2f4 using ST_ClosestPoint and a subquery 2018-06-27 14:55:10 +02:00
gemo1011
796f069d74 test adjusting 2018-06-27 14:55:09 +02:00
gemo1011
82b6245aaa better place node search with rank_search 2018-06-27 14:55:08 +02:00
gemo1011
f0e5cf53b8 only starts the search in country_osm_grid if $iMaxRank > 4 2018-06-27 14:55:07 +02:00
gemo1011
7585d37818 edited test 2018-06-27 14:55:07 +02:00
gemo1011
ee54ebfe77 new query if no polygon is found
the new query searchs in the country_osm_grid table for a polygon
2018-06-27 14:55:06 +02:00
gemo1011
43ee4a8faf changed parameters for lookup function in the reverse.php 2018-06-27 14:55:05 +02:00
gemo1011
ab5bcd6d2f rebase 2018-06-27 14:55:05 +02:00
gemo1011
6b8c99a275 rebase 2018-06-27 14:55:03 +02:00
gemo1011
073221d321 changed export.php to work with current master 2018-06-27 14:17:08 +02:00
Sarah Hoffmann
dfb9579a73 initial version of an export script
So far supports type selection down to street level, restriction to
country or an OSM place and postcode printing. Output is standard CSV.
2018-06-27 14:11:14 +02:00
Sarah Hoffmann
26bc83c984 Merge pull request #1069 from woodpeck/patch-2
limit default threads to 15
2018-06-21 22:16:24 +02:00
Frederik Ramm
8139a079f8 limit default threads to 15
When no explicit number of threads is given, don't simply use getProcessorCount()-1, but limit to max. 15
2018-06-20 14:17:07 +02:00
Sarah Hoffmann
0d341c256b Merge pull request #1062 from mtmail/display-viewbox-on-map
Display viewbox on map
2018-06-14 23:16:05 +02:00
Marc Tobias Metten
5a17bfc9c9 display viewbox on map 2018-06-14 02:19:19 +02:00
Marc Tobias Metten
1d981b3171 update leaflet.js 1.0 => 1.3 2018-06-14 02:18:00 +02:00
Sarah Hoffmann
743ec43460 nearest place search should match any of given tokens not all
When multiple isin tokens are given, then these are duplicates
and it is enough that any one of them is found in the
name_vector.

Fixes #1056.
2018-06-14 00:11:19 +02:00
Sarah Hoffmann
87ee3a6f58 Merge pull request #1053 from mtmail/update-tiger-install-instructions
Update tiger install instructions. Mirror no longer working
2018-06-12 22:54:29 +02:00
gemo1011
10897787af deleted query for place nodes search if no polygon is found and added search for interpolation lines 2018-06-05 11:54:12 +02:00
gemo1011
4d073b0350 new query to search place nodes if no polygon was found 2018-06-05 11:54:12 +02:00
gemo1011
625018b654 adapted the coding style with phpcs 2018-06-05 11:54:12 +02:00
gemo1011
5f2410119d better performance 2018-06-05 11:54:12 +02:00
gemo1011
424c0d0ebb faster query through bbox preselection 2018-06-05 11:54:12 +02:00
gemo1011
71c9adfdba performence update through subquerry 2018-06-05 11:54:12 +02:00
gemo1011
723bb4d0b9 changing to from rank_search to rank_address 2018-06-05 11:54:12 +02:00
gemo1011
cb76635da7 better performance for place node search 2018-06-05 11:54:12 +02:00
gemo1011
237e31b3ce use the linked_place_id for adress search if a place node is found with a linked_place_id 2018-06-05 11:54:12 +02:00
gemo1011
d0741f21b1 changed reverse geocode algorithm 2018-06-05 11:54:12 +02:00
marc tobias
a376608344 Update tiger install instructions. Mirror no longer working 2018-05-29 17:42:58 +02:00
Marc Tobias Metten
7a964efb3a search/reverse/lookup with geojson,geocodejson output 2018-05-29 17:20:34 +02:00
Sarah Hoffmann
1d0da944a6 document polygon_threashold parameter
Fixes #1041.
2018-05-15 23:30:58 +02:00
Sarah Hoffmann
d0880694eb Merge pull request #1043 from lonvia/token-as-a-class
Token as a class
2018-05-15 20:21:32 +02:00
Sarah Hoffmann
1f689bdaae document tokens 2018-05-14 23:23:38 +02:00
Sarah Hoffmann
6a0361d0c6 add documentation for TokenList 2018-05-14 23:17:54 +02:00
Sarah Hoffmann
f29c7bf910 introduce classes for token list and token types 2018-05-14 23:04:15 +02:00
Sarah Hoffmann
2cc4c73b64 Merge pull request #1038 from mtmail/phpcs-array-key-alignment
add PHPCS Squiz.Arrays.ArrayDeclaration.KeyNotAligned rule
2018-05-08 09:04:03 +02:00
Marc Tobias Metten
8841a328c8 add PHPCS Squiz.Arrays.ArrayDeclaration.KeyNotAligned rule 2018-05-08 00:37:41 +02:00
Sarah Hoffmann
c555b60b36 narrow down search by house number when postcode is given
Fixes #1034.
2018-05-07 21:34:11 +02:00
Sarah Hoffmann
b30e2ab5dc Merge pull request #1033 from lonvia/remove-word-frequency-scores
Replace word frequency hash
2018-05-07 20:59:20 +02:00
Sarah Hoffmann
115792d1db replace word frequency hash
The word frequency hash was only used to determine if the
name of a SearchDescription is rare. Do this already when
building the SearchDescription (when the word frequency
is still available) and get gid of the extra hash.
2018-05-06 22:35:31 +02:00
mtmail
7075a5828e add JSON format to /status endpoint (#1013)
add JSON format to /status endpoint
2018-05-04 23:37:48 +02:00
Sarah Hoffmann
bd04ce62e0 Merge pull request #1032 from mtmail/tests-for-debughtml
PHP unit tests for DebugHtml
2018-05-04 21:24:08 +02:00
Sarah Hoffmann
3bb6ecdc3b Merge pull request #1029 from lonvia/streamline-sql
streamline SQL for parenting rank 30 places
2018-05-04 21:17:09 +02:00
Marc Tobias Metten
a885e7309a PHP unit tests for DebugHtml 2018-05-03 22:19:19 +02:00
Sarah Hoffmann
6706a23fb5 streamline SQL for parenting rank 30 places
- avoid select all
 - prefer direct select into
 - use early loop exit when possible
2018-05-01 15:44:18 +02:00
Sarah Hoffmann
c7faab4d7c adapt reverse index to changed reverse query
Thanks to @gemo1011.
2018-05-01 15:29:39 +02:00
Sarah Hoffmann
080ba00956 Merge pull request #1024 from lonvia/reduce-address-search-terms
Reduce address search terms
2018-04-26 22:28:53 +02:00
Sarah Hoffmann
2613ebfa01 Merge pull request #977 from lonvia/fix-byteswap-check
clean up byte order detection
2018-04-18 21:35:32 +02:00
Sarah Hoffmann
53c526c01d remove search_name_country table
The table is no longer used, country names are handled
directly via the word table.
2018-04-16 20:47:45 +02:00
Sarah Hoffmann
dc371618ba remove now unused getNearestNamedRoadFeature() function 2018-04-16 20:34:28 +02:00
Sarah Hoffmann
59288417f0 remove debug code 2018-04-16 20:29:30 +02:00
Sarah Hoffmann
1dd401b570 remove use of is_in terms for address computation
The code has been dead for a long time because all is_in
terms have been added to the nameaddress_vector so
that the IF condition would never hit.
2018-04-16 20:27:16 +02:00
Sarah Hoffmann
ee194ab369 remove special search for Tiger postcodes
Postcodes should no longer appear in the address search terms.
2018-04-16 20:25:19 +02:00
Sarah Hoffmann
b8113abd93 fix variable name 2018-04-16 20:16:11 +02:00
Sarah Hoffmann
5182da9f45 add tests for address tag parsing for search name 2018-04-15 22:52:42 +02:00
Sarah Hoffmann
14c25717ab restrict addr:* tags that are used for search term
Fixes #1001.
2018-04-15 22:17:20 +02:00
mtmail
3087ac1145 PHP code style: enforce long array initialisation (#1015) 2018-04-13 13:18:29 +02:00
mtmail
cdbde5b88d get rid of Python psycopg2 install warning (#1014) 2018-04-13 12:28:12 +02:00
marc tobias
7a1ee99345 return centroid on geojson format 2018-04-12 22:02:24 +02:00
marc tobias
7a31a3d106 remove rank_search_label field 2018-04-12 22:02:24 +02:00
marc tobias
fe3dba3fd7 use RFC3339 for human readable date 2018-04-12 22:02:24 +02:00
marc tobias
31c7f25541 rename parentof to hierarchy and other lonvia Mar/29 PR feedback 2018-04-12 22:02:24 +02:00
Marc Tobias Metten
45aef06d00 localname field is required by nominatim-ui 2018-04-12 22:01:10 +02:00
Marc Tobias Metten
0eb71cdce8 only return polygon if &polygon_geojson=1 is set 2018-04-12 22:01:10 +02:00
marc tobias
45bc511955 variable naming after lonvia PR feedback 2018-04-12 22:01:10 +02:00
marc tobias
dedf56b5f8 spacing changes after lonvia PR feedback 2018-04-12 22:01:10 +02:00
Marc Tobias Metten
1e28f2478c details support json output 2018-04-12 22:01:10 +02:00
mtmail
3cdbcbff8f get apt-get php-db package running on travis-ci (#973)
travis: /usr/bin/env php whenever calling PHP scripts to deal with phpenv
2018-04-12 00:54:59 +02:00
Sarah Hoffmann
9a3bc9cc1e Merge pull request #1010 from lonvia/ignore-unicode-format-characters
Ignore Unicode format characters for normalization
2018-04-10 23:48:58 +02:00
Sarah Hoffmann
6bd905ff43 update osm2pgsql (name:suffix)
Fixes #823.
2018-04-10 23:46:57 +02:00
Sarah Hoffmann
ae83ceab5e ignore Unicode format characters for normalization
Also adds tests.

Fixes #1007.
2018-04-10 22:48:17 +02:00
Sarah Hoffmann
28ee59dd64 test: drop template DB when something goes wrong during creation
Fixes #951.
2018-04-08 10:06:33 +02:00
Sarah Hoffmann
b4ef3d91ab Merge pull request #1005 from lonvia/no-limit-for-housenumber-search
do not apply limit to house number place searches
2018-04-06 22:46:23 +02:00
Sarah Hoffmann
efac4a135a do not apply limit to house number place searches
Searches for house numbers are already limited by the
number of parent places. In fact, the limit assumed that
every parent place has exactly one match against the
given housenumber. That is not true in reality and so
we were dropping relevant results.

Fixes #329.
2018-04-06 22:20:21 +02:00
Sarah Hoffmann
984e91e519 Merge pull request #1003 from mtmail/details-permalink
details page: add a perma-link
2018-04-06 21:15:48 +02:00
Sarah Hoffmann
49dc62201c Merge pull request #1002 from mtmail/sql-bracket-error-in-details
when looking for keywords on detail page SQL bracket error was possible
2018-04-06 21:09:52 +02:00
marc tobias
4791fc341e details page: move permalink next to page title 2018-04-06 17:45:25 +02:00
marc tobias
4743a5e166 details page: add a perma-link 2018-04-06 16:11:54 +02:00
marc tobias
908c66ca84 when looking for keywords on detail page a SQL bracket error was possible 2018-04-06 15:22:29 +02:00
Sarah Hoffmann
62719f58c9 update osm2pgsql 2018-04-03 21:11:13 +02:00
Sarah Hoffmann
d183cd3c78 Merge pull request #994 from mtmail/bugfix-when-calling-debug-printDebugArray
fix -undefined offset- error
2018-03-27 09:04:59 +02:00
Sarah Hoffmann
ac5a901daf Merge pull request #992 from mtmail/phpcs-whitespace-warnings
phpcs: remove trailing whitespace from comments
2018-03-27 08:56:11 +02:00
Sarah Hoffmann
aaee03d502 Merge pull request #991 from mtmail/rename-NominatimTest-php
NominatimTest.php => LibTest.php
2018-03-27 08:54:11 +02:00
Marc Tobias Metten
329948e685 fix -undefined offset- error 2018-03-27 03:00:07 +02:00
Marc Tobias Metten
f6a76ebcd5 phpcs: remove trailing whitespace from comments 2018-03-27 01:43:02 +02:00
Marc Tobias Metten
1cb87164d9 NominatimTest.php => LibTest.php 2018-03-27 01:38:39 +02:00
Sarah Hoffmann
64fa70ac0a Merge pull request #989 from lonvia/pretty-debug
nicer formatting for Geocode debug output
2018-03-26 20:56:57 +02:00
Sarah Hoffmann
2c42bda9ce nicer formatting for Geocode debug output 2018-03-25 22:28:18 +02:00
Sarah Hoffmann
1787892d32 Merge pull request #986 from mtmail/php-replace-sizeof
replace PHP sizeof() with either count() or empty()
2018-03-24 18:51:34 +01:00
Sarah Hoffmann
7cc8f63125 Merge pull request #981 from mtmail/api-documentation-from-wiki-to-docs
copied API endpoint documentation from wiki.osm.org to /docs
2018-03-24 18:12:43 +01:00
Sarah Hoffmann
2b6515e704 Merge pull request #979 from mtmail/bdd-paths-in-vagrant-documentation
use real paths in BDD examples
2018-03-23 20:30:21 +01:00
marc tobias
27bc8d4f7b replace PHP sizeof() with either count() or empty() 2018-03-22 12:36:24 +01:00
marc tobias
90d531c640 copied API endpoint documentation from wiki.osm.org to /docs 2018-03-19 17:10:22 +01:00
Marc Tobias Metten
2c24b9da5a use real paths in BDD examples 2018-03-18 02:13:42 +01:00
Sarah Hoffmann
d79a2bb17e increase search radius for named streets in addresses
Fixes #950.
2018-03-16 23:37:42 +01:00
Sarah Hoffmann
4ac1bf2d47 clean up byte order detection
Check for existence of the expected functions and macros
and error out if nothing appropriate can be found.
2018-03-16 23:09:40 +01:00
Sarah Hoffmann
8b4b647d77 Merge pull request #974 from mtmail/remove-two-minute-data-date-offset
remove the 2-minute offset hack from data date in HTML output
2018-03-16 21:18:49 +01:00
Sarah Hoffmann
6495717036 Merge pull request #969 from mtmail/update-vagrant-md
update Vagrant instructions. E.g. cucumber => behave
2018-03-16 20:25:52 +01:00
marc tobias
d23fa84471 remove the 2-minute offset hack from data date in HTML output 2018-03-14 16:42:41 +01:00
marc tobias
937c8ba231 update Vagrant instructions. E.g. cucumber => behave 2018-03-14 16:12:39 +01:00
Marc Tobias Metten
88beeb7916 merge json and jsonv2 templates, they were very similar 2018-03-13 23:49:04 +01:00
marc tobias
34a27c7cab update Vagrant instructions. E.g. cucumber => behave 2018-03-09 15:06:57 +01:00
Sarah Hoffmann
c04541b4da remove now unnecessary DOCS comments in vagrant script 2018-03-08 21:44:21 +01:00
Sarah Hoffmann
8d4a86635f Merge branch 'vagrant-centos-with-selinux' of https://github.com/mtmail/Nominatim 2018-03-08 21:37:24 +01:00
marc tobias
2dc6ee7e1c vagrant centos: update documentation. /build directory is sibling, not child of /Nominatim 2018-03-07 16:09:08 +01:00
marc tobias
ccab565a4a vagrant centos: make sure /home/vagrant/Nominatim directory doesnt get created 2018-03-07 16:05:22 +01:00
Sarah Hoffmann
5c8fbe8186 Merge pull request #941 from mtmail/parameter-parser-tests2
PHP tests for ParameterParser
2018-03-06 23:25:20 +01:00
marc tobias
7fd46dcee9 ParameterParser: getSet default value doesnt have to be part of the set 2018-03-06 14:53:23 +01:00
marc tobias
3ef4c4fbe7 ParameterParser: getStringList removes empty strings 2018-03-06 14:51:48 +01:00
marc tobias
47258f40ea ParameterParser: getFloat with empty string value throws exception 2018-03-06 13:35:27 +01:00
marc tobias
123a3c0347 ParameterParser: getInt with empty string value throws exception 2018-03-06 13:33:19 +01:00
mtmail
d60a2e693c Merge pull request #963 from matkoniecz/typo
fix two typos in docs
2018-03-03 22:41:25 +01:00
Mateusz Konieczny
db524a35c6 fix two typos in docs 2018-03-03 08:36:47 +01:00
Sarah Hoffmann
f23a860b33 second attempt at strict names in structured queries
If a term does not go into names it should go into
address terms.
2018-03-02 00:26:48 +01:00
Sarah Hoffmann
df008d99f5 do not allow importance to become 0
Importance is weighed against a viewbox factor which disappears
when the importance is 0.

Fixes #930.
2018-03-01 22:37:45 +01:00
Sarah Hoffmann
fd920fba9b for structured search only accept name terms from the first phrase
Fixes #952.
2018-03-01 22:35:34 +01:00
Marc Tobias Metten
d9cd8c6fff use assertSame to check array order, 0 vs false 2018-02-28 23:22:45 +01:00
marc tobias
b303c785e9 CentOS: move SELinux setup step so it can install in /srv 2018-02-27 17:06:25 +01:00
Sarah Hoffmann
36fa21d7ce fix more behave table formatting errors 2018-02-26 23:41:57 +01:00
Sarah Hoffmann
f1a388700f Merge branch 'patch-1' of https://github.com/NeilRickards/Nominatim into NeilRickards-patch-1 2018-02-26 20:54:44 +01:00
Sarah Hoffmann
2b66a7a39a test: fix format of behave table 2018-02-26 20:49:26 +01:00
Neil Rickards
4daa584b7d Update nominatim.c 2018-02-26 00:07:53 +01:00
Sarah Hoffmann
de9507bc63 add develop section to documentation 2018-02-24 23:24:25 +01:00
Sarah Hoffmann
2b48d09286 Merge pull request #948 from mtmail/phpcs-enable-3-spacing-rules
enable 3 spacing rules again, no PHP file needed changes
2018-02-24 20:49:41 +01:00
Marc Tobias Metten
97741aaf1c enable 3 spacing rules again, no PHP file needed changes 2018-02-24 18:40:45 +01:00
Marc Tobias Metten
146779340c use setExpectedException to make sure exceptions are really thrown 2018-02-24 18:14:34 +01:00
marc tobias
d1f6fab68a add links to docker, ansible respositories 2018-02-24 17:17:43 +01:00
Sarah Hoffmann
c835918123 Merge pull request #940 from mtmail/phpcs-deep-levels
phpcs deep levels
2018-02-23 21:59:42 +01:00
Marc Tobias Metten
fd9345cda3 PHP tests for ParameterParser 2018-02-23 01:46:18 +01:00
Marc Tobias Metten
8a615ad969 phpcs fixes. Mostly spacing and single quotes 2018-02-23 01:16:01 +01:00
Marc Tobias Metten
eaaa4a7b31 phpcs instructions only searched one level deep 2018-02-23 01:15:36 +01:00
marc tobias
c3e5654113 move CentOS Vagrant VM to a SELinux-enabled base image 2018-02-22 17:51:55 +01:00
Sarah Hoffmann
ff2a40b109 Merge pull request #928 from EdwardBetts/spelling
Correct spelling mistakes.
2018-02-18 14:31:15 +01:00
Edward Betts
7e00a6e2ff Correct spelling mistakes. 2018-02-18 13:11:35 +00:00
Neil Rickards
8ee36fb78c Avoid reading outside buffer
Current str_replace code will read outside buffer if `isspace` and `from` occurs at the start of `buffer`
2018-02-15 18:02:59 +00:00
Sarah Hoffmann
c3483747eb reimport boundaries from scratch when type is changed
Fixes #895.
2018-02-12 21:19:27 +01:00
Sarah Hoffmann
3fda792929 ignore empty flatnode file option
Fixes #902.
2018-02-12 20:47:04 +01:00
Sarah Hoffmann
ba57a9ba07 Merge pull request #915 from foodev/master
PlaceLookup::getAddressDetails() should be public
2018-02-12 20:00:13 +01:00
Jonas Hantelmann
a489ac07cd PlaceLookup::getAddressDetails() should be public, restore default values
PlaceLookup::getAddressDetails() is also used within the hierarchy.php
file so it must not be private.
2018-02-12 11:21:05 +01:00
Sarah Hoffmann
3505417e3f Merge pull request #905 from mtmail/illinois-li-case-insensitive
make sure Illinois,Alabama,Louisiana state code special handling is case insensitive
2018-02-10 15:50:42 +01:00
Sarah Hoffmann
29e78780e5 Merge pull request #909 from mtmail/decimal-coord-parsing-with-sub-seconds
parsing coordinates allows second with floats
2018-02-10 15:49:48 +01:00
Sarah Hoffmann
868caeaf1b Merge pull request #910 from mtmail/shorten-line-in-update-php
shorten line to please PHP style guide
2018-02-10 15:43:41 +01:00
Sarah Hoffmann
7f72c7b5fc Merge pull request #911 from mtmail/trivial-typo
typo in error message
2018-02-10 15:42:28 +01:00
marc tobias
e428019170 typo in error message 2018-02-08 18:02:19 +01:00
marc tobias
e9407cd48d shorten line to please PHP style guide 2018-02-08 17:52:26 +01:00
marc tobias
5042be1b72 parsing coordinates allows second with floats 2018-02-08 17:45:43 +01:00
Marc Tobias Metten
315713ff9a make sure Illinois,Alabama,Louisiana state code special handling is case insensitive 2018-02-07 00:48:18 +01:00
Sarah Hoffmann
7b3fb23216 Merge pull request #900 from mtmail/check-file-exist-before-delete
update.php - check file exists before deleting
2018-01-31 08:58:32 +01:00
Marc Tobias Metten
1d6861667b update.php - check file exists before deleting 2018-01-31 00:38:05 +01:00
Sarah Hoffmann
ae1df044e2 update links and remove MapQuest reference 2018-01-22 23:47:41 +01:00
Sarah Hoffmann
9d2de46c47 prepare for release 3.1.0 2018-01-17 21:36:37 +01:00
Sarah Hoffmann
dfa74daf52 Merge pull request #884 from lonvia/docs-tomkdocs
Switch to mkdocs for building documentation
2018-01-16 22:43:49 +01:00
Sarah Hoffmann
d4110eef7e improve syntax highlighting for vagrant scripts 2018-01-15 23:47:00 +01:00
Sarah Hoffmann
86833454a4 update vagrant scripts 2018-01-15 22:59:16 +01:00
Sarah Hoffmann
b8f7563da9 use mkdocs for compiling the documentation
Requires to shuffle around the documentation.
make doc will now compile the documentation
in the build directory. The markdowns created
from the vagrant files are no longer versioned.
2018-01-14 23:43:15 +01:00
Sarah Hoffmann
e080ccbcf8 update US postcode file from 2017 Tiger data
Location has been computed as the centroid over all
lines of a given postcodes. Then all postcodes which
cover a radius of more than 0.9 have been removed.
2018-01-14 20:18:29 +01:00
Sarah Hoffmann
13469e1576 convert remaining http links and shorten copyright URL 2018-01-11 23:05:28 +01:00
Sarah Hoffmann
8f23ba076b replace non-standard uint type with unsigned
See #879.
2018-01-10 23:27:49 +01:00
Sarah Hoffmann
2cf1ff41c0 move nominatim.org links to https
Solves #737.
2018-01-10 23:21:21 +01:00
Sarah Hoffmann
118517b076 Merge pull request #874 from lonvia/check-for-updates
Add function to check if new updates are available
2018-01-10 22:51:12 +01:00
Sarah Hoffmann
45abcbc301 update drop list for new postcode table
Fixes #875.
2018-01-05 22:41:25 +01:00
Sarah Hoffmann
d5df1c8ae3 fix setup when no us_postcode is available 2018-01-05 22:41:05 +01:00
Sarah Hoffmann
9712decefe update URLs in code and documentation
Use https for all openstreetmap addresses, remove defunct
mapquest link and redirect documentation links to
nominatim.org.
2018-01-05 22:38:51 +01:00
Sarah Hoffmann
6ba87c37d6 switch default replication source to https 2018-01-04 23:27:53 +01:00
Sarah Hoffmann
b06bc799bc add function to check if new updates are available 2018-01-01 22:23:29 +01:00
Sarah Hoffmann
a36b316079 Merge pull request #868 from JonathanMontane/feat/export
feat(export): added linked_place_id as an attribute to feature element
2017-12-20 22:00:39 +01:00
Jonathan Montane
c54fc44b33 feat(export): added linked_place_id as an attribute to feature element 2017-12-18 10:34:05 +01:00
Sarah Hoffmann
c7b903f4b0 assume name for special operator in bounded search
With bounded=1 we already have a restricted area, so it does
not make sense to interpret the query as a near search.

Fixes #311.
2017-12-17 23:50:16 +01:00
Sarah Hoffmann
cdfa31c390 Gives preference to special terms like postcode and housenumber
Fixes #846.
2017-12-17 20:23:34 +01:00
Sarah Hoffmann
3d51c2a4e7 show by default all entries from the broken polygon list
Fixes #854.
2017-12-17 17:29:08 +01:00
Sarah Hoffmann
b94229fb8e Give higher penalty to partial search terms
Avoids that the interpreation of a term as partial term
is ranked higher than as a special term like postcode
or house number.

Fixes #847.
2017-12-17 16:00:44 +01:00
Sarah Hoffmann
637c5c2936 add documentation for new word count compute 2017-12-17 16:00:28 +01:00
Sarah Hoffmann
cbaabe7c24 add function to recalculate counts for full-word search term 2017-12-17 16:00:28 +01:00
Sarah Hoffmann
35c7269bac when linking waterway ways and relations allow all river-like types
Fixes #848
2017-12-16 17:00:55 +01:00
Sarah Hoffmann
ed85388de5 fix address walk-up for reverse
Fixes the row for the join and completely drops parts that have
a linked_place_id.

Fixes #859.
2017-12-15 00:10:05 +01:00
mtmail
f79434f49d Merge pull request #843 from matejkrajcovic/patch-1
Fix typos in introduction.php
2017-10-30 12:38:55 +01:00
Matej Krajčovič
fcba2eabc4 Fix typos in introduction.php 2017-10-30 12:36:20 +01:00
Sarah Hoffmann
e523b34db9 precomputed postcodes must use dedicated import function 2017-10-28 18:24:01 +02:00
Sarah Hoffmann
96b5f8786b Merge pull request #842 from mtmail/updated-tiger-county-json-file
update utils/tiger_county_fips.json data
2017-10-28 16:20:05 +02:00
Marc Tobias Metten
1a1e0ef138 update utils/tiger_county_fips.json data 2017-10-28 00:08:59 +02:00
Sarah Hoffmann
2d3ea552c4 Merge branch 'fix-map-on-details-page' of https://github.com/mtmail/Nominatim into mtmail-fix-map-on-details-page 2017-10-27 22:55:47 +02:00
Marc Tobias Metten
c4e72e6ca9 UI: minimap causes main map not to initialize 2017-10-27 22:15:22 +02:00
Marc Tobias Metten
9f6f3dd75d UI: minimap causes main map not to initialize 2017-10-27 22:13:47 +02:00
Marc Tobias Metten
6b994cb5ff UI: minimap causes main map not to initialize 2017-10-27 22:06:48 +02:00
Sarah Hoffmann
c44324fda5 ignore linked places for address details
Fixes #816.
2017-10-27 21:57:35 +02:00
Sarah Hoffmann
8d91a88b22 allow unnamed roads for reverse geocoding
Should avoid that results are too far off in areas where
most roads are unnamed.
2017-10-27 20:10:32 +02:00
Sarah Hoffmann
f7258e314d ignore linked places for reverse geocoding
Fixes #838.
2017-10-27 20:06:53 +02:00
Sarah Hoffmann
6a3c6c43ea Merge pull request #835 from lonvia/fix-quoting
Replace double quoting with single quotes
2017-10-26 22:01:43 +02:00
Sarah Hoffmann
6c1977b448 replace double-quoting with single quotes where applicable 2017-10-26 21:40:33 +02:00
marc tobias
71602afcad PHP code style rule to enforce single quotes 2017-10-26 21:03:17 +02:00
Sarah Hoffmann
0c053431f5 Merge pull request #834 from mtmail/tests-for-closest-housenumber
tests for lib.php closestHouseNumber
2017-10-26 20:51:51 +02:00
Marc Tobias Metten
185a983c9d tests for lib.php closestHouseNumber 2017-10-25 23:58:13 +02:00
Sarah Hoffmann
adbbb1ce02 restrict number of results for reverse queries
When given a coordinate off the coast of a large town, the entire
town may end up in the potential results during the reverse query.
Postgres then needs to sort tens of thousands of results before it
can determine the clostest one. Given that the results at such a
large search radius are bound to be imprecise anyway, restrict
the number of results postgres should consider to 1000.
2017-10-25 22:34:29 +02:00
Sarah Hoffmann
f78d094483 fix variable typo when filtering results
Fixes #830 and #832.
2017-10-25 20:25:23 +02:00
Sarah Hoffmann
7eeb79ce67 placex must not return a lookup housenumber 2017-10-25 20:11:51 +02:00
Sarah Hoffmann
7caa67d8ec penalize housenumber after the postcode 2017-10-24 23:30:41 +02:00
Sarah Hoffmann
919b1b42fa fix uninitialised rank variable when regrouping searches 2017-10-24 23:17:47 +02:00
Sarah Hoffmann
760807c5e0 revert use of global penalty for a search direction
Adding a penalty to a search description because there
is a term at the beginning which looks like a country
turned out to be a bad idea as there are too many
abbreviations around that match against frequently
matched words.
2017-10-24 22:42:29 +02:00
Sarah Hoffmann
9ac401267a tiger import: convert counties to str
For python2 the gdal features come out as str and
cannot be combined with unicode strings.
2017-10-24 22:27:09 +02:00
marc tobias
a71200a57a huge cleanup of tigerAddressImport.py 2017-10-24 22:27:09 +02:00
marc tobias
b062e7e774 huge cleanup of tigerAddressImport.py 2017-10-24 22:27:09 +02:00
Sarah Hoffmann
d42aa08705 Merge pull request #829 from lonvia/result-as-a-class
Use PlaceLookup in search for retriving place details
2017-10-24 22:17:59 +02:00
Sarah Hoffmann
282c6777ee use PlaceLookup::loadParamArray in search and lookup 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
9981d74ee1 add loadParamArray function to PlaceLookup and use for reverse 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
1a4506f6ab use PlaceLookup in search 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
914caab43d make PlaceLookup::lookup() accept multiple results 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
5eb11800a7 replace SQL code in PlaceLookup with content of search's get_details 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
1424e8e29b use Result class in reverse geocoding
Also simplifies the reverse algorithm slightly by no longer
having an additional distance lookup.
2017-10-23 23:30:53 +02:00
Sarah Hoffmann
42f079c355 introduce Result class in Geocode and SearchDescription 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
8f884d7f23 Merge pull request #822 from mtmail/ui-allow-copypaste-combined-latlon
Ui allow copypaste combined latlon
2017-10-22 12:11:02 +02:00
Sarah Hoffmann
2cf21a3008 Merge pull request #819 from mtmail/tiger-2017-import
Tiger 2017 data no longer contains -divroad- field
2017-10-21 15:50:05 +02:00
marc tobias
2361ca2c71 UI: allow copy&pasting lat,lon into the lat search field 2017-10-21 14:28:28 +02:00
marc tobias
3cee2d185d UI: allow copy&pasting lat,lon into the lat search field 2017-10-21 14:11:46 +02:00
Sarah Hoffmann
cc785ccad0 Merge pull request #821 from mtmail/ui-scrollwheel-minimap
UI: scrollwheel, minimap
2017-10-21 13:53:24 +02:00
marc tobias
da4a2b7b6e UI: scrollwheel, minimap 2017-10-21 13:24:02 +02:00
Sarah Hoffmann
f17be2403f Merge pull request #820 from mtmail/php-test-path-changed
Vagrant documentation: update path to php tests
2017-10-21 12:17:50 +02:00
marc tobias
47bb49384e Vagrant documentation: update path to php tests 2017-10-21 11:46:12 +02:00
marc tobias
8eed1a8bec Tiger 2017 data no longer contains -divroad- field 2017-10-20 15:17:51 +02:00
Sarah Hoffmann
fcf7fcee03 Merge pull request #814 from lonvia/phrase-as-a-class
Make phrases a class and add early checking of token validity
2017-10-15 18:07:55 +02:00
Sarah Hoffmann
5c18d6865d adapt unit tests to new Phrase class 2017-10-14 20:45:20 +02:00
Sarah Hoffmann
cdf8c67898 fix CodeSniffer offences 2017-10-13 23:11:09 +02:00
Sarah Hoffmann
00265af528 move word recheck into token collection
Drop tokens for special and postcode searches already when
collecting them for ValidTokens when they cannot be found
in the normalized query.
2017-10-13 23:04:12 +02:00
Sarah Hoffmann
77b76ae51b simplify cross-check of country tokens
Drop country tokens that do not match the country code list
early. Remove in turn the special country code check for
structured phrases. It is sufficient to do this during
word list building.
2017-10-13 22:23:39 +02:00
Sarah Hoffmann
9ef2370a2a remove unused $aPossibleMainWordIDs array 2017-10-13 21:34:13 +02:00
Sarah Hoffmann
c700421aa7 add documentation for Phrase 2017-10-13 21:23:45 +02:00
Sarah Hoffmann
77abe882ab take frequency scores from token description
No need to hand them in separately.
2017-10-12 22:59:07 +02:00
Sarah Hoffmann
023f94b066 convert phrase array to class 2017-10-12 22:37:44 +02:00
Sarah Hoffmann
7ea1ef3feb take country names only from relations 2017-10-12 21:03:03 +02:00
Sarah Hoffmann
df463f4ea6 Show address rank in details and hide unset admin_level
Address rank explains better why the address parts are where
they are.

Fixes #766.
2017-10-11 22:17:59 +02:00
Sarah Hoffmann
3da4c9c384 Sort results for near searches by proximity
If a reference coordinate is given, results really should be
sorted by distance to this point ignoring importance completely.

Fixes #796.
2017-10-10 23:03:28 +02:00
Sarah Hoffmann
97bc185152 Merge pull request #812 from lonvia/search-as-a-class
Refactoring Search arrays
2017-10-10 21:08:11 +02:00
Sarah Hoffmann
c8780da19c documentation for SearchContext and SearchDescription 2017-10-10 00:15:56 +02:00
Sarah Hoffmann
c02bf4986f coding style and some documentation 2017-10-09 23:13:04 +02:00
Sarah Hoffmann
9a5d5d9aec move complete search query code into SearchDescription 2017-10-09 22:55:50 +02:00
Sarah Hoffmann
2c62a8dbbc adapt phpunit tests to new SearchContext class 2017-10-09 22:11:46 +02:00
Sarah Hoffmann
55629a4891 move country list to SearchContext 2017-10-08 23:33:54 +02:00
Sarah Hoffmann
907133a38c move excluded place list to SearchContext 2017-10-08 23:15:06 +02:00
Sarah Hoffmann
86c0858130 move viewbox sql to new SearchContext 2017-10-08 22:44:01 +02:00
Sarah Hoffmann
30511fd3ab replace NearPoint with a more generic context object
The NearPoint is actually common to all SearchDescriptions
and there is other context data as well. like viewbox, that
needs to be available to the search object but is common.
2017-10-08 21:23:31 +02:00
Sarah Hoffmann
614a6ab861 don't trust words from word table to be sanatized 2017-10-08 17:36:38 +02:00
Sarah Hoffmann
4bff2814a9 add missing include 2017-10-08 17:13:41 +02:00
Sarah Hoffmann
8e0ffde3e0 fix CodeSniffer violations 2017-10-08 17:00:59 +02:00
Sarah Hoffmann
795153b213 fix more syntax issues 2017-10-08 16:42:04 +02:00
Sarah Hoffmann
fd08d41962 move Search dump function into SearchDescription class 2017-10-08 16:05:27 +02:00
Sarah Hoffmann
75e35f3832 fix syntax errors from introduction of SearchDescription 2017-10-08 15:26:14 +02:00
Sarah Hoffmann
16268f92cc convert getGroupedSearches to SearchDescription class 2017-10-08 12:57:22 +02:00
Sarah Hoffmann
d72c863353 add function to convert array to SQL 2017-10-08 10:06:17 +02:00
Sarah Hoffmann
96b6a1a418 use SearchDescription class in query loop 2017-10-08 09:54:12 +02:00
Sarah Hoffmann
0067555c38 move initial search setup to new class type 2017-10-07 12:24:21 +02:00
Sarah Hoffmann
77d4453334 add new class for searches 2017-10-07 12:24:21 +02:00
Sarah Hoffmann
c563c2bfec drop searches with excluded country codes earlier 2017-10-07 12:23:46 +02:00
Sarah Hoffmann
266153f218 remove code for dropping address terms
This code has been inactive in quite a while and is a suboptimal
solution. We need to be much more selective in what gets dropped.
2017-10-07 11:53:33 +02:00
Sarah Hoffmann
73e737d775 fix variable names 2017-10-06 22:01:52 +02:00
Sarah Hoffmann
5029101048 further restrict use of partial terms in names 2017-10-06 21:36:28 +02:00
Sarah Hoffmann
0c9a241487 housenumbers may only appear before or after the name 2017-10-06 21:16:35 +02:00
Sarah Hoffmann
41d2cd318d penalize search order where a country comes first 2017-10-06 21:07:33 +02:00
Sarah Hoffmann
0d2cdb5c2f allow postcodes and housenumbers together
Fixes #805.
2017-10-06 20:48:35 +02:00
Sarah Hoffmann
f8d55b5448 sanitize special search term before normalizing 2017-10-06 00:22:27 +02:00
Sarah Hoffmann
00a3a8834b fix postcode search
Name token must be fully replaced with the postcode and
postcode search must be done only once.
2017-10-04 23:33:29 +02:00
Sarah Hoffmann
32f6ddf6db only allow either postcode or special search
Fixes #804.
2017-10-04 20:15:06 +02:00
Sarah Hoffmann
1220ff5da6 use correct source for radius column in debug view 2017-10-04 20:14:35 +02:00
Sarah Hoffmann
89c576fbe1 tests: more coverage for all API endpoints 2017-10-04 00:05:34 +02:00
Sarah Hoffmann
e276ec2e94 Merge pull request #803 from lonvia/update-postcodes
Add script to update table with artifical postcode centroids
2017-10-03 16:28:22 +02:00
Sarah Hoffmann
bafbf679b6 add script for updating postcodes 2017-10-03 15:58:14 +02:00
Sarah Hoffmann
8e2ef2842e move psqlRunScript implementation into cmd lib
Function needed for update.php as well.
2017-10-03 14:26:59 +02:00
Sarah Hoffmann
218b70fd96 test: remove road-fallback test from db tests
This should be tested in the api section.
2017-10-03 14:26:08 +02:00
Sarah Hoffmann
e3323e8888 fix search for postcode via structured query
Results from the artifical postcode table were dropped
when reevaluating rank of results.
2017-10-03 12:10:27 +02:00
Sarah Hoffmann
eacaf3489e more coverage tests for Geocode.php 2017-10-02 23:09:45 +02:00
Sarah Hoffmann
2deac34648 remove unnecessary size check 2017-10-02 22:31:52 +02:00
Sarah Hoffmann
e8c52c6780 be more strict with searches involving house numbers
Housenumber searches without a name cannot exist per
definition. Searches with only a name but no address
should not fall back on a search without house number.
This should improve postcode only search.
2017-10-02 22:22:50 +02:00
Sarah Hoffmann
e7e7ae0104 avoid unnecessary SQL when rechecking rank restrictions 2017-10-02 20:42:37 +02:00
Sarah Hoffmann
0d4c1e8460 fix viewbox related test
Coordinates are no longer specially ordered.
2017-10-02 20:39:33 +02:00
Sarah Hoffmann
cdabea7c76 docs: clarify how to run pip install
Fixes #792.
2017-10-01 22:48:57 +02:00
Sarah Hoffmann
749091bf3a order of viewbox coordinates does not matter 2017-10-01 22:48:57 +02:00
Sarah Hoffmann
28810e6ce0 Merge pull request #802 from mtmail/coordinate-extract-missing-first-minus-sign
NearPoint::extractFromQuery - greedy-match optional quote sign
2017-10-01 22:42:24 +02:00
Sarah Hoffmann
f2c15b73ad skip output of lat/lon in debug when no near point given 2017-09-30 12:24:37 +02:00
Sarah Hoffmann
a88527b2a0 fix index when rechecking postcode name 2017-09-30 12:19:16 +02:00
Sarah Hoffmann
b1e8db7ca7 return unchanged term if normalizer was not found 2017-09-30 09:39:47 +02:00
marc tobias
06657b3e10 NearPoint::extractFromQuery - greedy-match optional quote sign 2017-09-21 19:13:50 +02:00
Sarah Hoffmann
81a7ea36db more API tests (mostly for user errors) 2017-09-19 23:06:31 +02:00
Sarah Hoffmann
af74c037f4 enable coverage also for tests with HTTP errors 2017-09-19 22:42:09 +02:00
Sarah Hoffmann
6796749136 Merge pull request #798 from mtmail/coordinate-extract-missing-first-minus-sign
fix to NearPoint::extractFromQuery handling first minus sign
2017-09-19 21:23:49 +02:00
marc tobias
e67a6dc321 fix to NearPoint::extractFromQuery handling first minus sign 2017-09-19 12:40:10 +01:00
Sarah Hoffmann
15a215729e fix handling of near queries with special search
Make sure to use the classtype tables with near search and
allow to search for arbitrary key/values (forbidding it
for viewbox searches).

Add tests for near queries.
2017-09-19 00:07:11 +02:00
Sarah Hoffmann
ce95c55d65 fix display of nearpoint in debug view 2017-09-18 23:06:30 +02:00
Sarah Hoffmann
8eb066c692 reinstate key-value amenity search
Reenable search by the secret special term [key=value]
matching against the given main tag. Note that for most
cases that works only for tags that also have a special
search table.
2017-09-18 22:09:06 +02:00
Sarah Hoffmann
a0de20e9bc more API tests for code coverage
Also fixes two minor issues related to structured queries.
2017-09-17 23:30:08 +02:00
Sarah Hoffmann
2dbf58d461 improve code coverage documentation 2017-09-17 20:27:06 +02:00
Sarah Hoffmann
9a47e1834f reduce message frequency during indexing 2017-09-17 20:13:05 +02:00
Sarah Hoffmann
61ed3b8ab3 setup: bail out earl when something is wrong with nominatim.so 2017-09-17 20:07:03 +02:00
Sarah Hoffmann
bb1552be29 setup: error out when web site user does not exist
User is needed to be able to grant rights.
2017-09-17 19:51:00 +02:00
Sarah Hoffmann
5614ece9a1 run psql in quiet mode unless 'verbose' is enabled 2017-09-17 11:34:35 +02:00
Sarah Hoffmann
3546b30473 timestamp info message and repeat warnings at end 2017-09-17 11:06:52 +02:00
Sarah Hoffmann
cf32da3748 docs: add more requirements for running tests 2017-09-16 22:11:39 +02:00
Sarah Hoffmann
909b0c7462 Merge pull request #782 from lonvia/rework-postcodes
Rework handling of artificial postcode centroids
2017-09-16 15:54:55 +02:00
Sarah Hoffmann
15cd5c777b README: point to release instalation instructions 2017-09-06 20:36:59 +02:00
Sarah Hoffmann
37c653396b increase search rank of leisure=park
Fixes #786.
2017-08-31 21:10:48 +02:00
Sarah Hoffmann
8c4bcd36ea check that replication URL points to a repo of OSM diffs
Also check that pyosmium does not return None to work around
a bad return code in the current release of pyosmium-get-changes.

Fixes #784.
2017-08-29 21:05:54 +02:00
Sarah Hoffmann
88610b1b74 further restrict results for <postcode>, <term>
Disallow postcode operator together with housenumbers
and force results around a postcode when no address is
given.
2017-08-21 22:29:51 +02:00
Sarah Hoffmann
9aeb111fba tests: add new admin scene 2017-08-20 09:29:56 +02:00
Sarah Hoffmann
7ca5219297 fixup tests 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
f4a00eba26 enable details view for artificial postcodes 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
86a8900e21 fix subqueries when getting details for postcodes 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
67bb885900 throw away searches with two postcodes 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
e55ac77c94 add simple tests for postcode import 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
f9205caf22 adapt scene generation tool to newest libosmium 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
3c9af7f151 move adding postcodes to word table to calculation step 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
5e54e78176 add migration path for postcodes 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
caf018538f normalize all postcodes before use 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
ccae2c733b simplify search for artificial postcodes 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
5673c4cf91 special handling for estimated postcode in areas
Don't add a postcode at all if multiple estimated
postcodes fall into the area.
2017-08-19 19:37:06 +02:00
Sarah Hoffmann
ec8af1dd40 fix more tests 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
50c5abf6bb fix API tests wrt postcodes 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
a44377c7b0 fix postcode-related tests 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
5237f44c4a remove lat/lon check for search terms
Was only used with GB postcodes which were removed.
2017-08-19 19:37:06 +02:00
Sarah Hoffmann
99e9abe843 require postcodes to match exactly in normalised form 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
5b4bbab9be include GB CodePoint data into location_postcode table 2017-08-19 19:37:06 +02:00
Sarah Hoffmann
413c69ddc9 improve calculation of postcode for interpolations 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
53f8459e97 move postcode indexing to end of setup
The search_name tables are needed for finding the parent,
so the rest of the database must be indexed.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
3714b7ea7d take postcode into account for other searches
Existence of postcode is still optional but if a matching
result is found, then non-matching ones will be discarded.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
0ecb920866 immediately drop searches where requested country code does not match 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
563099f7fa take address part into account in postcode search 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
ce76a25101 merging back postcodes is no longer necessary 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
57dc0304b5 add search for postcode
Implements the 'postcode' operator.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
872e73314e move postcodes into special operation for Searches
Introduces postcode field in searches and sorts out any marked
postcodes.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
727bd73d0b fix typo 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
a2a1901b09 add postcodes as special items in word table 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
43869b9938 make sure postcode gets recomputed on update 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
9a86c0cebc use only computed postcode when getting address
Postcodes from address parts are now ignored as they have
been already taken into account when computing the postcode.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
d0cc4006d3 remove unused get_address_postcode function 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
dc2911ae72 normalize postcodes before adding to location tables 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
16053e81bf show address tags and postcode in details 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
d59d57957c precompute postcodes
Set postcode column to the best guess for the postcode for
the place.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
4e792546e8 add postcode to location_area tables 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
79aa74b771 remove unused loaddata file
Load data is being done in the setup script.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
bdec4e6488 replace AddGeometryColumn() functions
Directly add the columns to the table definitions instead.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
71ddeb40a6 remove now useless getNearestPostcode function
Most postcodes are not in the location-area tables anymore.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
80ef6cbaab add indexing of artificial postcodes 2017-08-19 19:37:05 +02:00
Sarah Hoffmann
15dbb6383c add new location_postcode table
Artifical postcode centroids are now saved in there.
2017-08-19 19:37:05 +02:00
Sarah Hoffmann
3e9fb0dc84 fix syntax typo 2017-08-14 22:34:53 +02:00
Sarah Hoffmann
95df39c292 prepare for release 3.0.1 2017-08-13 22:18:08 +02:00
Sarah Hoffmann
aab41b78af Merge pull request #779 from lonvia/update-osm2pgsql
update osm2pgsql
2017-07-26 23:53:47 +02:00
Sarah Hoffmann
96ecee431b Merge pull request #776 from lonvia/vagrant-fix-country-paths
vagrant: download country data into correct directory
2017-07-26 23:44:00 +02:00
Sarah Hoffmann
64bf44eaf0 update osm2pgsql
Fixes #770.
2017-07-26 23:40:14 +02:00
Sarah Hoffmann
9996cc495c Merge pull request #778 from woodpeck/patch-1
Mention explicitly that Osmosis is not required.
2017-07-23 21:01:51 +02:00
Frederik Ramm
bce11a5fc7 Mention explicitly that Osmosis is not required. 2017-07-21 09:23:38 +02:00
Sarah Hoffmann
cce57139cf Merge pull request #775 from lonvia/code-coverage
enable code coverage computation for API BDD tests
2017-07-17 23:19:41 +02:00
Sarah Hoffmann
937bead547 vagrant: download country data into correct directory 2017-07-17 23:17:26 +02:00
Sarah Hoffmann
3fba5e7867 enable code coverage computation for API BDD tests
Fixes #505.
2017-07-17 22:59:13 +02:00
Sarah Hoffmann
a27e191335 Merge pull request #771 from rksh/patch-1
Warn about operation order with Wikipedia data
2017-07-15 10:34:43 +02:00
Andy Carra
8c087e4cb8 Warn about operation order with Wikipedia data
Give first-time users a tip about installing Wikipedia data before performing initial import.
once https://github.com/openstreetmap/Nominatim/issues/255 has been addressed, this note is no-longer important.
2017-07-14 18:52:54 -07:00
Sarah Hoffmann
39bab5f4a1 Merge pull request #768 from SrihariThalla/update-typo-fix
[Minor] [/docs] Update file name of update.php to init updates command
2017-07-07 12:51:51 +02:00
Srihari Thalla
d004a0e3df Update file name of update.php to init updates 2017-07-07 15:57:01 +05:30
mtmail
bc0cf74ba7 Merge pull request #767 from manzari/patch-1
add country_osm_grid download to development (vagrant) setup instructions
2017-07-07 01:46:04 +02:00
manzari
e64e3b4939 Update VAGRANT.md 2017-07-06 18:05:14 +02:00
Sarah Hoffmann
6de0a854fb Merge pull request #763 from SrihariThalla/update-doc-links
Update incorrect internal links to docs
2017-06-29 21:00:29 +02:00
Srihari Thalla
97ad4e4654 Update incorrect internal links to docs 2017-06-29 17:29:15 +05:30
3735 changed files with 63112 additions and 61174 deletions

4
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,4 @@
contact_links:
- name: Nominatim Discussions
url: https://github.com/osm-search/Nominatim/discussions
about: Ask questions, get support, share ideas and discuss with community members.

View File

@@ -0,0 +1,22 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
<!-- Before opening a new feature request, please search through the open issue to check that your request hasn't been reported already. -->
**Is your feature request related to a problem? Please describe.**
<!-- A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] -->
**Describe the solution you'd like**
<!-- A clear and concise description of what you want to happen. -->
**Describe alternatives you've considered**
<!-- A clear and concise description of any alternative solutions or features you've considered. -->
**Additional context**
<!-- Add any other context or screenshots about the feature request here. -->

View File

@@ -0,0 +1,39 @@
---
name: Report issues with search results
about: You have searched something with Nominatim and did not get the expected result.
title: ''
labels: ''
assignees: ''
---
<!-- Note: this template is for reporting problems with searching. If you have found an issue with the data, you need to report/fix the issue directly in OpenStreetMap. See https://www.openstreetmap.org/fixthemap for details. -->
## What did you search for?
<!-- Please try to provide a link to your search. You can go to https://nominatim.openstreetmap.org and repeat your search there. If you originally found the issue somewhere else, please tell us what software/website you were using. -->
## What result did you get?
## What result did you expect?
**When the result in the right place and just named wrongly:**
<!-- Please tell us the display name you expected. -->
**When the result missing completely:**
<!-- Make sure that the data you are looking for is in OpenStreetMap. Provide a link to the OpenStreetMap object or if you cannot get it, a link to the map on https://openstreetmap.org where you expect the result to be.
To get the link to the OSM object, you can try the following:
* Go to [https://openstreetmap.org](https://openstreetmap.org).
* Move to the area of the map where you expect the result and then zoom in as much as possible.
* Click on the question mark on the right side of the map. You get a question cursor. Use it to click on the map where your object is located.
* Find the object of interest in the list that appears on the left side.
* Click on the object and report back the URL that the browser shows.
-->
## Further details
<!-- Anything else we should know about the search. Particularities with addresses in the area etc. -->

View File

@@ -0,0 +1,36 @@
---
name: Report problems with the software
about: You have your own installation of Nominatim and found a bug.
title: ''
labels: ''
assignees: ''
---
<!-- Note: if you are installing Nominatim through a docker image, you should report issues with the installation process with the docker repository first. -->
**Describe the bug**
<!-- A clear and concise description of what the bug is. -->
**To Reproduce**
<!-- Please describe what you did to get to the issue. -->
**Software Environment (please complete the following information):**
- Nominatim version:
- Postgresql version:
- Postgis version:
- OS:
**Hardware Configuration (please complete the following information):**
- RAM:
- number of CPUs:
- type and size of disks:
- bare metal/AWS/other cloud service:
**Postgresql Configuration:**
<!-- List any configuration items you changed in your postgresql configuration. -->
**Additional context**
<!-- Add any other context about the problem here. -->

View File

@@ -0,0 +1,42 @@
name: 'Build Nominatim'
inputs:
ubuntu:
description: 'Version of Ubuntu to install on'
required: false
default: '20'
runs:
using: "composite"
steps:
- name: Install prerequisites
run: |
sudo apt-get install -y -qq libboost-system-dev libboost-filesystem-dev libexpat1-dev zlib1g-dev libbz2-dev libpq-dev libproj-dev libicu-dev
if [ "x$UBUNTUVER" == "x18" ]; then
pip3 install python-dotenv psycopg2==2.7.7 jinja2==2.8 psutil==5.4.2 pyicu osmium PyYAML==5.1 datrie
else
sudo apt-get install -y -qq python3-icu python3-datrie python3-pyosmium python3-jinja2 python3-psutil python3-psycopg2 python3-dotenv python3-yaml
fi
shell: bash
env:
UBUNTUVER: ${{ inputs.ubuntu }}
- name: Download dependencies
run: |
if [ ! -f country_grid.sql.gz ]; then
wget --no-verbose https://www.nominatim.org/data/country_grid.sql.gz
fi
cp country_grid.sql.gz Nominatim/data/country_osm_grid.sql.gz
shell: bash
- name: Configure
run: mkdir build && cd build && cmake ../Nominatim
shell: bash
- name: Build
run: |
make -j2 all
sudo make install
shell: bash
working-directory: build

View File

@@ -0,0 +1,47 @@
name: 'Setup Postgresql and Postgis'
inputs:
postgresql-version:
description: 'Version of PostgreSQL to install'
required: true
postgis-version:
description: 'Version of Postgis to install'
required: true
runs:
using: "composite"
steps:
- name: Remove existing PostgreSQL
run: |
sudo apt-get purge -yq postgresql*
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
sudo apt-get update -qq
shell: bash
- name: Install PostgreSQL
run: |
sudo apt-get install -y -qq --no-install-suggests --no-install-recommends postgresql-client-${PGVER} postgresql-${PGVER}-postgis-${POSTGISVER} postgresql-${PGVER}-postgis-${POSTGISVER}-scripts postgresql-contrib-${PGVER} postgresql-${PGVER} postgresql-server-dev-${PGVER}
shell: bash
env:
PGVER: ${{ inputs.postgresql-version }}
POSTGISVER: ${{ inputs.postgis-version }}
- name: Adapt postgresql configuration
run: |
echo 'fsync = off' | sudo tee /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'synchronous_commit = off' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'full_page_writes = off' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'shared_buffers = 1GB' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'port = 5432' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
shell: bash
env:
PGVER: ${{ inputs.postgresql-version }}
- name: Setup database
run: |
sudo systemctl restart postgresql
sudo -u postgres createuser -S www-data
sudo -u postgres createuser -s runner
shell: bash

320
.github/workflows/ci-tests.yml vendored Normal file
View File

@@ -0,0 +1,320 @@
name: CI Tests
on: [ push, pull_request ]
jobs:
create-archive:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
submodules: true
- uses: actions/cache@v2
with:
path: |
data/country_osm_grid.sql.gz
key: nominatim-country-data-1
- name: Package tarball
run: |
if [ ! -f data/country_osm_grid.sql.gz ]; then
wget --no-verbose -O data/country_osm_grid.sql.gz https://www.nominatim.org/data/country_grid.sql.gz
fi
cd ..
tar czf nominatim-src.tar.bz2 Nominatim
mv nominatim-src.tar.bz2 Nominatim
- name: 'Upload Artifact'
uses: actions/upload-artifact@v2
with:
name: full-source
path: nominatim-src.tar.bz2
retention-days: 1
tests:
needs: create-archive
strategy:
matrix:
ubuntu: [18, 20]
include:
- ubuntu: 18
postgresql: 9.5
postgis: 2.5
pytest: pytest
php: 7.2
- ubuntu: 20
postgresql: 13
postgis: 3
pytest: py.test-3
php: 7.4
runs-on: ubuntu-${{ matrix.ubuntu }}.04
steps:
- uses: actions/download-artifact@v2
with:
name: full-source
- name: Unpack Nominatim
run: tar xf nominatim-src.tar.bz2
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: ${{ matrix.php }}
coverage: xdebug
tools: phpunit, phpcs, composer
- uses: actions/setup-python@v2
with:
python-version: 3.6
if: matrix.ubuntu == 18
- uses: ./Nominatim/.github/actions/setup-postgresql
with:
postgresql-version: ${{ matrix.postgresql }}
postgis-version: ${{ matrix.postgis }}
- uses: ./Nominatim/.github/actions/build-nominatim
with:
ubuntu: ${{ matrix.ubuntu }}
- name: Install test prerequsites
run: sudo apt-get install -y -qq pylint python3-pytest python3-behave python3-pytest-cov php-codecoverage
if: matrix.ubuntu == 20
- name: Install test prerequsites
run: pip3 install pylint==2.6.0 pytest pytest-cov behave==1.2.6
if: matrix.ubuntu == 18
- name: PHP linting
run: phpcs --report-width=120 .
working-directory: Nominatim
- name: Python linting
run: pylint nominatim
working-directory: Nominatim
- name: PHP unit tests
run: phpunit --coverage-clover ../../coverage-php.xml ./
working-directory: Nominatim/test/php
if: matrix.ubuntu == 20
- name: Python unit tests
run: $PYTEST --cov=nominatim --cov-report=xml test/python
working-directory: Nominatim
env:
PYTEST: ${{ matrix.pytest }}
- name: BDD tests
run: |
mkdir cov
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build --format=progress3 -DPHPCOV=./cov
composer require phpunit/phpcov:7.0.2
vendor/bin/phpcov merge --clover ../../coverage-bdd.xml ./cov
working-directory: Nominatim/test/bdd
if: matrix.ubuntu == 20
- name: BDD tests
run: |
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build --format=progress3
working-directory: Nominatim/test/bdd
if: matrix.ubuntu == 18
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
files: ./Nominatim/coverage*.xml
directory: ./
name: codecov-umbrella
fail_ci_if_error: false
path_to_write_report: ./coverage/codecov_report.txt
verbose: true
if: matrix.ubuntu == 20
icu-test:
needs: create-archive
strategy:
matrix:
ubuntu: [20]
include:
- ubuntu: 20
postgresql: 13
postgis: 3
pytest: py.test-3
php: 7.4
runs-on: ubuntu-${{ matrix.ubuntu }}.04
steps:
- uses: actions/download-artifact@v2
with:
name: full-source
- name: Unpack Nominatim
run: tar xf nominatim-src.tar.bz2
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: ${{ matrix.php }}
coverage: xdebug
tools: phpunit, phpcs, composer
- uses: actions/setup-python@v2
with:
python-version: 3.6
if: matrix.ubuntu == 18
- uses: ./Nominatim/.github/actions/setup-postgresql
with:
postgresql-version: ${{ matrix.postgresql }}
postgis-version: ${{ matrix.postgis }}
- uses: ./Nominatim/.github/actions/build-nominatim
with:
ubuntu: ${{ matrix.ubuntu }}
- name: Install test prerequsites
run: sudo apt-get install -y -qq python3-behave
if: matrix.ubuntu == 20
- name: Install test prerequsites
run: pip3 install behave==1.2.6
if: matrix.ubuntu == 18
- name: BDD tests (icu tokenizer)
run: |
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build -DTOKENIZER=icu --format=progress3
working-directory: Nominatim/test/bdd
install:
runs-on: ubuntu-latest
needs: create-archive
strategy:
matrix:
name: [Ubuntu-18, Ubuntu-20, Centos-8]
include:
- name: Ubuntu-18
flavour: ubuntu
image: "ubuntu:18.04"
ubuntu: 18
install_mode: install-nginx
- name: Ubuntu-20
flavour: ubuntu
image: "ubuntu:20.04"
ubuntu: 20
install_mode: install-apache
- name: Centos-8
flavour: centos
image: "centos:8"
container:
image: ${{ matrix.image }}
env:
LANG: en_US.UTF-8
defaults:
run:
shell: sudo -Hu nominatim bash --noprofile --norc -eo pipefail {0}
steps:
- name: Prepare container (Ubuntu)
run: |
export APT_LISTCHANGES_FRONTEND=none
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq
apt-get install -y git sudo wget
ln -snf /usr/share/zoneinfo/$CONTAINER_TIMEZONE /etc/localtime && echo $CONTAINER_TIMEZONE > /etc/timezone
shell: bash
if: matrix.flavour == 'ubuntu'
- name: Prepare container (CentOS)
run: |
dnf update -y
dnf install -y sudo glibc-langpack-en
shell: bash
if: matrix.flavour == 'centos'
- name: Setup import user
run: |
useradd -m nominatim
echo 'nominatim ALL=(ALL:ALL) NOPASSWD: ALL' > /etc/sudoers.d/nominiatim
echo "/home/nominatim/Nominatim/vagrant/Install-on-${OS}.sh no $INSTALL_MODE" > /home/nominatim/vagrant.sh
shell: bash
env:
OS: ${{ matrix.name }}
INSTALL_MODE: ${{ matrix.install_mode }}
- uses: actions/download-artifact@v2
with:
name: full-source
path: /home/nominatim
- name: Install Nominatim
run: |
export USERNAME=nominatim
export USERHOME=/home/nominatim
export NOSYSTEMD=yes
export HAVE_SELINUX=no
tar xf nominatim-src.tar.bz2
. vagrant.sh
working-directory: /home/nominatim
- name: Prepare import environment
run: |
mv Nominatim/test/testdb/apidb-test-data.pbf test.pbf
rm -rf Nominatim
mkdir data-env-reverse
working-directory: /home/nominatim
- name: Prepare import environment (CentOS)
run: |
sudo ln -s /usr/local/bin/nominatim /usr/bin/nominatim
echo NOMINATIM_DATABASE_WEBUSER="apache" > nominatim-project/.env
cp nominatim-project/.env data-env-reverse/.env
working-directory: /home/nominatim
if: matrix.flavour == 'centos'
- name: Import
run: nominatim import --osm-file ../test.pbf
working-directory: /home/nominatim/nominatim-project
- name: Import special phrases
run: nominatim special-phrases --import-from-wiki
working-directory: /home/nominatim/nominatim-project
- name: Check full import
run: nominatim admin --check-database
working-directory: /home/nominatim/nominatim-project
- name: Warm up database
run: nominatim admin --warm
working-directory: /home/nominatim/nominatim-project
- name: Prepare update (Ubuntu)
run: apt-get install -y python3-pip
shell: bash
if: matrix.flavour == 'ubuntu'
- name: Run update
run: |
pip3 install --user osmium
nominatim replication --init
NOMINATIM_REPLICATION_MAX_DIFF=1 nominatim replication --once
working-directory: /home/nominatim/nominatim-project
- name: Run reverse-only import
run : |
echo 'NOMINATIM_DATABASE_DSN="pgsql:dbname=reverse"' >> .env
nominatim import --osm-file ../test.pbf --reverse-only --no-updates
working-directory: /home/nominatim/data-env-reverse
- name: Check reverse import
run: nominatim admin --check-database
working-directory: /home/nominatim/data-env-reverse

8
.gitignore vendored
View File

@@ -1,11 +1,9 @@
*.log
*.pyc
build
settings/local.php
docs/develop/*.png
data/wiki_import.sql
data/wiki_specialphrases.sql
data/osmosischange.osc
build
.vagrant
data/country_osm_grid.sql.gz

15
.pylintrc Normal file
View File

@@ -0,0 +1,15 @@
[MASTER]
extension-pkg-whitelist=osmium
ignored-modules=icu,datrie
[MESSAGES CONTROL]
[TYPECHECK]
# closing added here because it sometimes triggers a false positive with
# 'with' statements.
ignored-classes=NominatimArgs,closing
disable=too-few-public-methods,duplicate-code
good-names=i,x,y,fd,db

View File

@@ -1,31 +0,0 @@
---
sudo: required
dist: trusty
language: python
python:
- "3.6"
addons:
postgresql: "9.6"
git:
depth: 3
env:
- TEST_SUITE=tests
- TEST_SUITE=monaco
install:
- vagrant/install-on-travis-ci.sh
before_script:
- psql -U postgres -c "create extension postgis"
script:
- cd $TRAVIS_BUILD_DIR/build
- if [[ $TEST_SUITE == "monaco" ]]; then wget --no-verbose --output-document=../data/monaco.osm.pbf http://download.geofabrik.de/europe/monaco-latest.osm.pbf; fi
- if [[ $TEST_SUITE == "monaco" ]]; then ./utils/setup.php --osm-file ../data/monaco.osm.pbf --osm2pgsql-cache 1000 --all 2>&1 | grep -v 'ETA (seconds)'; fi
- if [[ $TEST_SUITE == "monaco" ]]; then ./utils/specialphrases.php --wiki-import | psql -d test_api_nominatim >/dev/null; fi
- cd $TRAVIS_BUILD_DIR/test/php
- if [[ $TEST_SUITE == "tests" ]]; then phpunit ./ ; fi
- if [[ $TEST_SUITE == "tests" ]]; then phpcs --report-width=120 */**.php ; fi
- cd $TRAVIS_BUILD_DIR/test/bdd
- # behave --format=progress3 api
- if [[ $TEST_SUITE == "tests" ]]; then behave --format=progress3 db ; fi
- if [[ $TEST_SUITE == "tests" ]]; then behave --format=progress3 osm2pgsql ; fi
notifications:
email: false

17
AUTHORS
View File

@@ -3,16 +3,13 @@ Nominatim was written by:
Brian Quinion
Sarah Hoffmann
Marc Tobias Metten
markigail
gemo1011
IrlJidel
Frederik Ramm
Michael Spreng
Daniele Forsi
mfn
Grant Slater
Andree Klattenhoff
appelflap
b3nn0
Spin0us
Kurt Roeckx
Rodolphe Quiédeville
and many more.
For a full list of contributors see
https://github.com/openstreetmap/Nominatim/graphs/contributors

View File

@@ -6,7 +6,7 @@
#
#-----------------------------------------------------------------------------
cmake_minimum_required(VERSION 2.8 FATAL_ERROR)
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake")
@@ -18,133 +18,258 @@ list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake")
project(nominatim)
set(NOMINATIM_VERSION_MAJOR 3)
set(NOMINATIM_VERSION_MAJOR 4)
set(NOMINATIM_VERSION_MINOR 0)
set(NOMINATIM_VERSION_PATCH 0)
set(NOMINATIM_VERSION "${NOMINATIM_VERSION_MAJOR}.${NOMINATIM_VERSION_MINOR}")
set(NOMINATIM_VERSION "${NOMINATIM_VERSION_MAJOR}.${NOMINATIM_VERSION_MINOR}.${NOMINATIM_VERSION_PATCH}")
add_definitions(-DNOMINATIM_VERSION="${NOMINATIM_VERSION}")
#-----------------------------------------------------------------------------
#
# Find external dependencies
#
# Configuration
#-----------------------------------------------------------------------------
set(BUILD_TESTS off CACHE BOOL "Build test suite" FORCE)
set(WITH_LUA off CACHE BOOL "Build with lua support" FORCE)
set(BUILD_IMPORTER on CACHE BOOL "Build everything for importing/updating the database")
set(BUILD_API on CACHE BOOL "Build everything for the API server")
set(BUILD_MODULE on CACHE BOOL "Build PostgreSQL module")
set(BUILD_TESTS on CACHE BOOL "Build test suite")
set(BUILD_DOCS on CACHE BOOL "Build documentation")
set(BUILD_MANPAGE on CACHE BOOL "Build Manual Page")
set(BUILD_OSM2PGSQL on CACHE BOOL "Build osm2pgsql (expert only)")
set(INSTALL_MUNIN_PLUGINS on CACHE BOOL "Install Munin plugins for supervising Nominatim")
if (NOT EXISTS "${CMAKE_SOURCE_DIR}/osm2pgsql/CMakeLists.txt")
message(FATAL_ERROR "The osm2pgsql directory is empty.\
Did you forget to check out Nominatim recursively?\
\nTry updating submodules with: git submodule update --init")
endif()
add_subdirectory(osm2pgsql)
#-----------------------------------------------------------------------------
# osm2pgsql (imports/updates only)
#-----------------------------------------------------------------------------
find_package(Threads REQUIRED)
unset(PostgreSQL_TYPE_INCLUDE_DIR CACHE)
set(PostgreSQL_TYPE_INCLUDE_DIR "/usr/include/")
find_package(PostgreSQL REQUIRED)
include_directories(${PostgreSQL_INCLUDE_DIRS})
link_directories(${PostgreSQL_LIBRARY_DIRS})
find_program(PYOSMIUM pyosmium-get-changes)
if (NOT EXISTS "${PYOSMIUM}")
set(PYOSMIUM_PATH "/nonexistent")
message(WARNING "pyosmium-get-changes not found (required for updates)")
else()
set(PYOSMIUM_PATH "${PYOSMIUM}")
message(STATUS "Using pyosmium-get-changes at ${PYOSMIUM_PATH}")
if (BUILD_IMPORTER AND BUILD_OSM2PGSQL)
if (NOT EXISTS "${CMAKE_SOURCE_DIR}/osm2pgsql/CMakeLists.txt")
message(FATAL_ERROR "The osm2pgsql directory is empty.\
Did you forget to check out Nominatim recursively?\
\nTry updating submodules with: git submodule update --init")
endif()
set(BUILD_TESTS_SAVED "${BUILD_TESTS}")
set(BUILD_TESTS off)
set(WITH_LUA off CACHE BOOL "")
add_subdirectory(osm2pgsql)
set(BUILD_TESTS ${BUILD_TESTS_SAVED})
endif()
find_program(PG_CONFIG pg_config)
execute_process(COMMAND ${PG_CONFIG} --pgxs
OUTPUT_VARIABLE PGXS
OUTPUT_STRIP_TRAILING_WHITESPACE)
#-----------------------------------------------------------------------------
# python (imports/updates only)
#-----------------------------------------------------------------------------
if (NOT EXISTS "${PGXS}")
message(FATAL_ERROR "Postgresql server package not found.")
if (BUILD_IMPORTER)
find_package(PythonInterp 3.6 REQUIRED)
endif()
find_package(ZLIB REQUIRED)
find_package(BZip2 REQUIRED)
find_package(LibXml2 REQUIRED)
include_directories(${LIBXML2_INCLUDE_DIR})
#-----------------------------------------------------------------------------
#
# Setup settings and paths
#
# PHP
#-----------------------------------------------------------------------------
set(CUSTOMFILES
settings/phrase_settings.php
website/deletable.php
website/details.php
website/hierarchy.php
website/lookup.php
website/polygons.php
website/reverse.php
website/search.php
website/status.php
utils/blocks.php
utils/country_languages.php
utils/imports.php
utils/importWikipedia.php
utils/query.php
utils/server_compare.php
utils/setup.php
utils/specialphrases.php
utils/update.php
utils/warm.php
)
foreach (cfile ${CUSTOMFILES})
configure_file(${PROJECT_SOURCE_DIR}/${cfile} ${PROJECT_BINARY_DIR}/${cfile})
endforeach()
configure_file(${PROJECT_SOURCE_DIR}/settings/defaults.php ${PROJECT_BINARY_DIR}/settings/settings.php)
set(WEBPATHS css images js)
foreach (wp ${WEBPATHS})
execute_process(
COMMAND ln -sf ${PROJECT_SOURCE_DIR}/website/${wp} ${PROJECT_BINARY_DIR}/website/
)
endforeach()
# Setting PHP binary variable as to command line (prevailing) or auto detect
if (BUILD_API OR BUILD_IMPORTER)
if (NOT PHP_BIN)
find_program (PHP_BIN php)
endif()
# sanity check if PHP binary exists
if (NOT EXISTS ${PHP_BIN})
message(FATAL_ERROR "PHP binary not found. Install php or provide location with -DPHP_BIN=/path/php ")
else()
message (STATUS "Using PHP binary " ${PHP_BIN})
endif()
if (NOT PHPCGI_BIN)
find_program (PHPCGI_BIN php-cgi)
endif()
# sanity check if PHP binary exists
if (NOT EXISTS ${PHPCGI_BIN})
message(WARNING "php-cgi binary not found. nominatim tool will not provide query functions.")
set (PHPCGI_BIN "")
else()
message (STATUS "Using php-cgi binary " ${PHPCGI_BIN})
endif()
endif()
#-----------------------------------------------------------------------------
# import scripts and utilities (importer only)
#-----------------------------------------------------------------------------
if (BUILD_IMPORTER)
find_file(COUNTRY_GRID_FILE country_osm_grid.sql.gz
PATHS ${PROJECT_SOURCE_DIR}/data
NO_DEFAULT_PATH
DOC "Location of the country grid file."
)
if (NOT COUNTRY_GRID_FILE)
message(FATAL_ERROR "\nYou need to download the country_osm_grid first:\n"
" wget -O ${PROJECT_SOURCE_DIR}/data/country_osm_grid.sql.gz https://www.nominatim.org/data/country_grid.sql.gz")
endif()
configure_file(${PROJECT_SOURCE_DIR}/cmake/tool.tmpl
${PROJECT_BINARY_DIR}/nominatim)
endif()
#-----------------------------------------------------------------------------
#
# Tests
#
#-----------------------------------------------------------------------------
include(CTest)
if (BUILD_TESTS)
include(CTest)
set(TEST_BDD db osm2pgsql api)
set(TEST_BDD db osm2pgsql api)
foreach (test ${TEST_BDD})
add_test(NAME bdd_${test}
COMMAND lettuce features/${test}
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/tests)
set_tests_properties(bdd_${test}
PROPERTIES ENVIRONMENT "NOMINATIM_DIR=${PROJECT_BINARY_DIR}")
endforeach()
find_program(PYTHON_BEHAVE behave)
find_program(PYLINT NAMES pylint3 pylint)
find_program(PYTEST NAMES pytest py.test-3 py.test)
find_program(PHPCS phpcs)
find_program(PHPUNIT phpunit)
add_test(NAME php
COMMAND phpunit ./
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/tests-php)
if (PYTHON_BEHAVE)
message(STATUS "Using Python behave binary ${PYTHON_BEHAVE}")
foreach (test ${TEST_BDD})
add_test(NAME bdd_${test}
COMMAND ${PYTHON_BEHAVE} ${test}
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/test/bdd)
set_tests_properties(bdd_${test}
PROPERTIES ENVIRONMENT "NOMINATIM_DIR=${PROJECT_BINARY_DIR}")
endforeach()
else()
message(WARNING "behave not found. BDD tests disabled." )
endif()
if (PHPUNIT)
message(STATUS "Using phpunit binary ${PHPUNIT}")
add_test(NAME php
COMMAND ${PHPUNIT} ./
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/test/php)
else()
message(WARNING "phpunit not found. PHP unit tests disabled." )
endif()
if (PHPCS)
message(STATUS "Using phpcs binary ${PHPCS}")
add_test(NAME phpcs
COMMAND ${PHPCS} --report-width=120 --colors lib-php
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
else()
message(WARNING "phpcs not found. PHP linting tests disabled." )
endif()
if (PYLINT)
message(STATUS "Using pylint binary ${PYLINT}")
add_test(NAME pylint
COMMAND ${PYLINT} nominatim
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
else()
message(WARNING "pylint not found. Python linting tests disabled.")
endif()
if (PYTEST)
message(STATUS "Using pytest binary ${PYTEST}")
add_test(NAME pytest
COMMAND ${PYTEST} test/python
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
else()
message(WARNING "pytest not found. Python tests disabled." )
endif()
endif()
#-----------------------------------------------------------------------------
# Postgres module
#-----------------------------------------------------------------------------
add_subdirectory(module)
add_subdirectory(nominatim)
add_subdirectory(docs)
if (BUILD_MODULE)
add_subdirectory(module)
endif()
#-----------------------------------------------------------------------------
# Documentation
#-----------------------------------------------------------------------------
if (BUILD_DOCS)
add_subdirectory(docs)
endif()
#-----------------------------------------------------------------------------
# Manual page
#-----------------------------------------------------------------------------
if (BUILD_MANPAGE)
add_subdirectory(man)
endif()
#-----------------------------------------------------------------------------
# Installation
#-----------------------------------------------------------------------------
include(GNUInstallDirs)
set(NOMINATIM_DATADIR ${CMAKE_INSTALL_FULL_DATADIR}/${PROJECT_NAME})
set(NOMINATIM_LIBDIR ${CMAKE_INSTALL_FULL_LIBDIR}/${PROJECT_NAME})
set(NOMINATIM_CONFIGDIR ${CMAKE_INSTALL_FULL_SYSCONFDIR}/${PROJECT_NAME})
set(NOMINATIM_MUNINDIR ${CMAKE_INSTALL_FULL_DATADIR}/munin/plugins)
if (BUILD_IMPORTER)
configure_file(${PROJECT_SOURCE_DIR}/cmake/tool-installed.tmpl installed.bin)
install(PROGRAMS ${PROJECT_BINARY_DIR}/installed.bin
DESTINATION ${CMAKE_INSTALL_BINDIR}
RENAME nominatim)
install(DIRECTORY nominatim
DESTINATION ${NOMINATIM_LIBDIR}/lib-python
FILES_MATCHING PATTERN "*.py"
PATTERN __pycache__ EXCLUDE)
install(DIRECTORY lib-sql DESTINATION ${NOMINATIM_LIBDIR})
install(FILES data/country_name.sql
${COUNTRY_GRID_FILE}
data/words.sql
DESTINATION ${NOMINATIM_DATADIR})
endif()
if (BUILD_OSM2PGSQL)
if (${CMAKE_VERSION} VERSION_LESS 3.13)
# Installation of subdirectory targets was only introduced in 3.13.
# So just copy the osm2pgsql file for older versions.
install(PROGRAMS ${PROJECT_BINARY_DIR}/osm2pgsql/osm2pgsql
DESTINATION ${NOMINATIM_LIBDIR})
else()
install(TARGETS osm2pgsql RUNTIME DESTINATION ${NOMINATIM_LIBDIR})
endif()
endif()
if (BUILD_MODULE)
install(PROGRAMS ${PROJECT_BINARY_DIR}/module/nominatim.so
DESTINATION ${NOMINATIM_LIBDIR}/module)
endif()
if (BUILD_API)
install(DIRECTORY lib-php DESTINATION ${NOMINATIM_LIBDIR})
endif()
install(FILES settings/env.defaults
settings/address-levels.json
settings/phrase-settings.json
settings/import-admin.style
settings/import-street.style
settings/import-address.style
settings/import-full.style
settings/import-extratags.style
settings/icu_tokenizer.yaml
settings/country_settings.yaml
DESTINATION ${NOMINATIM_CONFIGDIR})
install(DIRECTORY settings/icu-rules
DESTINATION ${NOMINATIM_CONFIGDIR})
if (INSTALL_MUNIN_PLUGINS)
install(FILES munin/nominatim_importlag
munin/nominatim_query_speed
munin/nominatim_requests
DESTINATION ${NOMINATIM_MUNINDIR})
endif()

View File

@@ -7,41 +7,9 @@ Please always open a separate issue for each problem. In particular, do
not add your bugs to closed issues. They may looks similar to you but
often are completely different from the maintainer's point of view.
### When Reporting Bad Search Results...
Please make sure to add the following information:
* the URL of the query that produces the bad result
* the result you are getting
* the expected result, preferably a link to the OSM object you want to find,
otherwise an address that is as precise as possible
To get the link to the OSM object, you can try the following:
* go to https://openstreetmap.org
* zoom to the area of the map where you expect the result and
zoom in as much as possible
* click on the question mark on the right side of the map,
then with the queston cursor on the map where your object is located
* find the object of interest in the list that appears on the left side
* click on the object and report the URL back that the browser shows
### When Reporting Problems with your Installation...
Please add the following information to your issue:
* hardware configuration: RAM size, CPUs, kind and size of disks
* Operating system (also mention if you are running on a cloud service)
* Postgres and Postgis version
* list of settings you changed in your Postgres configuration
* Nominatim version (release version or,
if you run from the git repo, the output of `git rev-parse HEAD`)
* (if applicable) exact command line of the command that was causing the issue
## Workflow for Pull Requests
We love to get pull reuqests from you. We operate the "Fork & Pull" model
We love to get pull requests from you. We operate the "Fork & Pull" model
explained at
https://help.github.com/articles/using-pull-requests
@@ -65,7 +33,7 @@ that duplicate work can be avoided.
## Coding style
Nominatim historically hasn't followed a particular coding style but we
are in process of consolodating the style. The following rules apply:
are in process of consolidating the style. The following rules apply:
* Python code uses the official Python style
* indention
@@ -78,23 +46,21 @@ are in process of consolodating the style. The following rules apply:
* no spaces after opening and before closing bracket
* leave out space between a function name and bracket
but add one between control statement(if, while, etc.) and bracket
* for PHP variables use CamelCase with a prefixing letter indicating the type
(i - integer, f - float, a - array, s - string, o - object)
The coding style is enforced with PHPCS and can be tested with:
The coding style is enforced with PHPCS and pylint. It can be tested with:
```
phpcs --report-width=120 --colors */**.php
phpcs --report-width=120 --colors .
pylint3 --extension-pkg-whitelist=osmium nominatim
```
## Testing
Before submitting a pull request make sure that the following tests pass:
Before submitting a pull request make sure that the tests pass:
```
cd test/bdd
behave -DBUILDDIR=<builddir> db osm2pgsql
```
```
cd test/php
phpunit ./
cd build
make test
```

249
ChangeLog
View File

@@ -1,4 +1,251 @@
3.0
4.0.0
* refactor name token computation and introduce ICU tokenizer
* name processing now happens in the indexer outside the DB
* reorganizes abbreviation handling and moves it to the indexing phases
* adds preprocessing of names
* add country-specific ranking for Spain, Slovakia
* partially switch to using SP-GIST indexes
* better updating of dependent addresses for name changes in streets
* remove unused/broken tables for external housenumbers
* move external postcodes to CSV format and no longer save them in tables
(adds support for postcodes for arbitrary countries)
* remove postcode helper entries from placex (thanks @AntoJvlt)
* change required format for TIGER data to CSV
* move configuration of default languages from wiki into config file
* expect customized configuration files in project directory by default
* disable search API for reverse-only import (thanks @darkshredder)
* port most of maintenance/import code to Python and remove PHP utils
* add catch-up mode for replication
* add updating of special phrases (thanks @AntoJvlt)
* add support for special phrases in CSV files (thanks @AntoJvlt)
* switch to case-independent matching between place and boundary names
* remove disabling of reverse query parsing
* minor tweaks to search algorithm to avoid more false positives
* major overhaul of the administrator and developer documentation
* add security disclosure policy
* add testing of installation scripts via CI
* drop support for Python < 3.6 and Postgresql < 9.5
3.7.2
* fix database check for reverse-only imports
* do not error out in status API result when import date is missing
* add array_key_last function for PHP < 7.3 (thanks to @woodpeck)
* fix more url when server name is unknown (thanks to @mogita)
* commit changes to replication log table
3.7.1
* fix smaller issues with special phrases import (thanks @AntoJvlt)
* add index to speed up continued indexing during import
* fix index on location_property_tiger(parent_place_id) (thanks @changpingc)
* make sure Python code is backward-compatible with Python 3.5
* various documentation fixes
3.7.0
* switch to dotenv for configuration file
* introduce 'make install' (reorganising most of the code)
* introduce nominatim tool as replacement for various php scripts
* introduce project directories and allow multiple installations from same build
* clean up BDD tests: drop nose, reorganise step code
* simplify test database for API BDD tests and autoinstall database
* port most of the code for command-line tools to Python
(thanks to @darkshredder and @AntoJvlt)
* add tests for all tooling
* replace pyosmium-get-changes with custom internal implementation using
pyosmium
* improve search for queries with housenumber and partial terms
* add database versioning
* use jinja2 for preprocessing SQL files
* introduce automatic migrations
* reverse fix preference of interpolations over housenumbers
* parallelize indexing of postcodes
* add non-key indexes to speed up housenumber + street searches
* switch housenumber field in placex to save transliterated names
3.6.0
* add full support for searching by and displaying of addr:* tags
* improve address output for large-area objects
* better use of country names from OSM data for search and display
* better debug output for reverse call
* add support for addr:place links without an place equivalent in OSM
* improve finding postcodes with normalisation artefacts
* batch object to index for rank 30, avoiding a wrap-around of transaction
IDs in PostgreSQL
* introduce dynamic address rank computation for administrative boundaries
depending on linked objects and their place in the admin level hierarchy
* add country-specific address ranking for Indonesia, Russia, Belgium and
the Netherlands (thanks @hendrikmoree)
* make sure wikidata/wikipedia tags are imported for all styles
* make POIs searchable by name and housenumber (thanks @joy-yyd)
* reverse geocoding now ignores places without an address rank (rivers etc.)
* installation of a webserver is no longer mandatory, for development
use the php internal webserver via 'make serve
* reduce the influence of place nodes in addresses
* drop support for the unspecific is_in tag
* various minor tweaks to supplied styles
* move HTML web frontend into its own project
* move scripts for processing external data sources into separate directories
* introduce separate configuration for website (thanks @krahulreddy)
* update documentation, in particular, clean up development docs
* update osm2pgsql to 1.4.0
3.5.2
* ensure that wikipedia tags are imported for all styles
* reinstate verbosity for indexing during updates
* make house number reappear in display name on named POIs
* introduce batch processing in indexer to avoid transaction ID overrun
* increase splitting for large geometries to improve indexing speed
* remove deprecated get_magic_quotes_gpc() function
* make sure that all postcodes have an entry in word and are thus searchable
* remove use of ST_Covers in conjunction woth ST_Intersects,
causes bad query planning and slow updates in Postgis3
* update osm2pgsql
3.5.1
* disable jit and parallel processing in PostgreSQL for osm2pgsql
* update libosmium to 2.15.6 (fixes an issue with processing hanging
on large multipolygons)
3.5.0
* structured select on HTML search page
* new PHP Nominatim\Shell class to wrap shell escaping
* remove polygon parameter from all API calls
* improve handling of postcode areas
* reorganise place linking algorithm, now using wikidata tag as well
* remove linkees from search_name and larger_area tables
* introduce country-specific address ranks
* reorganise rank address computation
* cleanup of partition function
* improve parenting for large POIs
* add support for Postgresql 12 and Postgis 3
* add earlier cleanup when --drop is given, to reduce memory usage
* remove use of place_id in URLs
* replace C nominatim indexer with a simpler Python implementation
* split up the huge sql/functions.sql file
* move osm2pgsql tests to osm2pgsql
* add new extratags style which imports all tags from OSM
* add new script for checking the import after completion
* update osm2pgsql, reducing memory usage
* use new wikipedia importance and add processing of wikidata tags
* add search form for details page
* use ExtraDataPath for country_grid table
* remove short_name from list of names to be displayed
* split up CMakeFile, so that all parts can be built separately
* update installation instructions for CentOS and Ubuntu
* add script for importing/updating multiple country extracts
* various documentation improvements
3.4.2
* fix security bug in /details endpoint where user input was not
properly sanitized
3.4.1
* update osm2pgsql to fix hans during updates and lost address numbers
during updates
3.4.0
* increase required version for PostgreSQL(9.3), PostGIS(2.2) and PHP(7.0)
* better error reporting for out-of-memory errors
* exclude postcode ranges separated by colon from centre point calculation
* update osm2pgsql, better handling of imports without flatnode file
* switch to more efficient algorithm for word set computation
* use only boundries for country and state parts of addresses
* improve updates of addresses with housenumbers and interpolations
* remove country from place_addressline table and use country_code instead
* optimise indexes on search_name partition tables
* improve searching of attached streets for large objects like airports
* drop support for python 2
* new scripts for importing Wikidata for importance
* create and drop indexes concurrently to not clash with auto vacuum
* various documentation improvements
3.3.0
* zoom 17 in reverse now zooms in on minor streets
* fix use of postcode relations in address
* support for housenumber 0 on interpolations
* replace database abstraction DB with PDO and switch to using exceptions
* exclude line features at rank 30 from reverse geocoding
* remove self-reference and country from place_addressline
* make json output more readable (less escaping)
* update conversion scripts for postcodes
* scripts in utils/ are no longer executable (always use scripts in build dir)
* remove Natural Earth country fallback (OSM is complete enough)
* make rank assignments configurable
* allow accept languages with underscore
* new reverse-only import mode (without search index table)
* rely on boundaries only for states and countries
* update osm2pgsql, now using a configurable style
* provide multiple import styles
* improve search when house number and postcodes are dropped
* overhaul of setup code
* add support for PHPUnit 6
* update test database
* various documentation improvements
3.2.0
* complete rewrite of reverse search algorithm
* add new geojson and geocodejson output formats
* add simple export script to exprot addresses to CSV
* remove is_in terms from address computation
* remove unused search_name_country tables
* various smaller fixes to query parsing
* convert Tokens and token types to class types
* correctly handle update when boundary object type is changed
* improve debug output for /search endpoint
* update to latest osm2pgsql and leaflet.js
* overhaul of /details endpoint:
* new class parameter when using osmtype/osmid parameters
* permalink to instance-independent osmtype/osmid parameter format
* new json output format
* update CentOS vagrant machine to use SELinux
* add vagrant scripts for Ubuntu 18.04
* fix build process for BSD
* enable running the database on a different host than the setup scripts
* allow to configure use of custom PHP binaries (PHP_BIN)
* extensive coding style improvements to PHP code
* more PHP unit tests for new classes
* increase coverage for API tests
* add documentation for API
3.1.0
* rework postcode handling and introduce location_postcode table
* make setup less verbose and print a summary in the end
* setup: error out when web site user does not exist
* add more API tests to complete code coverage
* reinstate key-value amenity search (special term [key=value])
* fix detection of coordinates in query
* various smaller tweaks to ranking of search interpretations
* complete overhaul of PHP frontend code using OOP
* add address rank to details page
* update Tiger scripts for 2017 data and clean up unused code
* various bug fixes and improvements to UI
* improve reverse geocoding performance close to coasts
* more PHP style cleanup (quoting)
* allow unnamed road in reverse geocoding to avoid too far off results
* add function to recalculate counts for full-word search term
* add function to check if new updates are available
* update documentation and switch to mkdocs for generating HTML
3.0.1
* fix bug in geometry building algorithm in osm2pgsql
* fix typos in build instructions
3.0.0
* move to cmake build system
* various fixes to HTML output

View File

@@ -1,4 +1,5 @@
[![Build Status](https://travis-ci.org/openstreetmap/Nominatim.svg?branch=master)](https://travis-ci.org/openstreetmap/Nominatim)
[![Build Status](https://github.com/osm-search/Nominatim/workflows/CI%20Tests/badge.svg)](https://github.com/osm-search/Nominatim/actions?query=workflow%3A%22CI+Tests%22)
[![codecov](https://codecov.io/gh/osm-search/Nominatim/branch/master/graph/badge.svg?token=8P1LXrhCMy)](https://codecov.io/gh/osm-search/Nominatim)
Nominatim
=========
@@ -6,23 +7,26 @@ Nominatim
Nominatim (from the Latin, 'by name') is a tool to search OpenStreetMap data
by name and address (geocoding) and to generate synthetic addresses of
OSM points (reverse geocoding). An instance with up-to-date data can be found
at http://nominatim.openstreetmap.org. Nominatim is also used as one of the
sources for the Search box on the OpenStreetMap home page and powers the search
on the MapQuest Open Initiative websites.
at https://nominatim.openstreetmap.org. Nominatim is also used as one of the
sources for the Search box on the OpenStreetMap home page.
Documentation
=============
More information about Nominatim, including usage and installation instructions,
can be found in the docs/ subdirectory and in the OSM wiki at:
http://wiki.openstreetmap.org/wiki/Nominatim
The documentation of the latest development version is in the
`docs/` subdirectory. A HTML version can be found at
https://nominatim.org/release-docs/develop/ .
Installation
============
There are detailed installation instructions in the /docs directory.
Here is a quick summary of the necessary steps.
The latest stable release can be downloaded from https://nominatim.org.
There you can also find [installation instructions for the release](https://nominatim.org/release-docs/latest/admin/Installation), as well as an extensive [Troubleshooting/FAQ section](https://nominatim.org/release-docs/latest/admin/Faq/).
[Detailed installation instructions for current master](https://nominatim.org/release-docs/develop/admin/Installation)
can be found at nominatim.org as well.
A quick summary of the necessary steps:
1. Compile Nominatim:
@@ -30,20 +34,15 @@ Here is a quick summary of the necessary steps.
cd build
cmake ..
make
sudo make install
For more detailed installation instructions see [docs/Installation.md](docs/Installation.md).
There are also step-by-step instructions for
[Ubuntu 16.04](docs/install-on-ubuntu-16.md) and
[CentOS 7.2](docs/install-on-centos-7.md).
2. Create a project directory, get OSM data and import:
2. Get OSM data and import:
./build/utils/setup.php --osm-file <your planet file> --all
Details can be found in [docs/Import_and_update.md](docs/Import_and_update.md)
3. Point your webserver to the ./build/website directory.
mkdir nominatim-project
cd nominatim-project
nominatim import --osm-file <your planet file>
3. Point your webserver to the nominatim-project/website directory.
License
@@ -51,11 +50,18 @@ License
The source code is available under a GPLv2 license.
Contact and Bugreports
======================
For questions you can join the geocoding mailinglist, see
http://lists.openstreetmap.org/listinfo/geocoding
Contributing
============
Bugs may be reported on the github project site:
https://github.com/openstreetmap/Nominatim
Contributions, bugreport and pull requests are welcome.
For details see [contribution guide](CONTRIBUTING.md).
Questions and help
==================
For questions, community help and discussions you can use the
[Github discussions forum](https://github.com/osm-search/Nominatim/discussions)
or join the
[geocoding mailing list](https://lists.openstreetmap.org/listinfo/geocoding).

39
SECURITY.md Normal file
View File

@@ -0,0 +1,39 @@
# Security Policy
## Supported Versions
All Nominatim releases receive security updates for two years.
The following table lists the end of support for all currently supported
versions.
| Version | End of support for security updates |
| ------- | ----------------------------------- |
| 3.7.x | 2023-04-05 |
| 3.6.x | 2022-12-12 |
| 3.5.x | 2022-06-05 |
| 3.4.x | 2021-10-24 |
## Reporting a Vulnerability
If you believe, you have found an issue in Nominatim that has implications on
security, please send a description of the issue to **security@nominatim.org**.
You will receive an acknowledgement of your mail within 3 work days where we
also notify you of the next steps.
## How we Disclose Security Issues
** The following section only applies to security issues found in released
versions. Issues that concern the master development branch only will be
fixed immediately on the branch with the corresponding PR containing the
description of the nature and severity of the issue. **
Patches for identified security issues are applied to all affected versions and
new minor versions are released. At the same time we release a statement at
the [Nominatim blog](https://nominatim.org/blog/) describing the nature of the
incident. Announcements will also be published at the
[geocoding mailinglist](https://lists.openstreetmap.org/listinfo/geocoding).
## List of Previous Incidents
* 2020-05-04 - [SQL injection issue on /details endpoint](https://lists.openstreetmap.org/pipermail/geocoding/2020-May/002012.html)

View File

@@ -1,11 +1,11 @@
# Install Nominatim in a virtual machine for development and testing
This document describes how you can install Nominatim inside a Ubuntu 14
This document describes how you can install Nominatim inside a Ubuntu 16
virtual machine on your desktop/laptop (host machine). The goal is to give
you a development environment to easily edit code and run the test suite
without affecting the rest of your system.
The installation can run largely unsupervised. You should expect 1-2h from
The installation can run largely unsupervised. You should expect 1h from
start to finish depending on how fast your computer and download speed
is.
@@ -19,7 +19,7 @@ is.
git clone --recursive https://github.com/openstreetmap/Nominatim.git
If you haven't used `--recursive`, then you can load the submodules using
If you forgot `--recursive`, it you can later load the submodules using
git submodule init
git submodule update
@@ -37,23 +37,19 @@ is.
vagrant ssh ubuntu
3. Import a small country (Monaco)
You need to give the virtual machine more memory (2GB) for an import,
see `Vagrantfile`. Otherwise 1GB is enough.
See the FAQ how to skip this step and point Nominatim to an existing database.
```
# inside the virtual machine:
mkdir data
cd build
wget --no-verbose --output-document=../data/monaco.osm.pbf http://download.geofabrik.de/europe/monaco-latest.osm.pbf
./utils/setup.php --osm-file ../data/monaco.osm.pbf --osm2pgsql-cache 1000 --all 2>&1 | tee monaco.$$.log
wget --no-verbose --output-document=/tmp/monaco.osm.pbf http://download.geofabrik.de/europe/monaco-latest.osm.pbf
./utils/setup.php --osm-file /tmp/monaco.osm.pbf --osm2pgsql-cache 1000 --all 2>&1 | tee monaco.$$.log
```
To repeat an import you'd need to delete the database first
dropdb -if-exists nominatim
dropdb --if-exists nominatim
@@ -65,45 +61,62 @@ see Nominatim in action on [locahost:8089](http://localhost:8089/nominatim/).
You edit code on your host machine in any editor you like. There is no need to
restart any software: just refresh your browser window.
Note that the webserver uses files from the /build directory. If you change
files in Nominatim/website or Nominatim/utils for example you first need to
copy them into the /build directory by running the `cmake` step from the
installation.
PHP errors are written to `/var/log/apache2/error.log`.
With `echo` and `var_dump()` you write into the output (HTML/XML/JSON) when
you either add `&debug=1` to the URL (preferred) or set
`@define('CONST_Debug', true);` in `settings/local.php`.
In the Python BDD test you can use `logger.info()` for temporary debug
statements.
## Running unit tests
cd ~/Nominatim/tests/php
phpunit ./
## Running PHP code style tests
cd ~/Nominatim
phpcs --colors .
## Running functional tests
Tests in `/features/db` and `/features/osm2pgsql` have to pass 100%. Other
Tests in `test/bdd/db` and `test/bdd/osm2pgsql` have to pass 100%. Other
tests might require full planet-wide data. Sadly even if you have your own
planet-wide data there will be enough differences to the openstreetmap.org
installation to cause false positives in the other tests (see FAQ).
To run the full test suite
cd ~/Nominatim/tests
NOMINATIM_SERVER=http://localhost:8089/nominatim lettuce features
cd ~/Nominatim/test/bdd
behave -DBUILDDIR=/home/vagrant/build/ db osm2pgsql
To run a single file
NOMINATIM_SERVER=http://localhost:8089/nominatim lettuce features/api/reverse.feature
behave -DBUILDDIR=/home/vagrant/build/ api/lookup/simple.feature
Or a single test by line number
behave -DBUILDDIR=/home/vagrant/build/ api/lookup/simple.feature:34
To run specific tests you can add tags just before the `Scenario line`, e.g.
To run specific groups of tests you can add tags just before the `Scenario line`, e.g.
@bug-34
Scenario: address lookup for non-existing or invalid node, way, relation
and then
NOMINATIM_SERVER=http://localhost:8089/nominatim lettuce -t bug-34
## Running unit tests
cd ~/Nominatim/tests-php
phpunit ./
behave -DBUILDDIR=/home/vagrant/build/ --tags @bug-34
@@ -128,27 +141,28 @@ No. Long running Nominatim installations will differ once new import features (o
bug fixes) get added since those usually only get applied to new/changed data.
Also this document skips the optional Wikipedia data import which affects ranking
of search results. See [Nominatim installation](http://wiki.openstreetmap.org/wiki/Nominatim/Installation) for details.
of search results. See [Nominatim installation](https://nominatim.org/release-docs/latest/admin/Installation) for details.
##### Why Ubuntu and CentOS, can I test CentOS/CoreOS/FreeBSD?
##### Why Ubuntu? Can I test CentOS/Fedora/CoreOS/FreeBSD?
There is a Vagrant script for CentOS available. Simply start your box
with `vagrant up centos` and then log in with `vagrant ssh centos`.
In general Nominatim will also run in the other environments. The installation steps
There is a Vagrant script for CentOS available, but the Nominatim directory
isn't symlinked/mounted to the host which makes development trickier. We used
it mainly for debugging installation with SELinux.
In general Nominatim will run in the other environments. The installation steps
are slightly different, e.g. the name of the package manager, Apache2 package
name, location of files. We chose Ubuntu because that is closest to the
nominatim.openstreetmap.org production environment.
You can configure/download other Vagrant boxes from [vagrantbox.es](http://www.vagrantbox.es/).
You can configure/download other Vagrant boxes from [https://app.vagrantup.com/boxes/search](https://app.vagrantup.com/boxes/search).
##### How can I connect to an existing database?
Let's say you have a Postgres database named `nominatim_it` on server `your-server.com` and port `5432`. The Postgres username is `postgres`. You can edit `settings/local.php` and point Nominatim to it.
pgsql://postgres@your-server.com:5432/nominatim_it
pgsql:host=your-server.com;port=5432;user=postgres;dbname=nominatim_it
No data import necessary, no restarting necessary.
No data import or restarting necessary.
If the Postgres installation is behind a firewall, you can try
@@ -157,14 +171,15 @@ If the Postgres installation is behind a firewall, you can try
inside the virtual machine. It will map the port to `localhost:9999` and then
you edit `settings/local.php` with
pgsql://postgres@localhost:9999/nominatim_it
@define('CONST_Database_DSN', 'pgsql:host=localhost;port=9999;user=postgres;dbname=nominatim_it');
To access postgres directly remember to specify the hostname, e.g. `psql --host localhost --port 9999 nominatim_it`
##### My computer is slow and the import takes too long. Can I start the virtual machine "in the cloud"?
Yes. It's possible to start the virtual machine on [Amazon AWS (plugin)](https://github.com/mitchellh/vagrant-aws) or [DigitalOcean (plugin)](https://github.com/smdahlen/vagrant-digitalocean).
Yes. It's possible to start the virtual machine on [Amazon AWS (plugin)](https://github.com/mitchellh/vagrant-aws)
or [DigitalOcean (plugin)](https://github.com/smdahlen/vagrant-digitalocean).

103
Vagrantfile vendored
View File

@@ -4,36 +4,92 @@
Vagrant.configure("2") do |config|
# Apache webserver
config.vm.network "forwarded_port", guest: 80, host: 8089
config.vm.network "forwarded_port", guest: 8088, host: 8088
# If true, then any SSH connections made will enable agent forwarding.
config.ssh.forward_agent = true
# Never sync the current directory to /vagrant.
config.vm.synced_folder ".", "/vagrant", disabled: true
checkout = "yes"
if ENV['CHECKOUT'] != 'y' then
config.vm.synced_folder ".", "/home/vagrant/Nominatim"
checkout = "no"
checkout = "no"
end
config.vm.provider "virtualbox" do |vb, override|
vb.gui = false
vb.memory = 2048
vb.customize ["setextradata", :id, "VBoxInternal2/SharedFoldersEnableSymlinksCreate//vagrant","0"]
if ENV['CHECKOUT'] != 'y' then
override.vm.synced_folder ".", "/home/vagrant/Nominatim"
end
end
config.vm.provider "libvirt" do |lv, override|
lv.memory = 2048
lv.nested = true
if ENV['CHECKOUT'] != 'y' then
override.vm.synced_folder ".", "/home/vagrant/Nominatim", type: 'nfs'
end
end
config.vm.define "ubuntu", primary: true do |sub|
sub.vm.box = "bento/ubuntu-16.04"
sub.vm.box = "generic/ubuntu2004"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-16.sh"
s.path = "vagrant/Install-on-Ubuntu-20.sh"
s.privileged = false
s.args = [checkout]
end
end
config.vm.define "travis" do |sub|
sub.vm.box = "bento/ubuntu-14.04"
config.vm.define "ubuntu-apache" do |sub|
sub.vm.box = "generic/ubuntu2004"
sub.vm.provision :shell do |s|
s.path = "vagrant/install-on-travis-ci.sh"
s.path = "vagrant/Install-on-Ubuntu-20.sh"
s.privileged = false
s.args = [checkout, "install-apache"]
end
end
config.vm.define "ubuntu-nginx" do |sub|
sub.vm.box = "generic/ubuntu2004"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-20.sh"
s.privileged = false
s.args = [checkout, "install-nginx"]
end
end
config.vm.define "ubuntu18" do |sub|
sub.vm.box = "generic/ubuntu1804"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-18.sh"
s.privileged = false
s.args = [checkout]
end
end
config.vm.define "centos" do |sub|
sub.vm.box = "bento/centos-7.2"
config.vm.define "ubuntu18-apache" do |sub|
sub.vm.box = "generic/ubuntu1804"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-18.sh"
s.privileged = false
s.args = [checkout, "install-apache"]
end
end
config.vm.define "ubuntu18-nginx" do |sub|
sub.vm.box = "generic/ubuntu1804"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-18.sh"
s.privileged = false
s.args = [checkout, "install-nginx"]
end
end
config.vm.define "centos7" do |sub|
sub.vm.box = "centos/7"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Centos-7.sh"
s.privileged = false
@@ -41,29 +97,14 @@ Vagrant.configure("2") do |config|
end
end
# configure shared package cache if possible
#if Vagrant.has_plugin?("vagrant-cachier")
# config.cache.enable :apt
# config.cache.scope = :box
#end
config.vm.provider "virtualbox" do |vb|
vb.gui = false
vb.customize ["modifyvm", :id, "--memory", "2048"]
config.vm.define "centos" do |sub|
sub.vm.box = "generic/centos8"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Centos-8.sh"
s.privileged = false
s.args = [checkout]
end
end
# config.vm.provider :digital_ocean do |provider, override|
# override.ssh.private_key_path = '~/.ssh/id_rsa'
# override.vm.box = 'digital_ocean'
# override.vm.box_url = "https://github.com/smdahlen/vagrant-digitalocean/raw/master/box/digital_ocean.box"
# provider.token = ''
# # provider.token = 'YOUR TOKEN'
# provider.image = 'ubuntu-14-04-x64'
# provider.region = 'nyc2'
# provider.size = '512mb'
# end
end

17
cmake/tool-installed.tmpl Normal file
View File

@@ -0,0 +1,17 @@
#!/usr/bin/env python3
import sys
import os
sys.path.insert(1, '@NOMINATIM_LIBDIR@/lib-python')
os.environ['NOMINATIM_NOMINATIM_TOOL'] = os.path.abspath(__file__)
from nominatim import cli
exit(cli.nominatim(module_dir='@NOMINATIM_LIBDIR@/module',
osm2pgsql_path='@NOMINATIM_LIBDIR@/osm2pgsql',
phplib_dir='@NOMINATIM_LIBDIR@/lib-php',
sqllib_dir='@NOMINATIM_LIBDIR@/lib-sql',
data_dir='@NOMINATIM_DATADIR@',
config_dir='@NOMINATIM_CONFIGDIR@',
phpcgi_path='@PHPCGI_BIN@'))

17
cmake/tool.tmpl Executable file
View File

@@ -0,0 +1,17 @@
#!/usr/bin/env python3
import sys
import os
sys.path.insert(1, '@CMAKE_SOURCE_DIR@')
os.environ['NOMINATIM_NOMINATIM_TOOL'] = os.path.abspath(__file__)
from nominatim import cli
exit(cli.nominatim(module_dir='@CMAKE_BINARY_DIR@/module',
osm2pgsql_path='@CMAKE_BINARY_DIR@/osm2pgsql/osm2pgsql',
phplib_dir='@CMAKE_SOURCE_DIR@/lib-php',
sqllib_dir='@CMAKE_SOURCE_DIR@/lib-sql',
data_dir='@CMAKE_SOURCE_DIR@/data',
config_dir='@CMAKE_SOURCE_DIR@/settings',
phpcgi_path='@PHPCGI_BIN@'))

14
codecov.yml Normal file
View File

@@ -0,0 +1,14 @@
codecov:
require_ci_to_pass: yes
coverage:
status:
project: off
patch: off
comment:
require_changes: true
after_n_builds: 2
fixes:
- "Nominatim/::"

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,26 +0,0 @@
-- This data contains Ordnance Survey data © Crown copyright and database right 2010.
-- Code-Point Open contains Royal Mail data © Royal Mail copyright and database right 2010.
-- OS data may be used under the terms of the OS OpenData licence:
-- http://www.ordnancesurvey.co.uk/oswebsite/opendata/licence/docs/licence.pdf
SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;
SET search_path = public, pg_catalog;
SET default_tablespace = '';
SET default_with_oids = false;
CREATE TABLE gb_postcode (
id integer,
postcode character varying(9),
geometry geometry,
CONSTRAINT enforce_dims_geometry CHECK ((st_ndims(geometry) = 2)),
CONSTRAINT enforce_srid_geometry CHECK ((st_srid(geometry) = 4326))
);

File diff suppressed because it is too large Load Diff

View File

@@ -18,12 +18,13 @@ SET default_with_oids = false;
-- Name: word_frequencies; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
DROP TABLE IF EXISTS word_frequencies;
CREATE TABLE word_frequencies (
word_token text,
count bigint
);
--
-- Data for Name: word_frequencies; Type: TABLE DATA; Schema: public; Owner: -
--
@@ -29786,8 +29787,7 @@ st 5557484
-- prefill word table
select count(make_keywords(v)) from (select distinct svals(name) as v from place) as w where v is not null;
select count(make_keywords(v)) from (select distinct address->'postcode' as v from place where address ? 'postcode') as w where v is not null;
select count(precompute_words(v)) from (select distinct svals(name) as v from place) as w where v is not null;
select count(getorcreate_housenumber_id(make_standard_name(v))) from (select distinct address->'housenumber' as v from place where address ? 'housenumber') as w;
-- copy the word frequencies

View File

@@ -1,32 +1,36 @@
# Auto-generated vagrant install documentation
set (INSTALLDOCFILES
Install-on-Centos-7
Install-on-Ubuntu-16
# build the actual documentation
configure_file(mkdocs.yml ../mkdocs.yml)
file(MAKE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/appendix)
set (DOC_SOURCES
admin
develop
api
customize
index.md
extra.css
styles.css
)
foreach (df ${INSTALLDOCFILES})
ADD_CUSTOM_COMMAND( OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${df}.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/${df}.sh ${CMAKE_CURRENT_BINARY_DIR}/${df}.md
MAIN_DEPENDENCY ${PROJECT_SOURCE_DIR}/vagrant/${df}.sh
DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh
COMMENT "Creating markdown docs from vagrant/${df}.sh"
)
ADD_CUSTOM_TARGET( md_install_${df} ALL
DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/${df}.md
)
endforeach()
# Copied static documentation
set (GENERALDOCFILES
Installation.md
Import-and-Update.md
Faq.md
foreach (src ${DOC_SOURCES})
execute_process(
COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/${src} ${CMAKE_CURRENT_BINARY_DIR}/${src}
)
foreach (df ${GENERALDOCFILES})
CONFIGURE_FILE(${df} ${df})
endforeach()
ADD_CUSTOM_TARGET(doc
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-7.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-7.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-8.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-8.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-18.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-18.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-20.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-20.md
COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs build -d ${CMAKE_CURRENT_BINARY_DIR}/../site-html -f ${CMAKE_CURRENT_BINARY_DIR}/../mkdocs.yml
)
ADD_CUSTOM_TARGET(serve-doc
COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs serve
WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
)

View File

@@ -1,127 +0,0 @@
Frequently Asked Questions
==========================
Running Your Own Instance
-------------------------
### Can I import only a few countries and also keep them up to date?
You should use the extracts and updates from http://download.geofabrik.de.
For the intial import, download the countries you need and merge them.
See [OSM Help](https://help.openstreetmap.org/questions/48843/merging-two-or-more-geographical-areas-to-import-two-or-more-osm-files-in-nominatim)
for examples how to do that. Use the resulting single osm file when
running `setup.php`.
For updates you need to download the change files for each country
once per day and apply them **separately** using
./utils/update.php --import-diff <filename> --index
See [this issue](https://github.com/openstreetmap/Nominatim/issues/60#issuecomment-18679446)
for a script that runs the updates using osmosis.
### My website shows: `XML Parsing Error: XML or text declaration not at start of entity Location</code>.`
Make sure there are no spaces at the beginning of your `settings/local.php` file.
Installation
------------
### I accidentally killed the import process after it has been running for many hours. Can it be resumed?
It is possible if the import already got to the indexing stage.
Check the last line of output that was logged before the process
was killed. If it looks like this:
Done 844 in 13 @ 64.923080 per second - Rank 26 ETA (seconds): 7.886255
then you can resume with the following command:
./utils/setup.php --index --create-search-indices --create-country-names
If the reported rank is 26 or higher, you can also safely add `-index-noanalyse`.
### When running the setup.php script I get a warning:
`PHP Warning: file_get_contents(): open_basedir restriction in effect.`
You need to adjust the [open_basedir](http://www.php.net/manual/en/ini.core.php#ini.open-basedir) setting
in your PHP configuration (php.ini file). By default this setting may look like this:
open_basedir = /srv/http/:/home/:/tmp/:/usr/share/pear/
Either add reported directories to the list or disable this setting temporarily by
dding ";" at the beginning of the line. Don't forget to enable this setting again
once you are done with the PHP command line operations.
### The Apache log contains lots of PHP warnings like this:
`PHP Warning: date_default_timezone_set() function.`
You should set the default time zone as instructed in the warning in
your `php.ini` file. Find the entry about timezone and set it to
something like this:
; Defines the default timezone used by the date functions
; http://php.net/date.timezone
date.timezone = 'America/Denver'
Or
echo "date.timezone = 'America/Denver'" > /etc/php.d/timezone.ini
### When running the import I get a version mismatch:
`COPY_END for place failed: ERROR: incompatible library "/opt/Nominatim/module/nominatim.so": version mismatch`
pg_config seems to use bad includes sometimes when multiple versions
of PostgreSQL are available in the system. Make sure you remove the
server development libraries (`postgresql-server-dev-9.1` on Ubuntu)
and recompile (`cmake .. && make`).
### I see the error: `function transliteration(text) does not exist`
Reinstall the nominatim functions with `setup.php --create--functions`
and check for any errors, e.g. a missing `nominatim.so` file.
### The website shows: `Could not get word tokens`
The server cannot access your database. Add `&debug=1` to your URL
to get the full error message.
### On CentOS the website shows `could not connect to server: No such file or directory`
On CentOS v7 the PostgreSQL server is started with `systemd`.
Check if `/usr/lib/systemd/system/httpd.service` contains a line `PrivateTmp=true`.
If so then Apache cannot see the `/tmp/.s.PGSQL.5432` file. It's a good security feature,
so use the [[#PostgreSQL_UNIX_Socket_Location_on_CentOS|preferred solution]]
However, you can solve this the quick and dirty way by commenting out that line and then run
sudo systemctl daemon-reload
sudo systemctl restart httpd
### Setup.php fails with the message: `DB Error: extension not found`
Make sure you have the Postgres extensions hstore and postgis installed.
See the installation instruction for a full list of required packages.
### When running the setup.php script I get a error:
`Cannot redeclare getDB() (previously declared in /your/path/Nominatim/lib/db.php:4)`
The message is a bit misleading as PHP needs to load the file `DB.php` and
instead re-loads Nominatim's `db.php`. To solve this make sure you
have the [http://pear.php.net/package/DB/ Pear module 'DB'] installed.
sudo pear install DB
### I forgot to delete the flatnodes file before starting an import
That's fine. For each import the flatnodes file get overwritten.
See https://help.openstreetmap.org/questions/52419/nominatim-flatnode-storage
for more information.

View File

@@ -1,177 +0,0 @@
Importing a new database
========================
The following instructions explain how to create a Nominatim database
from an OSM planet file and how to keep the database up to date. It
is assumed that you have already successfully installed the Nominatim
software itself, if not return to the [installation page](Installation.md).
Configuration setup in settings/local.php
-----------------------------------------
There are lots of configuration settings you can tweak. Have a look
at `settings/settings.php` for a full list. Most should have a sensible default.
If you plan to import a large dataset (e.g. Europe, North America, planet),
you should also enable flatnode storage of node locations. With this
setting enabled, node coordinates are stored in a simple file instead
of the database. This will save you import time and disk storage.
Add to your `settings/local.php`:
@define('CONST_Osm2pgsql_Flatnode_File', '/path/to/flatnode.file');
Replace the second part with a suitable path on your system and make sure
the directory exists. There should be at least 35GB of free space.
Downloading additional data
---------------------------
### Wikipedia rankings
Wikipedia can be used as an optional auxiliary data source to help indicate
the importance of osm features. Nominatim will work without this information
but it will improve the quality of the results if this is installed.
This data is available as a binary download:
cd $NOMINATIM_SOURCE_DIR/data
wget http://www.nominatim.org/data/wikipedia_article.sql.bin
wget http://www.nominatim.org/data/wikipedia_redirect.sql.bin
Combined the 2 files are around 1.5GB and add around 30GB to the install
size of nominatim. They also increase the install time by an hour or so.
### UK postcodes
Nominatim can use postcodes from an external source to improve searches that involve a UK postcode. This data can be optionally downloaded:
cd $NOMINATIM_SOURCE_DIR/data
wget http://www.nominatim.org/data/gb_postcode_data.sql.gz
Initial import of the data
--------------------------
**Important:** first try the import with a small excerpt, for example from
[Geofabrik](http://download.geofabrik.de).
Download the data to import and load the data with the following command:
./utils/setup.php --osm-file <your data file> --all [--osm2pgsql-cache 28000] 2>&1 | tee setup.log
The `--osm2pgsql-cache` parameter is optional but strongly recommended for
planet imports. It sets the node cache size for the osm2pgsql import part
(see `-C` parameter in osm2pgsql help). 28GB are recommended for a full planet
import, for excerpts you can use less. Adapt to your available RAM to
avoid swapping, never give more than 2/3 of RAM to osm2pgsql.
Loading additional datasets
---------------------------
The following commands will create additional entries for POI searches:
./utils/specialphrases.php --wiki-import > specialphrases.sql
psql -d nominatim -f specialphrases.sql
Installing Tiger housenumber data for the US
============================================
Nominatim is able to use the official TIGER address set to complement the
OSM housenumber data in the US. You can add TIGER data to your own Nominatim
instance by following these steps:
1. Install the GDAL library and python bindings and the unzip tool
* Ubuntu: `sudo apt-get install python-gdal unzip`
* CentOS: `sudo yum install gdal-python unzip`
2. Get the TIGER 2015 data. You will need the EDGES files
(3,234 zip files, 11GB total). Choose one of the two sources:
wget -r ftp://ftp2.census.gov/geo/tiger/TIGER2015/EDGES/
wget -r ftp://mirror1.shellbot.com/census/geo/tiger/TIGER2015/EDGES/
The first one is the original source, the second a considerably faster
mirror.
3. Convert the data into SQL statements (stored in data/tiger):
./utils/imports.php --parse-tiger <tiger edge data directory>
4. Import the data into your Nominatim database:
./utils/setup.php --import-tiger-data
5. Enable use of the Tiger data in your `settings/local.php` by adding:
@define('CONST_Use_US_Tiger_Data', true);
6. Apply the new settings:
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
Be warned that the import can take a very long time, especially if you
import all of the US. The entire US adds about 10GB to your database.
Updates
=======
There are many different possibilities to update your Nominatim database.
The following section describes how to keep it up-to-date with Pyosmium.
For a list of other methods see the output of `./utils/update.php --help`.
Installing the newest version of Pyosmium
-----------------------------------------
It is recommended to install Pyosmium via pip:
pip install --user osmium
Nominatim needs a tool called `pyosmium-get-updates` that comes with
Pyosmium. You need to tell Nominatim where to find it. Add the
following line to your `settings/local.php`:
@define('CONST_Pyosmium_Binary', '/home/user/.local/bin/pyosmium-get-changes');
The path above is fine if you used the `--user` parameter with pip.
Replace `user` with your user name.
Setting up the update process
-----------------------------
Next the update needs to be initialised. By default Nominatim is configured
to update using the global minutely diffs.
If you want a different update source you will need to add some settings
to `settings/local.php`. For example, to use the daily country extracts
diffs for Ireland from geofabrik add the following:
// base URL of the replication service
@define('CONST_Replication_Url', 'http://download.geofabrik.de/europe/ireland-and-northern-ireland-updates');
// How often upstream publishes diffs
@define('CONST_Replication_Update_Interval', '86400');
// How long to sleep if no update found yet
@define('CONST_Replication_Recheck_Interval', '900');
To set up the update process now run the following command:
./utils/update --init-updates
It outputs the date where updates will start. Recheck that this date is
what you expect.
The --init-updates command needs to be rerun whenever the replication service
is changed.
Updating Nominatim
------------------
The following command will keep your database constantly up to date:
./utils/update.php --import-osmosis-all
If you have imported multiple country extracts and want to keep them
up-to-date, have a look at the script in
[issue #60](https://github.com/openstreetmap/Nominatim/issues/60).

View File

@@ -1,181 +0,0 @@
*Note:* these installation instructions are also available in executable
form for use with vagrant under `vagrant/Install-on-Centos-7.sh`.
Installing the Required Software
================================
These instructions expect that you have a freshly installed CentOS version 7.
Make sure all packages are up-to-date by running:
sudo yum update -y
The standard CentOS repositories don't contain all the required packages,
you need to enable the EPEL repository as well. To enable it on CentOS,
install the epel-release RPM by running:
sudo yum install -y epel-release
Now you can install all packages needed for Nominatim:
sudo yum install -y postgresql-server postgresql-contrib postgresql-devel postgis postgis-utils \
git cmake make gcc gcc-c++ libtool policycoreutils-python \
php-pgsql php php-pear php-pear-DB php-intl libpqxx-devel proj-epsg \
bzip2-devel proj-devel geos-devel libxml2-devel boost-devel expat-devel zlib-devel
If you want to run the test suite, you need to install the following
additional packages:
sudo yum install -y python-pip python-Levenshtein python-psycopg2 \
python-numpy php-phpunit-PHPUnit
pip install --user --upgrade pip setuptools lettuce==0.2.18 six==1.9 \
haversine Shapely pytidylib
sudo pear install PHP_CodeSniffer
System Configuration
====================
The following steps are meant to configure a fresh CentOS installation
for use with Nominatim. You may skip some of the steps if you have your
OS already configured.
Creating Dedicated User Accounts
--------------------------------
Nominatim will run as a global service on your machine. It is therefore
best to install it under its own separate user account. In the following
we assume this user is called nominatim and the installation will be in
/srv/nominatim. To create the user and directory run:
sudo useradd -d /srv/nominatim -s /bin/bash -m nominatim
You may find a more suitable location if you wish.
To be able to copy and paste instructions from this manual, export
user name and home directory now like this:
export USERNAME=nominatim
export USERHOME=/srv/nominatim
**Never, ever run the installation as a root user.** You have been warned.
Make sure that system servers can read from the home directory:
chmod a+x $USERHOME
Setting up PostgreSQL
---------------------
CentOS does not automatically create a database cluster. Therefore, start
with initializing the database, then enable the server to start at boot:
sudo postgresql-setup initdb
sudo systemctl enable postgresql
Next tune the postgresql configuration, which is located in
`/var/lib/pgsql/data/postgresql.conf`. See section *Postgres Tuning* in
[the installation page](Installation.md) for the parameters to change.
Now start the postgresql service after updating this config file.
sudo systemctl restart postgresql
Finally, we need to add two postgres users: one for the user that does
the import and another for the webserver which should access the database
only for reading:
sudo -u postgres createuser -s $USERNAME
sudo -u postgres createuser apache
Setting up the Apache Webserver
-------------------------------
You need to create an alias to the website directory in your apache
configuration. Add a separate nominatim configuration to your webserver:
```
sudo tee /etc/httpd/conf.d/nominatim.conf << EOFAPACHECONF
<Directory "$USERHOME/Nominatim/build/website">
Options FollowSymLinks MultiViews
AddType text/html .php
DirectoryIndex search.php
Require all granted
</Directory>
Alias /nominatim $USERHOME/Nominatim/build/website
EOFAPACHECONF
```
Then reload apache
sudo systemctl restart httpd
Adding SELinux Security Settings
--------------------------------
It is a good idea to leave SELinux enabled and enforcing, particularly
with a web server accessible from the Internet. At a minimum the
following SELinux labeling should be done for Nominatim:
sudo semanage fcontext -a -t httpd_sys_content_t "$USERHOME/Nominatim/(website|lib|settings)(/.*)?"
sudo semanage fcontext -a -t lib_t "$USERHOME/Nominatim/module/nominatim.so"
sudo restorecon -R -v $USERHOME/Nominatim
Installing Nominatim
====================
Building and Configuration
--------------------------
Get the source code from Github and change into the source directory
cd $USERHOME
git clone --recursive git://github.com/openstreetmap/Nominatim.git
cd Nominatim
When installing the latest source from github, you also need to
download the country grid:
wget -O data/country_osm_grid.sql.gz http://www.nominatim.org/data/country_grid.sql.gz
The code must be built in a separate directory. Create this directory,
then configure and build Nominatim in there:
mkdir build
cd build
cmake $USERHOME/Nominatim
make
You need to create a minimal configuration file that tells nominatim
the name of your webserver user and the URL of the website:
```
tee settings/local.php << EOF
<?php
@define('CONST_Database_Web_User', 'apache');
@define('CONST_Website_BaseURL', '/nominatim/');
EOF
```
Nominatim is now ready to use. Continue with
[importing a database from OSM data](Import-and-Update.md).

View File

@@ -1,167 +0,0 @@
*Note:* these installation instructions are also available in executable
form for use with vagrant under vagrant/Install-on-Ubuntu-16.sh.
Installing the Required Software
================================
These instructions expect that you have a freshly installed Ubuntu 16.04.
Make sure all packages are are up-to-date by running:
sudo apt-get update -qq
Now you can install all packages needed for Nominatim:
sudo apt-get install -y build-essential cmake g++ libboost-dev libboost-system-dev \
libboost-filesystem-dev libexpat1-dev zlib1g-dev libxml2-dev\
libbz2-dev libpq-dev libgeos-dev libgeos++-dev libproj-dev \
postgresql-server-dev-9.5 postgresql-9.5-postgis-2.2 postgresql-contrib-9.5 \
apache2 php php-pgsql libapache2-mod-php php-pear php-db \
php-intl git
If you want to run the test suite, you need to install the following
additional packages:
sudo apt-get install -y python3-dev python3-pip python3-psycopg2 python3-tidylib phpunit
pip3 install --user behave nose # urllib3
sudo pear install PHP_CodeSniffer
System Configuration
====================
The following steps are meant to configure a fresh Ubuntu installation
for use with Nominatim. You may skip some of the steps if you have your
OS already configured.
Creating Dedicated User Accounts
--------------------------------
Nominatim will run as a global service on your machine. It is therefore
best to install it under its own separate user account. In the following
we assume this user is called nominatim and the installation will be in
/srv/nominatim. To create the user and directory run:
sudo useradd -d /srv/nominatim -s /bin/bash -m nominatim
You may find a more suitable location if you wish.
To be able to copy and paste instructions from this manual, export
user name and home directory now like this:
export USERNAME=nominatim
export USERHOME=/srv/nominatim
**Never, ever run the installation as a root user.** You have been warned.
Make sure that system servers can read from the home directory:
chmod a+x $USERHOME
Setting up PostgreSQL
---------------------
Tune the postgresql configuration, which is located in
`/etc/postgresql/9.5/main/postgresql.conf`. See section *Postgres Tuning* in
[the installation page](Installation.md) for the parameters to change.
Restart the postgresql service after updating this config file.
sudo systemctl restart postgresql
Finally, we need to add two postgres users: one for the user that does
the import and another for the webserver which should access the database
for reading only:
sudo -u postgres createuser -s $USERNAME
sudo -u postgres createuser www-data
Setting up the Apache Webserver
-------------------------------
You need to create an alias to the website directory in your apache
configuration. Add a separate nominatim configuration to your webserver:
```
sudo tee /etc/apache2/conf-available/nominatim.conf << EOFAPACHECONF
<Directory "$USERHOME/Nominatim/build/website">
Options FollowSymLinks MultiViews
AddType text/html .php
DirectoryIndex search.php
Require all granted
</Directory>
Alias /nominatim $USERHOME/Nominatim/build/website
EOFAPACHECONF
```
Then enable the configuration and restart apache
sudo a2enconf nominatim
sudo systemctl restart apache2
Installing Nominatim
====================
Building and Configuration
--------------------------
Get the source code from Github and change into the source directory
cd $USERHOME
git clone --recursive git://github.com/openstreetmap/Nominatim.git
cd Nominatim
When installing the latest source from github, you also need to
download the country grid:
wget -O data/country_osm_grid.sql.gz http://www.nominatim.org/data/country_grid.sql.gz
The code must be built in a separate directory. Create this directory,
then configure and build Nominatim in there:
mkdir build
cd build
cmake $USERHOME/Nominatim
make
You need to create a minimal configuration file that tells nominatim
where it is located on the webserver:
```
tee settings/local.php << EOF
<?php
@define('CONST_Website_BaseURL', '/nominatim/');
EOF
```
Nominatim is now ready to use. Continue with
[importing a database from OSM data](Import-and-Update.md).

View File

@@ -1,154 +0,0 @@
Nominatim installation
======================
This page contains generic installation instructions for Nominatim and its
prerequisites. There are also step-by-step instructions available for
the following operating systems:
* [Ubuntu 16.04](Install-on-Ubuntu-16.md)
* [CentOS 7.2](Install-on-Centos-7.md)
These OS-specific instructions can also be found in executable form
in the `vagrant/` directory.
Prerequisites
-------------
### Software
For compiling:
* [cmake](https://cmake.org/)
* [libxml2](http://xmlsoft.org/)
* a recent C++ compiler
Nominatim comes with its own version of osm2pgsql. See the
[osm2pgsql README](../osm2pgsql/README.md) for additional dependencies
required for compiling osm2pgsql.
For running tests:
* [behave](http://pythonhosted.org/behave/)
* [Psycopg2](http://initd.org/psycopg)
* [nose](https://nose.readthedocs.io)
* [phpunit](https://phpunit.de)
For running Nominatim:
* [PostgreSQL](http://www.postgresql.org) (9.1 or later)
* [PostGIS](http://postgis.refractions.net) (2.0 or later)
* [PHP](http://php.net) (5.4 or later)
* PHP-pgsql
* PHP-intl (bundled with PHP)
* [PEAR::DB](http://pear.php.net/package/DB)
* a webserver (apache or nginx are recommended)
For running continuous updates:
* [pyosmium](http://osmcode.org/pyosmium/)
### Hardware
A minimum of 2GB of RAM is required or installation will fail. For a full
planet import 32GB of RAM or more strongly are recommended.
For a full planet install you will need about 500GB of hard disk space (as of
June 2016, take into account that the OSM database is growing fast). SSD disks
will help considerably to speed up import and queries.
On a 6-core machine with 32GB RAM and SSDs the import of a full planet takes
a bit more than 2 days. Without SSDs 7-8 days are more realistic.
Setup of the server
-------------------
### PostgreSQL tuning
You might want to tune your PostgreSQL installation so that the later steps
make best use of your hardware. You should tune the following parameters in
your `postgresql.conf` file.
shared_buffers (2GB)
maintenance_work_mem (10GB)
work_mem (50MB)
effective_cache_size (24GB)
synchronous_commit = off
checkpoint_segments = 100 # only for postgresql <= 9.4
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
The numbers in brackets behind some parameters seem to work fine for
32GB RAM machine. Adjust to your setup.
For the initial import, you should also set:
fsync = off
full_page_writes = off
Don't forget to reenable them after the initial import or you risk database
corruption. Autovacuum must not be switched off because it ensures that the
tables are frequently analysed.
### Webserver setup
The `website/` directory in the build directory contains the configured
website. Include the directory into your webbrowser to serve php files
from there.
#### Configure for use with Apache
Make sure your Apache configuration contains the required permissions for the
directory and create an alias:
<Directory "/srv/nominatim/build/website">
Options FollowSymLinks MultiViews
AddType text/html .php
DirectoryIndex search.php
Require all granted
</Directory>
Alias /nominatim /srv/nominatim/build/website
`/srv/nominatim/build` should be replaced with the location of your
build directory.
After making changes in the apache config you need to restart apache.
The website should now be available on http://localhost/nominatim.
#### Configure for use with Nginx
Use php-fpm as a deamon for serving PHP cgi. Install php-fpm together with nginx.
By default php listens on a network socket. If you want it to listen to a
Unix socket instead, change the pool configuration (`pool.d/www.conf`) as
follows:
; Comment out the tcp listener and add the unix socket
;listen = 127.0.0.1:9000
listen = /var/run/php5-fpm.sock
; Ensure that the daemon runs as the correct user
listen.owner = www-data
listen.group = www-data
listen.mode = 0666
Tell nginx that php files are special and to fastcgi_pass to the php-fpm
unix socket by adding the location definition to the default configuration.
root /srv/nominatim/build/website;
index search.php index.html;
location ~ [^/]\.php(/|$) {
fastcgi_split_path_info ^(.+?\.php)(/.*)$;
if (!-f $document_root$fastcgi_script_name) {
return 404;
}
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index search.php;
include fastcgi.conf;
}
Restart the nginx and php5-fpm services and the website should now be available
on http://localhost/.
Now continue with [importing the database](Import-and-Update.md).

View File

@@ -0,0 +1,215 @@
# Advanced installations
This page contains instructions for setting up multiple countries in
your Nominatim database. It is assumed that you have already successfully
installed the Nominatim software itself, if not return to the
[installation page](Installation.md).
## Importing multiple regions (without updates)
To import multiple regions in your database you can simply give multiple
OSM files to the import command:
```
nominatim import --osm-file file1.pbf --osm-file file2.pbf
```
If you already have imported a file and want to add another one, you can
use the add-data function to import the additional data as follows:
```
nominatim add-data --file <FILE>
nominatim refresh --postcodes
nominatim index -j <NUMBER OF THREADS>
```
Please note that adding additional data is always significantly slower than
the original import.
## Importing multiple regions (with updates)
If you want to import multiple regions _and_ be able to keep them up-to-date
with updates, then you can use the scripts provided in the `utils` directory.
These scripts will set up an `update` directory in your project directory,
which has the following structure:
```bash
update
   ├── europe
   │   ├── andorra
   │   │   └── sequence.state
   │   └── monaco
   │   └── sequence.state
   └── tmp
└── europe
├── andorra-latest.osm.pbf
└── monaco-latest.osm.pbf
```
The `sequence.state` files contain the sequence ID for each region. They will
be used by pyosmium to get updates. The `tmp` folder is used for import dump and
can be deleted once the import is complete.
### Setting up multiple regions
Create a project directory as described for the
[simple import](Import.md#creating-the-project-directory). If necessary,
you can also add an `.env` configuration with customized options. In particular,
you need to make sure that `NOMINATIM_REPLICATION_UPDATE_INTERVAL` and
`NOMINATIM_REPLICATION_RECHECK_INTERVAL` are set according to the update
interval of the extract server you use.
Copy the scripts `utils/import_multiple_regions.sh` and `utils/update_database.sh`
into the project directory.
Now customize both files as per your requirements
1. List of countries. e.g.
COUNTRIES="europe/monaco europe/andorra"
2. URL to the service providing the extracts and updates. eg:
BASEURL="https://download.geofabrik.de"
DOWNCOUNTRYPOSTFIX="-latest.osm.pbf"
5. Followup in the update script can be set according to your installation.
E.g. for Photon,
FOLLOWUP="curl http://localhost:2322/nominatim-update"
will handle the indexing.
To start the initial import, change into the project directory and run
```
bash import_multiple_regions.sh
```
### Updating the database
Change into the project directory and run the following command:
bash update_database.sh
This will get diffs from the replication server, import diffs and index
the database. The default replication server in the
script([Geofabrik](https://download.geofabrik.de)) provides daily updates.
## Using an external PostgreSQL database
You can install Nominatim using a database that runs on a different server when
you have physical access to the file system on the other server. Nominatim
uses a custom normalization library that needs to be made accessible to the
PostgreSQL server. This section explains how to set up the normalization
library.
!!! note
The external module is only needed when using the legacy tokenizer.
If you have choosen the ICU tokenizer, then you can ignore this section
and follow the standard import documentation.
### Option 1: Compiling the library on the database server
The most sure way to get a working library is to compile it on the database
server. From the prerequisites you need at least cmake, gcc and the
PostgreSQL server package.
Clone or unpack the Nominatim source code, enter the source directory and
create and enter a build directory.
```sh
cd Nominatim
mkdir build
cd build
```
Now configure cmake to only build the PostgreSQL module and build it:
```
cmake -DBUILD_IMPORTER=off -DBUILD_API=off -DBUILD_TESTS=off -DBUILD_DOCS=off -DBUILD_OSM2PGSQL=off ..
make
```
When done, you find the normalization library in `build/module/nominatim.so`.
Copy it to a place where it is readable and executable by the PostgreSQL server
process.
### Option 2: Compiling the library on the import machine
You can also compile the normalization library on the machine from where you
run the import.
!!! important
You can only do this when the database server and the import machine have
the same architecture and run the same version of Linux. Otherwise there is
no guarantee that the compiled library is compatible with the PostgreSQL
server running on the database server.
Make sure that the PostgreSQL server package is installed on the machine
**with the same version as on the database server**. You do not need to install
the PostgreSQL server itself.
Download and compile Nominatim as per standard instructions. Once done, you find
the normalization library in `build/module/nominatim.so`. Copy the file to
the database server at a location where it is readable and executable by the
PostgreSQL server process.
### Running the import
On the client side you now need to configure the import to point to the
correct location of the library **on the database server**. Add the following
line to your your `.env` file:
```php
NOMINATIM_DATABASE_MODULE_PATH="<directory on the database server where nominatim.so resides>"
```
Now change the `NOMINATIM_DATABASE_DSN` to point to your remote server and continue
to follow the [standard instructions for importing](Import.md).
## Moving the database to another machine
For some configurations it may be useful to run the import on one machine, then
move the database to another machine and run the Nominatim service from there.
For example, you might want to use a large machine to be able to run the import
quickly but only want a smaller machine for production because there is not so
much load. Or you might want to do the import once and then replicate the
database to many machines.
The important thing to keep in mind when transferring the Nominatim installation
is that you need to transfer the database _and the project directory_. Both
parts are essential for your installation.
The Nominatim database can be transferred using the `pg_dump`/`pg_restore` tool.
Make sure to use the same version of PostgreSQL and PostGIS on source and
target machine.
!!! note
Before creating a dump of your Nominatim database, consider running
`nominatim freeze` first. Your database looses the ability to receive further
data updates but the resulting database is only about a third of the size
of a full database.
Next install Nominatim on the target machine by following the standard installation
instructions. Again make sure to use the same version as the source machine.
You can now copy the project directory from the source machine to the new machine.
If necessary, edit the `.env` file to point it to the restored database.
Finally run
nominatim refresh --website
to make sure that the local installation of Nominatim will be used.
If you are using the legacy tokenizer you might also have to switch to the
PostgreSQL module that was compiled on your target machine. If you get errors
that PostgreSQL cannot find or access `nominatim.so` then copy the installed
version into the `module` directory of your project directory. The installed
copy can usually be found under `/usr/local/lib/nominatim/module/nominatim.so`.

142
docs/admin/Deployment.md Normal file
View File

@@ -0,0 +1,142 @@
# Deploying Nominatim
The Nominatim API is implemented as a PHP application. The `website/` directory
in the project directory contains the configured website. You can serve this
in a production environment with any web server that is capable to run
PHP scripts.
This section gives a quick overview on how to configure Apache and Nginx to
serve Nominatim. It is not meant as a full system administration guide on how
to run a web service. Please refer to the documentation of
[Apache](http://httpd.apache.org/docs/current/) and
[Nginx](https://nginx.org/en/docs/)
for background information on configuring the services.
!!! Note
Throughout this page, we assume that your Nominatim project directory is
located in `/srv/nominatim-project` and that you have installed Nominatim
using the default installation prefix `/usr/local`. If you have put it
somewhere else, you need to adjust the commands and configuration
accordingly.
We further assume that your web server runs as user `www-data`. Older
versions of CentOS may still use the user name `apache`. You also need
to adapt the instructions in this case.
## Making the website directory accessible
You need to make sure that the `website` directory is accessible for the
web server user. You can check that the permissions are correct by accessing
on of the php files as the web server user:
``` sh
sudo -u www-data head -n 1 /srv/nominatim-project/website/search.php
```
If this shows a permission error, then you need to adapt the permissions of
each directory in the path so that it is executable for `www-data`.
If you have SELinux enabled, further adjustments may be necessary to give the
web server access. At a minimum the following SELinux labelling should be done
for Nominatim:
``` sh
sudo semanage fcontext -a -t httpd_sys_content_t "/usr/local/nominatim/lib/lib-php(/.*)?"
sudo semanage fcontext -a -t httpd_sys_content_t "/srv/nominatim-project/website(/.*)?"
sudo semanage fcontext -a -t lib_t "/srv/nominatim-project/module/nominatim.so"
sudo restorecon -R -v /usr/local/lib/nominatim
sudo restorecon -R -v /srv/nominatim-project
```
## Nominatim with Apache
### Installing the required packages
With Apache you can use the PHP module to run Nominatim.
Under Ubuntu/Debian install them with:
``` sh
sudo apt install apache2 libapache2-mod-php
```
### Configuring Apache
Make sure your Apache configuration contains the required permissions for the
directory and create an alias:
``` apache
<Directory "/srv/nominatim-project/website">
Options FollowSymLinks MultiViews
AddType text/html .php
DirectoryIndex search.php
Require all granted
</Directory>
Alias /nominatim /srv/nominatim-project/website
```
After making changes in the apache config you need to restart apache.
The website should now be available on `http://localhost/nominatim`.
## Nominatim with Nginx
### Installing the required packages
Nginx has no built-in PHP interpreter. You need to use php-fpm as a deamon for
serving PHP cgi.
On Ubuntu/Debian install nginx and php-fpm with:
``` sh
sudo apt install nginx php-fpm
```
### Configure php-fpm and Nginx
By default php-fpm listens on a network socket. If you want it to listen to a
Unix socket instead, change the pool configuration
(`/etc/php/<php version>/fpm/pool.d/www.conf`) as follows:
``` ini
; Replace the tcp listener and add the unix socket
listen = /var/run/php-fpm.sock
; Ensure that the daemon runs as the correct user
listen.owner = www-data
listen.group = www-data
listen.mode = 0666
```
Tell nginx that php files are special and to fastcgi_pass to the php-fpm
unix socket by adding the location definition to the default configuration.
``` nginx
root /srv/nominatim-project/website;
index search.php;
location / {
try_files $uri $uri/ @php;
}
location @php {
fastcgi_param SCRIPT_FILENAME "$document_root$uri.php";
fastcgi_param PATH_TRANSLATED "$document_root$uri.php";
fastcgi_param QUERY_STRING $args;
fastcgi_pass unix:/var/run/php-fpm.sock;
fastcgi_index index.php;
include fastcgi_params;
}
location ~ [^/]\.php(/|$) {
fastcgi_split_path_info ^(.+?\.php)(/.*)$;
if (!-f $document_root$fastcgi_script_name) {
return 404;
}
fastcgi_pass unix:/var/run/php-fpm.sock;
fastcgi_index search.php;
include fastcgi.conf;
}
```
Restart the nginx and php-fpm services and the website should now be available
at `http://localhost/`.

214
docs/admin/Faq.md Normal file
View File

@@ -0,0 +1,214 @@
# Troubleshooting Nominatim Installations
## Installation Issues
### Can a stopped/killed import process be resumed?
"I accidentally killed the import process after it has been running for many hours. Can it be resumed?"
It is possible if the import already got to the indexing stage.
Check the last line of output that was logged before the process
was killed. If it looks like this:
Done 844 in 13 @ 64.923080 per second - Rank 26 ETA (seconds): 7.886255
then you can resume with the following command:
```sh
nominatim import --continue indexing
```
If the reported rank is 26 or higher, you can also safely add `--index-noanalyse`.
### PostgreSQL crashed "invalid page in block"
Usually serious problem, can be a hardware issue, not all data written to disc
for example. Check PostgreSQL log file and search PostgreSQL issues/mailing
list for hints.
If it happened during index creation you can try rerunning the step with
```sh
nominatim import --continue indexing
```
Otherwise it's best to start the full setup from the beginning.
### PHP "open_basedir restriction in effect" warnings
PHP Warning: file_get_contents(): open_basedir restriction in effect.
You need to adjust the
[open_basedir](https://www.php.net/manual/en/ini.core.php#ini.open-basedir)
setting in your PHP configuration (`php.ini` file). By default this setting may
look like this:
open_basedir = /srv/http/:/home/:/tmp/:/usr/share/pear/
Either add reported directories to the list or disable this setting temporarily
by adding ";" at the beginning of the line. Don't forget to enable this setting
again once you are done with the PHP command line operations.
### PHP timezeone warnings
The Apache log may contain lots of PHP warnings like this:
`PHP Warning: date_default_timezone_set() function.`
You should set the default time zone as instructed in the warning in
your `php.ini` file. Find the entry about timezone and set it to
something like this:
; Defines the default timezone used by the date functions
; https://php.net/date.timezone
date.timezone = 'America/Denver'
Or
```
echo "date.timezone = 'America/Denver'" > /etc/php.d/timezone.ini
```
### nominatim.so version mismatch
When running the import you may get a version mismatch:
`COPY_END for place failed: ERROR: incompatible library "/srv/Nominatim/nominatim/build/module/nominatim.so": version mismatch`
pg_config seems to use bad includes sometimes when multiple versions
of PostgreSQL are available in the system. Make sure you remove the
server development libraries (`postgresql-server-dev-9.5` on Ubuntu)
and recompile (`cmake .. && make`).
### I see the error "ERROR: permission denied for language c"
`nominatim.so`, written in C, is required to be installed on the database
server. Some managed database (cloud) services like Amazon RDS do not allow
this. There is currently no work-around other than installing a database
on a non-managed machine.
### I see the error: "function transliteration(text) does not exist"
Reinstall the nominatim functions with `nominatim refresh --functions`
and check for any errors, e.g. a missing `nominatim.so` file.
### I see the error: "ERROR: mmap (remap) failed"
This may be a simple out-of-memory error. Try reducing the memory used
for `--osm2pgsql-cache`. Also make sure that overcommitting memory is
allowed: `cat /proc/sys/vm/overcommit_memory` should print 0 or 1.
If you are using a flatnode file, then it may also be that the underlying
filesystem does not fully support 'mmap'. A notable candidate is virtualbox's
vboxfs.
### I see the error: "clang: Command not found" on CentOS
On CentOS 7 users reported `/opt/rh/llvm-toolset-7/root/usr/bin/clang: Command not found`.
Double-check clang is installed. Instead of `make` try running `make CLANG=true`.
### nominatim UPDATE failed: ERROR: buffer 179261 is not owned by resource owner Portal
Several users [reported this](https://github.com/openstreetmap/Nominatim/issues/1168)
during the initial import of the database. It's
something PostgreSQL internal Nominatim doesn't control. And PostgreSQL forums
suggest it's threading related but definitely some kind of crash of a process.
Users reported either rebooting the server, different hardware or just trying
the import again worked.
### The website shows: "Could not get word tokens"
The server cannot access your database. Add `&debug=1` to your URL
to get the full error message.
### On CentOS the website shows "Could not connect to server"
`could not connect to server: No such file or directory`
On CentOS v7 the PostgreSQL server is started with `systemd`. Check if
`/usr/lib/systemd/system/httpd.service` contains a line `PrivateTmp=true`. If
so then Apache cannot see the `/tmp/.s.PGSQL.5432` file. It's a good security
feature, so use the
[preferred solution](../appendix/Install-on-Centos-7.md#adding-selinux-security-settings).
However, you can solve this the quick and dirty way by commenting out that line and then run
sudo systemctl daemon-reload
sudo systemctl restart httpd
### Website reports "DB Error: insufficient permissions"
The user the webserver, e.g. Apache, runs under needs to have access to the
Nominatim database. You can find the user like
[this](https://serverfault.com/questions/125865/finding-out-what-user-apache-is-running-as),
for default Ubuntu operating system for example it's `www-data`.
1. Repeat the `createuser` step of the installation instructions.
2. Give the user permission to existing tables
```
GRANT usage ON SCHEMA public TO "www-data";
GRANT SELECT ON ALL TABLES IN SCHEMA public TO "www-data";
```
### Website reports "Could not load library "nominatim.so"
Example error message
```
SELECT make_standard_name('3039 E MEADOWLARK LN') [nativecode=ERROR: could not
load library "/srv/nominatim/Nominatim-3.1.0/build/module/nominatim.so":
/srv/nominatim/Nominatim-3.1.0/build/module/nominatim.so: cannot open shared
object file: Permission denied
CONTEXT: PL/pgSQL function make_standard_name(text) line 5 at assignment]
```
The PostgreSQL database, i.e. user `postgres`, needs to have access to that file.
The permission need to be read & executable by everybody, but not writeable
by everybody, e.g.
```
-rwxr-xr-x 1 nominatim nominatim 297984 build/module/nominatim.so
```
Try `chmod a+r nominatim.so; chmod a+x nominatim.so`.
When running SELinux, make sure that the
[context is set up correctly](../appendix/Install-on-Centos-7.md#adding-selinux-security-settings).
When you recently updated your operating system, updated PostgreSQL to
a new version or moved files (e.g. the build directory) you should
recreate `nominatim.so`. Try
```
cd build
rm -r module/
cmake $main_Nominatim_path && make
```
### Setup.php fails with "DB Error: extension not found"
Make sure you have the PostgreSQL extensions "hstore" and "postgis" installed.
See the installation instructions for a full list of required packages.
### I forgot to delete the flatnodes file before starting an import.
That's fine. For each import the flatnodes file get overwritten.
See [https://help.openstreetmap.org/questions/52419/nominatim-flatnode-storage](https://help.openstreetmap.org/questions/52419/nominatim-flatnode-storage)
for more information.
## Running your own instance
### Can I import negative OSM ids into Nominatim?
See [this question of Stackoverflow](https://help.openstreetmap.org/questions/64662/nominatim-flatnode-with-negative-id).

288
docs/admin/Import.md Normal file
View File

@@ -0,0 +1,288 @@
# Importing the Database
The following instructions explain how to create a Nominatim database
from an OSM planet file. It is assumed that you have already successfully
installed the Nominatim software itself and the `nominatim` tool can be found
in your `PATH`. If this is not the case, return to the
[installation page](Installation.md).
## Creating the project directory
Before you start the import, you should create a project directory for your
new database installation. This directory receives all data that is related
to a single Nominatim setup: configuration, extra data, etc. Create a project
directory apart from the Nominatim software and change into the directory:
```
mkdir ~/nominatim-planet
cd ~/nominatim-planet
```
In the following, we refer to the project directory as `$PROJECT_DIR`. To be
able to copy&paste instructions, you can export the appropriate variable:
```
export PROJECT_DIR=~/nominatim-planet
```
The Nominatim tool assumes per default that the current working directory is
the project directory but you may explicitly state a different directory using
the `--project-dir` parameter. The following instructions assume that you run
all commands from the project directory.
!!! tip "Migration Tip"
Nominatim used to be run directly from the build directory until version 3.6.
Essentially, the build directory functioned as the project directory
for the database installation. This setup still works and can be useful for
development purposes. It is not recommended anymore for production setups.
Create a project directory that is separate from the Nominatim software.
### Configuration setup in `.env`
The Nominatim server can be customized via an `.env` configuration file in the
project directory. This is a file in [dotenv](https://github.com/theskumar/python-dotenv)
format which looks the same as variable settings in a standard shell environment.
You can also set the same configuration via environment variables. All
settings have a `NOMINATIM_` prefix to avoid conflicts with other environment
variables.
There are lots of configuration settings you can tweak. A full reference
can be found in the chapter [Configuration Settings](../customize/Settings.md).
Most should have a sensible default.
#### Flatnode files
If you plan to import a large dataset (e.g. Europe, North America, planet),
you should also enable flatnode storage of node locations. With this
setting enabled, node coordinates are stored in a simple file instead
of the database. This will save you import time and disk storage.
Add to your `.env`:
NOMINATIM_FLATNODE_FILE="/path/to/flatnode.file"
Replace the second part with a suitable path on your system and make sure
the directory exists. There should be at least 75GB of free space.
## Downloading additional data
### Wikipedia/Wikidata rankings
Wikipedia can be used as an optional auxiliary data source to help indicate
the importance of OSM features. Nominatim will work without this information
but it will improve the quality of the results if this is installed.
This data is available as a binary download. Put it into your project directory:
cd $PROJECT_DIR
wget https://www.nominatim.org/data/wikimedia-importance.sql.gz
The file is about 400MB and adds around 4GB to the Nominatim database.
!!! tip
If you forgot to download the wikipedia rankings, you can also add
importances after the import. Download the files, then run
`nominatim refresh --wiki-data --importance`. Updating importances for
a planet can take a couple of hours.
### External postcodes
Nominatim can use postcodes from an external source to improve searching with
postcodes. We provide precomputed postcodes sets for the US (using TIGER data)
and the UK (using the [CodePoint OpenData set](https://osdatahub.os.uk/downloads/open/CodePointOpen).
This data can be optionally downloaded into the project directory:
cd $PROJECT_DIR
wget https://www.nominatim.org/data/gb_postcodes.csv.gz
wget https://www.nominatim.org/data/us_postcodes.csv.gz
You can also add your own custom postcode sources, see
[Customization of postcodes](../customize/Postcodes.md).
## Choosing the data to import
In its default setup Nominatim is configured to import the full OSM data
set for the entire planet. Such a setup requires a powerful machine with
at least 64GB of RAM and around 900GB of SSD hard disks. Depending on your
use case there are various ways to reduce the amount of data imported. This
section discusses these methods. They can also be combined.
### Using an extract
If you only need geocoding for a smaller region, then precomputed OSM extracts
are a good way to reduce the database size and import time.
[Geofabrik](https://download.geofabrik.de) offers extracts for most countries.
They even have daily updates which can be used with the update process described
[in the next section](Update.md). There are also
[other providers for extracts](https://wiki.openstreetmap.org/wiki/Planet.osm#Downloading).
Please be aware that some extracts are not cut exactly along the country
boundaries. As a result some parts of the boundary may be missing which means
that Nominatim cannot compute the areas for some administrative areas.
### Dropping Data Required for Dynamic Updates
About half of the data in Nominatim's database is not really used for serving
the API. It is only there to allow the data to be updated from the latest
changes from OSM. For many uses these dynamic updates are not really required.
If you don't plan to apply updates, you can run the import with the
`--no-updates` parameter. This will drop the dynamic part of the database as
soon as it is not required anymore.
You can also drop the dynamic part later using the following command:
```
nominatim freeze
```
Note that you still need to provide for sufficient disk space for the initial
import. So this option is particularly interesting if you plan to transfer the
database or reuse the space later.
!!! warning
The datastructure for updates are also required when adding additional data
after the import, for example [TIGER housenumber data](../customize/Tiger.md).
If you plan to use those, you must not use the `--no-updates` parameter.
Do a normal import, add the external data and once you are done with
everything run `nominatim freeze`.
### Reverse-only Imports
If you only want to use the Nominatim database for reverse lookups or
if you plan to use the installation only for exports to a
[photon](https://photon.komoot.de/) database, then you can set up a database
without search indexes. Add `--reverse-only` to your setup command above.
This saves about 5% of disk space.
### Filtering Imported Data
Nominatim normally sets up a full search database containing administrative
boundaries, places, streets, addresses and POI data. There are also other
import styles available which only read selected data:
* **admin**
Only import administrative boundaries and places.
* **street**
Like the admin style but also adds streets.
* **address**
Import all data necessary to compute addresses down to house number level.
* **full**
Default style that also includes points of interest.
* **extratags**
Like the full style but also adds most of the OSM tags into the extratags
column.
The style can be changed with the configuration `NOMINATIM_IMPORT_STYLE`.
To give you an idea of the impact of using the different styles, the table
below gives rough estimates of the final database size after import of a
2020 planet and after using the `--drop` option. It also shows the time
needed for the import on a machine with 64GB RAM, 4 CPUS and NVME disks.
Note that the given sizes are just an estimate meant for comparison of
style requirements. Your planet import is likely to be larger as the
OSM data grows with time.
style | Import time | DB size | after drop
----------|--------------|------------|------------
admin | 4h | 215 GB | 20 GB
street | 22h | 440 GB | 185 GB
address | 36h | 545 GB | 260 GB
full | 54h | 640 GB | 330 GB
extratags | 54h | 650 GB | 340 GB
You can also customize the styles further.
A [description of the style format](../customize/Import-Styles.md)
can be found in the customization guide.
## Initial import of the data
!!! danger "Important"
First try the import with a small extract, for example from
[Geofabrik](https://download.geofabrik.de).
Download the data to import. Then issue the following command
from the **project directory** to start the import:
```sh
nominatim import --osm-file <data file> 2>&1 | tee setup.log
```
The **project directory** is the one that you have set up at the beginning.
See [creating the project directory](#creating-the-project-directory).
### Notes on full planet imports
Even on a perfectly configured machine
the import of a full planet takes around 2 days. Once you see messages
with `Rank .. ETA` appear, the indexing process has started. This part takes
the most time. There are 30 ranks to process. Rank 26 and 30 are the most complex.
They take each about a third of the total import time. If you have not reached
rank 26 after two days of import, it is worth revisiting your system
configuration as it may not be optimal for the import.
### Notes on memory usage
In the first step of the import Nominatim uses [osm2pgsql](https://osm2pgsql.org)
to load the OSM data into the PostgreSQL database. This step is very demanding
in terms of RAM usage. osm2pgsql and PostgreSQL are running in parallel at
this point. PostgreSQL blocks at least the part of RAM that has been configured
with the `shared_buffers` parameter during
[PostgreSQL tuning](Installation.md#postgresql-tuning)
and needs some memory on top of that. osm2pgsql needs at least 2GB of RAM for
its internal data structures, potentially more when it has to process very large
relations. In addition it needs to maintain a cache for node locations. The size
of this cache can be configured with the parameter `--osm2pgsql-cache`.
When importing with a flatnode file, it is best to disable the node cache
completely and leave the memory for the flatnode file. Nominatim will do this
by default, so you do not need to configure anything in this case.
For imports without a flatnode file, set `--osm2pgsql-cache` approximately to
the size of the OSM pbf file you are importing. The size needs to be given in
MB. Make sure you leave enough RAM for PostgreSQL and osm2pgsql as mentioned
above. If the system starts swapping or you are getting out-of-memory errors,
reduce the cache size or even consider using a flatnode file.
### Testing the installation
Run this script to verify that all required tables and indices got created
successfully.
```sh
nominatim admin --check-database
```
Now you can try out your installation by running:
```sh
nominatim serve
```
This runs a small test server normally used for development. You can use it
to verify that your installation is working. Go to
`http://localhost:8088/status.php` and you should see the message `OK`.
You can also run a search query, e.g. `http://localhost:8088/search.php?q=Berlin`.
Note that search query is not supported for reverse-only imports. You can run a
reverse query, e.g. `http://localhost:8088/reverse.php?lat=27.1750090510034&lon=78.04209025`.
To run Nominatim via webservers like Apache or nginx, please read the
[Deployment chapter](Deployment.md).
## Adding search through category phrases
If you want to be able to search for places by their type through
[special phrases](https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases)
you also need to import these key phrases like this:
```sh
nominatim special-phrases --import-from-wiki
```
Note that this command downloads the phrases from the wiki link above. You
need internet access for the step.
You can also import special phrases from a csv file, for more
information please see the [Customization part](../customize/Special-Phrases.md).

170
docs/admin/Installation.md Normal file
View File

@@ -0,0 +1,170 @@
# Basic Installation
This page contains generic installation instructions for Nominatim and its
prerequisites. There are also step-by-step instructions available for
the following operating systems:
* [Ubuntu 20.04](../appendix/Install-on-Ubuntu-20.md)
* [Ubuntu 18.04](../appendix/Install-on-Ubuntu-18.md)
* [CentOS 8](../appendix/Install-on-Centos-8.md)
* [CentOS 7.2](../appendix/Install-on-Centos-7.md)
These OS-specific instructions can also be found in executable form
in the `vagrant/` directory.
Users have created instructions for other frameworks. We haven't tested those
and can't offer support.
* [Docker](https://github.com/mediagis/nominatim-docker)
* [Docker on Kubernetes](https://github.com/peter-evans/nominatim-k8s)
* [Kubernetes with Helm](https://github.com/robjuz/helm-charts/blob/master/charts/nominatim/README.md)
* [Ansible](https://github.com/synthesio/infra-ansible-nominatim)
## Prerequisites
### Software
!!! Warning
For larger installations you **must have** PostgreSQL 11+ and Postgis 3+
otherwise import and queries will be slow to the point of being unusable.
For compiling:
* [cmake](https://cmake.org/)
* [expat](https://libexpat.github.io/)
* [proj](https://proj.org/)
* [bzip2](http://www.bzip.org/)
* [zlib](https://www.zlib.net/)
* [ICU](http://site.icu-project.org/)
* [Boost libraries](https://www.boost.org/), including system and filesystem
* PostgreSQL client libraries
* a recent C++ compiler (gcc 5+ or Clang 3.8+)
For running Nominatim:
* [PostgreSQL](https://www.postgresql.org) (9.5+ will work, 11+ strongly recommended)
* [PostGIS](https://postgis.net) (2.2+ will work, 3.0+ strongly recommended)
* [Python 3](https://www.python.org/) (3.6+)
* [Psycopg2](https://www.psycopg.org) (2.7+)
* [Python Dotenv](https://github.com/theskumar/python-dotenv)
* [psutil](https://github.com/giampaolo/psutil)
* [Jinja2](https://palletsprojects.com/p/jinja/)
* [PyICU](https://pypi.org/project/PyICU/)
* [PyYaml](https://pyyaml.org/) (5.1+)
* [datrie](https://github.com/pytries/datrie)
* [PHP](https://php.net) (7.0 or later)
* PHP-pgsql
* PHP-intl (bundled with PHP)
* PHP-cgi (for running queries from the command line)
For running continuous updates:
* [pyosmium](https://osmcode.org/pyosmium/)
For dependencies for running tests and building documentation, see
the [Development section](../develop/Development-Environment.md).
### Hardware
A minimum of 2GB of RAM is required or installation will fail. For a full
planet import 64GB of RAM or more are strongly recommended. Do not report
out of memory problems if you have less than 64GB RAM.
For a full planet install you will need at least 900GB of hard disk space.
Take into account that the OSM database is growing fast.
Fast disks are essential. Using NVME disks is recommended.
Even on a well configured machine the import of a full planet takes
around 2 days. On traditional spinning disks, 7-8 days are more realistic.
## Tuning the PostgreSQL database
You might want to tune your PostgreSQL installation so that the later steps
make best use of your hardware. You should tune the following parameters in
your `postgresql.conf` file.
shared_buffers = 2GB
maintenance_work_mem = (10GB)
autovacuum_work_mem = 2GB
work_mem = (50MB)
effective_cache_size = (24GB)
synchronous_commit = off
checkpoint_segments = 100 # only for postgresql <= 9.4
max_wal_size = 1GB # postgresql > 9.4
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
The numbers in brackets behind some parameters seem to work fine for
64GB RAM machine. Adjust to your setup. A higher number for `max_wal_size`
means that PostgreSQL needs to run checkpoints less often but it does require
the additional space on your disk.
Autovacuum must not be switched off because it ensures that the
tables are frequently analysed. If your machine has very little memory,
you might consider setting:
autovacuum_max_workers = 1
and even reduce `autovacuum_work_mem` further. This will reduce the amount
of memory that autovacuum takes away from the import process.
For the initial import, you should also set:
fsync = off
full_page_writes = off
Don't forget to reenable them after the initial import or you risk database
corruption.
## Downloading and building Nominatim
### Downloading the latest release
You can download the [latest release from nominatim.org](https://nominatim.org/downloads/).
The release contains all necessary files. Just unpack it.
### Downloading the latest development version
If you want to install latest development version from github, make sure to
also check out the osm2pgsql subproject:
```
git clone --recursive git://github.com/openstreetmap/Nominatim.git
```
The development version does not include the country grid. Download it separately:
```
wget -O Nominatim/data/country_osm_grid.sql.gz https://www.nominatim.org/data/country_grid.sql.gz
```
### Building Nominatim
The code must be built in a separate directory. Create the directory and
change into it.
```
mkdir build
cd build
```
Nominatim uses cmake and make for building. Assuming that you have created the
build at the same level as the Nominatim source directory run:
```
cmake ../Nominatim
make
sudo make install
```
Nominatim installs itself into `/usr/local` per default. To choose a different
installation directory add `-DCMAKE_INSTALL_PREFIX=<install root>` to the
cmake command. Make sure that the `bin` directory is available in your path
in that case, e.g.
```
export PATH=<install root>/bin:$PATH
```
Now continue with [importing the database](Import.md).

51
docs/admin/Maintenance.md Normal file
View File

@@ -0,0 +1,51 @@
This chapter describes the various operations the Nominatim database administrator
may use to clean and maintain the database. None of these operations is mandatory
but they may help improve the performance and accuracy of results.
## Updating postcodes
Command: `nominatim refresh --postcodes`
Postcode centroids (aka 'calculated postcodes') are generated by looking at all
postcodes of a country, grouping them and calculating the geometric centroid.
There is currently no logic to deal with extreme outliers (typos or other
mistakes in OSM data). There is also no check if a postcodes adheres to a
country's format, e.g. if Swiss postcodes are 4 digits.
When running regular updates, postcodes results can be improved by running
this command on a regular basis. Note that only the postcode table and the
postcode search terms are updated. The postcode that is assigned to each place
is only updated when the place is updated.
The command takes around 70min to run on the planet and needs ca. 40GB of
temporary disk space.
## Updating word counts
Command: `nominatim refresh --word-counts`
Nominatim keeps frequency statistics about all search terms it indexes. These
statistics are currently used to optimise queries to the database. Thus better
statistics mean better performance. Word counts are created once after import
and are usually sufficient even when running regular updates. You might want
to rerun the statistics computation when adding larger amounts of new data,
for example, when adding an additional country via `nominatim add-data`.
## Removing large deleted objects
Nominatim refuses to delete very large areas because often these deletions are
accidental and are reverted within hours. Instead the deletions are logged in
the `import_polygon_delete` table and left to the administrator to clean up.
There is currently no command to do that. You can use the following SQL
query to force a deletion on all objects that have been deleted more than
a certain timespan ago (here: 1 month):
```sql
SELECT place_force_delete(p.place_id) FROM import_polygon_delete d, placex p
WHERE p.osm_type = d.osm_type and p.osm_id = d.osm_id
and age(p.indexed_date) > '1 month'::interval
```

373
docs/admin/Migration.md Normal file
View File

@@ -0,0 +1,373 @@
# Database Migrations
Since version 3.7.0 Nominatim offers automatic migrations. Please follow
the following steps:
* stop any updates that are potentially running
* update Nominatim to the newer version
* go to your project directory and run `nominatim admin --migrate`
* (optionally) restart updates
Below you find additional migrations and hints about other structural and
breaking changes. **Please read them before running the migration.**
!!! note
If you are migrating from a version <3.6, then you still have to follow
the manual migration steps up to 3.6.
## 3.7.0 -> 4.0.0
### NOMINATIM_PHRASE_CONFIG removed
Custom blacklist configurations for special phrases now need to be handed
with the `--config` parameter to `nominatim special-phrases`. Alternatively
you can put your custom configuration in the project directory in a file
named `phrase-settings.json`.
Version 3.8 also removes the automatic converter for the php format of
the configuration in older versions. If you are updating from Nominatim < 3.7
and still work with a custom `phrase-settings.php`, you need to manually
convert it into a json format.
### PHP utils removed
The old PHP utils have now been removed completely. You need to switch to
the appropriate functions of the nominatim command line tool. See
[Introducing `nominatim` command line tool](#introducing-nominatim-command-line-tool)
below.
## 3.6.0 -> 3.7.0
### New format and name of configuration file
The configuration for an import is now saved in a `.env` file in the project
directory. This file follows the dotenv format. For more information, see
the [installation chapter](Import.md#configuration-setup-in-env).
To migrate to the new system, create a new project directory, add the `.env`
file and port your custom configuration from `settings/local.php`. Most
settings are named similar and only have received a `NOMINATIM_` prefix.
Use the default settings in `settings/env.defaults` as a reference.
### New location for data files
External data files for Wikipedia importance, postcodes etc. are no longer
expected to reside in the source tree by default. Instead they will be searched
in the project directory. If you have an automated setup script you must
either adapt the download location or explicitly set the location of the
files to the old place in your `.env`.
### Introducing `nominatim` command line tool
The various php utilities have been replaced with a single `nominatim`
command line tool. Make sure to adapt any scripts. There is no direct 1:1
matching between the old utilities and the commands of nominatim CLI. The
following list gives you a list of nominatim sub-commands that contain
functionality of each script:
* ./utils/setup.php: `import`, `freeze`, `refresh`
* ./utils/update.php: `replication`, `add-data`, `index`, `refresh`
* ./utils/specialphrases.php: `special-phrases`
* ./utils/check_import_finished.php: `admin`
* ./utils/warm.php: `admin`
* ./utils/export.php: `export`
Try `nominatim <command> --help` for more information about each subcommand.
`./utils/query.php` no longer exists in its old form. `nominatim search`
provides a replacement but returns different output.
### Switch to normalized house numbers
The housenumber column in the placex table uses now normalized version.
The automatic migration step will convert the column but this may take a
very long time. It is advisable to take the machine offline while doing that.
## 3.5.0 -> 3.6.0
### Change of layout of search_name_* tables
The table need a different index for nearest place lookup. Recreate the
indexes using the following shell script:
```bash
for table in `psql -d nominatim -c "SELECT tablename FROM pg_tables WHERE tablename LIKE 'search_name_%'" -tA | grep -v search_name_blank`;
do
psql -d nominatim -c "DROP INDEX idx_${table}_centroid_place; CREATE INDEX idx_${table}_centroid_place ON ${table} USING gist (centroid) WHERE ((address_rank >= 2) AND (address_rank <= 25)); DROP INDEX idx_${table}_centroid_street; CREATE INDEX idx_${table}_centroid_street ON ${table} USING gist (centroid) WHERE ((address_rank >= 26) AND (address_rank <= 27))";
done
```
### Removal of html output
The debugging UI is no longer directly provided with Nominatim. Instead we
now provide a simple Javascript application. Please refer to
[Setting up the Nominatim UI](Setup-Nominatim-UI.md) for details on how to
set up the UI.
The icons served together with the API responses have been moved to the
nominatim-ui project as well. If you want to keep the `icon` field in the
response, you need to set `CONST_MapIcon_URL` to the URL of the `/mapicon`
directory of nominatim-ui.
### Change order during indexing
When reindexing places during updates, there is now a different order used
which needs a different database index. Create it with the following SQL command:
```sql
CREATE INDEX idx_placex_pendingsector_rank_address
ON placex
USING BTREE (rank_address, geometry_sector)
WHERE indexed_status > 0;
```
You can then drop the old index with:
```sql
DROP INDEX idx_placex_pendingsector;
```
### Unused index
This index has been unused ever since the query using it was changed two years ago. Saves about 12GB on a planet installation.
```sql
DROP INDEX idx_placex_geometry_reverse_lookupPoint;
```
### Switching to dotenv
As part of the work changing the configuration format, the configuration for
the website is now using a separate configuration file. To create the
configuration file, run the following command after updating:
```sh
./utils/setup.php --setup-website
```
### Update SQL code
To update the SQL code to the leatest version run:
```
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.4.0 -> 3.5.0
### New Wikipedia/Wikidata importance tables
The `wikipedia_*` tables have a new format that also includes references to
Wikidata. You need to update the computation functions and the tables as
follows:
* download the new Wikipedia tables as described in the import section
* reimport the tables: `./utils/setup.php --import-wikipedia-articles`
* update the functions: `./utils/setup.php --create-functions --enable-diff-updates`
* create a new lookup index:
```sql
CREATE INDEX idx_placex_wikidata
ON placex
USING BTREE ((extratags -> 'wikidata'))
WHERE extratags ? 'wikidata'
AND class = 'place'
AND osm_type = 'N'
AND rank_search < 26;
```
* compute importance: `./utils/update.php --recompute-importance`
The last step takes about 10 hours on the full planet.
Remove one function (it will be recreated in the next step):
```sql
DROP FUNCTION create_country(hstore,character varying);
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.3.0 -> 3.4.0
### Reorganisation of location_area_country table
The table `location_area_country` has been optimized. You need to switch to the
new format when you run updates. While updates are disabled, run the following
SQL commands:
```sql
CREATE TABLE location_area_country_new AS
SELECT place_id, country_code, geometry FROM location_area_country;
DROP TABLE location_area_country;
ALTER TABLE location_area_country_new RENAME TO location_area_country;
CREATE INDEX idx_location_area_country_geometry ON location_area_country USING GIST (geometry);
CREATE INDEX idx_location_area_country_place_id ON location_area_country USING BTREE (place_id);
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.2.0 -> 3.3.0
### New database connection string (DSN) format
Previously database connection setting (`CONST_Database_DSN` in `settings/*.php`) had the format
* (simple) `pgsql://@/nominatim`
* (complex) `pgsql://johndoe:secret@machine1.domain.com:1234/db1`
The new format is
* (simple) `pgsql:dbname=nominatim`
* (complex) `pgsql:dbname=db1;host=machine1.domain.com;port=1234;user=johndoe;password=secret`
### Natural Earth country boundaries no longer needed as fallback
```sql
DROP TABLE country_naturalearthdata;
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
### Configurable Address Levels
The new configurable address levels require a new table. Create it with the
following command:
```sh
./utils/update.php --update-address-levels
```
## 3.1.0 -> 3.2.0
### New reverse algorithm
The reverse algorithm has changed and requires new indexes. Run the following
SQL statements to create the indexes:
```sql
CREATE INDEX idx_placex_geometry_reverse_lookupPoint
ON placex
USING gist (geometry)
WHERE (name IS NOT null or housenumber IS NOT null or rank_address BETWEEN 26 AND 27)
AND class NOT IN ('railway','tunnel','bridge','man_made')
AND rank_address >= 26
AND indexed_status = 0
AND linked_place_id IS null;
CREATE INDEX idx_placex_geometry_reverse_lookupPolygon
ON placex USING gist (geometry)
WHERE St_GeometryType(geometry) in ('ST_Polygon', 'ST_MultiPolygon')
AND rank_address between 4 and 25
AND type != 'postcode'
AND name is not null
AND indexed_status = 0
AND linked_place_id is null;
CREATE INDEX idx_placex_geometry_reverse_placeNode
ON placex USING gist (geometry)
WHERE osm_type = 'N'
AND rank_search between 5 and 25
AND class = 'place'
AND type != 'postcode'
AND name is not null
AND indexed_status = 0
AND linked_place_id is null;
```
You also need to grant the website user access to the `country_osm_grid` table:
```sql
GRANT SELECT ON table country_osm_grid to "www-user";
```
Replace the `www-user` with the user name of your website server if necessary.
You can now drop the unused indexes:
```sql
DROP INDEX idx_placex_reverse_geometry;
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.0.0 -> 3.1.0
### Postcode Table
A new separate table for artificially computed postcode centroids was introduced.
Migration to the new format is possible but **not recommended**.
Create postcode table and indexes, running the following SQL statements:
```sql
CREATE TABLE location_postcode
(place_id BIGINT, parent_place_id BIGINT, rank_search SMALLINT,
rank_address SMALLINT, indexed_status SMALLINT, indexed_date TIMESTAMP,
country_code varchar(2), postcode TEXT,
geometry GEOMETRY(Geometry, 4326));
CREATE INDEX idx_postcode_geometry ON location_postcode USING GIST (geometry);
CREATE UNIQUE INDEX idx_postcode_id ON location_postcode USING BTREE (place_id);
CREATE INDEX idx_postcode_postcode ON location_postcode USING BTREE (postcode);
GRANT SELECT ON location_postcode TO "www-data";
DROP TYPE IF EXISTS nearfeaturecentr CASCADE;
CREATE TYPE nearfeaturecentr AS (
place_id BIGINT,
keywords int[],
rank_address smallint,
rank_search smallint,
distance float,
isguess boolean,
postcode TEXT,
centroid GEOMETRY
);
```
Add postcode column to `location_area` tables with SQL statement:
```sql
ALTER TABLE location_area ADD COLUMN postcode TEXT;
```
Then reimport the functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
Create appropriate triggers with SQL:
```sql
CREATE TRIGGER location_postcode_before_update BEFORE UPDATE ON location_postcode
FOR EACH ROW EXECUTE PROCEDURE postcode_update();
```
Finally populate the postcode table (will take a while):
```sh
./utils/setup.php --calculate-postcodes --index --index-noanalyse
```
This will create a working database. You may also delete the old artificial
postcodes now. Note that this may be expensive and is not absolutely necessary.
The following SQL statement will remove them:
```sql
DELETE FROM place_addressline a USING placex p
WHERE a.address_place_id = p.place_id and p.osm_type = 'P';
ALTER TABLE placex DISABLE TRIGGER USER;
DELETE FROM placex WHERE osm_type = 'P';
ALTER TABLE placex ENABLE TRIGGER USER;
```

View File

@@ -0,0 +1,185 @@
# Setting up the Nominatim UI
Nominatim is a search API, it does not provide a website interface on its
own. [nominatim-ui](https://github.com/osm-search/nominatim-ui) offers a
small website for testing your setup and inspecting the database content.
This section provides a quick start how to use nominatim-ui with your
installation. For more details, please also have a look at the
[README of nominatim-ui](https://github.com/osm-search/nominatim-ui/blob/master/README.md).
## Installing nominatim-ui
We provide regular releases of nominatim-ui that contain the packaged website.
They do not need any special installation. Just download, configure
and run it. Grab the latest release from
[nominatim-ui's Github release page](https://github.com/osm-search/nominatim-ui/releases)
and unpack it. You can use `nominatim-ui-x.x.x.tar.gz` or `nominatim-ui-x.x.x.zip`.
Next you need to adapt the UI yo your installation. Custom settings need to be
put into `dist/theme/config.theme.js`. At a minimum you need to
set `Nominatim_API_Endpoint` to point to your Nominatim installation:
cd nominatim-ui
echo "Nominatim_Config.Nominatim_API_Endpoint='https:\\myserver.org\nominatim';" > dist/theme/config.theme.js
For the full set of available settings, have a look at `dist/config.defaults.js`.
Then you can just test it locally by spinning up a webserver in the `dist`
directory. For example, with Python:
cd nominatim-ui/dist
python3 -m http.server 8765
The website is now available at `http://localhost:8765`.
## Forwarding searches to nominatim-ui
Nominatim used to provide the search interface directly by itself when
`format=html` was requested. For all endpoints except for `/reverse` and
`/lookup` this even used to be the default.
The following section describes how to set up Apache or nginx, so that your
users are forwarded to nominatim-ui when they go to URL that formerly presented
the UI.
### Setting up forwarding in Nginx
First of all make nominatim-ui available under `/ui` on your webserver:
``` nginx
server {
# Here is the Nominatim setup as described in the Installation section
location /ui/ {
alias <full path to the nominatim-ui directory>/dist/;
index index.html;
}
}
```
Now we need to find out if a URL should be forwarded to the UI. Add the
following `map` commands *outside* the server section:
``` nginx
# Inspect the format parameter in the query arguments. We are interested
# if it is set to html or something else or if it is missing completely.
map $args $format {
default default;
~(^|&)format=html(&|$) html;
~(^|&)format= other;
}
# Determine from the URI and the format parameter above if forwarding is needed.
map $uri/$format $forward_to_ui {
default 1; # The default is to forward.
~^/ui 0; # If the URI point to the UI already, we are done.
~/other$ 0; # An explicit non-html format parameter. No forwarding.
~/reverse.*/default 0; # Reverse and lookup assume xml format when
~/lookup.*/default 0; # no format parameter is given. No forwarding.
}
```
The `$forward_to_ui` parameter can now be used to conditionally forward the
calls:
```
# When no endpoint is given, default to search.
# Need to add a rewrite so that the rewrite rules below catch it correctly.
rewrite ^/$ /search;
location @php {
# fastcgi stuff..
if ($forward_to_ui) {
rewrite ^(/[^/]*) https://yourserver.com/ui$1.html redirect;
}
}
location ~ [^/]\.php(/|$) {
# fastcgi stuff..
if ($forward_to_ui) {
rewrite (.*).php https://yourserver.com/ui$1.html redirect;
}
}
```
!!! warning
Be aware that the rewrite commands are slightly different for URIs with and
without the .php suffix.
Reload nginx and the UI should be available.
### Setting up forwarding in Apache
First of all make nominatim-ui available in the `ui/` subdirectory where
Nominatim is installed. For example, given you have set up an alias under
`nominatim` like this:
``` apache
Alias /nominatim /home/vagrant/build/website
```
you need to insert the following rules for nominatim-ui before that alias:
```
<Directory "/home/vagrant/nominatim-ui/dist">
DirectoryIndex search.html
Require all granted
</Directory>
Alias /nominatim/ui /home/vagrant/nominatim-ui/dist
```
Replace `/home/vagrant/nominatim-ui` with the directory where you have cloned
nominatim-ui.
!!! important
The alias for nominatim-ui must come before the alias for the Nominatim
website directory.
To set up forwarding, the Apache rewrite module is needed. Enable it with:
``` sh
sudo a2enmod rewrite
```
Then add rewrite rules to the `Directory` directive of the Nominatim website
directory like this:
``` apache
<Directory "/home/vagrant/build/website">
Options FollowSymLinks MultiViews
AddType text/html .php
Require all granted
RewriteEngine On
# This must correspond to the URL where nominatim can be found.
RewriteBase "/nominatim/"
# If no endpoint is given, then use search.
RewriteRule ^(/|$) "search.php"
# If format-html is explicity requested, forward to the UI.
RewriteCond %{QUERY_STRING} "format=html"
RewriteRule ^([^/]+).php ui/$1.html [R,END]
# Same but .php suffix is missing.
RewriteCond %{QUERY_STRING} "format=html"
RewriteRule ^([^/]+) ui/$1.html [R,END]
# If no format parameter is there then forward anything
# but /reverse and /lookup to the UI.
RewriteCond %{QUERY_STRING} "!format="
RewriteCond %{REQUEST_URI} "!/lookup"
RewriteCond %{REQUEST_URI} "!/reverse"
RewriteRule ^([^/]+).php ui/$1.html [R,END]
# Same but .php suffix is missing.
RewriteCond %{QUERY_STRING} "!format="
RewriteCond %{REQUEST_URI} "!/lookup"
RewriteCond %{REQUEST_URI} "!/reverse"
RewriteRule ^([^/]+) ui/$1.html [R,END]
</Directory>
```
Restart Apache and the UI should be available.

166
docs/admin/Update.md Normal file
View File

@@ -0,0 +1,166 @@
# Updating the Database
There are many different ways to update your Nominatim database.
The following section describes how to keep it up-to-date using
an [online replication service for OpenStreetMap data](https://wiki.openstreetmap.org/wiki/Planet.osm/diffs)
For a list of other methods to add or update data see the output of
`nominatim add-data --help`.
!!! important
If you have configured a flatnode file for the import, then you
need to keep this flatnode file around for updates.
### Installing the newest version of Pyosmium
The replication process uses
[Pyosmium](https://docs.osmcode.org/pyosmium/latest/updating_osm_data.html)
to download update data from the server.
It is recommended to install Pyosmium via pip.
Run (as the same user who will later run the updates):
```sh
pip3 install --user osmium
```
### Setting up the update process
Next the update process needs to be initialised. By default Nominatim is configured
to update using the global minutely diffs.
If you want a different update source you will need to add some settings
to `.env`. For example, to use the daily country extracts
diffs for Ireland from Geofabrik add the following:
# base URL of the replication service
NOMINATIM_REPLICATION_URL="https://download.geofabrik.de/europe/ireland-and-northern-ireland-updates"
# How often upstream publishes diffs (in seconds)
NOMINATIM_REPLICATION_UPDATE_INTERVAL=86400
# How long to sleep if no update found yet (in seconds)
NOMINATIM_REPLICATION_RECHECK_INTERVAL=900
To set up the update process now run the following command:
nominatim replication --init
It outputs the date where updates will start. Recheck that this date is
what you expect.
The `replication --init` command needs to be rerun whenever the replication
service is changed.
### Updating Nominatim
Nominatim supports different modes how to retrieve the update data from the
server. Which one you want to use depends on your exact setup and how often you
want to retrieve updates.
These instructions are for using a single source of updates. If you have
imported multiple country extracts and want to keep them
up-to-date, [Advanced installations section](Advanced-Installations.md)
contains instructions to set up and update multiple country extracts.
#### Continuous updates
This is the easiest mode. Simply run the replication command without any
parameters:
nominatim replication
The update application keeps running forever and retrieves and applies
new updates from the server as they are published.
You can run this command as a simple systemd service. Create a service
description like that in `/etc/systemd/system/nominatim-update.service`:
```
[Unit]
Description=Continuous updates of Nominatim
[Service]
WorkingDirectory=/srv/nominatim
ExecStart=nominatim replication
StandardOutput=append:/var/log/nominatim-updates.log
StandardError=append:/var/log/nominatim-updates.error.log
User=nominatim
Group=nominatim
Type=simple
[Install]
WantedBy=multi-user.target
```
Replace the `WorkingDirectory` with your project directory. Also adapt user
and group names as required.
Now activate the service and start the updates:
```
sudo systemctl daemon-reload
sudo systemctl enable nominatim-updates
sudo systemctl start nominatim-updates
```
#### One-time mode
When the `--once` parameter is given, then Nominatim will download exactly one
batch of updates and then exit. This one-time mode still respects the
`NOMINATIM_REPLICATION_UPDATE_INTERVAL` that you have set. If according to
the update interval no new data has been published yet, it will go to sleep
until the next expected update and only then attempt to download the next batch.
The one-time mode is particularly useful if you want to run updates continuously
but need to schedule other work in between updates. For example, the main
service at osm.org uses it, to regularly recompute postcodes -- a process that
must not be run while updates are in progress. Its update script
looks like this:
```sh
#!/bin/bash
# Switch to your project directory.
cd /srv/nominatim
while true; do
nominatim replication --once
if [ -f "/srv/nominatim/schedule-mainenance" ]; then
rm /srv/nominatim/schedule-mainenance
nominatim refresh --postcodes
fi
done
```
A cron job then creates the file `/srv/nominatim/need-mainenance` once per night.
#### Catch-up mode
With the `--catch-up` parameter, Nominatim will immediately try to download
all changes from the server until the database is up-to-date. The catch-up mode
still respects the parameter `NOMINATIM_REPLICATION_MAX_DIFF`. It downloads and
applies the changes in appropriate batches until all is done.
The catch-up mode is foremost useful to bring the database up to speed after the
initial import. Give that the service usually is not in production at this
point, you can temporarily be a bit more generous with the batch size and
number of threads you use for the updates by running catch-up like this:
```
cd /srv/nominatim
NOMINATIM_REPLICATION_MAX_DIFF=5000 nominatim replication --catch-up --threads 15
```
The catch-up mode is also useful when you want to apply updates at a lower
frequency than what the source publishes. You can set up a cron job to run
replication catch-up at whatever interval you desire.
!!! hint
When running scheduled updates with catch-up, it is a good idea to choose
a replication source with an update frequency that is an order of magnitude
lower. For example, if you want to update once a day, use an hourly updated
source. This makes sure that you don't miss an entire day of updates when
the source is unexpectely late to publish its update.
If you want to use the source with the same update frequency (e.g. a daily
updated source with daily updates), use the
continuous update mode. It ensures to re-request the newest update until it
is published.

154
docs/api/Details.md Normal file
View File

@@ -0,0 +1,154 @@
# Place details
Show all details about a single place saved in the database.
!!! warning
The details page exists for debugging only. You may not use it in scripts
or to automatically query details about a result.
See [Nominatim Usage Policy](https://operations.osmfoundation.org/policies/nominatim/).
## Parameters
The details API supports the following two request formats:
``` xml
https://nominatim.openstreetmap.org/details?osmtype=[N|W|R]&osmid=<value>&class=<value>
```
`osmtype` and `osmid` are required parameters. The type is one of node (N), way (W)
or relation (R). The id must be a number. The `class` parameter is optional and
allows to distinguish between entries, when the corresponding OSM object has more
than one main tag. For example, when a place is tagged with `tourism=hotel` and
`amenity=restaurant`, there will be two place entries in Nominatim, one for a
restaurant, one for a hotel. You need to specify `class=tourism` or `class=amentity`
to get exactly the one you want. If there are multiple places in the database
but the `class` parameter is left out, then one of the places will be chosen
at random and displayed.
``` xml
https://nominatim.openstreetmap.org/details?place_id=<value>
```
Place IDs are assigned sequentially during Nominatim data import. The ID
for a place is different between Nominatim installation (servers) and
changes when data gets reimported. Therefore it cannot be used as
a permanent id and shouldn't be used in bug reports.
Additional optional parameters are explained below.
### Output format
* `json_callback=<string>`
Wrap JSON output in a callback function (JSONP) i.e. `<string>(<json>)`.
* `pretty=[0|1]`
Add indentation to make it more human-readable. (Default: 0)
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 0)
* `keywords=[0|1]`
Include a list of name keywords and address keywords (word ids). (Default: 0)
* `linkedplaces=[0|1]`
Include a details of places that are linked with this one. Places get linked
together when they are different forms of the same physical object. Nominatim
links two kinds of objects together: place nodes get linked with the
corresponding administrative boundaries. Waterway relations get linked together with their
members.
(Default: 1)
* `hierarchy=[0|1]`
Include details of places lower in the address hierarchy. (Default: 0)
* `group_hierarchy=[0|1]`
For JSON output will group the places by type. (Default: 0)
* `polygon_geojson=[0|1]`
Include geometry of result. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing result, overrides the value
specified in the "Accept-Language" HTTP header.
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
## Examples
##### JSON
[https://nominatim.openstreetmap.org/details.php?osmtype=W&osmid=38210407&format=json](https://nominatim.openstreetmap.org/details.php?osmtype=W&osmid=38210407&format=json)
```json
{
"place_id": 85993608,
"parent_place_id": 72765313,
"osm_type": "W",
"osm_id": 38210407,
"category": "place",
"type": "square",
"admin_level": "15",
"localname": "Pariser Platz",
"names": {
"name": "Pariser Platz",
"name:be": "Парыжская плошча",
"name:de": "Pariser Platz",
"name:es": "Plaza de París",
"name:he": "פאריזר פלאץ",
"name:ko": "파리저 광장",
"name:la": "Forum Parisinum",
"name:ru": "Парижская площадь",
"name:uk": "Паризька площа",
"name:zh": "巴黎廣場"
},
"addresstags": {
"postcode": "10117"
},
"housenumber": null,
"calculated_postcode": "10117",
"country_code": "de",
"indexed_date": "2018-08-18T17:02:45+00:00",
"importance": 0.339401620591472,
"calculated_importance": 0.339401620591472,
"extratags": {
"wikidata": "Q156716",
"wikipedia": "de:Pariser Platz"
},
"calculated_wikipedia": "de:Pariser_Platz",
"rank_address": 30,
"rank_search": 30,
"isarea": true,
"centroid": {
"type": "Point",
"coordinates": [
13.3786822618517,
52.5163654
]
},
"geometry": {
"type": "Point",
"coordinates": [
13.3786822618517,
52.5163654
]
}
}
```

61
docs/api/Faq.md Normal file
View File

@@ -0,0 +1,61 @@
# Frequently Asked Questions
## API Results
#### 1. The address of my search results contains far-away places that don't belong there.
Nominatim computes the address from two sources in the OpenStreetMap data:
from administrative boundaries and from place nodes. Boundaries are the more
useful source. They precisely describe an area. So it is very clear for
Nominatim if a point belongs to an area or not. Place nodes are more complicated.
These are only points without any precise extent. So Nominatim has to take a
guess and assume that an address belongs to the closest place node it can find.
In an ideal world, Nominatim would not need the place nodes but there are
many places on earth where there are no precise boundaries available for
all parts that make up an address. This is in particular true for the more
local address parts, like villages and suburbs. Therefore it is not possible
to completely dismiss place nodes. And sometimes they sneak in where they
don't belong.
As a OpenStreetMap mapper, you can improve the situation in two ways: if you
see a place node for which already an administrative area exists, then you
should _link_ the two by adding the node with a 'label' role to the boundary
relation. If there is no administrative area, you can add the approximate
extent of the place and tag it place=<something> as well.
#### 2. When doing reverse search, the address details have parts that don't contain the point I was looking up.
There is a common misconception how the reverse API call works in Nominatim.
Reverse does not give you the address of the point you asked for. Reverse
returns the closest object to the point you asked for and then returns the
address of that object. Now, if you are close to a border, then the closest
object may be across that border. When Nominatim then returns the address,
it contains the county/state/country across the border.
#### 3. I get different counties/states/countries when I change the zoom parameter in the reverse query. How is that possible?
This is basically the same problem as in the previous answer.
The zoom level influences at which [search rank](../customize/Ranking.md#search-rank) Nominatim starts looking
for the closest object. So the closest house number maybe on one side of the
border while the closest street is on the other. As the address details contain
the address of the closest object found, you might sometimes get one result,
sometimes the other for the closest point.
#### 4. Can you return the continent?
Nominatim assigns each map feature one country. Those outside any administrative
boundaries are assigned a special no-country. Continents or other super-national
administrations (e.g. European Union, NATO, Custom unions) are not supported,
see also [Administrative Boundary](https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative#Super-national_administrations).
#### 5. Can you return the timezone?
See this separate OpenStreetMap-based project [Timezone Boundary Builder](https://github.com/evansiroky/timezone-boundary-builder).
#### 6. I want to download a list of streets/restaurants of a city/region
The [Overpass API](https://wiki.openstreetmap.org/wiki/Overpass_API) is more
suited for these kinds of queries.
That said if you installed your own Nominatim instance you can use the
`nominatim export` PHP script as basis to return such lists.

162
docs/api/Lookup.md Normal file
View File

@@ -0,0 +1,162 @@
# Address lookup
The lookup API allows to query the address and other details of one or
multiple OSM objects like node, way or relation.
## Parameters
The lookup API has the following format:
```
https://nominatim.openstreetmap.org/lookup?osm_ids=[N|W|R]<value>,…,…,&<params>
```
`osm_ids` is mandatory and must contain a comma-separated list of OSM ids each
prefixed with its type, one of node(N), way(W) or relation(R). Up to 50 ids
can be queried at the same time.
Additional optional parameters are explained below.
### Output format
* `format=[xml|json|jsonv2|geojson|geocodejson]`
See [Place Output Formats](Output.md) for details on each format. (Default: xml)
* `json_callback=<string>`
Wrap JSON output in a callback function (JSONP) i.e. `<string>(<json>)`.
Only has an effect for JSON output formats.
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 0)
* `extratags=[0|1]`
Include additional information in the result if available,
e.g. wikipedia link, opening hours. (Default: 0)
* `namedetails=[0|1]`
Include a list of alternative names in the results. These may include
language variants, references, operator and brand. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing search results, overrides the value
specified in the "Accept-Language" HTTP header.
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
### Polygon output
* `polygon_geojson=1`
* `polygon_kml=1`
* `polygon_svg=1`
* `polygon_text=1`
Output geometry of results as a GeoJSON, KML, SVG or WKT. Only one of these
options can be used at a time. (Default: 0)
* `polygon_threshold=0.0`
Return a simplified version of the output geometry. The parameter is the
tolerance in degrees with which the geometry may differ from the original
geometry. Topology is preserved in the result. (Default: 0.0)
### Other
* `email=<valid email address>`
If you are making large numbers of request please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.
* `debug=[0|1]`
Output assorted developer debug information. Data on internals of Nominatim's
"Search Loop" logic, and SQL queries. The output is (rough) HTML format.
This overrides the specified machine readable format. (Default: 0)
## Examples
##### XML
[https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189](https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189)
```xml
<lookupresults timestamp="Mon, 29 Jun 15 18:01:33 +0000" attribution="Data © OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright" querystring="R146656,W104393803,N240109189" polygon="false">
<place place_id="127761056" osm_type="relation" osm_id="146656" place_rank="16" lat="53.4791466" lon="-2.2447445" display_name="Manchester, Greater Manchester, North West England, England, United Kingdom" class="boundary" type="administrative" importance="0.704893333438333">
<city>Manchester</city>
<county>Greater Manchester</county>
<state_district>North West England</state_district>
<state>England</state>
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
<place place_id="77769745" osm_type="way" osm_id="104393803" place_rank="30" lat="52.5162024" lon="13.3777343363579" display_name="Brandenburg Gate, 1, Pariser Platz, Mitte, Berlin, 10117, Germany" class="tourism" type="attraction" importance="0.443472858361592">
<attraction>Brandenburg Gate</attraction>
<house_number>1</house_number>
<pedestrian>Pariser Platz</pedestrian>
<suburb>Mitte</suburb>
<city_district>Mitte</city_district>
<city>Berlin</city>
<state>Berlin</state>
<postcode>10117</postcode>
<country>Germany</country>
<country_code>de</country_code>
</place>
<place place_id="2570600569" osm_type="node" osm_id="240109189" place_rank="15" lat="52.5170365" lon="13.3888599" display_name="Berlin, Germany" class="place" type="city" importance="0.822149797630868">
<city>Berlin</city>
<state>Berlin</state>
<country>Germany</country>
<country_code>de</country_code>
</place>
</lookupresults>
```
##### JSON with extratags
[https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json](https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json)
```json
[
{
"place_id": "84271358",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "way",
"osm_id": "50637691",
"lat": "52.39955055",
"lon": "13.04806574678",
"display_name": "Brandenburger Tor, Brandenburger Straße, Nördliche Innenstadt, Innenstadt, Potsdam, Brandenburg, 14467, Germany",
"class": "historic",
"type": "city_gate",
"importance": "0.221233780277011",
"address": {
"address29": "Brandenburger Tor",
"pedestrian": "Brandenburger Straße",
"suburb": "Nördliche Innenstadt",
"city": "Potsdam",
"state": "Brandenburg",
"postcode": "14467",
"country": "Germany",
"country_code": "de"
},
"extratags": {
"image": "http://commons.wikimedia.org/wiki/File:Potsdam_brandenburger_tor.jpg",
"wikidata": "Q695045",
"wikipedia": "de:Brandenburger Tor (Potsdam)",
"wheelchair": "yes",
"description": "Kleines Brandenburger Tor in Potsdam"
}
}
]
```

296
docs/api/Output.md Normal file
View File

@@ -0,0 +1,296 @@
# Place Output
The [/reverse](Reverse.md), [/search](Search.md) and [/lookup](Lookup.md)
API calls produce very similar output which is explained in this section.
There is one section for each format. The format correspond to what was
selected via the `format` parameter.
## JSON
The JSON format returns an array of places (for search and lookup) or
a single place (for reverse) of the following format:
```
{
"place_id": "100149",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "node",
"osm_id": "107775",
"boundingbox": ["51.3473219", "51.6673219", "-0.2876474", "0.0323526"],
"lat": "51.5073219",
"lon": "-0.1276474",
"display_name": "London, Greater London, England, SW1A 2DU, United Kingdom",
"class": "place",
"type": "city",
"importance": 0.9654895765402,
"icon": "https://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png",
"address": {
"city": "London",
"state_district": "Greater London",
"state": "England",
"postcode": "SW1A 2DU",
"country": "United Kingdom",
"country_code": "gb"
},
"extratags": {
"capital": "yes",
"website": "http://www.london.gov.uk",
"wikidata": "Q84",
"wikipedia": "en:London",
"population": "8416535"
}
}
```
The possible fields are:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `boundingbox` - area of corner coordinates ([see notes](#boundingbox))
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `display_name` - full comma-separated address
* `class`, `type` - key and value of the main OSM tag
* `importance` - computed importance rank
* `icon` - link to class icon (if available)
* `address` - dictionary of address details (only with `addressdetails=1`,
[see notes](#addressdetails))
* `extratags` - dictionary with additional useful tags like website or maxspeed
(only with `extratags=1`)
* `namedetails` - dictionary with full list of available names including ref etc.
* `geojson`, `svg`, `geotext`, `geokml` - full geometry
(only with the appropriate `polygon_*` parameter)
## JSONv2
This is the same as the JSON format with two changes:
* `class` renamed to `category`
* additional field `place_rank` with the search rank of the object
## GeoJSON
This format follows the [RFC7946](https://geojson.org). Every feature includes
a bounding box (`bbox`).
The properties object has the following fields:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `category`, `type` - key and value of the main OSM tag
* `display_name` - full comma-separated address
* `place_rank` - class search rank
* `importance` - computed importance rank
* `icon` - link to class icon (if available)
* `address` - dictionary of address details (only with `addressdetails=1`,
[see notes](#addressdetails))
* `extratags` - dictionary with additional useful tags like `website` or `maxspeed`
(only with `extratags=1`)
* `namedetails` - dictionary with full list of available names including ref etc.
Use `polygon_geojson` to output the full geometry of the object instead
of the centroid.
## GeocodeJSON
The GeocodeJSON format follows the
[GeocodeJSON spec 0.1.0](https://github.com/geocoders/geocodejson-spec).
The following feature attributes are implemented:
* `osm_type`, `osm_id` - reference to the OSM object (unofficial extension, [see notes](#osm-reference))
* `type` - value of the main tag of the object (e.g. residential, restaurant, ...)
* `label` - full comma-separated address
* `name` - localised name of the place
* `housenumber`, `street`, `locality`, `district`, `postcode`, `city`,
`county`, `state`, `country` -
provided when it can be determined from the address
* `admin` - list of localised names of administrative boundaries (only with `addressdetails=1`)
Use `polygon_geojson` to output the full geometry of the object instead
of the centroid.
## XML
The XML response returns one or more place objects in slightly different
formats depending on the API call.
### Reverse
```
<reversegeocode timestamp="Sat, 11 Aug 18 11:53:21 +0000"
attribution="Data © OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright"
querystring="lat=48.400381&lon=11.745876&zoom=5&format=xml">
<result place_id="179509537" osm_type="relation" osm_id="2145268" ref="BY" place_rank="15" address_rank="15"
lat="48.9467562" lon="11.4038717"
boundingbox="47.2701114,50.5647142,8.9763497,13.8396373">
Bavaria, Germany
</result>
<addressparts>
<state>Bavaria</state>
<country>Germany</country>
<country_code>de</country_code>
</addressparts>
<extratags>
<tag key="place" value="state"/>
<tag key="wikidata" value="Q980"/>
<tag key="wikipedia" value="de:Bayern"/>
<tag key="population" value="12520000"/>
<tag key="name:prefix" value="Freistaat"/>
</extratags>
</reversegeocode>
```
The attributes of the outer `reversegeocode` element return generic information
about the query, including the time when the response was sent (in UTC),
attribution to OSM and the original querystring.
The place information can be found in the `result` element. The attributes of that element contain:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `ref` - content of `ref` tag if it exists
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `boundingbox` - comma-separated list of corner coordinates ([see notes](#boundingbox))
The full address of the result can be found in the content of the
`result` element as a comma-separated list.
Additional information requested with `addressdetails=1`, `extratags=1` and
`namedetails=1` can be found in extra elements.
### Search and Lookup
```
<searchresults timestamp="Sat, 11 Aug 18 11:55:35 +0000"
attribution="Data © OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright"
querystring="london" polygon="false" exclude_place_ids="100149"
more_url="https://nominatim.openstreetmap.org/search.php?q=london&addressdetails=1&extratags=1&exclude_place_ids=100149&format=xml&accept-language=en-US%2Cen%3Bq%3D0.7%2Cde%3Bq%3D0.3">
<place place_id="100149" osm_type="node" osm_id="107775" place_rank="15" address_rank="15"
boundingbox="51.3473219,51.6673219,-0.2876474,0.0323526" lat="51.5073219" lon="-0.1276474"
display_name="London, Greater London, England, SW1A 2DU, United Kingdom"
class="place" type="city" importance="0.9654895765402"
icon="https://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png">
<extratags>
<tag key="capital" value="yes"/>
<tag key="website" value="http://www.london.gov.uk"/>
<tag key="wikidata" value="Q84"/>
<tag key="wikipedia" value="en:London"/>
<tag key="population" value="8416535"/>
</extratags>
<city>London</city>
<state_district>Greater London</state_district>
<state>England</state>
<postcode>SW1A 2DU</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
</searchresults>
```
The attributes of the outer `searchresults` or `lookupresults` element return
generic information about the query:
* `timestamp` - UTC time when the response was sent
* `attribution` - OSM licensing information
* `querystring` - original query
* `polygon` - true when extra geometry information was requested
* `exclude_place_ids` - IDs of places that should be ignored in a follow-up request
* `more_url` - search call that will yield additional results for the query
just sent
The place information can be found in the `place` elements, of which there may
be more than one. The attributes of that element contain:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `ref` - content of `ref` tag if it exists
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `boundingbox` - comma-separated list of corner coordinates ([see notes](#boundingbox))
* `place_rank` - class [search rank](../develop/Ranking#search-rank)
* `address_rank` - place [address rank](../develop/Ranking#address-rank)
* `display_name` - full comma-separated address
* `class`, `type` - key and value of the main OSM tag
* `importance` - computed importance rank
* `icon` - link to class icon (if available)
When `addressdetails=1` is requested, the localised address parts appear
as subelements with the type of the address part.
Additional information requested with `extratags=1` and `namedetails=1` can
be found in extra elements as sub-element of `extratags` and `namedetails`
respectively.
## Notes on field values
### place_id is not a persistent id
The `place_id` is an internal identifier that is assigned data is imported
into a Nominatim database. The same OSM object will have a different value
on another server. It may even change its ID on the same server when it is
removed and reimported while updating the database with fresh OSM data.
It is thus not useful to treat it as permanent for later use.
The combination `osm_type`+`osm_id` is slighly better but remember in
OpenStreetMap mappers can delete, split, recreate places (and those
get a new `osm_id`), there is no link between those old and new ids.
Places can also change their meaning without changing their `osm_id`,
e.g. when a restaurant is retagged as supermarket. For a more in-depth
discussion see [Permanent ID](https://wiki.openstreetmap.org/wiki/Permanent_ID).
If you need an ID that is consistent over multiple installations of Nominatim,
then you should use the combination of `osm_type`+`osm_id`+`class`.
### OSM reference
Nominatim may sometimes return special objects that do not correspond directly
to an object in OpenStreetMap. These are:
* **Postcodes**. Nominatim returns an postcode point created from all mapped
postcodes of the same name. The class and type of these object is `place=postcdode`.
No `osm_type` and `osm_id` are included in the result.
* **Housenumber interpolations**. Nominatim returns a single interpolated
housenumber from the interpolation way. The class and type are `place=house`
and `osm_type` and `osm_id` correspond to the interpolation way in OSM.
* **TIGER housenumber.** Nominatim returns a single interpolated housenumber
from the TIGER data. The class and type are `place=house`
and `osm_type` and `osm_id` correspond to the street mentioned in the result.
Please note that the `osm_type` and `osm_id` returned may be changed in the
future. You should not expect to only find `node`, `way` and `relation` for
the type.
### boundingbox
Comma separated list of min latitude, max latitude, min longitude, max longitude.
The whole planet would be `-90,90,-180,180`.
Can be used to pan and center the map on the result, for example with leafletjs
mapping library
`map.fitBounds([[bbox[0],bbox[2]],[bbox[1],bbox[3]]], {padding: [20, 20], maxzoom: 16});`
Bounds crossing the antimeridian have a min latitude -180 and max latitude 180,
essentially covering the entire planet
(see [issue 184](https://github.com/openstreetmap/Nominatim/issues/184)).
### addressdetails
Address details in the xml and json formats return a list of names together
with a designation label. Per default the following labels may appear:
* continent
* country, country_code
* region, state, state_district, county
* municipality, city, town, village
* city_district, district, borough, suburb, subdivision
* hamlet, croft, isolated_dwelling
* neighbourhood, allotments, quarter
* city_block, residental, farm, farmyard, industrial, commercial, retail
* road
* house_number, house_name
* emergency, historic, military, natural, landuse, place, railway,
man_made, aerialway, boundary, amenity, aeroway, club, craft, leisure,
office, mountain_pass, shop, tourism, bridge, tunnel, waterway
* postcode
They roughly correspond to the classification of the OpenStreetMap data
according to either the `place` tag or the main key of the object.

14
docs/api/Overview.md Normal file
View File

@@ -0,0 +1,14 @@
### Nominatim API
Nominatim indexes named (or numbered) features within the OpenStreetMap (OSM) dataset and a subset of other unnamed features (pubs, hotels, churches, etc).
Its API has the following endpoints for querying the data:
* __[/search](Search.md)__ - search OSM objects by name or type
* __[/reverse](Reverse.md)__ - search OSM object by their location
* __[/lookup](Lookup.md)__ - look up address details for OSM objects by their ID
* __[/status](Status.md)__ - query the status of the server
* __/deletable__ - list objects that have been deleted in OSM but are held
back in Nominatim in case the deletion was accidental
* __/polygons__ - list of broken polygons detected by Nominatim
* __[/details](Details.md)__ - show internal details for an object (for debugging only)

285
docs/api/Reverse.md Normal file
View File

@@ -0,0 +1,285 @@
# Reverse Geocoding
Reverse geocoding generates an address from a latitude and longitude.
## How it works
The reverse geocoding API does not exactly compute the address for the
coordinate it receives. It works by finding the closest suitable OSM object
and returning its address information. This may occasionally lead to
unexpected results.
First of all, Nominatim only includes OSM objects in
its index that are suitable for searching. Small, unnamed paths for example
are missing from the database and can therefore not be used for reverse
geocoding either.
The other issue to be aware of is that the closest OSM object may not always
have a similar enough address to the coordinate you were requesting. For
example, in dense city areas it may belong to a completely different street.
## Parameters
The main format of the reverse API is
```
https://nominatim.openstreetmap.org/reverse?lat=<value>&lon=<value>&<params>
```
where `lat` and `lon` are latitude and longitutde of a coordinate in WGS84
projection. The API returns exactly one result or an error when the coordinate
is in an area with no OSM data coverage.
Additional paramters are accepted as listed below.
!!! warning "Deprecation warning"
The reverse API used to allow address lookup for a single OSM object by
its OSM id. This use is now deprecated. Use the [Address Lookup API](../Lookup)
instead.
### Output format
* `format=[xml|json|jsonv2|geojson|geocodejson]`
See [Place Output Formats](Output.md) for details on each format. (Default: xml)
* `json_callback=<string>`
Wrap JSON output in a callback function ([JSONP](https://en.wikipedia.org/wiki/JSONP)) i.e. `<string>(<json>)`.
Only has an effect for JSON output formats.
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 1)
* `extratags=[0|1]`
Include additional information in the result if available,
e.g. wikipedia link, opening hours. (Default: 0)
* `namedetails=[0|1]`
Include a list of alternative names in the results. These may include
language variants, references, operator and brand. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing search results, overrides the value
specified in the "Accept-Language" HTTP header.
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
### Result limitation
* `zoom=[0-18]`
Level of detail required for the address. Default: 18. This is a number that
corresponds roughly to the zoom level used in XYZ tile sources in frameworks
like Leaflet.js, Openlayers etc.
In terms of address details the zoom levels are as follows:
zoom | address detail
-----|---------------
3 | country
5 | state
8 | county
10 | city
14 | suburb
16 | major streets
17 | major and minor streets
18 | building
### Polygon output
* `polygon_geojson=1`
* `polygon_kml=1`
* `polygon_svg=1`
* `polygon_text=1`
Output geometry of results as a GeoJSON, KML, SVG or WKT. Only one of these
options can be used at a time. (Default: 0)
* `polygon_threshold=0.0`
Return a simplified version of the output geometry. The parameter is the
tolerance in degrees with which the geometry may differ from the original
geometry. Topology is preserved in the result. (Default: 0.0)
### Other
* `email=<valid email address>`
If you are making large numbers of request please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.
* `debug=[0|1]`
Output assorted developer debug information. Data on internals of Nominatim's
"Search Loop" logic, and SQL queries. The output is (rough) HTML format.
This overrides the specified machine readable format. (Default: 0)
## Examples
* [https://nominatim.openstreetmap.org/reverse?format=xml&lat=52.5487429714954&lon=-1.81602098644987&zoom=18&addressdetails=1](https://nominatim.openstreetmap.org/reverse?format=xml&lat=52.5487429714954&lon=-1.81602098644987&zoom=18&addressdetails=1)
```xml
<reversegeocode timestamp="Fri, 06 Nov 09 16:33:54 +0000" querystring="...">
<result place_id="1620612" osm_type="node" osm_id="452010817">
135, Pilkington Avenue, Wylde Green, City of Birmingham, West Midlands (county), B72, United Kingdom
</result>
<addressparts>
<house_number>135</house_number>
<road>Pilkington Avenue</road>
<village>Wylde Green</village>
<town>Sutton Coldfield</town>
<city>City of Birmingham</city>
<county>West Midlands (county)</county>
<postcode>B72</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
</addressparts>
</reversegeocode>
```
##### Example with `format=jsonv2`
* [https://nominatim.openstreetmap.org/reverse?format=jsonv2&lat=-34.44076&lon=-58.70521](https://nominatim.openstreetmap.org/reverse?format=jsonv2&lat=-34.44076&lon=-58.70521)
```json
{
"place_id":"134140761",
"licence":"Data © OpenStreetMap contributors, ODbL 1.0. https:\/\/www.openstreetmap.org\/copyright",
"osm_type":"way",
"osm_id":"280940520",
"lat":"-34.4391708",
"lon":"-58.7064573",
"place_rank":"26",
"category":"highway",
"type":"motorway",
"importance":"0.1",
"addresstype":"road",
"display_name":"Autopista Pedro Eugenio Aramburu, El Triángulo, Partido de Malvinas Argentinas, Buenos Aires, 1.619, Argentina",
"name":"Autopista Pedro Eugenio Aramburu",
"address":{
"road":"Autopista Pedro Eugenio Aramburu",
"village":"El Triángulo",
"state_district":"Partido de Malvinas Argentinas",
"state":"Buenos Aires",
"postcode":"1.619",
"country":"Argentina",
"country_code":"ar"
},
"boundingbox":["-34.44159","-34.4370994","-58.7086067","-58.7044712"]
}
```
##### Example with `format=geojson`
* [https://nominatim.openstreetmap.org/reverse?format=geojson&lat=44.50155&lon=11.33989](https://nominatim.openstreetmap.org/reverse?format=geojson&lat=44.50155&lon=11.33989)
```json
{
"type": "FeatureCollection",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"features": [
{
"type": "Feature",
"properties": {
"place_id": "18512203",
"osm_type": "node",
"osm_id": "1704756187",
"place_rank": "30",
"category": "place",
"type": "house",
"importance": "0",
"addresstype": "place",
"name": null,
"display_name": "71, Via Guglielmo Marconi, Saragozza-Porto, Bologna, BO, Emilia-Romagna, 40122, Italy",
"address": {
"house_number": "71",
"road": "Via Guglielmo Marconi",
"suburb": "Saragozza-Porto",
"city": "Bologna",
"county": "BO",
"state": "Emilia-Romagna",
"postcode": "40122",
"country": "Italy",
"country_code": "it"
}
},
"bbox": [
11.3397676,
44.5014307,
11.3399676,
44.5016307
],
"geometry": {
"type": "Point",
"coordinates": [
11.3398676,
44.5015307
]
}
}
]
}
```
##### Example with `format=geocodejson`
[https://nominatim.openstreetmap.org/reverse?format=geocodejson&lat=60.2299&lon=11.1663](https://nominatim.openstreetmap.org/reverse?format=geocodejson&lat=60.2299&lon=11.1663)
```json
{
"type": "FeatureCollection",
"geocoding": {
"version": "0.1.0",
"attribution": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"licence": "ODbL",
"query": "60.229917843587,11.16630979382"
},
"features": {
"type": "Feature",
"properties": {
"geocoding": {
"place_id": "42700574",
"osm_type": "node",
"osm_id": "3110596255",
"type": "house",
"accuracy": 0,
"label": "1, Løvenbergvegen, Mogreina, Ullensaker, Akershus, 2054, Norway",
"name": null,
"housenumber": "1",
"street": "Løvenbergvegen",
"postcode": "2054",
"county": "Akershus",
"country": "Norway",
"admin": {
"level7": "Ullensaker",
"level4": "Akershus",
"level2": "Norway"
}
}
},
"geometry": {
"type": "Point",
"coordinates": [
11.1658572,
60.2301296
]
}
}
}
```

356
docs/api/Search.md Normal file
View File

@@ -0,0 +1,356 @@
# Search queries
The search API allows you to look up a location from a textual description
or address. Nominatim supports structured and free-form search queries.
The search query may also contain
[special phrases](https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases)
which are translated into specific OpenStreetMap (OSM) tags (e.g. Pub => `amenity=pub`).
This can be used to narrow down the kind of objects to be returned.
!!! warning
Special phrases are not suitable to query all objects of a certain type in an
area. Nominatim will always just return a collection of the best matches. To
download OSM data by object type, use the [Overpass API](https://overpass-api.de/).
## Parameters
The search API has the following format:
```
https://nominatim.openstreetmap.org/search?<params>
```
The search term may be specified with two different sets of parameters:
* `q=<query>`
Free-form query string to search for.
Free-form queries are processed first left-to-right and then right-to-left if that fails. So you may search for
[pilkington avenue, birmingham](https://nominatim.openstreetmap.org/search?q=pilkington+avenue,birmingham) as well as for
[birmingham, pilkington avenue](https://nominatim.openstreetmap.org/search?q=birmingham,+pilkington+avenue).
Commas are optional, but improve performance by reducing the complexity of the search.
* `street=<housenumber> <streetname>`
* `city=<city>`
* `county=<county>`
* `state=<state>`
* `country=<country>`
* `postalcode=<postalcode>`
Alternative query string format split into several parameters for structured requests.
Structured requests are faster but are less robust against alternative
OSM tagging schemas. **Do not combine with** `q=<query>` **parameter**.
Both query forms accept the additional parameters listed below.
### Output format
* `format=[xml|json|jsonv2|geojson|geocodejson]`
See [Place Output Formats](Output.md) for details on each format. (Default: jsonv2)
* `json_callback=<string>`
Wrap JSON output in a callback function ([JSONP](https://en.wikipedia.org/wiki/JSONP)) i.e. `<string>(<json>)`.
Only has an effect for JSON output formats.
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 0)
* `extratags=[0|1]`
Include additional information in the result if available,
e.g. wikipedia link, opening hours. (Default: 0)
* `namedetails=[0|1]`
Include a list of alternative names in the results. These may include
language variants, references, operator and brand. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing search results, overrides the value
specified in the ["Accept-Language" HTTP header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language).
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
### Result limitation
* `countrycodes=<countrycode>[,<countrycode>][,<countrycode>]...`
Limit search results to one or more countries. `<countrycode>` must be the
[ISO 3166-1alpha2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) code,
e.g. `gb` for the United Kingdom, `de` for Germany.
Each place in Nominatim is assigned to one country code based
on OSM country boundaries. In rare cases a place may not be in any country
at all, for example, in international waters.
* `exclude_place_ids=<place_id,[place_id],[place_id]`
If you do not want certain OSM objects to appear in the search
result, give a comma separated list of the `place_id`s you want to skip.
This can be used to retrieve additional search results. For example, if a
previous query only returned a few results, then including those here would
cause the search to return other, less accurate, matches (if possible).
* `limit=<integer>`
Limit the number of returned results. (Default: 10, Maximum: 50)
* `viewbox=<x1>,<y1>,<x2>,<y2>`
The preferred area to find search results. Any two corner points of the box
are accepted as long as they span a real box. `x` is longitude,
`y` is latitude.
* `bounded=[0|1]`
When a viewbox is given, restrict the result to items contained within that
viewbox (see above). When `viewbox` and `bounded=1` are given, an amenity
only search is allowed. Give the special keyword for the amenity in square
brackets, e.g. `[pub]` and a selection of objects of this type is returned.
There is no guarantee that the result is complete. (Default: 0)
### Polygon output
* `polygon_geojson=1`
* `polygon_kml=1`
* `polygon_svg=1`
* `polygon_text=1`
Output geometry of results as a GeoJSON, KML, SVG or WKT. Only one of these
options can be used at a time. (Default: 0)
* `polygon_threshold=0.0`
Return a simplified version of the output geometry. The parameter is the
tolerance in degrees with which the geometry may differ from the original
geometry. Topology is preserved in the result. (Default: 0.0)
### Other
* `email=<valid email address>`
If you are making large numbers of request please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.
* `dedupe=[0|1]`
Sometimes you have several objects in OSM identifying the same place or
object in reality. The simplest case is a street being split into many
different OSM ways due to different characteristics. Nominatim will
attempt to detect such duplicates and only return one match unless
this parameter is set to 0. (Default: 1)
* `debug=[0|1]`
Output assorted developer debug information. Data on internals of Nominatim's
"Search Loop" logic, and SQL queries. The output is (rough) HTML format.
This overrides the specified machine readable format. (Default: 0)
## Examples
##### XML with kml polygon
* [https://nominatim.openstreetmap.org/search?q=135+pilkington+avenue,+birmingham&format=xml&polygon_geojson=1&addressdetails=1](https://nominatim.openstreetmap.org/search?q=135+pilkington+avenue,+birmingham&format=xml&polygon_geojson=1&addressdetails=1)
```xml
<searchresults timestamp="Sat, 07 Nov 09 14:42:10 +0000" querystring="135 pilkington, avenue birmingham" polygon="true">
<place
place_id="1620612" osm_type="node" osm_id="452010817"
boundingbox="52.548641204834,52.5488433837891,-1.81612110137939,-1.81592094898224"
lat="52.5487429714954" lon="-1.81602098644987"
display_name="135, Pilkington Avenue, Wylde Green, City of Birmingham, West Midlands (county), B72, United Kingdom"
class="place" type="house">
<geokml>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<coordinates>-1.816513,52.548756599999997 -1.816434,52.548747300000002 -1.816429,52.5487629 -1.8163717,52.548756099999999 -1.8163464,52.548834599999999 -1.8164599,52.548848100000001 -1.8164685,52.5488213 -1.8164913,52.548824000000003 -1.816513,52.548756599999997</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</geokml>
<house_number>135</house_number>
<road>Pilkington Avenue</road>
<village>Wylde Green</village>
<town>Sutton Coldfield</town>
<city>City of Birmingham</city>
<county>West Midlands (county)</county>
<postcode>B72</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
</searchresults>
```
##### JSON with SVG polygon
[https://nominatim.openstreetmap.org/search/Unter%20den%20Linden%201%20Berlin?format=json&addressdetails=1&limit=1&polygon_svg=1](https://nominatim.openstreetmap.org/search/Unter%20den%20Linden%201%20Berlin?format=json&addressdetails=1&limit=1&polygon_svg=1)
```json
{
"address": {
"city": "Berlin",
"city_district": "Mitte",
"construction": "Unter den Linden",
"continent": "European Union",
"country": "Deutschland",
"country_code": "de",
"house_number": "1",
"neighbourhood": "Scheunenviertel",
"postcode": "10117",
"public_building": "Kommandantenhaus",
"state": "Berlin",
"suburb": "Mitte"
},
"boundingbox": [
"52.5170783996582",
"52.5173187255859",
"13.3975105285645",
"13.3981599807739"
],
"class": "amenity",
"display_name": "Kommandantenhaus, 1, Unter den Linden, Scheunenviertel, Mitte, Berlin, 10117, Deutschland, European Union",
"importance": 0.73606775332943,
"lat": "52.51719785",
"licence": "Data \u00a9 OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright",
"lon": "13.3978352028938",
"osm_id": "15976890",
"osm_type": "way",
"place_id": "30848715",
"svg": "M 13.397511 -52.517283599999999 L 13.397829400000001 -52.517299800000004 13.398131599999999 -52.517315099999998 13.398159400000001 -52.517112099999999 13.3975388 -52.517080700000001 Z",
"type": "public_building"
}
```
##### JSON with address details
[https://nominatim.openstreetmap.org/?addressdetails=1&q=bakery+in+berlin+wedding&format=json&limit=1](https://nominatim.openstreetmap.org/?addressdetails=1&q=bakery+in+berlin+wedding&format=json&limit=1)
```json
{
"address": {
"bakery": "B\u00e4cker Kamps",
"city_district": "Mitte",
"continent": "European Union",
"country": "Deutschland",
"country_code": "de",
"footway": "Bahnsteig U6",
"neighbourhood": "Sprengelkiez",
"postcode": "13353",
"state": "Berlin",
"suburb": "Wedding"
},
"boundingbox": [
"52.5460929870605",
"52.5460968017578",
"13.3591794967651",
"13.3591804504395"
],
"class": "shop",
"display_name": "B\u00e4cker Kamps, Bahnsteig U6, Sprengelkiez, Wedding, Mitte, Berlin, 13353, Deutschland, European Union",
"icon": "https://nominatim.openstreetmap.org/images/mapicons/shopping_bakery.p.20.png",
"importance": 0.201,
"lat": "52.5460941",
"licence": "Data \u00a9 OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright",
"lon": "13.35918",
"osm_id": "317179427",
"osm_type": "node",
"place_id": "1453068",
"type": "bakery"
}
```
##### GeoJSON
[https://nominatim.openstreetmap.org/search?q=17+Strada+Pictor+Alexandru+Romano%2C+Bukarest&format=geojson](https://nominatim.openstreetmap.org/search?q=17+Strada+Pictor+Alexandru+Romano%2C+Bukarest&format=geojson)
```json
{
"type": "FeatureCollection",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"features": [
{
"type": "Feature",
"properties": {
"place_id": "35811445",
"osm_type": "node",
"osm_id": "2846295644",
"display_name": "17, Strada Pictor Alexandru Romano, Bukarest, Bucharest, Sector 2, Bucharest, 023964, Romania",
"place_rank": "30",
"category": "place",
"type": "house",
"importance": 0.62025
},
"bbox": [
26.1156689,
44.4354754,
26.1157689,
44.4355754
],
"geometry": {
"type": "Point",
"coordinates": [
26.1157189,
44.4355254
]
}
}
]
}
```
##### GeocodeJSON
[https://nominatim.openstreetmap.org/search?q=%CE%91%CE%B3%CE%AF%CE%B1+%CE%A4%CF%81%CE%B9%CE%AC%CE%B4%CE%B1%2C+%CE%91%CE%B4%CF%89%CE%BD%CE%B9%CE%B4%CE%BF%CF%82%2C+Athens%2C+Greece&format=geocodejson](https://nominatim.openstreetmap.org/search?q=%CE%91%CE%B3%CE%AF%CE%B1+%CE%A4%CF%81%CE%B9%CE%AC%CE%B4%CE%B1%2C+%CE%91%CE%B4%CF%89%CE%BD%CE%B9%CE%B4%CE%BF%CF%82%2C+Athens%2C+Greece&format=geocodejson)
```json
{
"type": "FeatureCollection",
"geocoding": {
"version": "0.1.0",
"attribution": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"licence": "ODbL",
"query": "Αγία Τριάδα, Αδωνιδος, Athens, Greece"
},
"features": [
{
"type": "Feature",
"properties": {
"geocoding": {
"type": "place_of_worship",
"label": "Αγία Τριάδα, Αδωνιδος, Άγιος Νικόλαος, 5º Δημοτικό Διαμέρισμα Αθηνών, Athens, Municipality of Athens, Regional Unit of Central Athens, Region of Attica, Attica, 11472, Greece",
"name": "Αγία Τριάδα",
"admin": null
}
},
"geometry": {
"type": "Point",
"coordinates": [
23.72949633941,
38.0051697
]
}
}
]
}
```

66
docs/api/Status.md Normal file
View File

@@ -0,0 +1,66 @@
# Status
Useful for checking if the service and database is running. The JSON output also shows
when the database was last updated.
## Parameters
* `format=[text|json]` (defaults to 'text')
## Output
#### Text format
```
https://nominatim.openstreetmap.org/status.php
```
will return HTTP status code 200 and print `OK`.
On error it will return HTTP status code 500 and print a message, e.g.
`ERROR: Database connection failed`.
#### JSON format
```
https://nominatim.openstreetmap.org/status.php?format=json
```
will return HTTP code 200 and a structure
```json
{
"status": 0,
"message": "OK",
"data_updated": "2020-05-04T14:47:00+00:00",
"software_version": "3.6.0-0",
"database_version": "3.6.0-0"
}
```
The `software_version` field contains the version of Nominatim used to serve
the API. The `database_version` field contains the version of the data format
in the database.
On error will also return HTTP status code 200 and a structure with error
code and message, e.g.
```json
{
"status": 700,
"message": "Database connection failed"
}
```
Possible status codes are
| | message | notes |
|-----|----------------------|---------------------------------------------------|
| 700 | "No database" | connection failed |
| 701 | "Module failed" | database could not load nominatim.so |
| 702 | "Module call failed" | nominatim.so loaded but calling a function failed |
| 703 | "Query failed" | test query against a database table failed |
| 704 | "No value" | test query worked but returned no results |

View File

@@ -0,0 +1,153 @@
## Configuring the Import
Which OSM objects are added to the database and which of the tags are used
can be configured via the import style configuration file. This
is a JSON file which contains a list of rules which are matched against every
tag of every object and then assign the tag its specific role.
The style to use is given by the `NOMINATIM_IMPORT_STYLE` configuration
option. There are a number of default styles, which are explained in detail
in the [Import section](../admin/Import.md#filtering-imported-data). These
standard styles may be referenced by their name.
You can also create your own custom syle. Put the style file into your
project directory and then set `NOMINATIM_IMPORT_STYLE` to the name of the file.
It is always recommended to start with one of the standard styles and customize
those. You find the standard styles under the name `import-<stylename>.style`
in the standard Nominatim configuration path (usually `/etc/nominatim` or
`/usr/local/etc/nominatim`).
The remainder of the page describes the format of the file.
### Configuration Rules
A single rule looks like this:
```json
{
"keys" : ["key1", "key2", ...],
"values" : {
"value1" : "prop",
"value2" : "prop1,prop2"
}
}
```
A rule first defines a list of keys to apply the rule to. This is always a list
of strings. The string may have four forms. An empty string matches against
any key. A string that ends in an asterisk `*` is a prefix match and accordingly
matches against any key that starts with the given string (minus the `*`). A
suffix match can be defined similarly with a string that starts with a `*`. Any
other string constitutes an exact match.
The second part of the rules defines a list of values and the properties that
apply to a successful match. Value strings may be either empty, which
means that they match any value, or describe an exact match. Prefix
or suffix matching of values is not possible.
For a rule to match, it has to find a valid combination of keys and values. The
resulting property is that of the matched values.
The rules in a configuration file are processed sequentially and the first
match for each tag wins.
A rule where key and value are the empty string is special. This defines the
fallback when none of the rules match. The fallback is always used as a last
resort when nothing else matches, no matter where the rule appears in the file.
Defining multiple fallback rules is not allowed. What happens in this case,
is undefined.
### Tag Properties
One or more of the following properties may be given for each tag:
* `main`
A principal tag. A new row will be added for the object with key and value
as `class` and `type`.
* `with_name`
When the tag is a principal tag (`main` property set): only really add a new
row, if there is any name tag found (a reference tag is not sufficient, see
below).
* `with_name_key`
When the tag is a principal tag (`main` property set): only really add a new
row, if there is also a name tag that matches the key of the principal tag.
For example, if the main tag is `bridge=yes`, then it will only be added as
an extra row, if there is a tag `bridge:name[:XXX]` for the same object.
If this property is set, all other names that are not domain-specific are
ignored.
* `fallback`
When the tag is a principal tag (`main` property set): only really add a new
row, when no other principal tags for this object have been found. Only one
fallback tag can win for an object.
* `operator`
When the tag is a principal tag (`main` property set): also include the
`operator` tag in the list of names. This is a special construct for an
out-dated tagging practise in OSM. Fuel stations and chain restaurants
in particular used to have the name of the chain tagged as `operator`.
These days the chain can be more commonly found in the `brand` tag but
there is still enough old data around to warrant this special case.
* `name`
Add tag to the list of names.
* `ref`
Add tag to the list of names as a reference. At the moment this only means
that the object is not considered to be named for `with_name`.
* `address`
Add tag to the list of address tags. If the tag starts with `addr:` or
`is_in:`, then this prefix is cut off before adding it to the list.
* `postcode`
Add the value as a postcode to the address tags. If multiple tags are
candidate for postcodes, one wins out and the others are dropped.
* `country`
Add the value as a country code to the address tags. The value must be a
two letter country code, otherwise it is ignored. If there are multiple
tags that match, then one wins out and the others are dropped.
* `house`
If no principle tags can be found for the object, still add the object with
`class`=`place` and `type`=`house`. Use this for address nodes that have no
other function.
* `interpolation`
Add this object as an address interpolation (appears as `class`=`place` and
`type`=`houses` in the database).
* `extra`
Add tag to the list of extra tags.
* `skip`
Skip the tag completely. Useful when a custom default fallback is defined
or to define exceptions to rules.
A rule can define as many of these properties for one match as it likes. For
example, if the property is `"main,extra"` then the tag will open a new row
but also have the tag appear in the list of extra tags.
### Changing the Style of Existing Databases
There is normally no issue changing the style of a database that is already
imported and now kept up-to-date with change files. Just be aware that any
change in the style applies to updates only. If you want to change the data
that is already in the database, then a reimport is necessary.

View File

@@ -0,0 +1,20 @@
Nominatim comes with a predefined set of configuration options that should
work for most standard installations. If you have special requirements, there
are many places where the configuration can be adapted. This chapter describes
the following configurable parts:
* [Global Settings](Settings.md) has a detailed description of all parameters that
can be set in your local `.env` configuration
* [Import styles](Import-Styles.md) explains how to write your own import style
in order to control what kind of OSM data will be imported
* [Place ranking](Ranking.md) describes the configuration around classifing
places in terms of their importance and their role in an address
* [Tokenizers](Tokenizers.md) describes the configuration of the module
responsible for analysing and indexing names
* [Special Phrases](Special-Phrases.md) are common nouns or phrases that
can be used in search to identify a class of places
There are also guides for adding the following external data:
* [US house numbers from the TIGER dataset](Tiger.md)
* [External postcodes](Postcodes.md)

View File

@@ -0,0 +1,37 @@
# External postcode data
Nominatim creates a table of known postcode centroids during import. This table
is used for searches of postcodes and for adding postcodes to places where the
OSM data does not provide one. These postcode centroids are mainly computed
from the OSM data itself. In addition, Nominatim supports reading postcode
information from an external CSV file, to supplement the postcodes that are
missing in OSM.
To enable external postcode support, simply put one CSV file per country into
your project directory and name it `<CC>_postcodes.csv`. `<CC>` must be the
two-letter country code for which to apply the file. The file may also be
gzipped. Then it must be called `<CC>_postcodes.csv.gz`.
The CSV file must use commas as a delimiter and have a header line. Nominatim
expects three columns to be present: `postcode`, `lat` and `lon`. All other
columns are ignored. `lon` and `lat` must describe the x and y coordinates of the
postcode centroids in WGS84.
The postcode files are loaded only when there is data for the given country
in your database. For example, if there is a `us_postcodes.csv` file in your
project directory but you import only an excerpt of Italy, then the US postcodes
will simply be ignored.
As a rule, the external postcode data should be put into the project directory
**before** starting the initial import. Still, you can add, remove and update the
external postcode data at any time. Simply
run:
```
nominatim refresh --postcodes
```
to make the changes visible in your database. Be aware, however, that the changes
only have an immediate effect on searches for postcodes. Postcodes that were
added to places are only updated, when they are reindexed. That usually happens
only during replication updates.

139
docs/customize/Ranking.md Normal file
View File

@@ -0,0 +1,139 @@
# Place Ranking in Nominatim
Nominatim uses two metrics to rank a place: search rank and address rank.
This chapter explains what place ranking means and how it can be customized.
## Search rank
The search rank describes the extent and importance of a place. It is used
when ranking search results. Simply put, if there are two results for a
search query which are otherwise equal, then the result with the _lower_
search rank will be appear higher in the result list.
Search ranks are not so important these days because many well-known
places use the Wikipedia importance ranking instead.
The following table gives an overview of the kind of features that Nominatim
expects for each rank:
rank | typical place types | extent
-------|---------------------------------|-------
1-3 | oceans, continents | -
4 | countries | -
5-9 | states, regions, provinces | -
10-12 | counties | -
13-16 | cities, municipalities, islands | 15 km
17-18 | towns, boroughs | 4 km
19 | villages, suburbs | 2 km
20 | hamlets, farms, neighbourhoods | 1 km
21-25 | isolated dwellings, city blocks | 500 m
The extent column describes how far a feature is assumed to reach when it
is mapped only as a point. Larger features like countries and states are usually
available with their exact area in the OpenStreetMap data. That is why no extent
is given.
## Address rank
The address rank describes where a place shows up in an address hierarchy.
Usually only administrative boundaries and place nodes and areas are
eligible to be part of an address. Places that should not appear in the
address must have an address rank of 0.
The following table gives an overview how ranks are mapped to address parts:
rank | address part
-------------|-------------
1-3 | _unused_
4 | country
5-9 | state
10-12 | county
13-16 | city
17-21 | suburb
22-24 | neighbourhood
25 | squares, farms, localities
26-27 | street
28-30 | POI/house number
The country rank 4 usually doesn't show up in the address parts of an object.
The country is determined indirectly from the country code.
Ranks 5-24 can be assigned more or less freely. They make up the major part
of the address.
Rank 25 is also an addressing rank but it is special because while it can be
the parent to a POI with an addr:place of the same name, it cannot be a parent
to streets. Use it for place features that are technically on the same level
as a street (e.g. squares, city blocks) or for places that should not normally
appear in an address unless explicitly tagged so (e.g place=locality which
should be uninhabited and as such not addressable).
The street ranks 26 and 27 are handled slightly differently. Only one object
from these ranks shows up in an address.
For POI level objects like shops, buildings or house numbers always use rank 30.
Ranks 28 is reserved for house number interpolations. 29 is for internal use
only.
## Rank configuration
Search and address ranks are assigned to a place when it is first imported
into the database. There are a few hard-coded rules for the assignment:
* postcodes follow special rules according to their length
* boundaries that are not areas and railway=rail are dropped completely
* the following are always search rank 30 and address rank 0:
* highway nodes
* landuse that is not an area
Other than that, the ranks can be freely assigned via the JSON file according
to their type and the country they are in. The name of the config file to be
used can be changed with the setting `NOMINATIM_ADDRESS_LEVEL_CONFIG`.
The address level configuration must consist of an array of configuration
entries, each containing a tag definition and an optional country array:
```
[ {
"tags" : {
"place" : {
"county" : 12,
"city" : 16,
},
"landuse" : {
"residential" : 22,
"" : 30
}
}
},
{
"countries" : [ "ca", "us" ],
"tags" : {
"boundary" : {
"administrative8" : 18,
"administrative9" : 20
},
"landuse" : {
"residential" : [22, 0]
}
}
}
]
```
The `countries` field contains a list of countries (as ISO 3166-1 alpha 2 code)
for which the definition applies. When the field is omitted, then the
definition is used as a fallback, when nothing more specific for a given
country exists.
`tags` contains the ranks for key/value pairs. The ranks can be either a
single number, in which case they are the search and address rank, or an array
of search and address rank (in that order). The value may be left empty.
Then the rank is used when no more specific value is found for the given
key.
Countries and key/value combination may appear in multiple definitions. Just
make sure that each combination of country/key/value appears only once per
file. Otherwise the import will fail with a UNIQUE INDEX constraint violation
on import.

649
docs/customize/Settings.md Normal file
View File

@@ -0,0 +1,649 @@
This section provides a reference of all configuration parameters that can
be used with Nominatim.
# Configuring Nominatim
Nominatim uses [dotenv](https://github.com/theskumar/python-dotenv) to manage
its configuration settings. There are two means to set configuration
variables: through an `.env` configuration file or through an environment
variable.
The `.env` configuration file needs to be placed into the
[project directory](../admin/Import.md#creating-the-project-directory). It
must contain configuration parameters in `<parameter>=<value>` format.
Please refer to the dotenv documentation for details.
The configuration options may also be set in the form of shell environment
variables. This is particularly useful, when you want to temporarily change
a configuration option. For example, to force the replication serve to
download the next change, you can temporarily disable the update interval:
NOMINATIM_REPLICATION_UPDATE_INTERVAL=0 nominatim replication --once
If a configuration option is defined through .env file and environment
variable, then the latter takes precedence.
## Configuration Parameter Reference
### Import and Database Settings
#### NOMINATIM_DATABASE_DSN
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Database connection string |
| **Format:** | string: `pgsql:<param1>=<value1>;<param2>=<value2>;...` |
| **Default:** | pgsql:dbname=nominatim |
| **After Changes:** | run `nominatim refresh --website` |
Sets the connection parameters for the Nominatim database. At a minimum
the name of the database (`dbname`) is required. You can set any additional
parameter that is understood by libpq. See the [Postgres documentation](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS) for a full list.
!!! note
It is usually recommended not to set the password directly in this
configuration parameter. Use a
[password file](https://www.postgresql.org/docs/current/libpq-pgpass.html)
instead.
#### NOMINATIM_DATABASE_WEBUSER
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Database query user |
| **Format:** | string |
| **Default:** | www-data |
| **After Changes:** | cannot be changed after import |
Defines the name of the database user that will run search queries. Usually
this is the user under which the webserver is executed. When running Nominatim
via php-fpm, you can also define a separate query user. The Postgres user
needs to be set up before starting the import.
Nominatim grants minimal rights to this user to all tables that are needed
for running geocoding queries.
#### NOMINATIM_DATABASE_MODULE_PATH
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Directory where to find the PostgreSQL server module |
| **Format:** | path |
| **Default:** | _empty_ (use `<project_directory>/module`) |
| **After Changes:** | run `nominatim refresh --functions` |
| **Comment:** | Legacy tokenizer only |
Defines the directory in which the PostgreSQL server module `nominatim.so`
is stored. The directory and module must be accessible by the PostgreSQL
server.
For information on how to use this setting when working with external databases,
see [Advanced Installations](../admin/Advanced-Installations.md).
The option is only used by the Legacy tokenizer and ignored otherwise.
#### NOMINATIM_TOKENIZER
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Tokenizer used for normalizing and parsing queries and names |
| **Format:** | string |
| **Default:** | legacy |
| **After Changes:** | cannot be changed after import |
Sets the tokenizer type to use for the import. For more information on
available tokenizers and how they are configured, see
[Tokenizers](../customize/Tokenizers.md).
#### NOMINATIM_TOKENIZER_CONFIG
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Configuration file for the tokenizer |
| **Format:** | path |
| **Default:** | _empty_ (default file depends on tokenizer) |
| **After Changes:** | see documentation for each tokenizer |
Points to the file with additional configuration for the tokenizer.
See the [Tokenizer](../customize/Tokenizers.md) descriptions for details
on the file format.
If a relative path is given, then the file is searched first relative to the
project directory and then in the global settings directory.
#### NOMINATIM_MAX_WORD_FREQUENCY
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Number of occurrences before a word is considered frequent |
| **Format:** | int |
| **Default:** | 50000 |
| **After Changes:** | cannot be changed after import |
| **Comment:** | Legacy tokenizer only |
The word frequency count is used by the Legacy tokenizer to automatically
identify _stop words_. Any partial term that occurs more often then what
is defined in this setting, is effectively ignored during search.
#### NOMINATIM_LIMIT_REINDEXING
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Avoid invalidating large areas |
| **Format:** | bool |
| **Default:** | yes |
Nominatim computes the address of each place at indexing time. This has the
advantage to make search faster but also means that more objects needs to
be invalidated when the data changes. For example, changing the name of
the state of Florida would require recomputing every single address point
in the state to make the new name searchable in conjunction with addresses.
Setting this option to 'yes' means that Nominatim skips reindexing of contained
objects when the area becomes too large.
#### NOMINATIM_LANGUAGES
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Restrict search languages |
| **Format:** | string: comma-separated list of language codes |
| **Default:** | _empty_ |
Normally Nominatim will include all language variants of name:XX
in the search index. Set this to a comma separated list of language
codes, to restrict import to a subset of languages.
Currently only affects the initial import of country names and special phrases.
#### NOMINATIM_TERM_NORMALIZATION
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Rules for normalizing terms for comparisons |
| **Format:** | string: semicolon-separated list of ICU rules |
| **Default:** | :: NFD (); [[:Nonspacing Mark:] [:Cf:]] >; :: lower (); [[:Punctuation:][:Space:]]+ > ' '; :: NFC (); |
| **Comment:** | Legacy tokenizer only |
[Special phrases](Special-Phrases.md) have stricter matching requirements than
normal search terms. They must appear exactly in the query after this term
normalization has been applied.
Only has an effect on the Legacy tokenizer. For the ICU tokenizer the rules
defined in the
[normalization section](Tokenizers.md#normalization-and-transliteration)
will be used.
#### NOMINATIM_USE_US_TIGER_DATA
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Enable searching for Tiger house number data |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim --refresh --functions` |
When this setting is enabled, search and reverse queries also take data
from [Tiger house number data](Tiger.md) into account.
#### NOMINATIM_USE_AUX_LOCATION_DATA
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Enable searching in external house number tables |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim --refresh --functions` |
| **Comment:** | Do not use. |
When this setting is enabled, search queries also take data from external
house number tables into account.
*Warning:* This feature is currently unmaintained and should not be used.
#### NOMINATIM_HTTP_PROXY
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Use HTTP proxy when downloading data |
| **Format:** | boolean |
| **Default:** | no |
When this setting is enabled and at least
[NOMINATIM_HTTP_PROXY_HOST](#nominatim_http_proxy_host) and
[NOMINATIM_HTTP_PROXY_PORT](#nominatim_http_proxy_port) are set, the
configured proxy will be used, when downloading external data like
replication diffs.
#### NOMINATIM_HTTP_PROXY_HOST
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Host name of the proxy to use |
| **Format:** | string |
| **Default:** | _empty_ |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, this setting
configures the proxy host name.
#### NOMINATIM_HTTP_PROXY_PORT
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Port number of the proxy to use |
| **Format:** | integer |
| **Default:** | 3128 |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, this setting
configures the port number to use with the proxy.
#### NOMINATIM_HTTP_PROXY_LOGIN
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Username for proxies that require login |
| **Format:** | string |
| **Default:** | _empty_ |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, use this
setting to define the username for proxies that require a login.
#### NOMINATIM_HTTP_PROXY_PASSWORD
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Password for proxies that require login |
| **Format:** | string |
| **Default:** | _empty_ |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, use this
setting to define the password for proxies that require a login.
#### NOMINATIM_OSM2PGSQL_BINARY
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Location of the osm2pgsql binary |
| **Format:** | path |
| **Default:** | _empty_ (use binary shipped with Nominatim) |
| **Comment:** | EXPERT ONLY |
Nominatim uses [osm2pgsql](https://osm2pgsql.org) to load the OSM data
initially into the database. Nominatim comes bundled with a version of
osm2pgsql that is guaranteed to be compatible. Use this setting to use
a different binary instead. You should do this only when you know exactly
what you are doing. If the osm2pgsql version is not compatible, then the
result is undefined.
#### NOMINATIM_WIKIPEDIA_DATA_PATH
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Directory with the wikipedia importance data |
| **Format:** | path |
| **Default:** | _empty_ (project directory) |
Set a custom location for the
[wikipedia ranking file](../admin/Import.md#wikipediawikidata-rankings). When
unset, Nominatim expects the data to be saved in the project directory.
#### NOMINATIM_ADDRESS_LEVEL_CONFIG
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Configuration file for rank assignments |
| **Format:** | path |
| **Default:** | address-levels.json |
The _address level configuration_ defines the rank assignments for places. See
[Place Ranking](Ranking.md) for a detailed explanation what rank assignments
are and what the configuration file must look like.
When a relative path is given, then the file is searched first relative to the
project directory and then in the global settings directory.
#### NOMINATIM_IMPORT_STYLE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Configuration to use for the initial OSM data import |
| **Format:** | string or path |
| **Default:** | extratags |
The _style configuration_ describes which OSM objects and tags are taken
into consideration for the search database. Nominatim comes with a set
of pre-configured styles, that may be configured here.
You can also write your own custom style and point the setting to the file
with the style. When a relative path is given, then the style file is searched
first relative to the project directory and then in the global settings
directory.
See [Import Styles](Import-Styles.md)
for more information on the available internal styles and the format of the
configuration file.
#### NOMINATIM_FLATNODE_FILE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Location of osm2pgsql flatnode file |
| **Format:** | path |
| **Default:** | _empty_ (do not use a flatnote file) |
| **After Changes:** | Only change when moving the file physically. |
The `osm2pgsql flatnode file` is file that efficiently stores geographic
location for OSM nodes. For larger imports it can significantly speed up
the import. When this option is unset, then osm2pgsql uses a PsotgreSQL table
to store the locations.
When a relative path is given, then the flatnode file is created/searched
relative to the project directory.
!!! warning
The flatnode file is not only used during the initial import but also
when adding new data with `nominatim add-data` or `nominatim replication`.
Make sure you keep the flatnode file around and this setting unmodified,
if you plan to add more data or run regular updates.
#### NOMINATIM_TABLESPACE_*
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Group of settings for distributing the database over tablespaces |
| **Format:** | string |
| **Default:** | _empty_ (do not use a table space) |
| **After Changes:** | no effect after initial import |
Nominatim allows to distribute the search database over up to 10 different
[PostgreSQL tablespaces](https://www.postgresql.org/docs/current/manage-ag-tablespaces.html).
If you use this option, make sure that the tablespaces exist before starting
the import.
The available tablespace groups are:
NOMINATIM_TABLESPACE_SEARCH_DATA
: Data used by the geocoding frontend.
NOMINATIM_TABLESPACE_SEARCH_INDEX
: Indexes used by the geocoding frontend.
NOMINATIM_TABLESPACE_OSM_DATA
: Raw OSM data cache used for import and updates.
NOMINATIM_TABLESPACE_OSM_DATA
: Indexes on the raw OSM data cache.
NOMINATIM_TABLESPACE_PLACE_DATA
: Data table with the pre-filtered but still unprocessed OSM data.
Used only during imports and updates.
NOMINATIM_TABLESPACE_PLACE_INDEX
: Indexes on raw data table. Used only during imports and updates.
NOMINATIM_TABLESPACE_ADDRESS_DATA
: Data tables used for computing search terms and addresses of places
during import and updates.
NOMINATIM_TABLESPACE_ADDRESS_INDEX
: Indexes on the data tables for search term and address computation.
Used only for import and updates.
NOMINATIM_TABLESPACE_AUX_DATA
: Auxiliary data tables for non-OSM data, e.g. for Tiger house number data.
NOMINATIM_TABLESPACE_AUX_INDEX
: Indexes on auxiliary data tables.
### Replication Update Settings
#### NOMINATIM_REPLICATION_URL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Base URL of the replication service |
| **Format:** | url |
| **Default:** | https://planet.openstreetmap.org/replication/minute |
| **After Changes:** | run `nominatim replication --init` |
Replication services deliver updates to OSM data. Use this setting to choose
which replication service to use. See [Updates](../admin/Update.md) for more
information on how to set up regular updates.
#### NOMINATIM_REPLICATION_MAX_DIFF
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Maximum amount of data to download per update cycle (in MB) |
| **Format:** | integer |
| **Default:** | 50 |
| **After Changes:** | restart the replication process |
At each update cycle Nominatim downloads diffs until either no more diffs
are available on the server (i.e. the database is up-to-date) or the limit
given in this setting is exceeded. Nominatim guarantees to downloads at least
one diff, if one is available, no matter how small the setting.
The default for this setting is fairly conservative because Nominatim keeps
all data downloaded in one cycle in RAM. Using large values in a production
server may interfere badly with the search frontend because it evicts data
from RAM that is needed for speedy answers to incoming requests. It is usually
a better idea to keep this setting lower and run multiple update cycles
to catch up with updates.
When catching up in non-production mode, for example after the initial import,
the setting can easily be changed temporarily on the command line:
NOMINATIM_REPLICATION_MAX_DIFF=3000 nominatim replication
#### NOMINATIM_REPLICATION_UPDATE_INTERVAL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Publication interval of the replication service (in seconds) |
| **Format:** | integer |
| **Default:** | 75 |
| **After Changes:** | restart the replication process |
This setting determines when Nominatim will attempt to download again a new
update. The time is computed from the publication date of the last diff
downloaded. Setting this to a slightly higher value than the actual
publication interval avoids unnecessary rechecks.
#### NOMINATIM_REPLICATION_RECHECK_INTERVAL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Wait time to recheck for a pending update (in seconds) |
| **Format:** | integer |
| **Default:** | 60 |
| **After Changes:** | restart the replication process |
When replication updates are run in continuous mode (using `nominatim replication`),
this setting determines how long Nominatim waits until it looks for updates
again when updates were not available on the server.
Note that this is different from
[NOMINATIM_REPLICATION_UPDATE_INTERVAL](#nominatim_replication_update_interval).
Nominatim will never attempt to query for new updates for UPDATE_INTERVAL
seconds after the current database date. Only after the update interval has
passed it asks for new data. If then no new data is found, it waits for
RECHECK_INTERVAL seconds before it attempts again.
### API Settings
#### NOMINATIM_CORS_NOACCESSCONTROL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Send permissive CORS access headers |
| **Format:** | boolean |
| **Default:** | yes |
| **After Changes:** | run `nominatim refresh --website` |
When this setting is enabled, API HTTP responses include the HTTP
[CORS](https://en.wikipedia.org/wiki/CORS) headers
`access-control-allow-origin: *` and `access-control-allow-methods: OPTIONS,GET`.
#### NOMINATIM_MAPICON_URL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | URL prefix for static icon images |
| **Format:** | url |
| **Default:** | _empty_ |
| **After Changes:** | run `nominatim refresh --website` |
When a mapicon URL is configured, then Nominatim includes an additional `icon`
field in the responses, pointing to an appropriate icon for the place type.
Map icons used to be included in Nominatim itself but now have moved to the
[nominatim-ui](https://github.com/osm-search/nominatim-ui/) project. If you
want the URL to be included in API responses, make the `/mapicon`
directory of the project available under a public URL and point this setting
to the directory.
#### NOMINATIM_DEFAULT_LANGUAGE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Language of responses when no language is requested |
| **Format:** | language code |
| **Default:** | _empty_ (use the local language of the feature) |
| **After Changes:** | run `nominatim refresh --website` |
Nominatim localizes the place names in responses when the corresponding
translation is available. Users can request a custom language setting through
the HTTP accept-languages header or through the explicit parameter
[accept-languages](../api/Search.md#language-of-results). If neither is
given, it falls back to this setting. If the setting is also empty, then
the local languages (in OSM: the name tag without any language suffix) is
used.
#### NOMINATIM_SEARCH_BATCH_MODE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Enable a special batch query mode |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim refresh --website` |
This feature is currently undocumented and potentially broken.
#### NOMINATIM_SEARCH_NAME_ONLY_THRESHOLD
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Threshold for switching the search index lookup strategy |
| **Format:** | integer |
| **Default:** | 500 |
| **After Changes:** | run `nominatim refresh --website` |
This setting defines the threshold over which a name is no longer considered
as rare. When searching for places with rare names, only the name is used
for place lookups. Otherwise the name and any address information is used.
This setting only has an effect after `nominatim refresh --word-counts` has
been called to compute the word frequencies.
#### NOMINATIM_LOOKUP_MAX_COUNT
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Maximum number of OSM ids accepted by /lookup |
| **Format:** | integer |
| **Default:** | 50 |
| **After Changes:** | run `nominatim refresh --website` |
The /lookup point accepts list of ids to look up address details for. This
setting restricts the number of places a user may look up with a single
request.
#### NOMINATIM_POLYGON_OUTPUT_MAX_TYPES
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Number of different geometry formats that may be returned |
| **Format:** | integer |
| **Default:** | 1 |
| **After Changes:** | run `nominatim refresh --website` |
Nominatim supports returning full geometries of places. The geometries may
be requested in different formats with one of the
[`polygon_*` parameters](../api/Search.md#polygon-output). Use this
setting to restrict the number of geometry types that may be requested
with a single query.
Setting this parameter to 0 disables polygon output completely.
### Logging Settings
#### NOMINATIM_LOG_DB
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Log requests into the database |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim refresh --website` |
Enable logging requests into a database table with this setting. The logs
can be found in the table `new_query_log`.
When using this logging method, it is advisable to set up a job that
regularly clears out old logging information. Nominatim will not do that
on its own.
Can be used as the same time as NOMINATIM_LOG_FILE.
#### NOMINATIM_LOG_FILE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Log requests into a file |
| **Format:** | path |
| **Default:** | _empty_ (logging disabled) |
| **After Changes:** | run `nominatim refresh --website` |
Enable logging of requests into a file with this setting by setting the log
file where to log to. A relative file name is assumed to be relative to
the project directory.
The entries in the log file have the following format:
<request time> <execution time in s> <number of results> <type> "<query string>"
Request time is the time when the request was started. The execution time is
given in ms and corresponds to the time the query took executing in PHP.
type contains the name of the endpoint used.
Can be used as the same time as NOMINATIM_LOG_DB.

View File

@@ -0,0 +1,34 @@
# Special phrases
## Importing OSM user-maintained special phrases
As described in the [Import section](../admin/Import.md), it is possible to
import special phrases from the wiki with the following command:
```sh
nominatim special-phrases --import-from-wiki
```
## Importing custom special phrases
But, it is also possible to import some phrases from a csv file.
To do so, you have access to the following command:
```sh
nominatim special-phrases --import-from-csv <csv file>
```
Note that the two previous import commands will update the phrases from your database.
This means that if you import some phrases from a csv file, only the phrases
present in the csv file will be kept into the database. All other phrases will
be removed.
If you want to only add new phrases and not update the other ones you can add
the argument `--no-replace` to the import command. For example:
```sh
nominatim special-phrases --import-from-csv <csv file> --no-replace
```
This will add the phrases present in the csv file into the database without
removing the other ones.

28
docs/customize/Tiger.md Normal file
View File

@@ -0,0 +1,28 @@
# Installing TIGER housenumber data for the US
Nominatim is able to use the official [TIGER](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html)
address set to complement the OSM house number data in the US. You can add
TIGER data to your own Nominatim instance by following these steps. The
entire US adds about 10GB to your database.
1. Get preprocessed TIGER 2021 data:
cd $PROJECT_DIR
wget https://nominatim.org/data/tiger2021-nominatim-preprocessed.csv.tar.gz
2. Import the data into your Nominatim database:
nominatim add-data --tiger-data tiger2021-nominatim-preprocessed.csv.tar.gz
3. Enable use of the Tiger data in your `.env` by adding:
echo NOMINATIM_USE_US_TIGER_DATA=yes >> .env
4. Apply the new settings:
nominatim refresh --functions
See the [TIGER-data project](https://github.com/osm-search/TIGER-data) for more
information on how the data got preprocessed.

View File

@@ -0,0 +1,302 @@
# Tokenizers
The tokenizer module in Nominatim is responsible for analysing the names given
to OSM objects and the terms of an incoming query in order to make sure, they
can be matched appropriately.
Nominatim offers different tokenizer modules, which behave differently and have
different configuration options. This sections describes the tokenizers and how
they can be configured.
!!! important
The use of a tokenizer is tied to a database installation. You need to choose
and configure the tokenizer before starting the initial import. Once the import
is done, you cannot switch to another tokenizer anymore. Reconfiguring the
chosen tokenizer is very limited as well. See the comments in each tokenizer
section.
## Legacy tokenizer
The legacy tokenizer implements the analysis algorithms of older Nominatim
versions. It uses a special Postgresql module to normalize names and queries.
This tokenizer is currently the default.
To enable the tokenizer add the following line to your project configuration:
```
NOMINATIM_TOKENIZER=legacy
```
The Postgresql module for the tokenizer is available in the `module` directory
and also installed with the remainder of the software under
`lib/nominatim/module/nominatim.so`. You can specify a custom location for
the module with
```
NOMINATIM_DATABASE_MODULE_PATH=<path to directory where nominatim.so resides>
```
This is in particular useful when the database runs on a different server.
See [Advanced installations](../admin/Advanced-Installations.md#importing-nominatim-to-an-external-postgresql-database) for details.
There are no other configuration options for the legacy tokenizer. All
normalization functions are hard-coded.
## ICU tokenizer
The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to
normalize names and queries. It also offers configurable decomposition and
abbreviation handling.
To enable the tokenizer add the following line to your project configuration:
```
NOMINATIM_TOKENIZER=icu
```
### How it works
On import the tokenizer processes names in the following three stages:
1. During the **Sanitizer step** incoming names are cleaned up and converted to
**full names**. This step can be used to regularize spelling, split multi-name
tags into their parts and tag names with additional attributes. See the
[Sanitizers section](#sanitizers) below for available cleaning routines.
2. The **Normalization** part removes all information from the full names
that are not relevant for search.
3. The **Token analysis** step takes the normalized full names and creates
all transliterated variants under which the name should be searchable.
See the [Token analysis](#token-analysis) section below for more
information.
During query time, only normalization and transliteration are relevant.
An incoming query is first split into name chunks (this usually means splitting
the string at the commas) and the each part is normalised and transliterated.
The result is used to look up places in the search index.
### Configuration
The ICU tokenizer is configured using a YAML file which can be configured using
`NOMINATIM_TOKENIZER_CONFIG`. The configuration is read on import and then
saved as part of the internal database status. Later changes to the variable
have no effect.
Here is an example configuration file:
``` yaml
normalization:
- ":: lower ()"
- "ß > 'ss'" # German szet is unimbigiously equal to double ss
transliteration:
- !include /etc/nominatim/icu-rules/extended-unicode-to-asccii.yaml
- ":: Ascii ()"
sanitizers:
- step: split-name-list
token-analysis:
- analyzer: generic
variants:
- !include icu-rules/variants-ca.yaml
- words:
- road -> rd
- bridge -> bdge,br,brdg,bri,brg
```
The configuration file contains four sections:
`normalization`, `transliteration`, `sanitizers` and `token-analysis`.
#### Normalization and Transliteration
The normalization and transliteration sections each define a set of
ICU rules that are applied to the names.
The **normalisation** rules are applied after sanitation. They should remove
any information that is not relevant for search at all. Usual rules to be
applied here are: lower-casing, removing of special characters, cleanup of
spaces.
The **transliteration** rules are applied at the end of the tokenization
process to transfer the name into an ASCII representation. Transliteration can
be useful to allow for further fuzzy matching, especially between different
scripts.
Each section must contain a list of
[ICU transformation rules](https://unicode-org.github.io/icu/userguide/transforms/general/rules.html).
The rules are applied in the order in which they appear in the file.
You can also include additional rules from external yaml file using the
`!include` tag. The included file must contain a valid YAML list of ICU rules
and may again include other files.
!!! warning
The ICU rule syntax contains special characters that conflict with the
YAML syntax. You should therefore always enclose the ICU rules in
double-quotes.
#### Sanitizers
The sanitizers section defines an ordered list of functions that are applied
to the name and address tags before they are further processed by the tokenizer.
They allows to clean up the tagging and bring it to a standardized form more
suitable for building the search index.
!!! hint
Sanitizers only have an effect on how the search index is built. They
do not change the information about each place that is saved in the
database. In particular, they have no influence on how the results are
displayed. The returned results always show the original information as
stored in the OpenStreetMap database.
Each entry contains information of a sanitizer to be applied. It has a
mandatory parameter `step` which gives the name of the sanitizer. Depending
on the type, it may have additional parameters to configure its operation.
The order of the list matters. The sanitizers are applied exactly in the order
that is configured. Each sanitizer works on the results of the previous one.
The following is a list of sanitizers that are shipped with Nominatim.
##### split-name-list
::: nominatim.tokenizer.sanitizers.split_name_list
selection:
members: False
rendering:
heading_level: 6
##### strip-brace-terms
::: nominatim.tokenizer.sanitizers.strip_brace_terms
selection:
members: False
rendering:
heading_level: 6
##### tag-analyzer-by-language
::: nominatim.tokenizer.sanitizers.tag_analyzer_by_language
selection:
members: False
rendering:
heading_level: 6
#### Token Analysis
Token analyzers take a full name and transform it into one or more normalized
form that are then saved in the search index. In its simplest form, the
analyzer only applies the transliteration rules. More complex analyzers
create additional spelling variants of a name. This is useful to handle
decomposition and abbreviation.
The ICU tokenizer may use different analyzers for different names. To select
the analyzer to be used, the name must be tagged with the `analyzer` attribute
by a sanitizer (see for example the
[tag-analyzer-by-language sanitizer](#tag-analyzer-by-language)).
The token-analysis section contains the list of configured analyzers. Each
analyzer must have an `id` parameter that uniquely identifies the analyzer.
The only exception is the default analyzer that is used when no special
analyzer was selected.
Different analyzer implementations may exist. To select the implementation,
the `analyzer` parameter must be set. Currently there is only one implementation
`generic` which is described in the following.
##### Generic token analyzer
The generic analyzer is able to create variants from a list of given
abbreviation and decomposition replacements. It takes one optional parameter
`variants` which lists the replacements to apply. If the section is
omitted, then the generic analyzer becomes a simple analyzer that only
applies the transliteration.
The variants section defines lists of replacements which create alternative
spellings of a name. To create the variants, a name is scanned from left to
right and the longest matching replacement is applied until the end of the
string is reached.
The variants section must contain a list of replacement groups. Each group
defines a set of properties that describes where the replacements are
applicable. In addition, the word section defines the list of replacements
to be made. The basic replacement description is of the form:
```
<source>[,<source>[...]] => <target>[,<target>[...]]
```
The left side contains one or more `source` terms to be replaced. The right side
lists one or more replacements. Each source is replaced with each replacement
term.
!!! tip
The source and target terms are internally normalized using the
normalization rules given in the configuration. This ensures that the
strings match as expected. In fact, it is better to use unnormalized
words in the configuration because then it is possible to change the
rules for normalization later without having to adapt the variant rules.
###### Decomposition
In its standard form, only full words match against the source. There
is a special notation to match the prefix and suffix of a word:
``` yaml
- ~strasse => str # matches "strasse" as full word and in suffix position
- hinter~ => hntr # matches "hinter" as full word and in prefix position
```
There is no facility to match a string in the middle of the word. The suffix
and prefix notation automatically trigger the decomposition mode: two variants
are created for each replacement, one with the replacement attached to the word
and one separate. So in above example, the tokenization of "hauptstrasse" will
create the variants "hauptstr" and "haupt str". Similarly, the name "rote strasse"
triggers the variants "rote str" and "rotestr". By having decomposition work
both ways, it is sufficient to create the variants at index time. The variant
rules are not applied at query time.
To avoid automatic decomposition, use the '|' notation:
``` yaml
- ~strasse |=> str
```
simply changes "hauptstrasse" to "hauptstr" and "rote strasse" to "rote str".
###### Initial and final terms
It is also possible to restrict replacements to the beginning and end of a
name:
``` yaml
- ^south => s # matches only at the beginning of the name
- road$ => rd # matches only at the end of the name
```
So the first example would trigger a replacement for "south 45th street" but
not for "the south beach restaurant".
###### Replacements vs. variants
The replacement syntax `source => target` works as a pure replacement. It changes
the name instead of creating a variant. To create an additional version, you'd
have to write `source => source,target`. As this is a frequent case, there is
a shortcut notation for it:
```
<source>[,<source>[...]] -> <target>[,<target>[...]]
```
The simple arrow causes an additional variant to be added. Note that
decomposition has an effect here on the source as well. So a rule
``` yaml
- "~strasse -> str"
```
means that for a word like `hauptstrasse` four variants are created:
`hauptstrasse`, `haupt strasse`, `hauptstr` and `haupt str`.
### Reconfiguration
Changing the configuration after the import is currently not possible, although
this feature may be added at a later time.

View File

@@ -0,0 +1,167 @@
# Database Layout
### Import tables
OSM data is initially imported using [osm2pgsql](https://osm2pgsql.org).
Nominatim uses its own data output style 'gazetteer', which differs from the
output style created for map rendering.
The import process creates the following tables:
![osm2pgsql tables](osm2pgsql-tables.svg)
The `planet_osm_*` tables are the usual backing tables for OSM data. Note
that Nominatim uses them to look up special relations and to find nodes on
ways.
The gazetteer style produces a single table `place` as output with the following
columns:
* `osm_type` - kind of OSM object (**N** - node, **W** - way, **R** - relation)
* `osm_id` - original OSM ID
* `class` - key of principal tag defining the object type
* `type` - value of principal tag defining the object type
* `name` - collection of tags that contain a name or reference
* `admin_level` - numerical value of the tagged administrative level
* `address` - collection of tags defining the address of an object
* `extratags` - collection of additional interesting tags that are not
directly relevant for searching
* `geometry` - geometry of the object (in WGS84)
A single OSM object may appear multiple times in this table when it is tagged
with multiple tags that may constitute a principal tag. Take for example a
motorway bridge. In OSM, this would be a way which is tagged with
`highway=motorway` and `bridge=yes`. This way would appear in the `place` table
once with `class` of `highway` and once with a `class` of `bridge`. Thus the
*unique key* for `place` is (`osm_type`, `osm_id`, `class`).
How raw OSM tags are mapped to the columns in the place table is to a certain
degree configurable. See [Customizing Import Styles](../customize/Import-Styles.md)
for more information.
### Search tables
The following tables carry all information needed to do the search:
![search tables](search-tables.svg)
The **placex** table is the central table that saves all information about the
searchable places in Nominatim. The basic columns are the same as for the
place table and have the same meaning. The placex tables adds the following
additional columns:
* `place_id` - the internal unique ID to identify the place
* `partition` - the id to use with partitioned tables (see below)
* `geometry_sector` - a location hash used for geographically close ordering
* `parent_place_id` - the next higher place in the address hierarchy, only
relevant for POI-type places (with rank 30)
* `linked_place_id` - place ID of the place this object has been merged with.
When this ID is set, then the place is invisible for search.
* `importance` - measure how well known the place is
* `rank_search`, `rank_address` - search and address rank (see [Customizing ranking](../customize/Ranking.md)
* `wikipedia` - the wikipedia page used for computing the importance of the place
* `country_code` - the country the place is located in
* `housenumber` - normalized housenumber, if the place has one
* `postcode` - computed postcode for the place
* `indexed_status` - processing status of the place (0 - ready, 1 - freshly inserted, 2 - needs updating, 100 - needs deletion)
* `indexed_date` - timestamp when the place was processed last
* `centroid` - a point feature for the place
The **location_property_osmline** table is a special table for
[address interpolations](https://wiki.openstreetmap.org/wiki/Addresses#Using_interpolation).
The columns have the same meaning and use as the columns with the same name in
the placex table. Only three columns are special:
* `startnumber` and `endnumber` - beginning and end of the number range
for the interpolation
* `interpolationtype` - a string `odd`, `even` or `all` to indicate
the interval between the numbers
Address interpolations are always ways in OSM, which is why there is no column
`osm_type`.
The **location_postcode** table holds computed centroids of all postcodes that
can be found in the OSM data. The meaning of the columns is again the same
as that of the placex table.
Every place needs an address, a set of surrounding places that describe the
location of the place. The set of address places is made up of OSM places
themselves. The **place_addressline** table cross-references for each place
all the places that make up its address. Two columns define the address
relation:
* `place_id` - reference to the place being addressed
* `address_place_id` - reference to the place serving as an address part
The most of the columns cache information from the placex entry of the address
part. The exceptions are:
* `fromarea` - is true if the address part has an area geometry and can
therefore be considered preceise
* `isaddress` - is true if the address part should show up in the address
output. Sometimes there are multiple places competing for for same address
type (e.g. multiple cities) and this field resolves the tie.
The **search_name** table contains the search index proper. It saves for each
place the terms with which the place can be found. The terms are split into
the name itself and all terms that make up the address. The table mirrors some
of the columns from placex for faster lookup.
Search terms are not saved as strings. Each term is assigned an integer and those
integers are saved in the name and address vectors of the search_name table. The
**word** table serves as the lookup table from string to such a word ID. The
exact content of the word table depends on the [tokenizer](Tokenizers.md) used.
## Address computation tables
Next to the main search tables, there is a set of secondary helper tables used
to compute the address relations between places. These tables are partitioned.
Each country is assigned a partition number in the country_name table (see
below) and the data is then split between a set of tables, one for each
partition. Note that Nominatim still manually manages partitioned tables.
Native support for partitions in PostgreSQL only became useable with version 13.
It will be a little while before Nominatim drops support for older versions.
![address tables](address-tables.svg)
The **search_name_X** tables are used to look up streets that appear in the
`addr:street` tag.
The **location_area_large_X** tables are used to look up larger areas
(administrative boundaries and place nodes) either through their geographic
closeness or through `addr:*` entries.
The **location_road_X** tables are used to find the closest street for a
dependent place.
All three table cache specific information from the placex table for their
selected subset of places:
* `keywords` and `name_vector` contain lists of term ids (from the word table)
that the full name of the place should match against
* `isguess` is true for places that are not described by an area
All other columns reflect their counterpart in the placex table.
## Static data tables
Nominatim also creates a number of static tables at import:
* `nominatim_properties` saves settings that must not be changed after
import
* `address_levels` save the rank information from the
[ranking configuration](../customize/Ranking.md)
* `country_name` contains a fallback of names for all countries, their
default languages and saves the assignment of countries to partitions.
* `country_osm_grid` provides a fallback for country geometries
## Auxilary data tables
Finally there are some table for auxillary data:
* `location_property_tiger` - saves housenumber from the Tiger import. Its
layout is similar to that of `location_propoerty_osmline`.
* `place_class_*` tables are helper tables to facilitate lookup of POIs
by their class and type. They exist because it is not possible to create
combined indexes with geometries.

View File

@@ -0,0 +1,129 @@
# Setting up Nominatim for Development
This chapter gives an overview how to set up Nominatim for developement
and how to run tests.
!!! Important
This guide assumes that you develop under the latest version of Ubuntu. You
can of course also use your favourite distribution. You just might have to
adapt the commands below slightly, in particular the commands for installing
additional software.
## Installing Nominatim
The first step is to install Nominatim itself. Please follow the installation
instructions in the [Admin section](../admin/Installation.md). You don't need
to set up a webserver for development, the webserver that is included with PHP
is sufficient.
If you want to run Nominatim in a VM via Vagrant, use the default `ubuntu` setup.
Vagrant's libvirt provider runs out-of-the-box under Ubuntu. You also need to
install an NFS daemon to enable directory sharing between host and guest. The
following packages should get you started:
sudo apt install vagrant vagrant-libvirt libvirt-daemon nfs-kernel-server
## Prerequisites for testing and documentation
The Nominatim test suite consists of behavioural tests (using behave) and
unit tests (using PHPUnit for PHP code and pytest for Python code).
It has the following additional requirements:
* [behave test framework](https://behave.readthedocs.io) >= 1.2.6
* [phpunit](https://phpunit.de) >= 7.3
* [PHP CodeSniffer](https://github.com/squizlabs/PHP_CodeSniffer)
* [Pylint](https://pylint.org/) (2.6.0 is used for the CI)
* [pytest](https://pytest.org)
The documentation is built with mkdocs:
* [mkdocs](https://www.mkdocs.org/) >= 1.1.2
* [mkdocstrings](https://mkdocstrings.github.io/)
### Installing prerequisites on Ubuntu/Debian
Some of the Python packages require the newest version which is not yet
available with the current distributions. Therefore it is recommended to
install pip to get the newest versions.
To install all necessary packages run:
```sh
sudo apt install php-cgi phpunit php-codesniffer \
python3-pip python3-setuptools python3-dev pylint
pip3 install --user behave mkdocs mkdocstrings pytest
```
The `mkdocs` executable will be located in `.local/bin`. You may have to add
this directory to your path, for example by running:
```
echo 'export PATH=~/.local/bin:$PATH' > ~/.profile
```
If your distribution does not have PHPUnit 7.3+, you can install it (as well
as CodeSniffer) via composer:
```
sudo apt-get install composer
composer global require "squizlabs/php_codesniffer=*"
composer global require "phpunit/phpunit=8.*"
```
The binaries are found in `.config/composer/vendor/bin`. You need to add this
to your PATH as well:
```
echo 'export PATH=~/.config/composer/vendor/bin:$PATH' > ~/.profile
```
## Executing Tests
All tests are located in the `/test` directory.
To run all tests just go to the build directory and run make:
```sh
cd build
make test
```
For more information about the structure of the tests and how to change and
extend the test suite, see the [Testing chapter](Testing.md).
## Documentation Pages
The [Nominatim documentation](https://nominatim.org/release-docs/develop/) is
built using the [MkDocs](https://www.mkdocs.org/) static site generation
framework. The master branch is automatically deployed every night on
[https://nominatim.org/release-docs/develop/](https://nominatim.org/release-docs/develop/)
To build the documentation, go to the build directory and run
```
make doc
INFO - Cleaning site directory
INFO - Building documentation to directory: /home/vagrant/build/site-html
```
This runs `mkdocs build` plus extra transformation of some files and adds
symlinks (see `CMakeLists.txt` for the exact steps).
Now you can start webserver for local testing
```
build> make serve-doc
[server:296] Serving on http://127.0.0.1:8000
[handlers:62] Start watching changes
```
If you develop inside a Vagrant virtual machine, use a port that is forwarded
to your host:
```
build> PYTHONPATH=$SRCDIR mkdocs serve --dev-addr 0.0.0.0:8088
[server:296] Serving on http://0.0.0.0:8088
[handlers:62] Start watching changes
```

152
docs/develop/Indexing.md Normal file
View File

@@ -0,0 +1,152 @@
# Indexing Places
In Nominatim, the word __indexing__ refers to the process that takes the raw
OpenStreetMap data from the place table, enriches it with address information
and creates the search indexes. This section explains the basic data flow.
## Initial import
After osm2pgsql has loaded the raw OSM data into the place table,
the data is copied to the final search tables placex and location_property_osmline.
While they are copied, some basic properties are added:
* country_code, geometry_sector and partition
* initial search and address rank
In addition the column `indexed_status` is set to `1` marking the place as one
that needs to be indexed.
All this happens in the triggers `placex_insert` and `osmline_insert`.
## Indexing
The main work horse of the data import is the indexing step, where Nominatim
takes every place from the placex and location_property_osmline tables where
the indexed_status != 0 and computes the search terms and the address parts
of the place.
The indexing happens in three major steps:
1. **Data preparation** - The indexer gets the data for the place to be indexed
from the database.
2. **Search name processing** - The prepared data is given to the
tokenizer which computes the search terms from the names
and potentially other information.
3. **Address processing** - The indexer then hands the prepared data and the
tokenizer information back to the database via an `INSERT` statement which
also sets the indexed_status to `0`. This triggers the update triggers
`placex_update`/`osmline_update` which do the work of computing address
parts and filling all the search tables.
When computing the address terms of a place, Nominatim relies on the processed
search names of all the address parts. That is why places are processed in rank
order, from smallest rank to largest. To ensure correct handling of linked
place nodes, administrative boundaries are processed before all other places.
Apart from these restrictions, each place can be indexed independently
from the others. This allows a large degree of parallelization during the indexing.
It also means that the indexing process can be interrupted at any time and
will simply pick up where it left of when restarted.
### Data preparation
The data preparation step computes and retrieves all data for a place that
might be needed for the next step of processing the search name. That includes
* location information (country code)
* place classification (class, type, ranks)
* names (including names of linked places)
* address information (`addr:*` tags)
Data preparation is implemented in pl/PgSQL mostly in the functions
`placex_indexing_prepare()` and `get_interpolation_address()`.
#### `addr:*` tag inheritance
Nominatim has limited support for inheriting address tags from a building
to POIs inside the building. This only works when the address tags are on the
building outline. Any rank 30 object inside such a building or on its outline
inherits all address tags when it does not have any address tags of its own.
The inheritance is computed in the data preparation step.
### Search name processing
The prepared place information is handed to the tokenizer next. This is a
Python module responsible for processing the names from both name and address
terms and building up the word index from them. The process is explained in
more detail in the [Tokenizer chapter](Tokenizer.md).
### Address processing
Finally, the preprocessed place information and the results of the search name
processing are written back to the database. At this point the update trigger
of the placex/location_property_osmline tables take over and fill all the
dependent tables. This makes up the most work-intensive part of the indexing.
Nominatim distinguishes between dependent and independent places.
**Dependent places** are all places on rank 30: house numbers, POIs etc. These
places don't have a full address of their own. Instead they are attached to
a parent street or place and use the information of the parent for searching
and displaying information. Everything else are **independent places**: streets,
parks, water bodies, suburbs, cities, states etc. They receive a full address
on their own.
The address processing for both types of places is very different.
#### Independent places
To compute the address of an independent place Nominatim searches for all
places that cover the place to compute the address for at least partially.
For places with an area, that area is used to check for coverage. For place
nodes an artificial square area is computed according to the rank of
the place. The lower the rank the lager the area. The `location_area_large_X`
tables are there to facilitate the lookup. All places that can function as
the address of another place are saved in those tables.
`addr:*` and `isin:*` tags are taken into account to compute the address, too.
Nominatim will give preference to places with the same name as in these tags
when looking for places in the vicinity. If there are no matching place names
at all, then the tags are at least added to the search index. That means that
the names will not be shown in the result as the 'address' of the place, but
searching by them still works.
Independent places are always added to the global search index `search_name`.
#### Dependent places
Dependent places skip the full address computation for performance reasons.
Instead they just find a parent place to attach themselves to.
![parenting of dependent places](parenting-flow.svg)
By default a POI
or house number will be attached to the closest street. That can be any major
or minor street indexed by Nominatim. In the default configuration that means
that it can attach itself to a footway but only when it has a name.
When the dependent place has an `addr:street` tag, then Nominatim will first
try to find a street with the same name before falling back to the closest
street.
There are also addresses in OSM, where the housenumber does not belong
to a street at all. These have an `addr:place` tag. For these places, Nominatim
tries to find a place with the given name in the indexed places with an
address rank between 16 and 25. If none is found, then the dependent place
is attached to the closest place in that category and the addr:place name is
added as *unlisted* place, which indicates to Nominatim that it needs to add
it to the address output, no matter what. This special case is necessary to
cover addresses that don't really refer to an existing object.
When an address has both the `addr:street` and `addr:place` tag, then Nominatim
assumes that the `addr:place` tag in fact should be the city part of the address
and give the POI the usual street number address.
Dependent places are only added to the global search index `search_name` when
they have either a name themselves or when they have address tags that are not
covered by the places that make up their address. The latter ensures that
addresses are always searchable by those address tags.

159
docs/develop/Testing.md Normal file
View File

@@ -0,0 +1,159 @@
# Nominatim Test Suite
This chapter describes the tests in the `/test` directory, how they are
structured and how to extend them. For a quick introduction on how to run
the tests, see the [Development setup chapter](Development-Environment.md).
## Overall structure
There are two kind of tests in this test suite. There are functional tests
which test the API interface using a BDD test framework and there are unit
tests for specific PHP functions.
This test directory is sturctured as follows:
```
-+- bdd Functional API tests
| \
| +- steps Step implementations for test descriptions
| +- osm2pgsql Tests for data import via osm2pgsql
| +- db Tests for internal data processing on import and update
| +- api Tests for API endpoints (search, reverse, etc.)
|
+- php PHP unit tests
+- python Python unit tests
+- scenes Geometry test data
+- testdb Base data for generating API test database
```
## PHP Unit Tests (`test/php`)
Unit tests for PHP code can be found in the `php/` directory. They test selected
PHP functions. Very low coverage.
To execute the test suite run
cd test/php
UNIT_TEST_DSN='pgsql:dbname=nominatim_unit_tests' phpunit ../
It will read phpunit.xml which points to the library, test path, bootstrap
strip and sets other parameters.
It will use (and destroy) a local database 'nominatim_unit_tests'. You can set
a different connection string with e.g. UNIT_TEST_DSN='pgsql:dbname=foo_unit_tests'.
## Python Unit Tests (`test/python`)
Unit tests for Python code can be found in the `python/` directory. The goal is
to have complete coverage of the Python library in `nominatim`.
To execute the tests run
py.test-3 test/python
or
pytest test/python
The name of the pytest binary depends on your installation.
## BDD Functional Tests (`test/bdd`)
Functional tests are written as BDD instructions. For more information on
the philosophy of BDD testing, see the
[Behave manual](http://pythonhosted.org/behave/philosophy.html).
The following explanation assume that the reader is familiar with the BDD
notations of features, scenarios and steps.
All possible steps can be found in the `steps` directory and should ideally
be documented.
### General Usage
To run the functional tests, do
cd test/bdd
behave
The tests can be configured with a set of environment variables (`behave -D key=val`):
* `BUILDDIR` - build directory of Nominatim installation to test
* `TEMPLATE_DB` - name of template database used as a skeleton for
the test databases (db tests)
* `TEST_DB` - name of test database (db tests)
* `API_TEST_DB` - name of the database containing the API test data (api tests)
* `API_TEST_FILE` - OSM file to be imported into the API test database (api tests)
* `DB_HOST` - (optional) hostname of database host
* `DB_PORT` - (optional) port of database on host
* `DB_USER` - (optional) username of database login
* `DB_PASS` - (optional) password for database login
* `SERVER_MODULE_PATH` - (optional) path on the Postgres server to Nominatim
module shared library file
* `REMOVE_TEMPLATE` - if true, the template and API database will not be reused
during the next run. Reusing the base templates speeds
up tests considerably but might lead to outdated errors
for some changes in the database layout.
* `KEEP_TEST_DB` - if true, the test database will not be dropped after a test
is finished. Should only be used if one single scenario is
run, otherwise the result is undefined.
Logging can be defined through command line parameters of behave itself. Check
out `behave --help` for details. Also have a look at the 'work-in-progress'
feature of behave which comes in handy when writing new tests.
### API Tests (`test/bdd/api`)
These tests are meant to test the different API endpoints and their parameters.
They require to import several datasets into a test database. This is normally
done automatically during setup of the test. The API test database is then
kept around and reused in subsequent runs of behave. Use `behave -DREMOVE_TEMPLATE`
to force a reimport of the database.
The official test dataset is saved in the file `test/testdb/apidb-test-data.pbf`
and compromises the following data:
* Geofabrik extract of Liechtenstein
* extract of Autauga country, Alabama, US (for tests against Tiger data)
* additional data from `test/testdb/additional_api_test.data.osm`
API tests should only be testing the functionality of the website PHP code.
Most tests should be formulated as BDD DB creation tests (see below) instead.
#### Code Coverage
The API tests also support code coverage tests. You need to install
[PHP_CodeCoverage](https://github.com/sebastianbergmann/php-code-coverage).
On Debian/Ubuntu run:
apt-get install php-codecoverage php-xdebug
Then run the API tests as follows:
behave api -DPHPCOV=<coverage output dir>
The output directory must be an absolute path. To generate reports, you can use
the [phpcov](https://github.com/sebastianbergmann/phpcov) tool:
phpcov merge --html=<report output dir> <coverage output dir>
### DB Creation Tests (`test/bdd/db`)
These tests check the import and update of the Nominatim database. They do not
test the correctness of osm2pgsql. Each test will write some data into the `place`
table (and optionally the `planet_osm_*` tables if required) and then run
Nominatim's processing functions on that.
These tests need to create their own test databases. By default they will be
called `test_template_nominatim` and `test_nominatim`. Names can be changed with
the environment variables `TEMPLATE_DB` and `TEST_DB`. The user running the tests
needs superuser rights for postgres.
### Import Tests (`test/bdd/osm2pgsql`)
These tests check that data is imported correctly into the place table. They
use the same template database as the DB Creation tests, so the same remarks apply.
Note that most testing of the gazetteer output of osm2pgsql is done in the tests
of osm2pgsql itself. The BDD tests are just there to ensure compatibility of
the osm2pgsql and Nominatim code.

332
docs/develop/Tokenizers.md Normal file
View File

@@ -0,0 +1,332 @@
# Tokenizers
The tokenizer is the component of Nominatim that is responsible for
analysing names of OSM objects and queries. Nominatim provides different
tokenizers that use different strategies for normalisation. This page describes
how tokenizers are expected to work and the public API that needs to be
implemented when creating a new tokenizer. For information on how to configure
a specific tokenizer for a database see the
[tokenizer chapter in the Customization Guide](../customize/Tokenizers.md).
## Generic Architecture
### About Search Tokens
Search in Nominatim is organised around search tokens. Such a token represents
string that can be part of the search query. Tokens are used so that the search
index does not need to be organised around strings. Instead the database saves
for each place which tokens match this place's name, address, house number etc.
To be able to distinguish between these different types of information stored
with the place, a search token also always has a certain type: name, house number,
postcode etc.
During search an incoming query is transformed into a ordered list of such
search tokens (or rather many lists, see below) and this list is then converted
into a database query to find the right place.
It is the core task of the tokenizer to create, manage and assign the search
tokens. The tokenizer is involved in two distinct operations:
* __at import time__: scanning names of OSM objects, normalizing them and
building up the list of search tokens.
* __at query time__: scanning the query and returning the appropriate search
tokens.
### Importing
The indexer is responsible to enrich an OSM object (or place) with all data
required for geocoding. It is split into two parts: the controller collects
the places that require updating, enriches the place information as required
and hands the place to Postgresql. The collector is part of the Nominatim
library written in Python. Within Postgresql, the `placex_update`
trigger is responsible to fill out all secondary tables with extra geocoding
information. This part is written in PL/pgSQL.
The tokenizer is involved in both parts. When the indexer prepares a place,
it hands it over to the tokenizer to inspect the names and create all the
search tokens applicable for the place. This usually involves updating the
tokenizer's internal token lists and creating a list of all token IDs for
the specific place. This list is later needed in the PL/pgSQL part where the
indexer needs to add the token IDs to the appropriate search tables. To be
able to communicate the list between the Python part and the pl/pgSQL trigger,
the `placex` table contains a special JSONB column `token_info` which is there
for the exclusive use of the tokenizer.
The Python part of the tokenizer returns a structured information about the
tokens of a place to the indexer which converts it to JSON and inserts it into
the `token_info` column. The content of the column is then handed to the PL/pqSQL
callbacks of the tokenizer which extracts the required information. Usually
the tokenizer then removes all information from the `token_info` structure,
so that no information is ever persistently saved in the table. All information
that went in should have been processed after all and put into secondary tables.
This is however not a hard requirement. If the tokenizer needs to store
additional information about a place permanently, it may do so in the
`token_info` column. It just may never execute searches over it and
consequently not create any special indexes on it.
### Querying
At query time, Nominatim builds up multiple _interpretations_ of the search
query. Each of these interpretations is tried against the database in order
of the likelihood with which they match to the search query. The first
interpretation that yields results wins.
The interpretations are encapsulated in the `SearchDescription` class. An
instance of this class is created by applying a sequence of
_search tokens_ to an initially empty SearchDescription. It is the
responsibility of the tokenizer to parse the search query and derive all
possible sequences of search tokens. To that end the tokenizer needs to parse
the search query and look up matching words in its own data structures.
## Tokenizer API
The following section describes the functions that need to be implemented
for a custom tokenizer implementation.
!!! warning
This API is currently in early alpha status. While this API is meant to
be a public API on which other tokenizers may be implemented, the API is
far away from being stable at the moment.
### Directory Structure
Nominatim expects two files for a tokenizer:
* `nominiatim/tokenizer/<NAME>_tokenizer.py` containing the Python part of the
implementation
* `lib-php/tokenizer/<NAME>_tokenizer.php` with the PHP part of the
implementation
where `<NAME>` is a unique name for the tokenizer consisting of only lower-case
letters, digits and underscore. A tokenizer also needs to install some SQL
functions. By convention, these should be placed in `lib-sql/tokenizer`.
If the tokenizer has a default configuration file, this should be saved in
the `settings/<NAME>_tokenizer.<SUFFIX>`.
### Configuration and Persistance
Tokenizers may define custom settings for their configuration. All settings
must be prefixed with `NOMINATIM_TOKENIZER_`. Settings may be transient or
persistent. Transient settings are loaded from the configuration file when
Nominatim is started and may thus be changed at any time. Persistent settings
are tied to a database installation and must only be read during installation
time. If they are needed for the runtime then they must be saved into the
`nominatim_properties` table and later loaded from there.
### The Python module
The Python module is expect to export a single factory function:
```python
def create(dsn: str, data_dir: Path) -> AbstractTokenizer
```
The `dsn` parameter contains the DSN of the Nominatim database. The `data_dir`
is a directory in the project directory that the tokenizer may use to save
database-specific data. The function must return the instance of the tokenizer
class as defined below.
### Python Tokenizer Class
All tokenizers must inherit from `nominatim.tokenizer.base.AbstractTokenizer`
and implement the abstract functions defined there.
::: nominatim.tokenizer.base.AbstractTokenizer
rendering:
heading_level: 4
### Python Analyzer Class
::: nominatim.tokenizer.base.AbstractAnalyzer
rendering:
heading_level: 4
### PL/pgSQL Functions
The tokenizer must provide access functions for the `token_info` column
to the indexer which extracts the necessary information for the global
search tables. If the tokenizer needs additional SQL functions for private
use, then these functions must be prefixed with `token_` in order to ensure
that there are no naming conflicts with the SQL indexer code.
The following functions are expected:
```sql
FUNCTION token_get_name_search_tokens(info JSONB) RETURNS INTEGER[]
```
Return an array of token IDs of search terms that should match
the name(s) for the given place. These tokens are used to look up the place
by name and, where the place functions as part of an address for another place,
by address. Must return NULL when the place has no name.
```sql
FUNCTION token_get_name_match_tokens(info JSONB) RETURNS INTEGER[]
```
Return an array of token IDs of full names of the place that should be used
to match addresses. The list of match tokens is usually more strict than
search tokens as it is used to find a match between two OSM tag values which
are expected to contain matching full names. Partial terms should not be
used for match tokens. Must return NULL when the place has no name.
```sql
FUNCTION token_get_housenumber_search_tokens(info JSONB) RETURNS INTEGER[]
```
Return an array of token IDs of house number tokens that apply to the place.
Note that a place may have multiple house numbers, for example when apartments
each have their own number. Must be NULL when the place has no house numbers.
```sql
FUNCTION token_normalized_housenumber(info JSONB) RETURNS TEXT
```
Return the house number(s) in the normalized form that can be matched against
a house number token text. If a place has multiple house numbers they must
be listed with a semicolon as delimiter. Must be NULL when the place has no
house numbers.
```sql
FUNCTION token_matches_street(info JSONB, street_tokens INTEGER[]) RETURNS BOOLEAN
```
Check if the given tokens (previously saved from `token_get_name_match_tokens()`)
match against the `addr:street` tag name. Must return either NULL or FALSE
when the place has no `addr:street` tag.
```sql
FUNCTION token_matches_place(info JSONB, place_tokens INTEGER[]) RETURNS BOOLEAN
```
Check if the given tokens (previously saved from `token_get_name_match_tokens()`)
match against the `addr:place` tag name. Must return either NULL or FALSE
when the place has no `addr:place` tag.
```sql
FUNCTION token_addr_place_search_tokens(info JSONB) RETURNS INTEGER[]
```
Return the search token IDs extracted from the `addr:place` tag. These tokens
are used for searches by address when no matching place can be found in the
database. Must be NULL when the place has no `addr:place` tag.
```sql
FUNCTION token_get_address_keys(info JSONB) RETURNS SETOF TEXT
```
Return the set of keys for which address information is provided. This
should correspond to the list of (relevant) `addr:*` tags with the `addr:`
prefix removed or the keys used in the `address` dictionary of the place info.
```sql
FUNCTION token_get_address_search_tokens(info JSONB, key TEXT) RETURNS INTEGER[]
```
Return the array of search tokens for the given address part. `key` can be
expected to be one of those returned with `token_get_address_keys()`. The
search tokens are added to the address search vector of the place, when no
corresponding OSM object could be found for the given address part from which
to copy the name information.
```sql
FUNCTION token_matches_address(info JSONB, key TEXT, tokens INTEGER[])
```
Check if the given tokens match against the address part `key`.
__Warning:__ the tokens that are handed in are the lists previously saved
from `token_get_name_search_tokens()`, _not_ from the match token list. This
is an historical oddity which will be fixed at some point in the future.
Currently, tokenizers are encouraged to make sure that matching works against
both the search token list and the match token list.
```sql
FUNCTION token_normalized_postcode(postcode TEXT) RETURNS TEXT
```
Return the normalized version of the given postcode. This function must return
the same value as the Python function `AbstractAnalyzer->normalize_postcode()`.
```sql
FUNCTION token_strip_info(info JSONB) RETURNS JSONB
```
Return the part of the `token_info` field that should be stored in the database
permanently. The indexer calls this function when all processing is done and
replaces the content of the `token_info` column with the returned value before
the trigger stores the information in the database. May return NULL if no
information should be stored permanently.
### PHP Tokenizer class
The PHP tokenizer class is instantiated once per request and responsible for
analyzing the incoming query. Multiple requests may be in flight in
parallel.
The class is expected to be found under the
name of `\Nominatim\Tokenizer`. To find the class the PHP code includes the file
`tokenizer/tokenizer.php` in the project directory. This file must be created
when the tokenizer is first set up on import. The file should initialize any
configuration variables by setting PHP constants and then require the file
with the actual implementation of the tokenizer.
The tokenizer class must implement the following functions:
```php
public function __construct(object &$oDB)
```
The constructor of the class receives a database connection that can be used
to query persistent data in the database.
```php
public function checkStatus()
```
Check that the tokenizer can access its persistent data structures. If there
is an issue, throw an `\Exception`.
```php
public function normalizeString(string $sTerm) : string
```
Normalize string to a form to be used for comparisons when reordering results.
Nominatim reweighs results how well the final display string matches the actual
query. Before comparing result and query, names and query are normalised against
this function. The tokenizer can thus remove all properties that should not be
taken into account for reweighing, e.g. special characters or case.
```php
public function tokensForSpecialTerm(string $sTerm) : array
```
Return the list of special term tokens that match the given term.
```php
public function extractTokensFromPhrases(array &$aPhrases) : TokenList
```
Parse the given phrases, splitting them into word lists and retrieve the
matching tokens.
The phrase array may take on two forms. In unstructured searches (using `q=`
parameter) the search query is split at the commas and the elements are
put into a sorted list. For structured searches the phrase array is an
associative array where the key designates the type of the term (street, city,
county etc.) The tokenizer may ignore the phrase type at this stage in parsing.
Matching phrase type and appropriate search token type will be done later
when the SearchDescription is built.
For each phrase in the list of phrases, the function must analyse the phrase
string and then call `setWordSets()` to communicate the result of the analysis.
A word set is a list of strings, where each string refers to a search token.
A phrase may have multiple interpretations. Therefore a list of word sets is
usually attached to the phrase. The search tokens themselves are returned
by the function in an associative array, where the key corresponds to the
strings given in the word sets. The value is a list of search tokens. Thus
a single string in the list of word sets may refer to multiple search tokens.

View File

@@ -0,0 +1,35 @@
@startuml
skinparam monochrome true
skinparam ObjectFontStyle bold
map search_name_X {
place_id => BIGINT
address_rank => SMALLINT
name_vector => INT[]
centroid => GEOMETRY
}
map location_area_large_X {
place_id => BIGINT
keywords => INT[]
partition => SMALLINT
rank_search => SMALLINT
rank_address => SMALLINT
country_code => VARCHR(2)
isguess => BOOLEAN
postcode => TEXT
centroid => POINT
geometry => GEOMETRY
}
map location_road_X {
place_id => BIGINT
partition => SMALLINT
country_code => VARCHR(2)
geometry => GEOMETRY
}
search_name_X -[hidden]> location_area_large_X
location_area_large_X -[hidden]> location_road_X
@enduml

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 11 KiB

View File

@@ -0,0 +1,34 @@
# Additional Data Sources
This guide explains how data sources other than OpenStreetMap mentioned in
the install instructions got obtained and converted.
## Country grid
Nominatim uses pre-generated country borders data. In case one imports only
a subset of a country. And to assign each place a partition. Nominatim
database tables are split into partitions for performance.
More details in [osm-search/country-grid-data](https://github.com/osm-search/country-grid-data).
## US Census TIGER
For the United States you can choose to import additonal street-level data.
The data isn't mixed into OSM data but queried as fallback when no OSM
result can be found.
More details in [osm-search/TIGER-data](https://github.com/osm-search/TIGER-data).
## GB postcodes
For Great Britain you can choose to import Royalmail postcode centroids.
More details in [osm-search/gb-postcode-data](https://github.com/osm-search/gb-postcode-data).
## Wikipedia & Wikidata rankings
Nominatim can import "importance" data of place names. This greatly
improves ranking of results.
More details in [osm-search/wikipedia-wikidata](https://github.com/osm-search/wikipedia-wikidata).

View File

@@ -0,0 +1,44 @@
@startuml
skinparam monochrome true
skinparam ObjectFontStyle bold
map planet_osm_nodes #eee {
id => BIGINT
lat => INT
lon => INT
}
map planet_osm_ways #eee {
id => BIGINT
nodes => BIGINT[]
tags => TEXT[]
}
map planet_osm_rels #eee {
id => BIGINT
parts => BIGINT[]
members => TEXT[]
tags => TEXT[]
way_off => SMALLINT
rel_off => SMALLINT
}
map place {
osm_type => CHAR(1)
osm_id => BIGINT
class => TEXT
type => TEXT
name => HSTORE
address => HSTORE
extratags => HSTORE
admin_level => SMALLINT
geometry => GEOMETRY
}
planet_osm_nodes -[hidden]> planet_osm_ways
planet_osm_ways -[hidden]> planet_osm_rels
planet_osm_ways -[hidden]-> place
planet_osm_nodes::id <- planet_osm_ways::nodes
@enduml

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 13 KiB

24
docs/develop/overview.md Normal file
View File

@@ -0,0 +1,24 @@
# Basic Architecture
Nominatim provides geocoding based on OpenStreetMap data. It uses a PostgreSQL
database as a backend for storing the data.
There are three basic parts to Nominatim's architecture: the data import,
the address computation and the search frontend.
The __data import__ stage reads the raw OSM data and extracts all information
that is useful for geocoding. This part is done by osm2pgsql, the same tool
that can also be used to import a rendering database. It uses the special
gazetteer output plugin in `osm2pgsql/src/output-gazetter.[ch]pp`. The result of
the import can be found in the database table `place`.
The __address computation__ or __indexing__ stage takes the data from `place`
and adds additional information needed for geocoding. It ranks the places by
importance, links objects that belong together and computes addresses and
the search index. Most of this work is done in PL/pgSQL via database triggers
and can be found in the files in the `sql/functions/` directory.
The __search frontend__ implements the actual API. It takes search
and reverse geocoding queries from the user, looks up the data and
returns the results in the requested format. This part is written in PHP
and can be found in the `lib/` and `website/` directories.

View File

@@ -0,0 +1,31 @@
@startuml
skinparam monochrome true
start
if (has 'addr:street'?) then (yes)
if (street with that name\n nearby?) then (yes)
:**Use closest street**
**with same name**;
kill
else (no)
:** Use closest**\n**street**;
kill
endif
elseif (has 'addr:place'?) then (yes)
if (place with that name\n nearby?) then (yes)
:**Use closest place**
**with same name**;
kill
else (no)
:add addr:place to adress;
:**Use closest place**\n**rank 16 to 25**;
kill
endif
else (otherwise)
:**Use closest**\n**street**;
kill
endif
@enduml

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 9.8 KiB

View File

@@ -0,0 +1,99 @@
@startuml
skinparam monochrome true
skinparam ObjectFontStyle bold
left to right direction
map placex {
place_id => BIGINT
osm_type => CHAR(1)
osm_id => BIGINT
class => TEXT
type => TEXT
name => HSTORE
address => HSTORE
extratags => HSTORE
admin_level => SMALLINT
partition => SMALLINT
geometry_sector => INT
parent_place_id => BIGINT
linked_place_id => BIGINT
importance => DOUBLE
rank_search => SMALLINT
rank_address => SMALLINT
wikipedia => TEXT
country_code => VARCHAR(2)
housenumber => TEXT
postcode => TEXT
indexed_status => SMALLINT
indexed_date => TIMESTAMP
centroid => GEOMETRY
geometry => GEOMETRY
}
map search_name {
place_id => BIGINT
importance => DOUBLE
search_rank => SMALLINT
address_rank => SMALLINT
name_vector => INT[]
nameaddress_vector => INT[]
country_code => VARCHAR(2)
centroid => GEOMETRY
}
map word {
word_id => INT
word_token => TEXT
... =>
}
map location_property_osmline {
place_id => BIGINT
osm_id => BIGINT
startnumber => INT
endnumber => INT
interpolationtype => TEXT
address => HSTORE
partition => SMALLINT
geometry_sector => INT
parent_place_id => BIGINT
country_code => VARCHAR(2)
postcode => text
indexed_status => SMALLINT
indexed_date => TIMESTAMP
linegeo => GEOMETRY
}
map place_addressline {
place_id => BIGINT
address_place_id => BIGINT
distance => DOUBLE
cached_rank_address => SMALLINT
fromarea => BOOLEAN
isaddress => BOOLEAN
}
map location_postcode {
place_id => BIGINT
postcode => TEXT
parent_place_id => BIGINT
rank_search => SMALLINT
rank_address => SMALLINT
indexed_status => SMALLINT
indexed_date => TIMESTAMP
geometry => GEOMETRY
}
placex::place_id <-- search_name::place_id
placex::place_id <-- place_addressline::place_id
placex::place_id <-- place_addressline::address_place_id
search_name::name_vector --> word::word_id
search_name::nameaddress_vector --> word::word_id
place_addressline -[hidden]> location_property_osmline
search_name -[hidden]> place_addressline
location_property_osmline -[hidden]-> location_postcode
@enduml

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 35 KiB

23
docs/extra.css Normal file
View File

@@ -0,0 +1,23 @@
.toctree-l3 {
display: none!important
}
table {
margin-bottom: 12pt
}
th, td {
padding: 1pt 12pt;
}
th {
background-color: #eee;
}
/* Indentation for mkdocstrings.
div.doc-contents:not(.first) {
padding-left: 25px;
border-left: 4px solid rgba(230, 230, 230);
margin-bottom: 60px;
}*/

10
docs/index.md Normal file
View File

@@ -0,0 +1,10 @@
Nominatim (from the Latin, 'by name') is a tool to search OSM data by name and address and to generate synthetic addresses of OSM points (reverse geocoding).
This guide comes in four parts:
* __[API reference](api/Overview.md)__ for users of Nominatim
* __[Administration Guide](admin/Installation.md)__ for those who want
to install their own Nominatim server
* __[Customization Guide](customize/Overview.md)__ for those who want to
adapt their own installation to their special requirements
* __[Developer's Guide](develop/overview.md)__ for developers of the software

63
docs/mkdocs.yml Normal file
View File

@@ -0,0 +1,63 @@
site_name: Nominatim 4.0.1
theme: readthedocs
docs_dir: ${CMAKE_CURRENT_BINARY_DIR}
site_url: https://nominatim.org
repo_url: https://github.com/openstreetmap/Nominatim
pages:
- 'Introduction' : 'index.md'
- 'API Reference':
- 'Overview': 'api/Overview.md'
- 'Search': 'api/Search.md'
- 'Reverse': 'api/Reverse.md'
- 'Address Lookup': 'api/Lookup.md'
- 'Details' : 'api/Details.md'
- 'Status' : 'api/Status.md'
- 'Place Output Formats': 'api/Output.md'
- 'FAQ': 'api/Faq.md'
- 'Administration Guide':
- 'Basic Installation': 'admin/Installation.md'
- 'Import' : 'admin/Import.md'
- 'Update' : 'admin/Update.md'
- 'Deploy' : 'admin/Deployment.md'
- 'Nominatim UI' : 'admin/Setup-Nominatim-UI.md'
- 'Advanced Installations' : 'admin/Advanced-Installations.md'
- 'Maintenance' : 'admin/Maintenance.md'
- 'Migration from older Versions' : 'admin/Migration.md'
- 'Troubleshooting' : 'admin/Faq.md'
- 'Customization Guide':
- 'Overview': 'customize/Overview.md'
- 'Import Styles': 'customize/Import-Styles.md'
- 'Configuration Settings': 'customize/Settings.md'
- 'Place Ranking' : 'customize/Ranking.md'
- 'Tokenizers' : 'customize/Tokenizers.md'
- 'Special Phrases': 'customize/Special-Phrases.md'
- 'External data: US housenumbers from TIGER': 'customize/Tiger.md'
- 'External data: Postcodes': 'customize/Postcodes.md'
- 'Developers Guide':
- 'Architecture Overview' : 'develop/overview.md'
- 'Database Layout' : 'develop/Database-Layout.md'
- 'Indexing' : 'develop/Indexing.md'
- 'Tokenizers' : 'develop/Tokenizers.md'
- 'Setup for Development' : 'develop/Development-Environment.md'
- 'Testing' : 'develop/Testing.md'
- 'External Data Sources': 'develop/data-sources.md'
- 'Appendix':
- 'Installation on CentOS 7' : 'appendix/Install-on-Centos-7.md'
- 'Installation on CentOS 8' : 'appendix/Install-on-Centos-8.md'
- 'Installation on Ubuntu 18' : 'appendix/Install-on-Ubuntu-18.md'
- 'Installation on Ubuntu 20' : 'appendix/Install-on-Ubuntu-20.md'
markdown_extensions:
- codehilite
- admonition
- def_list
- toc:
permalink:
extra_css: [extra.css, styles.css]
plugins:
- search
- mkdocstrings:
handlers:
python:
rendering:
show_source: false
show_signature_annotations: false

69
docs/styles.css Normal file
View File

@@ -0,0 +1,69 @@
.codehilite .hll { background-color: #ffffcc }
.codehilite { background: #f0f0f0; }
.codehilite .c { color: #60a0b0; font-style: italic } /* Comment */
.codehilite .err { /* border: 1px solid #FF0000 */ } /* Error */
.codehilite .k { color: #007020; font-weight: bold } /* Keyword */
.codehilite .o { color: #666666 } /* Operator */
.codehilite .ch { color: #60a0b0; font-style: italic } /* Comment.Hashbang */
.codehilite .cm { color: #60a0b0; font-style: italic } /* Comment.Multiline */
.codehilite .cp { color: #007020 } /* Comment.Preproc */
.codehilite .cpf { color: #60a0b0; font-style: italic } /* Comment.PreprocFile */
.codehilite .c1 { color: #60a0b0; font-style: italic } /* Comment.Single */
.codehilite .cs { color: #60a0b0; background-color: #fff0f0 } /* Comment.Special */
.codehilite .gd { color: #A00000 } /* Generic.Deleted */
.codehilite .ge { font-style: italic } /* Generic.Emph */
.codehilite .gr { color: #FF0000 } /* Generic.Error */
.codehilite .gh { color: #000080; font-weight: bold } /* Generic.Heading */
.codehilite .gi { color: #00A000 } /* Generic.Inserted */
.codehilite .go { color: #888888 } /* Generic.Output */
.codehilite .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */
.codehilite .gs { font-weight: bold } /* Generic.Strong */
.codehilite .gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.codehilite .gt { color: #0044DD } /* Generic.Traceback */
.codehilite .kc { color: #007020; font-weight: bold } /* Keyword.Constant */
.codehilite .kd { color: #007020; font-weight: bold } /* Keyword.Declaration */
.codehilite .kn { color: #007020; font-weight: bold } /* Keyword.Namespace */
.codehilite .kp { color: #007020 } /* Keyword.Pseudo */
.codehilite .kr { color: #007020; font-weight: bold } /* Keyword.Reserved */
.codehilite .kt { color: #902000 } /* Keyword.Type */
.codehilite .m { color: #40a070 } /* Literal.Number */
.codehilite .s { color: #4070a0 } /* Literal.String */
.codehilite .na { color: #4070a0 } /* Name.Attribute */
.codehilite .nb { color: #007020 } /* Name.Builtin */
.codehilite .nc { color: #0e84b5; font-weight: bold } /* Name.Class */
.codehilite .no { color: #60add5 } /* Name.Constant */
.codehilite .nd { color: #555555; font-weight: bold } /* Name.Decorator */
.codehilite .ni { color: #d55537; font-weight: bold } /* Name.Entity */
.codehilite .ne { color: #007020 } /* Name.Exception */
.codehilite .nf { color: #06287e } /* Name.Function */
.codehilite .nl { color: #002070; font-weight: bold } /* Name.Label */
.codehilite .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */
.codehilite .nt { color: #062873; font-weight: bold } /* Name.Tag */
.codehilite .nv { color: #bb60d5 } /* Name.Variable */
.codehilite .ow { color: #007020; font-weight: bold } /* Operator.Word */
.codehilite .w { color: #bbbbbb } /* Text.Whitespace */
.codehilite .mb { color: #40a070 } /* Literal.Number.Bin */
.codehilite .mf { color: #40a070 } /* Literal.Number.Float */
.codehilite .mh { color: #40a070 } /* Literal.Number.Hex */
.codehilite .mi { color: #40a070 } /* Literal.Number.Integer */
.codehilite .mo { color: #40a070 } /* Literal.Number.Oct */
.codehilite .sa { color: #4070a0 } /* Literal.String.Affix */
.codehilite .sb { color: #4070a0 } /* Literal.String.Backtick */
.codehilite .sc { color: #4070a0 } /* Literal.String.Char */
.codehilite .dl { color: #4070a0 } /* Literal.String.Delimiter */
.codehilite .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */
.codehilite .s2 { color: #4070a0 } /* Literal.String.Double */
.codehilite .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */
.codehilite .sh { color: #4070a0 } /* Literal.String.Heredoc */
.codehilite .si { color: #70a0d0; font-style: italic } /* Literal.String.Interpol */
.codehilite .sx { color: #c65d09 } /* Literal.String.Other */
.codehilite .sr { color: #235388 } /* Literal.String.Regex */
.codehilite .s1 { color: #4070a0 } /* Literal.String.Single */
.codehilite .ss { color: #517918 } /* Literal.String.Symbol */
.codehilite .bp { color: #007020 } /* Name.Builtin.Pseudo */
.codehilite .fm { color: #06287e } /* Name.Function.Magic */
.codehilite .vc { color: #bb60d5 } /* Name.Variable.Class */
.codehilite .vg { color: #bb60d5 } /* Name.Variable.Global */
.codehilite .vi { color: #bb60d5 } /* Name.Variable.Instance */
.codehilite .vm { color: #bb60d5 } /* Name.Variable.Magic */
.codehilite .il { color: #40a070 } /* Literal.Number.Integer.Long */

169
lib-php/AddressDetails.php Normal file
View File

@@ -0,0 +1,169 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/ClassTypes.php');
/**
* Detailed list of address parts for a single result
*/
class AddressDetails
{
private $iPlaceID;
private $aAddressLines;
public function __construct(&$oDB, $iPlaceID, $sHousenumber, $mLangPref)
{
$this->iPlaceID = $iPlaceID;
if (is_array($mLangPref)) {
$mLangPref = $oDB->getArraySQL($oDB->getDBQuotedList($mLangPref));
}
if (!isset($sHousenumber)) {
$sHousenumber = -1;
}
$sSQL = 'SELECT *,';
$sSQL .= ' get_name_by_language(name,'.$mLangPref.') as localname';
$sSQL .= ' FROM get_addressdata('.$iPlaceID.','.$sHousenumber.')';
$sSQL .= ' ORDER BY rank_address DESC, isaddress DESC';
$this->aAddressLines = $oDB->getAll($sSQL);
}
private static function isAddress($aLine)
{
return $aLine['isaddress'] || $aLine['type'] == 'country_code';
}
public function getAddressDetails($bAll = false)
{
if ($bAll) {
return $this->aAddressLines;
}
return array_filter($this->aAddressLines, array(__CLASS__, 'isAddress'));
}
public function getLocaleAddress()
{
$aParts = array();
$sPrevResult = '';
foreach ($this->aAddressLines as $aLine) {
if ($aLine['isaddress'] && $sPrevResult != $aLine['localname']) {
$sPrevResult = $aLine['localname'];
$aParts[] = $sPrevResult;
}
}
return join(', ', $aParts);
}
public function getAddressNames()
{
$aAddress = array();
foreach ($this->aAddressLines as $aLine) {
if (!self::isAddress($aLine)) {
continue;
}
$sTypeLabel = ClassTypes\getLabelTag($aLine);
$sName = null;
if (isset($aLine['localname']) && $aLine['localname']!=='') {
$sName = $aLine['localname'];
} elseif (isset($aLine['housenumber']) && $aLine['housenumber']!=='') {
$sName = $aLine['housenumber'];
}
if (isset($sName)
&& (!isset($aAddress[$sTypeLabel])
|| $aLine['class'] == 'place')
) {
$aAddress[$sTypeLabel] = $sName;
}
}
return $aAddress;
}
/**
* Annotates the given json with geocodejson address information fields.
*
* @param array $aJson Json hash to add the fields to.
*
* Geocodejson has the following fields:
* street, locality, postcode, city, district,
* county, state, country
*
* Postcode and housenumber are added by type, district is not used.
* All other fields are set according to address rank.
*/
public function addGeocodeJsonAddressParts(&$aJson)
{
foreach (array_reverse($this->aAddressLines) as $aLine) {
if (!$aLine['isaddress']) {
continue;
}
if (!isset($aLine['localname']) || $aLine['localname'] == '') {
continue;
}
if ($aLine['type'] == 'postcode' || $aLine['type'] == 'postal_code') {
$aJson['postcode'] = $aLine['localname'];
continue;
}
if ($aLine['type'] == 'house_number') {
$aJson['housenumber'] = $aLine['localname'];
continue;
}
if ($this->iPlaceID == $aLine['place_id']) {
continue;
}
$iRank = (int)$aLine['rank_address'];
if ($iRank > 25 && $iRank < 28) {
$aJson['street'] = $aLine['localname'];
} elseif ($iRank >= 22 && $iRank <= 25) {
$aJson['locality'] = $aLine['localname'];
} elseif ($iRank >= 17 && $iRank <= 21) {
$aJson['district'] = $aLine['localname'];
} elseif ($iRank >= 13 && $iRank <= 16) {
$aJson['city'] = $aLine['localname'];
} elseif ($iRank >= 10 && $iRank <= 12) {
$aJson['county'] = $aLine['localname'];
} elseif ($iRank >= 5 && $iRank <= 9) {
$aJson['state'] = $aLine['localname'];
} elseif ($iRank == 4) {
$aJson['country'] = $aLine['localname'];
}
}
}
public function getAdminLevels()
{
$aAddress = array();
foreach (array_reverse($this->aAddressLines) as $aLine) {
if (self::isAddress($aLine)
&& isset($aLine['admin_level'])
&& $aLine['admin_level'] < 15
&& !isset($aAddress['level'.$aLine['admin_level']])
) {
$aAddress['level'.$aLine['admin_level']] = $aLine['localname'];
}
}
return $aAddress;
}
public function debugInfo()
{
return $this->aAddressLines;
}
}

568
lib-php/ClassTypes.php Normal file
View File

@@ -0,0 +1,568 @@
<?php
namespace Nominatim\ClassTypes;
/**
* Create a label tag for the given place that can be used as an XML name.
*
* @param array[] $aPlace Information about the place to label.
*
* A label tag groups various object types together under a common
* label. The returned value is lower case and has no spaces
*/
function getLabelTag($aPlace, $sCountry = null)
{
$iRank = (int) ($aPlace['rank_address'] ?? 30);
$sLabel;
if (isset($aPlace['place_type'])) {
$sLabel = $aPlace['place_type'];
} elseif ($aPlace['class'] == 'boundary' && $aPlace['type'] == 'administrative') {
$sLabel = getBoundaryLabel($iRank/2, $sCountry);
} elseif ($aPlace['type'] == 'postal_code') {
$sLabel = 'postcode';
} elseif ($iRank < 26) {
$sLabel = $aPlace['type'];
} elseif ($iRank < 28) {
$sLabel = 'road';
} elseif ($aPlace['class'] == 'place'
&& ($aPlace['type'] == 'house_number' ||
$aPlace['type'] == 'house_name' ||
$aPlace['type'] == 'country_code')
) {
$sLabel = $aPlace['type'];
} else {
$sLabel = $aPlace['class'];
}
return strtolower(str_replace(' ', '_', $sLabel));
}
/**
* Create a label for the given place.
*
* @param array[] $aPlace Information about the place to label.
*/
function getLabel($aPlace, $sCountry = null)
{
if (isset($aPlace['place_type'])) {
return ucwords(str_replace('_', ' ', $aPlace['place_type']));
}
if ($aPlace['class'] == 'boundary' && $aPlace['type'] == 'administrative') {
return getBoundaryLabel(($aPlace['rank_address'] ?? 30)/2, $sCountry ?? null);
}
// Return a label only for 'important' class/type combinations
if (getImportance($aPlace) !== null) {
return ucwords(str_replace('_', ' ', $aPlace['type']));
}
return null;
}
/**
* Return a simple label for an administrative boundary for the given country.
*
* @param int $iAdminLevel Content of admin_level tag.
* @param string $sCountry Country code of the country where the object is
* in. May be null, in which case a world-wide
* fallback is used.
* @param string $sFallback String to return if no explicit string is listed.
*
* @return string
*/
function getBoundaryLabel($iAdminLevel, $sCountry, $sFallback = 'Administrative')
{
static $aBoundaryList = array (
'default' => array (
1 => 'Continent',
2 => 'Country',
3 => 'Region',
4 => 'State',
5 => 'State District',
6 => 'County',
7 => 'Municipality',
8 => 'City',
9 => 'City District',
10 => 'Suburb',
11 => 'Neighbourhood',
12 => 'City Block'
),
'no' => array (
3 => 'State',
4 => 'County'
),
'se' => array (
3 => 'State',
4 => 'County'
)
);
if (isset($aBoundaryList[$sCountry])
&& isset($aBoundaryList[$sCountry][$iAdminLevel])
) {
return $aBoundaryList[$sCountry][$iAdminLevel];
}
return $aBoundaryList['default'][$iAdminLevel] ?? $sFallback;
}
/**
* Return an estimated radius of how far the object node extends.
*
* @param array[] $aPlace Information about the place. This must be a node
* feature.
*
* @return float The radius around the feature in degrees.
*/
function getDefRadius($aPlace)
{
$aSpecialRadius = array(
'place:continent' => 25,
'place:country' => 7,
'place:state' => 2.6,
'place:province' => 2.6,
'place:region' => 1.0,
'place:county' => 0.7,
'place:city' => 0.16,
'place:municipality' => 0.16,
'place:island' => 0.32,
'place:postcode' => 0.16,
'place:town' => 0.04,
'place:village' => 0.02,
'place:hamlet' => 0.02,
'place:district' => 0.02,
'place:borough' => 0.02,
'place:suburb' => 0.02,
'place:locality' => 0.01,
'place:neighbourhood'=> 0.01,
'place:quarter' => 0.01,
'place:city_block' => 0.01,
'landuse:farm' => 0.01,
'place:farm' => 0.01,
'place:airport' => 0.015,
'aeroway:aerodrome' => 0.015,
'railway:station' => 0.005
);
$sClassPlace = $aPlace['class'].':'.$aPlace['type'];
return $aSpecialRadius[$sClassPlace] ?? 0.00005;
}
/**
* Get the icon to use with the given object.
*/
function getIcon($aPlace)
{
$aIcons = array(
'boundary:administrative' => 'poi_boundary_administrative',
'place:city' => 'poi_place_city',
'place:town' => 'poi_place_town',
'place:village' => 'poi_place_village',
'place:hamlet' => 'poi_place_village',
'place:suburb' => 'poi_place_village',
'place:locality' => 'poi_place_village',
'place:airport' => 'transport_airport2',
'aeroway:aerodrome' => 'transport_airport2',
'railway:station' => 'transport_train_station2',
'amenity:place_of_worship' => 'place_of_worship_unknown3',
'amenity:pub' => 'food_pub',
'amenity:bar' => 'food_bar',
'amenity:university' => 'education_university',
'tourism:museum' => 'tourist_museum',
'amenity:arts_centre' => 'tourist_art_gallery2',
'tourism:zoo' => 'tourist_zoo',
'tourism:theme_park' => 'poi_point_of_interest',
'tourism:attraction' => 'poi_point_of_interest',
'leisure:golf_course' => 'sport_golf',
'historic:castle' => 'tourist_castle',
'amenity:hospital' => 'health_hospital',
'amenity:school' => 'education_school',
'amenity:theatre' => 'tourist_theatre',
'amenity:library' => 'amenity_library',
'amenity:fire_station' => 'amenity_firestation3',
'amenity:police' => 'amenity_police2',
'amenity:bank' => 'money_bank2',
'amenity:post_office' => 'amenity_post_office',
'tourism:hotel' => 'accommodation_hotel2',
'amenity:cinema' => 'tourist_cinema',
'tourism:artwork' => 'tourist_art_gallery2',
'historic:archaeological_site' => 'tourist_archaeological2',
'amenity:doctors' => 'health_doctors',
'leisure:sports_centre' => 'sport_leisure_centre',
'leisure:swimming_pool' => 'sport_swimming_outdoor',
'shop:supermarket' => 'shopping_supermarket',
'shop:convenience' => 'shopping_convenience',
'amenity:restaurant' => 'food_restaurant',
'amenity:fast_food' => 'food_fastfood',
'amenity:cafe' => 'food_cafe',
'tourism:guest_house' => 'accommodation_bed_and_breakfast',
'amenity:pharmacy' => 'health_pharmacy_dispensing',
'amenity:fuel' => 'transport_fuel',
'natural:peak' => 'poi_peak',
'natural:wood' => 'landuse_coniferous_and_deciduous',
'shop:bicycle' => 'shopping_bicycle',
'shop:clothes' => 'shopping_clothes',
'shop:hairdresser' => 'shopping_hairdresser',
'shop:doityourself' => 'shopping_diy',
'shop:estate_agent' => 'shopping_estateagent2',
'shop:car' => 'shopping_car',
'shop:garden_centre' => 'shopping_garden_centre',
'shop:car_repair' => 'shopping_car_repair',
'shop:bakery' => 'shopping_bakery',
'shop:butcher' => 'shopping_butcher',
'shop:apparel' => 'shopping_clothes',
'shop:laundry' => 'shopping_laundrette',
'shop:beverages' => 'shopping_alcohol',
'shop:alcohol' => 'shopping_alcohol',
'shop:optician' => 'health_opticians',
'shop:chemist' => 'health_pharmacy',
'shop:gallery' => 'tourist_art_gallery2',
'shop:jewelry' => 'shopping_jewelry',
'tourism:information' => 'amenity_information',
'historic:ruins' => 'tourist_ruin',
'amenity:college' => 'education_school',
'historic:monument' => 'tourist_monument',
'historic:memorial' => 'tourist_monument',
'historic:mine' => 'poi_mine',
'tourism:caravan_site' => 'accommodation_caravan_park',
'amenity:bus_station' => 'transport_bus_station',
'amenity:atm' => 'money_atm2',
'tourism:viewpoint' => 'tourist_view_point',
'tourism:guesthouse' => 'accommodation_bed_and_breakfast',
'railway:tram' => 'transport_tram_stop',
'amenity:courthouse' => 'amenity_court',
'amenity:recycling' => 'amenity_recycling',
'amenity:dentist' => 'health_dentist',
'natural:beach' => 'tourist_beach',
'railway:tram_stop' => 'transport_tram_stop',
'amenity:prison' => 'amenity_prison',
'highway:bus_stop' => 'transport_bus_stop2'
);
$sClassPlace = $aPlace['class'].':'.$aPlace['type'];
return $aIcons[$sClassPlace] ?? null;
}
/**
* Get an icon for the given object with its full URL.
*/
function getIconFile($aPlace)
{
if (CONST_MapIcon_URL === false) {
return null;
}
$sIcon = getIcon($aPlace);
if (!isset($sIcon)) {
return null;
}
return CONST_MapIcon_URL.'/'.$sIcon.'.p.20.png';
}
/**
* Return a class importance value for the given place.
*
* @param array[] $aPlace Information about the place.
*
* @return int An importance value. The lower the value, the more
* important the class.
*/
function getImportance($aPlace)
{
static $aWithImportance = null;
if ($aWithImportance === null) {
$aWithImportance = array_flip(array(
'boundary:administrative',
'place:country',
'place:state',
'place:province',
'place:county',
'place:city',
'place:region',
'place:island',
'place:town',
'place:village',
'place:hamlet',
'place:suburb',
'place:locality',
'landuse:farm',
'place:farm',
'highway:motorway_junction',
'highway:motorway',
'highway:trunk',
'highway:primary',
'highway:secondary',
'highway:tertiary',
'highway:residential',
'highway:unclassified',
'highway:living_street',
'highway:service',
'highway:track',
'highway:road',
'highway:byway',
'highway:bridleway',
'highway:cycleway',
'highway:pedestrian',
'highway:footway',
'highway:steps',
'highway:motorway_link',
'highway:trunk_link',
'highway:primary_link',
'landuse:industrial',
'landuse:residential',
'landuse:retail',
'landuse:commercial',
'place:airport',
'aeroway:aerodrome',
'railway:station',
'amenity:place_of_worship',
'amenity:pub',
'amenity:bar',
'amenity:university',
'tourism:museum',
'amenity:arts_centre',
'tourism:zoo',
'tourism:theme_park',
'tourism:attraction',
'leisure:golf_course',
'historic:castle',
'amenity:hospital',
'amenity:school',
'amenity:theatre',
'amenity:public_building',
'amenity:library',
'amenity:townhall',
'amenity:community_centre',
'amenity:fire_station',
'amenity:police',
'amenity:bank',
'amenity:post_office',
'leisure:park',
'amenity:park',
'landuse:park',
'landuse:recreation_ground',
'tourism:hotel',
'tourism:motel',
'amenity:cinema',
'tourism:artwork',
'historic:archaeological_site',
'amenity:doctors',
'leisure:sports_centre',
'leisure:swimming_pool',
'shop:supermarket',
'shop:convenience',
'amenity:restaurant',
'amenity:fast_food',
'amenity:cafe',
'tourism:guest_house',
'amenity:pharmacy',
'amenity:fuel',
'natural:peak',
'waterway:waterfall',
'natural:wood',
'natural:water',
'landuse:forest',
'landuse:cemetery',
'landuse:allotments',
'landuse:farmyard',
'railway:rail',
'waterway:canal',
'waterway:river',
'waterway:stream',
'shop:bicycle',
'shop:clothes',
'shop:hairdresser',
'shop:doityourself',
'shop:estate_agent',
'shop:car',
'shop:garden_centre',
'shop:car_repair',
'shop:newsagent',
'shop:bakery',
'shop:furniture',
'shop:butcher',
'shop:apparel',
'shop:electronics',
'shop:department_store',
'shop:books',
'shop:yes',
'shop:outdoor',
'shop:mall',
'shop:florist',
'shop:charity',
'shop:hardware',
'shop:laundry',
'shop:shoes',
'shop:beverages',
'shop:dry_cleaning',
'shop:carpet',
'shop:computer',
'shop:alcohol',
'shop:optician',
'shop:chemist',
'shop:gallery',
'shop:mobile_phone',
'shop:sports',
'shop:jewelry',
'shop:pet',
'shop:beauty',
'shop:stationery',
'shop:shopping_centre',
'shop:general',
'shop:electrical',
'shop:toys',
'shop:jeweller',
'shop:betting',
'shop:household',
'shop:travel_agency',
'shop:hifi',
'amenity:shop',
'tourism:information',
'place:house',
'place:house_name',
'place:house_number',
'place:country_code',
'leisure:pitch',
'highway:unsurfaced',
'historic:ruins',
'amenity:college',
'historic:monument',
'railway:subway',
'historic:memorial',
'leisure:nature_reserve',
'leisure:common',
'waterway:lock_gate',
'natural:fell',
'amenity:nightclub',
'highway:path',
'leisure:garden',
'landuse:reservoir',
'leisure:playground',
'leisure:stadium',
'historic:mine',
'natural:cliff',
'tourism:caravan_site',
'amenity:bus_station',
'amenity:kindergarten',
'highway:construction',
'amenity:atm',
'amenity:emergency_phone',
'waterway:lock',
'waterway:riverbank',
'natural:coastline',
'tourism:viewpoint',
'tourism:hostel',
'tourism:bed_and_breakfast',
'railway:halt',
'railway:platform',
'railway:tram',
'amenity:courthouse',
'amenity:recycling',
'amenity:dentist',
'natural:beach',
'place:moor',
'amenity:grave_yard',
'waterway:drain',
'landuse:grass',
'landuse:village_green',
'natural:bay',
'railway:tram_stop',
'leisure:marina',
'highway:stile',
'natural:moor',
'railway:light_rail',
'railway:narrow_gauge',
'natural:land',
'amenity:village_hall',
'waterway:dock',
'amenity:veterinary',
'landuse:brownfield',
'leisure:track',
'railway:historic_station',
'landuse:construction',
'amenity:prison',
'landuse:quarry',
'amenity:telephone',
'highway:traffic_signals',
'natural:heath',
'historic:house',
'amenity:social_club',
'landuse:military',
'amenity:health_centre',
'historic:building',
'amenity:clinic',
'highway:services',
'amenity:ferry_terminal',
'natural:marsh',
'natural:hill',
'highway:raceway',
'amenity:taxi',
'amenity:take_away',
'amenity:car_rental',
'place:islet',
'amenity:nursery',
'amenity:nursing_home',
'amenity:toilets',
'amenity:hall',
'waterway:boatyard',
'highway:mini_roundabout',
'historic:manor',
'tourism:chalet',
'amenity:bicycle_parking',
'amenity:hotel',
'waterway:weir',
'natural:wetland',
'natural:cave_entrance',
'amenity:crematorium',
'tourism:picnic_site',
'landuse:wood',
'landuse:basin',
'natural:tree',
'leisure:slipway',
'landuse:meadow',
'landuse:piste',
'amenity:care_home',
'amenity:club',
'amenity:medical_centre',
'historic:roman_road',
'historic:fort',
'railway:subway_entrance',
'historic:yes',
'highway:gate',
'leisure:fishing',
'historic:museum',
'amenity:car_wash',
'railway:level_crossing',
'leisure:bird_hide',
'natural:headland',
'tourism:apartments',
'amenity:shopping',
'natural:scrub',
'natural:fen',
'building:yes',
'mountain_pass:yes',
'amenity:parking',
'highway:bus_stop',
'place:postcode',
'amenity:post_box',
'place:houses',
'railway:preserved',
'waterway:derelict_canal',
'amenity:dead_pub',
'railway:disused_station',
'railway:abandoned',
'railway:disused'
));
}
$sClassPlace = $aPlace['class'].':'.$aPlace['type'];
return $aWithImportance[$sClassPlace] ?? null;
}

349
lib-php/DB.php Normal file
View File

@@ -0,0 +1,349 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/DatabaseError.php');
/**
* Uses PDO to access the database specified in the CONST_Database_DSN
* setting.
*/
class DB
{
protected $connection;
public function __construct($sDSN = null)
{
$this->sDSN = $sDSN ?? getSetting('DATABASE_DSN');
}
public function connect($bNew = false, $bPersistent = true)
{
if (isset($this->connection) && !$bNew) {
return true;
}
$aConnOptions = array(
\PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION,
\PDO::ATTR_DEFAULT_FETCH_MODE => \PDO::FETCH_ASSOC,
\PDO::ATTR_PERSISTENT => $bPersistent
);
// https://secure.php.net/manual/en/ref.pdo-pgsql.connection.php
try {
$conn = new \PDO($this->sDSN, null, null, $aConnOptions);
} catch (\PDOException $e) {
$sMsg = 'Failed to establish database connection:' . $e->getMessage();
throw new \Nominatim\DatabaseError($sMsg, 500, null, $e->getMessage());
}
$conn->exec("SET DateStyle TO 'sql,european'");
$conn->exec("SET client_encoding TO 'utf-8'");
$iMaxExecution = ini_get('max_execution_time');
if ($iMaxExecution > 0) {
$conn->setAttribute(\PDO::ATTR_TIMEOUT, $iMaxExecution); // seconds
}
$this->connection = $conn;
return true;
}
// returns the number of rows that were modified or deleted by the SQL
// statement. If no rows were affected returns 0.
public function exec($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
$val = null;
try {
if (isset($aInputVars)) {
$stmt = $this->connection->prepare($sSQL);
$stmt->execute($aInputVars);
} else {
$val = $this->connection->exec($sSQL);
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $val;
}
/**
* Executes query. Returns first row as array.
* Returns false if no result found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getRow($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$row = $stmt->fetch();
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $row;
}
/**
* Executes query. Returns first value of first result.
* Returns false if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getOne($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$row = $stmt->fetch(\PDO::FETCH_NUM);
if ($row === false) {
return false;
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $row[0];
}
/**
* Executes query. Returns array of results (arrays).
* Returns empty array if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getAll($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$rows = $stmt->fetchAll();
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $rows;
}
/**
* Executes query. Returns array of the first value of each result.
* Returns empty array if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getCol($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
$aVals = array();
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
while (($val = $stmt->fetchColumn(0)) !== false) { // returns first column or false
$aVals[] = $val;
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $aVals;
}
/**
* Executes query. Returns associate array mapping first value to second value of each result.
* Returns empty array if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getAssoc($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$aList = array();
while ($aRow = $stmt->fetch(\PDO::FETCH_NUM)) {
$aList[$aRow[0]] = $aRow[1];
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $aList;
}
/**
* Executes query. Returns a PDO statement to iterate over.
*
* @param string $sSQL
*
* @return PDOStatement
*/
public function getQueryStatement($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
if (isset($aInputVars)) {
$stmt = $this->connection->prepare($sSQL);
$stmt->execute($aInputVars);
} else {
$stmt = $this->connection->query($sSQL);
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $stmt;
}
/**
* St. John's Way => 'St. John\'s Way'
*
* @param string $sVal Text to be quoted.
*
* @return string
*/
public function getDBQuoted($sVal)
{
return $this->connection->quote($sVal);
}
/**
* Like getDBQuoted, but takes an array.
*
* @param array $aVals List of text to be quoted.
*
* @return array[]
*/
public function getDBQuotedList($aVals)
{
return array_map(function ($sVal) {
return $this->getDBQuoted($sVal);
}, $aVals);
}
/**
* [1,2,'b'] => 'ARRAY[1,2,'b']''
*
* @param array $aVals List of text to be quoted.
*
* @return string
*/
public function getArraySQL($a)
{
return 'ARRAY['.join(',', $a).']';
}
/**
* Check if a table exists in the database. Returns true if it does.
*
* @param string $sTableName
*
* @return boolean
*/
public function tableExists($sTableName)
{
$sSQL = 'SELECT count(*) FROM pg_tables WHERE tablename = :tablename';
return ($this->getOne($sSQL, array(':tablename' => $sTableName)) == 1);
}
/**
* Deletes a table. Returns true if deleted or didn't exist.
*
* @param string $sTableName
*
* @return boolean
*/
public function deleteTable($sTableName)
{
return $this->exec('DROP TABLE IF EXISTS '.$sTableName.' CASCADE') == 0;
}
/**
* Tries to connect to the database but on failure doesn't throw an exception.
*
* @return boolean
*/
public function checkConnection()
{
$bExists = true;
try {
$this->connect(true);
} catch (\Nominatim\DatabaseError $e) {
$bExists = false;
}
return $bExists;
}
/**
* e.g. 9.6, 10, 11.2
*
* @return float
*/
public function getPostgresVersion()
{
$sVersionString = $this->getOne('SHOW server_version_num');
preg_match('#([0-9]?[0-9])([0-9][0-9])[0-9][0-9]#', $sVersionString, $aMatches);
return (float) ($aMatches[1].'.'.$aMatches[2]);
}
/**
* e.g. 2, 2.2
*
* @return float
*/
public function getPostgisVersion()
{
$sVersionString = $this->getOne('select postgis_lib_version()');
preg_match('#^([0-9]+)[.]([0-9]+)[.]#', $sVersionString, $aMatches);
return (float) ($aMatches[1].'.'.$aMatches[2]);
}
/**
* Returns an associate array of postgresql database connection settings. Keys can
* be 'database', 'hostspec', 'port', 'username', 'password'.
* Returns empty array on failure, thus check if at least 'database' is set.
*
* @return array[]
*/
public static function parseDSN($sDSN)
{
// https://secure.php.net/manual/en/ref.pdo-pgsql.connection.php
$aInfo = array();
if (preg_match('/^pgsql:(.+)$/', $sDSN, $aMatches)) {
foreach (explode(';', $aMatches[1]) as $sKeyVal) {
list($sKey, $sVal) = explode('=', $sKeyVal, 2);
if ($sKey == 'host') {
$sKey = 'hostspec';
} elseif ($sKey == 'dbname') {
$sKey = 'database';
} elseif ($sKey == 'user') {
$sKey = 'username';
}
$aInfo[$sKey] = $sVal;
}
}
return $aInfo;
}
/**
* Takes an array of settings and return the DNS string. Key names can be
* 'database', 'hostspec', 'port', 'username', 'password' but aliases
* 'dbname', 'host' and 'user' are also supported.
*
* @return string
*
*/
public static function generateDSN($aInfo)
{
$sDSN = sprintf(
'pgsql:host=%s;port=%s;dbname=%s;user=%s;password=%s;',
$aInfo['host'] ?? $aInfo['hostspec'] ?? '',
$aInfo['port'] ?? '',
$aInfo['dbname'] ?? $aInfo['database'] ?? '',
$aInfo['user'] ?? '',
$aInfo['password'] ?? ''
);
$sDSN = preg_replace('/\b\w+=;/', '', $sDSN);
$sDSN = preg_replace('/;\Z/', '', $sDSN);
return $sDSN;
}
}

34
lib-php/DatabaseError.php Normal file
View File

@@ -0,0 +1,34 @@
<?php
namespace Nominatim;
class DatabaseError extends \Exception
{
public function __construct($message, $code, $previous, $oPDOErr, $sSql = null)
{
parent::__construct($message, $code, $previous);
// https://secure.php.net/manual/en/class.pdoexception.php
$this->oPDOErr = $oPDOErr;
$this->sSql = $sSql;
}
public function __toString()
{
return __CLASS__ . ": [{$this->code}]: {$this->message}\n";
}
public function getSqlError()
{
return $this->oPDOErr->getMessage();
}
public function getSqlDebugDump()
{
if (CONST_Debug) {
return var_export($this->oPDOErr, true);
} else {
return $this->sSql;
}
}
}

180
lib-php/DebugHtml.php Normal file
View File

@@ -0,0 +1,180 @@
<?php
namespace Nominatim;
class Debug
{
public static function newFunction($sHeading)
{
echo "<pre><h2>Debug output for $sHeading</h2></pre>\n";
}
public static function newSection($sHeading)
{
echo "<hr><pre><h3>$sHeading</h3></pre>\n";
}
public static function printVar($sHeading, $mVar)
{
echo '<pre><b>'.$sHeading. ':</b> ';
Debug::outputVar($mVar, str_repeat(' ', strlen($sHeading) + 3));
echo "</pre>\n";
}
public static function fmtArrayVals($aArr)
{
return array('__debug_format' => 'array_vals', 'data' => $aArr);
}
public static function printDebugArray($sHeading, $oVar)
{
if ($oVar === null) {
Debug::printVar($sHeading, 'null');
} else {
Debug::printVar($sHeading, $oVar->debugInfo());
}
}
public static function printDebugTable($sHeading, $aVar)
{
echo '<b>'.$sHeading.":</b>\n";
echo "<table border='1'>\n";
if (!empty($aVar)) {
echo " <tr>\n";
$aKeys = array();
$aInfo = reset($aVar);
if (!is_array($aInfo)) {
$aInfo = $aInfo->debugInfo();
}
foreach ($aInfo as $sKey => $mVal) {
echo ' <th><small>'.$sKey.'</small></th>'."\n";
$aKeys[] = $sKey;
}
echo " </tr>\n";
foreach ($aVar as $oRow) {
$aInfo = $oRow;
if (!is_array($oRow)) {
$aInfo = $oRow->debugInfo();
}
echo " <tr>\n";
foreach ($aKeys as $sKey) {
echo ' <td><pre>';
if (isset($aInfo[$sKey])) {
Debug::outputVar($aInfo[$sKey], '');
}
echo '</pre></td>'."\n";
}
echo " </tr>\n";
}
}
echo "</table>\n";
}
public static function printGroupedSearch($aSearches, $aWordsIDs)
{
echo '<table border="1">';
echo '<tr><th>rank</th><th>Name Tokens</th><th>Name Not</th>';
echo '<th>Address Tokens</th><th>Address Not</th>';
echo '<th>country</th><th>operator</th>';
echo '<th>class</th><th>type</th><th>postcode</th><th>housenumber</th></tr>';
foreach ($aSearches as $aRankedSet) {
foreach ($aRankedSet as $aRow) {
$aRow->dumpAsHtmlTableRow($aWordsIDs);
}
}
echo '</table>';
}
public static function printGroupTable($sHeading, $aVar)
{
echo '<b>'.$sHeading.":</b>\n";
echo "<table border='1'>\n";
if (!empty($aVar)) {
echo " <tr>\n";
echo ' <th><small>Group</small></th>'."\n";
$aKeys = array();
$aInfo = reset($aVar)[0];
if (!is_array($aInfo)) {
$aInfo = $aInfo->debugInfo();
}
foreach ($aInfo as $sKey => $mVal) {
echo ' <th><small>'.$sKey.'</small></th>'."\n";
$aKeys[] = $sKey;
}
echo " </tr>\n";
foreach ($aVar as $sGrpKey => $aGroup) {
foreach ($aGroup as $oRow) {
$aInfo = $oRow;
if (!is_array($oRow)) {
$aInfo = $oRow->debugInfo();
}
echo " <tr>\n";
echo ' <td><pre>'.$sGrpKey.'</pre></td>'."\n";
foreach ($aKeys as $sKey) {
echo ' <td><pre>';
if (!empty($aInfo[$sKey])) {
Debug::outputVar($aInfo[$sKey], '');
}
echo '</pre></td>'."\n";
}
echo " </tr>\n";
}
}
}
echo "</table>\n";
}
public static function printSQL($sSQL)
{
echo '<p><tt><font color="#aaa">'.$sSQL.'</font></tt></p>'."\n";
}
private static function outputVar($mVar, $sPreNL)
{
if (is_array($mVar) && !isset($mVar['__debug_format'])) {
$sPre = '';
foreach ($mVar as $mKey => $aValue) {
echo $sPre;
$iKeyLen = Debug::outputSimpleVar($mKey);
echo ' => ';
Debug::outputVar(
$aValue,
$sPreNL.str_repeat(' ', $iKeyLen + 4)
);
$sPre = "\n".$sPreNL;
}
} elseif (is_array($mVar) && isset($mVar['__debug_format'])) {
if (!empty($mVar['data'])) {
$sPre = '';
foreach ($mVar['data'] as $mValue) {
echo $sPre;
Debug::outputSimpleVar($mValue);
$sPre = ', ';
}
}
} elseif (is_object($mVar) && method_exists($mVar, 'debugInfo')) {
Debug::outputVar($mVar->debugInfo(), $sPreNL);
} elseif (is_a($mVar, 'stdClass')) {
Debug::outputVar(json_decode(json_encode($mVar), true), $sPreNL);
} else {
Debug::outputSimpleVar($mVar);
}
}
private static function outputSimpleVar($mVar)
{
if (is_bool($mVar)) {
echo '<i>'.($mVar ? 'True' : 'False').'</i>';
return $mVar ? 4 : 5;
}
if (is_string($mVar)) {
echo "'$mVar'";
return strlen($mVar) + 2;
}
echo (string)$mVar;
return strlen((string)$mVar);
}
}

11
lib-php/DebugNone.php Normal file
View File

@@ -0,0 +1,11 @@
<?php
namespace Nominatim;
class Debug
{
public static function __callStatic($name, $arguments)
{
// nothing
}
}

929
lib-php/Geocode.php Normal file
View File

@@ -0,0 +1,929 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/PlaceLookup.php');
require_once(CONST_LibDir.'/Phrase.php');
require_once(CONST_LibDir.'/ReverseGeocode.php');
require_once(CONST_LibDir.'/SearchDescription.php');
require_once(CONST_LibDir.'/SearchContext.php');
require_once(CONST_LibDir.'/SearchPosition.php');
require_once(CONST_LibDir.'/TokenList.php');
require_once(CONST_TokenizerDir.'/tokenizer.php');
class Geocode
{
protected $oDB;
protected $oPlaceLookup;
protected $oTokenizer;
protected $aLangPrefOrder = array();
protected $aExcludePlaceIDs = array();
protected $iLimit = 20;
protected $iFinalLimit = 10;
protected $iOffset = 0;
protected $bFallback = false;
protected $aCountryCodes = false;
protected $bBoundedSearch = false;
protected $aViewBox = false;
protected $aRoutePoints = false;
protected $aRouteWidth = false;
protected $iMaxRank = 20;
protected $iMinAddressRank = 0;
protected $iMaxAddressRank = 30;
protected $aAddressRankList = array();
protected $sAllowedTypesSQLList = false;
protected $sQuery = false;
protected $aStructuredQuery = false;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
$this->oPlaceLookup = new PlaceLookup($this->oDB);
$this->oTokenizer = new \Nominatim\Tokenizer($this->oDB);
}
public function setLanguagePreference($aLangPref)
{
$this->aLangPrefOrder = $aLangPref;
}
public function getMoreUrlParams()
{
if ($this->aStructuredQuery) {
$aParams = $this->aStructuredQuery;
} else {
$aParams = array('q' => $this->sQuery);
}
$aParams = array_merge($aParams, $this->oPlaceLookup->getMoreUrlParams());
if ($this->aExcludePlaceIDs) {
$aParams['exclude_place_ids'] = implode(',', $this->aExcludePlaceIDs);
}
if ($this->bBoundedSearch) {
$aParams['bounded'] = '1';
}
if ($this->aCountryCodes) {
$aParams['countrycodes'] = implode(',', $this->aCountryCodes);
}
if ($this->aViewBox) {
$aParams['viewbox'] = join(',', $this->aViewBox);
}
return $aParams;
}
public function setLimit($iLimit = 10)
{
if ($iLimit > 50) {
$iLimit = 50;
} elseif ($iLimit < 1) {
$iLimit = 1;
}
$this->iFinalLimit = $iLimit;
$this->iLimit = $iLimit + min($iLimit, 10);
}
public function setFeatureType($sFeatureType)
{
switch ($sFeatureType) {
case 'country':
$this->setRankRange(4, 4);
break;
case 'state':
$this->setRankRange(8, 8);
break;
case 'city':
$this->setRankRange(14, 16);
break;
case 'settlement':
$this->setRankRange(8, 20);
break;
}
}
public function setRankRange($iMin, $iMax)
{
$this->iMinAddressRank = $iMin;
$this->iMaxAddressRank = $iMax;
}
public function setViewbox($aViewbox)
{
$aBox = array_map('floatval', $aViewbox);
$this->aViewBox[0] = max(-180.0, min($aBox[0], $aBox[2]));
$this->aViewBox[1] = max(-90.0, min($aBox[1], $aBox[3]));
$this->aViewBox[2] = min(180.0, max($aBox[0], $aBox[2]));
$this->aViewBox[3] = min(90.0, max($aBox[1], $aBox[3]));
if ($this->aViewBox[2] - $this->aViewBox[0] < 0.000000001
|| $this->aViewBox[3] - $this->aViewBox[1] < 0.000000001
) {
userError("Bad parameter 'viewbox'. Not a box.");
}
}
private function viewboxImportanceFactor($fX, $fY)
{
if (!$this->aViewBox) {
return 1;
}
$fWidth = ($this->aViewBox[2] - $this->aViewBox[0])/2;
$fHeight = ($this->aViewBox[3] - $this->aViewBox[1])/2;
$fXDist = abs($fX - ($this->aViewBox[0] + $this->aViewBox[2])/2);
$fYDist = abs($fY - ($this->aViewBox[1] + $this->aViewBox[3])/2);
if ($fXDist <= $fWidth && $fYDist <= $fHeight) {
return 1;
}
if ($fXDist <= $fWidth * 3 && $fYDist <= 3 * $fHeight) {
return 0.5;
}
return 0.25;
}
public function setQuery($sQueryString)
{
$this->sQuery = $sQueryString;
$this->aStructuredQuery = false;
}
public function getQueryString()
{
return $this->sQuery;
}
public function loadParamArray($oParams, $sForceGeometryType = null)
{
$this->bBoundedSearch = $oParams->getBool('bounded', $this->bBoundedSearch);
$this->setLimit($oParams->getInt('limit', $this->iFinalLimit));
$this->iOffset = $oParams->getInt('offset', $this->iOffset);
$this->bFallback = $oParams->getBool('fallback', $this->bFallback);
// List of excluded Place IDs - used for more acurate pageing
$sExcluded = $oParams->getStringList('exclude_place_ids');
if ($sExcluded) {
foreach ($sExcluded as $iExcludedPlaceID) {
$iExcludedPlaceID = (int)$iExcludedPlaceID;
if ($iExcludedPlaceID) {
$aExcludePlaceIDs[$iExcludedPlaceID] = $iExcludedPlaceID;
}
}
if (isset($aExcludePlaceIDs)) {
$this->aExcludePlaceIDs = $aExcludePlaceIDs;
}
}
// Only certain ranks of feature
$sFeatureType = $oParams->getString('featureType');
if (!$sFeatureType) {
$sFeatureType = $oParams->getString('featuretype');
}
if ($sFeatureType) {
$this->setFeatureType($sFeatureType);
}
// Country code list
$sCountries = $oParams->getStringList('countrycodes');
if ($sCountries) {
foreach ($sCountries as $sCountryCode) {
if (preg_match('/^[a-zA-Z][a-zA-Z]$/', $sCountryCode)) {
$aCountries[] = strtolower($sCountryCode);
}
}
if (isset($aCountries)) {
$this->aCountryCodes = $aCountries;
}
}
$aViewbox = $oParams->getStringList('viewboxlbrt');
if ($aViewbox) {
if (count($aViewbox) != 4) {
userError("Bad parameter 'viewboxlbrt'. Expected 4 coordinates.");
}
$this->setViewbox($aViewbox);
} else {
$aViewbox = $oParams->getStringList('viewbox');
if ($aViewbox) {
if (count($aViewbox) != 4) {
userError("Bad parameter 'viewbox'. Expected 4 coordinates.");
}
$this->setViewBox($aViewbox);
} else {
$aRoute = $oParams->getStringList('route');
$fRouteWidth = $oParams->getFloat('routewidth');
if ($aRoute && $fRouteWidth) {
$this->aRoutePoints = $aRoute;
$this->aRouteWidth = $fRouteWidth;
}
}
}
$this->oPlaceLookup->loadParamArray($oParams, $sForceGeometryType);
$this->oPlaceLookup->setIncludeAddressDetails($oParams->getBool('addressdetails', false));
}
public function setQueryFromParams($oParams)
{
// Search query
$sQuery = $oParams->getString('q');
if (!$sQuery) {
$this->setStructuredQuery(
$oParams->getString('amenity'),
$oParams->getString('street'),
$oParams->getString('city'),
$oParams->getString('county'),
$oParams->getString('state'),
$oParams->getString('country'),
$oParams->getString('postalcode')
);
} else {
$this->setQuery($sQuery);
}
}
public function loadStructuredAddressElement($sValue, $sKey, $iNewMinAddressRank, $iNewMaxAddressRank, $aItemListValues)
{
$sValue = trim($sValue);
if (!$sValue) {
return false;
}
$this->aStructuredQuery[$sKey] = $sValue;
if ($this->iMinAddressRank == 0 && $this->iMaxAddressRank == 30) {
$this->iMinAddressRank = $iNewMinAddressRank;
$this->iMaxAddressRank = $iNewMaxAddressRank;
}
if ($aItemListValues) {
$this->aAddressRankList = array_merge($this->aAddressRankList, $aItemListValues);
}
return true;
}
public function setStructuredQuery($sAmenity = false, $sStreet = false, $sCity = false, $sCounty = false, $sState = false, $sCountry = false, $sPostalCode = false)
{
$this->sQuery = false;
// Reset
$this->iMinAddressRank = 0;
$this->iMaxAddressRank = 30;
$this->aAddressRankList = array();
$this->aStructuredQuery = array();
$this->sAllowedTypesSQLList = false;
$this->loadStructuredAddressElement($sAmenity, 'amenity', 26, 30, false);
$this->loadStructuredAddressElement($sStreet, 'street', 26, 30, false);
$this->loadStructuredAddressElement($sCity, 'city', 14, 24, false);
$this->loadStructuredAddressElement($sCounty, 'county', 9, 13, false);
$this->loadStructuredAddressElement($sState, 'state', 8, 8, false);
$this->loadStructuredAddressElement($sPostalCode, 'postalcode', 5, 11, array(5, 11));
$this->loadStructuredAddressElement($sCountry, 'country', 4, 4, false);
if (!empty($this->aStructuredQuery)) {
$this->sQuery = join(', ', $this->aStructuredQuery);
if ($this->iMaxAddressRank < 30) {
$this->sAllowedTypesSQLList = '(\'place\',\'boundary\')';
}
}
}
public function fallbackStructuredQuery()
{
$aParams = $this->aStructuredQuery;
if (!$aParams || count($aParams) == 1) {
return false;
}
$aOrderToFallback = array('postalcode', 'street', 'city', 'county', 'state');
foreach ($aOrderToFallback as $sType) {
if (isset($aParams[$sType])) {
unset($aParams[$sType]);
$this->setStructuredQuery(@$aParams['amenity'], @$aParams['street'], @$aParams['city'], @$aParams['county'], @$aParams['state'], @$aParams['country'], @$aParams['postalcode']);
return true;
}
}
return false;
}
public function getGroupedSearches($aSearches, $aPhrases, $oValidTokens)
{
/*
Calculate all searches using oValidTokens i.e.
'Wodsworth Road, Sheffield' =>
Phrase Wordset
0 0 (wodsworth road)
0 1 (wodsworth)(road)
1 0 (sheffield)
Score how good the search is so they can be ordered
*/
foreach ($aPhrases as $iPhrase => $oPhrase) {
$aNewPhraseSearches = array();
$oPosition = new SearchPosition(
$oPhrase->getPhraseType(),
$iPhrase,
count($aPhrases)
);
foreach ($oPhrase->getWordSets() as $aWordset) {
$aWordsetSearches = $aSearches;
// Add all words from this wordset
foreach ($aWordset as $iToken => $sToken) {
$aNewWordsetSearches = array();
$oPosition->setTokenPosition($iToken, count($aWordset));
foreach ($aWordsetSearches as $oCurrentSearch) {
foreach ($oValidTokens->get($sToken) as $oSearchTerm) {
if ($oSearchTerm->isExtendable($oCurrentSearch, $oPosition)) {
$aNewSearches = $oSearchTerm->extendSearch(
$oCurrentSearch,
$oPosition
);
foreach ($aNewSearches as $oSearch) {
if ($oSearch->getRank() < $this->iMaxRank) {
$aNewWordsetSearches[] = $oSearch;
}
}
}
}
}
// Sort and cut
usort($aNewWordsetSearches, array('Nominatim\SearchDescription', 'bySearchRank'));
$aWordsetSearches = array_slice($aNewWordsetSearches, 0, 50);
}
$aNewPhraseSearches = array_merge($aNewPhraseSearches, $aNewWordsetSearches);
usort($aNewPhraseSearches, array('Nominatim\SearchDescription', 'bySearchRank'));
$aSearchHash = array();
foreach ($aNewPhraseSearches as $iSearch => $aSearch) {
$sHash = serialize($aSearch);
if (isset($aSearchHash[$sHash])) {
unset($aNewPhraseSearches[$iSearch]);
} else {
$aSearchHash[$sHash] = 1;
}
}
$aNewPhraseSearches = array_slice($aNewPhraseSearches, 0, 50);
}
// Re-group the searches by their score, junk anything over 20 as just not worth trying
$aGroupedSearches = array();
foreach ($aNewPhraseSearches as $aSearch) {
$iRank = $aSearch->getRank();
if ($iRank < $this->iMaxRank) {
if (!isset($aGroupedSearches[$iRank])) {
$aGroupedSearches[$iRank] = array();
}
$aGroupedSearches[$iRank][] = $aSearch;
}
}
ksort($aGroupedSearches);
$iSearchCount = 0;
$aSearches = array();
foreach ($aGroupedSearches as $aNewSearches) {
$iSearchCount += count($aNewSearches);
$aSearches = array_merge($aSearches, $aNewSearches);
if ($iSearchCount > 50) {
break;
}
}
}
// Revisit searches, drop bad searches and give penalty to unlikely combinations.
$aGroupedSearches = array();
foreach ($aSearches as $oSearch) {
if (!$oSearch->isValidSearch()) {
continue;
}
$iRank = $oSearch->getRank();
if (!isset($aGroupedSearches[$iRank])) {
$aGroupedSearches[$iRank] = array();
}
$aGroupedSearches[$iRank][] = $oSearch;
}
ksort($aGroupedSearches);
return $aGroupedSearches;
}
/* Perform the actual query lookup.
Returns an ordered list of results, each with the following fields:
osm_type: type of corresponding OSM object
N - node
W - way
R - relation
P - postcode (internally computed)
osm_id: id of corresponding OSM object
class: general object class (corresponds to tag key of primary OSM tag)
type: subclass of object (corresponds to tag value of primary OSM tag)
admin_level: see https://wiki.openstreetmap.org/wiki/Admin_level
rank_search: rank in search hierarchy
(see also https://wiki.openstreetmap.org/wiki/Nominatim/Development_overview#Country_to_street_level)
rank_address: rank in address hierarchy (determines orer in address)
place_id: internal key (may differ between different instances)
country_code: ISO country code
langaddress: localized full address
placename: localized name of object
ref: content of ref tag (if available)
lon: longitude
lat: latitude
importance: importance of place based on Wikipedia link count
addressimportance: cumulated importance of address elements
extra_place: type of place (for admin boundaries, if there is a place tag)
aBoundingBox: bounding Box
label: short description of the object class/type (English only)
name: full name (currently the same as langaddress)
foundorder: secondary ordering for places with same importance
*/
public function lookup()
{
Debug::newFunction('Geocode::lookup');
if (!$this->sQuery && !$this->aStructuredQuery) {
return array();
}
Debug::printDebugArray('Geocode', $this);
$oCtx = new SearchContext();
if ($this->aRoutePoints) {
$oCtx->setViewboxFromRoute(
$this->oDB,
$this->aRoutePoints,
$this->aRouteWidth,
$this->bBoundedSearch
);
} elseif ($this->aViewBox) {
$oCtx->setViewboxFromBox($this->aViewBox, $this->bBoundedSearch);
}
if ($this->aExcludePlaceIDs) {
$oCtx->setExcludeList($this->aExcludePlaceIDs);
}
if ($this->aCountryCodes) {
$oCtx->setCountryList($this->aCountryCodes);
}
Debug::newSection('Query Preprocessing');
$sQuery = $this->sQuery;
if (!preg_match('//u', $sQuery)) {
userError('Query string is not UTF-8 encoded.');
}
// Do we have anything that looks like a lat/lon pair?
$sQuery = $oCtx->setNearPointFromQuery($sQuery);
if ($sQuery || $this->aStructuredQuery) {
// Start with a single blank search
$aSearches = array(new SearchDescription($oCtx));
if ($sQuery) {
$sQuery = $aSearches[0]->extractKeyValuePairs($sQuery);
}
$sSpecialTerm = '';
if ($sQuery) {
preg_match_all(
'/\\[([\\w ]*)\\]/u',
$sQuery,
$aSpecialTermsRaw,
PREG_SET_ORDER
);
if (!empty($aSpecialTermsRaw)) {
Debug::printVar('Special terms', $aSpecialTermsRaw);
}
foreach ($aSpecialTermsRaw as $aSpecialTerm) {
$sQuery = str_replace($aSpecialTerm[0], ' ', $sQuery);
if (!$sSpecialTerm) {
$sSpecialTerm = $aSpecialTerm[1];
}
}
}
if (!$sSpecialTerm && $this->aStructuredQuery
&& isset($this->aStructuredQuery['amenity'])) {
$sSpecialTerm = $this->aStructuredQuery['amenity'];
unset($this->aStructuredQuery['amenity']);
}
if ($sSpecialTerm && !$aSearches[0]->hasOperator()) {
$aTokens = $this->oTokenizer->tokensForSpecialTerm($sSpecialTerm);
if (!empty($aTokens)) {
$aNewSearches = array();
$oPosition = new SearchPosition('', 0, 1);
$oPosition->setTokenPosition(0, 1);
foreach ($aSearches as $oSearch) {
foreach ($aTokens as $oToken) {
$aNewSearches = array_merge(
$aNewSearches,
$oToken->extendSearch($oSearch, $oPosition)
);
}
}
$aSearches = $aNewSearches;
}
}
// Split query into phrases
// Commas are used to reduce the search space by indicating where phrases split
$aPhrases = array();
if ($this->aStructuredQuery) {
foreach ($this->aStructuredQuery as $iPhrase => $sPhrase) {
$aPhrases[] = new Phrase($sPhrase, $iPhrase);
}
} else {
foreach (explode(',', $sQuery) as $sPhrase) {
$aPhrases[] = new Phrase($sPhrase, '');
}
}
Debug::printDebugArray('Search context', $oCtx);
Debug::printDebugArray('Base search', empty($aSearches) ? null : $aSearches[0]);
Debug::newSection('Tokenization');
$oValidTokens = $this->oTokenizer->extractTokensFromPhrases($aPhrases);
if ($oValidTokens->count() > 0) {
$oCtx->setFullNameWords($oValidTokens->getFullWordIDs());
$aPhrases = array_filter($aPhrases, function ($oPhrase) {
return $oPhrase->getWordSets() !== null;
});
// Any words that have failed completely?
// TODO: suggestions
Debug::printGroupTable('Valid Tokens', $oValidTokens->debugInfo());
Debug::printDebugTable('Phrases', $aPhrases);
Debug::newSection('Search candidates');
$aGroupedSearches = $this->getGroupedSearches($aSearches, $aPhrases, $oValidTokens);
if (!$this->aStructuredQuery) {
// Reverse phrase array and also reverse the order of the wordsets in
// the first and final phrase. Don't bother about phrases in the middle
// because order in the address doesn't matter.
$aPhrases = array_reverse($aPhrases);
$aPhrases[0]->invertWordSets();
if (count($aPhrases) > 1) {
$aPhrases[count($aPhrases)-1]->invertWordSets();
}
$aReverseGroupedSearches = $this->getGroupedSearches($aSearches, $aPhrases, $oValidTokens);
foreach ($aGroupedSearches as $aSearches) {
foreach ($aSearches as $aSearch) {
if (!isset($aReverseGroupedSearches[$aSearch->getRank()])) {
$aReverseGroupedSearches[$aSearch->getRank()] = array();
}
$aReverseGroupedSearches[$aSearch->getRank()][] = $aSearch;
}
}
$aGroupedSearches = $aReverseGroupedSearches;
ksort($aGroupedSearches);
}
} else {
// Re-group the searches by their score, junk anything over 20 as just not worth trying
$aGroupedSearches = array();
foreach ($aSearches as $aSearch) {
if ($aSearch->getRank() < $this->iMaxRank) {
if (!isset($aGroupedSearches[$aSearch->getRank()])) {
$aGroupedSearches[$aSearch->getRank()] = array();
}
$aGroupedSearches[$aSearch->getRank()][] = $aSearch;
}
}
ksort($aGroupedSearches);
}
// Filter out duplicate searches
$aSearchHash = array();
foreach ($aGroupedSearches as $iGroup => $aSearches) {
foreach ($aSearches as $iSearch => $aSearch) {
$sHash = serialize($aSearch);
if (isset($aSearchHash[$sHash])) {
unset($aGroupedSearches[$iGroup][$iSearch]);
if (empty($aGroupedSearches[$iGroup])) {
unset($aGroupedSearches[$iGroup]);
}
} else {
$aSearchHash[$sHash] = 1;
}
}
}
Debug::printGroupedSearch(
$aGroupedSearches,
$oValidTokens->debugTokenByWordIdList()
);
// Start the search process
$iGroupLoop = 0;
$iQueryLoop = 0;
$aNextResults = array();
foreach ($aGroupedSearches as $iGroupedRank => $aSearches) {
$iGroupLoop++;
$aResults = $aNextResults;
foreach ($aSearches as $oSearch) {
$iQueryLoop++;
Debug::newSection("Search Loop, group $iGroupLoop, loop $iQueryLoop");
Debug::printGroupedSearch(
array($iGroupedRank => array($oSearch)),
$oValidTokens->debugTokenByWordIdList()
);
$aNewResults = $oSearch->query(
$this->oDB,
$this->iMinAddressRank,
$this->iMaxAddressRank,
$this->iLimit
);
// The same result may appear in different rounds, only
// use the one with minimal rank.
foreach ($aNewResults as $iPlace => $oRes) {
if (!isset($aResults[$iPlace])
|| $aResults[$iPlace]->iResultRank > $oRes->iResultRank) {
$aResults[$iPlace] = $oRes;
}
}
if ($iQueryLoop > 20) {
break;
}
}
if (!empty($aResults)) {
$aSplitResults = Result::splitResults($aResults);
Debug::printVar('Split results', $aSplitResults);
if ($iGroupLoop <= 4
&& reset($aSplitResults['head'])->iResultRank > 0
&& $iGroupedRank !== array_key_last($aGroupedSearches)) {
// Haven't found an exact match for the query yet.
// Therefore add result from the next group level.
$aNextResults = $aSplitResults['head'];
foreach ($aNextResults as $oRes) {
$oRes->iResultRank--;
}
foreach ($aSplitResults['tail'] as $oRes) {
$oRes->iResultRank--;
$aNextResults[$oRes->iId] = $oRes;
}
$aResults = array();
} else {
$aResults = $aSplitResults['head'];
}
}
if (!empty($aResults) && ($this->iMinAddressRank != 0 || $this->iMaxAddressRank != 30)) {
// Need to verify passes rank limits before dropping out of the loop (yuk!)
// reduces the number of place ids, like a filter
// rank_address is 30 for interpolated housenumbers
$aFilterSql = array();
$sPlaceIds = Result::joinIdsByTable($aResults, Result::TABLE_PLACEX);
if ($sPlaceIds) {
$sSQL = 'SELECT place_id FROM placex ';
$sSQL .= 'WHERE place_id in ('.$sPlaceIds.') ';
$sSQL .= ' AND (';
$sSQL .= " placex.rank_address between $this->iMinAddressRank and $this->iMaxAddressRank ";
$sSQL .= " OR placex.rank_search between $this->iMinAddressRank and $this->iMaxAddressRank ";
if ($this->aAddressRankList) {
$sSQL .= ' OR placex.rank_address in ('.join(',', $this->aAddressRankList).')';
}
$sSQL .= ')';
$aFilterSql[] = $sSQL;
}
$sPlaceIds = Result::joinIdsByTable($aResults, Result::TABLE_POSTCODE);
if ($sPlaceIds) {
$sSQL = ' SELECT place_id FROM location_postcode lp ';
$sSQL .= 'WHERE place_id in ('.$sPlaceIds.') ';
$sSQL .= " AND (lp.rank_address between $this->iMinAddressRank and $this->iMaxAddressRank ";
if ($this->aAddressRankList) {
$sSQL .= ' OR lp.rank_address in ('.join(',', $this->aAddressRankList).')';
}
$sSQL .= ') ';
$aFilterSql[] = $sSQL;
}
$aFilteredIDs = array();
if ($aFilterSql) {
$sSQL = join(' UNION ', $aFilterSql);
Debug::printSQL($sSQL);
$aFilteredIDs = $this->oDB->getCol($sSQL);
}
$tempIDs = array();
foreach ($aResults as $oResult) {
if (($this->iMaxAddressRank == 30 &&
($oResult->iTable == Result::TABLE_OSMLINE
|| $oResult->iTable == Result::TABLE_TIGER))
|| in_array($oResult->iId, $aFilteredIDs)
) {
$tempIDs[$oResult->iId] = $oResult;
}
}
$aResults = $tempIDs;
}
if (!empty($aResults) || $iGroupLoop > 4 || $iQueryLoop > 30) {
break;
}
}
} else {
// Just interpret as a reverse geocode
$oReverse = new ReverseGeocode($this->oDB);
$oReverse->setZoom(18);
$oLookup = $oReverse->lookupPoint($oCtx->sqlNear, false);
Debug::printVar('Reverse search', $oLookup);
if ($oLookup) {
$aResults = array($oLookup->iId => $oLookup);
}
}
// No results? Done
if (empty($aResults)) {
if ($this->bFallback && $this->fallbackStructuredQuery()) {
return $this->lookup();
}
return array();
}
if ($this->aAddressRankList) {
$this->oPlaceLookup->setAddressRankList($this->aAddressRankList);
}
$this->oPlaceLookup->setAllowedTypesSQLList($this->sAllowedTypesSQLList);
$this->oPlaceLookup->setLanguagePreference($this->aLangPrefOrder);
if ($oCtx->hasNearPoint()) {
$this->oPlaceLookup->setAnchorSql($oCtx->sqlNear);
}
$aSearchResults = $this->oPlaceLookup->lookup($aResults);
$aRecheckWords = preg_split('/\b[\s,\\-]*/u', $sQuery);
foreach ($aRecheckWords as $i => $sWord) {
if (!preg_match('/[\pL\pN]/', $sWord)) {
unset($aRecheckWords[$i]);
}
}
Debug::printVar('Recheck words', $aRecheckWords);
foreach ($aSearchResults as $iIdx => $aResult) {
$fRadius = ClassTypes\getDefRadius($aResult);
$aOutlineResult = $this->oPlaceLookup->getOutlines($aResult['place_id'], $aResult['lon'], $aResult['lat'], $fRadius);
if ($aOutlineResult) {
$aResult = array_merge($aResult, $aOutlineResult);
}
// Is there an icon set for this type of result?
$sIcon = ClassTypes\getIconFile($aResult);
if (isset($sIcon)) {
$aResult['icon'] = $sIcon;
}
$sLabel = ClassTypes\getLabel($aResult);
if (isset($sLabel)) {
$aResult['label'] = $sLabel;
}
$aResult['name'] = $aResult['langaddress'];
if ($oCtx->hasNearPoint()) {
$aResult['importance'] = 0.001;
$aResult['foundorder'] = $aResult['addressimportance'];
} else {
$aResult['importance'] = max(0.001, $aResult['importance']);
$aResult['importance'] *= $this->viewboxImportanceFactor(
$aResult['lon'],
$aResult['lat']
);
// secondary ordering (for results with same importance (the smaller the better):
// - approximate importance of address parts
if (isset($aResult['addressimportance']) && $aResult['addressimportance']) {
$aResult['foundorder'] = -$aResult['addressimportance']/10;
} else {
$aResult['foundorder'] = -$aResult['importance'];
}
// - number of exact matches from the query
$aResult['foundorder'] -= $aResults[$aResult['place_id']]->iExactMatches;
// - importance of the class/type
$iClassImportance = ClassTypes\getImportance($aResult);
if (isset($iClassImportance)) {
$aResult['foundorder'] += 0.0001 * $iClassImportance;
} else {
$aResult['foundorder'] += 0.01;
}
// - rank
$aResult['foundorder'] -= 0.00001 * (30 - $aResult['rank_search']);
// Adjust importance for the number of exact string matches in the result
$iCountWords = 0;
$sAddress = $aResult['langaddress'];
foreach ($aRecheckWords as $i => $sWord) {
if (stripos($sAddress, $sWord)!==false) {
$iCountWords++;
if (preg_match('/(^|,)\s*'.preg_quote($sWord, '/').'\s*(,|$)/', $sAddress)) {
$iCountWords += 0.1;
}
}
}
// 0.1 is a completely arbitrary number but something in the range 0.1 to 0.5 would seem right
$aResult['importance'] = $aResult['importance'] + ($iCountWords*0.1);
}
$aSearchResults[$iIdx] = $aResult;
}
uasort($aSearchResults, 'byImportance');
Debug::printVar('Pre-filter results', $aSearchResults);
$aOSMIDDone = array();
$aClassTypeNameDone = array();
$aToFilter = $aSearchResults;
$aSearchResults = array();
foreach ($aToFilter as $aResult) {
$this->aExcludePlaceIDs[$aResult['place_id']] = $aResult['place_id'];
if (!$this->oPlaceLookup->doDeDupe() || (!isset($aOSMIDDone[$aResult['osm_type'].$aResult['osm_id']])
&& !isset($aClassTypeNameDone[$aResult['osm_type'].$aResult['class'].$aResult['type'].$aResult['name'].$aResult['admin_level']]))
) {
$aOSMIDDone[$aResult['osm_type'].$aResult['osm_id']] = true;
$aClassTypeNameDone[$aResult['osm_type'].$aResult['class'].$aResult['type'].$aResult['name'].$aResult['admin_level']] = true;
$aSearchResults[] = $aResult;
}
// Absolute limit on number of results
if (count($aSearchResults) >= $this->iFinalLimit) {
break;
}
}
Debug::printVar('Post-filter results', $aSearchResults);
return $aSearchResults;
} // end lookup()
public function debugInfo()
{
return array(
'Query' => $this->sQuery,
'Structured query' => $this->aStructuredQuery,
'Name keys' => Debug::fmtArrayVals($this->aLangPrefOrder),
'Excluded place IDs' => Debug::fmtArrayVals($this->aExcludePlaceIDs),
'Limit (for searches)' => $this->iLimit,
'Limit (for results)'=> $this->iFinalLimit,
'Country codes' => Debug::fmtArrayVals($this->aCountryCodes),
'Bounded search' => $this->bBoundedSearch,
'Viewbox' => Debug::fmtArrayVals($this->aViewBox),
'Route points' => Debug::fmtArrayVals($this->aRoutePoints),
'Route width' => $this->aRouteWidth,
'Max rank' => $this->iMaxRank,
'Min address rank' => $this->iMinAddressRank,
'Max address rank' => $this->iMaxAddressRank,
'Address rank list' => Debug::fmtArrayVals($this->aAddressRankList)
);
}
} // end class

134
lib-php/ParameterParser.php Normal file
View File

@@ -0,0 +1,134 @@
<?php
namespace Nominatim;
class ParameterParser
{
private $aParams;
public function __construct($aParams = null)
{
$this->aParams = ($aParams === null) ? $_GET : $aParams;
}
public function getBool($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName]) || strlen($this->aParams[$sName]) == 0) {
return $bDefault;
}
return (bool) $this->aParams[$sName];
}
public function getInt($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName])) {
return $bDefault;
}
if (!preg_match('/^[+-]?[0-9]+$/', $this->aParams[$sName])) {
userError("Integer number expected for parameter '$sName'");
}
return (int) $this->aParams[$sName];
}
public function getFloat($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName])) {
return $bDefault;
}
if (!preg_match('/^[+-]?[0-9]*\.?[0-9]+$/', $this->aParams[$sName])) {
userError("Floating-point number expected for parameter '$sName'");
}
return (float) $this->aParams[$sName];
}
public function getString($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName]) || strlen($this->aParams[$sName]) == 0) {
return $bDefault;
}
return $this->aParams[$sName];
}
public function getSet($sName, $aValues, $sDefault = false)
{
if (!isset($this->aParams[$sName]) || strlen($this->aParams[$sName]) == 0) {
return $sDefault;
}
if (!in_array($this->aParams[$sName], $aValues)) {
userError("Parameter '$sName' must be one of: ".join(', ', $aValues));
}
return $this->aParams[$sName];
}
public function getStringList($sName, $aDefault = false)
{
$sValue = $this->getString($sName);
if ($sValue) {
// removes all NULL, FALSE and Empty Strings but leaves 0 (zero) values
return array_values(array_filter(explode(',', $sValue), 'strlen'));
}
return $aDefault;
}
public function getPreferredLanguages($sFallback = null)
{
if ($sFallback === null && isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
$sFallback = $_SERVER['HTTP_ACCEPT_LANGUAGE'];
}
$aLanguages = array();
$sLangString = $this->getString('accept-language', $sFallback);
if ($sLangString
&& preg_match_all('/(([a-z]{1,8})([-_][a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $sLangString, $aLanguagesParse, PREG_SET_ORDER)
) {
foreach ($aLanguagesParse as $iLang => $aLanguage) {
$aLanguages[$aLanguage[1]] = isset($aLanguage[5])?(float)$aLanguage[5]:1 - ($iLang/100);
if (!isset($aLanguages[$aLanguage[2]])) {
$aLanguages[$aLanguage[2]] = $aLanguages[$aLanguage[1]]/10;
}
}
arsort($aLanguages);
}
if (empty($aLanguages) && CONST_Default_Language) {
$aLanguages[CONST_Default_Language] = 1;
}
foreach ($aLanguages as $sLanguage => $fLanguagePref) {
$aLangPrefOrder['name:'.$sLanguage] = 'name:'.$sLanguage;
}
$aLangPrefOrder['name'] = 'name';
$aLangPrefOrder['brand'] = 'brand';
foreach ($aLanguages as $sLanguage => $fLanguagePref) {
$aLangPrefOrder['official_name:'.$sLanguage] = 'official_name:'.$sLanguage;
$aLangPrefOrder['short_name:'.$sLanguage] = 'short_name:'.$sLanguage;
}
$aLangPrefOrder['official_name'] = 'official_name';
$aLangPrefOrder['short_name'] = 'short_name';
$aLangPrefOrder['ref'] = 'ref';
$aLangPrefOrder['type'] = 'type';
return $aLangPrefOrder;
}
public function hasSetAny($aParamNames)
{
foreach ($aParamNames as $sName) {
if ($this->getBool($sName)) {
return true;
}
}
return false;
}
}

81
lib-php/Phrase.php Normal file
View File

@@ -0,0 +1,81 @@
<?php
namespace Nominatim;
/**
* Segment of a query string.
*
* The parts of a query strings are usually separated by commas.
*/
class Phrase
{
// Complete phrase as a string (guaranteed to have no leading or trailing
// spaces).
private $sPhrase;
// Element type for structured searches.
private $sPhraseType;
// Possible segmentations of the phrase.
private $aWordSets;
public function __construct($sPhrase, $sPhraseType)
{
$this->sPhrase = trim($sPhrase);
$this->sPhraseType = $sPhraseType;
}
/**
* Get the orginal phrase of the string.
*/
public function getPhrase()
{
return $this->sPhrase;
}
/**
* Return the element type of the phrase.
*
* @return string Pharse type if the phrase comes from a structured query
* or empty string otherwise.
*/
public function getPhraseType()
{
return $this->sPhraseType;
}
public function setWordSets($aWordSets)
{
$this->aWordSets = $aWordSets;
}
/**
* Return the array of possible segmentations of the phrase.
*
* @return string[][] Array of segmentations, each consisting of an
* array of terms.
*/
public function getWordSets()
{
return $this->aWordSets;
}
/**
* Invert the set of possible segmentations.
*
* @return void
*/
public function invertWordSets()
{
foreach ($this->aWordSets as $i => $aSet) {
$this->aWordSets[$i] = array_reverse($aSet);
}
}
public function debugInfo()
{
return array(
'Type' => $this->sPhraseType,
'Phrase' => $this->sPhrase,
'WordSets' => $this->aWordSets
);
}
}

585
lib-php/PlaceLookup.php Normal file
View File

@@ -0,0 +1,585 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/AddressDetails.php');
require_once(CONST_LibDir.'/Result.php');
class PlaceLookup
{
protected $oDB;
protected $aLangPrefOrderSql = "''";
protected $bAddressDetails = false;
protected $bExtraTags = false;
protected $bNameDetails = false;
protected $bIncludePolygonAsText = false;
protected $bIncludePolygonAsGeoJSON = false;
protected $bIncludePolygonAsKML = false;
protected $bIncludePolygonAsSVG = false;
protected $fPolygonSimplificationThreshold = 0.0;
protected $sAnchorSql = null;
protected $sAddressRankListSql = null;
protected $sAllowedTypesSQLList = null;
protected $bDeDupe = true;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
}
public function doDeDupe()
{
return $this->bDeDupe;
}
public function setIncludeAddressDetails($b)
{
$this->bAddressDetails = $b;
}
public function loadParamArray($oParams, $sGeomType = null)
{
$aLangs = $oParams->getPreferredLanguages();
$this->aLangPrefOrderSql =
'ARRAY['.join(',', $this->oDB->getDBQuotedList($aLangs)).']';
$this->bExtraTags = $oParams->getBool('extratags', false);
$this->bNameDetails = $oParams->getBool('namedetails', false);
$this->bDeDupe = $oParams->getBool('dedupe', $this->bDeDupe);
if ($sGeomType === null || $sGeomType == 'geojson') {
$this->bIncludePolygonAsGeoJSON = $oParams->getBool('polygon_geojson');
}
if ($oParams->getString('format', '') !== 'geojson') {
if ($sGeomType === null || $sGeomType == 'text') {
$this->bIncludePolygonAsText = $oParams->getBool('polygon_text');
}
if ($sGeomType === null || $sGeomType == 'kml') {
$this->bIncludePolygonAsKML = $oParams->getBool('polygon_kml');
}
if ($sGeomType === null || $sGeomType == 'svg') {
$this->bIncludePolygonAsSVG = $oParams->getBool('polygon_svg');
}
}
$this->fPolygonSimplificationThreshold
= $oParams->getFloat('polygon_threshold', 0.0);
$iWantedTypes =
($this->bIncludePolygonAsText ? 1 : 0) +
($this->bIncludePolygonAsGeoJSON ? 1 : 0) +
($this->bIncludePolygonAsKML ? 1 : 0) +
($this->bIncludePolygonAsSVG ? 1 : 0);
if ($iWantedTypes > CONST_PolygonOutput_MaximumTypes) {
if (CONST_PolygonOutput_MaximumTypes) {
userError('Select only '.CONST_PolygonOutput_MaximumTypes.' polgyon output option');
} else {
userError('Polygon output is disabled');
}
}
}
public function getMoreUrlParams()
{
$aParams = array();
if ($this->bAddressDetails) {
$aParams['addressdetails'] = '1';
}
if ($this->bExtraTags) {
$aParams['extratags'] = '1';
}
if ($this->bNameDetails) {
$aParams['namedetails'] = '1';
}
if ($this->bIncludePolygonAsText) {
$aParams['polygon_text'] = '1';
}
if ($this->bIncludePolygonAsGeoJSON) {
$aParams['polygon_geojson'] = '1';
}
if ($this->bIncludePolygonAsKML) {
$aParams['polygon_kml'] = '1';
}
if ($this->bIncludePolygonAsSVG) {
$aParams['polygon_svg'] = '1';
}
if ($this->fPolygonSimplificationThreshold > 0.0) {
$aParams['polygon_threshold'] = $this->fPolygonSimplificationThreshold;
}
if (!$this->bDeDupe) {
$aParams['dedupe'] = '0';
}
return $aParams;
}
public function setAnchorSql($sPoint)
{
$this->sAnchorSql = $sPoint;
}
public function setAddressRankList($aList)
{
$this->sAddressRankListSql = '('.join(',', $aList).')';
}
public function setAllowedTypesSQLList($sSql)
{
$this->sAllowedTypesSQLList = $sSql;
}
public function setLanguagePreference($aLangPrefOrder)
{
$this->aLangPrefOrderSql = $this->oDB->getArraySQL(
$this->oDB->getDBQuotedList($aLangPrefOrder)
);
}
private function addressImportanceSql($sGeometry, $sPlaceId)
{
if ($this->sAnchorSql) {
$sSQL = 'ST_Distance('.$this->sAnchorSql.','.$sGeometry.')';
} else {
$sSQL = '(SELECT max(ai_p.importance * (ai_p.rank_address + 2))';
$sSQL .= ' FROM place_addressline ai_s, placex ai_p';
$sSQL .= ' WHERE ai_s.place_id = '.$sPlaceId;
$sSQL .= ' AND ai_p.place_id = ai_s.address_place_id ';
$sSQL .= ' AND ai_s.isaddress ';
$sSQL .= ' AND ai_p.importance is not null)';
}
return $sSQL.' AS addressimportance,';
}
private function langAddressSql($sHousenumber)
{
if ($this->bAddressDetails) {
return ''; // langaddress will be computed from address details
}
return 'get_address_by_language(place_id,'.$sHousenumber.','.$this->aLangPrefOrderSql.') AS langaddress,';
}
public function lookupOSMID($sType, $iID)
{
$sSQL = 'select place_id from placex where osm_type = :type and osm_id = :id';
$iPlaceID = $this->oDB->getOne($sSQL, array(':type' => $sType, ':id' => $iID));
if (!$iPlaceID) {
return null;
}
$aResults = $this->lookup(array($iPlaceID => new Result($iPlaceID)));
return empty($aResults) ? null : reset($aResults);
}
public function lookup($aResults, $iMinRank = 0, $iMaxRank = 30)
{
Debug::newFunction('Place lookup');
if (empty($aResults)) {
return array();
}
$aSubSelects = array();
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_PLACEX);
if ($sPlaceIDs) {
Debug::printVar('Ids from placex', $sPlaceIDs);
$sSQL = 'SELECT ';
$sSQL .= ' osm_type,';
$sSQL .= ' osm_id,';
$sSQL .= ' class,';
$sSQL .= ' type,';
$sSQL .= ' admin_level,';
$sSQL .= ' rank_search,';
$sSQL .= ' rank_address,';
$sSQL .= ' min(place_id) AS place_id,';
$sSQL .= ' min(parent_place_id) AS parent_place_id,';
$sSQL .= ' -1 as housenumber,';
$sSQL .= ' country_code,';
$sSQL .= $this->langAddressSql('-1');
$sSQL .= ' get_name_by_language(name,'.$this->aLangPrefOrderSql.') AS placename,';
$sSQL .= " get_name_by_language(name, ARRAY['ref']) AS ref,";
if ($this->bExtraTags) {
$sSQL .= 'hstore_to_json(extratags)::text AS extra,';
}
if ($this->bNameDetails) {
$sSQL .= 'hstore_to_json(name)::text AS names,';
}
$sSQL .= ' avg(ST_X(centroid)) AS lon, ';
$sSQL .= ' avg(ST_Y(centroid)) AS lat, ';
$sSQL .= ' COALESCE(importance,0.75-(rank_search::float/40)) AS importance, ';
$sSQL .= $this->addressImportanceSql(
'ST_Collect(centroid)',
'min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)'
);
$sSQL .= " COALESCE(extratags->'place', extratags->'linked_place') AS extra_place ";
$sSQL .= ' FROM placex';
$sSQL .= " WHERE place_id in ($sPlaceIDs) ";
$sSQL .= ' AND (';
$sSQL .= " placex.rank_address between $iMinRank and $iMaxRank ";
if (14 >= $iMinRank && 14 <= $iMaxRank) {
$sSQL .= " OR (extratags->'place') = 'city'";
}
if ($this->sAddressRankListSql) {
$sSQL .= ' OR placex.rank_address in '.$this->sAddressRankListSql;
}
$sSQL .= ' ) ';
if ($this->sAllowedTypesSQLList) {
$sSQL .= 'AND placex.class in '.$this->sAllowedTypesSQLList;
}
$sSQL .= ' AND linked_place_id is null ';
$sSQL .= ' GROUP BY ';
$sSQL .= ' osm_type, ';
$sSQL .= ' osm_id, ';
$sSQL .= ' class, ';
$sSQL .= ' type, ';
$sSQL .= ' admin_level, ';
$sSQL .= ' rank_search, ';
$sSQL .= ' rank_address, ';
$sSQL .= ' housenumber,';
$sSQL .= ' country_code, ';
$sSQL .= ' importance, ';
if (!$this->bDeDupe) {
$sSQL .= 'place_id,';
}
if (!$this->bAddressDetails) {
$sSQL .= 'langaddress, ';
}
$sSQL .= ' placename, ';
$sSQL .= ' ref, ';
if ($this->bExtraTags) {
$sSQL .= 'extratags, ';
}
if ($this->bNameDetails) {
$sSQL .= 'name, ';
}
$sSQL .= ' extra_place ';
$aSubSelects[] = $sSQL;
}
// postcode table
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_POSTCODE);
if ($sPlaceIDs) {
Debug::printVar('Ids from location_postcode', $sPlaceIDs);
$sSQL = 'SELECT';
$sSQL .= " 'P' as osm_type,";
$sSQL .= ' (SELECT osm_id from placex p WHERE p.place_id = lp.parent_place_id) as osm_id,';
$sSQL .= " 'place' as class, 'postcode' as type,";
$sSQL .= ' null::smallint as admin_level, rank_search, rank_address,';
$sSQL .= ' place_id, parent_place_id,';
$sSQL .= ' -1 as housenumber,';
$sSQL .= ' country_code,';
$sSQL .= $this->langAddressSql('-1');
$sSQL .= ' postcode as placename,';
$sSQL .= ' postcode as ref,';
if ($this->bExtraTags) {
$sSQL .= 'null::text AS extra,';
}
if ($this->bNameDetails) {
$sSQL .= 'null::text AS names,';
}
$sSQL .= ' ST_x(geometry) AS lon, ST_y(geometry) AS lat,';
$sSQL .= ' (0.75-(rank_search::float/40)) AS importance, ';
$sSQL .= $this->addressImportanceSql('geometry', 'lp.parent_place_id');
$sSQL .= ' null::text AS extra_place ';
$sSQL .= 'FROM location_postcode lp';
$sSQL .= " WHERE place_id in ($sPlaceIDs) ";
$sSQL .= " AND lp.rank_address between $iMinRank and $iMaxRank";
$aSubSelects[] = $sSQL;
}
// All other tables are rank 30 only.
if ($iMaxRank == 30) {
// TIGER table
if (CONST_Use_US_Tiger_Data) {
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_TIGER);
if ($sPlaceIDs) {
Debug::printVar('Ids from Tiger table', $sPlaceIDs);
$sHousenumbers = Result::sqlHouseNumberTable($aResults, Result::TABLE_TIGER);
// Tiger search only if a housenumber was searched and if it was found
// (realized through a join)
$sSQL = ' SELECT ';
$sSQL .= " 'T' AS osm_type, ";
$sSQL .= ' (SELECT osm_id from placex p WHERE p.place_id=blub.parent_place_id) as osm_id, ';
$sSQL .= " 'place' AS class, ";
$sSQL .= " 'house' AS type, ";
$sSQL .= ' null::smallint AS admin_level, ';
$sSQL .= ' 30 AS rank_search, ';
$sSQL .= ' 30 AS rank_address, ';
$sSQL .= ' place_id, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place as housenumber,';
$sSQL .= " 'us' AS country_code, ";
$sSQL .= $this->langAddressSql('housenumber_for_place');
$sSQL .= ' null::text AS placename, ';
$sSQL .= ' null::text AS ref, ';
if ($this->bExtraTags) {
$sSQL .= 'null::text AS extra,';
}
if ($this->bNameDetails) {
$sSQL .= 'null::text AS names,';
}
$sSQL .= ' st_x(centroid) AS lon, ';
$sSQL .= ' st_y(centroid) AS lat,';
$sSQL .= ' -1.15 AS importance, ';
$sSQL .= $this->addressImportanceSql('centroid', 'blub.parent_place_id');
$sSQL .= ' null::text AS extra_place ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT place_id, '; // interpolate the Tiger housenumbers here
$sSQL .= ' ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float) AS centroid, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place';
$sSQL .= ' FROM (';
$sSQL .= ' location_property_tiger ';
$sSQL .= ' JOIN (values '.$sHousenumbers.') AS housenumbers(place_id, housenumber_for_place) USING(place_id)) ';
$sSQL .= ' WHERE ';
$sSQL .= ' housenumber_for_place >= startnumber';
$sSQL .= ' AND housenumber_for_place <= endnumber';
$sSQL .= ' ) AS blub'; //postgres wants an alias here
$aSubSelects[] = $sSQL;
}
}
// osmline - interpolated housenumbers
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_OSMLINE);
if ($sPlaceIDs) {
Debug::printVar('Ids from interpolation', $sPlaceIDs);
$sHousenumbers = Result::sqlHouseNumberTable($aResults, Result::TABLE_OSMLINE);
// interpolation line search only if a housenumber was searched
// (realized through a join)
$sSQL = 'SELECT ';
$sSQL .= " 'W' AS osm_type, ";
$sSQL .= ' osm_id, ';
$sSQL .= " 'place' AS class, ";
$sSQL .= " 'house' AS type, ";
$sSQL .= ' null::smallint AS admin_level, ';
$sSQL .= ' 30 AS rank_search, ';
$sSQL .= ' 30 AS rank_address, ';
$sSQL .= ' place_id, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place as housenumber,';
$sSQL .= ' country_code, ';
$sSQL .= $this->langAddressSql('housenumber_for_place');
$sSQL .= ' null::text AS placename, ';
$sSQL .= ' null::text AS ref, ';
if ($this->bExtraTags) {
$sSQL .= 'null::text AS extra, ';
}
if ($this->bNameDetails) {
$sSQL .= 'null::text AS names, ';
}
$sSQL .= ' st_x(centroid) AS lon, ';
$sSQL .= ' st_y(centroid) AS lat, ';
// slightly smaller than the importance for normal houses
$sSQL .= ' -0.1 AS importance, ';
$sSQL .= $this->addressImportanceSql('centroid', 'blub.parent_place_id');
$sSQL .= ' null::text AS extra_place ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT ';
$sSQL .= ' osm_id, ';
$sSQL .= ' place_id, ';
$sSQL .= ' country_code, ';
$sSQL .= ' CASE '; // interpolate the housenumbers here
$sSQL .= ' WHEN startnumber != endnumber ';
$sSQL .= ' THEN ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float) ';
$sSQL .= ' ELSE ST_LineInterpolatePoint(linegeo, 0.5) ';
$sSQL .= ' END as centroid, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place ';
$sSQL .= ' FROM (';
$sSQL .= ' location_property_osmline ';
$sSQL .= ' JOIN (values '.$sHousenumbers.') AS housenumbers(place_id, housenumber_for_place) USING(place_id)';
$sSQL .= ' ) ';
$sSQL .= ' WHERE housenumber_for_place >= 0 ';
$sSQL .= ' ) as blub'; //postgres wants an alias here
$aSubSelects[] = $sSQL;
}
}
if (empty($aSubSelects)) {
return array();
}
$sSQL = join(' UNION ', $aSubSelects);
Debug::printSQL($sSQL);
$aPlaces = $this->oDB->getAll($sSQL, null, 'Could not lookup place');
foreach ($aPlaces as &$aPlace) {
$aPlace['importance'] = (float) $aPlace['importance'];
if ($this->bAddressDetails) {
// to get addressdetails for tiger data, the housenumber is needed
$aPlace['address'] = new AddressDetails(
$this->oDB,
$aPlace['place_id'],
$aPlace['housenumber'],
$this->aLangPrefOrderSql
);
$aPlace['langaddress'] = $aPlace['address']->getLocaleAddress();
}
if ($this->bExtraTags) {
if ($aPlace['extra']) {
$aPlace['sExtraTags'] = json_decode($aPlace['extra']);
} else {
$aPlace['sExtraTags'] = (object) array();
}
}
if ($this->bNameDetails) {
if ($aPlace['names']) {
$aPlace['sNameDetails'] = json_decode($aPlace['names']);
} else {
$aPlace['sNameDetails'] = (object) array();
}
}
$aPlace['addresstype'] = ClassTypes\getLabelTag(
$aPlace,
$aPlace['country_code']
);
$aResults[$aPlace['place_id']] = $aPlace;
}
$aResults = array_filter(
$aResults,
function ($v) {
return !($v instanceof Result);
}
);
Debug::printVar('Places', $aResults);
return $aResults;
}
/* returns an array which will contain the keys
* aBoundingBox
* and may also contain one or more of the keys
* asgeojson
* askml
* assvg
* astext
* lat
* lon
*/
public function getOutlines($iPlaceID, $fLon = null, $fLat = null, $fRadius = null, $fLonReverse = null, $fLatReverse = null)
{
$aOutlineResult = array();
if (!$iPlaceID) {
return $aOutlineResult;
}
// Get the bounding box and outline polygon
$sSQL = 'select place_id,0 as numfeatures,st_area(geometry) as area,';
if ($fLonReverse != null && $fLatReverse != null) {
$sSQL .= ' ST_Y(closest_point) as centrelat,';
$sSQL .= ' ST_X(closest_point) as centrelon,';
} else {
$sSQL .= ' ST_Y(centroid) as centrelat, ST_X(centroid) as centrelon,';
}
$sSQL .= ' ST_YMin(geometry) as minlat,ST_YMax(geometry) as maxlat,';
$sSQL .= ' ST_XMin(geometry) as minlon,ST_XMax(geometry) as maxlon';
if ($this->bIncludePolygonAsGeoJSON) {
$sSQL .= ',ST_AsGeoJSON(geometry) as asgeojson';
}
if ($this->bIncludePolygonAsKML) {
$sSQL .= ',ST_AsKML(geometry) as askml';
}
if ($this->bIncludePolygonAsSVG) {
$sSQL .= ',ST_AsSVG(geometry) as assvg';
}
if ($this->bIncludePolygonAsText) {
$sSQL .= ',ST_AsText(geometry) as astext';
}
if ($fLonReverse != null && $fLatReverse != null) {
$sFrom = ' from (SELECT * , CASE WHEN (class = \'highway\') AND (ST_GeometryType(geometry) = \'ST_LineString\') THEN ';
$sFrom .=' ST_ClosestPoint(geometry, ST_SetSRID(ST_Point('.$fLatReverse.','.$fLonReverse.'),4326))';
$sFrom .=' ELSE centroid END AS closest_point';
$sFrom .= ' from placex where place_id = '.$iPlaceID.') as plx';
} else {
$sFrom = ' from placex where place_id = '.$iPlaceID;
}
if ($this->fPolygonSimplificationThreshold > 0) {
$sSQL .= ' from (select place_id,centroid,ST_SimplifyPreserveTopology(geometry,'.$this->fPolygonSimplificationThreshold.') as geometry'.$sFrom.') as plx';
} else {
$sSQL .= $sFrom;
}
$aPointPolygon = $this->oDB->getRow($sSQL, null, 'Could not get outline');
if ($aPointPolygon && $aPointPolygon['place_id']) {
if ($aPointPolygon['centrelon'] !== null && $aPointPolygon['centrelat'] !== null) {
$aOutlineResult['lat'] = $aPointPolygon['centrelat'];
$aOutlineResult['lon'] = $aPointPolygon['centrelon'];
}
if ($this->bIncludePolygonAsGeoJSON) {
$aOutlineResult['asgeojson'] = $aPointPolygon['asgeojson'];
}
if ($this->bIncludePolygonAsKML) {
$aOutlineResult['askml'] = $aPointPolygon['askml'];
}
if ($this->bIncludePolygonAsSVG) {
$aOutlineResult['assvg'] = $aPointPolygon['assvg'];
}
if ($this->bIncludePolygonAsText) {
$aOutlineResult['astext'] = $aPointPolygon['astext'];
}
if (abs($aPointPolygon['minlat'] - $aPointPolygon['maxlat']) < 0.0000001) {
$aPointPolygon['minlat'] = $aPointPolygon['minlat'] - $fRadius;
$aPointPolygon['maxlat'] = $aPointPolygon['maxlat'] + $fRadius;
}
if (abs($aPointPolygon['minlon'] - $aPointPolygon['maxlon']) < 0.0000001) {
$aPointPolygon['minlon'] = $aPointPolygon['minlon'] - $fRadius;
$aPointPolygon['maxlon'] = $aPointPolygon['maxlon'] + $fRadius;
}
$aOutlineResult['aBoundingBox'] = array(
(string)$aPointPolygon['minlat'],
(string)$aPointPolygon['maxlat'],
(string)$aPointPolygon['minlon'],
(string)$aPointPolygon['maxlon']
);
}
// as a fallback we generate a bounding box without knowing the size of the geometry
if ((!isset($aOutlineResult['aBoundingBox'])) && isset($fLon)) {
$aBounds = array(
'minlat' => $fLat - $fRadius,
'maxlat' => $fLat + $fRadius,
'minlon' => $fLon - $fRadius,
'maxlon' => $fLon + $fRadius
);
$aOutlineResult['aBoundingBox'] = array(
(string)$aBounds['minlat'],
(string)$aBounds['maxlat'],
(string)$aBounds['minlon'],
(string)$aBounds['maxlon']
);
}
return $aOutlineResult;
}
}

121
lib-php/Result.php Normal file
View File

@@ -0,0 +1,121 @@
<?php
namespace Nominatim;
/**
* A single result of a search operation or a reverse lookup.
*
* This object only contains the id of the result. It does not yet
* have any details needed to format the output document.
*/
class Result
{
const TABLE_PLACEX = 0;
const TABLE_POSTCODE = 1;
const TABLE_OSMLINE = 2;
const TABLE_TIGER = 3;
/// Database table that contains the result.
public $iTable;
/// Id of the result.
public $iId;
/// House number (only for interpolation results).
public $iHouseNumber = -1;
/// Number of exact matches in address (address searches only).
public $iExactMatches = 0;
/// Subranking within the results (the higher the worse).
public $iResultRank = 0;
/// Address rank of the result.
public $iAddressRank;
public function debugInfo()
{
return array(
'Table' => $this->iTable,
'ID' => $this->iId,
'House number' => $this->iHouseNumber,
'Exact Matches' => $this->iExactMatches,
'Result rank' => $this->iResultRank
);
}
public function __construct($sId, $iTable = Result::TABLE_PLACEX)
{
$this->iTable = $iTable;
$this->iId = (int) $sId;
}
public static function joinIdsByTable($aResults, $iTable)
{
return join(',', array_keys(array_filter(
$aResults,
function ($aValue) use ($iTable) {
return $aValue->iTable == $iTable;
}
)));
}
public static function joinIdsByTableMinRank($aResults, $iTable, $iMinAddressRank)
{
return join(',', array_keys(array_filter(
$aResults,
function ($aValue) use ($iTable, $iMinAddressRank) {
return $aValue->iTable == $iTable && $aValue->iAddressRank >= $iMinAddressRank;
}
)));
}
public static function joinIdsByTableMaxRank($aResults, $iTable, $iMaxAddressRank)
{
return join(',', array_keys(array_filter(
$aResults,
function ($aValue) use ($iTable, $iMaxAddressRank) {
return $aValue->iTable == $iTable && $aValue->iAddressRank <= $iMaxAddressRank;
}
)));
}
public static function sqlHouseNumberTable($aResults, $iTable)
{
$sHousenumbers = '';
$sSep = '';
foreach ($aResults as $oResult) {
if ($oResult->iTable == $iTable) {
$sHousenumbers .= $sSep.'('.$oResult->iId.',';
$sHousenumbers .= $oResult->iHouseNumber.')';
$sSep = ',';
}
}
return $sHousenumbers;
}
/**
* Split a result array into highest ranked result and the rest
*
* @param object[] $aResults List of results to split.
*
* @return array[]
*/
public static function splitResults($aResults)
{
$aHead = array();
$aTail = array();
$iMinRank = 10000;
foreach ($aResults as $oRes) {
if ($oRes->iResultRank < $iMinRank) {
$aTail += $aHead;
$aHead = array($oRes->iId => $oRes);
$iMinRank = $oRes->iResultRank;
} elseif ($oRes->iResultRank == $iMinRank) {
$aHead[$oRes->iId] = $oRes;
} else {
$aTail[$oRes->iId] = $oRes;
}
}
return array('head' => $aHead, 'tail' => $aTail);
}
}

375
lib-php/ReverseGeocode.php Normal file
View File

@@ -0,0 +1,375 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/Result.php');
class ReverseGeocode
{
protected $oDB;
protected $iMaxRank = 28;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
}
public function setZoom($iZoom)
{
// Zoom to rank, this could probably be calculated but a lookup gives fine control
$aZoomRank = array(
0 => 2, // Continent / Sea
1 => 2,
2 => 2,
3 => 4, // Country
4 => 4,
5 => 8, // State
6 => 10, // Region
7 => 10,
8 => 12, // County
9 => 12,
10 => 17, // City
11 => 17,
12 => 18, // Town / Village
13 => 18,
14 => 22, // Suburb
15 => 22,
16 => 26, // major street
17 => 27, // minor street
18 => 30, // or >, Building
19 => 30, // or >, Building
);
$this->iMaxRank = (isset($iZoom) && isset($aZoomRank[$iZoom]))?$aZoomRank[$iZoom]:28;
}
/**
* Find the closest interpolation with the given search diameter.
*
* @param string $sPointSQL Reverse geocoding point as SQL
* @param float $fSearchDiam Search diameter
*
* @return Record of the interpolation or null.
*/
protected function lookupInterpolation($sPointSQL, $fSearchDiam)
{
Debug::newFunction('lookupInterpolation');
$sSQL = 'SELECT place_id, parent_place_id, 30 as rank_search,';
$sSQL .= ' ST_LineLocatePoint(linegeo,'.$sPointSQL.') as fraction,';
$sSQL .= ' startnumber, endnumber, interpolationtype,';
$sSQL .= ' ST_Distance(linegeo,'.$sPointSQL.') as distance';
$sSQL .= ' FROM location_property_osmline';
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', linegeo, '.$fSearchDiam.')';
$sSQL .= ' and indexed_status = 0 and startnumber is not NULL ';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
return $this->oDB->getRow(
$sSQL,
null,
'Could not determine closest housenumber on an osm interpolation line.'
);
}
protected function lookupLargeArea($sPointSQL, $iMaxRank)
{
if ($iMaxRank > 4) {
$aPlace = $this->lookupPolygon($sPointSQL, $iMaxRank);
if ($aPlace) {
return new Result($aPlace['place_id']);
}
}
// If no polygon which contains the searchpoint is found,
// searches in the country_osm_grid table for a polygon.
return $this->lookupInCountry($sPointSQL, $iMaxRank);
}
protected function lookupInCountry($sPointSQL, $iMaxRank)
{
Debug::newFunction('lookupInCountry');
// searches for polygon in table country_osm_grid which contains the searchpoint
// and searches for the nearest place node to the searchpoint in this polygon
$sSQL = 'SELECT country_code FROM country_osm_grid';
$sSQL .= ' WHERE ST_CONTAINS(geometry, '.$sPointSQL.') LIMIT 1';
Debug::printSQL($sSQL);
$sCountryCode = $this->oDB->getOne(
$sSQL,
null,
'Could not determine country polygon containing the point.'
);
Debug::printVar('Country code', $sCountryCode);
if ($sCountryCode) {
if ($iMaxRank > 4) {
// look for place nodes with the given country code
$sSQL = 'SELECT place_id FROM';
$sSQL .= ' (SELECT place_id, rank_search,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE osm_type = \'N\'';
$sSQL .= ' AND country_code = \''.$sCountryCode.'\'';
$sSQL .= ' AND rank_search < 26 '; // needed to select right index
$sSQL .= ' AND rank_search between 5 and ' .min(25, $iMaxRank);
$sSQL .= ' AND class = \'place\' AND type != \'postcode\'';
$sSQL .= ' AND name IS NOT NULL ';
$sSQL .= ' and indexed_status = 0 and linked_place_id is null';
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', geometry, 1.8)) p ';
$sSQL .= 'WHERE distance <= reverse_place_diameter(rank_search)';
$sSQL .= ' ORDER BY rank_search DESC, distance ASC';
$sSQL .= ' LIMIT 1';
Debug::printSQL($sSQL);
$aPlace = $this->oDB->getRow($sSQL, null, 'Could not determine place node.');
Debug::printVar('Country node', $aPlace);
if ($aPlace) {
return new Result($aPlace['place_id']);
}
}
// still nothing, then return the country object
$sSQL = 'SELECT place_id, ST_distance('.$sPointSQL.', centroid) as distance';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE country_code = \''.$sCountryCode.'\'';
$sSQL .= ' AND rank_search = 4 AND rank_address = 4';
$sSQL .= ' AND class in (\'boundary\', \'place\')';
$sSQL .= ' AND linked_place_id is null';
$sSQL .= ' ORDER BY distance ASC';
Debug::printSQL($sSQL);
$aPlace = $this->oDB->getRow($sSQL, null, 'Could not determine place node.');
Debug::printVar('Country place', $aPlace);
if ($aPlace) {
return new Result($aPlace['place_id']);
}
}
return null;
}
/**
* Search for areas or nodes for areas or nodes between state and suburb level.
*
* @param string $sPointSQL Search point as SQL string.
* @param int $iMaxRank Maximum address rank of the feature.
*
* @return Record of the found feature or null.
*
* Searches first for polygon that contains the search point.
* If such a polygon is found, place nodes with a higher rank are
* searched inside the polygon.
*/
protected function lookupPolygon($sPointSQL, $iMaxRank)
{
Debug::newFunction('lookupPolygon');
// polygon search begins at suburb-level
if ($iMaxRank > 25) {
$iMaxRank = 25;
}
// no polygon search over country-level
if ($iMaxRank < 5) {
$iMaxRank = 5;
}
// search for polygon
$sSQL = 'SELECT place_id, parent_place_id, rank_address, rank_search FROM';
$sSQL .= '(select place_id, parent_place_id, rank_address, rank_search, country_code, geometry';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE ST_GeometryType(geometry) in (\'ST_Polygon\', \'ST_MultiPolygon\')';
$sSQL .= ' AND rank_address Between 5 AND ' .$iMaxRank;
$sSQL .= ' AND geometry && '.$sPointSQL;
$sSQL .= ' AND type != \'postcode\' ';
$sSQL .= ' AND name is not null';
$sSQL .= ' AND indexed_status = 0 and linked_place_id is null';
$sSQL .= ' ORDER BY rank_address DESC LIMIT 50 ) as a';
$sSQL .= ' WHERE ST_CONTAINS(geometry, '.$sPointSQL.' )';
$sSQL .= ' ORDER BY rank_address DESC LIMIT 1';
Debug::printSQL($sSQL);
$aPoly = $this->oDB->getRow($sSQL, null, 'Could not determine polygon containing the point.');
Debug::printVar('Polygon result', $aPoly);
if ($aPoly) {
// if a polygon is found, search for placenodes begins ...
$iRankAddress = $aPoly['rank_address'];
$iRankSearch = $aPoly['rank_search'];
$iPlaceID = $aPoly['place_id'];
if ($iRankAddress != $iMaxRank) {
$sSQL = 'SELECT place_id FROM ';
$sSQL .= '(SELECT place_id, rank_search, country_code, geometry,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE osm_type = \'N\'';
// using rank_search because of a better differentiation
// for place nodes at rank_address 16
$sSQL .= ' AND rank_search > '.$iRankSearch;
$sSQL .= ' AND rank_search <= '.$iMaxRank;
$sSQL .= ' AND rank_search < 26 '; // needed to select right index
$sSQL .= ' AND rank_address > 0';
$sSQL .= ' AND class = \'place\'';
$sSQL .= ' AND type != \'postcode\'';
$sSQL .= ' AND name IS NOT NULL ';
$sSQL .= ' AND indexed_status = 0 AND linked_place_id is null';
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', geometry, reverse_place_diameter('.$iRankSearch.'::smallint))';
$sSQL .= ' ORDER BY distance ASC,';
$sSQL .= ' rank_address DESC';
$sSQL .= ' limit 500) as a';
$sSQL .= ' WHERE ST_CONTAINS((SELECT geometry FROM placex WHERE place_id = '.$iPlaceID.'), geometry )';
$sSQL .= ' AND distance <= reverse_place_diameter(rank_search)';
$sSQL .= ' ORDER BY distance ASC, rank_search DESC';
$sSQL .= ' LIMIT 1';
Debug::printSQL($sSQL);
$aPlaceNode = $this->oDB->getRow($sSQL, null, 'Could not determine place node.');
Debug::printVar('Nearest place node', $aPlaceNode);
if ($aPlaceNode) {
return $aPlaceNode;
}
}
}
return $aPoly;
}
public function lookup($fLat, $fLon, $bDoInterpolation = true)
{
return $this->lookupPoint(
'ST_SetSRID(ST_Point('.$fLon.','.$fLat.'),4326)',
$bDoInterpolation
);
}
public function lookupPoint($sPointSQL, $bDoInterpolation = true)
{
Debug::newFunction('lookupPoint');
// Find the nearest point
$fSearchDiam = 0.006;
$oResult = null;
$aPlace = null;
// for POI or street level
if ($this->iMaxRank >= 26) {
// starts if the search is on POI or street level,
// searches for the nearest POI or street,
// if a street is found and a POI is searched for,
// the nearest POI which the found street is a parent of is choosen.
$sSQL = 'select place_id,parent_place_id,rank_address,country_code,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM ';
$sSQL .= ' placex';
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', geometry, '.$fSearchDiam.')';
$sSQL .= ' AND';
$sSQL .= ' rank_address between 26 and '.$this->iMaxRank;
$sSQL .= ' and (name is not null or housenumber is not null';
$sSQL .= ' or rank_address between 26 and 27)';
$sSQL .= ' and (rank_address between 26 and 27';
$sSQL .= ' or ST_GeometryType(geometry) != \'ST_LineString\')';
$sSQL .= ' and class not in (\'boundary\')';
$sSQL .= ' and indexed_status = 0 and linked_place_id is null';
$sSQL .= ' and (ST_GeometryType(geometry) not in (\'ST_Polygon\',\'ST_MultiPolygon\') ';
$sSQL .= ' OR ST_DWithin('.$sPointSQL.', centroid, '.$fSearchDiam.'))';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
$aPlace = $this->oDB->getRow($sSQL, null, 'Could not determine closest place.');
Debug::printVar('POI/street level result', $aPlace);
if ($aPlace) {
$iPlaceID = $aPlace['place_id'];
$oResult = new Result($iPlaceID);
$iRankAddress = $aPlace['rank_address'];
}
if ($aPlace) {
// if street and maxrank > streetlevel
if ($iRankAddress <= 27 && $this->iMaxRank > 27) {
// find the closest object (up to a certain radius) of which the street is a parent of
$sSQL = ' select place_id,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM ';
$sSQL .= ' placex';
// radius ?
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', geometry, 0.001)';
$sSQL .= ' AND parent_place_id = '.$iPlaceID;
$sSQL .= ' and rank_address > 28';
$sSQL .= ' and ST_GeometryType(geometry) != \'ST_LineString\'';
$sSQL .= ' and (name is not null or housenumber is not null)';
$sSQL .= ' and class not in (\'boundary\')';
$sSQL .= ' and indexed_status = 0 and linked_place_id is null';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
$aStreet = $this->oDB->getRow($sSQL, null, 'Could not determine closest place.');
Debug::printVar('Closest POI result', $aStreet);
if ($aStreet) {
$aPlace = $aStreet;
$oResult = new Result($aStreet['place_id']);
$iRankAddress = 30;
}
}
// In the US we can check TIGER data for nearest housenumber
if (CONST_Use_US_Tiger_Data
&& $iRankAddress <= 27
&& $aPlace['country_code'] == 'us'
&& $this->iMaxRank >= 28
) {
$sSQL = 'SELECT place_id,parent_place_id,30 as rank_search,';
$sSQL .= 'ST_LineLocatePoint(linegeo,'.$sPointSQL.') as fraction,';
$sSQL .= 'ST_distance('.$sPointSQL.', linegeo) as distance,';
$sSQL .= 'startnumber,endnumber,interpolationtype';
$sSQL .= ' FROM location_property_tiger WHERE parent_place_id = '.$oResult->iId;
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', linegeo, 0.001)';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
$aPlaceTiger = $this->oDB->getRow($sSQL, null, 'Could not determine closest Tiger place.');
Debug::printVar('Tiger house number result', $aPlaceTiger);
if ($aPlaceTiger) {
$aPlace = $aPlaceTiger;
$oResult = new Result($aPlaceTiger['place_id'], Result::TABLE_TIGER);
$oResult->iHouseNumber = closestHouseNumber($aPlaceTiger);
$iRankAddress = 30;
}
}
}
if ($bDoInterpolation && $this->iMaxRank >= 30) {
$fDistance = $fSearchDiam;
if ($aPlace) {
// We can't reliably go from the closest street to an
// interpolation line because the closest interpolation
// may have a different street segments as a parent.
// Therefore allow an interpolation line to take precendence
// even when the street is closer.
$fDistance = $iRankAddress < 28 ? 0.001 : $aPlace['distance'];
}
$aHouse = $this->lookupInterpolation($sPointSQL, $fDistance);
Debug::printVar('Interpolation result', $aPlace);
if ($aHouse) {
$oResult = new Result($aHouse['place_id'], Result::TABLE_OSMLINE);
$oResult->iHouseNumber = closestHouseNumber($aHouse);
$aPlace = $aHouse;
}
}
if (!$aPlace) {
// if no POI or street is found ...
$oResult = $this->lookupLargeArea($sPointSQL, 25);
}
} else {
// lower than street level ($iMaxRank < 26 )
$oResult = $this->lookupLargeArea($sPointSQL, $this->iMaxRank);
}
Debug::printVar('Final result', $oResult);
return $oResult;
}
}

311
lib-php/SearchContext.php Normal file
View File

@@ -0,0 +1,311 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/lib.php');
/**
* Collection of search constraints that are independent of the
* actual interpretation of the search query.
*
* The search context is shared between all SearchDescriptions. This
* object mainly serves as context provider for the database queries.
* Therefore most data is directly cached as SQL statements.
*/
class SearchContext
{
/// Search radius around a given Near reference point.
private $fNearRadius = false;
/// True if search must be restricted to viewbox only.
public $bViewboxBounded = false;
/// Reference point for search (as SQL).
public $sqlNear = '';
/// Viewbox selected for search (as SQL).
public $sqlViewboxSmall = '';
/// Viewbox with a larger buffer around (as SQL).
public $sqlViewboxLarge = '';
/// Reference along a route (as SQL).
public $sqlViewboxCentre = '';
/// List of countries to restrict search to (as array).
public $aCountryList = null;
/// List of countries to restrict search to (as SQL).
public $sqlCountryList = '';
/// List of place IDs to exclude (as SQL).
private $sqlExcludeList = '';
/// Subset of word ids of full words in the query.
private $aFullNameWords = array();
public function setFullNameWords($aWordList)
{
$this->aFullNameWords = $aWordList;
}
public function getFullNameTerms()
{
return $this->aFullNameWords;
}
/**
* Check if a reference point is defined.
*
* @return bool True if a reference point is defined.
*/
public function hasNearPoint()
{
return $this->fNearRadius !== false;
}
/**
* Get radius around reference point.
*
* @return float Search radius around reference point.
*/
public function nearRadius()
{
return $this->fNearRadius;
}
/**
* Set search reference point in WGS84.
*
* If set, then only places around this point will be taken into account.
*
* @param float $fLat Latitude of point.
* @param float $fLon Longitude of point.
* @param float $fRadius Search radius around point.
*
* @return void
*/
public function setNearPoint($fLat, $fLon, $fRadius = 0.1)
{
$this->fNearRadius = $fRadius;
$this->sqlNear = 'ST_SetSRID(ST_Point('.$fLon.','.$fLat.'),4326)';
}
/**
* Check if the search is geographically restricted.
*
* Searches are restricted if a reference point is given or if
* a bounded viewbox is set.
*
* @return bool True, if the search is geographically bounded.
*/
public function isBoundedSearch()
{
return $this->hasNearPoint() || ($this->sqlViewboxSmall && $this->bViewboxBounded);
}
/**
* Set rectangular viewbox.
*
* The viewbox may be bounded which means that no search results
* must be outside the viewbox.
*
* @param float[4] $aViewBox Coordinates of the viewbox.
* @param bool $bBounded True if the viewbox is bounded.
*
* @return void
*/
public function setViewboxFromBox(&$aViewBox, $bBounded)
{
$this->bViewboxBounded = $bBounded;
$this->sqlViewboxCentre = '';
$this->sqlViewboxSmall = sprintf(
'ST_SetSRID(ST_MakeBox2D(ST_Point(%F,%F),ST_Point(%F,%F)),4326)',
$aViewBox[0],
$aViewBox[1],
$aViewBox[2],
$aViewBox[3]
);
$fHeight = abs($aViewBox[0] - $aViewBox[2]);
$fWidth = abs($aViewBox[1] - $aViewBox[3]);
$this->sqlViewboxLarge = sprintf(
'ST_SetSRID(ST_MakeBox2D(ST_Point(%F,%F),ST_Point(%F,%F)),4326)',
max($aViewBox[0], $aViewBox[2]) + $fHeight,
max($aViewBox[1], $aViewBox[3]) + $fWidth,
min($aViewBox[0], $aViewBox[2]) - $fHeight,
min($aViewBox[1], $aViewBox[3]) - $fWidth
);
}
/**
* Set viewbox along a route.
*
* The viewbox may be bounded which means that no search results
* must be outside the viewbox.
*
* @param object $oDB Nominatim::DB instance to use for computing the box.
* @param string[] $aRoutePoints List of x,y coordinates along a route.
* @param float $fRouteWidth Buffer around the route to use.
* @param bool $bBounded True if the viewbox bounded.
*
* @return void
*/
public function setViewboxFromRoute(&$oDB, $aRoutePoints, $fRouteWidth, $bBounded)
{
$this->bViewboxBounded = $bBounded;
$this->sqlViewboxCentre = "ST_SetSRID('LINESTRING(";
$sSep = '';
foreach ($aRoutePoints as $aPoint) {
$fPoint = (float)$aPoint;
$this->sqlViewboxCentre .= $sSep.$fPoint;
$sSep = ($sSep == ' ') ? ',' : ' ';
}
$this->sqlViewboxCentre .= ")'::geometry,4326)";
$sSQL = 'ST_BUFFER('.$this->sqlViewboxCentre.','.($fRouteWidth/69).')';
$sGeom = $oDB->getOne('select '.$sSQL, null, 'Could not get small viewbox');
$this->sqlViewboxSmall = "'".$sGeom."'::geometry";
$sSQL = 'ST_BUFFER('.$this->sqlViewboxCentre.','.($fRouteWidth/30).')';
$sGeom = $oDB->getOne('select '.$sSQL, null, 'Could not get large viewbox');
$this->sqlViewboxLarge = "'".$sGeom."'::geometry";
}
/**
* Set list of excluded place IDs.
*
* @param integer[] $aExcluded List of IDs.
*
* @return void
*/
public function setExcludeList($aExcluded)
{
$this->sqlExcludeList = ' not in ('.join(',', $aExcluded).')';
}
/**
* Set list of countries to restrict search to.
*
* @param string[] $aCountries List of two-letter lower-case country codes.
*
* @return void
*/
public function setCountryList($aCountries)
{
$this->sqlCountryList = '('.join(',', array_map('addQuotes', $aCountries)).')';
$this->aCountryList = $aCountries;
}
/**
* Extract a reference point from a query string.
*
* @param string $sQuery Query to scan.
*
* @return string The remaining query string.
*/
public function setNearPointFromQuery($sQuery)
{
$aResult = parseLatLon($sQuery);
if ($aResult !== false
&& $aResult[1] <= 90.1
&& $aResult[1] >= -90.1
&& $aResult[2] <= 180.1
&& $aResult[2] >= -180.1
) {
$this->setNearPoint($aResult[1], $aResult[2]);
$sQuery = trim(str_replace($aResult[0], ' ', $sQuery));
}
return $sQuery;
}
/**
* Get an SQL snippet for computing the distance from the reference point.
*
* @param string $sObj SQL variable name to compute the distance from.
*
* @return string An SQL string.
*/
public function distanceSQL($sObj)
{
return 'ST_Distance('.$this->sqlNear.", $sObj)";
}
/**
* Get an SQL snippet for checking if something is within range of the
* reference point.
*
* @param string $sObj SQL variable name to compute if it is within range.
*
* @return string An SQL string.
*/
public function withinSQL($sObj)
{
return sprintf('ST_DWithin(%s, %s, %F)', $sObj, $this->sqlNear, $this->fNearRadius);
}
/**
* Get an SQL snippet of the importance factor of the viewbox.
*
* The importance factor is computed by checking if an object is within
* the viewbox and/or the extended version of the viewbox.
*
* @param string $sObj SQL variable name of object to weight the importance
*
* @return string SQL snippet of the factor with a leading multiply sign.
*/
public function viewboxImportanceSQL($sObj)
{
$sSQL = '';
if ($this->sqlViewboxSmall) {
$sSQL = " * CASE WHEN ST_Contains($this->sqlViewboxSmall, $sObj) THEN 1 ELSE 0.5 END";
}
if ($this->sqlViewboxLarge) {
$sSQL = " * CASE WHEN ST_Contains($this->sqlViewboxLarge, $sObj) THEN 1 ELSE 0.5 END";
}
return $sSQL;
}
/**
* SQL snippet checking if a place ID should be excluded.
*
* @param string $sVariable SQL variable name of place ID to check,
* potentially prefixed with more SQL.
*
* @return string SQL snippet.
*/
public function excludeSQL($sVariable)
{
if ($this->sqlExcludeList) {
return $sVariable.$this->sqlExcludeList;
}
return '';
}
/**
* Check if the given country is covered by the search context.
*
* @param string $sCountryCode Country code of the country to check.
*
* @return True, if no country code restrictions are set or the
* country is included in the country list.
*/
public function isCountryApplicable($sCountryCode)
{
return $this->aCountryList === null || in_array($sCountryCode, $this->aCountryList);
}
public function debugInfo()
{
return array(
'Near radius' => $this->fNearRadius,
'Near point (SQL)' => $this->sqlNear,
'Bounded viewbox' => $this->bViewboxBounded,
'Viewbox (SQL, small)' => $this->sqlViewboxSmall,
'Viewbox (SQL, large)' => $this->sqlViewboxLarge,
'Viewbox (SQL, centre)' => $this->sqlViewboxCentre,
'Countries (SQL)' => $this->sqlCountryList,
'Excluded IDs (SQL)' => $this->sqlExcludeList
);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,87 @@
<?php
namespace Nominatim;
/**
* Description of the position of a token within a query.
*/
class SearchPosition
{
private $sPhraseType;
private $iPhrase;
private $iNumPhrases;
private $iToken;
private $iNumTokens;
public function __construct($sPhraseType, $iPhrase, $iNumPhrases)
{
$this->sPhraseType = $sPhraseType;
$this->iPhrase = $iPhrase;
$this->iNumPhrases = $iNumPhrases;
}
public function setTokenPosition($iToken, $iNumTokens)
{
$this->iToken = $iToken;
$this->iNumTokens = $iNumTokens;
}
/**
* Check if the phrase can be of the given type.
*
* @param string $sType Type of phrse requested.
*
* @return True if the phrase is untyped or of the given type.
*/
public function maybePhrase($sType)
{
return $this->sPhraseType == '' || $this->sPhraseType == $sType;
}
/**
* Check if the phrase is exactly of the given type.
*
* @param string $sType Type of phrse requested.
*
* @return True if the phrase of the given type.
*/
public function isPhrase($sType)
{
return $this->sPhraseType == $sType;
}
/**
* Return true if the token is the very first in the query.
*/
public function isFirstToken()
{
return $this->iPhrase == 0 && $this->iToken == 0;
}
/**
* Check if the token is the final one in the query.
*/
public function isLastToken()
{
return $this->iToken + 1 == $this->iNumTokens && $this->iPhrase + 1 == $this->iNumPhrases;
}
/**
* Check if the current token is part of the first phrase in the query.
*/
public function isFirstPhrase()
{
return $this->iPhrase == 0;
}
/**
* Get the phrase position in the query.
*/
public function getPhrase()
{
return $this->iPhrase;
}
}

84
lib-php/Shell.php Normal file
View File

@@ -0,0 +1,84 @@
<?php
namespace Nominatim;
class Shell
{
public function __construct($sBaseCmd, ...$aParams)
{
if (!$sBaseCmd) {
throw new \Exception('Command missing in new() call');
}
$this->baseCmd = $sBaseCmd;
$this->aParams = array();
$this->aEnv = null; // null = use the same environment as the current PHP process
$this->stdoutString = null;
foreach ($aParams as $sParam) {
$this->addParams($sParam);
}
}
public function addParams(...$aParams)
{
foreach ($aParams as $sParam) {
if (isset($sParam) && $sParam !== null && $sParam !== '') {
array_push($this->aParams, $sParam);
}
}
return $this;
}
public function addEnvPair($sKey, $sVal)
{
if (isset($sKey) && $sKey && isset($sVal)) {
if (!isset($this->aEnv)) {
$this->aEnv = $_ENV;
}
$this->aEnv = array_merge($this->aEnv, array($sKey => $sVal), $_ENV);
}
return $this;
}
public function escapedCmd()
{
$aEscaped = array_map(function ($sParam) {
return $this->escapeParam($sParam);
}, array_merge(array($this->baseCmd), $this->aParams));
return join(' ', $aEscaped);
}
public function run($bExitOnFail = false)
{
$sCmd = $this->escapedCmd();
// $aEnv does not need escaping, proc_open seems to handle it fine
$aFDs = array(
0 => array('pipe', 'r'),
1 => STDOUT,
2 => STDERR
);
$aPipes = null;
$hProc = @proc_open($sCmd, $aFDs, $aPipes, null, $this->aEnv);
if (!is_resource($hProc)) {
throw new \Exception('Unable to run command: ' . $sCmd);
}
fclose($aPipes[0]); // no stdin
$iStat = proc_close($hProc);
if ($iStat != 0 && $bExitOnFail) {
exit($iStat);
}
return $iStat;
}
private function escapeParam($sParam)
{
return (preg_match('/^-*\w+$/', $sParam)) ? $sParam : escapeshellarg($sParam);
}
}

131
lib-php/SimpleWordList.php Normal file
View File

@@ -0,0 +1,131 @@
<?php
namespace Nominatim;
/**
* A word list creator based on simple splitting by space.
*
* Creates possible permutations of split phrases by finding all combination
* of splitting the phrase on space boundaries.
*/
class SimpleWordList
{
const MAX_WORDSET_LEN = 20;
const MAX_WORDSETS = 100;
// The phrase as a list of simple terms (without spaces).
private $aWords;
/**
* Create a new word list
*
* @param string sPhrase Phrase to create the word list from. The phrase is
* expected to be normalised, so that there are no
* subsequent spaces.
*/
public function __construct($sPhrase)
{
if (strlen($sPhrase) > 0) {
$this->aWords = explode(' ', $sPhrase);
} else {
$this->aWords = array();
}
}
/**
* Get all possible tokens that are present in this word list.
*
* @return array The list of string tokens in the word list.
*/
public function getTokens()
{
$aTokens = array();
$iNumWords = count($this->aWords);
for ($i = 0; $i < $iNumWords; $i++) {
$sPhrase = $this->aWords[$i];
$aTokens[$sPhrase] = $sPhrase;
for ($j = $i + 1; $j < $iNumWords; $j++) {
$sPhrase .= ' '.$this->aWords[$j];
$aTokens[$sPhrase] = $sPhrase;
}
}
return $aTokens;
}
/**
* Compute all possible permutations of phrase splits that result in
* words which are in the token list.
*/
public function getWordSets($oTokens)
{
$iNumWords = count($this->aWords);
if ($iNumWords == 0) {
return null;
}
// Caches the word set for the partial phrase up to word i.
$aSetCache = array_fill(0, $iNumWords, array());
// Initialise first element of cache. There can only be the word.
if ($oTokens->containsAny($this->aWords[0])) {
$aSetCache[0][] = array($this->aWords[0]);
}
// Now do the next elements using what we already have.
for ($i = 1; $i < $iNumWords; $i++) {
for ($j = $i; $j > 0; $j--) {
$sPartial = $j == $i ? $this->aWords[$j] : $this->aWords[$j].' '.$sPartial;
if (!empty($aSetCache[$j - 1]) && $oTokens->containsAny($sPartial)) {
$aPartial = array($sPartial);
foreach ($aSetCache[$j - 1] as $aSet) {
if (count($aSet) < SimpleWordList::MAX_WORDSET_LEN) {
$aSetCache[$i][] = array_merge($aSet, $aPartial);
}
}
if (count($aSetCache[$i]) > 2 * SimpleWordList::MAX_WORDSETS) {
usort(
$aSetCache[$i],
array('\Nominatim\SimpleWordList', 'cmpByArraylen')
);
$aSetCache[$i] = array_slice(
$aSetCache[$i],
0,
SimpleWordList::MAX_WORDSETS
);
}
}
}
// finally the current full phrase
$sPartial = $this->aWords[0].' '.$sPartial;
if ($oTokens->containsAny($sPartial)) {
$aSetCache[$i][] = array($sPartial);
}
}
$aWordSets = $aSetCache[$iNumWords - 1];
usort($aWordSets, array('\Nominatim\SimpleWordList', 'cmpByArraylen'));
return array_slice($aWordSets, 0, SimpleWordList::MAX_WORDSETS);
}
public static function cmpByArraylen($aA, $aB)
{
$iALen = count($aA);
$iBLen = count($aB);
if ($iALen == $iBLen) {
return 0;
}
return ($iALen < $iBLen) ? -1 : 1;
}
public function debugInfo()
{
return $this->aWords;
}
}

View File

@@ -0,0 +1,44 @@
<?php
namespace Nominatim;
/**
* Operators describing special searches.
*/
abstract class Operator
{
/// No operator selected.
const NONE = 0;
/// Search for POI of the given type.
const TYPE = 1;
/// Search for POIs near the given place.
const NEAR = 2;
/// Search for POIS in the given place.
const IN = 3;
/// Search for POIS named as given.
const NAME = 4;
/// Search for postcodes.
const POSTCODE = 5;
private static $aConstantNames = null;
public static function toString($iOperator)
{
if ($iOperator == Operator::NONE) {
return '';
}
if (Operator::$aConstantNames === null) {
$oReflector = new \ReflectionClass('Nominatim\Operator');
$aConstants = $oReflector->getConstants();
Operator::$aConstantNames = array();
foreach ($aConstants as $sName => $iValue) {
Operator::$aConstantNames[$iValue] = $sName;
}
}
return Operator::$aConstantNames[$iOperator];
}
}

51
lib-php/Status.php Normal file
View File

@@ -0,0 +1,51 @@
<?php
namespace Nominatim;
require_once(CONST_TokenizerDir.'/tokenizer.php');
use Exception;
class Status
{
protected $oDB;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
}
public function status()
{
if (!$this->oDB) {
throw new Exception('No database', 700);
}
try {
$this->oDB->connect();
} catch (\Nominatim\DatabaseError $e) {
throw new Exception('Database connection failed', 700);
}
$oTokenizer = new \Nominatim\Tokenizer($this->oDB);
$oTokenizer->checkStatus();
}
public function dataDate()
{
$sSQL = 'SELECT EXTRACT(EPOCH FROM lastimportdate) FROM import_status LIMIT 1';
$iDataDateEpoch = $this->oDB->getOne($sSQL);
if ($iDataDateEpoch === false) {
throw new Exception('Import date is not available', 705);
}
return $iDataDateEpoch;
}
public function databaseVersion()
{
$sSQL = 'SELECT value FROM nominatim_properties WHERE property = \'database_version\'';
return $this->oDB->getOne($sSQL);
}
}

74
lib-php/TokenCountry.php Normal file
View File

@@ -0,0 +1,74 @@
<?php
namespace Nominatim\Token;
/**
* A country token.
*/
class Country
{
/// Database word id, if available.
private $iId;
/// Two-letter country code (lower-cased).
private $sCountryCode;
public function __construct($iId, $sCountryCode)
{
$this->iId = $iId;
$this->sCountryCode = $sCountryCode;
}
public function getId()
{
return $this->iId;
}
/**
* Check if the token can be added to the given search.
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return True if the token is compatible with the search configuration
* given the position.
*/
public function isExtendable($oSearch, $oPosition)
{
return !$oSearch->hasCountry()
&& $oPosition->maybePhrase('country')
&& $oSearch->getContext()->isCountryApplicable($this->sCountryCode);
}
/**
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return SearchDescription[] List of derived search descriptions.
*/
public function extendSearch($oSearch, $oPosition)
{
$oNewSearch = $oSearch->clone($oPosition->isLastToken() ? 1 : 6);
$oNewSearch->setCountry($this->sCountryCode);
return array($oNewSearch);
}
public function debugInfo()
{
return array(
'ID' => $this->iId,
'Type' => 'country',
'Info' => $this->sCountryCode
);
}
public function debugCode()
{
return 'C';
}
}

View File

@@ -0,0 +1,108 @@
<?php
namespace Nominatim\Token;
/**
* A house number token.
*/
class HouseNumber
{
/// Database word id, if available.
private $iId;
/// Normalized house number.
private $sToken;
public function __construct($iId, $sToken)
{
$this->iId = $iId;
$this->sToken = $sToken;
}
public function getId()
{
return $this->iId;
}
/**
* Check if the token can be added to the given search.
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return True if the token is compatible with the search configuration
* given the position.
*/
public function isExtendable($oSearch, $oPosition)
{
return !$oSearch->hasHousenumber()
&& !$oSearch->hasOperator(\Nominatim\Operator::POSTCODE)
&& $oPosition->maybePhrase('street');
}
/**
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return SearchDescription[] List of derived search descriptions.
*/
public function extendSearch($oSearch, $oPosition)
{
$aNewSearches = array();
// sanity check: if the housenumber is not mainly made
// up of numbers, add a penalty
$iSearchCost = 1;
if (preg_match('/\\d/', $this->sToken) === 0
|| preg_match_all('/[^0-9]/', $this->sToken, $aMatches) > 2) {
$iSearchCost += strlen($this->sToken) - 1;
}
if (!$oSearch->hasOperator(\Nominatim\Operator::NONE)) {
$iSearchCost++;
}
if (empty($this->iId)) {
$iSearchCost++;
}
// also must not appear in the middle of the address
if ($oSearch->hasAddress() || $oSearch->hasPostcode()) {
$iSearchCost++;
}
$oNewSearch = $oSearch->clone($iSearchCost);
$oNewSearch->setHousenumber($this->sToken);
$aNewSearches[] = $oNewSearch;
// Housenumbers may appear in the name when the place has its own
// address terms.
if ($this->iId !== null
&& ($oSearch->getNamePhrase() >= 0 || !$oSearch->hasName())
&& !$oSearch->hasAddress()
) {
$oNewSearch = $oSearch->clone($iSearchCost);
$oNewSearch->setHousenumberAsName($this->iId);
$aNewSearches[] = $oNewSearch;
}
return $aNewSearches;
}
public function debugInfo()
{
return array(
'ID' => $this->iId,
'Type' => 'house number',
'Info' => array('nr' => $this->sToken)
);
}
public function debugCode()
{
return 'H';
}
}

126
lib-php/TokenList.php Normal file
View File

@@ -0,0 +1,126 @@
<?php
namespace Nominatim;
require_once(CONST_LibDir.'/TokenCountry.php');
require_once(CONST_LibDir.'/TokenHousenumber.php');
require_once(CONST_LibDir.'/TokenPostcode.php');
require_once(CONST_LibDir.'/TokenSpecialTerm.php');
require_once(CONST_LibDir.'/TokenWord.php');
require_once(CONST_LibDir.'/TokenPartial.php');
require_once(CONST_LibDir.'/SpecialSearchOperator.php');
/**
* Saves information about the tokens that appear in a search query.
*
* Tokens are sorted by their normalized form, the token word. There are different
* kinds of tokens, represented by different Token* classes. Note that
* tokens do not have a common base class. All tokens need to have a field
* with the word id that points to an entry in the `word` database table
* but otherwise the information saved about a token can be very different.
*/
class TokenList
{
// List of list of tokens indexed by their word_token.
private $aTokens = array();
/**
* Return total number of tokens.
*
* @return Integer
*/
public function count()
{
return count($this->aTokens);
}
/**
* Check if there are tokens for the given token word.
*
* @param string $sWord Token word to look for.
*
* @return bool True if there is one or more token for the token word.
*/
public function contains($sWord)
{
return isset($this->aTokens[$sWord]);
}
/**
* Check if there are partial or full tokens for the given word.
*
* @param string $sWord Token word to look for.
*
* @return bool True if there is one or more token for the token word.
*/
public function containsAny($sWord)
{
return isset($this->aTokens[$sWord]);
}
/**
* Get the list of tokens for the given token word.
*
* @param string $sWord Token word to look for.
*
* @return object[] Array of tokens for the given token word or an
* empty array if no tokens could be found.
*/
public function get($sWord)
{
return isset($this->aTokens[$sWord]) ? $this->aTokens[$sWord] : array();
}
public function getFullWordIDs()
{
$ids = array();
foreach ($this->aTokens as $aTokenList) {
foreach ($aTokenList as $oToken) {
if (is_a($oToken, '\Nominatim\Token\Word')) {
$ids[$oToken->getId()] = $oToken->getId();
}
}
}
return $ids;
}
/**
* Add a new token for the given word.
*
* @param string $sWord Word the token describes.
* @param object $oToken Token object to add.
*
* @return void
*/
public function addToken($sWord, $oToken)
{
if (isset($this->aTokens[$sWord])) {
$this->aTokens[$sWord][] = $oToken;
} else {
$this->aTokens[$sWord] = array($oToken);
}
}
public function debugTokenByWordIdList()
{
$aWordsIDs = array();
foreach ($this->aTokens as $sToken => $aWords) {
foreach ($aWords as $aToken) {
$iId = $aToken->getId();
if ($iId !== null) {
$aWordsIDs[$iId] = '#'.$sToken.'('.$aToken->debugCode().' '.$iId.')#';
}
}
}
return $aWordsIDs;
}
public function debugInfo()
{
return $this->aTokens;
}
}

119
lib-php/TokenPartial.php Normal file
View File

@@ -0,0 +1,119 @@
<?php
namespace Nominatim\Token;
/**
* A standard word token.
*/
class Partial
{
/// Database word id, if applicable.
private $iId;
/// Number of appearances in the database.
private $iSearchNameCount;
/// True, if the token consists exclusively of digits and spaces.
private $bNumberToken;
public function __construct($iId, $sToken, $iSearchNameCount)
{
$this->iId = $iId;
$this->bNumberToken = (bool) preg_match('#^[0-9 ]+$#', $sToken);
$this->iSearchNameCount = $iSearchNameCount;
}
public function getId()
{
return $this->iId;
}
/**
* Check if the token can be added to the given search.
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return True if the token is compatible with the search configuration
* given the position.
*/
public function isExtendable($oSearch, $oPosition)
{
return !$oPosition->isPhrase('country');
}
/**
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return SearchDescription[] List of derived search descriptions.
*/
public function extendSearch($oSearch, $oPosition)
{
$aNewSearches = array();
// Partial token in Address.
if (($oPosition->isPhrase('') || !$oPosition->isFirstPhrase())
&& $oSearch->hasName()
) {
$iSearchCost = $this->bNumberToken ? 2 : 1;
if ($this->iSearchNameCount >= CONST_Max_Word_Frequency) {
$iSearchCost += 1;
}
$oNewSearch = $oSearch->clone($iSearchCost);
$oNewSearch->addAddressToken(
$this->iId,
$this->iSearchNameCount < CONST_Max_Word_Frequency
);
$aNewSearches[] = $oNewSearch;
}
// Partial token in Name.
if ((!$oSearch->hasPostcode() && !$oSearch->hasAddress())
&& (!$oSearch->hasName(true)
|| $oSearch->getNamePhrase() == $oPosition->getPhrase())
) {
$iSearchCost = 1;
if (!$oSearch->hasName(true)) {
$iSearchCost += 1;
}
if ($this->bNumberToken) {
$iSearchCost += 1;
}
$oNewSearch = $oSearch->clone($iSearchCost);
$oNewSearch->addPartialNameToken(
$this->iId,
$this->iSearchNameCount < CONST_Max_Word_Frequency,
$this->iSearchNameCount > CONST_Search_NameOnlySearchFrequencyThreshold,
$oPosition->getPhrase()
);
$aNewSearches[] = $oNewSearch;
}
return $aNewSearches;
}
public function debugInfo()
{
return array(
'ID' => $this->iId,
'Type' => 'partial',
'Info' => array(
'count' => $this->iSearchNameCount
)
);
}
public function debugCode()
{
return 'w';
}
}

98
lib-php/TokenPostcode.php Normal file
View File

@@ -0,0 +1,98 @@
<?php
namespace Nominatim\Token;
/**
* A postcode token.
*/
class Postcode
{
/// Database word id, if available.
private $iId;
/// Full nomralized postcode (upper cased).
private $sPostcode;
// Optional country code the postcode belongs to (currently unused).
private $sCountryCode;
public function __construct($iId, $sPostcode, $sCountryCode = '')
{
$this->iId = $iId;
$this->sPostcode = $sPostcode;
$this->sCountryCode = empty($sCountryCode) ? '' : $sCountryCode;
}
public function getId()
{
return $this->iId;
}
/**
* Check if the token can be added to the given search.
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return True if the token is compatible with the search configuration
* given the position.
*/
public function isExtendable($oSearch, $oPosition)
{
return !$oSearch->hasPostcode() && $oPosition->maybePhrase('postalcode');
}
/**
* Derive new searches by adding this token to an existing search.
*
* @param object $oSearch Partial search description derived so far.
* @param object $oPosition Description of the token position within
the query.
*
* @return SearchDescription[] List of derived search descriptions.
*/
public function extendSearch($oSearch, $oPosition)
{
$aNewSearches = array();
// If we have structured search or this is the first term,
// make the postcode the primary search element.
if ($oSearch->hasOperator(\Nominatim\Operator::NONE) && $oPosition->isFirstToken()) {
$oNewSearch = $oSearch->clone(1);
$oNewSearch->setPostcodeAsName($this->iId, $this->sPostcode);
$aNewSearches[] = $oNewSearch;
}
// If we have a structured search or this is not the first term,
// add the postcode as an addendum.
if (!$oSearch->hasOperator(\Nominatim\Operator::POSTCODE)
&& ($oPosition->isPhrase('postalcode') || $oSearch->hasName())
) {
$iPenalty = 1;
if (strlen($this->sPostcode) < 4) {
$iPenalty += 4 - strlen($this->sPostcode);
}
$oNewSearch = $oSearch->clone($iPenalty);
$oNewSearch->setPostcode($this->sPostcode);
$aNewSearches[] = $oNewSearch;
}
return $aNewSearches;
}
public function debugInfo()
{
return array(
'ID' => $this->iId,
'Type' => 'postcode',
'Info' => $this->sPostcode.'('.$this->sCountryCode.')'
);
}
public function debugCode()
{
return 'P';
}
}

Some files were not shown because too many files have changed in this diff Show More