Compare commits

..

4 Commits

Author SHA1 Message Date
Sarah Hoffmann
7eb263a3a4 only check for postgres major and minor version
fixes #192
2014-11-28 23:02:53 +01:00
Sarah Hoffmann
ed4e46c3d9 remove debug output 2014-11-28 23:02:44 +01:00
Sarah Hoffmann
a7ef9426d3 prepare 2.3.1 release 2014-11-27 22:37:05 +01:00
Sarah Hoffmann
0bc81fe59c more tolerant regexp for parsing replication state directories
Apache 2.4 has changed the date format, so that the current regexp
doesn't match anymore, so be more tolerant with the date format.
Also force less fancy output formatting without tables.
2014-11-27 22:33:41 +01:00
4074 changed files with 150522 additions and 132787 deletions

2
.github/FUNDING.yml vendored
View File

@@ -1,2 +0,0 @@
github: lonvia
custom: "https://nominatim.org/funding/"

View File

@@ -1,4 +0,0 @@
contact_links:
- name: Nominatim Discussions
url: https://github.com/osm-search/Nominatim/discussions
about: Ask questions, get support, share ideas and discuss with community members.

View File

@@ -1,22 +0,0 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
<!-- Before opening a new feature request, please search through the open issue to check that your request hasn't been reported already. -->
**Is your feature request related to a problem? Please describe.**
<!-- A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] -->
**Describe the solution you'd like**
<!-- A clear and concise description of what you want to happen. -->
**Describe alternatives you've considered**
<!-- A clear and concise description of any alternative solutions or features you've considered. -->
**Additional context**
<!-- Add any other context or screenshots about the feature request here. -->

View File

@@ -1,39 +0,0 @@
---
name: Report issues with search results
about: You have searched something with Nominatim and did not get the expected result.
title: ''
labels: ''
assignees: ''
---
<!-- Note: this template is for reporting problems with searching. If you have found an issue with the data, you need to report/fix the issue directly in OpenStreetMap. See https://www.openstreetmap.org/fixthemap for details. -->
## What did you search for?
<!-- Please try to provide a link to your search. You can go to https://nominatim.openstreetmap.org and repeat your search there. If you originally found the issue somewhere else, please tell us what software/website you were using. -->
## What result did you get?
## What result did you expect?
**When the result in the right place and just named wrongly:**
<!-- Please tell us the display name you expected. -->
**When the result missing completely:**
<!-- Make sure that the data you are looking for is in OpenStreetMap. Provide a link to the OpenStreetMap object or if you cannot get it, a link to the map on https://openstreetmap.org where you expect the result to be.
To get the link to the OSM object, you can try the following:
* Go to [https://openstreetmap.org](https://openstreetmap.org).
* Move to the area of the map where you expect the result and then zoom in as much as possible.
* Click on the question mark on the right side of the map. You get a question cursor. Use it to click on the map where your object is located.
* Find the object of interest in the list that appears on the left side.
* Click on the object and report back the URL that the browser shows.
-->
## Further details
<!-- Anything else we should know about the search. Particularities with addresses in the area etc. -->

View File

@@ -1,36 +0,0 @@
---
name: Report problems with the software
about: You have your own installation of Nominatim and found a bug.
title: ''
labels: ''
assignees: ''
---
<!-- Note: if you are installing Nominatim through a docker image, you should report issues with the installation process with the docker repository first. -->
**Describe the bug**
<!-- A clear and concise description of what the bug is. -->
**To Reproduce**
<!-- Please describe what you did to get to the issue. -->
**Software Environment (please complete the following information):**
- Nominatim version:
- Postgresql version:
- Postgis version:
- OS:
**Hardware Configuration (please complete the following information):**
- RAM:
- number of CPUs:
- type and size of disks:
- bare metal/AWS/other cloud service:
**Postgresql Configuration:**
<!-- List any configuration items you changed in your postgresql configuration. -->
**Additional context**
<!-- Add any other context about the problem here. -->

View File

@@ -1,51 +0,0 @@
name: 'Build Nominatim'
inputs:
ubuntu:
description: 'Version of Ubuntu to install on'
required: false
default: '20'
cmake-args:
description: 'Additional options to hand to cmake'
required: false
default: ''
lua:
description: 'Version of Lua to use'
required: false
default: '5.3'
runs:
using: "composite"
steps:
- name: Clean out the disk
run: |
sudo rm -rf /opt/hostedtoolcache/go /opt/hostedtoolcache/CodeQL /usr/lib/jvm /usr/local/share/chromium /usr/local/lib/android
df -h
shell: bash
- name: Install prerequisites
run: |
sudo apt-get install -y -qq libboost-system-dev libboost-filesystem-dev libexpat1-dev zlib1g-dev libbz2-dev libpq-dev libproj-dev libicu-dev liblua${LUA_VERSION}-dev lua${LUA_VERSION}
if [ "x$UBUNTUVER" == "x18" ]; then
pip3 install python-dotenv psycopg2==2.7.7 jinja2==2.8 psutil==5.4.2 pyicu==2.9 osmium PyYAML==5.1 datrie
else
sudo apt-get install -y -qq python3-icu python3-datrie python3-pyosmium python3-jinja2 python3-psutil python3-psycopg2 python3-dotenv python3-yaml
fi
shell: bash
env:
UBUNTUVER: ${{ inputs.ubuntu }}
CMAKE_ARGS: ${{ inputs.cmake-args }}
LUA_VERSION: ${{ inputs.lua }}
- name: Configure
run: mkdir build && cd build && cmake $CMAKE_ARGS ../Nominatim
shell: bash
env:
CMAKE_ARGS: ${{ inputs.cmake-args }}
- name: Build
run: |
make -j2 all
sudo make install
shell: bash
working-directory: build

View File

@@ -1,47 +0,0 @@
name: 'Setup Postgresql and Postgis'
inputs:
postgresql-version:
description: 'Version of PostgreSQL to install'
required: true
postgis-version:
description: 'Version of Postgis to install'
required: true
runs:
using: "composite"
steps:
- name: Remove existing PostgreSQL
run: |
sudo apt-get purge -yq postgresql*
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
sudo apt-get update -qq
shell: bash
- name: Install PostgreSQL
run: |
sudo apt-get install -y -qq --no-install-suggests --no-install-recommends postgresql-client-${PGVER} postgresql-${PGVER}-postgis-${POSTGISVER} postgresql-${PGVER}-postgis-${POSTGISVER}-scripts postgresql-contrib-${PGVER} postgresql-${PGVER}
shell: bash
env:
PGVER: ${{ inputs.postgresql-version }}
POSTGISVER: ${{ inputs.postgis-version }}
- name: Adapt postgresql configuration
run: |
echo 'fsync = off' | sudo tee /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'synchronous_commit = off' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'full_page_writes = off' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'shared_buffers = 1GB' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
echo 'port = 5432' | sudo tee -a /etc/postgresql/${PGVER}/main/conf.d/local.conf
shell: bash
env:
PGVER: ${{ inputs.postgresql-version }}
- name: Setup database
run: |
sudo systemctl restart postgresql
sudo -u postgres createuser -S www-data
sudo -u postgres createuser -s runner
shell: bash

View File

@@ -1,316 +0,0 @@
name: CI Tests
on: [ push, pull_request ]
jobs:
create-archive:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
submodules: true
- uses: actions/cache@v3
with:
path: |
data/country_osm_grid.sql.gz
key: nominatim-country-data-1
- name: Package tarball
run: |
if [ ! -f data/country_osm_grid.sql.gz ]; then
wget --no-verbose -O data/country_osm_grid.sql.gz https://www.nominatim.org/data/country_grid.sql.gz
fi
cd ..
tar czf nominatim-src.tar.bz2 Nominatim
mv nominatim-src.tar.bz2 Nominatim
- name: 'Upload Artifact'
uses: actions/upload-artifact@v3
with:
name: full-source
path: nominatim-src.tar.bz2
retention-days: 1
tests:
needs: create-archive
strategy:
matrix:
ubuntu: [18, 20, 22]
include:
- ubuntu: 18
postgresql: 9.6
postgis: 2.5
pytest: pytest
php: 7.2
- ubuntu: 20
postgresql: 13
postgis: 3
pytest: py.test-3
php: 7.4
- ubuntu: 22
postgresql: 15
postgis: 3
pytest: py.test-3
php: 8.1
runs-on: ubuntu-${{ matrix.ubuntu }}.04
steps:
- uses: actions/download-artifact@v3
with:
name: full-source
- name: Unpack Nominatim
run: tar xf nominatim-src.tar.bz2
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: ${{ matrix.php }}
tools: phpunit, phpcs, composer
ini-values: opcache.jit=disable
- uses: actions/setup-python@v4
with:
python-version: 3.6
if: matrix.ubuntu == 18
- uses: ./Nominatim/.github/actions/setup-postgresql
with:
postgresql-version: ${{ matrix.postgresql }}
postgis-version: ${{ matrix.postgis }}
- uses: ./Nominatim/.github/actions/build-nominatim
with:
ubuntu: ${{ matrix.ubuntu }}
- name: Install test prerequsites
run: sudo apt-get install -y -qq python3-pytest python3-behave
if: matrix.ubuntu == 20
- name: Install test prerequsites
run: pip3 install pylint pytest behave==1.2.6
if: ${{ (matrix.ubuntu == 18) || (matrix.ubuntu == 22) }}
- name: Install test prerequsites
run: sudo apt-get install -y -qq python3-pytest
if: matrix.ubuntu == 22
- name: Install latest pylint/mypy
run: pip3 install -U pylint mypy types-PyYAML types-jinja2 types-psycopg2 types-psutil types-requests typing-extensions
- name: PHP linting
run: phpcs --report-width=120 .
working-directory: Nominatim
- name: Python linting
run: pylint nominatim
working-directory: Nominatim
- name: Python static typechecking
run: mypy --strict nominatim
working-directory: Nominatim
- name: PHP unit tests
run: phpunit ./
working-directory: Nominatim/test/php
if: ${{ (matrix.ubuntu == 20) || (matrix.ubuntu == 22) }}
- name: Python unit tests
run: $PYTEST test/python
working-directory: Nominatim
env:
PYTEST: ${{ matrix.pytest }}
- name: BDD tests
run: |
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build --format=progress3
working-directory: Nominatim/test/bdd
legacy-test:
needs: create-archive
runs-on: ubuntu-20.04
steps:
- uses: actions/download-artifact@v3
with:
name: full-source
- name: Unpack Nominatim
run: tar xf nominatim-src.tar.bz2
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: 7.4
- uses: ./Nominatim/.github/actions/setup-postgresql
with:
postgresql-version: 13
postgis-version: 3
- name: Install Postgresql server dev
run: sudo apt-get install postgresql-server-dev-13
- uses: ./Nominatim/.github/actions/build-nominatim
with:
ubuntu: 20
cmake-args: -DBUILD_MODULE=on
- name: Install test prerequsites
run: sudo apt-get install -y -qq python3-behave
- name: BDD tests (legacy tokenizer)
run: |
behave -DREMOVE_TEMPLATE=1 -DBUILDDIR=$GITHUB_WORKSPACE/build -DTOKENIZER=legacy --format=progress3
working-directory: Nominatim/test/bdd
install:
runs-on: ubuntu-latest
needs: create-archive
strategy:
matrix:
name: [Ubuntu-18, Ubuntu-20, Ubuntu-22]
include:
- name: Ubuntu-18
flavour: ubuntu
image: "ubuntu:18.04"
ubuntu: 18
install_mode: install-nginx
- name: Ubuntu-20
flavour: ubuntu
image: "ubuntu:20.04"
ubuntu: 20
install_mode: install-apache
- name: Ubuntu-22
flavour: ubuntu
image: "ubuntu:22.04"
ubuntu: 22
install_mode: install-apache
container:
image: ${{ matrix.image }}
env:
LANG: en_US.UTF-8
defaults:
run:
shell: sudo -Hu nominatim bash --noprofile --norc -eo pipefail {0}
steps:
- name: Prepare container (Ubuntu)
run: |
export APT_LISTCHANGES_FRONTEND=none
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq
apt-get install -y git sudo wget
ln -snf /usr/share/zoneinfo/$CONTAINER_TIMEZONE /etc/localtime && echo $CONTAINER_TIMEZONE > /etc/timezone
shell: bash
if: matrix.flavour == 'ubuntu'
- name: Prepare container (CentOS)
run: |
dnf update -y
dnf install -y sudo glibc-langpack-en
shell: bash
if: matrix.flavour == 'centos'
- name: Setup import user
run: |
useradd -m nominatim
echo 'nominatim ALL=(ALL:ALL) NOPASSWD: ALL' > /etc/sudoers.d/nominiatim
echo "/home/nominatim/Nominatim/vagrant/Install-on-${OS}.sh no $INSTALL_MODE" > /home/nominatim/vagrant.sh
shell: bash
env:
OS: ${{ matrix.name }}
INSTALL_MODE: ${{ matrix.install_mode }}
- uses: actions/download-artifact@v3
with:
name: full-source
path: /home/nominatim
- name: Install Nominatim
run: |
export USERNAME=nominatim
export USERHOME=/home/nominatim
export NOSYSTEMD=yes
export HAVE_SELINUX=no
tar xf nominatim-src.tar.bz2
. vagrant.sh
working-directory: /home/nominatim
- name: Prepare import environment
run: |
mv Nominatim/test/testdb/apidb-test-data.pbf test.pbf
rm -rf Nominatim
mkdir data-env-reverse
working-directory: /home/nominatim
- name: Prepare import environment (CentOS)
run: |
sudo ln -s /usr/local/bin/nominatim /usr/bin/nominatim
echo NOMINATIM_DATABASE_WEBUSER="apache" > nominatim-project/.env
cp nominatim-project/.env data-env-reverse/.env
working-directory: /home/nominatim
if: matrix.flavour == 'centos'
- name: Print version
run: nominatim --version
working-directory: /home/nominatim/nominatim-project
- name: Collect host OS information
run: nominatim admin --collect-os-info
working-directory: /home/nominatim/nominatim-project
- name: Import
run: nominatim import --osm-file ../test.pbf
working-directory: /home/nominatim/nominatim-project
- name: Import special phrases
run: nominatim special-phrases --import-from-wiki
working-directory: /home/nominatim/nominatim-project
- name: Check full import
run: nominatim admin --check-database
working-directory: /home/nominatim/nominatim-project
- name: Warm up database
run: nominatim admin --warm
working-directory: /home/nominatim/nominatim-project
- name: Prepare update (Ubuntu)
run: apt-get install -y python3-pip
shell: bash
if: matrix.flavour == 'ubuntu'
- name: Run update
run: |
pip3 install --user osmium
nominatim replication --init
NOMINATIM_REPLICATION_MAX_DIFF=1 nominatim replication --once
working-directory: /home/nominatim/nominatim-project
- name: Clean up database
run: nominatim refresh --postcodes --word-tokens
working-directory: /home/nominatim/nominatim-project
- name: Run reverse-only import
run : |
echo 'NOMINATIM_DATABASE_DSN="pgsql:dbname=reverse"' >> .env
nominatim import --osm-file ../test.pbf --reverse-only --no-updates
working-directory: /home/nominatim/data-env-reverse
- name: Check reverse-only import
run: nominatim admin --check-database
working-directory: /home/nominatim/data-env-reverse
- name: Clean up database (reverse-only import)
run: nominatim refresh --postcodes --word-tokens
working-directory: /home/nominatim/nominatim-project

30
.gitignore vendored
View File

@@ -1,9 +1,31 @@
*.log
*.pyc
docs/develop/*.png
nominatim/*.d
nominatim/*.o
nominatim/nominatim
module/nominatim.so
module/nominatim.o
settings/configuration.txt
settings/download.lock
settings/state.txt
settings/local.php
build
.deps/
autom4te.cache/
config.*
configure
Makefile
Makefile.in
stamp-h1
missing
INSTALL
aclocal.m4
depcomp
install-sh
compile
data/wiki_import.sql
data/wiki_specialphrases.sql
data/osmosischange.osc
.vagrant
data/country_osm_grid.sql.gz

View File

@@ -1,13 +0,0 @@
[mypy]
[mypy-icu.*]
ignore_missing_imports = True
[mypy-osmium.*]
ignore_missing_imports = True
[mypy-datrie.*]
ignore_missing_imports = True
[mypy-dotenv.*]
ignore_missing_imports = True

View File

@@ -1,18 +0,0 @@
[MASTER]
extension-pkg-whitelist=osmium
ignored-modules=icu,datrie
[MESSAGES CONTROL]
[TYPECHECK]
# closing added here because it sometimes triggers a false positive with
# 'with' statements.
ignored-classes=NominatimArgs,closing
# 'too-many-ancestors' is triggered already by deriving from UserDict
# 'not-context-manager' disabled because it causes false positives once
# typed Python is enabled. See also https://github.com/PyCQA/pylint/issues/5273
disable=too-few-public-methods,duplicate-code,too-many-ancestors,bad-option-value,no-self-use,not-context-manager
good-names=i,x,y,m,fd,db,cc

28
AUTHORS
View File

@@ -1,15 +1,17 @@
Nominatim was written by:
* Brian Quinion
* Sarah Hoffmann
* Marc Tobias Metten
* markigail
* AntoJvlt
* gemo1011
* darkshredder
and many more.
For a full list of contributors see the Git logs or visit
https://github.com/openstreetmap/Nominatim/graphs/contributors
Brian Quinion
Sarah Hoffmann
Frederik Ramm
Michael Spreng
Daniele Forsi
mfn
Grant Slater
Andree Klattenhoff
IrlJidel
appelflap
b3nn0
Spin0us
Kurt Roeckx
Rodolphe Quiédeville
Marc Tobias Metten

View File

@@ -1,286 +0,0 @@
#-----------------------------------------------------------------------------
#
# CMake Config
#
# Nominatim
#
#-----------------------------------------------------------------------------
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake")
#-----------------------------------------------------------------------------
#
# Project version
#
#-----------------------------------------------------------------------------
project(nominatim)
set(NOMINATIM_VERSION_MAJOR 4)
set(NOMINATIM_VERSION_MINOR 2)
set(NOMINATIM_VERSION_PATCH 0)
set(NOMINATIM_VERSION "${NOMINATIM_VERSION_MAJOR}.${NOMINATIM_VERSION_MINOR}.${NOMINATIM_VERSION_PATCH}")
add_definitions(-DNOMINATIM_VERSION="${NOMINATIM_VERSION}")
# Setting GIT_HASH
find_package(Git)
if (GIT_FOUND)
execute_process(
COMMAND "${GIT_EXECUTABLE}" log -1 --format=%h
WORKING_DIRECTORY ${CMAKE_CURRENT_LIST_DIR}
OUTPUT_VARIABLE GIT_HASH
OUTPUT_STRIP_TRAILING_WHITESPACE
ERROR_QUIET
)
endif()
#-----------------------------------------------------------------------------
# Configuration
#-----------------------------------------------------------------------------
set(BUILD_IMPORTER on CACHE BOOL "Build everything for importing/updating the database")
set(BUILD_API on CACHE BOOL "Build everything for the API server")
set(BUILD_MODULE off CACHE BOOL "Build PostgreSQL module for legacy tokenizer")
set(BUILD_TESTS on CACHE BOOL "Build test suite")
set(BUILD_DOCS on CACHE BOOL "Build documentation")
set(BUILD_MANPAGE on CACHE BOOL "Build Manual Page")
set(BUILD_OSM2PGSQL on CACHE BOOL "Build osm2pgsql (expert only)")
set(INSTALL_MUNIN_PLUGINS on CACHE BOOL "Install Munin plugins for supervising Nominatim")
#-----------------------------------------------------------------------------
# osm2pgsql (imports/updates only)
#-----------------------------------------------------------------------------
if (BUILD_IMPORTER AND BUILD_OSM2PGSQL)
if (NOT EXISTS "${CMAKE_SOURCE_DIR}/osm2pgsql/CMakeLists.txt")
message(FATAL_ERROR "The osm2pgsql directory is empty.\
Did you forget to check out Nominatim recursively?\
\nTry updating submodules with: git submodule update --init")
endif()
set(BUILD_TESTS_SAVED "${BUILD_TESTS}")
set(BUILD_TESTS off)
add_subdirectory(osm2pgsql)
set(BUILD_TESTS ${BUILD_TESTS_SAVED})
endif()
#-----------------------------------------------------------------------------
# python (imports/updates only)
#-----------------------------------------------------------------------------
if (BUILD_IMPORTER)
find_package(PythonInterp 3.6 REQUIRED)
endif()
#-----------------------------------------------------------------------------
# PHP
#-----------------------------------------------------------------------------
# Setting PHP binary variable as to command line (prevailing) or auto detect
if (BUILD_API OR BUILD_IMPORTER)
if (NOT PHP_BIN)
find_program (PHP_BIN php)
endif()
# sanity check if PHP binary exists
if (NOT EXISTS ${PHP_BIN})
message(FATAL_ERROR "PHP binary not found. Install php or provide location with -DPHP_BIN=/path/php ")
else()
message (STATUS "Using PHP binary " ${PHP_BIN})
endif()
if (NOT PHPCGI_BIN)
find_program (PHPCGI_BIN php-cgi)
endif()
# sanity check if PHP binary exists
if (NOT EXISTS ${PHPCGI_BIN})
message(WARNING "php-cgi binary not found. nominatim tool will not provide query functions.")
set (PHPCGI_BIN "")
else()
message (STATUS "Using php-cgi binary " ${PHPCGI_BIN})
endif()
endif()
#-----------------------------------------------------------------------------
# import scripts and utilities (importer only)
#-----------------------------------------------------------------------------
if (BUILD_IMPORTER)
find_file(COUNTRY_GRID_FILE country_osm_grid.sql.gz
PATHS ${PROJECT_SOURCE_DIR}/data
NO_DEFAULT_PATH
DOC "Location of the country grid file."
)
if (NOT COUNTRY_GRID_FILE)
message(FATAL_ERROR "\nYou need to download the country_osm_grid first:\n"
" wget -O ${PROJECT_SOURCE_DIR}/data/country_osm_grid.sql.gz https://www.nominatim.org/data/country_grid.sql.gz")
endif()
configure_file(${PROJECT_SOURCE_DIR}/cmake/tool.tmpl
${PROJECT_BINARY_DIR}/nominatim)
endif()
#-----------------------------------------------------------------------------
# Tests
#-----------------------------------------------------------------------------
if (BUILD_TESTS)
include(CTest)
set(TEST_BDD db osm2pgsql api)
find_program(PYTHON_BEHAVE behave)
find_program(PYLINT NAMES pylint3 pylint)
find_program(PYTEST NAMES pytest py.test-3 py.test)
find_program(PHPCS phpcs)
find_program(PHPUNIT phpunit)
if (PYTHON_BEHAVE)
message(STATUS "Using Python behave binary ${PYTHON_BEHAVE}")
foreach (test ${TEST_BDD})
add_test(NAME bdd_${test}
COMMAND ${PYTHON_BEHAVE} ${test}
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/test/bdd)
set_tests_properties(bdd_${test}
PROPERTIES ENVIRONMENT "NOMINATIM_DIR=${PROJECT_BINARY_DIR}")
endforeach()
else()
message(WARNING "behave not found. BDD tests disabled." )
endif()
if (PHPUNIT)
message(STATUS "Using phpunit binary ${PHPUNIT}")
add_test(NAME php
COMMAND ${PHPUNIT} ./
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/test/php)
else()
message(WARNING "phpunit not found. PHP unit tests disabled." )
endif()
if (PHPCS)
message(STATUS "Using phpcs binary ${PHPCS}")
add_test(NAME phpcs
COMMAND ${PHPCS} --report-width=120 --colors lib-php
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
else()
message(WARNING "phpcs not found. PHP linting tests disabled." )
endif()
if (PYLINT)
message(STATUS "Using pylint binary ${PYLINT}")
add_test(NAME pylint
COMMAND ${PYLINT} nominatim
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
else()
message(WARNING "pylint not found. Python linting tests disabled.")
endif()
if (PYTEST)
message(STATUS "Using pytest binary ${PYTEST}")
add_test(NAME pytest
COMMAND ${PYTEST} test/python
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR})
else()
message(WARNING "pytest not found. Python tests disabled." )
endif()
endif()
#-----------------------------------------------------------------------------
# Postgres module
#-----------------------------------------------------------------------------
if (BUILD_MODULE)
add_subdirectory(module)
endif()
#-----------------------------------------------------------------------------
# Documentation
#-----------------------------------------------------------------------------
if (BUILD_DOCS)
add_subdirectory(docs)
endif()
#-----------------------------------------------------------------------------
# Manual page
#-----------------------------------------------------------------------------
if (BUILD_MANPAGE)
add_subdirectory(man)
endif()
#-----------------------------------------------------------------------------
# Installation
#-----------------------------------------------------------------------------
include(GNUInstallDirs)
set(NOMINATIM_DATADIR ${CMAKE_INSTALL_FULL_DATADIR}/${PROJECT_NAME})
set(NOMINATIM_LIBDIR ${CMAKE_INSTALL_FULL_LIBDIR}/${PROJECT_NAME})
set(NOMINATIM_CONFIGDIR ${CMAKE_INSTALL_FULL_SYSCONFDIR}/${PROJECT_NAME})
set(NOMINATIM_MUNINDIR ${CMAKE_INSTALL_FULL_DATADIR}/munin/plugins)
if (BUILD_IMPORTER)
configure_file(${PROJECT_SOURCE_DIR}/cmake/tool-installed.tmpl installed.bin)
install(PROGRAMS ${PROJECT_BINARY_DIR}/installed.bin
DESTINATION ${CMAKE_INSTALL_BINDIR}
RENAME nominatim)
install(DIRECTORY nominatim
DESTINATION ${NOMINATIM_LIBDIR}/lib-python
FILES_MATCHING PATTERN "*.py"
PATTERN __pycache__ EXCLUDE)
install(DIRECTORY lib-sql DESTINATION ${NOMINATIM_LIBDIR})
install(FILES ${COUNTRY_GRID_FILE}
data/words.sql
DESTINATION ${NOMINATIM_DATADIR})
endif()
if (BUILD_OSM2PGSQL)
if (${CMAKE_VERSION} VERSION_LESS 3.13)
# Installation of subdirectory targets was only introduced in 3.13.
# So just copy the osm2pgsql file for older versions.
install(PROGRAMS ${PROJECT_BINARY_DIR}/osm2pgsql/osm2pgsql
DESTINATION ${NOMINATIM_LIBDIR})
else()
install(TARGETS osm2pgsql RUNTIME DESTINATION ${NOMINATIM_LIBDIR})
endif()
endif()
if (BUILD_MODULE)
install(PROGRAMS ${PROJECT_BINARY_DIR}/module/nominatim.so
DESTINATION ${NOMINATIM_LIBDIR}/module)
endif()
if (BUILD_API)
install(DIRECTORY lib-php DESTINATION ${NOMINATIM_LIBDIR})
endif()
install(FILES settings/env.defaults
settings/address-levels.json
settings/phrase-settings.json
settings/import-admin.style
settings/import-street.style
settings/import-address.style
settings/import-full.style
settings/import-extratags.style
settings/icu_tokenizer.yaml
settings/country_settings.yaml
DESTINATION ${NOMINATIM_CONFIGDIR})
install(DIRECTORY settings/icu-rules
DESTINATION ${NOMINATIM_CONFIGDIR})
install(DIRECTORY settings/country-names
DESTINATION ${NOMINATIM_CONFIGDIR})
if (INSTALL_MUNIN_PLUGINS)
install(FILES munin/nominatim_importlag
munin/nominatim_query_speed
munin/nominatim_requests
DESTINATION ${NOMINATIM_MUNINDIR})
endif()

View File

@@ -1,102 +0,0 @@
# Nominatim contribution guidelines
## Reporting Bugs
Bugs can be reported at https://github.com/openstreetmap/Nominatim/issues.
Please always open a separate issue for each problem. In particular, do
not add your bugs to closed issues. They may looks similar to you but
often are completely different from the maintainer's point of view.
## Workflow for Pull Requests
We love to get pull requests from you. We operate the "Fork & Pull" model
explained at
https://help.github.com/articles/using-pull-requests
You should fork the project into your own repo, create a topic branch
there and then make one or more pull requests back to the openstreetmap repository.
Your pull requests will then be reviewed and discussed. Please be aware
that you are responsible for your pull requests. You should be prepared
to get change requests because as the maintainers we have to make sure
that your contribution fits well with the rest of the code. Please make
sure that you have time to react to these comments and amend the code or
engage in a conversion. Do not expect that others will pick up your code,
it will almost never happen.
Please open a separate pull request for each issue you want to address.
Don't mix multiple changes. In particular, don't mix style cleanups with
feature pull requests. If you plan to make larger changes, please open
an issue first or comment on the appropriate issue already existing so
that duplicate work can be avoided.
## Coding style
Nominatim historically hasn't followed a particular coding style but we
are in process of consolidating the style. The following rules apply:
* Python code uses the official Python style
* indentation
* SQL use 2 spaces
* all other file types use 4 spaces
* [BSD style](https://en.wikipedia.org/wiki/Indent_style#Allman_style) for braces
* spaces
* spaces before and after equal signs and operators
* no trailing spaces
* no spaces after opening and before closing bracket
* leave out space between a function name and bracket
but add one between control statement(if, while, etc.) and bracket
* for PHP variables use CamelCase with a prefixing letter indicating the type
(i - integer, f - float, a - array, s - string, o - object)
The coding style is enforced with PHPCS and pylint. It can be tested with:
```
phpcs --report-width=120 --colors .
pylint3 --extension-pkg-whitelist=osmium nominatim
```
## Testing
Before submitting a pull request make sure that the tests pass:
```
cd build
make test
```
## Releases
Nominatim follows semantic versioning. Major releases are done for large changes
that require (or at least strongly recommend) a reimport of the databases.
Minor releases can usually be applied to exisiting databases Patch releases
contain bug fixes only and are released from a separate branch where the
relevant changes are cherry-picked from the master branch.
Checklist for releases:
* [ ] increase version in `nominatim/version.py` and CMakeLists.txt
* [ ] update `ChangeLog` (copy information from patch releases from release branch)
* [ ] complete `docs/admin/Migration.md`
* [ ] update EOL dates in `SECURITY.md`
* [ ] commit and make sure CI tests pass
* [ ] test migration
* download, build and import previous version
* migrate using master version
* run updates using master version
* [ ] prepare tarball:
* `git clone --recursive https://github.com/osm-search/Nominatim` (switch to right branch!)
* `rm -r .git* osm2pgsql/.git*`
* copy country data into `data/`
* add version to base directory and package
* [ ] upload tarball to https://nominatim.org
* [ ] prepare documentation
* check out new docs branch
* change git checkout instructions to tarball download instructions or adapt version on existing ones
* build documentation and copy to https://github.com/osm-search/nominatim-org-site
* add new version to history
* [ ] check release tarball
* download tarball as per new documentation instructions
* compile and import Nominatim
* run `nominatim --version` to confirm correct version
* [ ] tag new release and add a release on github.com

14
COPYING
View File

@@ -1,12 +1,12 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
@@ -56,7 +56,7 @@ patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
@@ -255,7 +255,7 @@ make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
@@ -277,9 +277,9 @@ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it

436
ChangeLog
View File

@@ -1,401 +1,7 @@
4.2.0
2.0.1
* add experimental support for osm2pgsql flex style
* introduce secondary importance value to be retrieved from a raster data file
(currently still unused, to replace address importance, thanks to @tareqpi)
* add new report tool `nominatim admin --collect-os-info`
(thanks @micahcochran, @tareqpi)
* reorganise index to improve lookup performance and size
* run index creation after import in parallel
* run ANALYZE more selectively to speed up continuation of indexing
* fix crash on update when addr:interpolation receives an illegal value
* fix minimum number of retrieved results to be at least 10
* fix search for combinations of special term + name (e.g Hotel Bellevue)
* do not return interpolations without a parent street on reverse search
* improve invalidation of linked places on updates
* fix address parsing for interpolation lines
* make sure socket timeouts are respected during replication
(working around a bug in some versions of pyosmium)
* update bundled osm2pgsql to 1.7.1
* add support for PostgreSQL 15
* typing fixes to work with latest type annotations from typeshed
* smaller improvements to documentation (thanks to @mausch)
4.1.0
* switch to ICU tokenizer as default
* add housenumber normalization and support optional spaces during search
* add postcode format checking and support optional spaces during search
* add function for cleaning housenumbers in word table
* add updates/deletion of country names imported from OSM
* linked places no longer overwrite names from a place permanently
* move default country name configuration into yaml file (thanks @tareqpi)
* more compact layout for interpolation and TIGER tables
* introduce mutations to ICU tokenizer (used for German umlauts)
* support reinitializing a full project directory with refresh --website
* fix various issues with linked places on updates
* add support for external sanitizers and token analyzers
* add CLI commands for forced indexing
* add CLI command for version report
* add offline import mode
* change geocodejson to return a feature class in the 'type' field
* add ISO3166-2 to address output (thanks @I70l0teN4ik)
* improve parsing and matching of addr: tags
* support relations as street members of associatedStreet
* better ranking for address results from TIGER data
* adapt rank classification to changed tag usage in OSM
* update bundled osm2pgsql to 1.6.0
* add typing information to Python code
* improve unit test coverage
* reorganise and speed up code for BDD tests, drop support for scenes
* move PHP unit tests to PHP 9.5
* extensive typo fixes in documentation (thanks @woodpeck,@StephanGeorg,
@amandasaurus, @nslxndr, @stefkiourk, @Luflosi, @kianmeng)
* drop official support for installation on CentOS
* add installation instructions for Ubuntu 22.04
* add support for PHP8
* add setup instructions for updates and systemd
* drop support for PostgreSQL 9.5
4.0.1
* fix initialisation error in replication script
* ICU tokenizer: avoid any special characters in word tokens
* better error message when API php script does not exist
* fix quoting of house numbers in SQL queries
* small fixes and improvements in search query parsing
* add documentation for moving the database to a different machine
4.0.0
* refactor name token computation and introduce ICU tokenizer
* name processing now happens in the indexer outside the DB
* reorganizes abbreviation handling and moves it to the indexing phases
* adds preprocessing of names
* add country-specific ranking for Spain, Slovakia
* partially switch to using SP-GIST indexes
* better updating of dependent addresses for name changes in streets
* remove unused/broken tables for external housenumbers
* move external postcodes to CSV format and no longer save them in tables
(adds support for postcodes for arbitrary countries)
* remove postcode helper entries from placex (thanks @AntoJvlt)
* change required format for TIGER data to CSV
* move configuration of default languages from wiki into config file
* expect customized configuration files in project directory by default
* disable search API for reverse-only import (thanks @darkshredder)
* port most of maintenance/import code to Python and remove PHP utils
* add catch-up mode for replication
* add updating of special phrases (thanks @AntoJvlt)
* add support for special phrases in CSV files (thanks @AntoJvlt)
* switch to case-independent matching between place and boundary names
* remove disabling of reverse query parsing
* minor tweaks to search algorithm to avoid more false positives
* major overhaul of the administrator and developer documentation
* add security disclosure policy
* add testing of installation scripts via CI
* drop support for Python < 3.6 and Postgresql < 9.5
3.7.2
* fix database check for reverse-only imports
* do not error out in status API result when import date is missing
* add array_key_last function for PHP < 7.3 (thanks to @woodpeck)
* fix more url when server name is unknown (thanks to @mogita)
* commit changes to replication log table
3.7.1
* fix smaller issues with special phrases import (thanks @AntoJvlt)
* add index to speed up continued indexing during import
* fix index on location_property_tiger(parent_place_id) (thanks @changpingc)
* make sure Python code is backward-compatible with Python 3.5
* various documentation fixes
3.7.0
* switch to dotenv for configuration file
* introduce 'make install' (reorganising most of the code)
* introduce nominatim tool as replacement for various php scripts
* introduce project directories and allow multiple installations from same build
* clean up BDD tests: drop nose, reorganise step code
* simplify test database for API BDD tests and autoinstall database
* port most of the code for command-line tools to Python
(thanks to @darkshredder and @AntoJvlt)
* add tests for all tooling
* replace pyosmium-get-changes with custom internal implementation using
pyosmium
* improve search for queries with housenumber and partial terms
* add database versioning
* use jinja2 for preprocessing SQL files
* introduce automatic migrations
* reverse fix preference of interpolations over housenumbers
* parallelize indexing of postcodes
* add non-key indexes to speed up housenumber + street searches
* switch housenumber field in placex to save transliterated names
3.6.0
* add full support for searching by and displaying of addr:* tags
* improve address output for large-area objects
* better use of country names from OSM data for search and display
* better debug output for reverse call
* add support for addr:place links without an place equivalent in OSM
* improve finding postcodes with normalisation artefacts
* batch object to index for rank 30, avoiding a wrap-around of transaction
IDs in PostgreSQL
* introduce dynamic address rank computation for administrative boundaries
depending on linked objects and their place in the admin level hierarchy
* add country-specific address ranking for Indonesia, Russia, Belgium and
the Netherlands (thanks @hendrikmoree)
* make sure wikidata/wikipedia tags are imported for all styles
* make POIs searchable by name and housenumber (thanks @joy-yyd)
* reverse geocoding now ignores places without an address rank (rivers etc.)
* installation of a webserver is no longer mandatory, for development
use the php internal webserver via 'make serve
* reduce the influence of place nodes in addresses
* drop support for the unspecific is_in tag
* various minor tweaks to supplied styles
* move HTML web frontend into its own project
* move scripts for processing external data sources into separate directories
* introduce separate configuration for website (thanks @krahulreddy)
* update documentation, in particular, clean up development docs
* update osm2pgsql to 1.4.0
3.5.2
* ensure that wikipedia tags are imported for all styles
* reinstate verbosity for indexing during updates
* make house number reappear in display name on named POIs
* introduce batch processing in indexer to avoid transaction ID overrun
* increase splitting for large geometries to improve indexing speed
* remove deprecated get_magic_quotes_gpc() function
* make sure that all postcodes have an entry in word and are thus searchable
* remove use of ST_Covers in conjunction woth ST_Intersects,
causes bad query planning and slow updates in Postgis3
* update osm2pgsql
3.5.1
* disable jit and parallel processing in PostgreSQL for osm2pgsql
* update libosmium to 2.15.6 (fixes an issue with processing hanging
on large multipolygons)
3.5.0
* structured select on HTML search page
* new PHP Nominatim\Shell class to wrap shell escaping
* remove polygon parameter from all API calls
* improve handling of postcode areas
* reorganise place linking algorithm, now using wikidata tag as well
* remove linkees from search_name and larger_area tables
* introduce country-specific address ranks
* reorganise rank address computation
* cleanup of partition function
* improve parenting for large POIs
* add support for Postgresql 12 and Postgis 3
* add earlier cleanup when --drop is given, to reduce memory usage
* remove use of place_id in URLs
* replace C nominatim indexer with a simpler Python implementation
* split up the huge sql/functions.sql file
* move osm2pgsql tests to osm2pgsql
* add new extratags style which imports all tags from OSM
* add new script for checking the import after completion
* update osm2pgsql, reducing memory usage
* use new wikipedia importance and add processing of wikidata tags
* add search form for details page
* use ExtraDataPath for country_grid table
* remove short_name from list of names to be displayed
* split up CMakeFile, so that all parts can be built separately
* update installation instructions for CentOS and Ubuntu
* add script for importing/updating multiple country extracts
* various documentation improvements
3.4.2
* fix security bug in /details endpoint where user input was not
properly sanitized
3.4.1
* update osm2pgsql to fix hans during updates and lost address numbers
during updates
3.4.0
* increase required version for PostgreSQL(9.3), PostGIS(2.2) and PHP(7.0)
* better error reporting for out-of-memory errors
* exclude postcode ranges separated by colon from centre point calculation
* update osm2pgsql, better handling of imports without flatnode file
* switch to more efficient algorithm for word set computation
* use only boundries for country and state parts of addresses
* improve updates of addresses with housenumbers and interpolations
* remove country from place_addressline table and use country_code instead
* optimise indexes on search_name partition tables
* improve searching of attached streets for large objects like airports
* drop support for python 2
* new scripts for importing Wikidata for importance
* create and drop indexes concurrently to not clash with auto vacuum
* various documentation improvements
3.3.0
* zoom 17 in reverse now zooms in on minor streets
* fix use of postcode relations in address
* support for housenumber 0 on interpolations
* replace database abstraction DB with PDO and switch to using exceptions
* exclude line features at rank 30 from reverse geocoding
* remove self-reference and country from place_addressline
* make json output more readable (less escaping)
* update conversion scripts for postcodes
* scripts in utils/ are no longer executable (always use scripts in build dir)
* remove Natural Earth country fallback (OSM is complete enough)
* make rank assignments configurable
* allow accept languages with underscore
* new reverse-only import mode (without search index table)
* rely on boundaries only for states and countries
* update osm2pgsql, now using a configurable style
* provide multiple import styles
* improve search when house number and postcodes are dropped
* overhaul of setup code
* add support for PHPUnit 6
* update test database
* various documentation improvements
3.2.0
* complete rewrite of reverse search algorithm
* add new geojson and geocodejson output formats
* add simple export script to exprot addresses to CSV
* remove is_in terms from address computation
* remove unused search_name_country tables
* various smaller fixes to query parsing
* convert Tokens and token types to class types
* correctly handle update when boundary object type is changed
* improve debug output for /search endpoint
* update to latest osm2pgsql and leaflet.js
* overhaul of /details endpoint:
* new class parameter when using osmtype/osmid parameters
* permalink to instance-independent osmtype/osmid parameter format
* new json output format
* update CentOS vagrant machine to use SELinux
* add vagrant scripts for Ubuntu 18.04
* fix build process for BSD
* enable running the database on a different host than the setup scripts
* allow to configure use of custom PHP binaries (PHP_BIN)
* extensive coding style improvements to PHP code
* more PHP unit tests for new classes
* increase coverage for API tests
* add documentation for API
3.1.0
* rework postcode handling and introduce location_postcode table
* make setup less verbose and print a summary in the end
* setup: error out when web site user does not exist
* add more API tests to complete code coverage
* reinstate key-value amenity search (special term [key=value])
* fix detection of coordinates in query
* various smaller tweaks to ranking of search interpretations
* complete overhaul of PHP frontend code using OOP
* add address rank to details page
* update Tiger scripts for 2017 data and clean up unused code
* various bug fixes and improvements to UI
* improve reverse geocoding performance close to coasts
* more PHP style cleanup (quoting)
* allow unnamed road in reverse geocoding to avoid too far off results
* add function to recalculate counts for full-word search term
* add function to check if new updates are available
* update documentation and switch to mkdocs for generating HTML
3.0.1
* fix bug in geometry building algorithm in osm2pgsql
* fix typos in build instructions
3.0.0
* move to cmake build system
* various fixes to HTML output
* reverse endpoint now can return geometries
* add support for PHP7
* move to on-the-fly computation of interpolations
* improve handling of linked places (updating)
* improve test framework:
* replace lettuce with behave
* use smaller database for API tests
* drop support for postgres < 9.1, postgis < 2.0 and PHP < 5.4
* make external data use optional (useful for imports without US)
* detect postgres and postgis versions automatically
* clean up query logging and remove unused tables
* move installation documentation into this repo
* add self-documenting vagrant scripts
* remove --create-website, recommend to use website directory in build
* add accessor functions for URL parameters and improve erro checking
* remove IP blocking and rate-limiting code
* enable CI via travis
* reformatting for more consistent coding style
* make country search term creation part of setup
* update country names and country grid
* handle roads that cross boundaries better
* keep full information on address tags
* update to refactored osm2pgsql which use libosmium based types
* switch from osmosis to pyosmium for updates
* be more strict when matching against special search terms
* handle postcode entries with mutliple values correctly
2.5
* reverse geocoding includes looking up housenumbers from Tiger data
* added parameter to return simplified geometries
* new lookup call for getting address information for OSM objects
* new namedetails and extratags parameters that expose the name and extratags
fields of the placex table
* mobile website
* reverse web view
2.4
* drop support for postgres 8.4
* rewrite address interpolation
* switch to C++ version of osm2pgsql and rewrite tag filtering
* support for bridge:name and tunnel:name, man_made, junction
* drop way-node index (reduces database size by about 15%)
* add support for configuring tablespaces and webserver user
* better evaluation of search queries in right-to-left notation
* improve house number search for streets with many duplicate entries
* code cleanup (remove unused functions and tables)
2.3
* further improve ordering of results
* support for more lat/lon formats in search-as-reverse
* fix handling of GB postcodes
* new functional test suite
* support for waterway relations
* inherit postcodes from street to poi
* fix housenumber normalisation to find non-latin house numbers
* take viewbox into account for ordering of results
* pois may now inherit address tags from surrounding buildings
* improve what objects may participate in an address
* clean up handled class/type combinations to current OSM usage
* lots of bug fixes
2.2
* correct database rights for www-data
* add timestamps for update output
* load postgis via extension for postgis >= 2.0
* remove non-admin boundaries from addresses
* further improve ordering of results with same importance
* merge addr:postcode tags into object addresses
* include rank and importance in reverse geocode output
* replace ST_Line_Interpolate_Point with ST_LineInterpolatePoint
(for postgis >= 2.1)
* update osm2pgsql to latest version
* properly detect changes of admin_level
* remove landuses when name is removed
* smaller fixes
* delete outdated entries from location_area_country
* remove remaining uses of INTEGER, to allow node ids larger than 2^31
2.1
@@ -421,7 +27,37 @@
* refactoring of front-end PHP code
* lots of smaller bug fixes
2.0.1
2.2
* delete outdated entries from location_area_country
* remove remaining uses of INTEGER, to allow node ids larger than 2^31
* correct database rights for www-data
* add timestamps for update output
* load postgis via extension for postgis >= 2.0
* remove non-admin boundaries from addresses
* further improve ordering of results with same importance
* merge addr:postcode tags into object addresses
* include rank and importance in reverse geocode output
* replace ST_Line_Interpolate_Point with ST_LineInterpolatePoint
(for postgis >= 2.1)
* update osm2pgsql to latest version
* properly detect changes of admin_level
* remove landuses when name is removed
* smaller fixes
2.3
* further improve ordering of results
* support for more lat/lon formats in search-as-reverse
* fix handling of GB postcodes
* new functional test suite
* support for waterway relations
* inherit postcodes from street to poi
* fix housenumber normalisation to find non-latin house numbers
* take viewbox into account for ordering of results
* pois may now inherit address tags from surrounding buildings
* improve what objects may participate in an address
* clean up handled class/type combinations to current OSM usage
* lots of bug fixes
2.3.1
* fix parse error in replication state directories (fatal during setup)

27
Makefile.am Normal file
View File

@@ -0,0 +1,27 @@
ACLOCAL_AMFLAGS = -I osm2pgsql/m4
AUTOMAKE_OPTIONS = -Wno-portability
SUBDIRS = osm2pgsql module nominatim
NOMINATIM_SERVER ?= $(shell echo a | php -F lib/init.php -E 'echo CONST_Website_BaseURL."\n";')
NOMINATIM_DATABASE ?= $(shell echo a | php -F lib/init.php -E 'echo DB::parseDSN(CONST_Database_DSN)["database"];')
install:
@echo Nominatim needs to be executed directly from this directory. No install necessary.
test:
cd tests; NOMINATIM_SERVER=${NOMINATIM_SERVER} lettuce -t -Fail -t -poldi-only
test-fast:
cd tests; NOMINATIM_SERVER=${NOMINATIM_SERVER} NOMINATIM_REUSE_TEMPLATE=1 lettuce -t -Fail -t -poldi-only
test-db:
cd tests; NOMINATIM_SERVER=${NOMINATIM_SERVER} lettuce -t -Fail -t -poldi-only features/db
test-db-fast:
cd tests; NOMINATIM_SERVER=${NOMINATIM_SERVER} NOMINATIM_REUSE_TEMPLATE=1 lettuce -t -Fail -t -poldi-only features/db
test-api:
cd tests; NOMINATIM_SERVER=${NOMINATIM_SERVER} lettuce -t -Fail -t -poldi-only features/api
.PHONY: test test-fast test-db test-db-fast test-api

56
README Normal file
View File

@@ -0,0 +1,56 @@
Nominatim
=========
Nominatim (from the Latin, 'by name') is a tool to search OpenStreetMap data
by name and address (geocoding) and to generate synthetic addresses of
OSM points (reverse geocoding). An instance with up-to-date data can be found
at http://nominatim.openstreetmap.org. Nominatim is also used as one of the
sources for the Search box on the OpenStreetMap home page and powers the search
on the MapQuest Open Initiative websites.
Documentation
=============
More information about Nominatim, including usage and installation instructions,
can be found in the OSM wiki at:
http://wiki.openstreetmap.org/wiki/Nominatim
Installation
============
The following instructions is a quick guide to installation. A more detailed guide
how to set up your own instance of Nominatim can be found in the wiki:
http://wiki.openstreetmap.org/wiki/Nominatim/Installation
Installation steps:
0. If checking out from git run:
./autogen.sh
1. Compile Nominatim:
./configure [--enable-64bit-ids]
make
2. Get OSM data and import:
./utils/setup.php --osm-file <your planet file> --all
3. Point your webserver to the ./website directory.
License
=======
The source code is available under a GPLv2 license.
Contact and Bugreports
======================
For questions you can join the geocoding mailinglist, see
http://lists.openstreetmap.org/listinfo/geocoding
Bugs may be reported on the github project site:
https://github.com/twain47/Nominatim

View File

@@ -1,67 +0,0 @@
[![Build Status](https://github.com/osm-search/Nominatim/workflows/CI%20Tests/badge.svg)](https://github.com/osm-search/Nominatim/actions?query=workflow%3A%22CI+Tests%22)
[![codecov](https://codecov.io/gh/osm-search/Nominatim/branch/master/graph/badge.svg?token=8P1LXrhCMy)](https://codecov.io/gh/osm-search/Nominatim)
Nominatim
=========
Nominatim (from the Latin, 'by name') is a tool to search OpenStreetMap data
by name and address (geocoding) and to generate synthetic addresses of
OSM points (reverse geocoding). An instance with up-to-date data can be found
at https://nominatim.openstreetmap.org. Nominatim is also used as one of the
sources for the Search box on the OpenStreetMap home page.
Documentation
=============
The documentation of the latest development version is in the
`docs/` subdirectory. A HTML version can be found at
https://nominatim.org/release-docs/develop/ .
Installation
============
The latest stable release can be downloaded from https://nominatim.org.
There you can also find [installation instructions for the release](https://nominatim.org/release-docs/latest/admin/Installation), as well as an extensive [Troubleshooting/FAQ section](https://nominatim.org/release-docs/latest/admin/Faq/).
[Detailed installation instructions for current master](https://nominatim.org/release-docs/develop/admin/Installation)
can be found at nominatim.org as well.
A quick summary of the necessary steps:
1. Compile Nominatim:
mkdir build
cd build
cmake ..
make
sudo make install
2. Create a project directory, get OSM data and import:
mkdir nominatim-project
cd nominatim-project
nominatim import --osm-file <your planet file>
3. Point your webserver to the nominatim-project/website directory.
License
=======
The source code is available under a GPLv2 license.
Contributing
============
Contributions, bugreport and pull requests are welcome.
For details see [contribution guide](CONTRIBUTING.md).
Questions and help
==================
For questions, community help and discussions you can use the
[Github discussions forum](https://github.com/osm-search/Nominatim/discussions)
or join the
[geocoding mailing list](https://lists.openstreetmap.org/listinfo/geocoding).

View File

@@ -1,40 +0,0 @@
# Security Policy
## Supported Versions
All Nominatim releases receive security updates for two years.
The following table lists the end of support for all currently supported
versions.
| Version | End of support for security updates |
| ------- | ----------------------------------- |
| 4.2.x | 2024-11-24 |
| 4.1.x | 2024-08-05 |
| 4.0.x | 2023-11-02 |
| 3.7.x | 2023-04-05 |
| 3.6.x | 2022-12-12 |
## Reporting a Vulnerability
If you believe, you have found an issue in Nominatim that has implications on
security, please send a description of the issue to **security@nominatim.org**.
You will receive an acknowledgement of your mail within 3 work days where we
also notify you of the next steps.
## How we Disclose Security Issues
** The following section only applies to security issues found in released
versions. Issues that concern the master development branch only will be
fixed immediately on the branch with the corresponding PR containing the
description of the nature and severity of the issue. **
Patches for identified security issues are applied to all affected versions and
new minor versions are released. At the same time we release a statement at
the [Nominatim blog](https://nominatim.org/blog/) describing the nature of the
incident. Announcements will also be published at the
[geocoding mailinglist](https://lists.openstreetmap.org/listinfo/geocoding).
## List of Previous Incidents
* 2020-05-04 - [SQL injection issue on /details endpoint](https://lists.openstreetmap.org/pipermail/geocoding/2020-May/002012.html)

View File

@@ -1,186 +0,0 @@
# Install Nominatim in a virtual machine for development and testing
This document describes how you can install Nominatim inside a Ubuntu 16
virtual machine on your desktop/laptop (host machine). The goal is to give
you a development environment to easily edit code and run the test suite
without affecting the rest of your system.
The installation can run largely unsupervised. You should expect 1h from
start to finish depending on how fast your computer and download speed
is.
## Prerequisites
1. [Virtualbox](https://www.virtualbox.org/wiki/Downloads)
2. [Vagrant](https://www.vagrantup.com/downloads.html)
3. Nominatim
git clone --recursive https://github.com/openstreetmap/Nominatim.git
If you forgot `--recursive`, it you can later load the submodules using
git submodule init
git submodule update
## Installation
1. Start the virtual machine
vagrant up ubuntu
2. Log into the virtual machine
vagrant ssh ubuntu
3. Import a small country (Monaco)
See the FAQ how to skip this step and point Nominatim to an existing database.
```
# inside the virtual machine:
cd nominatim-project
wget --no-verbose --output-document=monaco.osm.pbf http://download.geofabrik.de/europe/monaco-latest.osm.pbf
nominatim import --osm-file monaco.osm.pbf 2>&1 | tee monaco.$$.log
```
To repeat an import you'd need to delete the database first
dropdb --if-exists nominatim
## Development
Vagrant maps the virtual machine's port 8089 to your host machine. Thus you can
see Nominatim in action on [localhost:8089](http://localhost:8089/nominatim/).
You edit code on your host machine in any editor you like. There is no need to
restart any software: just refresh your browser window.
Note that the webserver uses files from the /build directory. If you change
files in Nominatim/website or Nominatim/utils for example you first need to
copy them into the /build directory by running the `cmake` step from the
installation.
PHP errors are written to `/var/log/apache2/error.log`.
With `echo` and `var_dump()` you write into the output (HTML/XML/JSON) when
you either add `&debug=1` to the URL (preferred) or set
`@define('CONST_Debug', true);` in `settings/local.php`.
In the Python BDD test you can use `logger.info()` for temporary debug
statements.
## Running unit tests
cd ~/Nominatim/tests/php
phpunit ./
## Running PHP code style tests
cd ~/Nominatim
phpcs --colors .
## Running functional tests
Tests in `test/bdd/db` and `test/bdd/osm2pgsql` have to pass 100%. Other
tests might require full planet-wide data. Sadly even if you have your own
planet-wide data there will be enough differences to the openstreetmap.org
installation to cause false positives in the other tests (see FAQ).
To run the full test suite
cd ~/Nominatim/test/bdd
behave -DBUILDDIR=/home/vagrant/build/ db osm2pgsql
To run a single file
behave -DBUILDDIR=/home/vagrant/build/ api/lookup/simple.feature
Or a single test by line number
behave -DBUILDDIR=/home/vagrant/build/ api/lookup/simple.feature:34
To run specific groups of tests you can add tags just before the `Scenario line`, e.g.
@bug-34
Scenario: address lookup for non-existing or invalid node, way, relation
and then
behave -DBUILDDIR=/home/vagrant/build/ --tags @bug-34
## FAQ
##### Will it run on Windows?
Yes, Vagrant and Virtualbox can be installed on MS Windows just fine. You need a 64bit
version of Windows.
##### Why Monaco, can I use another country?
Of course! The Monaco import takes less than 30 minutes and works with 2GB RAM.
##### Will the results be the same as those from nominatim.openstreetmap.org?
No. Long running Nominatim installations will differ once new import features (or
bug fixes) get added since those usually only get applied to new/changed data.
Also this document skips the optional Wikipedia data import which affects ranking
of search results. See [Nominatim installation](https://nominatim.org/release-docs/latest/admin/Installation) for details.
##### Why Ubuntu? Can I test CentOS/Fedora/CoreOS/FreeBSD?
There is a Vagrant script for CentOS available, but the Nominatim directory
isn't symlinked/mounted to the host which makes development trickier. We used
it mainly for debugging installation with SELinux.
In general Nominatim will run in the other environments. The installation steps
are slightly different, e.g. the name of the package manager, Apache2 package
name, location of files. We chose Ubuntu because that is closest to the
nominatim.openstreetmap.org production environment.
You can configure/download other Vagrant boxes from [https://app.vagrantup.com/boxes/search](https://app.vagrantup.com/boxes/search).
##### How can I connect to an existing database?
Let's say you have a Postgres database named `nominatim_it` on server `your-server.com` and port `5432`. The Postgres username is `postgres`. You can edit `settings/local.php` and point Nominatim to it.
pgsql:host=your-server.com;port=5432;user=postgres;dbname=nominatim_it
No data import or restarting necessary.
If the Postgres installation is behind a firewall, you can try
ssh -L 9999:localhost:5432 your-username@your-server.com
inside the virtual machine. It will map the port to `localhost:9999` and then
you edit `settings/local.php` with
@define('CONST_Database_DSN', 'pgsql:host=localhost;port=9999;user=postgres;dbname=nominatim_it');
To access postgres directly remember to specify the hostname, e.g. `psql --host localhost --port 9999 nominatim_it`
##### My computer is slow and the import takes too long. Can I start the virtual machine "in the cloud"?
Yes. It's possible to start the virtual machine on [Amazon AWS (plugin)](https://github.com/mitchellh/vagrant-aws)
or [DigitalOcean (plugin)](https://github.com/smdahlen/vagrant-digitalocean).

110
Vagrantfile vendored
View File

@@ -1,110 +0,0 @@
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
# Apache webserver
config.vm.network "forwarded_port", guest: 80, host: 8089
config.vm.network "forwarded_port", guest: 8088, host: 8088
# If true, then any SSH connections made will enable agent forwarding.
config.ssh.forward_agent = true
# Never sync the current directory to /vagrant.
config.vm.synced_folder ".", "/vagrant", disabled: true
checkout = "yes"
if ENV['CHECKOUT'] != 'y' then
checkout = "no"
end
config.vm.provider "virtualbox" do |vb, override|
vb.gui = false
vb.memory = 2048
vb.customize ["setextradata", :id, "VBoxInternal2/SharedFoldersEnableSymlinksCreate//vagrant","0"]
if ENV['CHECKOUT'] != 'y' then
override.vm.synced_folder ".", "/home/vagrant/Nominatim"
end
end
config.vm.provider "libvirt" do |lv, override|
lv.memory = 2048
lv.nested = true
if ENV['CHECKOUT'] != 'y' then
override.vm.synced_folder ".", "/home/vagrant/Nominatim", type: 'nfs'
end
end
config.vm.define "ubuntu", primary: true do |sub|
sub.vm.box = "generic/ubuntu2004"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-20.sh"
s.privileged = false
s.args = [checkout]
end
end
config.vm.define "ubuntu-apache" do |sub|
sub.vm.box = "generic/ubuntu2004"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-20.sh"
s.privileged = false
s.args = [checkout, "install-apache"]
end
end
config.vm.define "ubuntu-nginx" do |sub|
sub.vm.box = "generic/ubuntu2004"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-20.sh"
s.privileged = false
s.args = [checkout, "install-nginx"]
end
end
config.vm.define "ubuntu18" do |sub|
sub.vm.box = "generic/ubuntu1804"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-18.sh"
s.privileged = false
s.args = [checkout]
end
end
config.vm.define "ubuntu18-apache" do |sub|
sub.vm.box = "generic/ubuntu1804"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-18.sh"
s.privileged = false
s.args = [checkout, "install-apache"]
end
end
config.vm.define "ubuntu18-nginx" do |sub|
sub.vm.box = "generic/ubuntu1804"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Ubuntu-18.sh"
s.privileged = false
s.args = [checkout, "install-nginx"]
end
end
config.vm.define "centos7" do |sub|
sub.vm.box = "centos/7"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Centos-7.sh"
s.privileged = false
s.args = [checkout]
end
end
config.vm.define "centos" do |sub|
sub.vm.box = "generic/centos8"
sub.vm.provision :shell do |s|
s.path = "vagrant/Install-on-Centos-8.sh"
s.privileged = false
s.args = [checkout]
end
end
end

2
autogen.sh Executable file
View File

@@ -0,0 +1,2 @@
#!/bin/sh
autoreconf -vfi

View File

@@ -1,20 +0,0 @@
#!/usr/bin/env python3
import sys
import os
sys.path.insert(1, '@NOMINATIM_LIBDIR@/lib-python')
os.environ['NOMINATIM_NOMINATIM_TOOL'] = os.path.abspath(__file__)
from nominatim import cli
from nominatim import version
version.GIT_COMMIT_HASH = '@GIT_HASH@'
exit(cli.nominatim(module_dir='@NOMINATIM_LIBDIR@/module',
osm2pgsql_path='@NOMINATIM_LIBDIR@/osm2pgsql',
phplib_dir='@NOMINATIM_LIBDIR@/lib-php',
sqllib_dir='@NOMINATIM_LIBDIR@/lib-sql',
data_dir='@NOMINATIM_DATADIR@',
config_dir='@NOMINATIM_CONFIGDIR@',
phpcgi_path='@PHPCGI_BIN@'))

View File

@@ -1,20 +0,0 @@
#!/usr/bin/env python3
import sys
import os
sys.path.insert(1, '@CMAKE_SOURCE_DIR@')
os.environ['NOMINATIM_NOMINATIM_TOOL'] = os.path.abspath(__file__)
from nominatim import cli
from nominatim import version
version.GIT_COMMIT_HASH = '@GIT_HASH@'
exit(cli.nominatim(module_dir='@CMAKE_BINARY_DIR@/module',
osm2pgsql_path='@CMAKE_BINARY_DIR@/osm2pgsql/osm2pgsql',
phplib_dir='@CMAKE_SOURCE_DIR@/lib-php',
sqllib_dir='@CMAKE_SOURCE_DIR@/lib-sql',
data_dir='@CMAKE_SOURCE_DIR@/data',
config_dir='@CMAKE_SOURCE_DIR@/settings',
phpcgi_path='@PHPCGI_BIN@'))

65
configure.ac Normal file
View File

@@ -0,0 +1,65 @@
AC_INIT(Nominatim,2.3.1)
if git rev-parse HEAD 2>/dev/null >/dev/null; then
AC_SUBST([PACKAGE_VERSION], [$PACKAGE_VERSION-git-`git rev-parse --short HEAD`])
fi
dnl Required autoconf version
AC_PREREQ(2.61)
AM_INIT_AUTOMAKE([1.9.6 dist-bzip2 std-options check-news])
dnl Additional macro definitions are in here
AC_CONFIG_MACRO_DIR([osm2pgsql/m4])
dnl Generate configuration header file
AC_CONFIG_HEADER(nominatim/config.h)
dnl Find C compiler
AC_PROG_CC
dnl Find C++ compiler
AC_PROG_CXX
dnl pthread
AX_PTHREAD([], [AC_MSG_ERROR([pthread library required])])
dnl Check for Geos library
AX_LIB_GEOS
if test "x$GEOS_VERSION" = "x"
then
AC_MSG_ERROR([required library not found]);
fi
dnl Check for Proj library
AX_LIB_PROJ
if test "$HAVE_PROJ" = "no"
then
AC_MSG_ERROR([required library not found]);
fi
dnl Check for PostgresSQL client library
AX_LIB_POSTGRESQL(8.4)
if test "x$POSTGRESQL_VERSION" = "x"
then
AC_MSG_ERROR([postgresql client library not found])
fi
if test ! -f "$POSTGRESQL_PGXS"
then
AC_MSG_ERROR([postgresql server development library not found])
fi
dnl Check for bzip2 library
AX_LIB_BZIP2
if test "$HAVE_BZIP2" = "no"
then
AC_MSG_ERROR([required library not found]);
fi
dnl Check for libxml2 library
AM_PATH_XML2
AC_CONFIG_SUBDIRS([osm2pgsql])
AC_OUTPUT(Makefile nominatim/Makefile module/Makefile)

36
contrib/openlayers.cfg Normal file
View File

@@ -0,0 +1,36 @@
# This file includes a small subset of OpenLayers code, designed to be
# integrated into another application. It includes only the Layer types
# neccesary to create tiled or untiled WMS, and does not include any Controls.
# This is the result of what was at the time called "Webmap.js" at the FOSS4G
# Web Mapping BOF.
[first]
[last]
[include]
OpenLayers/Map.js
OpenLayers/Kinetic.js
OpenLayers/Geometry/MultiLineString.js
OpenLayers/Geometry/MultiPolygon.js
OpenLayers/Format/WKT.js
OpenLayers/Layer/OSM.js
OpenLayers/Layer/Vector.js
OpenLayers/Layer/SphericalMercator.js
OpenLayers/Control/Attribution.js
OpenLayers/Control/KeyboardDefaults.js
OpenLayers/Control/Navigation.js
OpenLayers/Control/MousePosition.js
OpenLayers/Control/PanZoomBar.js
OpenLayers/Control/Permalink.js
OpenLayers/Control/TouchNavigation.js
OpenLayers/Style.js
OpenLayers/Protocol/HTTP.js
OpenLayers/Projection.js
OpenLayers/Renderer/SVG.js
OpenLayers/Renderer/VML.js
OpenLayers/Renderer/Canvas.js
[exclude]

304
data/country_name.sql Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

25524
data/country_osm_grid.sql Normal file

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,27 @@
-- This data contains Ordnance Survey data © Crown copyright and database right 2010.
-- Code-Point Open contains Royal Mail data © Royal Mail copyright and database right 2010.
-- OS data may be used under the terms of the OS OpenData licence:
-- http://www.ordnancesurvey.co.uk/oswebsite/opendata/licence/docs/licence.pdf
SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;
SET search_path = public, pg_catalog;
SET default_tablespace = '';
SET default_with_oids = false;
CREATE TABLE gb_postcode (
id integer,
postcode character varying(9),
geometry geometry,
CONSTRAINT enforce_dims_geometry CHECK ((st_ndims(geometry) = 2)),
CONSTRAINT enforce_srid_geometry CHECK ((st_srid(geometry) = 4326))
);
GRANT SELECT ON TABLE gb_postcode TO "www-data";

32430
data/us_postcode.sql Normal file

File diff suppressed because it is too large Load Diff

2930
data/us_state.sql Normal file

File diff suppressed because one or more lines are too long

6198
data/us_statecounty.sql Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@@ -1,35 +0,0 @@
# Auto-generated vagrant install documentation
# build the actual documentation
configure_file(mkdocs.yml ../mkdocs.yml)
file(MAKE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/appendix)
set (DOC_SOURCES
admin
develop
api
customize
index.md
extra.css
styles.css
)
foreach (src ${DOC_SOURCES})
execute_process(
COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/${src} ${CMAKE_CURRENT_BINARY_DIR}/${src}
)
endforeach()
ADD_CUSTOM_TARGET(doc
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-18.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-18.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-20.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-20.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-22.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-22.md
COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs build -d ${CMAKE_CURRENT_BINARY_DIR}/../site-html -f ${CMAKE_CURRENT_BINARY_DIR}/../mkdocs.yml
)
ADD_CUSTOM_TARGET(serve-doc
COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs serve
WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
)

View File

@@ -1,216 +0,0 @@
# Advanced installations
This page contains instructions for setting up multiple countries in
your Nominatim database. It is assumed that you have already successfully
installed the Nominatim software itself, if not return to the
[installation page](Installation.md).
## Importing multiple regions (without updates)
To import multiple regions in your database you can simply give multiple
OSM files to the import command:
```
nominatim import --osm-file file1.pbf --osm-file file2.pbf
```
If you already have imported a file and want to add another one, you can
use the add-data function to import the additional data as follows:
```
nominatim add-data --file <FILE>
nominatim refresh --postcodes
nominatim index -j <NUMBER OF THREADS>
```
Please note that adding additional data is always significantly slower than
the original import.
## Importing multiple regions (with updates)
If you want to import multiple regions _and_ be able to keep them up-to-date
with updates, then you can use the scripts provided in the `utils` directory.
These scripts will set up an `update` directory in your project directory,
which has the following structure:
```bash
update
   ├── europe
   │   ├── andorra
   │   │   └── sequence.state
   │   └── monaco
   │   └── sequence.state
   └── tmp
└── europe
├── andorra-latest.osm.pbf
└── monaco-latest.osm.pbf
```
The `sequence.state` files contain the sequence ID for each region. They will
be used by pyosmium to get updates. The `tmp` folder is used for import dump and
can be deleted once the import is complete.
### Setting up multiple regions
Create a project directory as described for the
[simple import](Import.md#creating-the-project-directory). If necessary,
you can also add an `.env` configuration with customized options. In particular,
you need to make sure that `NOMINATIM_REPLICATION_UPDATE_INTERVAL` and
`NOMINATIM_REPLICATION_RECHECK_INTERVAL` are set according to the update
interval of the extract server you use.
Copy the scripts `utils/import_multiple_regions.sh` and `utils/update_database.sh`
into the project directory.
Now customize both files as per your requirements
1. List of countries. e.g.
COUNTRIES="europe/monaco europe/andorra"
2. URL to the service providing the extracts and updates. eg:
BASEURL="https://download.geofabrik.de"
DOWNCOUNTRYPOSTFIX="-latest.osm.pbf"
5. Followup in the update script can be set according to your installation.
E.g. for Photon,
FOLLOWUP="curl http://localhost:2322/nominatim-update"
will handle the indexing.
To start the initial import, change into the project directory and run
```
bash import_multiple_regions.sh
```
### Updating the database
Change into the project directory and run the following command:
bash update_database.sh
This will get diffs from the replication server, import diffs and index
the database. The default replication server in the
script([Geofabrik](https://download.geofabrik.de)) provides daily updates.
## Using an external PostgreSQL database
You can install Nominatim using a database that runs on a different server when
you have physical access to the file system on the other server. Nominatim
uses a custom normalization library that needs to be made accessible to the
PostgreSQL server. This section explains how to set up the normalization
library.
!!! note
The external module is only needed when using the legacy tokenizer.
If you have chosen the ICU tokenizer, then you can ignore this section
and follow the standard import documentation.
### Option 1: Compiling the library on the database server
The most sure way to get a working library is to compile it on the database
server. From the prerequisites you need at least cmake, gcc and the
PostgreSQL server package.
Clone or unpack the Nominatim source code, enter the source directory and
create and enter a build directory.
```sh
cd Nominatim
mkdir build
cd build
```
Now configure cmake to only build the PostgreSQL module and build it:
```
cmake -DBUILD_IMPORTER=off -DBUILD_API=off -DBUILD_TESTS=off -DBUILD_DOCS=off -DBUILD_OSM2PGSQL=off ..
make
```
When done, you find the normalization library in `build/module/nominatim.so`.
Copy it to a place where it is readable and executable by the PostgreSQL server
process.
### Option 2: Compiling the library on the import machine
You can also compile the normalization library on the machine from where you
run the import.
!!! important
You can only do this when the database server and the import machine have
the same architecture and run the same version of Linux. Otherwise there is
no guarantee that the compiled library is compatible with the PostgreSQL
server running on the database server.
Make sure that the PostgreSQL server package is installed on the machine
**with the same version as on the database server**. You do not need to install
the PostgreSQL server itself.
Download and compile Nominatim as per standard instructions. Once done, you find
the normalization library in `build/module/nominatim.so`. Copy the file to
the database server at a location where it is readable and executable by the
PostgreSQL server process.
### Running the import
On the client side you now need to configure the import to point to the
correct location of the library **on the database server**. Add the following
line to your your `.env` file:
```php
NOMINATIM_DATABASE_MODULE_PATH="<directory on the database server where nominatim.so resides>"
```
Now change the `NOMINATIM_DATABASE_DSN` to point to your remote server and continue
to follow the [standard instructions for importing](Import.md).
## Moving the database to another machine
For some configurations it may be useful to run the import on one machine, then
move the database to another machine and run the Nominatim service from there.
For example, you might want to use a large machine to be able to run the import
quickly but only want a smaller machine for production because there is not so
much load. Or you might want to do the import once and then replicate the
database to many machines.
The important thing to keep in mind when transferring the Nominatim installation
is that you need to transfer the database _and the project directory_. Both
parts are essential for your installation.
The Nominatim database can be transferred using the `pg_dump`/`pg_restore` tool.
Make sure to use the same version of PostgreSQL and PostGIS on source and
target machine.
!!! note
Before creating a dump of your Nominatim database, consider running
`nominatim freeze` first. Your database looses the ability to receive further
data updates but the resulting database is only about a third of the size
of a full database.
Next install Nominatim on the target machine by following the standard installation
instructions. Again, make sure to use the same version as the source machine.
Create a project directory on your destination machine and set up the `.env`
file to match the configuration on the source machine. Finally run
nominatim refresh --website
to make sure that the local installation of Nominatim will be used.
If you are using the legacy tokenizer you might also have to switch to the
PostgreSQL module that was compiled on your target machine. If you get errors
that PostgreSQL cannot find or access `nominatim.so` then rerun
nominatim refresh --functions
on the target machine to update the the location of the module.

View File

@@ -1,148 +0,0 @@
# Deploying Nominatim
The Nominatim API is implemented as a PHP application. The `website/` directory
in the project directory contains the configured website. You can serve this
in a production environment with any web server that is capable to run
PHP scripts.
This section gives a quick overview on how to configure Apache and Nginx to
serve Nominatim. It is not meant as a full system administration guide on how
to run a web service. Please refer to the documentation of
[Apache](http://httpd.apache.org/docs/current/) and
[Nginx](https://nginx.org/en/docs/)
for background information on configuring the services.
!!! Note
Throughout this page, we assume that your Nominatim project directory is
located in `/srv/nominatim-project` and that you have installed Nominatim
using the default installation prefix `/usr/local`. If you have put it
somewhere else, you need to adjust the commands and configuration
accordingly.
We further assume that your web server runs as user `www-data`. Older
versions of CentOS may still use the user name `apache`. You also need
to adapt the instructions in this case.
## Making the website directory accessible
You need to make sure that the `website` directory is accessible for the
web server user. You can check that the permissions are correct by accessing
on of the php files as the web server user:
``` sh
sudo -u www-data head -n 1 /srv/nominatim-project/website/search.php
```
If this shows a permission error, then you need to adapt the permissions of
each directory in the path so that it is executable for `www-data`.
If you have SELinux enabled, further adjustments may be necessary to give the
web server access. At a minimum the following SELinux labelling should be done
for Nominatim:
``` sh
sudo semanage fcontext -a -t httpd_sys_content_t "/usr/local/nominatim/lib/lib-php(/.*)?"
sudo semanage fcontext -a -t httpd_sys_content_t "/srv/nominatim-project/website(/.*)?"
sudo semanage fcontext -a -t lib_t "/srv/nominatim-project/module/nominatim.so"
sudo restorecon -R -v /usr/local/lib/nominatim
sudo restorecon -R -v /srv/nominatim-project
```
## Nominatim with Apache
### Installing the required packages
With Apache you can use the PHP module to run Nominatim.
Under Ubuntu/Debian install them with:
``` sh
sudo apt install apache2 libapache2-mod-php
```
### Configuring Apache
Make sure your Apache configuration contains the required permissions for the
directory and create an alias:
``` apache
<Directory "/srv/nominatim-project/website">
Options FollowSymLinks MultiViews
AddType text/html .php
DirectoryIndex search.php
Require all granted
</Directory>
Alias /nominatim /srv/nominatim-project/website
```
After making changes in the apache config you need to restart apache.
The website should now be available on `http://localhost/nominatim`.
## Nominatim with Nginx
### Installing the required packages
Nginx has no built-in PHP interpreter. You need to use php-fpm as a daemon for
serving PHP cgi.
On Ubuntu/Debian install nginx and php-fpm with:
``` sh
sudo apt install nginx php-fpm
```
### Configure php-fpm and Nginx
By default php-fpm listens on a network socket. If you want it to listen to a
Unix socket instead, change the pool configuration
(`/etc/php/<php version>/fpm/pool.d/www.conf`) as follows:
``` ini
; Replace the tcp listener and add the unix socket
listen = /var/run/php-fpm-nominatim.sock
; Ensure that the daemon runs as the correct user
listen.owner = www-data
listen.group = www-data
listen.mode = 0666
```
Tell nginx that php files are special and to fastcgi_pass to the php-fpm
unix socket by adding the location definition to the default configuration.
``` nginx
root /srv/nominatim-project/website;
index search.php;
location / {
try_files $uri $uri/ @php;
}
location @php {
fastcgi_param SCRIPT_FILENAME "$document_root$uri.php";
fastcgi_param PATH_TRANSLATED "$document_root$uri.php";
fastcgi_param QUERY_STRING $args;
fastcgi_pass unix:/var/run/php-fpm-nominatim.sock;
fastcgi_index index.php;
include fastcgi_params;
}
location ~ [^/]\.php(/|$) {
fastcgi_split_path_info ^(.+?\.php)(/.*)$;
if (!-f $document_root$fastcgi_script_name) {
return 404;
}
fastcgi_pass unix:/var/run/php-fpm-nominatim.sock;
fastcgi_index search.php;
include fastcgi.conf;
}
```
Restart the nginx and php-fpm services and the website should now be available
at `http://localhost/`.
## Nominatim with other webservers
Users have created instructions for other webservers:
* [Caddy](https://github.com/osm-search/Nominatim/discussions/2580)

View File

@@ -1,190 +0,0 @@
# Troubleshooting Nominatim Installations
## Installation Issues
### Can a stopped/killed import process be resumed?
"I accidentally killed the import process after it has been running for many hours. Can it be resumed?"
It is possible if the import already got to the indexing stage.
Check the last line of output that was logged before the process
was killed. If it looks like this:
Done 844 in 13 @ 64.923080 per second - Rank 26 ETA (seconds): 7.886255
then you can resume with the following command:
```sh
nominatim import --continue indexing
```
If the reported rank is 26 or higher, you can also safely add `--index-noanalyse`.
### PostgreSQL crashed "invalid page in block"
Usually serious problem, can be a hardware issue, not all data written to disc
for example. Check PostgreSQL log file and search PostgreSQL issues/mailing
list for hints.
If it happened during index creation you can try rerunning the step with
```sh
nominatim import --continue indexing
```
Otherwise it's best to start the full setup from the beginning.
### PHP "open_basedir restriction in effect" warnings
PHP Warning: file_get_contents(): open_basedir restriction in effect.
You need to adjust the
[open_basedir](https://www.php.net/manual/en/ini.core.php#ini.open-basedir)
setting in your PHP configuration (`php.ini` file). By default this setting may
look like this:
open_basedir = /srv/http/:/home/:/tmp/:/usr/share/pear/
Either add reported directories to the list or disable this setting temporarily
by adding ";" at the beginning of the line. Don't forget to enable this setting
again once you are done with the PHP command line operations.
### PHP timezeone warnings
The Apache log may contain lots of PHP warnings like this:
`PHP Warning: date_default_timezone_set() function.`
You should set the default time zone as instructed in the warning in
your `php.ini` file. Find the entry about timezone and set it to
something like this:
; Defines the default timezone used by the date functions
; https://php.net/date.timezone
date.timezone = 'America/Denver'
Or
```
echo "date.timezone = 'America/Denver'" > /etc/php.d/timezone.ini
```
### nominatim.so version mismatch
When running the import you may get a version mismatch:
`COPY_END for place failed: ERROR: incompatible library "/srv/Nominatim/nominatim/build/module/nominatim.so": version mismatch`
pg_config seems to use bad includes sometimes when multiple versions
of PostgreSQL are available in the system. Make sure you remove the
server development libraries (`postgresql-server-dev-13` on Ubuntu)
and recompile (`cmake .. && make`).
### I see the error "ERROR: permission denied for language c"
`nominatim.so`, written in C, is required to be installed on the database
server. Some managed database (cloud) services like Amazon RDS do not allow
this. There is currently no work-around other than installing a database
on a non-managed machine.
### I see the error: "function transliteration(text) does not exist"
Reinstall the nominatim functions with `nominatim refresh --functions`
and check for any errors, e.g. a missing `nominatim.so` file.
### I see the error: "ERROR: mmap (remap) failed"
This may be a simple out-of-memory error. Try reducing the memory used
for `--osm2pgsql-cache`. Also make sure that overcommitting memory is
allowed: `cat /proc/sys/vm/overcommit_memory` should print 0 or 1.
If you are using a flatnode file, then it may also be that the underlying
filesystem does not fully support 'mmap'. A notable candidate is virtualbox's
vboxfs.
### nominatim UPDATE failed: ERROR: buffer 179261 is not owned by resource owner Portal
Several users [reported this](https://github.com/openstreetmap/Nominatim/issues/1168)
during the initial import of the database. It's
something PostgreSQL internal Nominatim doesn't control. And PostgreSQL forums
suggest it's threading related but definitely some kind of crash of a process.
Users reported either rebooting the server, different hardware or just trying
the import again worked.
### The website shows: "Could not get word tokens"
The server cannot access your database. Add `&debug=1` to your URL
to get the full error message.
### Website reports "DB Error: insufficient permissions"
The user the webserver, e.g. Apache, runs under needs to have access to the
Nominatim database. You can find the user like
[this](https://serverfault.com/questions/125865/finding-out-what-user-apache-is-running-as),
for default Ubuntu operating system for example it's `www-data`.
1. Repeat the `createuser` step of the installation instructions.
2. Give the user permission to existing tables
```
GRANT usage ON SCHEMA public TO "www-data";
GRANT SELECT ON ALL TABLES IN SCHEMA public TO "www-data";
```
### Website reports "Could not load library "nominatim.so"
Example error message
```
SELECT make_standard_name('3039 E MEADOWLARK LN') [nativecode=ERROR: could not
load library "/srv/nominatim/Nominatim-3.1.0/build/module/nominatim.so":
/srv/nominatim/Nominatim-3.1.0/build/module/nominatim.so: cannot open shared
object file: Permission denied
CONTEXT: PL/pgSQL function make_standard_name(text) line 5 at assignment]
```
The PostgreSQL database, i.e. user `postgres`, needs to have access to that file.
The permission need to be read & executable by everybody, but not writeable
by everybody, e.g.
```
-rwxr-xr-x 1 nominatim nominatim 297984 build/module/nominatim.so
```
Try `chmod a+r nominatim.so; chmod a+x nominatim.so`.
When you recently updated your operating system, updated PostgreSQL to
a new version or moved files (e.g. the build directory) you should
recreate `nominatim.so`. Try
```
cd build
rm -r module/
cmake $main_Nominatim_path && make
```
### Setup.php fails with "DB Error: extension not found"
Make sure you have the PostgreSQL extensions "hstore" and "postgis" installed.
See the installation instructions for a full list of required packages.
### I forgot to delete the flatnodes file before starting an import.
That's fine. For each import the flatnodes file get overwritten.
See [https://help.openstreetmap.org/questions/52419/nominatim-flatnode-storage](https://help.openstreetmap.org/questions/52419/nominatim-flatnode-storage)
for more information.
## Running your own instance
### Can I import negative OSM ids into Nominatim?
See [this question of Stackoverflow](https://help.openstreetmap.org/questions/64662/nominatim-flatnode-with-negative-id).

View File

@@ -1,288 +0,0 @@
# Importing the Database
The following instructions explain how to create a Nominatim database
from an OSM planet file. It is assumed that you have already successfully
installed the Nominatim software itself and the `nominatim` tool can be found
in your `PATH`. If this is not the case, return to the
[installation page](Installation.md).
## Creating the project directory
Before you start the import, you should create a project directory for your
new database installation. This directory receives all data that is related
to a single Nominatim setup: configuration, extra data, etc. Create a project
directory apart from the Nominatim software and change into the directory:
```
mkdir ~/nominatim-planet
cd ~/nominatim-planet
```
In the following, we refer to the project directory as `$PROJECT_DIR`. To be
able to copy&paste instructions, you can export the appropriate variable:
```
export PROJECT_DIR=~/nominatim-planet
```
The Nominatim tool assumes per default that the current working directory is
the project directory but you may explicitly state a different directory using
the `--project-dir` parameter. The following instructions assume that you run
all commands from the project directory.
!!! tip "Migration Tip"
Nominatim used to be run directly from the build directory until version 3.6.
Essentially, the build directory functioned as the project directory
for the database installation. This setup still works and can be useful for
development purposes. It is not recommended anymore for production setups.
Create a project directory that is separate from the Nominatim software.
### Configuration setup in `.env`
The Nominatim server can be customized via an `.env` configuration file in the
project directory. This is a file in [dotenv](https://github.com/theskumar/python-dotenv)
format which looks the same as variable settings in a standard shell environment.
You can also set the same configuration via environment variables. All
settings have a `NOMINATIM_` prefix to avoid conflicts with other environment
variables.
There are lots of configuration settings you can tweak. A full reference
can be found in the chapter [Configuration Settings](../customize/Settings.md).
Most should have a sensible default.
#### Flatnode files
If you plan to import a large dataset (e.g. Europe, North America, planet),
you should also enable flatnode storage of node locations. With this
setting enabled, node coordinates are stored in a simple file instead
of the database. This will save you import time and disk storage.
Add to your `.env`:
NOMINATIM_FLATNODE_FILE="/path/to/flatnode.file"
Replace the second part with a suitable path on your system and make sure
the directory exists. There should be at least 75GB of free space.
## Downloading additional data
### Wikipedia/Wikidata rankings
Wikipedia can be used as an optional auxiliary data source to help indicate
the importance of OSM features. Nominatim will work without this information
but it will improve the quality of the results if this is installed.
This data is available as a binary download. Put it into your project directory:
cd $PROJECT_DIR
wget https://nominatim.org/data/wikimedia-importance.sql.gz
The file is about 400MB and adds around 4GB to the Nominatim database.
!!! tip
If you forgot to download the wikipedia rankings, then you can
also add importances after the import. Download the SQL files, then
run `nominatim refresh --wiki-data --importance`. Updating
importances for a planet will take a couple of hours.
### External postcodes
Nominatim can use postcodes from an external source to improve searching with
postcodes. We provide precomputed postcodes sets for the US (using TIGER data)
and the UK (using the [CodePoint OpenData set](https://osdatahub.os.uk/downloads/open/CodePointOpen).
This data can be optionally downloaded into the project directory:
cd $PROJECT_DIR
wget https://nominatim.org/data/gb_postcodes.csv.gz
wget https://nominatim.org/data/us_postcodes.csv.gz
You can also add your own custom postcode sources, see
[Customization of postcodes](../customize/Postcodes.md).
## Choosing the data to import
In its default setup Nominatim is configured to import the full OSM data
set for the entire planet. Such a setup requires a powerful machine with
at least 64GB of RAM and around 900GB of SSD hard disks. Depending on your
use case there are various ways to reduce the amount of data imported. This
section discusses these methods. They can also be combined.
### Using an extract
If you only need geocoding for a smaller region, then precomputed OSM extracts
are a good way to reduce the database size and import time.
[Geofabrik](https://download.geofabrik.de) offers extracts for most countries.
They even have daily updates which can be used with the update process described
[in the next section](Update.md). There are also
[other providers for extracts](https://wiki.openstreetmap.org/wiki/Planet.osm#Downloading).
Please be aware that some extracts are not cut exactly along the country
boundaries. As a result some parts of the boundary may be missing which means
that Nominatim cannot compute the areas for some administrative areas.
### Dropping Data Required for Dynamic Updates
About half of the data in Nominatim's database is not really used for serving
the API. It is only there to allow the data to be updated from the latest
changes from OSM. For many uses these dynamic updates are not really required.
If you don't plan to apply updates, you can run the import with the
`--no-updates` parameter. This will drop the dynamic part of the database as
soon as it is not required anymore.
You can also drop the dynamic part later using the following command:
```
nominatim freeze
```
Note that you still need to provide for sufficient disk space for the initial
import. So this option is particularly interesting if you plan to transfer the
database or reuse the space later.
!!! warning
The data structure for updates are also required when adding additional data
after the import, for example [TIGER housenumber data](../customize/Tiger.md).
If you plan to use those, you must not use the `--no-updates` parameter.
Do a normal import, add the external data and once you are done with
everything run `nominatim freeze`.
### Reverse-only Imports
If you only want to use the Nominatim database for reverse lookups or
if you plan to use the installation only for exports to a
[photon](https://photon.komoot.io/) database, then you can set up a database
without search indexes. Add `--reverse-only` to your setup command above.
This saves about 5% of disk space.
### Filtering Imported Data
Nominatim normally sets up a full search database containing administrative
boundaries, places, streets, addresses and POI data. There are also other
import styles available which only read selected data:
* **admin**
Only import administrative boundaries and places.
* **street**
Like the admin style but also adds streets.
* **address**
Import all data necessary to compute addresses down to house number level.
* **full**
Default style that also includes points of interest.
* **extratags**
Like the full style but also adds most of the OSM tags into the extratags
column.
The style can be changed with the configuration `NOMINATIM_IMPORT_STYLE`.
To give you an idea of the impact of using the different styles, the table
below gives rough estimates of the final database size after import of a
2020 planet and after using the `--drop` option. It also shows the time
needed for the import on a machine with 64GB RAM, 4 CPUS and NVME disks.
Note that the given sizes are just an estimate meant for comparison of
style requirements. Your planet import is likely to be larger as the
OSM data grows with time.
style | Import time | DB size | after drop
----------|--------------|------------|------------
admin | 4h | 215 GB | 20 GB
street | 22h | 440 GB | 185 GB
address | 36h | 545 GB | 260 GB
full | 54h | 640 GB | 330 GB
extratags | 54h | 650 GB | 340 GB
You can also customize the styles further.
A [description of the style format](../customize/Import-Styles.md)
can be found in the customization guide.
## Initial import of the data
!!! danger "Important"
First try the import with a small extract, for example from
[Geofabrik](https://download.geofabrik.de).
Download the data to import. Then issue the following command
from the **project directory** to start the import:
```sh
nominatim import --osm-file <data file> 2>&1 | tee setup.log
```
The **project directory** is the one that you have set up at the beginning.
See [creating the project directory](#creating-the-project-directory).
### Notes on full planet imports
Even on a perfectly configured machine
the import of a full planet takes around 2 days. Once you see messages
with `Rank .. ETA` appear, the indexing process has started. This part takes
the most time. There are 30 ranks to process. Rank 26 and 30 are the most complex.
They take each about a third of the total import time. If you have not reached
rank 26 after two days of import, it is worth revisiting your system
configuration as it may not be optimal for the import.
### Notes on memory usage
In the first step of the import Nominatim uses [osm2pgsql](https://osm2pgsql.org)
to load the OSM data into the PostgreSQL database. This step is very demanding
in terms of RAM usage. osm2pgsql and PostgreSQL are running in parallel at
this point. PostgreSQL blocks at least the part of RAM that has been configured
with the `shared_buffers` parameter during
[PostgreSQL tuning](Installation.md#postgresql-tuning)
and needs some memory on top of that. osm2pgsql needs at least 2GB of RAM for
its internal data structures, potentially more when it has to process very large
relations. In addition it needs to maintain a cache for node locations. The size
of this cache can be configured with the parameter `--osm2pgsql-cache`.
When importing with a flatnode file, it is best to disable the node cache
completely and leave the memory for the flatnode file. Nominatim will do this
by default, so you do not need to configure anything in this case.
For imports without a flatnode file, set `--osm2pgsql-cache` approximately to
the size of the OSM pbf file you are importing. The size needs to be given in
MB. Make sure you leave enough RAM for PostgreSQL and osm2pgsql as mentioned
above. If the system starts swapping or you are getting out-of-memory errors,
reduce the cache size or even consider using a flatnode file.
### Testing the installation
Run this script to verify that all required tables and indices got created
successfully.
```sh
nominatim admin --check-database
```
Now you can try out your installation by running:
```sh
nominatim serve
```
This runs a small test server normally used for development. You can use it
to verify that your installation is working. Go to
`http://localhost:8088/status.php` and you should see the message `OK`.
You can also run a search query, e.g. `http://localhost:8088/search.php?q=Berlin`.
Note that search query is not supported for reverse-only imports. You can run a
reverse query, e.g. `http://localhost:8088/reverse.php?lat=27.1750090510034&lon=78.04209025`.
To run Nominatim via webservers like Apache or nginx, please read the
[Deployment chapter](Deployment.md).
## Adding search through category phrases
If you want to be able to search for places by their type through
[special phrases](https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases)
you also need to import these key phrases like this:
```sh
nominatim special-phrases --import-from-wiki
```
Note that this command downloads the phrases from the wiki link above. You
need internet access for the step.
You can also import special phrases from a csv file, for more
information please see the [Customization part](../customize/Special-Phrases.md).

View File

@@ -1,180 +0,0 @@
# Basic Installation
This page contains generic installation instructions for Nominatim and its
prerequisites. There are also step-by-step instructions available for
the following operating systems:
* [Ubuntu 22.04](../appendix/Install-on-Ubuntu-22.md)
* [Ubuntu 20.04](../appendix/Install-on-Ubuntu-20.md)
* [Ubuntu 18.04](../appendix/Install-on-Ubuntu-18.md)
These OS-specific instructions can also be found in executable form
in the `vagrant/` directory.
Users have created instructions for other frameworks. We haven't tested those
and can't offer support.
* [Docker](https://github.com/mediagis/nominatim-docker)
* [Docker on Kubernetes](https://github.com/peter-evans/nominatim-k8s)
* [Kubernetes with Helm](https://github.com/robjuz/helm-charts/blob/master/charts/nominatim/README.md)
* [Ansible](https://github.com/synthesio/infra-ansible-nominatim)
## Prerequisites
### Software
!!! Warning
For larger installations you **must have** PostgreSQL 11+ and PostGIS 3+
otherwise import and queries will be slow to the point of being unusable.
Query performance has marked improvements with PostgreSQL 13+ and PostGIS 3.2+.
For compiling:
* [cmake](https://cmake.org/)
* [expat](https://libexpat.github.io/)
* [proj](https://proj.org/)
* [bzip2](http://www.bzip.org/)
* [zlib](https://www.zlib.net/)
* [ICU](http://site.icu-project.org/)
* [Boost libraries](https://www.boost.org/), including system and filesystem
* PostgreSQL client libraries
* a recent C++ compiler (gcc 5+ or Clang 3.8+)
For running Nominatim:
* [PostgreSQL](https://www.postgresql.org) (9.6+ will work, 11+ strongly recommended)
* [PostGIS](https://postgis.net) (2.2+ will work, 3.0+ strongly recommended)
* [Python 3](https://www.python.org/) (3.6+)
* [Psycopg2](https://www.psycopg.org) (2.7+)
* [Python Dotenv](https://github.com/theskumar/python-dotenv)
* [psutil](https://github.com/giampaolo/psutil)
* [Jinja2](https://palletsprojects.com/p/jinja/)
* [PyICU](https://pypi.org/project/PyICU/)
* [PyYaml](https://pyyaml.org/) (5.1+)
* [datrie](https://github.com/pytries/datrie)
* [PHP](https://php.net) (7.0 or later)
* PHP-pgsql
* PHP-intl (bundled with PHP)
* PHP-cgi (for running queries from the command line)
For running continuous updates:
* [pyosmium](https://osmcode.org/pyosmium/)
For dependencies for running tests and building documentation, see
the [Development section](../develop/Development-Environment.md).
### Hardware
A minimum of 2GB of RAM is required or installation will fail. For a full
planet import 128GB of RAM or more are strongly recommended. Do not report
out of memory problems if you have less than 64GB RAM.
For a full planet install you will need at least 1TB of hard disk space.
Take into account that the OSM database is growing fast.
Fast disks are essential. Using NVME disks is recommended.
Even on a well configured machine the import of a full planet takes
around 2 days. On traditional spinning disks, 7-8 days are more realistic.
## Tuning the PostgreSQL database
You might want to tune your PostgreSQL installation so that the later steps
make best use of your hardware. You should tune the following parameters in
your `postgresql.conf` file.
shared_buffers = 2GB
maintenance_work_mem = (10GB)
autovacuum_work_mem = 2GB
work_mem = (50MB)
effective_cache_size = (24GB)
synchronous_commit = off
max_wal_size = 1GB
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
The numbers in brackets behind some parameters seem to work fine for
64GB RAM machine. Adjust to your setup. A higher number for `max_wal_size`
means that PostgreSQL needs to run checkpoints less often but it does require
the additional space on your disk.
Autovacuum must not be switched off because it ensures that the
tables are frequently analysed. If your machine has very little memory,
you might consider setting:
autovacuum_max_workers = 1
and even reduce `autovacuum_work_mem` further. This will reduce the amount
of memory that autovacuum takes away from the import process.
For the initial import, you should also set:
fsync = off
full_page_writes = off
Don't forget to re-enable them after the initial import or you risk database
corruption.
## Downloading and building Nominatim
### Downloading the latest release
You can download the [latest release from nominatim.org](https://nominatim.org/downloads/).
The release contains all necessary files. Just unpack it.
### Downloading the latest development version
If you want to install latest development version from github, make sure to
also check out the osm2pgsql subproject:
```
git clone --recursive https://github.com/openstreetmap/Nominatim.git
```
The development version does not include the country grid. Download it separately:
```
wget -O Nominatim/data/country_osm_grid.sql.gz https://nominatim.org/data/country_grid.sql.gz
```
### Building Nominatim
The code must be built in a separate directory. Create the directory and
change into it.
```
mkdir build
cd build
```
Nominatim uses cmake and make for building. Assuming that you have created the
build at the same level as the Nominatim source directory run:
```
cmake ../Nominatim
make
sudo make install
```
!!! warning
The default installation no longer compiles the PostgreSQL module that
is needed for the legacy tokenizer from older Nominatim versions. If you
are upgrading an older database or want to run the
[legacy tokenizer](../customize/Tokenizers.md#legacy-tokenizer) for
some other reason, you need to enable the PostgreSQL module via
cmake: `cmake -DBUILD_MODULE=on ../Nominatim`. To compile the module
you need to have the server development headers for PostgreSQL installed.
On Ubuntu/Debian run: `sudo apt install postgresql-server-dev-<postgresql version>`
Nominatim installs itself into `/usr/local` per default. To choose a different
installation directory add `-DCMAKE_INSTALL_PREFIX=<install root>` to the
cmake command. Make sure that the `bin` directory is available in your path
in that case, e.g.
```
export PATH=<install root>/bin:$PATH
```
Now continue with [importing the database](Import.md).

View File

@@ -1,75 +0,0 @@
This chapter describes the various operations the Nominatim database administrator
may use to clean and maintain the database. None of these operations is mandatory
but they may help improve the performance and accuracy of results.
## Updating postcodes
Command: `nominatim refresh --postcodes`
Postcode centroids (aka 'calculated postcodes') are generated by looking at all
postcodes of a country, grouping them and calculating the geometric centroid.
There is currently no logic to deal with extreme outliers (typos or other
mistakes in OSM data). There is also no check if a postcodes adheres to a
country's format, e.g. if Swiss postcodes are 4 digits.
When running regular updates, postcodes results can be improved by running
this command on a regular basis. Note that only the postcode table and the
postcode search terms are updated. The postcode that is assigned to each place
is only updated when the place is updated.
The command takes around 70min to run on the planet and needs ca. 40GB of
temporary disk space.
## Updating word counts
Command: `nominatim refresh --word-counts`
Nominatim keeps frequency statistics about all search terms it indexes. These
statistics are currently used to optimise queries to the database. Thus better
statistics mean better performance. Word counts are created once after import
and are usually sufficient even when running regular updates. You might want
to rerun the statistics computation when adding larger amounts of new data,
for example, when adding an additional country via `nominatim add-data`.
## Forcing recomputation of places and areas
Command: `nominatim refresh --data-object [NWR]<id> --data-area [NWR]<id>`
When running replication updates, Nominatim tries to recompute the search
and address information for all places that are affected by a change. But it
needs to restrict the total number of changes to make sure it can keep up
with the minutely updates. Therefore it will refrain from propagating changes
that affect a lot of objects.
The administrator may force an update of places in the database.
`nominatim refresh --data-object` invalidates a single OSM object.
`nominatim refresh --data-area` invalidates an OSM object and all dependent
objects. That are usually the places that inside its area or around the
center of the object. Both commands expect the OSM object as an argument
of the form OSM type + OSM id. The type must be `N` (node), `W` (way) or
`R` (relation).
After invalidating the object, indexing must be run again. If continuous
update are running in the background, the objects will be recomputed together
with the next round of updates. Otherwise you need to run `nominatim index`
to finish the recomputation.
## Removing large deleted objects
Nominatim refuses to delete very large areas because often these deletions are
accidental and are reverted within hours. Instead the deletions are logged in
the `import_polygon_delete` table and left to the administrator to clean up.
There is currently no command to do that. You can use the following SQL
query to force a deletion on all objects that have been deleted more than
a certain timespan ago (here: 1 month):
```sql
SELECT place_force_delete(p.place_id) FROM import_polygon_delete d, placex p
WHERE p.osm_type = d.osm_type and p.osm_id = d.osm_id
and age(p.indexed_date) > '1 month'::interval
```

View File

@@ -1,398 +0,0 @@
# Database Migrations
Since version 3.7.0 Nominatim offers automatic migrations. Please follow
the following steps:
* stop any updates that are potentially running
* update Nominatim to the newer version
* go to your project directory and run `nominatim admin --migrate`
* (optionally) restart updates
Below you find additional migrations and hints about other structural and
breaking changes. **Please read them before running the migration.**
!!! note
If you are migrating from a version <3.6, then you still have to follow
the manual migration steps up to 3.6.
## 4.2.2 -> 4.2.3
### Update interpolation functions
When updating to this release, you need to run `nominatim refresh --functions`
after updating and before restarting updates. Otherwise you may see an error
`Splitting of Point geometries is unsupported` or similar.
## 4.0.0 -> 4.1.0
### ICU tokenizer is the new default
Nominatim now installs the [ICU tokenizer](../customize/Tokenizers.md#icu-tokenizer)
by default. This only has an effect on newly installed databases. When
updating older databases, it keeps its installed tokenizer. If you still
run with the legacy tokenizer, make sure to compile Nominatim with the
PostgreSQL module, see [Installation](Installation.md#building-nominatim).
### geocodejson output changed
The `type` field of the geocodejson output has changed. It now contains
the address class of the object instead of the value of the OSM tag. If
your client has used the `type` field, switch them to read `osm_value`
instead.
## 3.7.0 -> 4.0.0
### NOMINATIM_PHRASE_CONFIG removed
Custom blacklist configurations for special phrases now need to be handed
with the `--config` parameter to `nominatim special-phrases`. Alternatively
you can put your custom configuration in the project directory in a file
named `phrase-settings.json`.
Version 3.8 also removes the automatic converter for the php format of
the configuration in older versions. If you are updating from Nominatim < 3.7
and still work with a custom `phrase-settings.php`, you need to manually
convert it into a json format.
### PHP utils removed
The old PHP utils have now been removed completely. You need to switch to
the appropriate functions of the nominatim command line tool. See
[Introducing `nominatim` command line tool](#introducing-nominatim-command-line-tool)
below.
## 3.6.0 -> 3.7.0
### New format and name of configuration file
The configuration for an import is now saved in a `.env` file in the project
directory. This file follows the dotenv format. For more information, see
the [installation chapter](Import.md#configuration-setup-in-env).
To migrate to the new system, create a new project directory, add the `.env`
file and port your custom configuration from `settings/local.php`. Most
settings are named similar and only have received a `NOMINATIM_` prefix.
Use the default settings in `settings/env.defaults` as a reference.
### New location for data files
External data files for Wikipedia importance, postcodes etc. are no longer
expected to reside in the source tree by default. Instead they will be searched
in the project directory. If you have an automated setup script you must
either adapt the download location or explicitly set the location of the
files to the old place in your `.env`.
### Introducing `nominatim` command line tool
The various php utilities have been replaced with a single `nominatim`
command line tool. Make sure to adapt any scripts. There is no direct 1:1
matching between the old utilities and the commands of nominatim CLI. The
following list gives you a list of nominatim sub-commands that contain
functionality of each script:
* ./utils/setup.php: `import`, `freeze`, `refresh`
* ./utils/update.php: `replication`, `add-data`, `index`, `refresh`
* ./utils/specialphrases.php: `special-phrases`
* ./utils/check_import_finished.php: `admin`
* ./utils/warm.php: `admin`
* ./utils/export.php: `export`
Try `nominatim <command> --help` for more information about each subcommand.
`./utils/query.php` no longer exists in its old form. `nominatim search`
provides a replacement but returns different output.
### Switch to normalized house numbers
The housenumber column in the placex table uses now normalized version.
The automatic migration step will convert the column but this may take a
very long time. It is advisable to take the machine offline while doing that.
## 3.5.0 -> 3.6.0
### Change of layout of search_name_* tables
The table need a different index for nearest place lookup. Recreate the
indexes using the following shell script:
```bash
for table in `psql -d nominatim -c "SELECT tablename FROM pg_tables WHERE tablename LIKE 'search_name_%'" -tA | grep -v search_name_blank`;
do
psql -d nominatim -c "DROP INDEX idx_${table}_centroid_place; CREATE INDEX idx_${table}_centroid_place ON ${table} USING gist (centroid) WHERE ((address_rank >= 2) AND (address_rank <= 25)); DROP INDEX idx_${table}_centroid_street; CREATE INDEX idx_${table}_centroid_street ON ${table} USING gist (centroid) WHERE ((address_rank >= 26) AND (address_rank <= 27))";
done
```
### Removal of html output
The debugging UI is no longer directly provided with Nominatim. Instead we
now provide a simple Javascript application. Please refer to
[Setting up the Nominatim UI](Setup-Nominatim-UI.md) for details on how to
set up the UI.
The icons served together with the API responses have been moved to the
nominatim-ui project as well. If you want to keep the `icon` field in the
response, you need to set `CONST_MapIcon_URL` to the URL of the `/mapicon`
directory of nominatim-ui.
### Change order during indexing
When reindexing places during updates, there is now a different order used
which needs a different database index. Create it with the following SQL command:
```sql
CREATE INDEX idx_placex_pendingsector_rank_address
ON placex
USING BTREE (rank_address, geometry_sector)
WHERE indexed_status > 0;
```
You can then drop the old index with:
```sql
DROP INDEX idx_placex_pendingsector;
```
### Unused index
This index has been unused ever since the query using it was changed two years ago. Saves about 12GB on a planet installation.
```sql
DROP INDEX idx_placex_geometry_reverse_lookupPoint;
```
### Switching to dotenv
As part of the work changing the configuration format, the configuration for
the website is now using a separate configuration file. To create the
configuration file, run the following command after updating:
```sh
./utils/setup.php --setup-website
```
### Update SQL code
To update the SQL code to the leatest version run:
```
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.4.0 -> 3.5.0
### New Wikipedia/Wikidata importance tables
The `wikipedia_*` tables have a new format that also includes references to
Wikidata. You need to update the computation functions and the tables as
follows:
* download the new Wikipedia tables as described in the import section
* reimport the tables: `./utils/setup.php --import-wikipedia-articles`
* update the functions: `./utils/setup.php --create-functions --enable-diff-updates`
* create a new lookup index:
```sql
CREATE INDEX idx_placex_wikidata
ON placex
USING BTREE ((extratags -> 'wikidata'))
WHERE extratags ? 'wikidata'
AND class = 'place'
AND osm_type = 'N'
AND rank_search < 26;
```
* compute importance: `./utils/update.php --recompute-importance`
The last step takes about 10 hours on the full planet.
Remove one function (it will be recreated in the next step):
```sql
DROP FUNCTION create_country(hstore,character varying);
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.3.0 -> 3.4.0
### Reorganisation of location_area_country table
The table `location_area_country` has been optimized. You need to switch to the
new format when you run updates. While updates are disabled, run the following
SQL commands:
```sql
CREATE TABLE location_area_country_new AS
SELECT place_id, country_code, geometry FROM location_area_country;
DROP TABLE location_area_country;
ALTER TABLE location_area_country_new RENAME TO location_area_country;
CREATE INDEX idx_location_area_country_geometry ON location_area_country USING GIST (geometry);
CREATE INDEX idx_location_area_country_place_id ON location_area_country USING BTREE (place_id);
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.2.0 -> 3.3.0
### New database connection string (DSN) format
Previously database connection setting (`CONST_Database_DSN` in `settings/*.php`) had the format
* (simple) `pgsql://@/nominatim`
* (complex) `pgsql://johndoe:secret@machine1.domain.com:1234/db1`
The new format is
* (simple) `pgsql:dbname=nominatim`
* (complex) `pgsql:dbname=db1;host=machine1.domain.com;port=1234;user=johndoe;password=secret`
### Natural Earth country boundaries no longer needed as fallback
```sql
DROP TABLE country_naturalearthdata;
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
### Configurable Address Levels
The new configurable address levels require a new table. Create it with the
following command:
```sh
./utils/update.php --update-address-levels
```
## 3.1.0 -> 3.2.0
### New reverse algorithm
The reverse algorithm has changed and requires new indexes. Run the following
SQL statements to create the indexes:
```sql
CREATE INDEX idx_placex_geometry_reverse_lookupPoint
ON placex
USING gist (geometry)
WHERE (name IS NOT null or housenumber IS NOT null or rank_address BETWEEN 26 AND 27)
AND class NOT IN ('railway','tunnel','bridge','man_made')
AND rank_address >= 26
AND indexed_status = 0
AND linked_place_id IS null;
CREATE INDEX idx_placex_geometry_reverse_lookupPolygon
ON placex USING gist (geometry)
WHERE St_GeometryType(geometry) in ('ST_Polygon', 'ST_MultiPolygon')
AND rank_address between 4 and 25
AND type != 'postcode'
AND name is not null
AND indexed_status = 0
AND linked_place_id is null;
CREATE INDEX idx_placex_geometry_reverse_placeNode
ON placex USING gist (geometry)
WHERE osm_type = 'N'
AND rank_search between 5 and 25
AND class = 'place'
AND type != 'postcode'
AND name is not null
AND indexed_status = 0
AND linked_place_id is null;
```
You also need to grant the website user access to the `country_osm_grid` table:
```sql
GRANT SELECT ON table country_osm_grid to "www-user";
```
Replace the `www-user` with the user name of your website server if necessary.
You can now drop the unused indexes:
```sql
DROP INDEX idx_placex_reverse_geometry;
```
Finally, update all SQL functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
## 3.0.0 -> 3.1.0
### Postcode Table
A new separate table for artificially computed postcode centroids was introduced.
Migration to the new format is possible but **not recommended**.
Create postcode table and indexes, running the following SQL statements:
```sql
CREATE TABLE location_postcode
(place_id BIGINT, parent_place_id BIGINT, rank_search SMALLINT,
rank_address SMALLINT, indexed_status SMALLINT, indexed_date TIMESTAMP,
country_code varchar(2), postcode TEXT,
geometry GEOMETRY(Geometry, 4326));
CREATE INDEX idx_postcode_geometry ON location_postcode USING GIST (geometry);
CREATE UNIQUE INDEX idx_postcode_id ON location_postcode USING BTREE (place_id);
CREATE INDEX idx_postcode_postcode ON location_postcode USING BTREE (postcode);
GRANT SELECT ON location_postcode TO "www-data";
DROP TYPE IF EXISTS nearfeaturecentr CASCADE;
CREATE TYPE nearfeaturecentr AS (
place_id BIGINT,
keywords int[],
rank_address smallint,
rank_search smallint,
distance float,
isguess boolean,
postcode TEXT,
centroid GEOMETRY
);
```
Add postcode column to `location_area` tables with SQL statement:
```sql
ALTER TABLE location_area ADD COLUMN postcode TEXT;
```
Then reimport the functions:
```sh
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
```
Create appropriate triggers with SQL:
```sql
CREATE TRIGGER location_postcode_before_update BEFORE UPDATE ON location_postcode
FOR EACH ROW EXECUTE PROCEDURE postcode_update();
```
Finally populate the postcode table (will take a while):
```sh
./utils/setup.php --calculate-postcodes --index --index-noanalyse
```
This will create a working database. You may also delete the old artificial
postcodes now. Note that this may be expensive and is not absolutely necessary.
The following SQL statement will remove them:
```sql
DELETE FROM place_addressline a USING placex p
WHERE a.address_place_id = p.place_id and p.osm_type = 'P';
ALTER TABLE placex DISABLE TRIGGER USER;
DELETE FROM placex WHERE osm_type = 'P';
ALTER TABLE placex ENABLE TRIGGER USER;
```

View File

@@ -1,177 +0,0 @@
# Setting up the Nominatim UI
Nominatim is a search API, it does not provide a website interface on its
own. [nominatim-ui](https://github.com/osm-search/nominatim-ui) offers a
small website for testing your setup and inspecting the database content.
This section provides a quick start how to use nominatim-ui with your
installation. For more details, please also have a look at the
[README of nominatim-ui](https://github.com/osm-search/nominatim-ui/blob/master/README.md).
## Installing nominatim-ui
We provide regular releases of nominatim-ui that contain the packaged website.
They do not need any special installation. Just download, configure
and run it. Grab the latest release from
[nominatim-ui's Github release page](https://github.com/osm-search/nominatim-ui/releases)
and unpack it. You can use `nominatim-ui-x.x.x.tar.gz` or `nominatim-ui-x.x.x.zip`.
Next you need to adapt the UI to your installation. Custom settings need to be
put into `dist/theme/config.theme.js`. At a minimum you need to
set `Nominatim_API_Endpoint` to point to your Nominatim installation:
cd nominatim-ui
echo "Nominatim_Config.Nominatim_API_Endpoint='https://myserver.org/nominatim/';" > dist/theme/config.theme.js
For the full set of available settings, have a look at `dist/config.defaults.js`.
Then you can just test it locally by spinning up a webserver in the `dist`
directory. For example, with Python:
cd nominatim-ui/dist
python3 -m http.server 8765
The website is now available at `http://localhost:8765`.
## Forwarding searches to nominatim-ui
Nominatim used to provide the search interface directly by itself when
`format=html` was requested. For all endpoints except for `/reverse` and
`/lookup` this even used to be the default.
The following section describes how to set up Apache or nginx, so that your
users are forwarded to nominatim-ui when they go to URL that formerly presented
the UI.
### Setting up forwarding in Nginx
First of all make nominatim-ui available under `/ui` on your webserver:
``` nginx
server {
# Here is the Nominatim setup as described in the Installation section
location /ui/ {
alias <full path to the nominatim-ui directory>/dist/;
index index.html;
}
}
```
Now we need to find out if a URL should be forwarded to the UI. Add the
following `map` commands *outside* the server section:
``` nginx
# Inspect the format parameter in the query arguments. We are interested
# if it is set to html or something else or if it is missing completely.
map $args $format {
default default;
~(^|&)format=html(&|$) html;
~(^|&)format= other;
}
# Determine from the URI and the format parameter above if forwarding is needed.
map $uri/$format $forward_to_ui {
default 1; # The default is to forward.
~^/ui 0; # If the URI point to the UI already, we are done.
~/other$ 0; # An explicit non-html format parameter. No forwarding.
~/reverse.*/default 0; # Reverse and lookup assume xml format when
~/lookup.*/default 0; # no format parameter is given. No forwarding.
}
```
The `$forward_to_ui` parameter can now be used to conditionally forward the
calls:
```
# When no endpoint is given, default to search.
# Need to add a rewrite so that the rewrite rules below catch it correctly.
rewrite ^/$ /search;
location @php {
# fastcgi stuff..
if ($forward_to_ui) {
rewrite ^(/[^/]*) https://yourserver.com/ui$1.html redirect;
}
}
location ~ [^/]\.php(/|$) {
# fastcgi stuff..
if ($forward_to_ui) {
rewrite (.*).php https://yourserver.com/ui$1.html redirect;
}
}
```
!!! warning
Be aware that the rewrite commands are slightly different for URIs with and
without the .php suffix.
Reload nginx and the UI should be available.
### Setting up forwarding in Apache
First of all make nominatim-ui available in the `ui/` subdirectory where
Nominatim is installed. For example, given you have set up an alias under
`nominatim` like this:
``` apache
Alias /nominatim /home/vagrant/build/website
```
you need to insert the following rules for nominatim-ui before that alias:
```
<Directory "/home/vagrant/nominatim-ui/dist">
DirectoryIndex search.html
Require all granted
</Directory>
Alias /nominatim/ui /home/vagrant/nominatim-ui/dist
```
Replace `/home/vagrant/nominatim-ui` with the directory where you have cloned
nominatim-ui.
!!! important
The alias for nominatim-ui must come before the alias for the Nominatim
website directory.
To set up forwarding, the Apache rewrite module is needed. Enable it with:
``` sh
sudo a2enmod rewrite
```
Then add rewrite rules to the `Directory` directive of the Nominatim website
directory like this:
``` apache
<Directory "/home/vagrant/build/website">
Options FollowSymLinks MultiViews
AddType text/html .php
Require all granted
RewriteEngine On
# This must correspond to the URL where nominatim can be found.
RewriteBase "/nominatim/"
# If no endpoint is given, then use search.
RewriteRule ^(/|$) "search.php"
# If format-html is explicitly requested, forward to the UI.
RewriteCond %{QUERY_STRING} "format=html"
RewriteRule ^([^/]+)(.php)? ui/$1.html [R,END]
# If no format parameter is there then forward anything
# but /reverse and /lookup to the UI.
RewriteCond %{QUERY_STRING} "!format="
RewriteCond %{REQUEST_URI} "!/lookup"
RewriteCond %{REQUEST_URI} "!/reverse"
RewriteRule ^([^/]+)(.php)? ui/$1.html [R,END]
</Directory>
```
Restart Apache and the UI should be available.

View File

@@ -1,223 +0,0 @@
# Updating the Database
There are many different ways to update your Nominatim database.
The following section describes how to keep it up-to-date using
an [online replication service for OpenStreetMap data](https://wiki.openstreetmap.org/wiki/Planet.osm/diffs)
For a list of other methods to add or update data see the output of
`nominatim add-data --help`.
!!! important
If you have configured a flatnode file for the import, then you
need to keep this flatnode file around for updates.
### Installing the newest version of Pyosmium
The replication process uses
[Pyosmium](https://docs.osmcode.org/pyosmium/latest/updating_osm_data.html)
to download update data from the server.
It is recommended to install Pyosmium via pip.
Run (as the same user who will later run the updates):
```sh
pip3 install --user osmium
```
### Setting up the update process
Next the update process needs to be initialised. By default Nominatim is configured
to update using the global minutely diffs.
If you want a different update source you will need to add some settings
to `.env`. For example, to use the daily country extracts
diffs for Ireland from Geofabrik add the following:
# base URL of the replication service
NOMINATIM_REPLICATION_URL="https://download.geofabrik.de/europe/ireland-and-northern-ireland-updates"
# How often upstream publishes diffs (in seconds)
NOMINATIM_REPLICATION_UPDATE_INTERVAL=86400
# How long to sleep if no update found yet (in seconds)
NOMINATIM_REPLICATION_RECHECK_INTERVAL=900
To set up the update process now run the following command:
nominatim replication --init
It outputs the date where updates will start. Recheck that this date is
what you expect.
The `replication --init` command needs to be rerun whenever the replication
service is changed.
### Updating Nominatim
Nominatim supports different modes how to retrieve the update data from the
server. Which one you want to use depends on your exact setup and how often you
want to retrieve updates.
These instructions are for using a single source of updates. If you have
imported multiple country extracts and want to keep them
up-to-date, [Advanced installations section](Advanced-Installations.md)
contains instructions to set up and update multiple country extracts.
#### Continuous updates
This is the easiest mode. Simply run the replication command without any
parameters:
nominatim replication
The update application keeps running forever and retrieves and applies
new updates from the server as they are published.
You can run this command as a simple systemd service. Create a service
description like that in `/etc/systemd/system/nominatim-updates.service`:
```
[Unit]
Description=Continuous updates of Nominatim
[Service]
WorkingDirectory=/srv/nominatim
ExecStart=nominatim replication
StandardOutput=append:/var/log/nominatim-updates.log
StandardError=append:/var/log/nominatim-updates.error.log
User=nominatim
Group=nominatim
Type=simple
[Install]
WantedBy=multi-user.target
```
Replace the `WorkingDirectory` with your project directory. Also adapt user
and group names as required.
Now activate the service and start the updates:
```
sudo systemctl daemon-reload
sudo systemctl enable nominatim-updates
sudo systemctl start nominatim-updates
```
#### One-time mode
When the `--once` parameter is given, then Nominatim will download exactly one
batch of updates and then exit. This one-time mode still respects the
`NOMINATIM_REPLICATION_UPDATE_INTERVAL` that you have set. If according to
the update interval no new data has been published yet, it will go to sleep
until the next expected update and only then attempt to download the next batch.
The one-time mode is particularly useful if you want to run updates continuously
but need to schedule other work in between updates. For example, the main
service at osm.org uses it, to regularly recompute postcodes -- a process that
must not be run while updates are in progress. Its update script
looks like this:
```sh
#!/bin/bash
# Switch to your project directory.
cd /srv/nominatim
while true; do
nominatim replication --once
if [ -f "/srv/nominatim/schedule-maintenance" ]; then
rm /srv/nominatim/schedule-maintenance
nominatim refresh --postcodes
fi
done
```
A cron job then creates the file `/srv/nominatim/schedule-maintenance` once per night.
##### One-time mode with systemd
You can run the one-time mode with a systemd timer & service.
Create a timer description like `/etc/systemd/system/nominatim-updates.timer`:
```
[Unit]
Description=Timer to start updates of Nominatim
[Timer]
OnActiveSec=2
OnUnitActiveSec=1min
Unit=nominatim-updates.service
[Install]
WantedBy=multi-user.target
```
And then a similar service definition: `/etc/systemd/system/nominatim-updates.service`:
```
[Unit]
Description=Single updates of Nominatim
[Service]
WorkingDirectory=/srv/nominatim
ExecStart=nominatim replication --once
StandardOutput=append:/var/log/nominatim-updates.log
StandardError=append:/var/log/nominatim-updates.error.log
User=nominatim
Group=nominatim
Type=simple
[Install]
WantedBy=multi-user.target
```
Replace the `WorkingDirectory` with your project directory. Also adapt user and
group names as required. `OnUnitActiveSec` defines how often the individual
update command is run.
Now activate the service and start the updates:
```
sudo systemctl daemon-reload
sudo systemctl enable nominatim-updates.timer
sudo systemctl start nominatim-updates.timer
```
You can stop future data updates, while allowing any current, in-progress
update steps to finish, by running `sudo systemctl stop
nominatim-updates.timer` and waiting until `nominatim-updates.service` isn't
running (`sudo systemctl is-active nominatim-updates.service`). Current output
from the update can be seen like above (`systemctl status
nominatim-updates.service`).
#### Catch-up mode
With the `--catch-up` parameter, Nominatim will immediately try to download
all changes from the server until the database is up-to-date. The catch-up mode
still respects the parameter `NOMINATIM_REPLICATION_MAX_DIFF`. It downloads and
applies the changes in appropriate batches until all is done.
The catch-up mode is foremost useful to bring the database up to speed after the
initial import. Give that the service usually is not in production at this
point, you can temporarily be a bit more generous with the batch size and
number of threads you use for the updates by running catch-up like this:
```
cd /srv/nominatim
NOMINATIM_REPLICATION_MAX_DIFF=5000 nominatim replication --catch-up --threads 15
```
The catch-up mode is also useful when you want to apply updates at a lower
frequency than what the source publishes. You can set up a cron job to run
replication catch-up at whatever interval you desire.
!!! hint
When running scheduled updates with catch-up, it is a good idea to choose
a replication source with an update frequency that is an order of magnitude
lower. For example, if you want to update once a day, use an hourly updated
source. This makes sure that you don't miss an entire day of updates when
the source is unexpectedly late to publish its update.
If you want to use the source with the same update frequency (e.g. a daily
updated source with daily updates), use the
continuous update mode. It ensures to re-request the newest update until it
is published.

View File

@@ -1,154 +0,0 @@
# Place details
Show all details about a single place saved in the database.
!!! warning
The details page exists for debugging only. You may not use it in scripts
or to automatically query details about a result.
See [Nominatim Usage Policy](https://operations.osmfoundation.org/policies/nominatim/).
## Parameters
The details API supports the following two request formats:
``` xml
https://nominatim.openstreetmap.org/details?osmtype=[N|W|R]&osmid=<value>&class=<value>
```
`osmtype` and `osmid` are required parameters. The type is one of node (N), way (W)
or relation (R). The id must be a number. The `class` parameter is optional and
allows to distinguish between entries, when the corresponding OSM object has more
than one main tag. For example, when a place is tagged with `tourism=hotel` and
`amenity=restaurant`, there will be two place entries in Nominatim, one for a
restaurant, one for a hotel. You need to specify `class=tourism` or `class=amentity`
to get exactly the one you want. If there are multiple places in the database
but the `class` parameter is left out, then one of the places will be chosen
at random and displayed.
``` xml
https://nominatim.openstreetmap.org/details?place_id=<value>
```
Place IDs are assigned sequentially during Nominatim data import. The ID
for a place is different between Nominatim installation (servers) and
changes when data gets reimported. Therefore it cannot be used as
a permanent id and shouldn't be used in bug reports.
Additional optional parameters are explained below.
### Output format
* `json_callback=<string>`
Wrap JSON output in a callback function (JSONP) i.e. `<string>(<json>)`.
* `pretty=[0|1]`
Add indentation to make it more human-readable. (Default: 0)
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 0)
* `keywords=[0|1]`
Include a list of name keywords and address keywords (word ids). (Default: 0)
* `linkedplaces=[0|1]`
Include a details of places that are linked with this one. Places get linked
together when they are different forms of the same physical object. Nominatim
links two kinds of objects together: place nodes get linked with the
corresponding administrative boundaries. Waterway relations get linked together with their
members.
(Default: 1)
* `hierarchy=[0|1]`
Include details of places lower in the address hierarchy. (Default: 0)
* `group_hierarchy=[0|1]`
For JSON output will group the places by type. (Default: 0)
* `polygon_geojson=[0|1]`
Include geometry of result. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing result, overrides the value
specified in the "Accept-Language" HTTP header.
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
## Examples
##### JSON
[https://nominatim.openstreetmap.org/details.php?osmtype=W&osmid=38210407&format=json](https://nominatim.openstreetmap.org/details.php?osmtype=W&osmid=38210407&format=json)
```json
{
"place_id": 85993608,
"parent_place_id": 72765313,
"osm_type": "W",
"osm_id": 38210407,
"category": "place",
"type": "square",
"admin_level": "15",
"localname": "Pariser Platz",
"names": {
"name": "Pariser Platz",
"name:be": "Парыжская плошча",
"name:de": "Pariser Platz",
"name:es": "Plaza de París",
"name:he": "פאריזר פלאץ",
"name:ko": "파리저 광장",
"name:la": "Forum Parisinum",
"name:ru": "Парижская площадь",
"name:uk": "Паризька площа",
"name:zh": "巴黎廣場"
},
"addresstags": {
"postcode": "10117"
},
"housenumber": null,
"calculated_postcode": "10117",
"country_code": "de",
"indexed_date": "2018-08-18T17:02:45+00:00",
"importance": 0.339401620591472,
"calculated_importance": 0.339401620591472,
"extratags": {
"wikidata": "Q156716",
"wikipedia": "de:Pariser Platz"
},
"calculated_wikipedia": "de:Pariser_Platz",
"rank_address": 30,
"rank_search": 30,
"isarea": true,
"centroid": {
"type": "Point",
"coordinates": [
13.3786822618517,
52.5163654
]
},
"geometry": {
"type": "Point",
"coordinates": [
13.3786822618517,
52.5163654
]
}
}
```

View File

@@ -1,85 +0,0 @@
# Frequently Asked Questions
## API Results
#### 1. The address of my search results contains far-away places that don't belong there.
Nominatim computes the address from two sources in the OpenStreetMap data:
from administrative boundaries and from place nodes. Boundaries are the more
useful source. They precisely describe an area. So it is very clear for
Nominatim if a point belongs to an area or not. Place nodes are more complicated.
These are only points without any precise extent. So Nominatim has to take a
guess and assume that an address belongs to the closest place node it can find.
In an ideal world, Nominatim would not need the place nodes but there are
many places on earth where there are no precise boundaries available for
all parts that make up an address. This is in particular true for the more
local address parts, like villages and suburbs. Therefore it is not possible
to completely dismiss place nodes. And sometimes they sneak in where they
don't belong.
As a OpenStreetMap mapper, you can improve the situation in two ways: if you
see a place node for which already an administrative area exists, then you
should _link_ the two by adding the node with a 'label' role to the boundary
relation. If there is no administrative area, you can add the approximate
extent of the place and tag it place=<something> as well.
#### 2. When doing reverse search, the address details have parts that don't contain the point I was looking up.
There is a common misconception how the reverse API call works in Nominatim.
Reverse does not give you the address of the point you asked for. Reverse
returns the closest object to the point you asked for and then returns the
address of that object. Now, if you are close to a border, then the closest
object may be across that border. When Nominatim then returns the address,
it contains the county/state/country across the border.
#### 3. I get different counties/states/countries when I change the zoom parameter in the reverse query. How is that possible?
This is basically the same problem as in the previous answer.
The zoom level influences at which [search rank](../customize/Ranking.md#search-rank) Nominatim starts looking
for the closest object. So the closest house number maybe on one side of the
border while the closest street is on the other. As the address details contain
the address of the closest object found, you might sometimes get one result,
sometimes the other for the closest point.
#### 4. Can you return the continent?
Nominatim assigns each map feature one country. Those outside any administrative
boundaries are assigned a special no-country. Continents or other super-national
administrations (e.g. European Union, NATO, Custom unions) are not supported,
see also [Administrative Boundary](https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative#Super-national_administrations).
#### 5. Can you return the timezone?
See this separate OpenStreetMap-based project [Timezone Boundary Builder](https://github.com/evansiroky/timezone-boundary-builder).
#### 6. I want to download a list of streets/restaurants of a city/region
The [Overpass API](https://wiki.openstreetmap.org/wiki/Overpass_API) is more
suited for these kinds of queries.
That said if you installed your own Nominatim instance you can use the
`nominatim export` PHP script as basis to return such lists.
#### 7. My result has a wrong postcode. Where does it come from?
Most places in OSM don't have a postcode, so Nominatim tries to interpolate
one. It first look at all the places that make up the address of the place.
If one of them has a postcode defined, this is the one to be used. When
none of the address parts has a postcode either, Nominatim interpolates one
from the surrounding objects. If the postcode is for your result is one, then
most of the time there is an OSM object with the wrong postcode nearby.
To find the bad postcode, go to
[https://nominatim.openstreetmap.org](https://nominatim.openstreetmap.org)
and search for your place. When you have found it, click on the 'details' link
under the result to go to the details page. There is a field 'Computed Postcode'
which should display the bad postcode. Click on the 'how?' link. A small
explanation text appears. It contains a link to a query for Overpass Turbo.
Click on that and you get a map with all places in the area that have the bad
postcode. If none is displayed, zoom the map out a bit and then click on 'Run'.
Now go to [OpenStreetMap](https://openstreetmap.org) and fix the error you
have just found. It will take at least a day for Nominatim to catch up with
your data fix. Sometimes longer, depending on how much editing activity is in
the area.

View File

@@ -1,173 +0,0 @@
# Address lookup
The lookup API allows to query the address and other details of one or
multiple OSM objects like node, way or relation.
## Parameters
The lookup API has the following format:
```
https://nominatim.openstreetmap.org/lookup?osm_ids=[N|W|R]<value>,…,…,&<params>
```
`osm_ids` is mandatory and must contain a comma-separated list of OSM ids each
prefixed with its type, one of node(N), way(W) or relation(R). Up to 50 ids
can be queried at the same time.
Additional optional parameters are explained below.
### Output format
* `format=[xml|json|jsonv2|geojson|geocodejson]`
See [Place Output Formats](Output.md) for details on each format. (Default: xml)
* `json_callback=<string>`
Wrap JSON output in a callback function (JSONP) i.e. `<string>(<json>)`.
Only has an effect for JSON output formats.
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 0)
* `extratags=[0|1]`
Include additional information in the result if available,
e.g. wikipedia link, opening hours. (Default: 0)
* `namedetails=[0|1]`
Include a list of alternative names in the results. These may include
language variants, references, operator and brand. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing search results, overrides the value
specified in the "Accept-Language" HTTP header.
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
### Polygon output
* `polygon_geojson=1`
* `polygon_kml=1`
* `polygon_svg=1`
* `polygon_text=1`
Output geometry of results as a GeoJSON, KML, SVG or WKT. Only one of these
options can be used at a time. (Default: 0)
* `polygon_threshold=0.0`
Return a simplified version of the output geometry. The parameter is the
tolerance in degrees with which the geometry may differ from the original
geometry. Topology is preserved in the result. (Default: 0.0)
### Other
* `email=<valid email address>`
If you are making large numbers of request please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.
* `debug=[0|1]`
Output assorted developer debug information. Data on internals of Nominatim's
"Search Loop" logic, and SQL queries. The output is (rough) HTML format.
This overrides the specified machine readable format. (Default: 0)
## Examples
##### XML
[https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W104393803,N240109189](https://nominatim.openstreetmap.org/lookup?osm_ids=R146656,W50637691,N240109189)
```xml
<lookupresults timestamp="Mon, 28 Mar 22 14:38:54 +0000" attribution="Data &#xA9; OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright" querystring="R146656,W50637691,N240109189" more_url="">
<place place_id="282236157" osm_type="relation" osm_id="146656" place_rank="16" address_rank="16" boundingbox="53.3401044,53.5445923,-2.3199185,-2.1468288" lat="53.44246175" lon="-2.2324547359718547" display_name="Manchester, Greater Manchester, North West England, England, United Kingdom" class="boundary" type="administrative" importance="0.35">
<city>Manchester</city>
<county>Greater Manchester</county>
<state_district>North West England</state_district>
<state>England</state>
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
<place place_id="115462561" osm_type="way" osm_id="50637691" place_rank="30" address_rank="30" boundingbox="52.3994612,52.3996426,13.0479574,13.0481754" lat="52.399550700000006" lon="13.048066846939687" display_name="Brandenburger Tor, Brandenburger Stra&#xDF;e, Historische Innenstadt, Innenstadt, Potsdam, Brandenburg, 14467, Germany" class="tourism" type="attraction" importance="0.29402874005524">
<tourism>Brandenburger Tor</tourism>
<road>Brandenburger Stra&#xDF;e</road>
<suburb>Historische Innenstadt</suburb>
<city>Potsdam</city>
<state>Brandenburg</state>
<postcode>14467</postcode>
<country>Germany</country>
<country_code>de</country_code>
</place>
<place place_id="567505" osm_type="node" osm_id="240109189" place_rank="15" address_rank="16" boundingbox="52.3586925,52.6786925,13.2396024,13.5596024" lat="52.5186925" lon="13.3996024" display_name="Berlin, 10178, Germany" class="place" type="city" importance="0.78753902824914">
<city>Berlin</city>
<state>Berlin</state>
<postcode>10178</postcode>
<country>Germany</country>
<country_code>de</country_code>
</place>
</lookupresults>
```
##### JSON with extratags
[https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json&extratags=1](https://nominatim.openstreetmap.org/lookup?osm_ids=W50637691&format=json&extratags=1)
```json
[
{
"place_id": 115462561,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "way",
"osm_id": 50637691,
"boundingbox": [
"52.3994612",
"52.3996426",
"13.0479574",
"13.0481754"
],
"lat": "52.399550700000006",
"lon": "13.048066846939687",
"display_name": "Brandenburger Tor, Brandenburger Straße, Historische Innenstadt, Innenstadt, Potsdam, Brandenburg, 14467, Germany",
"class": "tourism",
"type": "attraction",
"importance": 0.2940287400552381,
"address": {
"tourism": "Brandenburger Tor",
"road": "Brandenburger Straße",
"suburb": "Historische Innenstadt",
"city": "Potsdam",
"state": "Brandenburg",
"postcode": "14467",
"country": "Germany",
"country_code": "de"
},
"extratags": {
"image": "http://commons.wikimedia.org/wiki/File:Potsdam_brandenburger_tor.jpg",
"heritage": "4",
"wikidata": "Q695045",
"architect": "Carl von Gontard;Georg Christian Unger",
"wikipedia": "de:Brandenburger Tor (Potsdam)",
"wheelchair": "yes",
"description": "Kleines Brandenburger Tor in Potsdam",
"heritage:website": "http://www.bldam-brandenburg.de/images/stories/PDF/DML%202012/04-p-internet-13.pdf",
"heritage:operator": "bldam",
"architect:wikidata": "Q68768;Q95223",
"year_of_construction": "1771"
}
}
]
```

View File

@@ -1,302 +0,0 @@
# Place Output
The [/reverse](Reverse.md), [/search](Search.md) and [/lookup](Lookup.md)
API calls produce very similar output which is explained in this section.
There is one section for each format. The format correspond to what was
selected via the `format` parameter.
## JSON
The JSON format returns an array of places (for search and lookup) or
a single place (for reverse) of the following format:
```
{
"place_id": "100149",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "node",
"osm_id": "107775",
"boundingbox": ["51.3473219", "51.6673219", "-0.2876474", "0.0323526"],
"lat": "51.5073219",
"lon": "-0.1276474",
"display_name": "London, Greater London, England, SW1A 2DU, United Kingdom",
"class": "place",
"type": "city",
"importance": 0.9654895765402,
"icon": "https://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png",
"address": {
"city": "London",
"state_district": "Greater London",
"state": "England",
"ISO3166-2-lvl4": "GB-ENG",
"postcode": "SW1A 2DU",
"country": "United Kingdom",
"country_code": "gb"
},
"extratags": {
"capital": "yes",
"website": "http://www.london.gov.uk",
"wikidata": "Q84",
"wikipedia": "en:London",
"population": "8416535"
}
}
```
The possible fields are:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `boundingbox` - area of corner coordinates ([see notes](#boundingbox))
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `display_name` - full comma-separated address
* `class`, `type` - key and value of the main OSM tag
* `importance` - computed importance rank
* `icon` - link to class icon (if available)
* `address` - dictionary of address details (only with `addressdetails=1`,
[see notes](#addressdetails))
* `extratags` - dictionary with additional useful tags like website or maxspeed
(only with `extratags=1`)
* `namedetails` - dictionary with full list of available names including ref etc.
* `geojson`, `svg`, `geotext`, `geokml` - full geometry
(only with the appropriate `polygon_*` parameter)
## JSONv2
This is the same as the JSON format with two changes:
* `class` renamed to `category`
* additional field `place_rank` with the search rank of the object
## GeoJSON
This format follows the [RFC7946](https://geojson.org). Every feature includes
a bounding box (`bbox`).
The properties object has the following fields:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `category`, `type` - key and value of the main OSM tag
* `display_name` - full comma-separated address
* `place_rank` - class search rank
* `importance` - computed importance rank
* `icon` - link to class icon (if available)
* `address` - dictionary of address details (only with `addressdetails=1`,
[see notes](#addressdetails))
* `extratags` - dictionary with additional useful tags like `website` or `maxspeed`
(only with `extratags=1`)
* `namedetails` - dictionary with full list of available names including ref etc.
Use `polygon_geojson` to output the full geometry of the object instead
of the centroid.
## GeocodeJSON
The GeocodeJSON format follows the
[GeocodeJSON spec 0.1.0](https://github.com/geocoders/geocodejson-spec).
The following feature attributes are implemented:
* `osm_type`, `osm_id` - reference to the OSM object (unofficial extension, [see notes](#osm-reference))
* `type` - the 'address level' of the object ('house', 'street', `district`, `city`,
`county`, `state`, `country`, `locality`)
* `osm_key`- key of the main tag of the OSM object (e.g. boundary, highway, amenity)
* `osm_value` - value of the main tag of the OSM object (e.g. residential, restaurant)
* `label` - full comma-separated address
* `name` - localised name of the place
* `housenumber`, `street`, `locality`, `district`, `postcode`, `city`,
`county`, `state`, `country` -
provided when it can be determined from the address
* `admin` - list of localised names of administrative boundaries (only with `addressdetails=1`)
Use `polygon_geojson` to output the full geometry of the object instead
of the centroid.
## XML
The XML response returns one or more place objects in slightly different
formats depending on the API call.
### Reverse
```
<reversegeocode timestamp="Sat, 11 Aug 18 11:53:21 +0000"
attribution="Data © OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright"
querystring="lat=48.400381&lon=11.745876&zoom=5&format=xml">
<result place_id="179509537" osm_type="relation" osm_id="2145268" ref="BY" place_rank="15" address_rank="15"
lat="48.9467562" lon="11.4038717"
boundingbox="47.2701114,50.5647142,8.9763497,13.8396373">
Bavaria, Germany
</result>
<addressparts>
<state>Bavaria</state>
<ISO3166-2-lvl4>DE-BY</ISO3166-2-lvl4>
<country>Germany</country>
<country_code>de</country_code>
</addressparts>
<extratags>
<tag key="place" value="state"/>
<tag key="wikidata" value="Q980"/>
<tag key="wikipedia" value="de:Bayern"/>
<tag key="population" value="12520000"/>
<tag key="name:prefix" value="Freistaat"/>
</extratags>
</reversegeocode>
```
The attributes of the outer `reversegeocode` element return generic information
about the query, including the time when the response was sent (in UTC),
attribution to OSM and the original querystring.
The place information can be found in the `result` element. The attributes of that element contain:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `ref` - content of `ref` tag if it exists
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `boundingbox` - comma-separated list of corner coordinates ([see notes](#boundingbox))
The full address of the result can be found in the content of the
`result` element as a comma-separated list.
Additional information requested with `addressdetails=1`, `extratags=1` and
`namedetails=1` can be found in extra elements.
### Search and Lookup
```
<searchresults timestamp="Sat, 11 Aug 18 11:55:35 +0000"
attribution="Data © OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright"
querystring="london" polygon="false" exclude_place_ids="100149"
more_url="https://nominatim.openstreetmap.org/search.php?q=london&addressdetails=1&extratags=1&exclude_place_ids=100149&format=xml&accept-language=en-US%2Cen%3Bq%3D0.7%2Cde%3Bq%3D0.3">
<place place_id="100149" osm_type="node" osm_id="107775" place_rank="15" address_rank="15"
boundingbox="51.3473219,51.6673219,-0.2876474,0.0323526" lat="51.5073219" lon="-0.1276474"
display_name="London, Greater London, England, SW1A 2DU, United Kingdom"
class="place" type="city" importance="0.9654895765402"
icon="https://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png">
<extratags>
<tag key="capital" value="yes"/>
<tag key="website" value="http://www.london.gov.uk"/>
<tag key="wikidata" value="Q84"/>
<tag key="wikipedia" value="en:London"/>
<tag key="population" value="8416535"/>
</extratags>
<city>London</city>
<state_district>Greater London</state_district>
<state>England</state>
<ISO3166-2-lvl4>GB-ENG</ISO3166-2-lvl4>
<postcode>SW1A 2DU</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
</searchresults>
```
The attributes of the outer `searchresults` or `lookupresults` element return
generic information about the query:
* `timestamp` - UTC time when the response was sent
* `attribution` - OSM licensing information
* `querystring` - original query
* `polygon` - true when extra geometry information was requested
* `exclude_place_ids` - IDs of places that should be ignored in a follow-up request
* `more_url` - search call that will yield additional results for the query
just sent
The place information can be found in the `place` elements, of which there may
be more than one. The attributes of that element contain:
* `place_id` - reference to the Nominatim internal database ID ([see notes](#place_id-is-not-a-persistent-id))
* `osm_type`, `osm_id` - reference to the OSM object ([see notes](#osm-reference))
* `ref` - content of `ref` tag if it exists
* `lat`, `lon` - latitude and longitude of the centroid of the object
* `boundingbox` - comma-separated list of corner coordinates ([see notes](#boundingbox))
* `place_rank` - class [search rank](../customize/Ranking.md#search-rank)
* `address_rank` - place [address rank](../customize/Ranking.md#address-rank)
* `display_name` - full comma-separated address
* `class`, `type` - key and value of the main OSM tag
* `importance` - computed importance rank
* `icon` - link to class icon (if available)
When `addressdetails=1` is requested, the localised address parts appear
as subelements with the type of the address part.
Additional information requested with `extratags=1` and `namedetails=1` can
be found in extra elements as sub-element of `extratags` and `namedetails`
respectively.
## Notes on field values
### place_id is not a persistent id
The `place_id` is an internal identifier that is assigned data is imported
into a Nominatim database. The same OSM object will have a different value
on another server. It may even change its ID on the same server when it is
removed and reimported while updating the database with fresh OSM data.
It is thus not useful to treat it as permanent for later use.
The combination `osm_type`+`osm_id` is slightly better but remember in
OpenStreetMap mappers can delete, split, recreate places (and those
get a new `osm_id`), there is no link between those old and new ids.
Places can also change their meaning without changing their `osm_id`,
e.g. when a restaurant is retagged as supermarket. For a more in-depth
discussion see [Permanent ID](https://wiki.openstreetmap.org/wiki/Permanent_ID).
If you need an ID that is consistent over multiple installations of Nominatim,
then you should use the combination of `osm_type`+`osm_id`+`class`.
### OSM reference
Nominatim may sometimes return special objects that do not correspond directly
to an object in OpenStreetMap. These are:
* **Postcodes**. Nominatim returns an postcode point created from all mapped
postcodes of the same name. The class and type of these object is `place=postcdode`.
No `osm_type` and `osm_id` are included in the result.
* **Housenumber interpolations**. Nominatim returns a single interpolated
housenumber from the interpolation way. The class and type are `place=house`
and `osm_type` and `osm_id` correspond to the interpolation way in OSM.
* **TIGER housenumber.** Nominatim returns a single interpolated housenumber
from the TIGER data. The class and type are `place=house`
and `osm_type` and `osm_id` correspond to the street mentioned in the result.
Please note that the `osm_type` and `osm_id` returned may be changed in the
future. You should not expect to only find `node`, `way` and `relation` for
the type.
### boundingbox
Comma separated list of min latitude, max latitude, min longitude, max longitude.
The whole planet would be `-90,90,-180,180`.
Can be used to pan and center the map on the result, for example with leafletjs
mapping library
`map.fitBounds([[bbox[0],bbox[2]],[bbox[1],bbox[3]]], {padding: [20, 20], maxzoom: 16});`
Bounds crossing the antimeridian have a min latitude -180 and max latitude 180,
essentially covering the entire planet
(see [issue 184](https://github.com/openstreetmap/Nominatim/issues/184)).
### addressdetails
Address details in the xml and json formats return a list of names together
with a designation label. Per default the following labels may appear:
* continent
* country, country_code
* region, state, state_district, county, ISO3166-2-lvl<admin_level>
* municipality, city, town, village
* city_district, district, borough, suburb, subdivision
* hamlet, croft, isolated_dwelling
* neighbourhood, allotments, quarter
* city_block, residential, farm, farmyard, industrial, commercial, retail
* road
* house_number, house_name
* emergency, historic, military, natural, landuse, place, railway,
man_made, aerialway, boundary, amenity, aeroway, club, craft, leisure,
office, mountain_pass, shop, tourism, bridge, tunnel, waterway
* postcode
They roughly correspond to the classification of the OpenStreetMap data
according to either the `place` tag or the main key of the object.

View File

@@ -1,14 +0,0 @@
### Nominatim API
Nominatim indexes named (or numbered) features within the OpenStreetMap (OSM) dataset and a subset of other unnamed features (pubs, hotels, churches, etc).
Its API has the following endpoints for querying the data:
* __[/search](Search.md)__ - search OSM objects by name or type
* __[/reverse](Reverse.md)__ - search OSM object by their location
* __[/lookup](Lookup.md)__ - look up address details for OSM objects by their ID
* __[/status](Status.md)__ - query the status of the server
* __/deletable__ - list objects that have been deleted in OSM but are held
back in Nominatim in case the deletion was accidental
* __/polygons__ - list of broken polygons detected by Nominatim
* __[/details](Details.md)__ - show internal details for an object (for debugging only)

View File

@@ -1,285 +0,0 @@
# Reverse Geocoding
Reverse geocoding generates an address from a latitude and longitude.
## How it works
The reverse geocoding API does not exactly compute the address for the
coordinate it receives. It works by finding the closest suitable OSM object
and returning its address information. This may occasionally lead to
unexpected results.
First of all, Nominatim only includes OSM objects in
its index that are suitable for searching. Small, unnamed paths for example
are missing from the database and can therefore not be used for reverse
geocoding either.
The other issue to be aware of is that the closest OSM object may not always
have a similar enough address to the coordinate you were requesting. For
example, in dense city areas it may belong to a completely different street.
## Parameters
The main format of the reverse API is
```
https://nominatim.openstreetmap.org/reverse?lat=<value>&lon=<value>&<params>
```
where `lat` and `lon` are latitude and longitude of a coordinate in WGS84
projection. The API returns exactly one result or an error when the coordinate
is in an area with no OSM data coverage.
Additional parameters are accepted as listed below.
!!! warning "Deprecation warning"
The reverse API used to allow address lookup for a single OSM object by
its OSM id. This use is now deprecated. Use the [Address Lookup API](Lookup.md)
instead.
### Output format
* `format=[xml|json|jsonv2|geojson|geocodejson]`
See [Place Output Formats](Output.md) for details on each format. (Default: xml)
* `json_callback=<string>`
Wrap JSON output in a callback function ([JSONP](https://en.wikipedia.org/wiki/JSONP)) i.e. `<string>(<json>)`.
Only has an effect for JSON output formats.
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 1)
* `extratags=[0|1]`
Include additional information in the result if available,
e.g. wikipedia link, opening hours. (Default: 0)
* `namedetails=[0|1]`
Include a list of alternative names in the results. These may include
language variants, references, operator and brand. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing search results, overrides the value
specified in the "Accept-Language" HTTP header.
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
### Result limitation
* `zoom=[0-18]`
Level of detail required for the address. Default: 18. This is a number that
corresponds roughly to the zoom level used in XYZ tile sources in frameworks
like Leaflet.js, Openlayers etc.
In terms of address details the zoom levels are as follows:
zoom | address detail
-----|---------------
3 | country
5 | state
8 | county
10 | city
14 | suburb
16 | major streets
17 | major and minor streets
18 | building
### Polygon output
* `polygon_geojson=1`
* `polygon_kml=1`
* `polygon_svg=1`
* `polygon_text=1`
Output geometry of results as a GeoJSON, KML, SVG or WKT. Only one of these
options can be used at a time. (Default: 0)
* `polygon_threshold=0.0`
Return a simplified version of the output geometry. The parameter is the
tolerance in degrees with which the geometry may differ from the original
geometry. Topology is preserved in the result. (Default: 0.0)
### Other
* `email=<valid email address>`
If you are making a large number of requests, please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.
* `debug=[0|1]`
Output assorted developer debug information. Data on internals of Nominatim's
"Search Loop" logic, and SQL queries. The output is (rough) HTML format.
This overrides the specified machine readable format. (Default: 0)
## Examples
* [https://nominatim.openstreetmap.org/reverse?format=xml&lat=52.5487429714954&lon=-1.81602098644987&zoom=18&addressdetails=1](https://nominatim.openstreetmap.org/reverse?format=xml&lat=52.5487429714954&lon=-1.81602098644987&zoom=18&addressdetails=1)
```xml
<reversegeocode timestamp="Fri, 06 Nov 09 16:33:54 +0000" querystring="...">
<result place_id="1620612" osm_type="node" osm_id="452010817">
135, Pilkington Avenue, Wylde Green, City of Birmingham, West Midlands (county), B72, United Kingdom
</result>
<addressparts>
<house_number>135</house_number>
<road>Pilkington Avenue</road>
<village>Wylde Green</village>
<town>Sutton Coldfield</town>
<city>City of Birmingham</city>
<county>West Midlands (county)</county>
<postcode>B72</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
</addressparts>
</reversegeocode>
```
##### Example with `format=jsonv2`
* [https://nominatim.openstreetmap.org/reverse?format=jsonv2&lat=-34.44076&lon=-58.70521](https://nominatim.openstreetmap.org/reverse?format=jsonv2&lat=-34.44076&lon=-58.70521)
```json
{
"place_id":"134140761",
"licence":"Data © OpenStreetMap contributors, ODbL 1.0. https:\/\/www.openstreetmap.org\/copyright",
"osm_type":"way",
"osm_id":"280940520",
"lat":"-34.4391708",
"lon":"-58.7064573",
"place_rank":"26",
"category":"highway",
"type":"motorway",
"importance":"0.1",
"addresstype":"road",
"display_name":"Autopista Pedro Eugenio Aramburu, El Triángulo, Partido de Malvinas Argentinas, Buenos Aires, 1.619, Argentina",
"name":"Autopista Pedro Eugenio Aramburu",
"address":{
"road":"Autopista Pedro Eugenio Aramburu",
"village":"El Triángulo",
"state_district":"Partido de Malvinas Argentinas",
"state":"Buenos Aires",
"postcode":"1.619",
"country":"Argentina",
"country_code":"ar"
},
"boundingbox":["-34.44159","-34.4370994","-58.7086067","-58.7044712"]
}
```
##### Example with `format=geojson`
* [https://nominatim.openstreetmap.org/reverse?format=geojson&lat=44.50155&lon=11.33989](https://nominatim.openstreetmap.org/reverse?format=geojson&lat=44.50155&lon=11.33989)
```json
{
"type": "FeatureCollection",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"features": [
{
"type": "Feature",
"properties": {
"place_id": "18512203",
"osm_type": "node",
"osm_id": "1704756187",
"place_rank": "30",
"category": "place",
"type": "house",
"importance": "0",
"addresstype": "place",
"name": null,
"display_name": "71, Via Guglielmo Marconi, Saragozza-Porto, Bologna, BO, Emilia-Romagna, 40122, Italy",
"address": {
"house_number": "71",
"road": "Via Guglielmo Marconi",
"suburb": "Saragozza-Porto",
"city": "Bologna",
"county": "BO",
"state": "Emilia-Romagna",
"postcode": "40122",
"country": "Italy",
"country_code": "it"
}
},
"bbox": [
11.3397676,
44.5014307,
11.3399676,
44.5016307
],
"geometry": {
"type": "Point",
"coordinates": [
11.3398676,
44.5015307
]
}
}
]
}
```
##### Example with `format=geocodejson`
[https://nominatim.openstreetmap.org/reverse?format=geocodejson&lat=60.2299&lon=11.1663](https://nominatim.openstreetmap.org/reverse?format=geocodejson&lat=60.2299&lon=11.1663)
```json
{
"type": "FeatureCollection",
"geocoding": {
"version": "0.1.0",
"attribution": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"licence": "ODbL",
"query": "60.229917843587,11.16630979382"
},
"features": {
"type": "Feature",
"properties": {
"geocoding": {
"place_id": "42700574",
"osm_type": "node",
"osm_id": "3110596255",
"type": "house",
"accuracy": 0,
"label": "1, Løvenbergvegen, Mogreina, Ullensaker, Akershus, 2054, Norway",
"name": null,
"housenumber": "1",
"street": "Løvenbergvegen",
"postcode": "2054",
"county": "Akershus",
"country": "Norway",
"admin": {
"level7": "Ullensaker",
"level4": "Akershus",
"level2": "Norway"
}
}
},
"geometry": {
"type": "Point",
"coordinates": [
11.1658572,
60.2301296
]
}
}
}
```

View File

@@ -1,362 +0,0 @@
# Search queries
The search API allows you to look up a location from a textual description
or address. Nominatim supports structured and free-form search queries.
The search query may also contain
[special phrases](https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases)
which are translated into specific OpenStreetMap (OSM) tags (e.g. Pub => `amenity=pub`).
This can be used to narrow down the kind of objects to be returned.
!!! warning
Special phrases are not suitable to query all objects of a certain type in an
area. Nominatim will always just return a collection of the best matches. To
download OSM data by object type, use the [Overpass API](https://overpass-api.de/).
## Parameters
The search API has the following format:
```
https://nominatim.openstreetmap.org/search?<params>
```
The search term may be specified with two different sets of parameters:
* `q=<query>`
Free-form query string to search for.
Free-form queries are processed first left-to-right and then right-to-left if that fails. So you may search for
[pilkington avenue, birmingham](https://nominatim.openstreetmap.org/search?q=pilkington+avenue,birmingham) as well as for
[birmingham, pilkington avenue](https://nominatim.openstreetmap.org/search?q=birmingham,+pilkington+avenue).
Commas are optional, but improve performance by reducing the complexity of the search.
* `street=<housenumber> <streetname>`
* `city=<city>`
* `county=<county>`
* `state=<state>`
* `country=<country>`
* `postalcode=<postalcode>`
Alternative query string format split into several parameters for structured requests.
Structured requests are faster but are less robust against alternative
OSM tagging schemas. **Do not combine with** `q=<query>` **parameter**.
Both query forms accept the additional parameters listed below.
### Output format
* `format=[xml|json|jsonv2|geojson|geocodejson]`
See [Place Output Formats](Output.md) for details on each format. (Default: jsonv2)
!!! note
The Nominatim service at
[https://nominatim.openstreetmap.org](https://nominatim.openstreetmap.org)
has a different default behaviour for historical reasons. When the
`format` parameter is omitted, the request will be forwarded to the Web UI.
* `json_callback=<string>`
Wrap JSON output in a callback function ([JSONP](https://en.wikipedia.org/wiki/JSONP)) i.e. `<string>(<json>)`.
Only has an effect for JSON output formats.
### Output details
* `addressdetails=[0|1]`
Include a breakdown of the address into elements. (Default: 0)
* `extratags=[0|1]`
Include additional information in the result if available,
e.g. wikipedia link, opening hours. (Default: 0)
* `namedetails=[0|1]`
Include a list of alternative names in the results. These may include
language variants, references, operator and brand. (Default: 0)
### Language of results
* `accept-language=<browser language string>`
Preferred language order for showing search results, overrides the value
specified in the ["Accept-Language" HTTP header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language).
Either use a standard RFC2616 accept-language string or a simple
comma-separated list of language codes.
### Result limitation
* `countrycodes=<countrycode>[,<countrycode>][,<countrycode>]...`
Limit search results to one or more countries. `<countrycode>` must be the
[ISO 3166-1alpha2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) code,
e.g. `gb` for the United Kingdom, `de` for Germany.
Each place in Nominatim is assigned to one country code based
on OSM country boundaries. In rare cases a place may not be in any country
at all, for example, in international waters.
* `exclude_place_ids=<place_id,[place_id],[place_id]`
If you do not want certain OSM objects to appear in the search
result, give a comma separated list of the `place_id`s you want to skip.
This can be used to retrieve additional search results. For example, if a
previous query only returned a few results, then including those here would
cause the search to return other, less accurate, matches (if possible).
* `limit=<integer>`
Limit the number of returned results. (Default: 10, Maximum: 50)
* `viewbox=<x1>,<y1>,<x2>,<y2>`
The preferred area to find search results. Any two corner points of the box
are accepted as long as they span a real box. `x` is longitude,
`y` is latitude.
* `bounded=[0|1]`
When a viewbox is given, restrict the result to items contained within that
viewbox (see above). When `viewbox` and `bounded=1` are given, an amenity
only search is allowed. Give the special keyword for the amenity in square
brackets, e.g. `[pub]` and a selection of objects of this type is returned.
There is no guarantee that the result is complete. (Default: 0)
### Polygon output
* `polygon_geojson=1`
* `polygon_kml=1`
* `polygon_svg=1`
* `polygon_text=1`
Output geometry of results as a GeoJSON, KML, SVG or WKT. Only one of these
options can be used at a time. (Default: 0)
* `polygon_threshold=0.0`
Return a simplified version of the output geometry. The parameter is the
tolerance in degrees with which the geometry may differ from the original
geometry. Topology is preserved in the result. (Default: 0.0)
### Other
* `email=<valid email address>`
If you are making large numbers of request please include an appropriate email
address to identify your requests. See Nominatim's [Usage Policy](https://operations.osmfoundation.org/policies/nominatim/) for more details.
* `dedupe=[0|1]`
Sometimes you have several objects in OSM identifying the same place or
object in reality. The simplest case is a street being split into many
different OSM ways due to different characteristics. Nominatim will
attempt to detect such duplicates and only return one match unless
this parameter is set to 0. (Default: 1)
* `debug=[0|1]`
Output assorted developer debug information. Data on internals of Nominatim's
"Search Loop" logic, and SQL queries. The output is (rough) HTML format.
This overrides the specified machine readable format. (Default: 0)
## Examples
##### XML with kml polygon
* [https://nominatim.openstreetmap.org/search?q=135+pilkington+avenue,+birmingham&format=xml&polygon_geojson=1&addressdetails=1](https://nominatim.openstreetmap.org/search?q=135+pilkington+avenue,+birmingham&format=xml&polygon_geojson=1&addressdetails=1)
```xml
<searchresults timestamp="Sat, 07 Nov 09 14:42:10 +0000" querystring="135 pilkington, avenue birmingham" polygon="true">
<place
place_id="1620612" osm_type="node" osm_id="452010817"
boundingbox="52.548641204834,52.5488433837891,-1.81612110137939,-1.81592094898224"
lat="52.5487429714954" lon="-1.81602098644987"
display_name="135, Pilkington Avenue, Wylde Green, City of Birmingham, West Midlands (county), B72, United Kingdom"
class="place" type="house">
<geokml>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<coordinates>-1.816513,52.548756599999997 -1.816434,52.548747300000002 -1.816429,52.5487629 -1.8163717,52.548756099999999 -1.8163464,52.548834599999999 -1.8164599,52.548848100000001 -1.8164685,52.5488213 -1.8164913,52.548824000000003 -1.816513,52.548756599999997</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</geokml>
<house_number>135</house_number>
<road>Pilkington Avenue</road>
<village>Wylde Green</village>
<town>Sutton Coldfield</town>
<city>City of Birmingham</city>
<county>West Midlands (county)</county>
<postcode>B72</postcode>
<country>United Kingdom</country>
<country_code>gb</country_code>
</place>
</searchresults>
```
##### JSON with SVG polygon
[https://nominatim.openstreetmap.org/search/Unter%20den%20Linden%201%20Berlin?format=json&addressdetails=1&limit=1&polygon_svg=1](https://nominatim.openstreetmap.org/search/Unter%20den%20Linden%201%20Berlin?format=json&addressdetails=1&limit=1&polygon_svg=1)
```json
{
"address": {
"city": "Berlin",
"city_district": "Mitte",
"construction": "Unter den Linden",
"continent": "European Union",
"country": "Deutschland",
"country_code": "de",
"house_number": "1",
"neighbourhood": "Scheunenviertel",
"postcode": "10117",
"public_building": "Kommandantenhaus",
"state": "Berlin",
"suburb": "Mitte"
},
"boundingbox": [
"52.5170783996582",
"52.5173187255859",
"13.3975105285645",
"13.3981599807739"
],
"class": "amenity",
"display_name": "Kommandantenhaus, 1, Unter den Linden, Scheunenviertel, Mitte, Berlin, 10117, Deutschland, European Union",
"importance": 0.73606775332943,
"lat": "52.51719785",
"licence": "Data \u00a9 OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright",
"lon": "13.3978352028938",
"osm_id": "15976890",
"osm_type": "way",
"place_id": "30848715",
"svg": "M 13.397511 -52.517283599999999 L 13.397829400000001 -52.517299800000004 13.398131599999999 -52.517315099999998 13.398159400000001 -52.517112099999999 13.3975388 -52.517080700000001 Z",
"type": "public_building"
}
```
##### JSON with address details
[https://nominatim.openstreetmap.org/?addressdetails=1&q=bakery+in+berlin+wedding&format=json&limit=1](https://nominatim.openstreetmap.org/?addressdetails=1&q=bakery+in+berlin+wedding&format=json&limit=1)
```json
{
"address": {
"bakery": "B\u00e4cker Kamps",
"city_district": "Mitte",
"continent": "European Union",
"country": "Deutschland",
"country_code": "de",
"footway": "Bahnsteig U6",
"neighbourhood": "Sprengelkiez",
"postcode": "13353",
"state": "Berlin",
"suburb": "Wedding"
},
"boundingbox": [
"52.5460929870605",
"52.5460968017578",
"13.3591794967651",
"13.3591804504395"
],
"class": "shop",
"display_name": "B\u00e4cker Kamps, Bahnsteig U6, Sprengelkiez, Wedding, Mitte, Berlin, 13353, Deutschland, European Union",
"icon": "https://nominatim.openstreetmap.org/images/mapicons/shopping_bakery.p.20.png",
"importance": 0.201,
"lat": "52.5460941",
"licence": "Data \u00a9 OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright",
"lon": "13.35918",
"osm_id": "317179427",
"osm_type": "node",
"place_id": "1453068",
"type": "bakery"
}
```
##### GeoJSON
[https://nominatim.openstreetmap.org/search?q=17+Strada+Pictor+Alexandru+Romano%2C+Bukarest&format=geojson](https://nominatim.openstreetmap.org/search?q=17+Strada+Pictor+Alexandru+Romano%2C+Bukarest&format=geojson)
```json
{
"type": "FeatureCollection",
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"features": [
{
"type": "Feature",
"properties": {
"place_id": "35811445",
"osm_type": "node",
"osm_id": "2846295644",
"display_name": "17, Strada Pictor Alexandru Romano, Bukarest, Bucharest, Sector 2, Bucharest, 023964, Romania",
"place_rank": "30",
"category": "place",
"type": "house",
"importance": 0.62025
},
"bbox": [
26.1156689,
44.4354754,
26.1157689,
44.4355754
],
"geometry": {
"type": "Point",
"coordinates": [
26.1157189,
44.4355254
]
}
}
]
}
```
##### GeocodeJSON
[https://nominatim.openstreetmap.org/search?q=%CE%91%CE%B3%CE%AF%CE%B1+%CE%A4%CF%81%CE%B9%CE%AC%CE%B4%CE%B1%2C+%CE%91%CE%B4%CF%89%CE%BD%CE%B9%CE%B4%CE%BF%CF%82%2C+Athens%2C+Greece&format=geocodejson](https://nominatim.openstreetmap.org/search?q=%CE%91%CE%B3%CE%AF%CE%B1+%CE%A4%CF%81%CE%B9%CE%AC%CE%B4%CE%B1%2C+%CE%91%CE%B4%CF%89%CE%BD%CE%B9%CE%B4%CE%BF%CF%82%2C+Athens%2C+Greece&format=geocodejson)
```json
{
"type": "FeatureCollection",
"geocoding": {
"version": "0.1.0",
"attribution": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"licence": "ODbL",
"query": "Αγία Τριάδα, Αδωνιδος, Athens, Greece"
},
"features": [
{
"type": "Feature",
"properties": {
"geocoding": {
"type": "place_of_worship",
"label": "Αγία Τριάδα, Αδωνιδος, Άγιος Νικόλαος, 5º Δημοτικό Διαμέρισμα Αθηνών, Athens, Municipality of Athens, Regional Unit of Central Athens, Region of Attica, Attica, 11472, Greece",
"name": "Αγία Τριάδα",
"admin": null
}
},
"geometry": {
"type": "Point",
"coordinates": [
23.72949633941,
38.0051697
]
}
}
]
}
```

View File

@@ -1,67 +0,0 @@
# Status
Useful for checking if the service and database is running. The JSON output also shows
when the database was last updated.
## Parameters
* `format=[text|json]` (defaults to 'text')
## Output
#### Text format
```
https://nominatim.openstreetmap.org/status.php
```
will return HTTP status code 200 and print `OK`.
On error it will return HTTP status code 500 and print a message, e.g.
`ERROR: Database connection failed`.
#### JSON format
```
https://nominatim.openstreetmap.org/status.php?format=json
```
will return HTTP code 200 and a structure
```json
{
"status": 0,
"message": "OK",
"data_updated": "2020-05-04T14:47:00+00:00",
"software_version": "3.6.0-0",
"database_version": "3.6.0-0"
}
```
The `software_version` field contains the version of Nominatim used to serve
the API. The `database_version` field contains the version of the data format
in the database.
On error will also return HTTP status code 200 and a structure with error
code and message, e.g.
```json
{
"status": 700,
"message": "Database connection failed"
}
```
Possible status codes are
| | message | notes |
| --- | ------------------------------ | ----------------------------------------------------------------- |
| 700 | "No database" | connection failed |
| 701 | "Module failed" | database could not load nominatim.so |
| 702 | "Module call failed" | nominatim.so loaded but calling a function failed |
| 703 | "Query failed" | test query against a database table failed |
| 704 | "No value" | test query worked but returned no results |
| 705 | "Import date is not available" | No import dates were returned (enabling replication can fix this) |

View File

@@ -1,7 +0,0 @@
#!/bin/sh
#
# Extract markdown-formatted documentation from a source file
#
# Usage: bash2md.sh <infile> <outfile>
sed '/^#!/d;s:^#\( \|$\)::;s/.*#DOCS://' $1 > $2

View File

@@ -1,149 +0,0 @@
# Customizing Per-Country Data
Whenever an OSM is imported into Nominatim, the object is first assigned
a country. Nominatim can use this information to adapt various aspects of
the address computation to the local customs of the country. This section
explains how country assignment works and the principal per-country
localizations.
## Country assignment
Countries are assigned on the basis of country data from the OpenStreetMap
input data itself. Countries are expected to be tagged according to the
[administrative boundary schema](https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative):
a OSM relation with `boundary=administrative` and `admin_level=2`. Nominatim
uses the country code to distinguish the countries.
If there is no country data available for a point, then Nominatim uses the
fallback data imported from `data/country_osm_grid.sql.gz`. This was computed
from OSM data as well but is guaranteed to cover all countries.
Some OSM objects may also be located outside any country, for example a buoy
in the middle of the ocean. These object do not get any country assigned and
get a default treatment when it comes to localized handling of data.
## Per-country settings
### Global country settings
The main place to configure settings per country is the file
`settings/country_settings.yaml`. This file has one section per country that
is recognised by Nominatim. Each section is tagged with the country code
(in lower case) and contains the different localization information. Only
countries which are listed in this file are taken into account for computations.
For example, the section for Andorra looks like this:
```
partition: 35
languages: ca
names: !include country-names/ad.yaml
postcode:
pattern: "(ddd)"
output: AD\1
```
The individual settings are described below.
#### `partition`
Nominatim internally splits the data into multiple tables to improve
performance. The partition number tells Nominatim into which table to put
the country. This is purely internal management and has no effect on the
output data.
The default is to have one partition per country.
#### `languages`
A comma-separated list of ISO-639 language codes of default languages in the
country. These are the languages used in name tags without a language suffix.
Note that this is not necessarily the same as the list of official languages
in the country. There may be officially recognised languages in a country
which are only ever used in name tags with the appropriate language suffixes.
Conversely, a non-official language may appear a lot in the name tags, for
example when used as an unofficial Lingua Franca.
List the languages in order of frequency of appearance with the most frequently
used language first. It is not recommended to add languages when there are only
very few occurrences.
If only one language is listed, then Nominatim will 'auto-complete' the
language of names without an explicit language-suffix.
#### `names`
List of names of the country and its translations. These names are used as
a baseline. It is always possible to search countries by the given names, no
matter what other names are in the OSM data. They are also used as a fallback
when a needed translation is not available.
!!! Note
The list of names per country is currently fairly large because Nominatim
supports translations in many languages per default. That is why the
name lists have been separated out into extra files. You can find the
name lists in the file `settings/country-names/<country code>.yaml`.
The names section in the main country settings file only refers to these
files via the special `!include` directive.
#### `postcode`
Describes the format of the postcode that is in use in the country.
When a country has no official postcodes, set this to no. Example:
```
ae:
postcode: no
```
When a country has a postcode, you need to state the postcode pattern and
the default output format. Example:
```
bm:
postcode:
pattern: "(ll)[ -]?(dd)"
output: \1 \2
```
The **pattern** is a regular expression that describes the possible formats
accepted as a postcode. The pattern follows the standard syntax for
[regular expressions in Python](https://docs.python.org/3/library/re.html#regular-expression-syntax)
with two extra shortcuts: `d` is a shortcut for a single digit([0-9])
and `l` for a single ASCII letter ([A-Z]).
Use match groups to indicate groups in the postcode that may optionally be
separated with a space or a hyphen.
For example, the postcode for Bermuda above always consists of two letters
and two digits. They may optionally be separated by a space or hyphen. That
means that Nominatim will consider `AB56`, `AB 56` and `AB-56` spelling variants
for one and the same postcode.
Never add the country code in front of the postcode pattern. Nominatim will
automatically accept variants with a country code prefix for all postcodes.
The **output** field is an optional field that describes what the canonical
spelling of the postcode should be. The format is the
[regular expression expand syntax](https://docs.python.org/3/library/re.html#re.Match.expand) referring back to the bracket groups in the pattern.
Most simple postcodes only have one spelling variant. In that case, the
**output** can be omitted. The postcode will simply be used as is.
In the Bermuda example above, the canonical spelling would be to have a space
between letters and digits.
!!! Warning
When your postcode pattern covers multiple variants of the postcode, then
you must explicitly state the canonical output or Nominatim will not
handle the variations correctly.
### Other country-specific configuration
There are some other configuration files where you can set localized settings
according to the assigned country. These are:
* [Place ranking configuration](Ranking.md)
Please see the linked documentation sections for more information.

View File

@@ -1,153 +0,0 @@
## Configuring the Import
Which OSM objects are added to the database and which of the tags are used
can be configured via the import style configuration file. This
is a JSON file which contains a list of rules which are matched against every
tag of every object and then assign the tag its specific role.
The style to use is given by the `NOMINATIM_IMPORT_STYLE` configuration
option. There are a number of default styles, which are explained in detail
in the [Import section](../admin/Import.md#filtering-imported-data). These
standard styles may be referenced by their name.
You can also create your own custom style. Put the style file into your
project directory and then set `NOMINATIM_IMPORT_STYLE` to the name of the file.
It is always recommended to start with one of the standard styles and customize
those. You find the standard styles under the name `import-<stylename>.style`
in the standard Nominatim configuration path (usually `/etc/nominatim` or
`/usr/local/etc/nominatim`).
The remainder of the page describes the format of the file.
### Configuration Rules
A single rule looks like this:
```json
{
"keys" : ["key1", "key2", ...],
"values" : {
"value1" : "prop",
"value2" : "prop1,prop2"
}
}
```
A rule first defines a list of keys to apply the rule to. This is always a list
of strings. The string may have four forms. An empty string matches against
any key. A string that ends in an asterisk `*` is a prefix match and accordingly
matches against any key that starts with the given string (minus the `*`). A
suffix match can be defined similarly with a string that starts with a `*`. Any
other string constitutes an exact match.
The second part of the rules defines a list of values and the properties that
apply to a successful match. Value strings may be either empty, which
means that they match any value, or describe an exact match. Prefix
or suffix matching of values is not possible.
For a rule to match, it has to find a valid combination of keys and values. The
resulting property is that of the matched values.
The rules in a configuration file are processed sequentially and the first
match for each tag wins.
A rule where key and value are the empty string is special. This defines the
fallback when none of the rules match. The fallback is always used as a last
resort when nothing else matches, no matter where the rule appears in the file.
Defining multiple fallback rules is not allowed. What happens in this case,
is undefined.
### Tag Properties
One or more of the following properties may be given for each tag:
* `main`
A principal tag. A new row will be added for the object with key and value
as `class` and `type`.
* `with_name`
When the tag is a principal tag (`main` property set): only really add a new
row, if there is any name tag found (a reference tag is not sufficient, see
below).
* `with_name_key`
When the tag is a principal tag (`main` property set): only really add a new
row, if there is also a name tag that matches the key of the principal tag.
For example, if the main tag is `bridge=yes`, then it will only be added as
an extra row, if there is a tag `bridge:name[:XXX]` for the same object.
If this property is set, all other names that are not domain-specific are
ignored.
* `fallback`
When the tag is a principal tag (`main` property set): only really add a new
row, when no other principal tags for this object have been found. Only one
fallback tag can win for an object.
* `operator`
When the tag is a principal tag (`main` property set): also include the
`operator` tag in the list of names. This is a special construct for an
out-dated tagging practise in OSM. Fuel stations and chain restaurants
in particular used to have the name of the chain tagged as `operator`.
These days the chain can be more commonly found in the `brand` tag but
there is still enough old data around to warrant this special case.
* `name`
Add tag to the list of names.
* `ref`
Add tag to the list of names as a reference. At the moment this only means
that the object is not considered to be named for `with_name`.
* `address`
Add tag to the list of address tags. If the tag starts with `addr:` or
`is_in:`, then this prefix is cut off before adding it to the list.
* `postcode`
Add the value as a postcode to the address tags. If multiple tags are
candidate for postcodes, one wins out and the others are dropped.
* `country`
Add the value as a country code to the address tags. The value must be a
two letter country code, otherwise it is ignored. If there are multiple
tags that match, then one wins out and the others are dropped.
* `house`
If no principle tags can be found for the object, still add the object with
`class`=`place` and `type`=`house`. Use this for address nodes that have no
other function.
* `interpolation`
Add this object as an address interpolation (appears as `class`=`place` and
`type`=`houses` in the database).
* `extra`
Add tag to the list of extra tags.
* `skip`
Skip the tag completely. Useful when a custom default fallback is defined
or to define exceptions to rules.
A rule can define as many of these properties for one match as it likes. For
example, if the property is `"main,extra"` then the tag will open a new row
but also have the tag appear in the list of extra tags.
### Changing the Style of Existing Databases
There is normally no issue changing the style of a database that is already
imported and now kept up-to-date with change files. Just be aware that any
change in the style applies to updates only. If you want to change the data
that is already in the database, then a reimport is necessary.

View File

@@ -1,49 +0,0 @@
## Importance
Search requests can yield multiple results which match equally well with
the original query. In such case Nominatim needs to order the results
according to a different criterion: importance. This is a measure for how
likely it is that a user will search for a given place. This section explains
the sources Nominatim uses for computing importance of a place and how to
customize them.
### How importance is computed
The main value for importance is derived from page ranking values for Wikipedia
pages for a place. For places that do not have their own
Wikipedia page, a formula is used that derives a static importance from the
places [search rank](../customize/Ranking.md#search-rank).
In a second step, a secondary importance value is added which is meant to
represent how well-known the general area is where the place is located. It
functions as a tie-breaker between places with very similar primary
importance values.
nominatim.org has preprocessed importance tables for the
[primary Wikipedia rankings](https://nominatim.org/data/wikimedia-importance.sql.gz)
and for a secondary importance based on the number of tile views on openstreetmap.org.
### Customizing secondary importance
The secondary importance is implemented as a simple
[Postgis raster](https://postgis.net/docs/raster.html) table, where Nominatim
looks up the value for the coordinates of the centroid of a place. You can
provide your own secondary importance raster in form of an SQL file named
`secondary_importance.sql.gz` in your project directory.
The SQL file needs to drop and (re)create a table `secondary_importance` which
must as a minimum contain a column `rast` of type `raster`. The raster must
be in EPSG:4326 and contain 16bit unsigned ints
(`raster_constraint_pixel_types(rast) = '{16BUI}'). Any other columns in the
table will be ignored. You must furthermore create an index as follows:
```
CREATE INDEX ON secondary_importance USING gist(ST_ConvexHull(gist))
```
The following raster2pgsql command will create a table that conforms to
the requirements:
```
raster2pgsql -I -C -Y -d -t 128x128 input.tiff public.secondary_importance
```

View File

@@ -1,20 +0,0 @@
Nominatim comes with a predefined set of configuration options that should
work for most standard installations. If you have special requirements, there
are many places where the configuration can be adapted. This chapter describes
the following configurable parts:
* [Global Settings](Settings.md) has a detailed description of all parameters that
can be set in your local `.env` configuration
* [Import styles](Import-Styles.md) explains how to write your own import style
in order to control what kind of OSM data will be imported
* [Place ranking](Ranking.md) describes the configuration around classifing
places in terms of their importance and their role in an address
* [Tokenizers](Tokenizers.md) describes the configuration of the module
responsible for analysing and indexing names
* [Special Phrases](Special-Phrases.md) are common nouns or phrases that
can be used in search to identify a class of places
There are also guides for adding the following external data:
* [US house numbers from the TIGER dataset](Tiger.md)
* [External postcodes](Postcodes.md)

View File

@@ -1,37 +0,0 @@
# External postcode data
Nominatim creates a table of known postcode centroids during import. This table
is used for searches of postcodes and for adding postcodes to places where the
OSM data does not provide one. These postcode centroids are mainly computed
from the OSM data itself. In addition, Nominatim supports reading postcode
information from an external CSV file, to supplement the postcodes that are
missing in OSM.
To enable external postcode support, simply put one CSV file per country into
your project directory and name it `<CC>_postcodes.csv`. `<CC>` must be the
two-letter country code for which to apply the file. The file may also be
gzipped. Then it must be called `<CC>_postcodes.csv.gz`.
The CSV file must use commas as a delimiter and have a header line. Nominatim
expects three columns to be present: `postcode`, `lat` and `lon`. All other
columns are ignored. `lon` and `lat` must describe the x and y coordinates of the
postcode centroids in WGS84.
The postcode files are loaded only when there is data for the given country
in your database. For example, if there is a `us_postcodes.csv` file in your
project directory but you import only an excerpt of Italy, then the US postcodes
will simply be ignored.
As a rule, the external postcode data should be put into the project directory
**before** starting the initial import. Still, you can add, remove and update the
external postcode data at any time. Simply
run:
```
nominatim refresh --postcodes
```
to make the changes visible in your database. Be aware, however, that the changes
only have an immediate effect on searches for postcodes. Postcodes that were
added to places are only updated, when they are reindexed. That usually happens
only during replication updates.

View File

@@ -1,139 +0,0 @@
# Place Ranking in Nominatim
Nominatim uses two metrics to rank a place: search rank and address rank.
This chapter explains what place ranking means and how it can be customized.
## Search rank
The search rank describes the extent and importance of a place. It is used
when ranking search results. Simply put, if there are two results for a
search query which are otherwise equal, then the result with the _lower_
search rank will be appear higher in the result list.
Search ranks are not so important these days because many well-known
places use the Wikipedia importance ranking instead.
The following table gives an overview of the kind of features that Nominatim
expects for each rank:
rank | typical place types | extent
-------|---------------------------------|-------
1-3 | oceans, continents | -
4 | countries | -
5-9 | states, regions, provinces | -
10-12 | counties | -
13-16 | cities, municipalities, islands | 15 km
17-18 | towns, boroughs | 4 km
19 | villages, suburbs | 2 km
20 | hamlets, farms, neighbourhoods | 1 km
21-25 | isolated dwellings, city blocks | 500 m
The extent column describes how far a feature is assumed to reach when it
is mapped only as a point. Larger features like countries and states are usually
available with their exact area in the OpenStreetMap data. That is why no extent
is given.
## Address rank
The address rank describes where a place shows up in an address hierarchy.
Usually only administrative boundaries and place nodes and areas are
eligible to be part of an address. Places that should not appear in the
address must have an address rank of 0.
The following table gives an overview how ranks are mapped to address parts:
rank | address part
-------------|-------------
1-3 | _unused_
4 | country
5-9 | state
10-12 | county
13-16 | city
17-21 | suburb
22-24 | neighbourhood
25 | squares, farms, localities
26-27 | street
28-30 | POI/house number
The country rank 4 usually doesn't show up in the address parts of an object.
The country is determined indirectly from the country code.
Ranks 5-24 can be assigned more or less freely. They make up the major part
of the address.
Rank 25 is also an addressing rank but it is special because while it can be
the parent to a POI with an addr:place of the same name, it cannot be a parent
to streets. Use it for place features that are technically on the same level
as a street (e.g. squares, city blocks) or for places that should not normally
appear in an address unless explicitly tagged so (e.g place=locality which
should be uninhabited and as such not addressable).
The street ranks 26 and 27 are handled slightly differently. Only one object
from these ranks shows up in an address.
For POI level objects like shops, buildings or house numbers always use rank 30.
Ranks 28 is reserved for house number interpolations. 29 is for internal use
only.
## Rank configuration
Search and address ranks are assigned to a place when it is first imported
into the database. There are a few hard-coded rules for the assignment:
* postcodes follow special rules according to their length
* boundaries that are not areas and railway=rail are dropped completely
* the following are always search rank 30 and address rank 0:
* highway nodes
* landuse that is not an area
Other than that, the ranks can be freely assigned via the JSON file according
to their type and the country they are in. The name of the config file to be
used can be changed with the setting `NOMINATIM_ADDRESS_LEVEL_CONFIG`.
The address level configuration must consist of an array of configuration
entries, each containing a tag definition and an optional country array:
```
[ {
"tags" : {
"place" : {
"county" : 12,
"city" : 16,
},
"landuse" : {
"residential" : 22,
"" : 30
}
}
},
{
"countries" : [ "ca", "us" ],
"tags" : {
"boundary" : {
"administrative8" : 18,
"administrative9" : 20
},
"landuse" : {
"residential" : [22, 0]
}
}
}
]
```
The `countries` field contains a list of countries (as ISO 3166-1 alpha 2 code)
for which the definition applies. When the field is omitted, then the
definition is used as a fallback, when nothing more specific for a given
country exists.
`tags` contains the ranks for key/value pairs. The ranks can be either a
single number, in which case they are the search and address rank, or an array
of search and address rank (in that order). The value may be left empty.
Then the rank is used when no more specific value is found for the given
key.
Countries and key/value combination may appear in multiple definitions. Just
make sure that each combination of country/key/value appears only once per
file. Otherwise the import will fail with a UNIQUE INDEX constraint violation
on import.

View File

@@ -1,672 +0,0 @@
This section provides a reference of all configuration parameters that can
be used with Nominatim.
# Configuring Nominatim
Nominatim uses [dotenv](https://github.com/theskumar/python-dotenv) to manage
its configuration settings. There are two means to set configuration
variables: through an `.env` configuration file or through an environment
variable.
The `.env` configuration file needs to be placed into the
[project directory](../admin/Import.md#creating-the-project-directory). It
must contain configuration parameters in `<parameter>=<value>` format.
Please refer to the dotenv documentation for details.
The configuration options may also be set in the form of shell environment
variables. This is particularly useful, when you want to temporarily change
a configuration option. For example, to force the replication serve to
download the next change, you can temporarily disable the update interval:
NOMINATIM_REPLICATION_UPDATE_INTERVAL=0 nominatim replication --once
If a configuration option is defined through .env file and environment
variable, then the latter takes precedence.
## Configuration Parameter Reference
### Import and Database Settings
#### NOMINATIM_DATABASE_DSN
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Database connection string |
| **Format:** | string: `pgsql:<param1>=<value1>;<param2>=<value2>;...` |
| **Default:** | pgsql:dbname=nominatim |
| **After Changes:** | run `nominatim refresh --website` |
Sets the connection parameters for the Nominatim database. At a minimum
the name of the database (`dbname`) is required. You can set any additional
parameter that is understood by libpq. See the [Postgres documentation](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS) for a full list.
!!! note
It is usually recommended not to set the password directly in this
configuration parameter. Use a
[password file](https://www.postgresql.org/docs/current/libpq-pgpass.html)
instead.
#### NOMINATIM_DATABASE_WEBUSER
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Database query user |
| **Format:** | string |
| **Default:** | www-data |
| **After Changes:** | cannot be changed after import |
Defines the name of the database user that will run search queries. Usually
this is the user under which the webserver is executed. When running Nominatim
via php-fpm, you can also define a separate query user. The Postgres user
needs to be set up before starting the import.
Nominatim grants minimal rights to this user to all tables that are needed
for running geocoding queries.
#### NOMINATIM_DATABASE_MODULE_PATH
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Directory where to find the PostgreSQL server module |
| **Format:** | path |
| **Default:** | _empty_ (use `<project_directory>/module`) |
| **After Changes:** | run `nominatim refresh --functions` |
| **Comment:** | Legacy tokenizer only |
Defines the directory in which the PostgreSQL server module `nominatim.so`
is stored. The directory and module must be accessible by the PostgreSQL
server.
For information on how to use this setting when working with external databases,
see [Advanced Installations](../admin/Advanced-Installations.md).
The option is only used by the Legacy tokenizer and ignored otherwise.
#### NOMINATIM_TOKENIZER
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Tokenizer used for normalizing and parsing queries and names |
| **Format:** | string |
| **Default:** | legacy |
| **After Changes:** | cannot be changed after import |
Sets the tokenizer type to use for the import. For more information on
available tokenizers and how they are configured, see
[Tokenizers](../customize/Tokenizers.md).
#### NOMINATIM_TOKENIZER_CONFIG
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Configuration file for the tokenizer |
| **Format:** | path |
| **Default:** | _empty_ (default file depends on tokenizer) |
| **After Changes:** | see documentation for each tokenizer |
Points to the file with additional configuration for the tokenizer.
See the [Tokenizer](../customize/Tokenizers.md) descriptions for details
on the file format.
If a relative path is given, then the file is searched first relative to the
project directory and then in the global settings directory.
#### NOMINATIM_MAX_WORD_FREQUENCY
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Number of occurrences before a word is considered frequent |
| **Format:** | int |
| **Default:** | 50000 |
| **After Changes:** | cannot be changed after import |
| **Comment:** | Legacy tokenizer only |
The word frequency count is used by the Legacy tokenizer to automatically
identify _stop words_. Any partial term that occurs more often then what
is defined in this setting, is effectively ignored during search.
#### NOMINATIM_LIMIT_REINDEXING
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Avoid invalidating large areas |
| **Format:** | bool |
| **Default:** | yes |
Nominatim computes the address of each place at indexing time. This has the
advantage to make search faster but also means that more objects needs to
be invalidated when the data changes. For example, changing the name of
the state of Florida would require recomputing every single address point
in the state to make the new name searchable in conjunction with addresses.
Setting this option to 'yes' means that Nominatim skips reindexing of contained
objects when the area becomes too large.
#### NOMINATIM_UPDATE_FORWARD_DEPENDENCIES
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Forward geometry changes to dependet objects |
| **Format:** | bool |
| **Default:** | no |
| **Comment:** | EXPERT ONLY. Must not be enabled after import. |
The geometry of OSM ways and relations may change when a node that is part
of the object is moved around. These changes are not propagated per default.
The geometry of ways/relations is only updated the next time that the object
itself is touched. When enabling this option, then dependent objects will
be marked for update when one of its member objects changes.
Enabling this option may slow down updates significantly.
!!! warning
If you want to enable this option, it must be set already on import.
Do not enable this option on an existing database that was imported with
NOMINATIM_UPDATE_FORWARD_DEPENDENCIES=no.
Updates will become unusably slow.
#### NOMINATIM_LANGUAGES
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Restrict search languages |
| **Format:** | string: comma-separated list of language codes |
| **Default:** | _empty_ |
Normally Nominatim will include all language variants of name:XX
in the search index. Set this to a comma separated list of language
codes, to restrict import to a subset of languages.
Currently only affects the initial import of country names and special phrases.
#### NOMINATIM_TERM_NORMALIZATION
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Rules for normalizing terms for comparisons |
| **Format:** | string: semicolon-separated list of ICU rules |
| **Default:** | :: NFD (); [[:Nonspacing Mark:] [:Cf:]] >; :: lower (); [[:Punctuation:][:Space:]]+ > ' '; :: NFC (); |
| **Comment:** | Legacy tokenizer only |
[Special phrases](Special-Phrases.md) have stricter matching requirements than
normal search terms. They must appear exactly in the query after this term
normalization has been applied.
Only has an effect on the Legacy tokenizer. For the ICU tokenizer the rules
defined in the
[normalization section](Tokenizers.md#normalization-and-transliteration)
will be used.
#### NOMINATIM_USE_US_TIGER_DATA
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Enable searching for Tiger house number data |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim refresh --functions` |
When this setting is enabled, search and reverse queries also take data
from [Tiger house number data](Tiger.md) into account.
#### NOMINATIM_USE_AUX_LOCATION_DATA
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Enable searching in external house number tables |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim refresh --functions` |
| **Comment:** | Do not use. |
When this setting is enabled, search queries also take data from external
house number tables into account.
*Warning:* This feature is currently unmaintained and should not be used.
#### NOMINATIM_HTTP_PROXY
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Use HTTP proxy when downloading data |
| **Format:** | boolean |
| **Default:** | no |
When this setting is enabled and at least
[NOMINATIM_HTTP_PROXY_HOST](#nominatim_http_proxy_host) and
[NOMINATIM_HTTP_PROXY_PORT](#nominatim_http_proxy_port) are set, the
configured proxy will be used, when downloading external data like
replication diffs.
#### NOMINATIM_HTTP_PROXY_HOST
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Host name of the proxy to use |
| **Format:** | string |
| **Default:** | _empty_ |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, this setting
configures the proxy host name.
#### NOMINATIM_HTTP_PROXY_PORT
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Port number of the proxy to use |
| **Format:** | integer |
| **Default:** | 3128 |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, this setting
configures the port number to use with the proxy.
#### NOMINATIM_HTTP_PROXY_LOGIN
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Username for proxies that require login |
| **Format:** | string |
| **Default:** | _empty_ |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, use this
setting to define the username for proxies that require a login.
#### NOMINATIM_HTTP_PROXY_PASSWORD
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Password for proxies that require login |
| **Format:** | string |
| **Default:** | _empty_ |
When [NOMINATIM_HTTP_PROXY](#nominatim_http_proxy) is enabled, use this
setting to define the password for proxies that require a login.
#### NOMINATIM_OSM2PGSQL_BINARY
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Location of the osm2pgsql binary |
| **Format:** | path |
| **Default:** | _empty_ (use binary shipped with Nominatim) |
| **Comment:** | EXPERT ONLY |
Nominatim uses [osm2pgsql](https://osm2pgsql.org) to load the OSM data
initially into the database. Nominatim comes bundled with a version of
osm2pgsql that is guaranteed to be compatible. Use this setting to use
a different binary instead. You should do this only when you know exactly
what you are doing. If the osm2pgsql version is not compatible, then the
result is undefined.
#### NOMINATIM_WIKIPEDIA_DATA_PATH
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Directory with the wikipedia importance data |
| **Format:** | path |
| **Default:** | _empty_ (project directory) |
Set a custom location for the
[wikipedia ranking file](../admin/Import.md#wikipediawikidata-rankings). When
unset, Nominatim expects the data to be saved in the project directory.
#### NOMINATIM_ADDRESS_LEVEL_CONFIG
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Configuration file for rank assignments |
| **Format:** | path |
| **Default:** | address-levels.json |
The _address level configuration_ defines the rank assignments for places. See
[Place Ranking](Ranking.md) for a detailed explanation what rank assignments
are and what the configuration file must look like.
When a relative path is given, then the file is searched first relative to the
project directory and then in the global settings directory.
#### NOMINATIM_IMPORT_STYLE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Configuration to use for the initial OSM data import |
| **Format:** | string or path |
| **Default:** | extratags |
The _style configuration_ describes which OSM objects and tags are taken
into consideration for the search database. Nominatim comes with a set
of pre-configured styles, that may be configured here.
You can also write your own custom style and point the setting to the file
with the style. When a relative path is given, then the style file is searched
first relative to the project directory and then in the global settings
directory.
See [Import Styles](Import-Styles.md)
for more information on the available internal styles and the format of the
configuration file.
#### NOMINATIM_FLATNODE_FILE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Location of osm2pgsql flatnode file |
| **Format:** | path |
| **Default:** | _empty_ (do not use a flatnote file) |
| **After Changes:** | Only change when moving the file physically. |
The `osm2pgsql flatnode file` is file that efficiently stores geographic
location for OSM nodes. For larger imports it can significantly speed up
the import. When this option is unset, then osm2pgsql uses a PsotgreSQL table
to store the locations.
When a relative path is given, then the flatnode file is created/searched
relative to the project directory.
!!! warning
The flatnode file is not only used during the initial import but also
when adding new data with `nominatim add-data` or `nominatim replication`.
Make sure you keep the flatnode file around and this setting unmodified,
if you plan to add more data or run regular updates.
#### NOMINATIM_TABLESPACE_*
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Group of settings for distributing the database over tablespaces |
| **Format:** | string |
| **Default:** | _empty_ (do not use a table space) |
| **After Changes:** | no effect after initial import |
Nominatim allows to distribute the search database over up to 10 different
[PostgreSQL tablespaces](https://www.postgresql.org/docs/current/manage-ag-tablespaces.html).
If you use this option, make sure that the tablespaces exist before starting
the import.
The available tablespace groups are:
NOMINATIM_TABLESPACE_SEARCH_DATA
: Data used by the geocoding frontend.
NOMINATIM_TABLESPACE_SEARCH_INDEX
: Indexes used by the geocoding frontend.
NOMINATIM_TABLESPACE_OSM_DATA
: Raw OSM data cache used for import and updates.
NOMINATIM_TABLESPACE_OSM_DATA
: Indexes on the raw OSM data cache.
NOMINATIM_TABLESPACE_PLACE_DATA
: Data table with the pre-filtered but still unprocessed OSM data.
Used only during imports and updates.
NOMINATIM_TABLESPACE_PLACE_INDEX
: Indexes on raw data table. Used only during imports and updates.
NOMINATIM_TABLESPACE_ADDRESS_DATA
: Data tables used for computing search terms and addresses of places
during import and updates.
NOMINATIM_TABLESPACE_ADDRESS_INDEX
: Indexes on the data tables for search term and address computation.
Used only for import and updates.
NOMINATIM_TABLESPACE_AUX_DATA
: Auxiliary data tables for non-OSM data, e.g. for Tiger house number data.
NOMINATIM_TABLESPACE_AUX_INDEX
: Indexes on auxiliary data tables.
### Replication Update Settings
#### NOMINATIM_REPLICATION_URL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Base URL of the replication service |
| **Format:** | url |
| **Default:** | https://planet.openstreetmap.org/replication/minute |
| **After Changes:** | run `nominatim replication --init` |
Replication services deliver updates to OSM data. Use this setting to choose
which replication service to use. See [Updates](../admin/Update.md) for more
information on how to set up regular updates.
#### NOMINATIM_REPLICATION_MAX_DIFF
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Maximum amount of data to download per update cycle (in MB) |
| **Format:** | integer |
| **Default:** | 50 |
| **After Changes:** | restart the replication process |
At each update cycle Nominatim downloads diffs until either no more diffs
are available on the server (i.e. the database is up-to-date) or the limit
given in this setting is exceeded. Nominatim guarantees to downloads at least
one diff, if one is available, no matter how small the setting.
The default for this setting is fairly conservative because Nominatim keeps
all data downloaded in one cycle in RAM. Using large values in a production
server may interfere badly with the search frontend because it evicts data
from RAM that is needed for speedy answers to incoming requests. It is usually
a better idea to keep this setting lower and run multiple update cycles
to catch up with updates.
When catching up in non-production mode, for example after the initial import,
the setting can easily be changed temporarily on the command line:
NOMINATIM_REPLICATION_MAX_DIFF=3000 nominatim replication
#### NOMINATIM_REPLICATION_UPDATE_INTERVAL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Publication interval of the replication service (in seconds) |
| **Format:** | integer |
| **Default:** | 75 |
| **After Changes:** | restart the replication process |
This setting determines when Nominatim will attempt to download again a new
update. The time is computed from the publication date of the last diff
downloaded. Setting this to a slightly higher value than the actual
publication interval avoids unnecessary rechecks.
#### NOMINATIM_REPLICATION_RECHECK_INTERVAL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Wait time to recheck for a pending update (in seconds) |
| **Format:** | integer |
| **Default:** | 60 |
| **After Changes:** | restart the replication process |
When replication updates are run in continuous mode (using `nominatim replication`),
this setting determines how long Nominatim waits until it looks for updates
again when updates were not available on the server.
Note that this is different from
[NOMINATIM_REPLICATION_UPDATE_INTERVAL](#nominatim_replication_update_interval).
Nominatim will never attempt to query for new updates for UPDATE_INTERVAL
seconds after the current database date. Only after the update interval has
passed it asks for new data. If then no new data is found, it waits for
RECHECK_INTERVAL seconds before it attempts again.
### API Settings
#### NOMINATIM_CORS_NOACCESSCONTROL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Send permissive CORS access headers |
| **Format:** | boolean |
| **Default:** | yes |
| **After Changes:** | run `nominatim refresh --website` |
When this setting is enabled, API HTTP responses include the HTTP
[CORS](https://en.wikipedia.org/wiki/CORS) headers
`access-control-allow-origin: *` and `access-control-allow-methods: OPTIONS,GET`.
#### NOMINATIM_MAPICON_URL
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | URL prefix for static icon images |
| **Format:** | url |
| **Default:** | _empty_ |
| **After Changes:** | run `nominatim refresh --website` |
When a mapicon URL is configured, then Nominatim includes an additional `icon`
field in the responses, pointing to an appropriate icon for the place type.
Map icons used to be included in Nominatim itself but now have moved to the
[nominatim-ui](https://github.com/osm-search/nominatim-ui/) project. If you
want the URL to be included in API responses, make the `/mapicon`
directory of the project available under a public URL and point this setting
to the directory.
#### NOMINATIM_DEFAULT_LANGUAGE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Language of responses when no language is requested |
| **Format:** | language code |
| **Default:** | _empty_ (use the local language of the feature) |
| **After Changes:** | run `nominatim refresh --website` |
Nominatim localizes the place names in responses when the corresponding
translation is available. Users can request a custom language setting through
the HTTP accept-languages header or through the explicit parameter
[accept-languages](../api/Search.md#language-of-results). If neither is
given, it falls back to this setting. If the setting is also empty, then
the local languages (in OSM: the name tag without any language suffix) is
used.
#### NOMINATIM_SEARCH_BATCH_MODE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Enable a special batch query mode |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim refresh --website` |
This feature is currently undocumented and potentially broken.
#### NOMINATIM_SEARCH_NAME_ONLY_THRESHOLD
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Threshold for switching the search index lookup strategy |
| **Format:** | integer |
| **Default:** | 500 |
| **After Changes:** | run `nominatim refresh --website` |
This setting defines the threshold over which a name is no longer considered
as rare. When searching for places with rare names, only the name is used
for place lookups. Otherwise the name and any address information is used.
This setting only has an effect after `nominatim refresh --word-counts` has
been called to compute the word frequencies.
#### NOMINATIM_LOOKUP_MAX_COUNT
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Maximum number of OSM ids accepted by /lookup |
| **Format:** | integer |
| **Default:** | 50 |
| **After Changes:** | run `nominatim refresh --website` |
The /lookup point accepts list of ids to look up address details for. This
setting restricts the number of places a user may look up with a single
request.
#### NOMINATIM_POLYGON_OUTPUT_MAX_TYPES
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Number of different geometry formats that may be returned |
| **Format:** | integer |
| **Default:** | 1 |
| **After Changes:** | run `nominatim refresh --website` |
Nominatim supports returning full geometries of places. The geometries may
be requested in different formats with one of the
[`polygon_*` parameters](../api/Search.md#polygon-output). Use this
setting to restrict the number of geometry types that may be requested
with a single query.
Setting this parameter to 0 disables polygon output completely.
### Logging Settings
#### NOMINATIM_LOG_DB
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Log requests into the database |
| **Format:** | boolean |
| **Default:** | no |
| **After Changes:** | run `nominatim refresh --website` |
Enable logging requests into a database table with this setting. The logs
can be found in the table `new_query_log`.
When using this logging method, it is advisable to set up a job that
regularly clears out old logging information. Nominatim will not do that
on its own.
Can be used as the same time as NOMINATIM_LOG_FILE.
#### NOMINATIM_LOG_FILE
| Summary | |
| -------------- | --------------------------------------------------- |
| **Description:** | Log requests into a file |
| **Format:** | path |
| **Default:** | _empty_ (logging disabled) |
| **After Changes:** | run `nominatim refresh --website` |
Enable logging of requests into a file with this setting by setting the log
file where to log to. A relative file name is assumed to be relative to
the project directory.
The entries in the log file have the following format:
<request time> <execution time in s> <number of results> <type> "<query string>"
Request time is the time when the request was started. The execution time is
given in seconds and corresponds to the time the query took executing in PHP.
type contains the name of the endpoint used.
Can be used as the same time as NOMINATIM_LOG_DB.

View File

@@ -1,34 +0,0 @@
# Special phrases
## Importing OSM user-maintained special phrases
As described in the [Import section](../admin/Import.md), it is possible to
import special phrases from the wiki with the following command:
```sh
nominatim special-phrases --import-from-wiki
```
## Importing custom special phrases
But, it is also possible to import some phrases from a csv file.
To do so, you have access to the following command:
```sh
nominatim special-phrases --import-from-csv <csv file>
```
Note that the two previous import commands will update the phrases from your database.
This means that if you import some phrases from a csv file, only the phrases
present in the csv file will be kept into the database. All other phrases will
be removed.
If you want to only add new phrases and not update the other ones you can add
the argument `--no-replace` to the import command. For example:
```sh
nominatim special-phrases --import-from-csv <csv file> --no-replace
```
This will add the phrases present in the csv file into the database without
removing the other ones.

View File

@@ -1,28 +0,0 @@
# Installing TIGER housenumber data for the US
Nominatim is able to use the official [TIGER](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html)
address set to complement the OSM house number data in the US. You can add
TIGER data to your own Nominatim instance by following these steps. The
entire US adds about 10GB to your database.
1. Get preprocessed TIGER data:
cd $PROJECT_DIR
wget https://nominatim.org/data/tiger-nominatim-preprocessed-latest.csv.tar.gz
2. Import the data into your Nominatim database:
nominatim add-data --tiger-data tiger-nominatim-preprocessed-latest.csv.tar.gz
3. Enable use of the Tiger data in your existing `.env` file by adding:
echo NOMINATIM_USE_US_TIGER_DATA=yes >> .env
4. Apply the new settings:
nominatim refresh --functions --website
See the [TIGER-data project](https://github.com/osm-search/TIGER-data) for more
information on how the data got preprocessed.

View File

@@ -1,391 +0,0 @@
# Tokenizers
The tokenizer module in Nominatim is responsible for analysing the names given
to OSM objects and the terms of an incoming query in order to make sure, they
can be matched appropriately.
Nominatim offers different tokenizer modules, which behave differently and have
different configuration options. This sections describes the tokenizers and how
they can be configured.
!!! important
The use of a tokenizer is tied to a database installation. You need to choose
and configure the tokenizer before starting the initial import. Once the import
is done, you cannot switch to another tokenizer anymore. Reconfiguring the
chosen tokenizer is very limited as well. See the comments in each tokenizer
section.
## Legacy tokenizer
The legacy tokenizer implements the analysis algorithms of older Nominatim
versions. It uses a special Postgresql module to normalize names and queries.
This tokenizer is automatically installed and used when upgrading an older
database. It should not be used for new installations anymore.
### Compiling the PostgreSQL module
The tokeinzer needs a special C module for PostgreSQL which is not compiled
by default. If you need the legacy tokenizer, compile Nominatim as follows:
```
mkdir build
cd build
cmake -DBUILD_MODULE=on
make
```
### Enabling the tokenizer
To enable the tokenizer add the following line to your project configuration:
```
NOMINATIM_TOKENIZER=legacy
```
The Postgresql module for the tokenizer is available in the `module` directory
and also installed with the remainder of the software under
`lib/nominatim/module/nominatim.so`. You can specify a custom location for
the module with
```
NOMINATIM_DATABASE_MODULE_PATH=<path to directory where nominatim.so resides>
```
This is in particular useful when the database runs on a different server.
See [Advanced installations](../admin/Advanced-Installations.md#importing-nominatim-to-an-external-postgresql-database) for details.
There are no other configuration options for the legacy tokenizer. All
normalization functions are hard-coded.
## ICU tokenizer
The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to
normalize names and queries. It also offers configurable decomposition and
abbreviation handling.
This tokenizer is currently the default.
To enable the tokenizer add the following line to your project configuration:
```
NOMINATIM_TOKENIZER=icu
```
### How it works
On import the tokenizer processes names in the following three stages:
1. During the **Sanitizer step** incoming names are cleaned up and converted to
**full names**. This step can be used to regularize spelling, split multi-name
tags into their parts and tag names with additional attributes. See the
[Sanitizers section](#sanitizers) below for available cleaning routines.
2. The **Normalization** part removes all information from the full names
that are not relevant for search.
3. The **Token analysis** step takes the normalized full names and creates
all transliterated variants under which the name should be searchable.
See the [Token analysis](#token-analysis) section below for more
information.
During query time, only normalization and transliteration are relevant.
An incoming query is first split into name chunks (this usually means splitting
the string at the commas) and the each part is normalised and transliterated.
The result is used to look up places in the search index.
### Configuration
The ICU tokenizer is configured using a YAML file which can be configured using
`NOMINATIM_TOKENIZER_CONFIG`. The configuration is read on import and then
saved as part of the internal database status. Later changes to the variable
have no effect.
Here is an example configuration file:
``` yaml
normalization:
- ":: lower ()"
- "ß > 'ss'" # German szet is unimbigiously equal to double ss
transliteration:
- !include /etc/nominatim/icu-rules/extended-unicode-to-asccii.yaml
- ":: Ascii ()"
sanitizers:
- step: split-name-list
token-analysis:
- analyzer: generic
variants:
- !include icu-rules/variants-ca.yaml
- words:
- road -> rd
- bridge -> bdge,br,brdg,bri,brg
mutations:
- pattern: 'ä'
replacements: ['ä', 'ae']
```
The configuration file contains four sections:
`normalization`, `transliteration`, `sanitizers` and `token-analysis`.
#### Normalization and Transliteration
The normalization and transliteration sections each define a set of
ICU rules that are applied to the names.
The **normalisation** rules are applied after sanitation. They should remove
any information that is not relevant for search at all. Usual rules to be
applied here are: lower-casing, removing of special characters, cleanup of
spaces.
The **transliteration** rules are applied at the end of the tokenization
process to transfer the name into an ASCII representation. Transliteration can
be useful to allow for further fuzzy matching, especially between different
scripts.
Each section must contain a list of
[ICU transformation rules](https://unicode-org.github.io/icu/userguide/transforms/general/rules.html).
The rules are applied in the order in which they appear in the file.
You can also include additional rules from external yaml file using the
`!include` tag. The included file must contain a valid YAML list of ICU rules
and may again include other files.
!!! warning
The ICU rule syntax contains special characters that conflict with the
YAML syntax. You should therefore always enclose the ICU rules in
double-quotes.
#### Sanitizers
The sanitizers section defines an ordered list of functions that are applied
to the name and address tags before they are further processed by the tokenizer.
They allows to clean up the tagging and bring it to a standardized form more
suitable for building the search index.
!!! hint
Sanitizers only have an effect on how the search index is built. They
do not change the information about each place that is saved in the
database. In particular, they have no influence on how the results are
displayed. The returned results always show the original information as
stored in the OpenStreetMap database.
Each entry contains information of a sanitizer to be applied. It has a
mandatory parameter `step` which gives the name of the sanitizer. Depending
on the type, it may have additional parameters to configure its operation.
The order of the list matters. The sanitizers are applied exactly in the order
that is configured. Each sanitizer works on the results of the previous one.
The following is a list of sanitizers that are shipped with Nominatim.
##### split-name-list
::: nominatim.tokenizer.sanitizers.split_name_list
selection:
members: False
rendering:
heading_level: 6
##### strip-brace-terms
::: nominatim.tokenizer.sanitizers.strip_brace_terms
selection:
members: False
rendering:
heading_level: 6
##### tag-analyzer-by-language
::: nominatim.tokenizer.sanitizers.tag_analyzer_by_language
selection:
members: False
rendering:
heading_level: 6
##### clean-housenumbers
::: nominatim.tokenizer.sanitizers.clean_housenumbers
selection:
members: False
rendering:
heading_level: 6
##### clean-postcodes
::: nominatim.tokenizer.sanitizers.clean_postcodes
selection:
members: False
rendering:
heading_level: 6
##### clean-tiger-tags
::: nominatim.tokenizer.sanitizers.clean_tiger_tags
selection:
members: False
rendering:
heading_level: 6
#### Token Analysis
Token analyzers take a full name and transform it into one or more normalized
form that are then saved in the search index. In its simplest form, the
analyzer only applies the transliteration rules. More complex analyzers
create additional spelling variants of a name. This is useful to handle
decomposition and abbreviation.
The ICU tokenizer may use different analyzers for different names. To select
the analyzer to be used, the name must be tagged with the `analyzer` attribute
by a sanitizer (see for example the
[tag-analyzer-by-language sanitizer](#tag-analyzer-by-language)).
The token-analysis section contains the list of configured analyzers. Each
analyzer must have an `id` parameter that uniquely identifies the analyzer.
The only exception is the default analyzer that is used when no special
analyzer was selected. There are analysers with special ids:
* '@housenumber'. If an analyzer with that name is present, it is used
for normalization of house numbers.
* '@potcode'. If an analyzer with that name is present, it is used
for normalization of postcodes.
Different analyzer implementations may exist. To select the implementation,
the `analyzer` parameter must be set. The different implementations are
described in the following.
##### Generic token analyzer
The generic analyzer `generic` is able to create variants from a list of given
abbreviation and decomposition replacements and introduce spelling variations.
###### Variants
The optional 'variants' section defines lists of replacements which create alternative
spellings of a name. To create the variants, a name is scanned from left to
right and the longest matching replacement is applied until the end of the
string is reached.
The variants section must contain a list of replacement groups. Each group
defines a set of properties that describes where the replacements are
applicable. In addition, the word section defines the list of replacements
to be made. The basic replacement description is of the form:
```
<source>[,<source>[...]] => <target>[,<target>[...]]
```
The left side contains one or more `source` terms to be replaced. The right side
lists one or more replacements. Each source is replaced with each replacement
term.
!!! tip
The source and target terms are internally normalized using the
normalization rules given in the configuration. This ensures that the
strings match as expected. In fact, it is better to use unnormalized
words in the configuration because then it is possible to change the
rules for normalization later without having to adapt the variant rules.
###### Decomposition
In its standard form, only full words match against the source. There
is a special notation to match the prefix and suffix of a word:
``` yaml
- ~strasse => str # matches "strasse" as full word and in suffix position
- hinter~ => hntr # matches "hinter" as full word and in prefix position
```
There is no facility to match a string in the middle of the word. The suffix
and prefix notation automatically trigger the decomposition mode: two variants
are created for each replacement, one with the replacement attached to the word
and one separate. So in above example, the tokenization of "hauptstrasse" will
create the variants "hauptstr" and "haupt str". Similarly, the name "rote strasse"
triggers the variants "rote str" and "rotestr". By having decomposition work
both ways, it is sufficient to create the variants at index time. The variant
rules are not applied at query time.
To avoid automatic decomposition, use the '|' notation:
``` yaml
- ~strasse |=> str
```
simply changes "hauptstrasse" to "hauptstr" and "rote strasse" to "rote str".
###### Initial and final terms
It is also possible to restrict replacements to the beginning and end of a
name:
``` yaml
- ^south => s # matches only at the beginning of the name
- road$ => rd # matches only at the end of the name
```
So the first example would trigger a replacement for "south 45th street" but
not for "the south beach restaurant".
###### Replacements vs. variants
The replacement syntax `source => target` works as a pure replacement. It changes
the name instead of creating a variant. To create an additional version, you'd
have to write `source => source,target`. As this is a frequent case, there is
a shortcut notation for it:
```
<source>[,<source>[...]] -> <target>[,<target>[...]]
```
The simple arrow causes an additional variant to be added. Note that
decomposition has an effect here on the source as well. So a rule
``` yaml
- "~strasse -> str"
```
means that for a word like `hauptstrasse` four variants are created:
`hauptstrasse`, `haupt strasse`, `hauptstr` and `haupt str`.
###### Mutations
The 'mutation' section in the configuration describes an additional set of
replacements to be applied after the variants have been computed.
Each mutation is described by two parameters: `pattern` and `replacements`.
The pattern must contain a single regular expression to search for in the
variant name. The regular expressions need to follow the syntax for
[Python regular expressions](file:///usr/share/doc/python3-doc/html/library/re.html#regular-expression-syntax).
Capturing groups are not permitted.
`replacements` must contain a list of strings that the pattern
should be replaced with. Each occurrence of the pattern is replaced with
all given replacements. Be mindful of combinatorial explosion of variants.
###### Modes
The generic analyser supports a special mode `variant-only`. When configured
then it consumes the input token and emits only variants (if any exist). Enable
the mode by adding:
```
mode: variant-only
```
to the analyser configuration.
##### Housenumber token analyzer
The analyzer `housenumbers` is purpose-made to analyze house numbers. It
creates variants with optional spaces between numbers and letters. Thus,
house numbers of the form '3 a', '3A', '3-A' etc. are all considered equivalent.
The analyzer cannot be customized.
##### Postcode token analyzer
The analyzer `postcodes` is pupose-made to analyze postcodes. It supports
a 'lookup' varaint of the token, which produces variants with optional
spaces. Use together with the clean-postcodes sanitizer.
The analyzer cannot be customized.
### Reconfiguration
Changing the configuration after the import is currently not possible, although
this feature may be added at a later time.

View File

@@ -1,167 +0,0 @@
# Database Layout
### Import tables
OSM data is initially imported using [osm2pgsql](https://osm2pgsql.org).
Nominatim uses its own data output style 'gazetteer', which differs from the
output style created for map rendering.
The import process creates the following tables:
![osm2pgsql tables](osm2pgsql-tables.svg)
The `planet_osm_*` tables are the usual backing tables for OSM data. Note
that Nominatim uses them to look up special relations and to find nodes on
ways.
The gazetteer style produces a single table `place` as output with the following
columns:
* `osm_type` - kind of OSM object (**N** - node, **W** - way, **R** - relation)
* `osm_id` - original OSM ID
* `class` - key of principal tag defining the object type
* `type` - value of principal tag defining the object type
* `name` - collection of tags that contain a name or reference
* `admin_level` - numerical value of the tagged administrative level
* `address` - collection of tags defining the address of an object
* `extratags` - collection of additional interesting tags that are not
directly relevant for searching
* `geometry` - geometry of the object (in WGS84)
A single OSM object may appear multiple times in this table when it is tagged
with multiple tags that may constitute a principal tag. Take for example a
motorway bridge. In OSM, this would be a way which is tagged with
`highway=motorway` and `bridge=yes`. This way would appear in the `place` table
once with `class` of `highway` and once with a `class` of `bridge`. Thus the
*unique key* for `place` is (`osm_type`, `osm_id`, `class`).
How raw OSM tags are mapped to the columns in the place table is to a certain
degree configurable. See [Customizing Import Styles](../customize/Import-Styles.md)
for more information.
### Search tables
The following tables carry all information needed to do the search:
![search tables](search-tables.svg)
The **placex** table is the central table that saves all information about the
searchable places in Nominatim. The basic columns are the same as for the
place table and have the same meaning. The placex tables adds the following
additional columns:
* `place_id` - the internal unique ID to identify the place
* `partition` - the id to use with partitioned tables (see below)
* `geometry_sector` - a location hash used for geographically close ordering
* `parent_place_id` - the next higher place in the address hierarchy, only
relevant for POI-type places (with rank 30)
* `linked_place_id` - place ID of the place this object has been merged with.
When this ID is set, then the place is invisible for search.
* `importance` - measure how well known the place is
* `rank_search`, `rank_address` - search and address rank (see [Customizing ranking](../customize/Ranking.md)
* `wikipedia` - the wikipedia page used for computing the importance of the place
* `country_code` - the country the place is located in
* `housenumber` - normalized housenumber, if the place has one
* `postcode` - computed postcode for the place
* `indexed_status` - processing status of the place (0 - ready, 1 - freshly inserted, 2 - needs updating, 100 - needs deletion)
* `indexed_date` - timestamp when the place was processed last
* `centroid` - a point feature for the place
The **location_property_osmline** table is a special table for
[address interpolations](https://wiki.openstreetmap.org/wiki/Addresses#Using_interpolation).
The columns have the same meaning and use as the columns with the same name in
the placex table. Only three columns are special:
* `startnumber` and `endnumber` - beginning and end of the number range
for the interpolation
* `interpolationtype` - a string `odd`, `even` or `all` to indicate
the interval between the numbers
Address interpolations are always ways in OSM, which is why there is no column
`osm_type`.
The **location_postcode** table holds computed centroids of all postcodes that
can be found in the OSM data. The meaning of the columns is again the same
as that of the placex table.
Every place needs an address, a set of surrounding places that describe the
location of the place. The set of address places is made up of OSM places
themselves. The **place_addressline** table cross-references for each place
all the places that make up its address. Two columns define the address
relation:
* `place_id` - reference to the place being addressed
* `address_place_id` - reference to the place serving as an address part
The most of the columns cache information from the placex entry of the address
part. The exceptions are:
* `fromarea` - is true if the address part has an area geometry and can
therefore be considered preceise
* `isaddress` - is true if the address part should show up in the address
output. Sometimes there are multiple places competing for for same address
type (e.g. multiple cities) and this field resolves the tie.
The **search_name** table contains the search index proper. It saves for each
place the terms with which the place can be found. The terms are split into
the name itself and all terms that make up the address. The table mirrors some
of the columns from placex for faster lookup.
Search terms are not saved as strings. Each term is assigned an integer and those
integers are saved in the name and address vectors of the search_name table. The
**word** table serves as the lookup table from string to such a word ID. The
exact content of the word table depends on the [tokenizer](Tokenizers.md) used.
## Address computation tables
Next to the main search tables, there is a set of secondary helper tables used
to compute the address relations between places. These tables are partitioned.
Each country is assigned a partition number in the country_name table (see
below) and the data is then split between a set of tables, one for each
partition. Note that Nominatim still manually manages partitioned tables.
Native support for partitions in PostgreSQL only became usable with version 13.
It will be a little while before Nominatim drops support for older versions.
![address tables](address-tables.svg)
The **search_name_X** tables are used to look up streets that appear in the
`addr:street` tag.
The **location_area_large_X** tables are used to look up larger areas
(administrative boundaries and place nodes) either through their geographic
closeness or through `addr:*` entries.
The **location_road_X** tables are used to find the closest street for a
dependent place.
All three table cache specific information from the placex table for their
selected subset of places:
* `keywords` and `name_vector` contain lists of term ids (from the word table)
that the full name of the place should match against
* `isguess` is true for places that are not described by an area
All other columns reflect their counterpart in the placex table.
## Static data tables
Nominatim also creates a number of static tables at import:
* `nominatim_properties` saves settings that must not be changed after
import
* `address_levels` save the rank information from the
[ranking configuration](../customize/Ranking.md)
* `country_name` contains a fallback of names for all countries, their
default languages and saves the assignment of countries to partitions.
* `country_osm_grid` provides a fallback for country geometries
## Auxiliary data tables
Finally there are some table for auxiliary data:
* `location_property_tiger` - saves housenumber from the Tiger import. Its
layout is similar to that of `location_propoerty_osmline`.
* `place_class_*` tables are helper tables to facilitate lookup of POIs
by their class and type. They exist because it is not possible to create
combined indexes with geometries.

View File

@@ -1,133 +0,0 @@
# Setting up Nominatim for Development
This chapter gives an overview how to set up Nominatim for development
and how to run tests.
!!! Important
This guide assumes that you develop under the latest version of Ubuntu. You
can of course also use your favourite distribution. You just might have to
adapt the commands below slightly, in particular the commands for installing
additional software.
## Installing Nominatim
The first step is to install Nominatim itself. Please follow the installation
instructions in the [Admin section](../admin/Installation.md). You don't need
to set up a webserver for development, the webserver that is included with PHP
is sufficient.
If you want to run Nominatim in a VM via Vagrant, use the default `ubuntu` setup.
Vagrant's libvirt provider runs out-of-the-box under Ubuntu. You also need to
install an NFS daemon to enable directory sharing between host and guest. The
following packages should get you started:
sudo apt install vagrant vagrant-libvirt libvirt-daemon nfs-kernel-server
## Prerequisites for testing and documentation
The Nominatim test suite consists of behavioural tests (using behave) and
unit tests (using PHPUnit for PHP code and pytest for Python code).
It has the following additional requirements:
* [behave test framework](https://behave.readthedocs.io) >= 1.2.6
* [phpunit](https://phpunit.de) (9.5 is known to work)
* [PHP CodeSniffer](https://github.com/squizlabs/PHP_CodeSniffer)
* [Pylint](https://pylint.org/) (CI always runs the latest version from pip)
* [mypy](http://mypy-lang.org/) (plus typing information for external libs)
* [Python Typing Extensions](https://github.com/python/typing_extensions) (for Python < 3.9)
* [pytest](https://pytest.org)
The documentation is built with mkdocs:
* [mkdocs](https://www.mkdocs.org/) >= 1.1.2
* [mkdocstrings](https://mkdocstrings.github.io/) >= 0.16
* [mkdocstrings-python-legacy](https://mkdocstrings.github.io/python-legacy/)
### Installing prerequisites on Ubuntu/Debian
Some of the Python packages require the newest version which is not yet
available with the current distributions. Therefore it is recommended to
install pip to get the newest versions.
To install all necessary packages run:
```sh
sudo apt install php-cgi phpunit php-codesniffer \
python3-pip python3-setuptools python3-dev
pip3 install --user behave mkdocs mkdocstrings pytest pylint \
mypy types-PyYAML types-jinja2 types-psycopg2 types-psutil
```
The `mkdocs` executable will be located in `.local/bin`. You may have to add
this directory to your path, for example by running:
```
echo 'export PATH=~/.local/bin:$PATH' > ~/.profile
```
If your distribution does not have PHPUnit 7.3+, you can install it (as well
as CodeSniffer) via composer:
```
sudo apt-get install composer
composer global require "squizlabs/php_codesniffer=*"
composer global require "phpunit/phpunit=8.*"
```
The binaries are found in `.config/composer/vendor/bin`. You need to add this
to your PATH as well:
```
echo 'export PATH=~/.config/composer/vendor/bin:$PATH' > ~/.profile
```
## Executing Tests
All tests are located in the `/test` directory.
To run all tests just go to the build directory and run make:
```sh
cd build
make test
```
For more information about the structure of the tests and how to change and
extend the test suite, see the [Testing chapter](Testing.md).
## Documentation Pages
The [Nominatim documentation](https://nominatim.org/release-docs/develop/) is
built using the [MkDocs](https://www.mkdocs.org/) static site generation
framework. The master branch is automatically deployed every night on
[https://nominatim.org/release-docs/develop/](https://nominatim.org/release-docs/develop/)
To build the documentation, go to the build directory and run
```
make doc
INFO - Cleaning site directory
INFO - Building documentation to directory: /home/vagrant/build/site-html
```
This runs `mkdocs build` plus extra transformation of some files and adds
symlinks (see `CMakeLists.txt` for the exact steps).
Now you can start webserver for local testing
```
build> make serve-doc
[server:296] Serving on http://127.0.0.1:8000
[handlers:62] Start watching changes
```
If you develop inside a Vagrant virtual machine, use a port that is forwarded
to your host:
```
build> PYTHONPATH=$SRCDIR mkdocs serve --dev-addr 0.0.0.0:8088
[server:296] Serving on http://0.0.0.0:8088
[handlers:62] Start watching changes
```

View File

@@ -1,227 +0,0 @@
# Writing custom sanitizer and token analysis modules for the ICU tokenizer
The [ICU tokenizer](../customize/Tokenizers.md#icu-tokenizer) provides a
highly customizable method to pre-process and normalize the name information
of the input data before it is added to the search index. It comes with a
selection of sanitizers and token analyzers which you can use to adapt your
installation to your needs. If the provided modules are not enough, you can
also provide your own implementations. This section describes the API
of sanitizers and token analysis.
!!! warning
This API is currently in early alpha status. While this API is meant to
be a public API on which other sanitizers and token analyzers may be
implemented, it is not guaranteed to be stable at the moment.
## Using non-standard sanitizers and token analyzers
Sanitizer names (in the `step` property) and token analysis names (in the
`analyzer`) may refer to externally supplied modules. There are two ways
to include external modules: through a library or from the project directory.
To include a module from a library, use the absolute import path as name and
make sure the library can be found in your PYTHONPATH.
To use a custom module without creating a library, you can put the module
somewhere in your project directory and then use the relative path to the
file. Include the whole name of the file including the `.py` ending.
## Custom sanitizer modules
A sanitizer module must export a single factory function `create` with the
following signature:
``` python
def create(config: SanitizerConfig) -> Callable[[ProcessInfo], None]
```
The function receives the custom configuration for the sanitizer and must
return a callable (function or class) that transforms the name and address
terms of a place. When a place is processed, then a `ProcessInfo` object
is created from the information that was queried from the database. This
object is sequentially handed to each configured sanitizer, so that each
sanitizer receives the result of processing from the previous sanitizer.
After the last sanitizer is finished, the resulting name and address lists
are forwarded to the token analysis module.
Sanitizer functions are instantiated once and then called for each place
that is imported or updated. They don't need to be thread-safe.
If multi-threading is used, each thread creates their own instance of
the function.
### Sanitizer configuration
::: nominatim.tokenizer.sanitizers.config.SanitizerConfig
rendering:
show_source: no
heading_level: 6
### The main filter function of the sanitizer
The filter function receives a single object of type `ProcessInfo`
which has with three members:
* `place`: read-only information about the place being processed.
See PlaceInfo below.
* `names`: The current list of names for the place. Each name is a
PlaceName object.
* `address`: The current list of address names for the place. Each name
is a PlaceName object.
While the `place` member is provided for information only, the `names` and
`address` lists are meant to be manipulated by the sanitizer. It may add and
remove entries, change information within a single entry (for example by
adding extra attributes) or completely replace the list with a different one.
#### PlaceInfo - information about the place
::: nominatim.data.place_info.PlaceInfo
rendering:
show_source: no
heading_level: 6
#### PlaceName - extended naming information
::: nominatim.data.place_name.PlaceName
rendering:
show_source: no
heading_level: 6
### Example: Filter for US street prefixes
The following sanitizer removes the directional prefixes from street names
in the US:
``` python
import re
def _filter_function(obj):
if obj.place.country_code == 'us' \
and obj.place.rank_address >= 26 and obj.place.rank_address <= 27:
for name in obj.names:
name.name = re.sub(r'^(north|south|west|east) ',
'',
name.name,
flags=re.IGNORECASE)
def create(config):
return _filter_function
```
This is the most simple form of a sanitizer module. If defines a single
filter function and implements the required `create()` function by returning
the filter.
The filter function first checks if the object is interesting for the
sanitizer. Namely it checks if the place is in the US (through `country_code`)
and it the place is a street (a `rank_address` of 26 or 27). If the
conditions are met, then it goes through all available names and
removes any leading directional prefix using a simple regular expression.
Save the source code in a file in your project directory, for example as
`us_streets.py`. Then you can use the sanitizer in your `icu_tokenizer.yaml`:
``` yaml
...
sanitizers:
- step: us_streets.py
...
```
!!! warning
This example is just a simplified show case on how to create a sanitizer.
It is not really read for real-world use: while the sanitizer would
correcly transform `West 5th Street` into `5th Street`. it would also
shorten a simple `North Street` to `Street`.
For more sanitizer examples, have a look at the sanitizers provided by Nominatim.
They can be found in the directory
[`nominatim/tokenizer/sanitizers`](https://github.com/osm-search/Nominatim/tree/master/nominatim/tokenizer/sanitizers).
## Custom token analysis module
::: nominatim.tokenizer.token_analysis.base.AnalysisModule
rendering:
show_source: no
heading_level: 6
::: nominatim.tokenizer.token_analysis.base.Analyzer
rendering:
show_source: no
heading_level: 6
### Example: Creating acronym variants for long names
The following example of a token analysis module creates acronyms from
very long names and adds them as a variant:
``` python
class AcronymMaker:
""" This class is the actual analyzer.
"""
def __init__(self, norm, trans):
self.norm = norm
self.trans = trans
def get_canonical_id(self, name):
# In simple cases, the normalized name can be used as a canonical id.
return self.norm.transliterate(name.name).strip()
def compute_variants(self, name):
# The transliterated form of the name always makes up a variant.
variants = [self.trans.transliterate(name)]
# Only create acronyms from very long words.
if len(name) > 20:
# Take the first letter from each word to form the acronym.
acronym = ''.join(w[0] for w in name.split())
# If that leds to an acronym with at least three letters,
# add the resulting acronym as a variant.
if len(acronym) > 2:
# Never forget to transliterate the variants before returning them.
variants.append(self.trans.transliterate(acronym))
return variants
# The following two functions are the module interface.
def configure(rules, normalizer, transliterator):
# There is no configuration to parse and no data to set up.
# Just return an empty configuration.
return None
def create(normalizer, transliterator, config):
# Return a new instance of our token analysis class above.
return AcronymMaker(normalizer, transliterator)
```
Given the name `Trans-Siberian Railway`, the code above would return the full
name `Trans-Siberian Railway` and the acronym `TSR` as variant, so that
searching would work for both.
## Sanitizers vs. Token analysis - what to use for variants?
It is not always clear when to implement variations in the sanitizer and
when to write a token analysis module. Just take the acronym example
above: it would also have been possible to write a sanitizer which adds the
acronym as an additional name to the name list. The result would have been
similar. So which should be used when?
The most important thing to keep in mind is that variants created by the
token analysis are only saved in the word lookup table. They do not need
extra space in the search index. If there are many spelling variations, this
can mean quite a significant amount of space is saved.
When creating additional names with a sanitizer, these names are completely
independent. In particular, they can be fed into different token analysis
modules. This gives a much greater flexibility but at the price that the
additional names increase the size of the search index.

View File

@@ -1,152 +0,0 @@
# Indexing Places
In Nominatim, the word __indexing__ refers to the process that takes the raw
OpenStreetMap data from the place table, enriches it with address information
and creates the search indexes. This section explains the basic data flow.
## Initial import
After osm2pgsql has loaded the raw OSM data into the place table,
the data is copied to the final search tables placex and location_property_osmline.
While they are copied, some basic properties are added:
* country_code, geometry_sector and partition
* initial search and address rank
In addition the column `indexed_status` is set to `1` marking the place as one
that needs to be indexed.
All this happens in the triggers `placex_insert` and `osmline_insert`.
## Indexing
The main work horse of the data import is the indexing step, where Nominatim
takes every place from the placex and location_property_osmline tables where
the indexed_status != 0 and computes the search terms and the address parts
of the place.
The indexing happens in three major steps:
1. **Data preparation** - The indexer gets the data for the place to be indexed
from the database.
2. **Search name processing** - The prepared data is given to the
tokenizer which computes the search terms from the names
and potentially other information.
3. **Address processing** - The indexer then hands the prepared data and the
tokenizer information back to the database via an `INSERT` statement which
also sets the indexed_status to `0`. This triggers the update triggers
`placex_update`/`osmline_update` which do the work of computing address
parts and filling all the search tables.
When computing the address terms of a place, Nominatim relies on the processed
search names of all the address parts. That is why places are processed in rank
order, from smallest rank to largest. To ensure correct handling of linked
place nodes, administrative boundaries are processed before all other places.
Apart from these restrictions, each place can be indexed independently
from the others. This allows a large degree of parallelization during the indexing.
It also means that the indexing process can be interrupted at any time and
will simply pick up where it left of when restarted.
### Data preparation
The data preparation step computes and retrieves all data for a place that
might be needed for the next step of processing the search name. That includes
* location information (country code)
* place classification (class, type, ranks)
* names (including names of linked places)
* address information (`addr:*` tags)
Data preparation is implemented in pl/PgSQL mostly in the functions
`placex_indexing_prepare()` and `get_interpolation_address()`.
#### `addr:*` tag inheritance
Nominatim has limited support for inheriting address tags from a building
to POIs inside the building. This only works when the address tags are on the
building outline. Any rank 30 object inside such a building or on its outline
inherits all address tags when it does not have any address tags of its own.
The inheritance is computed in the data preparation step.
### Search name processing
The prepared place information is handed to the tokenizer next. This is a
Python module responsible for processing the names from both name and address
terms and building up the word index from them. The process is explained in
more detail in the [Tokenizer chapter](Tokenizers.md).
### Address processing
Finally, the preprocessed place information and the results of the search name
processing are written back to the database. At this point the update trigger
of the placex/location_property_osmline tables take over and fill all the
dependent tables. This makes up the most work-intensive part of the indexing.
Nominatim distinguishes between dependent and independent places.
**Dependent places** are all places on rank 30: house numbers, POIs etc. These
places don't have a full address of their own. Instead they are attached to
a parent street or place and use the information of the parent for searching
and displaying information. Everything else are **independent places**: streets,
parks, water bodies, suburbs, cities, states etc. They receive a full address
on their own.
The address processing for both types of places is very different.
#### Independent places
To compute the address of an independent place Nominatim searches for all
places that cover the place to compute the address for at least partially.
For places with an area, that area is used to check for coverage. For place
nodes an artificial square area is computed according to the rank of
the place. The lower the rank the lager the area. The `location_area_large_X`
tables are there to facilitate the lookup. All places that can function as
the address of another place are saved in those tables.
`addr:*` and `isin:*` tags are taken into account to compute the address, too.
Nominatim will give preference to places with the same name as in these tags
when looking for places in the vicinity. If there are no matching place names
at all, then the tags are at least added to the search index. That means that
the names will not be shown in the result as the 'address' of the place, but
searching by them still works.
Independent places are always added to the global search index `search_name`.
#### Dependent places
Dependent places skip the full address computation for performance reasons.
Instead they just find a parent place to attach themselves to.
![parenting of dependent places](parenting-flow.svg)
By default a POI
or house number will be attached to the closest street. That can be any major
or minor street indexed by Nominatim. In the default configuration that means
that it can attach itself to a footway but only when it has a name.
When the dependent place has an `addr:street` tag, then Nominatim will first
try to find a street with the same name before falling back to the closest
street.
There are also addresses in OSM, where the housenumber does not belong
to a street at all. These have an `addr:place` tag. For these places, Nominatim
tries to find a place with the given name in the indexed places with an
address rank between 16 and 25. If none is found, then the dependent place
is attached to the closest place in that category and the addr:place name is
added as *unlisted* place, which indicates to Nominatim that it needs to add
it to the address output, no matter what. This special case is necessary to
cover addresses that don't really refer to an existing object.
When an address has both the `addr:street` and `addr:place` tag, then Nominatim
assumes that the `addr:place` tag in fact should be the city part of the address
and give the POI the usual street number address.
Dependent places are only added to the global search index `search_name` when
they have either a name themselves or when they have address tags that are not
covered by the places that make up their address. The latter ensures that
addresses are always searchable by those address tags.

View File

@@ -1,159 +0,0 @@
# Nominatim Test Suite
This chapter describes the tests in the `/test` directory, how they are
structured and how to extend them. For a quick introduction on how to run
the tests, see the [Development setup chapter](Development-Environment.md).
## Overall structure
There are two kind of tests in this test suite. There are functional tests
which test the API interface using a BDD test framework and there are unit
tests for specific PHP functions.
This test directory is sturctured as follows:
```
-+- bdd Functional API tests
| \
| +- steps Step implementations for test descriptions
| +- osm2pgsql Tests for data import via osm2pgsql
| +- db Tests for internal data processing on import and update
| +- api Tests for API endpoints (search, reverse, etc.)
|
+- php PHP unit tests
+- python Python unit tests
+- testdb Base data for generating API test database
+- testdata Additional test data used by unit tests
```
## PHP Unit Tests (`test/php`)
Unit tests for PHP code can be found in the `php/` directory. They test selected
PHP functions. Very low coverage.
To execute the test suite run
cd test/php
UNIT_TEST_DSN='pgsql:dbname=nominatim_unit_tests' phpunit ../
It will read phpunit.xml which points to the library, test path, bootstrap
strip and sets other parameters.
It will use (and destroy) a local database 'nominatim_unit_tests'. You can set
a different connection string with e.g. UNIT_TEST_DSN='pgsql:dbname=foo_unit_tests'.
## Python Unit Tests (`test/python`)
Unit tests for Python code can be found in the `python/` directory. The goal is
to have complete coverage of the Python library in `nominatim`.
To execute the tests run
py.test-3 test/python
or
pytest test/python
The name of the pytest binary depends on your installation.
## BDD Functional Tests (`test/bdd`)
Functional tests are written as BDD instructions. For more information on
the philosophy of BDD testing, see the
[Behave manual](http://pythonhosted.org/behave/philosophy.html).
The following explanation assume that the reader is familiar with the BDD
notations of features, scenarios and steps.
All possible steps can be found in the `steps` directory and should ideally
be documented.
### General Usage
To run the functional tests, do
cd test/bdd
behave
The tests can be configured with a set of environment variables (`behave -D key=val`):
* `BUILDDIR` - build directory of Nominatim installation to test
* `TEMPLATE_DB` - name of template database used as a skeleton for
the test databases (db tests)
* `TEST_DB` - name of test database (db tests)
* `API_TEST_DB` - name of the database containing the API test data (api tests)
* `API_TEST_FILE` - OSM file to be imported into the API test database (api tests)
* `DB_HOST` - (optional) hostname of database host
* `DB_PORT` - (optional) port of database on host
* `DB_USER` - (optional) username of database login
* `DB_PASS` - (optional) password for database login
* `SERVER_MODULE_PATH` - (optional) path on the Postgres server to Nominatim
module shared library file
* `REMOVE_TEMPLATE` - if true, the template and API database will not be reused
during the next run. Reusing the base templates speeds
up tests considerably but might lead to outdated errors
for some changes in the database layout.
* `KEEP_TEST_DB` - if true, the test database will not be dropped after a test
is finished. Should only be used if one single scenario is
run, otherwise the result is undefined.
Logging can be defined through command line parameters of behave itself. Check
out `behave --help` for details. Also have a look at the 'work-in-progress'
feature of behave which comes in handy when writing new tests.
### API Tests (`test/bdd/api`)
These tests are meant to test the different API endpoints and their parameters.
They require to import several datasets into a test database. This is normally
done automatically during setup of the test. The API test database is then
kept around and reused in subsequent runs of behave. Use `behave -DREMOVE_TEMPLATE`
to force a reimport of the database.
The official test dataset is saved in the file `test/testdb/apidb-test-data.pbf`
and compromises the following data:
* Geofabrik extract of Liechtenstein
* extract of Autauga country, Alabama, US (for tests against Tiger data)
* additional data from `test/testdb/additional_api_test.data.osm`
API tests should only be testing the functionality of the website PHP code.
Most tests should be formulated as BDD DB creation tests (see below) instead.
#### Code Coverage
The API tests also support code coverage tests. You need to install
[PHP_CodeCoverage](https://github.com/sebastianbergmann/php-code-coverage).
On Debian/Ubuntu run:
apt-get install php-codecoverage php-xdebug
Then run the API tests as follows:
behave api -DPHPCOV=<coverage output dir>
The output directory must be an absolute path. To generate reports, you can use
the [phpcov](https://github.com/sebastianbergmann/phpcov) tool:
phpcov merge --html=<report output dir> <coverage output dir>
### DB Creation Tests (`test/bdd/db`)
These tests check the import and update of the Nominatim database. They do not
test the correctness of osm2pgsql. Each test will write some data into the `place`
table (and optionally the `planet_osm_*` tables if required) and then run
Nominatim's processing functions on that.
These tests need to create their own test databases. By default they will be
called `test_template_nominatim` and `test_nominatim`. Names can be changed with
the environment variables `TEMPLATE_DB` and `TEST_DB`. The user running the tests
needs superuser rights for postgres.
### Import Tests (`test/bdd/osm2pgsql`)
These tests check that data is imported correctly into the place table. They
use the same template database as the DB Creation tests, so the same remarks apply.
Note that most testing of the gazetteer output of osm2pgsql is done in the tests
of osm2pgsql itself. The BDD tests are just there to ensure compatibility of
the osm2pgsql and Nominatim code.

View File

@@ -1,332 +0,0 @@
# Tokenizers
The tokenizer is the component of Nominatim that is responsible for
analysing names of OSM objects and queries. Nominatim provides different
tokenizers that use different strategies for normalisation. This page describes
how tokenizers are expected to work and the public API that needs to be
implemented when creating a new tokenizer. For information on how to configure
a specific tokenizer for a database see the
[tokenizer chapter in the Customization Guide](../customize/Tokenizers.md).
## Generic Architecture
### About Search Tokens
Search in Nominatim is organised around search tokens. Such a token represents
string that can be part of the search query. Tokens are used so that the search
index does not need to be organised around strings. Instead the database saves
for each place which tokens match this place's name, address, house number etc.
To be able to distinguish between these different types of information stored
with the place, a search token also always has a certain type: name, house number,
postcode etc.
During search an incoming query is transformed into a ordered list of such
search tokens (or rather many lists, see below) and this list is then converted
into a database query to find the right place.
It is the core task of the tokenizer to create, manage and assign the search
tokens. The tokenizer is involved in two distinct operations:
* __at import time__: scanning names of OSM objects, normalizing them and
building up the list of search tokens.
* __at query time__: scanning the query and returning the appropriate search
tokens.
### Importing
The indexer is responsible to enrich an OSM object (or place) with all data
required for geocoding. It is split into two parts: the controller collects
the places that require updating, enriches the place information as required
and hands the place to Postgresql. The collector is part of the Nominatim
library written in Python. Within Postgresql, the `placex_update`
trigger is responsible to fill out all secondary tables with extra geocoding
information. This part is written in PL/pgSQL.
The tokenizer is involved in both parts. When the indexer prepares a place,
it hands it over to the tokenizer to inspect the names and create all the
search tokens applicable for the place. This usually involves updating the
tokenizer's internal token lists and creating a list of all token IDs for
the specific place. This list is later needed in the PL/pgSQL part where the
indexer needs to add the token IDs to the appropriate search tables. To be
able to communicate the list between the Python part and the pl/pgSQL trigger,
the `placex` table contains a special JSONB column `token_info` which is there
for the exclusive use of the tokenizer.
The Python part of the tokenizer returns a structured information about the
tokens of a place to the indexer which converts it to JSON and inserts it into
the `token_info` column. The content of the column is then handed to the PL/pqSQL
callbacks of the tokenizer which extracts the required information. Usually
the tokenizer then removes all information from the `token_info` structure,
so that no information is ever persistently saved in the table. All information
that went in should have been processed after all and put into secondary tables.
This is however not a hard requirement. If the tokenizer needs to store
additional information about a place permanently, it may do so in the
`token_info` column. It just may never execute searches over it and
consequently not create any special indexes on it.
### Querying
At query time, Nominatim builds up multiple _interpretations_ of the search
query. Each of these interpretations is tried against the database in order
of the likelihood with which they match to the search query. The first
interpretation that yields results wins.
The interpretations are encapsulated in the `SearchDescription` class. An
instance of this class is created by applying a sequence of
_search tokens_ to an initially empty SearchDescription. It is the
responsibility of the tokenizer to parse the search query and derive all
possible sequences of search tokens. To that end the tokenizer needs to parse
the search query and look up matching words in its own data structures.
## Tokenizer API
The following section describes the functions that need to be implemented
for a custom tokenizer implementation.
!!! warning
This API is currently in early alpha status. While this API is meant to
be a public API on which other tokenizers may be implemented, the API is
far away from being stable at the moment.
### Directory Structure
Nominatim expects two files for a tokenizer:
* `nominatim/tokenizer/<NAME>_tokenizer.py` containing the Python part of the
implementation
* `lib-php/tokenizer/<NAME>_tokenizer.php` with the PHP part of the
implementation
where `<NAME>` is a unique name for the tokenizer consisting of only lower-case
letters, digits and underscore. A tokenizer also needs to install some SQL
functions. By convention, these should be placed in `lib-sql/tokenizer`.
If the tokenizer has a default configuration file, this should be saved in
the `settings/<NAME>_tokenizer.<SUFFIX>`.
### Configuration and Persistence
Tokenizers may define custom settings for their configuration. All settings
must be prefixed with `NOMINATIM_TOKENIZER_`. Settings may be transient or
persistent. Transient settings are loaded from the configuration file when
Nominatim is started and may thus be changed at any time. Persistent settings
are tied to a database installation and must only be read during installation
time. If they are needed for the runtime then they must be saved into the
`nominatim_properties` table and later loaded from there.
### The Python module
The Python module is expect to export a single factory function:
```python
def create(dsn: str, data_dir: Path) -> AbstractTokenizer
```
The `dsn` parameter contains the DSN of the Nominatim database. The `data_dir`
is a directory in the project directory that the tokenizer may use to save
database-specific data. The function must return the instance of the tokenizer
class as defined below.
### Python Tokenizer Class
All tokenizers must inherit from `nominatim.tokenizer.base.AbstractTokenizer`
and implement the abstract functions defined there.
::: nominatim.tokenizer.base.AbstractTokenizer
rendering:
heading_level: 4
### Python Analyzer Class
::: nominatim.tokenizer.base.AbstractAnalyzer
rendering:
heading_level: 4
### PL/pgSQL Functions
The tokenizer must provide access functions for the `token_info` column
to the indexer which extracts the necessary information for the global
search tables. If the tokenizer needs additional SQL functions for private
use, then these functions must be prefixed with `token_` in order to ensure
that there are no naming conflicts with the SQL indexer code.
The following functions are expected:
```sql
FUNCTION token_get_name_search_tokens(info JSONB) RETURNS INTEGER[]
```
Return an array of token IDs of search terms that should match
the name(s) for the given place. These tokens are used to look up the place
by name and, where the place functions as part of an address for another place,
by address. Must return NULL when the place has no name.
```sql
FUNCTION token_get_name_match_tokens(info JSONB) RETURNS INTEGER[]
```
Return an array of token IDs of full names of the place that should be used
to match addresses. The list of match tokens is usually more strict than
search tokens as it is used to find a match between two OSM tag values which
are expected to contain matching full names. Partial terms should not be
used for match tokens. Must return NULL when the place has no name.
```sql
FUNCTION token_get_housenumber_search_tokens(info JSONB) RETURNS INTEGER[]
```
Return an array of token IDs of house number tokens that apply to the place.
Note that a place may have multiple house numbers, for example when apartments
each have their own number. Must be NULL when the place has no house numbers.
```sql
FUNCTION token_normalized_housenumber(info JSONB) RETURNS TEXT
```
Return the house number(s) in the normalized form that can be matched against
a house number token text. If a place has multiple house numbers they must
be listed with a semicolon as delimiter. Must be NULL when the place has no
house numbers.
```sql
FUNCTION token_matches_street(info JSONB, street_tokens INTEGER[]) RETURNS BOOLEAN
```
Check if the given tokens (previously saved from `token_get_name_match_tokens()`)
match against the `addr:street` tag name. Must return either NULL or FALSE
when the place has no `addr:street` tag.
```sql
FUNCTION token_matches_place(info JSONB, place_tokens INTEGER[]) RETURNS BOOLEAN
```
Check if the given tokens (previously saved from `token_get_name_match_tokens()`)
match against the `addr:place` tag name. Must return either NULL or FALSE
when the place has no `addr:place` tag.
```sql
FUNCTION token_addr_place_search_tokens(info JSONB) RETURNS INTEGER[]
```
Return the search token IDs extracted from the `addr:place` tag. These tokens
are used for searches by address when no matching place can be found in the
database. Must be NULL when the place has no `addr:place` tag.
```sql
FUNCTION token_get_address_keys(info JSONB) RETURNS SETOF TEXT
```
Return the set of keys for which address information is provided. This
should correspond to the list of (relevant) `addr:*` tags with the `addr:`
prefix removed or the keys used in the `address` dictionary of the place info.
```sql
FUNCTION token_get_address_search_tokens(info JSONB, key TEXT) RETURNS INTEGER[]
```
Return the array of search tokens for the given address part. `key` can be
expected to be one of those returned with `token_get_address_keys()`. The
search tokens are added to the address search vector of the place, when no
corresponding OSM object could be found for the given address part from which
to copy the name information.
```sql
FUNCTION token_matches_address(info JSONB, key TEXT, tokens INTEGER[])
```
Check if the given tokens match against the address part `key`.
__Warning:__ the tokens that are handed in are the lists previously saved
from `token_get_name_search_tokens()`, _not_ from the match token list. This
is an historical oddity which will be fixed at some point in the future.
Currently, tokenizers are encouraged to make sure that matching works against
both the search token list and the match token list.
```sql
FUNCTION token_get_postcode(info JSONB) RETURNS TEXT
```
Return the postcode for the object, if any exists. The postcode must be in
the form that should also be presented to the end-user.
```sql
FUNCTION token_strip_info(info JSONB) RETURNS JSONB
```
Return the part of the `token_info` field that should be stored in the database
permanently. The indexer calls this function when all processing is done and
replaces the content of the `token_info` column with the returned value before
the trigger stores the information in the database. May return NULL if no
information should be stored permanently.
### PHP Tokenizer class
The PHP tokenizer class is instantiated once per request and responsible for
analyzing the incoming query. Multiple requests may be in flight in
parallel.
The class is expected to be found under the
name of `\Nominatim\Tokenizer`. To find the class the PHP code includes the file
`tokenizer/tokenizer.php` in the project directory. This file must be created
when the tokenizer is first set up on import. The file should initialize any
configuration variables by setting PHP constants and then require the file
with the actual implementation of the tokenizer.
The tokenizer class must implement the following functions:
```php
public function __construct(object &$oDB)
```
The constructor of the class receives a database connection that can be used
to query persistent data in the database.
```php
public function checkStatus()
```
Check that the tokenizer can access its persistent data structures. If there
is an issue, throw an `\Exception`.
```php
public function normalizeString(string $sTerm) : string
```
Normalize string to a form to be used for comparisons when reordering results.
Nominatim reweighs results how well the final display string matches the actual
query. Before comparing result and query, names and query are normalised against
this function. The tokenizer can thus remove all properties that should not be
taken into account for reweighing, e.g. special characters or case.
```php
public function tokensForSpecialTerm(string $sTerm) : array
```
Return the list of special term tokens that match the given term.
```php
public function extractTokensFromPhrases(array &$aPhrases) : TokenList
```
Parse the given phrases, splitting them into word lists and retrieve the
matching tokens.
The phrase array may take on two forms. In unstructured searches (using `q=`
parameter) the search query is split at the commas and the elements are
put into a sorted list. For structured searches the phrase array is an
associative array where the key designates the type of the term (street, city,
county etc.) The tokenizer may ignore the phrase type at this stage in parsing.
Matching phrase type and appropriate search token type will be done later
when the SearchDescription is built.
For each phrase in the list of phrases, the function must analyse the phrase
string and then call `setWordSets()` to communicate the result of the analysis.
A word set is a list of strings, where each string refers to a search token.
A phrase may have multiple interpretations. Therefore a list of word sets is
usually attached to the phrase. The search tokens themselves are returned
by the function in an associative array, where the key corresponds to the
strings given in the word sets. The value is a list of search tokens. Thus
a single string in the list of word sets may refer to multiple search tokens.

View File

@@ -1,35 +0,0 @@
@startuml
skinparam monochrome true
skinparam ObjectFontStyle bold
map search_name_X {
place_id => BIGINT
address_rank => SMALLINT
name_vector => INT[]
centroid => GEOMETRY
}
map location_area_large_X {
place_id => BIGINT
keywords => INT[]
partition => SMALLINT
rank_search => SMALLINT
rank_address => SMALLINT
country_code => VARCHR(2)
isguess => BOOLEAN
postcode => TEXT
centroid => POINT
geometry => GEOMETRY
}
map location_road_X {
place_id => BIGINT
partition => SMALLINT
country_code => VARCHR(2)
geometry => GEOMETRY
}
search_name_X -[hidden]> location_area_large_X
location_area_large_X -[hidden]> location_road_X
@enduml

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 11 KiB

View File

@@ -1,34 +0,0 @@
# Additional Data Sources
This guide explains how data sources other than OpenStreetMap mentioned in
the install instructions got obtained and converted.
## Country grid
Nominatim uses pre-generated country borders data. In case one imports only
a subset of a country. And to assign each place a partition. Nominatim
database tables are split into partitions for performance.
More details in [osm-search/country-grid-data](https://github.com/osm-search/country-grid-data).
## US Census TIGER
For the United States you can choose to import additional street-level data.
The data isn't mixed into OSM data but queried as fallback when no OSM
result can be found.
More details in [osm-search/TIGER-data](https://github.com/osm-search/TIGER-data).
## GB postcodes
For Great Britain you can choose to import Royalmail postcode centroids.
More details in [osm-search/gb-postcode-data](https://github.com/osm-search/gb-postcode-data).
## Wikipedia & Wikidata rankings
Nominatim can import "importance" data of place names. This greatly
improves ranking of results.
More details in [osm-search/wikipedia-wikidata](https://github.com/osm-search/wikipedia-wikidata).

View File

@@ -1,44 +0,0 @@
@startuml
skinparam monochrome true
skinparam ObjectFontStyle bold
map planet_osm_nodes #eee {
id => BIGINT
lat => INT
lon => INT
}
map planet_osm_ways #eee {
id => BIGINT
nodes => BIGINT[]
tags => TEXT[]
}
map planet_osm_rels #eee {
id => BIGINT
parts => BIGINT[]
members => TEXT[]
tags => TEXT[]
way_off => SMALLINT
rel_off => SMALLINT
}
map place {
osm_type => CHAR(1)
osm_id => BIGINT
class => TEXT
type => TEXT
name => HSTORE
address => HSTORE
extratags => HSTORE
admin_level => SMALLINT
geometry => GEOMETRY
}
planet_osm_nodes -[hidden]> planet_osm_ways
planet_osm_ways -[hidden]> planet_osm_rels
planet_osm_ways -[hidden]-> place
planet_osm_nodes::id <- planet_osm_ways::nodes
@enduml

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 13 KiB

View File

@@ -1,24 +0,0 @@
# Basic Architecture
Nominatim provides geocoding based on OpenStreetMap data. It uses a PostgreSQL
database as a backend for storing the data.
There are three basic parts to Nominatim's architecture: the data import,
the address computation and the search frontend.
The __data import__ stage reads the raw OSM data and extracts all information
that is useful for geocoding. This part is done by osm2pgsql, the same tool
that can also be used to import a rendering database. It uses the special
gazetteer output plugin in `osm2pgsql/src/output-gazetter.[ch]pp`. The result of
the import can be found in the database table `place`.
The __address computation__ or __indexing__ stage takes the data from `place`
and adds additional information needed for geocoding. It ranks the places by
importance, links objects that belong together and computes addresses and
the search index. Most of this work is done in PL/pgSQL via database triggers
and can be found in the files in the `sql/functions/` directory.
The __search frontend__ implements the actual API. It takes search
and reverse geocoding queries from the user, looks up the data and
returns the results in the requested format. This part is written in PHP
and can be found in the `lib/` and `website/` directories.

View File

@@ -1,31 +0,0 @@
@startuml
skinparam monochrome true
start
if (has 'addr:street'?) then (yes)
if (street with that name\n nearby?) then (yes)
:**Use closest street**
**with same name**;
kill
else (no)
:** Use closest**\n**street**;
kill
endif
elseif (has 'addr:place'?) then (yes)
if (place with that name\n nearby?) then (yes)
:**Use closest place**
**with same name**;
kill
else (no)
:add addr:place to adress;
:**Use closest place**\n**rank 16 to 25**;
kill
endif
else (otherwise)
:**Use closest**\n**street**;
kill
endif
@enduml

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 9.8 KiB

View File

@@ -1,99 +0,0 @@
@startuml
skinparam monochrome true
skinparam ObjectFontStyle bold
left to right direction
map placex {
place_id => BIGINT
osm_type => CHAR(1)
osm_id => BIGINT
class => TEXT
type => TEXT
name => HSTORE
address => HSTORE
extratags => HSTORE
admin_level => SMALLINT
partition => SMALLINT
geometry_sector => INT
parent_place_id => BIGINT
linked_place_id => BIGINT
importance => DOUBLE
rank_search => SMALLINT
rank_address => SMALLINT
wikipedia => TEXT
country_code => VARCHAR(2)
housenumber => TEXT
postcode => TEXT
indexed_status => SMALLINT
indexed_date => TIMESTAMP
centroid => GEOMETRY
geometry => GEOMETRY
}
map search_name {
place_id => BIGINT
importance => DOUBLE
search_rank => SMALLINT
address_rank => SMALLINT
name_vector => INT[]
nameaddress_vector => INT[]
country_code => VARCHAR(2)
centroid => GEOMETRY
}
map word {
word_id => INT
word_token => TEXT
... =>
}
map location_property_osmline {
place_id => BIGINT
osm_id => BIGINT
startnumber => INT
endnumber => INT
interpolationtype => TEXT
address => HSTORE
partition => SMALLINT
geometry_sector => INT
parent_place_id => BIGINT
country_code => VARCHAR(2)
postcode => text
indexed_status => SMALLINT
indexed_date => TIMESTAMP
linegeo => GEOMETRY
}
map place_addressline {
place_id => BIGINT
address_place_id => BIGINT
distance => DOUBLE
cached_rank_address => SMALLINT
fromarea => BOOLEAN
isaddress => BOOLEAN
}
map location_postcode {
place_id => BIGINT
postcode => TEXT
parent_place_id => BIGINT
rank_search => SMALLINT
rank_address => SMALLINT
indexed_status => SMALLINT
indexed_date => TIMESTAMP
geometry => GEOMETRY
}
placex::place_id <-- search_name::place_id
placex::place_id <-- place_addressline::place_id
placex::place_id <-- place_addressline::address_place_id
search_name::name_vector --> word::word_id
search_name::nameaddress_vector --> word::word_id
place_addressline -[hidden]> location_property_osmline
search_name -[hidden]> place_addressline
location_property_osmline -[hidden]-> location_postcode
@enduml

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 35 KiB

View File

@@ -1,20 +0,0 @@
table {
margin-bottom: 12pt
}
th, td {
padding: 1pt 12pt;
}
th {
background-color: #eee;
}
.doc-object h6 {
margin-bottom: 0.8em;
font-size: 120%;
}
.doc-object {
margin-bottom: 1.3em;
}

View File

@@ -1,10 +0,0 @@
Nominatim (from the Latin, 'by name') is a tool to search OSM data by name and address and to generate synthetic addresses of OSM points (reverse geocoding).
This guide comes in four parts:
* __[API reference](api/Overview.md)__ for users of Nominatim
* __[Administration Guide](admin/Installation.md)__ for those who want
to install their own Nominatim server
* __[Customization Guide](customize/Overview.md)__ for those who want to
adapt their own installation to their special requirements
* __[Developer's Guide](develop/overview.md)__ for developers of the software

View File

@@ -1,65 +0,0 @@
site_name: Nominatim 4.2.4
theme: readthedocs
docs_dir: ${CMAKE_CURRENT_BINARY_DIR}
site_url: https://nominatim.org
repo_url: https://github.com/openstreetmap/Nominatim
nav:
- 'Introduction' : 'index.md'
- 'API Reference':
- 'Overview': 'api/Overview.md'
- 'Search': 'api/Search.md'
- 'Reverse': 'api/Reverse.md'
- 'Address Lookup': 'api/Lookup.md'
- 'Details' : 'api/Details.md'
- 'Status' : 'api/Status.md'
- 'Place Output Formats': 'api/Output.md'
- 'FAQ': 'api/Faq.md'
- 'Administration Guide':
- 'Basic Installation': 'admin/Installation.md'
- 'Import' : 'admin/Import.md'
- 'Update' : 'admin/Update.md'
- 'Deploy' : 'admin/Deployment.md'
- 'Nominatim UI' : 'admin/Setup-Nominatim-UI.md'
- 'Advanced Installations' : 'admin/Advanced-Installations.md'
- 'Maintenance' : 'admin/Maintenance.md'
- 'Migration from older Versions' : 'admin/Migration.md'
- 'Troubleshooting' : 'admin/Faq.md'
- 'Customization Guide':
- 'Overview': 'customize/Overview.md'
- 'Import Styles': 'customize/Import-Styles.md'
- 'Configuration Settings': 'customize/Settings.md'
- 'Per-Country Data': 'customize/Country-Settings.md'
- 'Place Ranking' : 'customize/Ranking.md'
- 'Importance' : 'customize/Importance.md'
- 'Tokenizers' : 'customize/Tokenizers.md'
- 'Special Phrases': 'customize/Special-Phrases.md'
- 'External data: US housenumbers from TIGER': 'customize/Tiger.md'
- 'External data: Postcodes': 'customize/Postcodes.md'
- 'Developers Guide':
- 'Architecture Overview' : 'develop/overview.md'
- 'Database Layout' : 'develop/Database-Layout.md'
- 'Indexing' : 'develop/Indexing.md'
- 'Tokenizers' : 'develop/Tokenizers.md'
- 'Custom modules for ICU tokenizer': 'develop/ICU-Tokenizer-Modules.md'
- 'Setup for Development' : 'develop/Development-Environment.md'
- 'Testing' : 'develop/Testing.md'
- 'External Data Sources': 'develop/data-sources.md'
- 'Appendix':
- 'Installation on Ubuntu 18' : 'appendix/Install-on-Ubuntu-18.md'
- 'Installation on Ubuntu 20' : 'appendix/Install-on-Ubuntu-20.md'
- 'Installation on Ubuntu 22' : 'appendix/Install-on-Ubuntu-22.md'
markdown_extensions:
- codehilite
- admonition
- def_list
- toc:
permalink:
extra_css: [extra.css, styles.css]
plugins:
- search
- mkdocstrings:
handlers:
python-legacy:
rendering:
show_source: false
show_signature_annotations: false

View File

@@ -1,69 +0,0 @@
.codehilite .hll { background-color: #ffffcc }
.codehilite { background: #f0f0f0; }
.codehilite .c { color: #60a0b0; font-style: italic } /* Comment */
.codehilite .err { /* border: 1px solid #FF0000 */ } /* Error */
.codehilite .k { color: #007020; font-weight: bold } /* Keyword */
.codehilite .o { color: #666666 } /* Operator */
.codehilite .ch { color: #60a0b0; font-style: italic } /* Comment.Hashbang */
.codehilite .cm { color: #60a0b0; font-style: italic } /* Comment.Multiline */
.codehilite .cp { color: #007020 } /* Comment.Preproc */
.codehilite .cpf { color: #60a0b0; font-style: italic } /* Comment.PreprocFile */
.codehilite .c1 { color: #60a0b0; font-style: italic } /* Comment.Single */
.codehilite .cs { color: #60a0b0; background-color: #fff0f0 } /* Comment.Special */
.codehilite .gd { color: #A00000 } /* Generic.Deleted */
.codehilite .ge { font-style: italic } /* Generic.Emph */
.codehilite .gr { color: #FF0000 } /* Generic.Error */
.codehilite .gh { color: #000080; font-weight: bold } /* Generic.Heading */
.codehilite .gi { color: #00A000 } /* Generic.Inserted */
.codehilite .go { color: #888888 } /* Generic.Output */
.codehilite .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */
.codehilite .gs { font-weight: bold } /* Generic.Strong */
.codehilite .gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.codehilite .gt { color: #0044DD } /* Generic.Traceback */
.codehilite .kc { color: #007020; font-weight: bold } /* Keyword.Constant */
.codehilite .kd { color: #007020; font-weight: bold } /* Keyword.Declaration */
.codehilite .kn { color: #007020; font-weight: bold } /* Keyword.Namespace */
.codehilite .kp { color: #007020 } /* Keyword.Pseudo */
.codehilite .kr { color: #007020; font-weight: bold } /* Keyword.Reserved */
.codehilite .kt { color: #902000 } /* Keyword.Type */
.codehilite .m { color: #40a070 } /* Literal.Number */
.codehilite .s { color: #4070a0 } /* Literal.String */
.codehilite .na { color: #4070a0 } /* Name.Attribute */
.codehilite .nb { color: #007020 } /* Name.Builtin */
.codehilite .nc { color: #0e84b5; font-weight: bold } /* Name.Class */
.codehilite .no { color: #60add5 } /* Name.Constant */
.codehilite .nd { color: #555555; font-weight: bold } /* Name.Decorator */
.codehilite .ni { color: #d55537; font-weight: bold } /* Name.Entity */
.codehilite .ne { color: #007020 } /* Name.Exception */
.codehilite .nf { color: #06287e } /* Name.Function */
.codehilite .nl { color: #002070; font-weight: bold } /* Name.Label */
.codehilite .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */
.codehilite .nt { color: #062873; font-weight: bold } /* Name.Tag */
.codehilite .nv { color: #bb60d5 } /* Name.Variable */
.codehilite .ow { color: #007020; font-weight: bold } /* Operator.Word */
.codehilite .w { color: #bbbbbb } /* Text.Whitespace */
.codehilite .mb { color: #40a070 } /* Literal.Number.Bin */
.codehilite .mf { color: #40a070 } /* Literal.Number.Float */
.codehilite .mh { color: #40a070 } /* Literal.Number.Hex */
.codehilite .mi { color: #40a070 } /* Literal.Number.Integer */
.codehilite .mo { color: #40a070 } /* Literal.Number.Oct */
.codehilite .sa { color: #4070a0 } /* Literal.String.Affix */
.codehilite .sb { color: #4070a0 } /* Literal.String.Backtick */
.codehilite .sc { color: #4070a0 } /* Literal.String.Char */
.codehilite .dl { color: #4070a0 } /* Literal.String.Delimiter */
.codehilite .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */
.codehilite .s2 { color: #4070a0 } /* Literal.String.Double */
.codehilite .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */
.codehilite .sh { color: #4070a0 } /* Literal.String.Heredoc */
.codehilite .si { color: #70a0d0; font-style: italic } /* Literal.String.Interpol */
.codehilite .sx { color: #c65d09 } /* Literal.String.Other */
.codehilite .sr { color: #235388 } /* Literal.String.Regex */
.codehilite .s1 { color: #4070a0 } /* Literal.String.Single */
.codehilite .ss { color: #517918 } /* Literal.String.Symbol */
.codehilite .bp { color: #007020 } /* Name.Builtin.Pseudo */
.codehilite .fm { color: #06287e } /* Name.Function.Magic */
.codehilite .vc { color: #bb60d5 } /* Name.Variable.Class */
.codehilite .vg { color: #bb60d5 } /* Name.Variable.Global */
.codehilite .vi { color: #bb60d5 } /* Name.Variable.Instance */
.codehilite .vm { color: #bb60d5 } /* Name.Variable.Magic */
.codehilite .il { color: #40a070 } /* Literal.Number.Integer.Long */

View File

@@ -1,191 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/ClassTypes.php');
/**
* Detailed list of address parts for a single result
*/
class AddressDetails
{
private $iPlaceID;
private $aAddressLines;
public function __construct(&$oDB, $iPlaceID, $sHousenumber, $mLangPref)
{
$this->iPlaceID = $iPlaceID;
if (is_array($mLangPref)) {
$mLangPref = $oDB->getArraySQL($oDB->getDBQuotedList($mLangPref));
}
if (!isset($sHousenumber)) {
$sHousenumber = -1;
}
$sSQL = 'SELECT *,';
$sSQL .= ' get_name_by_language(name,'.$mLangPref.') as localname';
$sSQL .= ' FROM get_addressdata('.$iPlaceID.','.$sHousenumber.')';
$sSQL .= ' ORDER BY rank_address DESC, isaddress DESC';
$this->aAddressLines = $oDB->getAll($sSQL);
}
private static function isAddress($aLine)
{
return $aLine['isaddress'] || $aLine['type'] == 'country_code';
}
public function getAddressDetails($bAll = false)
{
if ($bAll) {
return $this->aAddressLines;
}
return array_filter($this->aAddressLines, array(__CLASS__, 'isAddress'));
}
public function getLocaleAddress()
{
$aParts = array();
$sPrevResult = '';
foreach ($this->aAddressLines as $aLine) {
if ($aLine['isaddress'] && $sPrevResult != $aLine['localname']) {
$sPrevResult = $aLine['localname'];
$aParts[] = $sPrevResult;
}
}
return join(', ', $aParts);
}
public function getAddressNames()
{
$aAddress = array();
foreach ($this->aAddressLines as $aLine) {
if (!self::isAddress($aLine)) {
continue;
}
$sTypeLabel = ClassTypes\getLabelTag($aLine);
$sName = null;
if (isset($aLine['localname']) && $aLine['localname']!=='') {
$sName = $aLine['localname'];
} elseif (isset($aLine['housenumber']) && $aLine['housenumber']!=='') {
$sName = $aLine['housenumber'];
}
if (isset($sName)
&& (!isset($aAddress[$sTypeLabel])
|| $aLine['class'] == 'place')
) {
$aAddress[$sTypeLabel] = $sName;
if (!empty($aLine['name'])) {
$this->addSubdivisionCode($aAddress, $aLine['admin_level'], $aLine['name']);
}
}
}
return $aAddress;
}
/**
* Annotates the given json with geocodejson address information fields.
*
* @param array $aJson Json hash to add the fields to.
*
* Geocodejson has the following fields:
* street, locality, postcode, city, district,
* county, state, country
*
* Postcode and housenumber are added by type, district is not used.
* All other fields are set according to address rank.
*/
public function addGeocodeJsonAddressParts(&$aJson)
{
foreach (array_reverse($this->aAddressLines) as $aLine) {
if (!$aLine['isaddress']) {
continue;
}
if (!isset($aLine['localname']) || $aLine['localname'] == '') {
continue;
}
if ($aLine['type'] == 'postcode' || $aLine['type'] == 'postal_code') {
$aJson['postcode'] = $aLine['localname'];
continue;
}
if ($aLine['type'] == 'house_number') {
$aJson['housenumber'] = $aLine['localname'];
continue;
}
if ($this->iPlaceID == $aLine['place_id']) {
continue;
}
$iRank = (int)$aLine['rank_address'];
if ($iRank > 25 && $iRank < 28) {
$aJson['street'] = $aLine['localname'];
} elseif ($iRank >= 22 && $iRank <= 25) {
$aJson['locality'] = $aLine['localname'];
} elseif ($iRank >= 17 && $iRank <= 21) {
$aJson['district'] = $aLine['localname'];
} elseif ($iRank >= 13 && $iRank <= 16) {
$aJson['city'] = $aLine['localname'];
} elseif ($iRank >= 10 && $iRank <= 12) {
$aJson['county'] = $aLine['localname'];
} elseif ($iRank >= 5 && $iRank <= 9) {
$aJson['state'] = $aLine['localname'];
} elseif ($iRank == 4) {
$aJson['country'] = $aLine['localname'];
}
}
}
public function getAdminLevels()
{
$aAddress = array();
foreach (array_reverse($this->aAddressLines) as $aLine) {
if (self::isAddress($aLine)
&& isset($aLine['admin_level'])
&& $aLine['admin_level'] < 15
&& !isset($aAddress['level'.$aLine['admin_level']])
) {
$aAddress['level'.$aLine['admin_level']] = $aLine['localname'];
}
}
return $aAddress;
}
public function debugInfo()
{
return $this->aAddressLines;
}
private function addSubdivisionCode(&$aAddress, $iAdminLevel, $nameDetails)
{
if (is_string($nameDetails)) {
$nameDetails = json_decode('{' . str_replace('"=>"', '":"', $nameDetails) . '}', true);
}
if (!empty($nameDetails['ISO3166-2'])) {
$aAddress["ISO3166-2-lvl$iAdminLevel"] = $nameDetails['ISO3166-2'];
}
}
}

View File

@@ -1,576 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim\ClassTypes;
/**
* Create a label tag for the given place that can be used as an XML name.
*
* @param array[] $aPlace Information about the place to label.
*
* A label tag groups various object types together under a common
* label. The returned value is lower case and has no spaces
*/
function getLabelTag($aPlace, $sCountry = null)
{
$iRank = (int) ($aPlace['rank_address'] ?? 30);
$sLabel;
if (isset($aPlace['place_type'])) {
$sLabel = $aPlace['place_type'];
} elseif ($aPlace['class'] == 'boundary' && $aPlace['type'] == 'administrative') {
$sLabel = getBoundaryLabel($iRank/2, $sCountry);
} elseif ($aPlace['type'] == 'postal_code') {
$sLabel = 'postcode';
} elseif ($iRank < 26) {
$sLabel = $aPlace['type'];
} elseif ($iRank < 28) {
$sLabel = 'road';
} elseif ($aPlace['class'] == 'place'
&& ($aPlace['type'] == 'house_number' ||
$aPlace['type'] == 'house_name' ||
$aPlace['type'] == 'country_code')
) {
$sLabel = $aPlace['type'];
} else {
$sLabel = $aPlace['class'];
}
return strtolower(str_replace(' ', '_', $sLabel));
}
/**
* Create a label for the given place.
*
* @param array[] $aPlace Information about the place to label.
*/
function getLabel($aPlace, $sCountry = null)
{
if (isset($aPlace['place_type'])) {
return ucwords(str_replace('_', ' ', $aPlace['place_type']));
}
if ($aPlace['class'] == 'boundary' && $aPlace['type'] == 'administrative') {
return getBoundaryLabel(($aPlace['rank_address'] ?? 30)/2, $sCountry ?? null);
}
// Return a label only for 'important' class/type combinations
if (getImportance($aPlace) !== null) {
return ucwords(str_replace('_', ' ', $aPlace['type']));
}
return null;
}
/**
* Return a simple label for an administrative boundary for the given country.
*
* @param int $iAdminLevel Content of admin_level tag.
* @param string $sCountry Country code of the country where the object is
* in. May be null, in which case a world-wide
* fallback is used.
* @param string $sFallback String to return if no explicit string is listed.
*
* @return string
*/
function getBoundaryLabel($iAdminLevel, $sCountry, $sFallback = 'Administrative')
{
static $aBoundaryList = array (
'default' => array (
1 => 'Continent',
2 => 'Country',
3 => 'Region',
4 => 'State',
5 => 'State District',
6 => 'County',
7 => 'Municipality',
8 => 'City',
9 => 'City District',
10 => 'Suburb',
11 => 'Neighbourhood',
12 => 'City Block'
),
'no' => array (
3 => 'State',
4 => 'County'
),
'se' => array (
3 => 'State',
4 => 'County'
)
);
if (isset($aBoundaryList[$sCountry])
&& isset($aBoundaryList[$sCountry][$iAdminLevel])
) {
return $aBoundaryList[$sCountry][$iAdminLevel];
}
return $aBoundaryList['default'][$iAdminLevel] ?? $sFallback;
}
/**
* Return an estimated radius of how far the object node extends.
*
* @param array[] $aPlace Information about the place. This must be a node
* feature.
*
* @return float The radius around the feature in degrees.
*/
function getDefRadius($aPlace)
{
$aSpecialRadius = array(
'place:continent' => 25,
'place:country' => 7,
'place:state' => 2.6,
'place:province' => 2.6,
'place:region' => 1.0,
'place:county' => 0.7,
'place:city' => 0.16,
'place:municipality' => 0.16,
'place:island' => 0.32,
'place:postcode' => 0.16,
'place:town' => 0.04,
'place:village' => 0.02,
'place:hamlet' => 0.02,
'place:district' => 0.02,
'place:borough' => 0.02,
'place:suburb' => 0.02,
'place:locality' => 0.01,
'place:neighbourhood'=> 0.01,
'place:quarter' => 0.01,
'place:city_block' => 0.01,
'landuse:farm' => 0.01,
'place:farm' => 0.01,
'place:airport' => 0.015,
'aeroway:aerodrome' => 0.015,
'railway:station' => 0.005
);
$sClassPlace = $aPlace['class'].':'.$aPlace['type'];
return $aSpecialRadius[$sClassPlace] ?? 0.00005;
}
/**
* Get the icon to use with the given object.
*/
function getIcon($aPlace)
{
$aIcons = array(
'boundary:administrative' => 'poi_boundary_administrative',
'place:city' => 'poi_place_city',
'place:town' => 'poi_place_town',
'place:village' => 'poi_place_village',
'place:hamlet' => 'poi_place_village',
'place:suburb' => 'poi_place_village',
'place:locality' => 'poi_place_village',
'place:airport' => 'transport_airport2',
'aeroway:aerodrome' => 'transport_airport2',
'railway:station' => 'transport_train_station2',
'amenity:place_of_worship' => 'place_of_worship_unknown3',
'amenity:pub' => 'food_pub',
'amenity:bar' => 'food_bar',
'amenity:university' => 'education_university',
'tourism:museum' => 'tourist_museum',
'amenity:arts_centre' => 'tourist_art_gallery2',
'tourism:zoo' => 'tourist_zoo',
'tourism:theme_park' => 'poi_point_of_interest',
'tourism:attraction' => 'poi_point_of_interest',
'leisure:golf_course' => 'sport_golf',
'historic:castle' => 'tourist_castle',
'amenity:hospital' => 'health_hospital',
'amenity:school' => 'education_school',
'amenity:theatre' => 'tourist_theatre',
'amenity:library' => 'amenity_library',
'amenity:fire_station' => 'amenity_firestation3',
'amenity:police' => 'amenity_police2',
'amenity:bank' => 'money_bank2',
'amenity:post_office' => 'amenity_post_office',
'tourism:hotel' => 'accommodation_hotel2',
'amenity:cinema' => 'tourist_cinema',
'tourism:artwork' => 'tourist_art_gallery2',
'historic:archaeological_site' => 'tourist_archaeological2',
'amenity:doctors' => 'health_doctors',
'leisure:sports_centre' => 'sport_leisure_centre',
'leisure:swimming_pool' => 'sport_swimming_outdoor',
'shop:supermarket' => 'shopping_supermarket',
'shop:convenience' => 'shopping_convenience',
'amenity:restaurant' => 'food_restaurant',
'amenity:fast_food' => 'food_fastfood',
'amenity:cafe' => 'food_cafe',
'tourism:guest_house' => 'accommodation_bed_and_breakfast',
'amenity:pharmacy' => 'health_pharmacy_dispensing',
'amenity:fuel' => 'transport_fuel',
'natural:peak' => 'poi_peak',
'natural:wood' => 'landuse_coniferous_and_deciduous',
'shop:bicycle' => 'shopping_bicycle',
'shop:clothes' => 'shopping_clothes',
'shop:hairdresser' => 'shopping_hairdresser',
'shop:doityourself' => 'shopping_diy',
'shop:estate_agent' => 'shopping_estateagent2',
'shop:car' => 'shopping_car',
'shop:garden_centre' => 'shopping_garden_centre',
'shop:car_repair' => 'shopping_car_repair',
'shop:bakery' => 'shopping_bakery',
'shop:butcher' => 'shopping_butcher',
'shop:apparel' => 'shopping_clothes',
'shop:laundry' => 'shopping_laundrette',
'shop:beverages' => 'shopping_alcohol',
'shop:alcohol' => 'shopping_alcohol',
'shop:optician' => 'health_opticians',
'shop:chemist' => 'health_pharmacy',
'shop:gallery' => 'tourist_art_gallery2',
'shop:jewelry' => 'shopping_jewelry',
'tourism:information' => 'amenity_information',
'historic:ruins' => 'tourist_ruin',
'amenity:college' => 'education_school',
'historic:monument' => 'tourist_monument',
'historic:memorial' => 'tourist_monument',
'historic:mine' => 'poi_mine',
'tourism:caravan_site' => 'accommodation_caravan_park',
'amenity:bus_station' => 'transport_bus_station',
'amenity:atm' => 'money_atm2',
'tourism:viewpoint' => 'tourist_view_point',
'tourism:guesthouse' => 'accommodation_bed_and_breakfast',
'railway:tram' => 'transport_tram_stop',
'amenity:courthouse' => 'amenity_court',
'amenity:recycling' => 'amenity_recycling',
'amenity:dentist' => 'health_dentist',
'natural:beach' => 'tourist_beach',
'railway:tram_stop' => 'transport_tram_stop',
'amenity:prison' => 'amenity_prison',
'highway:bus_stop' => 'transport_bus_stop2'
);
$sClassPlace = $aPlace['class'].':'.$aPlace['type'];
return $aIcons[$sClassPlace] ?? null;
}
/**
* Get an icon for the given object with its full URL.
*/
function getIconFile($aPlace)
{
if (CONST_MapIcon_URL === false) {
return null;
}
$sIcon = getIcon($aPlace);
if (!isset($sIcon)) {
return null;
}
return CONST_MapIcon_URL.'/'.$sIcon.'.p.20.png';
}
/**
* Return a class importance value for the given place.
*
* @param array[] $aPlace Information about the place.
*
* @return int An importance value. The lower the value, the more
* important the class.
*/
function getImportance($aPlace)
{
static $aWithImportance = null;
if ($aWithImportance === null) {
$aWithImportance = array_flip(array(
'boundary:administrative',
'place:country',
'place:state',
'place:province',
'place:county',
'place:city',
'place:region',
'place:island',
'place:town',
'place:village',
'place:hamlet',
'place:suburb',
'place:locality',
'landuse:farm',
'place:farm',
'highway:motorway_junction',
'highway:motorway',
'highway:trunk',
'highway:primary',
'highway:secondary',
'highway:tertiary',
'highway:residential',
'highway:unclassified',
'highway:living_street',
'highway:service',
'highway:track',
'highway:road',
'highway:byway',
'highway:bridleway',
'highway:cycleway',
'highway:pedestrian',
'highway:footway',
'highway:steps',
'highway:motorway_link',
'highway:trunk_link',
'highway:primary_link',
'landuse:industrial',
'landuse:residential',
'landuse:retail',
'landuse:commercial',
'place:airport',
'aeroway:aerodrome',
'railway:station',
'amenity:place_of_worship',
'amenity:pub',
'amenity:bar',
'amenity:university',
'tourism:museum',
'amenity:arts_centre',
'tourism:zoo',
'tourism:theme_park',
'tourism:attraction',
'leisure:golf_course',
'historic:castle',
'amenity:hospital',
'amenity:school',
'amenity:theatre',
'amenity:public_building',
'amenity:library',
'amenity:townhall',
'amenity:community_centre',
'amenity:fire_station',
'amenity:police',
'amenity:bank',
'amenity:post_office',
'leisure:park',
'amenity:park',
'landuse:park',
'landuse:recreation_ground',
'tourism:hotel',
'tourism:motel',
'amenity:cinema',
'tourism:artwork',
'historic:archaeological_site',
'amenity:doctors',
'leisure:sports_centre',
'leisure:swimming_pool',
'shop:supermarket',
'shop:convenience',
'amenity:restaurant',
'amenity:fast_food',
'amenity:cafe',
'tourism:guest_house',
'amenity:pharmacy',
'amenity:fuel',
'natural:peak',
'waterway:waterfall',
'natural:wood',
'natural:water',
'landuse:forest',
'landuse:cemetery',
'landuse:allotments',
'landuse:farmyard',
'railway:rail',
'waterway:canal',
'waterway:river',
'waterway:stream',
'shop:bicycle',
'shop:clothes',
'shop:hairdresser',
'shop:doityourself',
'shop:estate_agent',
'shop:car',
'shop:garden_centre',
'shop:car_repair',
'shop:newsagent',
'shop:bakery',
'shop:furniture',
'shop:butcher',
'shop:apparel',
'shop:electronics',
'shop:department_store',
'shop:books',
'shop:yes',
'shop:outdoor',
'shop:mall',
'shop:florist',
'shop:charity',
'shop:hardware',
'shop:laundry',
'shop:shoes',
'shop:beverages',
'shop:dry_cleaning',
'shop:carpet',
'shop:computer',
'shop:alcohol',
'shop:optician',
'shop:chemist',
'shop:gallery',
'shop:mobile_phone',
'shop:sports',
'shop:jewelry',
'shop:pet',
'shop:beauty',
'shop:stationery',
'shop:shopping_centre',
'shop:general',
'shop:electrical',
'shop:toys',
'shop:jeweller',
'shop:betting',
'shop:household',
'shop:travel_agency',
'shop:hifi',
'amenity:shop',
'tourism:information',
'place:house',
'place:house_name',
'place:house_number',
'place:country_code',
'leisure:pitch',
'highway:unsurfaced',
'historic:ruins',
'amenity:college',
'historic:monument',
'railway:subway',
'historic:memorial',
'leisure:nature_reserve',
'leisure:common',
'waterway:lock_gate',
'natural:fell',
'amenity:nightclub',
'highway:path',
'leisure:garden',
'landuse:reservoir',
'leisure:playground',
'leisure:stadium',
'historic:mine',
'natural:cliff',
'tourism:caravan_site',
'amenity:bus_station',
'amenity:kindergarten',
'highway:construction',
'amenity:atm',
'amenity:emergency_phone',
'waterway:lock',
'waterway:riverbank',
'natural:coastline',
'tourism:viewpoint',
'tourism:hostel',
'tourism:bed_and_breakfast',
'railway:halt',
'railway:platform',
'railway:tram',
'amenity:courthouse',
'amenity:recycling',
'amenity:dentist',
'natural:beach',
'place:moor',
'amenity:grave_yard',
'waterway:drain',
'landuse:grass',
'landuse:village_green',
'natural:bay',
'railway:tram_stop',
'leisure:marina',
'highway:stile',
'natural:moor',
'railway:light_rail',
'railway:narrow_gauge',
'natural:land',
'amenity:village_hall',
'waterway:dock',
'amenity:veterinary',
'landuse:brownfield',
'leisure:track',
'railway:historic_station',
'landuse:construction',
'amenity:prison',
'landuse:quarry',
'amenity:telephone',
'highway:traffic_signals',
'natural:heath',
'historic:house',
'amenity:social_club',
'landuse:military',
'amenity:health_centre',
'historic:building',
'amenity:clinic',
'highway:services',
'amenity:ferry_terminal',
'natural:marsh',
'natural:hill',
'highway:raceway',
'amenity:taxi',
'amenity:take_away',
'amenity:car_rental',
'place:islet',
'amenity:nursery',
'amenity:nursing_home',
'amenity:toilets',
'amenity:hall',
'waterway:boatyard',
'highway:mini_roundabout',
'historic:manor',
'tourism:chalet',
'amenity:bicycle_parking',
'amenity:hotel',
'waterway:weir',
'natural:wetland',
'natural:cave_entrance',
'amenity:crematorium',
'tourism:picnic_site',
'landuse:wood',
'landuse:basin',
'natural:tree',
'leisure:slipway',
'landuse:meadow',
'landuse:piste',
'amenity:care_home',
'amenity:club',
'amenity:medical_centre',
'historic:roman_road',
'historic:fort',
'railway:subway_entrance',
'historic:yes',
'highway:gate',
'leisure:fishing',
'historic:museum',
'amenity:car_wash',
'railway:level_crossing',
'leisure:bird_hide',
'natural:headland',
'tourism:apartments',
'amenity:shopping',
'natural:scrub',
'natural:fen',
'building:yes',
'mountain_pass:yes',
'amenity:parking',
'highway:bus_stop',
'place:postcode',
'amenity:post_box',
'place:houses',
'railway:preserved',
'waterway:derelict_canal',
'amenity:dead_pub',
'railway:disused_station',
'railway:abandoned',
'railway:disused'
));
}
$sClassPlace = $aPlace['class'].':'.$aPlace['type'];
return $aWithImportance[$sClassPlace] ?? null;
}

View File

@@ -1,360 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/DatabaseError.php');
/**
* Uses PDO to access the database specified in the CONST_Database_DSN
* setting.
*/
class DB
{
protected $connection;
public function __construct($sDSN = null)
{
$this->sDSN = $sDSN ?? getSetting('DATABASE_DSN');
}
public function connect($bNew = false, $bPersistent = true)
{
if (isset($this->connection) && !$bNew) {
return true;
}
$aConnOptions = array(
\PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION,
\PDO::ATTR_DEFAULT_FETCH_MODE => \PDO::FETCH_ASSOC,
\PDO::ATTR_PERSISTENT => $bPersistent
);
// https://secure.php.net/manual/en/ref.pdo-pgsql.connection.php
try {
$conn = new \PDO($this->sDSN, null, null, $aConnOptions);
} catch (\PDOException $e) {
$sMsg = 'Failed to establish database connection:' . $e->getMessage();
throw new \Nominatim\DatabaseError($sMsg, 500, null, $e->getMessage());
}
$conn->exec("SET DateStyle TO 'sql,european'");
$conn->exec("SET client_encoding TO 'utf-8'");
// Disable JIT and parallel workers. They interfere badly with search SQL.
$conn->exec("UPDATE pg_settings SET setting = -1 WHERE name = 'jit_above_cost'");
$conn->exec("UPDATE pg_settings SET setting = 0 WHERE name = 'max_parallel_workers_per_gather'");
$iMaxExecution = ini_get('max_execution_time');
if ($iMaxExecution > 0) {
$conn->setAttribute(\PDO::ATTR_TIMEOUT, $iMaxExecution); // seconds
}
$this->connection = $conn;
return true;
}
// returns the number of rows that were modified or deleted by the SQL
// statement. If no rows were affected returns 0.
public function exec($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
$val = null;
try {
if (isset($aInputVars)) {
$stmt = $this->connection->prepare($sSQL);
$stmt->execute($aInputVars);
} else {
$val = $this->connection->exec($sSQL);
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $val;
}
/**
* Executes query. Returns first row as array.
* Returns false if no result found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getRow($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$row = $stmt->fetch();
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $row;
}
/**
* Executes query. Returns first value of first result.
* Returns false if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getOne($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$row = $stmt->fetch(\PDO::FETCH_NUM);
if ($row === false) {
return false;
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $row[0];
}
/**
* Executes query. Returns array of results (arrays).
* Returns empty array if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getAll($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$rows = $stmt->fetchAll();
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $rows;
}
/**
* Executes query. Returns array of the first value of each result.
* Returns empty array if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getCol($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
$aVals = array();
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
while (($val = $stmt->fetchColumn(0)) !== false) { // returns first column or false
$aVals[] = $val;
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $aVals;
}
/**
* Executes query. Returns associate array mapping first value to second value of each result.
* Returns empty array if no results found.
*
* @param string $sSQL
*
* @return array[]
*/
public function getAssoc($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
$stmt = $this->getQueryStatement($sSQL, $aInputVars, $sErrMessage);
$aList = array();
while ($aRow = $stmt->fetch(\PDO::FETCH_NUM)) {
$aList[$aRow[0]] = $aRow[1];
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $aList;
}
/**
* Executes query. Returns a PDO statement to iterate over.
*
* @param string $sSQL
*
* @return PDOStatement
*/
public function getQueryStatement($sSQL, $aInputVars = null, $sErrMessage = 'Database query failed')
{
try {
if (isset($aInputVars)) {
$stmt = $this->connection->prepare($sSQL);
$stmt->execute($aInputVars);
} else {
$stmt = $this->connection->query($sSQL);
}
} catch (\PDOException $e) {
throw new \Nominatim\DatabaseError($sErrMessage, 500, null, $e, $sSQL);
}
return $stmt;
}
/**
* St. John's Way => 'St. John\'s Way'
*
* @param string $sVal Text to be quoted.
*
* @return string
*/
public function getDBQuoted($sVal)
{
return $this->connection->quote($sVal);
}
/**
* Like getDBQuoted, but takes an array.
*
* @param array $aVals List of text to be quoted.
*
* @return array[]
*/
public function getDBQuotedList($aVals)
{
return array_map(function ($sVal) {
return $this->getDBQuoted($sVal);
}, $aVals);
}
/**
* [1,2,'b'] => 'ARRAY[1,2,'b']''
*
* @param array $aVals List of text to be quoted.
*
* @return string
*/
public function getArraySQL($a)
{
return 'ARRAY['.join(',', $a).']';
}
/**
* Check if a table exists in the database. Returns true if it does.
*
* @param string $sTableName
*
* @return boolean
*/
public function tableExists($sTableName)
{
$sSQL = 'SELECT count(*) FROM pg_tables WHERE tablename = :tablename';
return ($this->getOne($sSQL, array(':tablename' => $sTableName)) == 1);
}
/**
* Deletes a table. Returns true if deleted or didn't exist.
*
* @param string $sTableName
*
* @return boolean
*/
public function deleteTable($sTableName)
{
return $this->exec('DROP TABLE IF EXISTS '.$sTableName.' CASCADE') == 0;
}
/**
* Tries to connect to the database but on failure doesn't throw an exception.
*
* @return boolean
*/
public function checkConnection()
{
$bExists = true;
try {
$this->connect(true);
} catch (\Nominatim\DatabaseError $e) {
$bExists = false;
}
return $bExists;
}
/**
* e.g. 9.6, 10, 11.2
*
* @return float
*/
public function getPostgresVersion()
{
$sVersionString = $this->getOne('SHOW server_version_num');
preg_match('#([0-9]?[0-9])([0-9][0-9])[0-9][0-9]#', $sVersionString, $aMatches);
return (float) ($aMatches[1].'.'.$aMatches[2]);
}
/**
* e.g. 2, 2.2
*
* @return float
*/
public function getPostgisVersion()
{
$sVersionString = $this->getOne('select postgis_lib_version()');
preg_match('#^([0-9]+)[.]([0-9]+)[.]#', $sVersionString, $aMatches);
return (float) ($aMatches[1].'.'.$aMatches[2]);
}
/**
* Returns an associate array of postgresql database connection settings. Keys can
* be 'database', 'hostspec', 'port', 'username', 'password'.
* Returns empty array on failure, thus check if at least 'database' is set.
*
* @return array[]
*/
public static function parseDSN($sDSN)
{
// https://secure.php.net/manual/en/ref.pdo-pgsql.connection.php
$aInfo = array();
if (preg_match('/^pgsql:(.+)$/', $sDSN, $aMatches)) {
foreach (explode(';', $aMatches[1]) as $sKeyVal) {
list($sKey, $sVal) = explode('=', $sKeyVal, 2);
if ($sKey == 'host') {
$sKey = 'hostspec';
} elseif ($sKey == 'dbname') {
$sKey = 'database';
} elseif ($sKey == 'user') {
$sKey = 'username';
}
$aInfo[$sKey] = $sVal;
}
}
return $aInfo;
}
/**
* Takes an array of settings and return the DNS string. Key names can be
* 'database', 'hostspec', 'port', 'username', 'password' but aliases
* 'dbname', 'host' and 'user' are also supported.
*
* @return string
*
*/
public static function generateDSN($aInfo)
{
$sDSN = sprintf(
'pgsql:host=%s;port=%s;dbname=%s;user=%s;password=%s;',
$aInfo['host'] ?? $aInfo['hostspec'] ?? '',
$aInfo['port'] ?? '',
$aInfo['dbname'] ?? $aInfo['database'] ?? '',
$aInfo['user'] ?? '',
$aInfo['password'] ?? ''
);
$sDSN = preg_replace('/\b\w+=;/', '', $sDSN);
$sDSN = preg_replace('/;\Z/', '', $sDSN);
return $sDSN;
}
}

View File

@@ -1,42 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
class DatabaseError extends \Exception
{
public function __construct($message, $code, $previous, $oPDOErr, $sSql = null)
{
parent::__construct($message, $code, $previous);
// https://secure.php.net/manual/en/class.pdoexception.php
$this->oPDOErr = $oPDOErr;
$this->sSql = $sSql;
}
public function __toString()
{
return __CLASS__ . ": [{$this->code}]: {$this->message}\n";
}
public function getSqlError()
{
return $this->oPDOErr->getMessage();
}
public function getSqlDebugDump()
{
if (CONST_Debug) {
return var_export($this->oPDOErr, true);
} else {
return $this->sSql;
}
}
}

View File

@@ -1,188 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
class Debug
{
public static function newFunction($sHeading)
{
echo "<pre><h2>Debug output for $sHeading</h2></pre>\n";
}
public static function newSection($sHeading)
{
echo "<hr><pre><h3>$sHeading</h3></pre>\n";
}
public static function printVar($sHeading, $mVar)
{
echo '<pre><b>'.$sHeading. ':</b> ';
Debug::outputVar($mVar, str_repeat(' ', strlen($sHeading) + 3));
echo "</pre>\n";
}
public static function fmtArrayVals($aArr)
{
return array('__debug_format' => 'array_vals', 'data' => $aArr);
}
public static function printDebugArray($sHeading, $oVar)
{
if ($oVar === null) {
Debug::printVar($sHeading, 'null');
} else {
Debug::printVar($sHeading, $oVar->debugInfo());
}
}
public static function printDebugTable($sHeading, $aVar)
{
echo '<b>'.$sHeading.":</b>\n";
echo "<table border='1'>\n";
if (!empty($aVar)) {
echo " <tr>\n";
$aKeys = array();
$aInfo = reset($aVar);
if (!is_array($aInfo)) {
$aInfo = $aInfo->debugInfo();
}
foreach ($aInfo as $sKey => $mVal) {
echo ' <th><small>'.$sKey.'</small></th>'."\n";
$aKeys[] = $sKey;
}
echo " </tr>\n";
foreach ($aVar as $oRow) {
$aInfo = $oRow;
if (!is_array($oRow)) {
$aInfo = $oRow->debugInfo();
}
echo " <tr>\n";
foreach ($aKeys as $sKey) {
echo ' <td><pre>';
if (isset($aInfo[$sKey])) {
Debug::outputVar($aInfo[$sKey], '');
}
echo '</pre></td>'."\n";
}
echo " </tr>\n";
}
}
echo "</table>\n";
}
public static function printGroupedSearch($aSearches, $aWordsIDs)
{
echo '<table border="1">';
echo '<tr><th>rank</th><th>Name Tokens</th><th>Name Not</th>';
echo '<th>Address Tokens</th><th>Address Not</th>';
echo '<th>country</th><th>operator</th>';
echo '<th>class</th><th>type</th><th>postcode</th><th>housenumber</th></tr>';
foreach ($aSearches as $aRankedSet) {
foreach ($aRankedSet as $aRow) {
$aRow->dumpAsHtmlTableRow($aWordsIDs);
}
}
echo '</table>';
}
public static function printGroupTable($sHeading, $aVar)
{
echo '<b>'.$sHeading.":</b>\n";
echo "<table border='1'>\n";
if (!empty($aVar)) {
echo " <tr>\n";
echo ' <th><small>Group</small></th>'."\n";
$aKeys = array();
$aInfo = reset($aVar)[0];
if (!is_array($aInfo)) {
$aInfo = $aInfo->debugInfo();
}
foreach ($aInfo as $sKey => $mVal) {
echo ' <th><small>'.$sKey.'</small></th>'."\n";
$aKeys[] = $sKey;
}
echo " </tr>\n";
foreach ($aVar as $sGrpKey => $aGroup) {
foreach ($aGroup as $oRow) {
$aInfo = $oRow;
if (!is_array($oRow)) {
$aInfo = $oRow->debugInfo();
}
echo " <tr>\n";
echo ' <td><pre>'.$sGrpKey.'</pre></td>'."\n";
foreach ($aKeys as $sKey) {
echo ' <td><pre>';
if (!empty($aInfo[$sKey])) {
Debug::outputVar($aInfo[$sKey], '');
}
echo '</pre></td>'."\n";
}
echo " </tr>\n";
}
}
}
echo "</table>\n";
}
public static function printSQL($sSQL)
{
echo '<p><tt><font color="#aaa">'.$sSQL.'</font></tt></p>'."\n";
}
private static function outputVar($mVar, $sPreNL)
{
if (is_array($mVar) && !isset($mVar['__debug_format'])) {
$sPre = '';
foreach ($mVar as $mKey => $aValue) {
echo $sPre;
$iKeyLen = Debug::outputSimpleVar($mKey);
echo ' => ';
Debug::outputVar(
$aValue,
$sPreNL.str_repeat(' ', $iKeyLen + 4)
);
$sPre = "\n".$sPreNL;
}
} elseif (is_array($mVar) && isset($mVar['__debug_format'])) {
if (!empty($mVar['data'])) {
$sPre = '';
foreach ($mVar['data'] as $mValue) {
echo $sPre;
Debug::outputSimpleVar($mValue);
$sPre = ', ';
}
}
} elseif (is_object($mVar) && method_exists($mVar, 'debugInfo')) {
Debug::outputVar($mVar->debugInfo(), $sPreNL);
} elseif (is_a($mVar, 'stdClass')) {
Debug::outputVar(json_decode(json_encode($mVar), true), $sPreNL);
} else {
Debug::outputSimpleVar($mVar);
}
}
private static function outputSimpleVar($mVar)
{
if (is_bool($mVar)) {
echo '<i>'.($mVar ? 'True' : 'False').'</i>';
return $mVar ? 4 : 5;
}
if (is_string($mVar)) {
echo "'$mVar'";
return strlen($mVar) + 2;
}
echo (string)$mVar;
return strlen((string)$mVar);
}
}

View File

@@ -1,19 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
class Debug
{
public static function __callStatic($name, $arguments)
{
// nothing
}
}

View File

@@ -1,938 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/PlaceLookup.php');
require_once(CONST_LibDir.'/Phrase.php');
require_once(CONST_LibDir.'/ReverseGeocode.php');
require_once(CONST_LibDir.'/SearchDescription.php');
require_once(CONST_LibDir.'/SearchContext.php');
require_once(CONST_LibDir.'/SearchPosition.php');
require_once(CONST_LibDir.'/TokenList.php');
require_once(CONST_TokenizerDir.'/tokenizer.php');
class Geocode
{
protected $oDB;
protected $oPlaceLookup;
protected $oTokenizer;
protected $aLangPrefOrder = array();
protected $aExcludePlaceIDs = array();
protected $iLimit = 20;
protected $iFinalLimit = 10;
protected $iOffset = 0;
protected $bFallback = false;
protected $aCountryCodes = false;
protected $bBoundedSearch = false;
protected $aViewBox = false;
protected $aRoutePoints = false;
protected $aRouteWidth = false;
protected $iMaxRank = 20;
protected $iMinAddressRank = 0;
protected $iMaxAddressRank = 30;
protected $aAddressRankList = array();
protected $sAllowedTypesSQLList = false;
protected $sQuery = false;
protected $aStructuredQuery = false;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
$this->oPlaceLookup = new PlaceLookup($this->oDB);
$this->oTokenizer = new \Nominatim\Tokenizer($this->oDB);
}
public function setLanguagePreference($aLangPref)
{
$this->aLangPrefOrder = $aLangPref;
}
public function getMoreUrlParams()
{
if ($this->aStructuredQuery) {
$aParams = $this->aStructuredQuery;
} else {
$aParams = array('q' => $this->sQuery);
}
$aParams = array_merge($aParams, $this->oPlaceLookup->getMoreUrlParams());
if ($this->aExcludePlaceIDs) {
$aParams['exclude_place_ids'] = implode(',', $this->aExcludePlaceIDs);
}
if ($this->bBoundedSearch) {
$aParams['bounded'] = '1';
}
if ($this->aCountryCodes) {
$aParams['countrycodes'] = implode(',', $this->aCountryCodes);
}
if ($this->aViewBox) {
$aParams['viewbox'] = join(',', $this->aViewBox);
}
return $aParams;
}
public function setLimit($iLimit = 10)
{
if ($iLimit > 50) {
$iLimit = 50;
} elseif ($iLimit < 1) {
$iLimit = 1;
}
$this->iFinalLimit = $iLimit;
$this->iLimit = $iLimit + max($iLimit, 10);
}
public function setFeatureType($sFeatureType)
{
switch ($sFeatureType) {
case 'country':
$this->setRankRange(4, 4);
break;
case 'state':
$this->setRankRange(8, 8);
break;
case 'city':
$this->setRankRange(14, 16);
break;
case 'settlement':
$this->setRankRange(8, 20);
break;
}
}
public function setRankRange($iMin, $iMax)
{
$this->iMinAddressRank = $iMin;
$this->iMaxAddressRank = $iMax;
}
public function setViewbox($aViewbox)
{
$aBox = array_map('floatval', $aViewbox);
$this->aViewBox[0] = max(-180.0, min($aBox[0], $aBox[2]));
$this->aViewBox[1] = max(-90.0, min($aBox[1], $aBox[3]));
$this->aViewBox[2] = min(180.0, max($aBox[0], $aBox[2]));
$this->aViewBox[3] = min(90.0, max($aBox[1], $aBox[3]));
if ($this->aViewBox[2] - $this->aViewBox[0] < 0.000000001
|| $this->aViewBox[3] - $this->aViewBox[1] < 0.000000001
) {
userError("Bad parameter 'viewbox'. Not a box.");
}
}
private function viewboxImportanceFactor($fX, $fY)
{
if (!$this->aViewBox) {
return 1;
}
$fWidth = ($this->aViewBox[2] - $this->aViewBox[0])/2;
$fHeight = ($this->aViewBox[3] - $this->aViewBox[1])/2;
$fXDist = abs($fX - ($this->aViewBox[0] + $this->aViewBox[2])/2);
$fYDist = abs($fY - ($this->aViewBox[1] + $this->aViewBox[3])/2);
if ($fXDist <= $fWidth && $fYDist <= $fHeight) {
return 1;
}
if ($fXDist <= $fWidth * 3 && $fYDist <= 3 * $fHeight) {
return 0.5;
}
return 0.25;
}
public function setQuery($sQueryString)
{
$this->sQuery = $sQueryString;
$this->aStructuredQuery = false;
}
public function getQueryString()
{
return $this->sQuery;
}
public function loadParamArray($oParams, $sForceGeometryType = null)
{
$this->bBoundedSearch = $oParams->getBool('bounded', $this->bBoundedSearch);
$this->setLimit($oParams->getInt('limit', $this->iFinalLimit));
$this->iOffset = $oParams->getInt('offset', $this->iOffset);
$this->bFallback = $oParams->getBool('fallback', $this->bFallback);
// List of excluded Place IDs - used for more accurate pageing
$sExcluded = $oParams->getStringList('exclude_place_ids');
if ($sExcluded) {
foreach ($sExcluded as $iExcludedPlaceID) {
$iExcludedPlaceID = (int)$iExcludedPlaceID;
if ($iExcludedPlaceID) {
$aExcludePlaceIDs[$iExcludedPlaceID] = $iExcludedPlaceID;
}
}
if (isset($aExcludePlaceIDs)) {
$this->aExcludePlaceIDs = $aExcludePlaceIDs;
}
}
// Only certain ranks of feature
$sFeatureType = $oParams->getString('featureType');
if (!$sFeatureType) {
$sFeatureType = $oParams->getString('featuretype');
}
if ($sFeatureType) {
$this->setFeatureType($sFeatureType);
}
// Country code list
$sCountries = $oParams->getStringList('countrycodes');
if ($sCountries) {
foreach ($sCountries as $sCountryCode) {
if (preg_match('/^[a-zA-Z][a-zA-Z]$/', $sCountryCode)) {
$aCountries[] = strtolower($sCountryCode);
}
}
if (isset($aCountries)) {
$this->aCountryCodes = $aCountries;
}
}
$aViewbox = $oParams->getStringList('viewboxlbrt');
if ($aViewbox) {
if (count($aViewbox) != 4) {
userError("Bad parameter 'viewboxlbrt'. Expected 4 coordinates.");
}
$this->setViewbox($aViewbox);
} else {
$aViewbox = $oParams->getStringList('viewbox');
if ($aViewbox) {
if (count($aViewbox) != 4) {
userError("Bad parameter 'viewbox'. Expected 4 coordinates.");
}
$this->setViewBox($aViewbox);
} else {
$aRoute = $oParams->getStringList('route');
$fRouteWidth = $oParams->getFloat('routewidth');
if ($aRoute && $fRouteWidth) {
$this->aRoutePoints = $aRoute;
$this->aRouteWidth = $fRouteWidth;
}
}
}
$this->oPlaceLookup->loadParamArray($oParams, $sForceGeometryType);
$this->oPlaceLookup->setIncludeAddressDetails($oParams->getBool('addressdetails', false));
}
public function setQueryFromParams($oParams)
{
// Search query
$sQuery = $oParams->getString('q');
if (!$sQuery) {
$this->setStructuredQuery(
$oParams->getString('amenity'),
$oParams->getString('street'),
$oParams->getString('city'),
$oParams->getString('county'),
$oParams->getString('state'),
$oParams->getString('country'),
$oParams->getString('postalcode')
);
} else {
$this->setQuery($sQuery);
}
}
public function loadStructuredAddressElement($sValue, $sKey, $iNewMinAddressRank, $iNewMaxAddressRank, $aItemListValues)
{
$sValue = trim($sValue);
if (!$sValue) {
return false;
}
$this->aStructuredQuery[$sKey] = $sValue;
if ($this->iMinAddressRank == 0 && $this->iMaxAddressRank == 30) {
$this->iMinAddressRank = $iNewMinAddressRank;
$this->iMaxAddressRank = $iNewMaxAddressRank;
}
if ($aItemListValues) {
$this->aAddressRankList = array_merge($this->aAddressRankList, $aItemListValues);
}
return true;
}
public function setStructuredQuery($sAmenity = false, $sStreet = false, $sCity = false, $sCounty = false, $sState = false, $sCountry = false, $sPostalCode = false)
{
$this->sQuery = false;
// Reset
$this->iMinAddressRank = 0;
$this->iMaxAddressRank = 30;
$this->aAddressRankList = array();
$this->aStructuredQuery = array();
$this->sAllowedTypesSQLList = false;
$this->loadStructuredAddressElement($sAmenity, 'amenity', 26, 30, false);
$this->loadStructuredAddressElement($sStreet, 'street', 26, 30, false);
$this->loadStructuredAddressElement($sCity, 'city', 14, 24, false);
$this->loadStructuredAddressElement($sCounty, 'county', 9, 13, false);
$this->loadStructuredAddressElement($sState, 'state', 8, 8, false);
$this->loadStructuredAddressElement($sPostalCode, 'postalcode', 5, 11, array(5, 11));
$this->loadStructuredAddressElement($sCountry, 'country', 4, 4, false);
if (!empty($this->aStructuredQuery)) {
$this->sQuery = join(', ', $this->aStructuredQuery);
if ($this->iMaxAddressRank < 30) {
$this->sAllowedTypesSQLList = '(\'place\',\'boundary\')';
}
}
}
public function fallbackStructuredQuery()
{
$aParams = $this->aStructuredQuery;
if (!$aParams || count($aParams) == 1) {
return false;
}
$aOrderToFallback = array('postalcode', 'street', 'city', 'county', 'state');
foreach ($aOrderToFallback as $sType) {
if (isset($aParams[$sType])) {
unset($aParams[$sType]);
$this->setStructuredQuery(@$aParams['amenity'], @$aParams['street'], @$aParams['city'], @$aParams['county'], @$aParams['state'], @$aParams['country'], @$aParams['postalcode']);
return true;
}
}
return false;
}
public function getGroupedSearches($aSearches, $aPhrases, $oValidTokens)
{
/*
Calculate all searches using oValidTokens i.e.
'Wodsworth Road, Sheffield' =>
Phrase Wordset
0 0 (wodsworth road)
0 1 (wodsworth)(road)
1 0 (sheffield)
Score how good the search is so they can be ordered
*/
foreach ($aPhrases as $iPhrase => $oPhrase) {
$aNewPhraseSearches = array();
$oPosition = new SearchPosition(
$oPhrase->getPhraseType(),
$iPhrase,
count($aPhrases)
);
foreach ($oPhrase->getWordSets() as $aWordset) {
$aWordsetSearches = $aSearches;
// Add all words from this wordset
foreach ($aWordset as $iToken => $sToken) {
$aNewWordsetSearches = array();
$oPosition->setTokenPosition($iToken, count($aWordset));
foreach ($aWordsetSearches as $oCurrentSearch) {
foreach ($oValidTokens->get($sToken) as $oSearchTerm) {
if ($oSearchTerm->isExtendable($oCurrentSearch, $oPosition)) {
$aNewSearches = $oSearchTerm->extendSearch(
$oCurrentSearch,
$oPosition
);
foreach ($aNewSearches as $oSearch) {
if ($oSearch->getRank() < $this->iMaxRank) {
$aNewWordsetSearches[] = $oSearch;
}
}
}
}
}
// Sort and cut
usort($aNewWordsetSearches, array('Nominatim\SearchDescription', 'bySearchRank'));
$aWordsetSearches = array_slice($aNewWordsetSearches, 0, 50);
}
$aNewPhraseSearches = array_merge($aNewPhraseSearches, $aNewWordsetSearches);
usort($aNewPhraseSearches, array('Nominatim\SearchDescription', 'bySearchRank'));
$aSearchHash = array();
foreach ($aNewPhraseSearches as $iSearch => $aSearch) {
$sHash = serialize($aSearch);
if (isset($aSearchHash[$sHash])) {
unset($aNewPhraseSearches[$iSearch]);
} else {
$aSearchHash[$sHash] = 1;
}
}
$aNewPhraseSearches = array_slice($aNewPhraseSearches, 0, 50);
}
// Re-group the searches by their score, junk anything over 20 as just not worth trying
$aGroupedSearches = array();
foreach ($aNewPhraseSearches as $aSearch) {
$iRank = $aSearch->getRank();
if ($iRank < $this->iMaxRank) {
if (!isset($aGroupedSearches[$iRank])) {
$aGroupedSearches[$iRank] = array();
}
$aGroupedSearches[$iRank][] = $aSearch;
}
}
ksort($aGroupedSearches);
$iSearchCount = 0;
$aSearches = array();
foreach ($aGroupedSearches as $aNewSearches) {
$iSearchCount += count($aNewSearches);
$aSearches = array_merge($aSearches, $aNewSearches);
if ($iSearchCount > 50) {
break;
}
}
}
// Revisit searches, drop bad searches and give penalty to unlikely combinations.
$aGroupedSearches = array();
foreach ($aSearches as $oSearch) {
if (!$oSearch->isValidSearch()) {
continue;
}
$iRank = $oSearch->getRank();
if (!isset($aGroupedSearches[$iRank])) {
$aGroupedSearches[$iRank] = array();
}
$aGroupedSearches[$iRank][] = $oSearch;
}
ksort($aGroupedSearches);
return $aGroupedSearches;
}
/* Perform the actual query lookup.
Returns an ordered list of results, each with the following fields:
osm_type: type of corresponding OSM object
N - node
W - way
R - relation
P - postcode (internally computed)
osm_id: id of corresponding OSM object
class: general object class (corresponds to tag key of primary OSM tag)
type: subclass of object (corresponds to tag value of primary OSM tag)
admin_level: see https://wiki.openstreetmap.org/wiki/Admin_level
rank_search: rank in search hierarchy
(see also https://wiki.openstreetmap.org/wiki/Nominatim/Development_overview#Country_to_street_level)
rank_address: rank in address hierarchy (determines orer in address)
place_id: internal key (may differ between different instances)
country_code: ISO country code
langaddress: localized full address
placename: localized name of object
ref: content of ref tag (if available)
lon: longitude
lat: latitude
importance: importance of place based on Wikipedia link count
addressimportance: cumulated importance of address elements
extra_place: type of place (for admin boundaries, if there is a place tag)
aBoundingBox: bounding Box
label: short description of the object class/type (English only)
name: full name (currently the same as langaddress)
foundorder: secondary ordering for places with same importance
*/
public function lookup()
{
Debug::newFunction('Geocode::lookup');
if (!$this->sQuery && !$this->aStructuredQuery) {
return array();
}
Debug::printDebugArray('Geocode', $this);
$oCtx = new SearchContext();
if ($this->aRoutePoints) {
$oCtx->setViewboxFromRoute(
$this->oDB,
$this->aRoutePoints,
$this->aRouteWidth,
$this->bBoundedSearch
);
} elseif ($this->aViewBox) {
$oCtx->setViewboxFromBox($this->aViewBox, $this->bBoundedSearch);
}
if ($this->aExcludePlaceIDs) {
$oCtx->setExcludeList($this->aExcludePlaceIDs);
}
if ($this->aCountryCodes) {
$oCtx->setCountryList($this->aCountryCodes);
}
Debug::newSection('Query Preprocessing');
$sQuery = $this->sQuery;
if (!preg_match('//u', $sQuery)) {
userError('Query string is not UTF-8 encoded.');
}
// Do we have anything that looks like a lat/lon pair?
$sQuery = $oCtx->setNearPointFromQuery($sQuery);
if ($sQuery || $this->aStructuredQuery) {
// Start with a single blank search
$aSearches = array(new SearchDescription($oCtx));
if ($sQuery) {
$sQuery = $aSearches[0]->extractKeyValuePairs($sQuery);
}
$sSpecialTerm = '';
if ($sQuery) {
preg_match_all(
'/\\[([\\w ]*)\\]/u',
$sQuery,
$aSpecialTermsRaw,
PREG_SET_ORDER
);
if (!empty($aSpecialTermsRaw)) {
Debug::printVar('Special terms', $aSpecialTermsRaw);
}
foreach ($aSpecialTermsRaw as $aSpecialTerm) {
$sQuery = str_replace($aSpecialTerm[0], ' ', $sQuery);
if (!$sSpecialTerm) {
$sSpecialTerm = $aSpecialTerm[1];
}
}
}
if (!$sSpecialTerm && $this->aStructuredQuery
&& isset($this->aStructuredQuery['amenity'])) {
$sSpecialTerm = $this->aStructuredQuery['amenity'];
unset($this->aStructuredQuery['amenity']);
}
if ($sSpecialTerm && !$aSearches[0]->hasOperator()) {
$aTokens = $this->oTokenizer->tokensForSpecialTerm($sSpecialTerm);
if (!empty($aTokens)) {
$aNewSearches = array();
$oPosition = new SearchPosition('', 0, 1);
$oPosition->setTokenPosition(0, 1);
foreach ($aSearches as $oSearch) {
foreach ($aTokens as $oToken) {
$aNewSearches = array_merge(
$aNewSearches,
$oToken->extendSearch($oSearch, $oPosition)
);
}
}
$aSearches = $aNewSearches;
}
}
// Split query into phrases
// Commas are used to reduce the search space by indicating where phrases split
$aPhrases = array();
if ($this->aStructuredQuery) {
foreach ($this->aStructuredQuery as $iPhrase => $sPhrase) {
$aPhrases[] = new Phrase($sPhrase, $iPhrase);
}
} else {
foreach (explode(',', $sQuery) as $sPhrase) {
$aPhrases[] = new Phrase($sPhrase, '');
}
}
Debug::printDebugArray('Search context', $oCtx);
Debug::printDebugArray('Base search', empty($aSearches) ? null : $aSearches[0]);
Debug::newSection('Tokenization');
$oValidTokens = $this->oTokenizer->extractTokensFromPhrases($aPhrases);
if ($oValidTokens->count() > 0) {
$oCtx->setFullNameWords($oValidTokens->getFullWordIDs());
$aPhrases = array_filter($aPhrases, function ($oPhrase) {
return $oPhrase->getWordSets() !== null;
});
// Any words that have failed completely?
// TODO: suggestions
Debug::printGroupTable('Valid Tokens', $oValidTokens->debugInfo());
Debug::printDebugTable('Phrases', $aPhrases);
Debug::newSection('Search candidates');
$aGroupedSearches = $this->getGroupedSearches($aSearches, $aPhrases, $oValidTokens);
if (!$this->aStructuredQuery) {
// Reverse phrase array and also reverse the order of the wordsets in
// the first and final phrase. Don't bother about phrases in the middle
// because order in the address doesn't matter.
$aPhrases = array_reverse($aPhrases);
$aPhrases[0]->invertWordSets();
if (count($aPhrases) > 1) {
$aPhrases[count($aPhrases)-1]->invertWordSets();
}
$aReverseGroupedSearches = $this->getGroupedSearches($aSearches, $aPhrases, $oValidTokens);
foreach ($aReverseGroupedSearches as $aSearches) {
foreach ($aSearches as $aSearch) {
if (!isset($aGroupedSearches[$aSearch->getRank()])) {
$aGroupedSearches[$aSearch->getRank()] = array();
}
$aGroupedSearches[$aSearch->getRank()][] = $aSearch;
}
}
ksort($aGroupedSearches);
}
} else {
// Re-group the searches by their score, junk anything over 20 as just not worth trying
$aGroupedSearches = array();
foreach ($aSearches as $aSearch) {
if ($aSearch->getRank() < $this->iMaxRank) {
if (!isset($aGroupedSearches[$aSearch->getRank()])) {
$aGroupedSearches[$aSearch->getRank()] = array();
}
$aGroupedSearches[$aSearch->getRank()][] = $aSearch;
}
}
ksort($aGroupedSearches);
}
// Filter out duplicate searches
$aSearchHash = array();
foreach ($aGroupedSearches as $iGroup => $aSearches) {
foreach ($aSearches as $iSearch => $aSearch) {
$sHash = serialize($aSearch);
if (isset($aSearchHash[$sHash])) {
unset($aGroupedSearches[$iGroup][$iSearch]);
if (empty($aGroupedSearches[$iGroup])) {
unset($aGroupedSearches[$iGroup]);
}
} else {
$aSearchHash[$sHash] = 1;
}
}
}
Debug::printGroupedSearch(
$aGroupedSearches,
$oValidTokens->debugTokenByWordIdList()
);
// Start the search process
$iGroupLoop = 0;
$iQueryLoop = 0;
$aNextResults = array();
foreach ($aGroupedSearches as $iGroupedRank => $aSearches) {
$iGroupLoop++;
$aResults = $aNextResults;
foreach ($aSearches as $oSearch) {
$iQueryLoop++;
Debug::newSection("Search Loop, group $iGroupLoop, loop $iQueryLoop");
Debug::printGroupedSearch(
array($iGroupedRank => array($oSearch)),
$oValidTokens->debugTokenByWordIdList()
);
$aNewResults = $oSearch->query(
$this->oDB,
$this->iMinAddressRank,
$this->iMaxAddressRank,
$this->iLimit
);
// The same result may appear in different rounds, only
// use the one with minimal rank.
foreach ($aNewResults as $iPlace => $oRes) {
if (!isset($aResults[$iPlace])
|| $aResults[$iPlace]->iResultRank > $oRes->iResultRank) {
$aResults[$iPlace] = $oRes;
}
}
if ($iQueryLoop > 20) {
break;
}
}
if (!empty($aResults)) {
$aSplitResults = Result::splitResults($aResults);
Debug::printVar('Split results', $aSplitResults);
if ($iGroupLoop <= 4
&& reset($aSplitResults['head'])->iResultRank > 0
&& $iGroupedRank !== array_key_last($aGroupedSearches)) {
// Haven't found an exact match for the query yet.
// Therefore add result from the next group level.
$aNextResults = $aSplitResults['head'];
foreach ($aNextResults as $oRes) {
$oRes->iResultRank--;
}
foreach ($aSplitResults['tail'] as $oRes) {
$oRes->iResultRank--;
$aNextResults[$oRes->iId] = $oRes;
}
$aResults = array();
} else {
$aResults = $aSplitResults['head'];
}
}
if (!empty($aResults) && ($this->iMinAddressRank != 0 || $this->iMaxAddressRank != 30)) {
// Need to verify passes rank limits before dropping out of the loop (yuk!)
// reduces the number of place ids, like a filter
// rank_address is 30 for interpolated housenumbers
$aFilterSql = array();
$sPlaceIds = Result::joinIdsByTable($aResults, Result::TABLE_PLACEX);
if ($sPlaceIds) {
$sSQL = 'SELECT place_id FROM placex ';
$sSQL .= 'WHERE place_id in ('.$sPlaceIds.') ';
$sSQL .= ' AND (';
$sSQL .= " placex.rank_address between $this->iMinAddressRank and $this->iMaxAddressRank ";
$sSQL .= " OR placex.rank_search between $this->iMinAddressRank and $this->iMaxAddressRank ";
if ($this->aAddressRankList) {
$sSQL .= ' OR placex.rank_address in ('.join(',', $this->aAddressRankList).')';
}
$sSQL .= ')';
$aFilterSql[] = $sSQL;
}
$sPlaceIds = Result::joinIdsByTable($aResults, Result::TABLE_POSTCODE);
if ($sPlaceIds) {
$sSQL = ' SELECT place_id FROM location_postcode lp ';
$sSQL .= 'WHERE place_id in ('.$sPlaceIds.') ';
$sSQL .= " AND (lp.rank_address between $this->iMinAddressRank and $this->iMaxAddressRank ";
if ($this->aAddressRankList) {
$sSQL .= ' OR lp.rank_address in ('.join(',', $this->aAddressRankList).')';
}
$sSQL .= ') ';
$aFilterSql[] = $sSQL;
}
$aFilteredIDs = array();
if ($aFilterSql) {
$sSQL = join(' UNION ', $aFilterSql);
Debug::printSQL($sSQL);
$aFilteredIDs = $this->oDB->getCol($sSQL);
}
$tempIDs = array();
foreach ($aResults as $oResult) {
if (($this->iMaxAddressRank == 30 &&
($oResult->iTable == Result::TABLE_OSMLINE
|| $oResult->iTable == Result::TABLE_TIGER))
|| in_array($oResult->iId, $aFilteredIDs)
) {
$tempIDs[$oResult->iId] = $oResult;
}
}
$aResults = $tempIDs;
}
if (!empty($aResults) || $iGroupLoop > 4 || $iQueryLoop > 30) {
break;
}
}
} else {
// Just interpret as a reverse geocode
$oReverse = new ReverseGeocode($this->oDB);
$oReverse->setZoom(18);
$oLookup = $oReverse->lookupPoint($oCtx->sqlNear, false);
Debug::printVar('Reverse search', $oLookup);
if ($oLookup) {
$aResults = array($oLookup->iId => $oLookup);
}
}
// No results? Done
if (empty($aResults)) {
if ($this->bFallback && $this->fallbackStructuredQuery()) {
return $this->lookup();
}
return array();
}
if ($this->aAddressRankList) {
$this->oPlaceLookup->setAddressRankList($this->aAddressRankList);
}
$this->oPlaceLookup->setAllowedTypesSQLList($this->sAllowedTypesSQLList);
$this->oPlaceLookup->setLanguagePreference($this->aLangPrefOrder);
if ($oCtx->hasNearPoint()) {
$this->oPlaceLookup->setAnchorSql($oCtx->sqlNear);
}
$aSearchResults = $this->oPlaceLookup->lookup($aResults);
$aRecheckWords = preg_split('/\b[\s,\\-]*/u', $sQuery);
foreach ($aRecheckWords as $i => $sWord) {
if (!preg_match('/[\pL\pN]/', $sWord)) {
unset($aRecheckWords[$i]);
}
}
Debug::printVar('Recheck words', $aRecheckWords);
foreach ($aSearchResults as $iIdx => $aResult) {
$fRadius = ClassTypes\getDefRadius($aResult);
$aOutlineResult = $this->oPlaceLookup->getOutlines($aResult['place_id'], $aResult['lon'], $aResult['lat'], $fRadius);
if ($aOutlineResult) {
$aResult = array_merge($aResult, $aOutlineResult);
}
// Is there an icon set for this type of result?
$sIcon = ClassTypes\getIconFile($aResult);
if (isset($sIcon)) {
$aResult['icon'] = $sIcon;
}
$sLabel = ClassTypes\getLabel($aResult);
if (isset($sLabel)) {
$aResult['label'] = $sLabel;
}
$aResult['name'] = $aResult['langaddress'];
if ($oCtx->hasNearPoint()) {
$aResult['importance'] = 0.001;
$aResult['foundorder'] = $aResult['addressimportance'];
} else {
if ($aResult['importance'] == 0) {
$aResult['importance'] = 0.0001;
}
$aResult['importance'] *= $this->viewboxImportanceFactor(
$aResult['lon'],
$aResult['lat']
);
// secondary ordering (for results with same importance (the smaller the better):
// - approximate importance of address parts
if (isset($aResult['addressimportance']) && $aResult['addressimportance']) {
$aResult['foundorder'] = -$aResult['addressimportance']/10;
} else {
$aResult['foundorder'] = -$aResult['importance'];
}
// - number of exact matches from the query
$aResult['foundorder'] -= $aResults[$aResult['place_id']]->iExactMatches;
// - importance of the class/type
$iClassImportance = ClassTypes\getImportance($aResult);
if (isset($iClassImportance)) {
$aResult['foundorder'] += 0.0001 * $iClassImportance;
} else {
$aResult['foundorder'] += 0.01;
}
// - rank
$aResult['foundorder'] -= 0.00001 * (30 - $aResult['rank_search']);
// Adjust importance for the number of exact string matches in the result
$iCountWords = 0;
$sAddress = $aResult['langaddress'];
foreach ($aRecheckWords as $i => $sWord) {
if (stripos($sAddress, $sWord)!==false) {
$iCountWords++;
if (preg_match('/(^|,)\s*'.preg_quote($sWord, '/').'\s*(,|$)/', $sAddress)) {
$iCountWords += 0.1;
}
}
}
// 0.1 is a completely arbitrary number but something in the range 0.1 to 0.5 would seem right
$aResult['importance'] = $aResult['importance'] + ($iCountWords*0.1);
}
$aSearchResults[$iIdx] = $aResult;
}
uasort($aSearchResults, 'byImportance');
Debug::printVar('Pre-filter results', $aSearchResults);
$aOSMIDDone = array();
$aClassTypeNameDone = array();
$aToFilter = $aSearchResults;
$aSearchResults = array();
foreach ($aToFilter as $aResult) {
$this->aExcludePlaceIDs[$aResult['place_id']] = $aResult['place_id'];
if (!$this->oPlaceLookup->doDeDupe() || (!isset($aOSMIDDone[$aResult['osm_type'].$aResult['osm_id']])
&& !isset($aClassTypeNameDone[$aResult['osm_type'].$aResult['class'].$aResult['type'].$aResult['name'].$aResult['admin_level']]))
) {
$aOSMIDDone[$aResult['osm_type'].$aResult['osm_id']] = true;
$aClassTypeNameDone[$aResult['osm_type'].$aResult['class'].$aResult['type'].$aResult['name'].$aResult['admin_level']] = true;
$aSearchResults[] = $aResult;
}
// Absolute limit on number of results
if (count($aSearchResults) >= $this->iFinalLimit) {
break;
}
}
Debug::printVar('Post-filter results', $aSearchResults);
return $aSearchResults;
} // end lookup()
public function debugInfo()
{
return array(
'Query' => $this->sQuery,
'Structured query' => $this->aStructuredQuery,
'Name keys' => Debug::fmtArrayVals($this->aLangPrefOrder),
'Excluded place IDs' => Debug::fmtArrayVals($this->aExcludePlaceIDs),
'Limit (for searches)' => $this->iLimit,
'Limit (for results)'=> $this->iFinalLimit,
'Country codes' => Debug::fmtArrayVals($this->aCountryCodes),
'Bounded search' => $this->bBoundedSearch,
'Viewbox' => Debug::fmtArrayVals($this->aViewBox),
'Route points' => Debug::fmtArrayVals($this->aRoutePoints),
'Route width' => $this->aRouteWidth,
'Max rank' => $this->iMaxRank,
'Min address rank' => $this->iMinAddressRank,
'Max address rank' => $this->iMaxAddressRank,
'Address rank list' => Debug::fmtArrayVals($this->aAddressRankList)
);
}
} // end class

View File

@@ -1,157 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
class ParameterParser
{
private $aParams;
public function __construct($aParams = null)
{
$this->aParams = ($aParams === null) ? $_GET : $aParams;
}
public function getBool($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName])
|| !is_string($this->aParams[$sName])
|| strlen($this->aParams[$sName]) == 0
) {
return $bDefault;
}
return (bool) $this->aParams[$sName];
}
public function getInt($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName]) || is_array($this->aParams[$sName])) {
return $bDefault;
}
if (!preg_match('/^[+-]?[0-9]+$/', $this->aParams[$sName])) {
userError("Integer number expected for parameter '$sName'");
}
return (int) $this->aParams[$sName];
}
public function getFloat($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName]) || is_array($this->aParams[$sName])) {
return $bDefault;
}
if (!preg_match('/^[+-]?[0-9]*\.?[0-9]+$/', $this->aParams[$sName])) {
userError("Floating-point number expected for parameter '$sName'");
}
return (float) $this->aParams[$sName];
}
public function getString($sName, $bDefault = false)
{
if (!isset($this->aParams[$sName])
|| !is_string($this->aParams[$sName])
|| strlen($this->aParams[$sName]) == 0
) {
return $bDefault;
}
return $this->aParams[$sName];
}
public function getSet($sName, $aValues, $sDefault = false)
{
if (!isset($this->aParams[$sName])
|| !is_string($this->aParams[$sName])
|| strlen($this->aParams[$sName]) == 0
) {
return $sDefault;
}
if (!in_array($this->aParams[$sName], $aValues, true)) {
userError("Parameter '$sName' must be one of: ".join(', ', $aValues));
}
return $this->aParams[$sName];
}
public function getStringList($sName, $aDefault = false)
{
$sValue = $this->getString($sName);
if ($sValue) {
// removes all NULL, FALSE and Empty Strings but leaves 0 (zero) values
return array_values(array_filter(explode(',', $sValue), 'strlen'));
}
return $aDefault;
}
public function getPreferredLanguages($sFallback = null)
{
if ($sFallback === null && isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
$sFallback = $_SERVER['HTTP_ACCEPT_LANGUAGE'];
}
$aLanguages = array();
$sLangString = $this->getString('accept-language', $sFallback);
if ($sLangString
&& preg_match_all('/(([a-z]{1,8})([-_][a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $sLangString, $aLanguagesParse, PREG_SET_ORDER)
) {
foreach ($aLanguagesParse as $iLang => $aLanguage) {
$aLanguages[$aLanguage[1]] = isset($aLanguage[5])?(float)$aLanguage[5]:1 - ($iLang/100);
if (!isset($aLanguages[$aLanguage[2]])) {
$aLanguages[$aLanguage[2]] = $aLanguages[$aLanguage[1]]/10;
}
}
arsort($aLanguages);
}
if (empty($aLanguages) && CONST_Default_Language) {
$aLanguages[CONST_Default_Language] = 1;
}
foreach ($aLanguages as $sLanguage => $fLanguagePref) {
$this->addNameTag($aLangPrefOrder, 'name:'.$sLanguage);
}
$this->addNameTag($aLangPrefOrder, 'name');
$this->addNameTag($aLangPrefOrder, 'brand');
foreach ($aLanguages as $sLanguage => $fLanguagePref) {
$this->addNameTag($aLangPrefOrder, 'official_name:'.$sLanguage);
$this->addNameTag($aLangPrefOrder, 'short_name:'.$sLanguage);
}
$this->addNameTag($aLangPrefOrder, 'official_name');
$this->addNameTag($aLangPrefOrder, 'short_name');
$this->addNameTag($aLangPrefOrder, 'ref');
$this->addNameTag($aLangPrefOrder, 'type');
return $aLangPrefOrder;
}
private function addNameTag(&$aLangPrefOrder, $sTag)
{
$aLangPrefOrder[$sTag] = $sTag;
$aLangPrefOrder['_place_'.$sTag] = '_place_'.$sTag;
}
public function hasSetAny($aParamNames)
{
foreach ($aParamNames as $sName) {
if ($this->getBool($sName)) {
return true;
}
}
return false;
}
}

View File

@@ -1,89 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
/**
* Segment of a query string.
*
* The parts of a query strings are usually separated by commas.
*/
class Phrase
{
// Complete phrase as a string (guaranteed to have no leading or trailing
// spaces).
private $sPhrase;
// Element type for structured searches.
private $sPhraseType;
// Possible segmentations of the phrase.
private $aWordSets;
public function __construct($sPhrase, $sPhraseType)
{
$this->sPhrase = trim($sPhrase);
$this->sPhraseType = $sPhraseType;
}
/**
* Get the original phrase of the string.
*/
public function getPhrase()
{
return $this->sPhrase;
}
/**
* Return the element type of the phrase.
*
* @return string Pharse type if the phrase comes from a structured query
* or empty string otherwise.
*/
public function getPhraseType()
{
return $this->sPhraseType;
}
public function setWordSets($aWordSets)
{
$this->aWordSets = $aWordSets;
}
/**
* Return the array of possible segmentations of the phrase.
*
* @return string[][] Array of segmentations, each consisting of an
* array of terms.
*/
public function getWordSets()
{
return $this->aWordSets;
}
/**
* Invert the set of possible segmentations.
*
* @return void
*/
public function invertWordSets()
{
foreach ($this->aWordSets as $i => $aSet) {
$this->aWordSets[$i] = array_reverse($aSet);
}
}
public function debugInfo()
{
return array(
'Type' => $this->sPhraseType,
'Phrase' => $this->sPhrase,
'WordSets' => $this->aWordSets
);
}
}

View File

@@ -1,618 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/AddressDetails.php');
require_once(CONST_LibDir.'/Result.php');
class PlaceLookup
{
protected $oDB;
protected $aLangPrefOrderSql = "''";
protected $bAddressDetails = false;
protected $bExtraTags = false;
protected $bNameDetails = false;
protected $bIncludePolygonAsText = false;
protected $bIncludePolygonAsGeoJSON = false;
protected $bIncludePolygonAsKML = false;
protected $bIncludePolygonAsSVG = false;
protected $fPolygonSimplificationThreshold = 0.0;
protected $sAnchorSql = null;
protected $sAddressRankListSql = null;
protected $sAllowedTypesSQLList = null;
protected $bDeDupe = true;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
}
public function doDeDupe()
{
return $this->bDeDupe;
}
public function setIncludeAddressDetails($b)
{
$this->bAddressDetails = $b;
}
public function loadParamArray($oParams, $sGeomType = null)
{
$aLangs = $oParams->getPreferredLanguages();
$this->aLangPrefOrderSql =
'ARRAY['.join(',', $this->oDB->getDBQuotedList($aLangs)).']';
$this->bExtraTags = $oParams->getBool('extratags', false);
$this->bNameDetails = $oParams->getBool('namedetails', false);
$this->bDeDupe = $oParams->getBool('dedupe', $this->bDeDupe);
if ($sGeomType === null || $sGeomType == 'geojson') {
$this->bIncludePolygonAsGeoJSON = $oParams->getBool('polygon_geojson');
}
if ($oParams->getString('format', '') !== 'geojson') {
if ($sGeomType === null || $sGeomType == 'text') {
$this->bIncludePolygonAsText = $oParams->getBool('polygon_text');
}
if ($sGeomType === null || $sGeomType == 'kml') {
$this->bIncludePolygonAsKML = $oParams->getBool('polygon_kml');
}
if ($sGeomType === null || $sGeomType == 'svg') {
$this->bIncludePolygonAsSVG = $oParams->getBool('polygon_svg');
}
}
$this->fPolygonSimplificationThreshold
= $oParams->getFloat('polygon_threshold', 0.0);
$iWantedTypes =
($this->bIncludePolygonAsText ? 1 : 0) +
($this->bIncludePolygonAsGeoJSON ? 1 : 0) +
($this->bIncludePolygonAsKML ? 1 : 0) +
($this->bIncludePolygonAsSVG ? 1 : 0);
if ($iWantedTypes > CONST_PolygonOutput_MaximumTypes) {
if (CONST_PolygonOutput_MaximumTypes) {
userError('Select only '.CONST_PolygonOutput_MaximumTypes.' polgyon output option');
} else {
userError('Polygon output is disabled');
}
}
}
public function getMoreUrlParams()
{
$aParams = array();
if ($this->bAddressDetails) {
$aParams['addressdetails'] = '1';
}
if ($this->bExtraTags) {
$aParams['extratags'] = '1';
}
if ($this->bNameDetails) {
$aParams['namedetails'] = '1';
}
if ($this->bIncludePolygonAsText) {
$aParams['polygon_text'] = '1';
}
if ($this->bIncludePolygonAsGeoJSON) {
$aParams['polygon_geojson'] = '1';
}
if ($this->bIncludePolygonAsKML) {
$aParams['polygon_kml'] = '1';
}
if ($this->bIncludePolygonAsSVG) {
$aParams['polygon_svg'] = '1';
}
if ($this->fPolygonSimplificationThreshold > 0.0) {
$aParams['polygon_threshold'] = $this->fPolygonSimplificationThreshold;
}
if (!$this->bDeDupe) {
$aParams['dedupe'] = '0';
}
return $aParams;
}
public function setAnchorSql($sPoint)
{
$this->sAnchorSql = $sPoint;
}
public function setAddressRankList($aList)
{
$this->sAddressRankListSql = '('.join(',', $aList).')';
}
public function setAllowedTypesSQLList($sSql)
{
$this->sAllowedTypesSQLList = $sSql;
}
public function setLanguagePreference($aLangPrefOrder)
{
$this->aLangPrefOrderSql = $this->oDB->getArraySQL(
$this->oDB->getDBQuotedList($aLangPrefOrder)
);
}
private function addressImportanceSql($sGeometry, $sPlaceId)
{
if ($this->sAnchorSql) {
$sSQL = 'ST_Distance('.$this->sAnchorSql.','.$sGeometry.')';
} else {
$sSQL = '(SELECT max(ai_p.importance * (ai_p.rank_address + 2))';
$sSQL .= ' FROM place_addressline ai_s, placex ai_p';
$sSQL .= ' WHERE ai_s.place_id = '.$sPlaceId;
$sSQL .= ' AND ai_p.place_id = ai_s.address_place_id ';
$sSQL .= ' AND ai_s.isaddress ';
$sSQL .= ' AND ai_p.importance is not null)';
}
return $sSQL.' AS addressimportance,';
}
private function langAddressSql($sHousenumber)
{
if ($this->bAddressDetails) {
return ''; // langaddress will be computed from address details
}
return 'get_address_by_language(place_id,'.$sHousenumber.','.$this->aLangPrefOrderSql.') AS langaddress,';
}
public function lookupOSMID($sType, $iID)
{
$sSQL = 'select place_id from placex where osm_type = :type and osm_id = :id';
$iPlaceID = $this->oDB->getOne($sSQL, array(':type' => $sType, ':id' => $iID));
if (!$iPlaceID) {
return null;
}
$aResults = $this->lookup(array($iPlaceID => new Result($iPlaceID)), 0, 30, true);
return empty($aResults) ? null : reset($aResults);
}
public function lookup($aResults, $iMinRank = 0, $iMaxRank = 30, $bAllowLinked = false)
{
Debug::newFunction('Place lookup');
if (empty($aResults)) {
return array();
}
$aSubSelects = array();
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_PLACEX);
if ($sPlaceIDs) {
Debug::printVar('Ids from placex', $sPlaceIDs);
$sSQL = 'SELECT ';
$sSQL .= ' osm_type,';
$sSQL .= ' osm_id,';
$sSQL .= ' class,';
$sSQL .= ' type,';
$sSQL .= ' admin_level,';
$sSQL .= ' rank_search,';
$sSQL .= ' rank_address,';
$sSQL .= ' min(place_id) AS place_id,';
$sSQL .= ' min(parent_place_id) AS parent_place_id,';
$sSQL .= ' -1 as housenumber,';
$sSQL .= ' country_code,';
$sSQL .= $this->langAddressSql('-1');
$sSQL .= ' get_name_by_language(name,'.$this->aLangPrefOrderSql.') AS placename,';
$sSQL .= " get_name_by_language(name, ARRAY['ref']) AS ref,";
if ($this->bExtraTags) {
$sSQL .= 'hstore_to_json(extratags)::text AS extra,';
}
if ($this->bNameDetails) {
$sSQL .= 'hstore_to_json(name)::text AS names,';
}
$sSQL .= ' avg(ST_X(centroid)) AS lon, ';
$sSQL .= ' avg(ST_Y(centroid)) AS lat, ';
$sSQL .= ' COALESCE(importance,0.75-(rank_search::float/40)) AS importance, ';
$sSQL .= $this->addressImportanceSql(
'ST_Collect(centroid)',
'min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)'
);
$sSQL .= " COALESCE(extratags->'place', extratags->'linked_place') AS extra_place ";
$sSQL .= ' FROM placex';
$sSQL .= " WHERE place_id in ($sPlaceIDs) ";
$sSQL .= ' AND (';
$sSQL .= " placex.rank_address between $iMinRank and $iMaxRank ";
if (14 >= $iMinRank && 14 <= $iMaxRank) {
$sSQL .= " OR (extratags->'place') = 'city'";
}
if ($this->sAddressRankListSql) {
$sSQL .= ' OR placex.rank_address in '.$this->sAddressRankListSql;
}
$sSQL .= ' ) ';
if ($this->sAllowedTypesSQLList) {
$sSQL .= 'AND placex.class in '.$this->sAllowedTypesSQLList;
}
if (!$bAllowLinked) {
$sSQL .= ' AND linked_place_id is null ';
}
$sSQL .= ' GROUP BY ';
$sSQL .= ' osm_type, ';
$sSQL .= ' osm_id, ';
$sSQL .= ' class, ';
$sSQL .= ' type, ';
$sSQL .= ' admin_level, ';
$sSQL .= ' rank_search, ';
$sSQL .= ' rank_address, ';
$sSQL .= ' housenumber,';
$sSQL .= ' country_code, ';
$sSQL .= ' importance, ';
if (!$this->bDeDupe) {
$sSQL .= 'place_id,';
}
if (!$this->bAddressDetails) {
$sSQL .= 'langaddress, ';
}
$sSQL .= ' placename, ';
$sSQL .= ' ref, ';
if ($this->bExtraTags) {
$sSQL .= 'extratags, ';
}
if ($this->bNameDetails) {
$sSQL .= 'name, ';
}
$sSQL .= ' extra_place ';
$aSubSelects[] = $sSQL;
}
// postcode table
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_POSTCODE);
if ($sPlaceIDs) {
Debug::printVar('Ids from location_postcode', $sPlaceIDs);
$sSQL = 'SELECT';
$sSQL .= " 'P' as osm_type,";
$sSQL .= ' (SELECT osm_id from placex p WHERE p.place_id = lp.parent_place_id) as osm_id,';
$sSQL .= " 'place' as class, 'postcode' as type,";
$sSQL .= ' null::smallint as admin_level, rank_search, rank_address,';
$sSQL .= ' place_id, parent_place_id,';
$sSQL .= ' -1 as housenumber,';
$sSQL .= ' country_code,';
$sSQL .= $this->langAddressSql('-1');
$sSQL .= ' postcode as placename,';
$sSQL .= ' postcode as ref,';
if ($this->bExtraTags) {
$sSQL .= 'null::text AS extra,';
}
if ($this->bNameDetails) {
$sSQL .= 'null::text AS names,';
}
$sSQL .= ' ST_x(geometry) AS lon, ST_y(geometry) AS lat,';
$sSQL .= ' (0.75-(rank_search::float/40)) AS importance, ';
$sSQL .= $this->addressImportanceSql('geometry', 'lp.parent_place_id');
$sSQL .= ' null::text AS extra_place ';
$sSQL .= 'FROM location_postcode lp';
$sSQL .= " WHERE place_id in ($sPlaceIDs) ";
$sSQL .= " AND lp.rank_address between $iMinRank and $iMaxRank";
$aSubSelects[] = $sSQL;
}
// All other tables are rank 30 only.
if ($iMaxRank == 30) {
// TIGER table
if (CONST_Use_US_Tiger_Data) {
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_TIGER);
if ($sPlaceIDs) {
Debug::printVar('Ids from Tiger table', $sPlaceIDs);
$sHousenumbers = Result::sqlHouseNumberTable($aResults, Result::TABLE_TIGER);
// Tiger search only if a housenumber was searched and if it was found
// (realized through a join)
$sSQL = ' SELECT ';
$sSQL .= " 'T' AS osm_type, ";
$sSQL .= ' (SELECT osm_id from placex p WHERE p.place_id=blub.parent_place_id) as osm_id, ';
$sSQL .= " 'place' AS class, ";
$sSQL .= " 'house' AS type, ";
$sSQL .= ' null::smallint AS admin_level, ';
$sSQL .= ' 30 AS rank_search, ';
$sSQL .= ' 30 AS rank_address, ';
$sSQL .= ' place_id, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place as housenumber,';
$sSQL .= " 'us' AS country_code, ";
$sSQL .= $this->langAddressSql('housenumber_for_place');
$sSQL .= ' null::text AS placename, ';
$sSQL .= ' null::text AS ref, ';
if ($this->bExtraTags) {
$sSQL .= 'null::text AS extra,';
}
if ($this->bNameDetails) {
$sSQL .= 'null::text AS names,';
}
$sSQL .= ' st_x(centroid) AS lon, ';
$sSQL .= ' st_y(centroid) AS lat,';
$sSQL .= ' -1.15 AS importance, ';
$sSQL .= $this->addressImportanceSql('centroid', 'blub.parent_place_id');
$sSQL .= ' null::text AS extra_place ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT place_id, '; // interpolate the Tiger housenumbers here
$sSQL .= ' CASE WHEN startnumber != endnumber';
$sSQL .= ' THEN ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float)';
$sSQL .= ' ELSE ST_LineInterpolatePoint(linegeo, 0.5) END AS centroid, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place';
$sSQL .= ' FROM (';
$sSQL .= ' location_property_tiger ';
$sSQL .= ' JOIN (values '.$sHousenumbers.') AS housenumbers(place_id, housenumber_for_place) USING(place_id)) ';
$sSQL .= ' WHERE ';
$sSQL .= ' housenumber_for_place >= startnumber';
$sSQL .= ' AND housenumber_for_place <= endnumber';
$sSQL .= ' ) AS blub'; //postgres wants an alias here
$aSubSelects[] = $sSQL;
}
}
// osmline - interpolated housenumbers
$sPlaceIDs = Result::joinIdsByTable($aResults, Result::TABLE_OSMLINE);
if ($sPlaceIDs) {
Debug::printVar('Ids from interpolation', $sPlaceIDs);
$sHousenumbers = Result::sqlHouseNumberTable($aResults, Result::TABLE_OSMLINE);
// interpolation line search only if a housenumber was searched
// (realized through a join)
$sSQL = 'SELECT ';
$sSQL .= " 'W' AS osm_type, ";
$sSQL .= ' osm_id, ';
$sSQL .= " 'place' AS class, ";
$sSQL .= " 'house' AS type, ";
$sSQL .= ' null::smallint AS admin_level, ';
$sSQL .= ' 30 AS rank_search, ';
$sSQL .= ' 30 AS rank_address, ';
$sSQL .= ' place_id, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place as housenumber,';
$sSQL .= ' country_code, ';
$sSQL .= $this->langAddressSql('housenumber_for_place');
$sSQL .= ' null::text AS placename, ';
$sSQL .= ' null::text AS ref, ';
if ($this->bExtraTags) {
$sSQL .= 'null::text AS extra, ';
}
if ($this->bNameDetails) {
$sSQL .= 'null::text AS names, ';
}
$sSQL .= ' st_x(centroid) AS lon, ';
$sSQL .= ' st_y(centroid) AS lat, ';
// slightly smaller than the importance for normal houses
$sSQL .= ' -0.1 AS importance, ';
$sSQL .= $this->addressImportanceSql('centroid', 'blub.parent_place_id');
$sSQL .= ' null::text AS extra_place ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT ';
$sSQL .= ' osm_id, ';
$sSQL .= ' place_id, ';
$sSQL .= ' country_code, ';
$sSQL .= ' CASE '; // interpolate the housenumbers here
$sSQL .= ' WHEN startnumber != endnumber ';
$sSQL .= ' THEN ST_LineInterpolatePoint(linegeo, (housenumber_for_place-startnumber::float)/(endnumber-startnumber)::float) ';
$sSQL .= ' ELSE linegeo ';
$sSQL .= ' END as centroid, ';
$sSQL .= ' parent_place_id, ';
$sSQL .= ' housenumber_for_place ';
$sSQL .= ' FROM (';
$sSQL .= ' location_property_osmline ';
$sSQL .= ' JOIN (values '.$sHousenumbers.') AS housenumbers(place_id, housenumber_for_place) USING(place_id)';
$sSQL .= ' ) ';
$sSQL .= ' WHERE housenumber_for_place >= 0 ';
$sSQL .= ' ) as blub'; //postgres wants an alias here
$aSubSelects[] = $sSQL;
}
}
if (empty($aSubSelects)) {
return array();
}
$sSQL = join(' UNION ', $aSubSelects);
Debug::printSQL($sSQL);
$aPlaces = $this->oDB->getAll($sSQL, null, 'Could not lookup place');
foreach ($aPlaces as &$aPlace) {
$aPlace['importance'] = (float) $aPlace['importance'];
if ($this->bAddressDetails) {
// to get addressdetails for tiger data, the housenumber is needed
$aPlace['address'] = new AddressDetails(
$this->oDB,
$aPlace['place_id'],
$aPlace['housenumber'],
$this->aLangPrefOrderSql
);
$aPlace['langaddress'] = $aPlace['address']->getLocaleAddress();
}
if ($this->bExtraTags) {
if ($aPlace['extra']) {
$aPlace['sExtraTags'] = json_decode($aPlace['extra'], true);
} else {
$aPlace['sExtraTags'] = (object) array();
}
}
if ($this->bNameDetails) {
$aPlace['sNameDetails'] = $this->extractNames($aPlace['names']);
}
$aPlace['addresstype'] = ClassTypes\getLabelTag(
$aPlace,
$aPlace['country_code']
);
$aResults[$aPlace['place_id']] = $aPlace;
}
$aResults = array_filter(
$aResults,
function ($v) {
return !($v instanceof Result);
}
);
Debug::printVar('Places', $aResults);
return $aResults;
}
private function extractNames($sNames)
{
if (!$sNames) {
return (object) array();
}
$aFullNames = json_decode($sNames, true);
$aNames = array();
foreach ($aFullNames as $sKey => $sValue) {
if (strpos($sKey, '_place_') === 0) {
$sSubKey = substr($sKey, 7);
if (array_key_exists($sSubKey, $aFullNames)) {
$aNames[$sKey] = $sValue;
} else {
$aNames[$sSubKey] = $sValue;
}
} else {
$aNames[$sKey] = $sValue;
}
}
return $aNames;
}
/* returns an array which will contain the keys
* aBoundingBox
* and may also contain one or more of the keys
* asgeojson
* askml
* assvg
* astext
* lat
* lon
*/
public function getOutlines($iPlaceID, $fLon = null, $fLat = null, $fRadius = null, $fLonReverse = null, $fLatReverse = null)
{
$aOutlineResult = array();
if (!$iPlaceID) {
return $aOutlineResult;
}
// Get the bounding box and outline polygon
$sSQL = 'select place_id,0 as numfeatures,st_area(geometry) as area,';
if ($fLonReverse != null && $fLatReverse != null) {
$sSQL .= ' ST_Y(closest_point) as centrelat,';
$sSQL .= ' ST_X(closest_point) as centrelon,';
} else {
$sSQL .= ' ST_Y(centroid) as centrelat, ST_X(centroid) as centrelon,';
}
$sSQL .= ' ST_YMin(geometry) as minlat,ST_YMax(geometry) as maxlat,';
$sSQL .= ' ST_XMin(geometry) as minlon,ST_XMax(geometry) as maxlon';
if ($this->bIncludePolygonAsGeoJSON) {
$sSQL .= ',ST_AsGeoJSON(geometry) as asgeojson';
}
if ($this->bIncludePolygonAsKML) {
$sSQL .= ',ST_AsKML(geometry) as askml';
}
if ($this->bIncludePolygonAsSVG) {
$sSQL .= ',ST_AsSVG(geometry) as assvg';
}
if ($this->bIncludePolygonAsText) {
$sSQL .= ',ST_AsText(geometry) as astext';
}
if ($fLonReverse != null && $fLatReverse != null) {
$sFrom = ' from (SELECT * , CASE WHEN (class = \'highway\') AND (ST_GeometryType(geometry) = \'ST_LineString\') THEN ';
$sFrom .=' ST_ClosestPoint(geometry, ST_SetSRID(ST_Point('.$fLatReverse.','.$fLonReverse.'),4326))';
$sFrom .=' ELSE centroid END AS closest_point';
$sFrom .= ' from placex where place_id = '.$iPlaceID.') as plx';
} else {
$sFrom = ' from placex where place_id = '.$iPlaceID;
}
if ($this->fPolygonSimplificationThreshold > 0) {
$sSQL .= ' from (select place_id,centroid,ST_SimplifyPreserveTopology(geometry,'.$this->fPolygonSimplificationThreshold.') as geometry'.$sFrom.') as plx';
} else {
$sSQL .= $sFrom;
}
$aPointPolygon = $this->oDB->getRow($sSQL, null, 'Could not get outline');
if ($aPointPolygon && $aPointPolygon['place_id']) {
if ($aPointPolygon['centrelon'] !== null && $aPointPolygon['centrelat'] !== null) {
$aOutlineResult['lat'] = $aPointPolygon['centrelat'];
$aOutlineResult['lon'] = $aPointPolygon['centrelon'];
}
if ($this->bIncludePolygonAsGeoJSON) {
$aOutlineResult['asgeojson'] = $aPointPolygon['asgeojson'];
}
if ($this->bIncludePolygonAsKML) {
$aOutlineResult['askml'] = $aPointPolygon['askml'];
}
if ($this->bIncludePolygonAsSVG) {
$aOutlineResult['assvg'] = $aPointPolygon['assvg'];
}
if ($this->bIncludePolygonAsText) {
$aOutlineResult['astext'] = $aPointPolygon['astext'];
}
if (abs($aPointPolygon['minlat'] - $aPointPolygon['maxlat']) < 0.0000001) {
$aPointPolygon['minlat'] = $aPointPolygon['minlat'] - $fRadius;
$aPointPolygon['maxlat'] = $aPointPolygon['maxlat'] + $fRadius;
}
if (abs($aPointPolygon['minlon'] - $aPointPolygon['maxlon']) < 0.0000001) {
$aPointPolygon['minlon'] = $aPointPolygon['minlon'] - $fRadius;
$aPointPolygon['maxlon'] = $aPointPolygon['maxlon'] + $fRadius;
}
$aOutlineResult['aBoundingBox'] = array(
(string)$aPointPolygon['minlat'],
(string)$aPointPolygon['maxlat'],
(string)$aPointPolygon['minlon'],
(string)$aPointPolygon['maxlon']
);
}
// as a fallback we generate a bounding box without knowing the size of the geometry
if ((!isset($aOutlineResult['aBoundingBox'])) && isset($fLon)) {
$aBounds = array(
'minlat' => $fLat - $fRadius,
'maxlat' => $fLat + $fRadius,
'minlon' => $fLon - $fRadius,
'maxlon' => $fLon + $fRadius
);
$aOutlineResult['aBoundingBox'] = array(
(string)$aBounds['minlat'],
(string)$aBounds['maxlat'],
(string)$aBounds['minlon'],
(string)$aBounds['maxlon']
);
}
return $aOutlineResult;
}
}

View File

@@ -1,129 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
/**
* A single result of a search operation or a reverse lookup.
*
* This object only contains the id of the result. It does not yet
* have any details needed to format the output document.
*/
class Result
{
const TABLE_PLACEX = 0;
const TABLE_POSTCODE = 1;
const TABLE_OSMLINE = 2;
const TABLE_TIGER = 3;
/// Database table that contains the result.
public $iTable;
/// Id of the result.
public $iId;
/// House number (only for interpolation results).
public $iHouseNumber = -1;
/// Number of exact matches in address (address searches only).
public $iExactMatches = 0;
/// Subranking within the results (the higher the worse).
public $iResultRank = 0;
/// Address rank of the result.
public $iAddressRank;
public function debugInfo()
{
return array(
'Table' => $this->iTable,
'ID' => $this->iId,
'House number' => $this->iHouseNumber,
'Exact Matches' => $this->iExactMatches,
'Result rank' => $this->iResultRank
);
}
public function __construct($sId, $iTable = Result::TABLE_PLACEX)
{
$this->iTable = $iTable;
$this->iId = (int) $sId;
}
public static function joinIdsByTable($aResults, $iTable)
{
return join(',', array_keys(array_filter(
$aResults,
function ($aValue) use ($iTable) {
return $aValue->iTable == $iTable;
}
)));
}
public static function joinIdsByTableMinRank($aResults, $iTable, $iMinAddressRank)
{
return join(',', array_keys(array_filter(
$aResults,
function ($aValue) use ($iTable, $iMinAddressRank) {
return $aValue->iTable == $iTable && $aValue->iAddressRank >= $iMinAddressRank;
}
)));
}
public static function joinIdsByTableMaxRank($aResults, $iTable, $iMaxAddressRank)
{
return join(',', array_keys(array_filter(
$aResults,
function ($aValue) use ($iTable, $iMaxAddressRank) {
return $aValue->iTable == $iTable && $aValue->iAddressRank <= $iMaxAddressRank;
}
)));
}
public static function sqlHouseNumberTable($aResults, $iTable)
{
$sHousenumbers = '';
$sSep = '';
foreach ($aResults as $oResult) {
if ($oResult->iTable == $iTable) {
$sHousenumbers .= $sSep.'('.$oResult->iId.',';
$sHousenumbers .= $oResult->iHouseNumber.')';
$sSep = ',';
}
}
return $sHousenumbers;
}
/**
* Split a result array into highest ranked result and the rest
*
* @param object[] $aResults List of results to split.
*
* @return array[]
*/
public static function splitResults($aResults)
{
$aHead = array();
$aTail = array();
$iMinRank = 10000;
foreach ($aResults as $oRes) {
if ($oRes->iResultRank < $iMinRank) {
$aTail += $aHead;
$aHead = array($oRes->iId => $oRes);
$iMinRank = $oRes->iResultRank;
} elseif ($oRes->iResultRank == $iMinRank) {
$aHead[$oRes->iId] = $oRes;
} else {
$aTail[$oRes->iId] = $oRes;
}
}
return array('head' => $aHead, 'tail' => $aTail);
}
}

View File

@@ -1,396 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/Result.php');
class ReverseGeocode
{
protected $oDB;
protected $iMaxRank = 28;
public function __construct(&$oDB)
{
$this->oDB =& $oDB;
}
public function setZoom($iZoom)
{
// Zoom to rank, this could probably be calculated but a lookup gives fine control
$aZoomRank = array(
0 => 2, // Continent / Sea
1 => 2,
2 => 2,
3 => 4, // Country
4 => 4,
5 => 8, // State
6 => 10, // Region
7 => 10,
8 => 12, // County
9 => 12,
10 => 17, // City
11 => 17,
12 => 18, // Town / Village
13 => 18,
14 => 22, // Suburb
15 => 22,
16 => 26, // major street
17 => 27, // minor street
18 => 30, // or >, Building
19 => 30, // or >, Building
);
$this->iMaxRank = (isset($iZoom) && isset($aZoomRank[$iZoom]))?$aZoomRank[$iZoom]:28;
}
/**
* Find the closest interpolation with the given search diameter.
*
* @param string $sPointSQL Reverse geocoding point as SQL
* @param float $fSearchDiam Search diameter
*
* @return Record of the interpolation or null.
*/
protected function lookupInterpolation($sPointSQL, $fSearchDiam)
{
Debug::newFunction('lookupInterpolation');
$sSQL = 'SELECT place_id, parent_place_id, 30 as rank_search,';
$sSQL .= ' (CASE WHEN endnumber != startnumber';
$sSQL .= ' THEN (endnumber - startnumber) * ST_LineLocatePoint(linegeo,'.$sPointSQL.')';
$sSQL .= ' ELSE startnumber END) as fhnr,';
$sSQL .= ' startnumber, endnumber, step,';
$sSQL .= ' ST_Distance(linegeo,'.$sPointSQL.') as distance';
$sSQL .= ' FROM location_property_osmline';
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', linegeo, '.$fSearchDiam.')';
$sSQL .= ' and indexed_status = 0 and startnumber is not NULL ';
$sSQL .= ' and parent_place_id != 0';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
return $this->oDB->getRow(
$sSQL,
null,
'Could not determine closest housenumber on an osm interpolation line.'
);
}
protected function lookupLargeArea($sPointSQL, $iMaxRank)
{
if ($iMaxRank > 4) {
$aPlace = $this->lookupPolygon($sPointSQL, $iMaxRank);
if ($aPlace) {
return new Result($aPlace['place_id']);
}
}
// If no polygon which contains the searchpoint is found,
// searches in the country_osm_grid table for a polygon.
return $this->lookupInCountry($sPointSQL, $iMaxRank);
}
protected function lookupInCountry($sPointSQL, $iMaxRank)
{
Debug::newFunction('lookupInCountry');
// searches for polygon in table country_osm_grid which contains the searchpoint
// and searches for the nearest place node to the searchpoint in this polygon
$sSQL = 'SELECT country_code FROM country_osm_grid';
$sSQL .= ' WHERE ST_CONTAINS(geometry, '.$sPointSQL.') LIMIT 1';
Debug::printSQL($sSQL);
$sCountryCode = $this->oDB->getOne(
$sSQL,
null,
'Could not determine country polygon containing the point.'
);
Debug::printVar('Country code', $sCountryCode);
if ($sCountryCode) {
if ($iMaxRank > 4) {
// look for place nodes with the given country code
$sSQL = 'SELECT place_id FROM';
$sSQL .= ' (SELECT place_id, rank_search,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE osm_type = \'N\'';
$sSQL .= ' AND country_code = \''.$sCountryCode.'\'';
$sSQL .= ' AND rank_search < 26 '; // needed to select right index
$sSQL .= ' AND rank_search between 5 and ' .min(25, $iMaxRank);
$sSQL .= ' AND class = \'place\' AND type != \'postcode\'';
$sSQL .= ' AND name IS NOT NULL ';
$sSQL .= ' and indexed_status = 0 and linked_place_id is null';
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', geometry, 1.8)) p ';
$sSQL .= 'WHERE distance <= reverse_place_diameter(rank_search)';
$sSQL .= ' ORDER BY rank_search DESC, distance ASC';
$sSQL .= ' LIMIT 1';
Debug::printSQL($sSQL);
$aPlace = $this->oDB->getRow($sSQL, null, 'Could not determine place node.');
Debug::printVar('Country node', $aPlace);
if ($aPlace) {
return new Result($aPlace['place_id']);
}
}
// still nothing, then return the country object
$sSQL = 'SELECT place_id, ST_distance('.$sPointSQL.', centroid) as distance';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE country_code = \''.$sCountryCode.'\'';
$sSQL .= ' AND rank_search = 4 AND rank_address = 4';
$sSQL .= ' AND class in (\'boundary\', \'place\')';
$sSQL .= ' AND linked_place_id is null';
$sSQL .= ' ORDER BY distance ASC';
Debug::printSQL($sSQL);
$aPlace = $this->oDB->getRow($sSQL, null, 'Could not determine place node.');
Debug::printVar('Country place', $aPlace);
if ($aPlace) {
return new Result($aPlace['place_id']);
}
}
return null;
}
/**
* Search for areas or nodes for areas or nodes between state and suburb level.
*
* @param string $sPointSQL Search point as SQL string.
* @param int $iMaxRank Maximum address rank of the feature.
*
* @return Record of the found feature or null.
*
* Searches first for polygon that contains the search point.
* If such a polygon is found, place nodes with a higher rank are
* searched inside the polygon.
*/
protected function lookupPolygon($sPointSQL, $iMaxRank)
{
Debug::newFunction('lookupPolygon');
// polygon search begins at suburb-level
if ($iMaxRank > 25) {
$iMaxRank = 25;
}
// no polygon search over country-level
if ($iMaxRank < 5) {
$iMaxRank = 5;
}
// search for polygon
$sSQL = 'SELECT place_id, parent_place_id, rank_address, rank_search FROM';
$sSQL .= '(select place_id, parent_place_id, rank_address, rank_search, country_code, geometry';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE ST_GeometryType(geometry) in (\'ST_Polygon\', \'ST_MultiPolygon\')';
// Ensure that query planner doesn't use the index on rank_search.
$sSQL .= ' AND coalesce(rank_search, 0) between 5 and ' .$iMaxRank;
$sSQL .= ' AND rank_address between 4 and 25'; // needed for index selection
$sSQL .= ' AND geometry && '.$sPointSQL;
$sSQL .= ' AND type != \'postcode\' ';
$sSQL .= ' AND name is not null';
$sSQL .= ' AND indexed_status = 0 and linked_place_id is null';
$sSQL .= ' ORDER BY rank_search DESC LIMIT 50 ) as a';
$sSQL .= ' WHERE ST_Contains(geometry, '.$sPointSQL.' )';
$sSQL .= ' ORDER BY rank_search DESC LIMIT 1';
Debug::printSQL($sSQL);
$aPoly = $this->oDB->getRow($sSQL, null, 'Could not determine polygon containing the point.');
Debug::printVar('Polygon result', $aPoly);
if ($aPoly) {
// if a polygon is found, search for placenodes begins ...
$iRankAddress = $aPoly['rank_address'];
$iRankSearch = $aPoly['rank_search'];
$iPlaceID = $aPoly['place_id'];
if ($iRankSearch != $iMaxRank) {
$sSQL = 'SELECT place_id FROM ';
$sSQL .= '(SELECT place_id, rank_search, country_code, geometry,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM placex';
$sSQL .= ' WHERE osm_type = \'N\'';
// using rank_search because of a better differentiation
// for place nodes at rank_address 16
$sSQL .= ' AND rank_search > '.$iRankSearch;
$sSQL .= ' AND rank_search <= '.$iMaxRank;
$sSQL .= ' AND rank_search < 26 '; // needed to select right index
$sSQL .= ' AND rank_address > 0';
$sSQL .= ' AND class = \'place\'';
$sSQL .= ' AND type != \'postcode\'';
$sSQL .= ' AND name IS NOT NULL ';
$sSQL .= ' AND indexed_status = 0 AND linked_place_id is null';
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', geometry, reverse_place_diameter('.$iRankSearch.'::smallint))';
$sSQL .= ' ORDER BY distance ASC,';
$sSQL .= ' rank_address DESC';
$sSQL .= ' limit 500) as a';
$sSQL .= ' WHERE ST_CONTAINS((SELECT geometry FROM placex WHERE place_id = '.$iPlaceID.'), geometry )';
$sSQL .= ' AND distance <= reverse_place_diameter(rank_search)';
$sSQL .= ' ORDER BY distance ASC, rank_search DESC';
$sSQL .= ' LIMIT 1';
Debug::printSQL($sSQL);
$aPlaceNode = $this->oDB->getRow($sSQL, null, 'Could not determine place node.');
Debug::printVar('Nearest place node', $aPlaceNode);
if ($aPlaceNode) {
return $aPlaceNode;
}
}
}
return $aPoly;
}
public function lookup($fLat, $fLon, $bDoInterpolation = true)
{
return $this->lookupPoint(
'ST_SetSRID(ST_Point('.$fLon.','.$fLat.'),4326)',
$bDoInterpolation
);
}
public function lookupPoint($sPointSQL, $bDoInterpolation = true)
{
Debug::newFunction('lookupPoint');
// Find the nearest point
$fSearchDiam = 0.006;
$oResult = null;
$aPlace = null;
// for POI or street level
if ($this->iMaxRank >= 26) {
// starts if the search is on POI or street level,
// searches for the nearest POI or street,
// if a street is found and a POI is searched for,
// the nearest POI which the found street is a parent of is chosen.
$sSQL = 'select place_id,parent_place_id,rank_address,country_code,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM ';
$sSQL .= ' placex';
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', geometry, '.$fSearchDiam.')';
$sSQL .= ' AND';
$sSQL .= ' rank_address between 26 and '.$this->iMaxRank;
$sSQL .= ' and (name is not null or housenumber is not null';
$sSQL .= ' or rank_address between 26 and 27)';
$sSQL .= ' and (rank_address between 26 and 27';
$sSQL .= ' or ST_GeometryType(geometry) != \'ST_LineString\')';
$sSQL .= ' and class not in (\'boundary\')';
$sSQL .= ' and indexed_status = 0 and linked_place_id is null';
$sSQL .= ' and (ST_GeometryType(geometry) not in (\'ST_Polygon\',\'ST_MultiPolygon\') ';
$sSQL .= ' OR ST_DWithin('.$sPointSQL.', centroid, '.$fSearchDiam.'))';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
$aPlace = $this->oDB->getRow($sSQL, null, 'Could not determine closest place.');
Debug::printVar('POI/street level result', $aPlace);
if ($aPlace) {
$iPlaceID = $aPlace['place_id'];
$oResult = new Result($iPlaceID);
$iRankAddress = $aPlace['rank_address'];
}
if ($aPlace) {
// if street and maxrank > streetlevel
if ($iRankAddress <= 27 && $this->iMaxRank > 27) {
// find the closest object (up to a certain radius) of which the street is a parent of
$sSQL = ' select place_id,';
$sSQL .= ' ST_distance('.$sPointSQL.', geometry) as distance';
$sSQL .= ' FROM ';
$sSQL .= ' placex';
// radius ?
$sSQL .= ' WHERE ST_DWithin('.$sPointSQL.', geometry, 0.001)';
$sSQL .= ' AND parent_place_id = '.$iPlaceID;
$sSQL .= ' and rank_address > 28';
$sSQL .= ' and ST_GeometryType(geometry) != \'ST_LineString\'';
$sSQL .= ' and (name is not null or housenumber is not null)';
$sSQL .= ' and class not in (\'boundary\')';
$sSQL .= ' and indexed_status = 0 and linked_place_id is null';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
$aStreet = $this->oDB->getRow($sSQL, null, 'Could not determine closest place.');
Debug::printVar('Closest POI result', $aStreet);
if ($aStreet) {
$aPlace = $aStreet;
$oResult = new Result($aStreet['place_id']);
$iRankAddress = 30;
}
}
// In the US we can check TIGER data for nearest housenumber
if (CONST_Use_US_Tiger_Data
&& $iRankAddress <= 27
&& $aPlace['country_code'] == 'us'
&& $this->iMaxRank >= 28
) {
$sSQL = 'SELECT place_id,parent_place_id,30 as rank_search,';
$sSQL .= ' (endnumber - startnumber) * ST_LineLocatePoint(linegeo,'.$sPointSQL.') as fhnr,';
$sSQL .= ' startnumber, endnumber, step,';
$sSQL .= ' ST_Distance('.$sPointSQL.', linegeo) as distance';
$sSQL .= ' FROM location_property_tiger WHERE parent_place_id = '.$oResult->iId;
$sSQL .= ' AND ST_DWithin('.$sPointSQL.', linegeo, 0.001)';
$sSQL .= ' ORDER BY distance ASC limit 1';
Debug::printSQL($sSQL);
$aPlaceTiger = $this->oDB->getRow($sSQL, null, 'Could not determine closest Tiger place.');
Debug::printVar('Tiger house number result', $aPlaceTiger);
if ($aPlaceTiger) {
$aPlace = $aPlaceTiger;
$oResult = new Result($aPlaceTiger['place_id'], Result::TABLE_TIGER);
$iRndNum = max(0, round($aPlaceTiger['fhnr'] / $aPlaceTiger['step']) * $aPlaceTiger['step']);
$oResult->iHouseNumber = $aPlaceTiger['startnumber'] + $iRndNum;
if ($oResult->iHouseNumber > $aPlaceTiger['endnumber']) {
$oResult->iHouseNumber = $aPlaceTiger['endnumber'];
}
$iRankAddress = 30;
}
}
}
if ($bDoInterpolation && $this->iMaxRank >= 30) {
$fDistance = $fSearchDiam;
if ($aPlace) {
// We can't reliably go from the closest street to an
// interpolation line because the closest interpolation
// may have a different street segments as a parent.
// Therefore allow an interpolation line to take precedence
// even when the street is closer.
$fDistance = $iRankAddress < 28 ? 0.001 : $aPlace['distance'];
}
$aHouse = $this->lookupInterpolation($sPointSQL, $fDistance);
Debug::printVar('Interpolation result', $aPlace);
if ($aHouse) {
$oResult = new Result($aHouse['place_id'], Result::TABLE_OSMLINE);
$iRndNum = max(0, round($aHouse['fhnr'] / $aHouse['step']) * $aHouse['step']);
$oResult->iHouseNumber = $aHouse['startnumber'] + $iRndNum;
if ($oResult->iHouseNumber > $aHouse['endnumber']) {
$oResult->iHouseNumber = $aHouse['endnumber'];
}
$aPlace = $aHouse;
}
}
if (!$aPlace) {
// if no POI or street is found ...
$oResult = $this->lookupLargeArea($sPointSQL, 25);
}
} else {
// lower than street level ($iMaxRank < 26 )
$oResult = $this->lookupLargeArea($sPointSQL, $this->iMaxRank);
}
Debug::printVar('Final result', $oResult);
return $oResult;
}
}

View File

@@ -1,319 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/lib.php');
/**
* Collection of search constraints that are independent of the
* actual interpretation of the search query.
*
* The search context is shared between all SearchDescriptions. This
* object mainly serves as context provider for the database queries.
* Therefore most data is directly cached as SQL statements.
*/
class SearchContext
{
/// Search radius around a given Near reference point.
private $fNearRadius = false;
/// True if search must be restricted to viewbox only.
public $bViewboxBounded = false;
/// Reference point for search (as SQL).
public $sqlNear = '';
/// Viewbox selected for search (as SQL).
public $sqlViewboxSmall = '';
/// Viewbox with a larger buffer around (as SQL).
public $sqlViewboxLarge = '';
/// Reference along a route (as SQL).
public $sqlViewboxCentre = '';
/// List of countries to restrict search to (as array).
public $aCountryList = null;
/// List of countries to restrict search to (as SQL).
public $sqlCountryList = '';
/// List of place IDs to exclude (as SQL).
private $sqlExcludeList = '';
/// Subset of word ids of full words in the query.
private $aFullNameWords = array();
public function setFullNameWords($aWordList)
{
$this->aFullNameWords = $aWordList;
}
public function getFullNameTerms()
{
return $this->aFullNameWords;
}
/**
* Check if a reference point is defined.
*
* @return bool True if a reference point is defined.
*/
public function hasNearPoint()
{
return $this->fNearRadius !== false;
}
/**
* Get radius around reference point.
*
* @return float Search radius around reference point.
*/
public function nearRadius()
{
return $this->fNearRadius;
}
/**
* Set search reference point in WGS84.
*
* If set, then only places around this point will be taken into account.
*
* @param float $fLat Latitude of point.
* @param float $fLon Longitude of point.
* @param float $fRadius Search radius around point.
*
* @return void
*/
public function setNearPoint($fLat, $fLon, $fRadius = 0.1)
{
$this->fNearRadius = $fRadius;
$this->sqlNear = 'ST_SetSRID(ST_Point('.$fLon.','.$fLat.'),4326)';
}
/**
* Check if the search is geographically restricted.
*
* Searches are restricted if a reference point is given or if
* a bounded viewbox is set.
*
* @return bool True, if the search is geographically bounded.
*/
public function isBoundedSearch()
{
return $this->hasNearPoint() || ($this->sqlViewboxSmall && $this->bViewboxBounded);
}
/**
* Set rectangular viewbox.
*
* The viewbox may be bounded which means that no search results
* must be outside the viewbox.
*
* @param float[4] $aViewBox Coordinates of the viewbox.
* @param bool $bBounded True if the viewbox is bounded.
*
* @return void
*/
public function setViewboxFromBox(&$aViewBox, $bBounded)
{
$this->bViewboxBounded = $bBounded;
$this->sqlViewboxCentre = '';
$this->sqlViewboxSmall = sprintf(
'ST_SetSRID(ST_MakeBox2D(ST_Point(%F,%F),ST_Point(%F,%F)),4326)',
$aViewBox[0],
$aViewBox[1],
$aViewBox[2],
$aViewBox[3]
);
$fHeight = abs($aViewBox[0] - $aViewBox[2]);
$fWidth = abs($aViewBox[1] - $aViewBox[3]);
$this->sqlViewboxLarge = sprintf(
'ST_SetSRID(ST_MakeBox2D(ST_Point(%F,%F),ST_Point(%F,%F)),4326)',
max($aViewBox[0], $aViewBox[2]) + $fHeight,
max($aViewBox[1], $aViewBox[3]) + $fWidth,
min($aViewBox[0], $aViewBox[2]) - $fHeight,
min($aViewBox[1], $aViewBox[3]) - $fWidth
);
}
/**
* Set viewbox along a route.
*
* The viewbox may be bounded which means that no search results
* must be outside the viewbox.
*
* @param object $oDB Nominatim::DB instance to use for computing the box.
* @param string[] $aRoutePoints List of x,y coordinates along a route.
* @param float $fRouteWidth Buffer around the route to use.
* @param bool $bBounded True if the viewbox bounded.
*
* @return void
*/
public function setViewboxFromRoute(&$oDB, $aRoutePoints, $fRouteWidth, $bBounded)
{
$this->bViewboxBounded = $bBounded;
$this->sqlViewboxCentre = "ST_SetSRID('LINESTRING(";
$sSep = '';
foreach ($aRoutePoints as $aPoint) {
$fPoint = (float)$aPoint;
$this->sqlViewboxCentre .= $sSep.$fPoint;
$sSep = ($sSep == ' ') ? ',' : ' ';
}
$this->sqlViewboxCentre .= ")'::geometry,4326)";
$sSQL = 'ST_BUFFER('.$this->sqlViewboxCentre.','.($fRouteWidth/69).')';
$sGeom = $oDB->getOne('select '.$sSQL, null, 'Could not get small viewbox');
$this->sqlViewboxSmall = "'".$sGeom."'::geometry";
$sSQL = 'ST_BUFFER('.$this->sqlViewboxCentre.','.($fRouteWidth/30).')';
$sGeom = $oDB->getOne('select '.$sSQL, null, 'Could not get large viewbox');
$this->sqlViewboxLarge = "'".$sGeom."'::geometry";
}
/**
* Set list of excluded place IDs.
*
* @param integer[] $aExcluded List of IDs.
*
* @return void
*/
public function setExcludeList($aExcluded)
{
$this->sqlExcludeList = ' not in ('.join(',', $aExcluded).')';
}
/**
* Set list of countries to restrict search to.
*
* @param string[] $aCountries List of two-letter lower-case country codes.
*
* @return void
*/
public function setCountryList($aCountries)
{
$this->sqlCountryList = '('.join(',', array_map('addQuotes', $aCountries)).')';
$this->aCountryList = $aCountries;
}
/**
* Extract a reference point from a query string.
*
* @param string $sQuery Query to scan.
*
* @return string The remaining query string.
*/
public function setNearPointFromQuery($sQuery)
{
$aResult = parseLatLon($sQuery);
if ($aResult !== false
&& $aResult[1] <= 90.1
&& $aResult[1] >= -90.1
&& $aResult[2] <= 180.1
&& $aResult[2] >= -180.1
) {
$this->setNearPoint($aResult[1], $aResult[2]);
$sQuery = trim(str_replace($aResult[0], ' ', $sQuery));
}
return $sQuery;
}
/**
* Get an SQL snippet for computing the distance from the reference point.
*
* @param string $sObj SQL variable name to compute the distance from.
*
* @return string An SQL string.
*/
public function distanceSQL($sObj)
{
return 'ST_Distance('.$this->sqlNear.", $sObj)";
}
/**
* Get an SQL snippet for checking if something is within range of the
* reference point.
*
* @param string $sObj SQL variable name to compute if it is within range.
*
* @return string An SQL string.
*/
public function withinSQL($sObj)
{
return sprintf('ST_DWithin(%s, %s, %F)', $sObj, $this->sqlNear, $this->fNearRadius);
}
/**
* Get an SQL snippet of the importance factor of the viewbox.
*
* The importance factor is computed by checking if an object is within
* the viewbox and/or the extended version of the viewbox.
*
* @param string $sObj SQL variable name of object to weight the importance
*
* @return string SQL snippet of the factor with a leading multiply sign.
*/
public function viewboxImportanceSQL($sObj)
{
$sSQL = '';
if ($this->sqlViewboxSmall) {
$sSQL = " * CASE WHEN ST_Contains($this->sqlViewboxSmall, $sObj) THEN 1 ELSE 0.5 END";
}
if ($this->sqlViewboxLarge) {
$sSQL = " * CASE WHEN ST_Contains($this->sqlViewboxLarge, $sObj) THEN 1 ELSE 0.5 END";
}
return $sSQL;
}
/**
* SQL snippet checking if a place ID should be excluded.
*
* @param string $sVariable SQL variable name of place ID to check,
* potentially prefixed with more SQL.
*
* @return string SQL snippet.
*/
public function excludeSQL($sVariable)
{
if ($this->sqlExcludeList) {
return $sVariable.$this->sqlExcludeList;
}
return '';
}
/**
* Check if the given country is covered by the search context.
*
* @param string $sCountryCode Country code of the country to check.
*
* @return True, if no country code restrictions are set or the
* country is included in the country list.
*/
public function isCountryApplicable($sCountryCode)
{
return $this->aCountryList === null || in_array($sCountryCode, $this->aCountryList);
}
public function debugInfo()
{
return array(
'Near radius' => $this->fNearRadius,
'Near point (SQL)' => $this->sqlNear,
'Bounded viewbox' => $this->bViewboxBounded,
'Viewbox (SQL, small)' => $this->sqlViewboxSmall,
'Viewbox (SQL, large)' => $this->sqlViewboxLarge,
'Viewbox (SQL, centre)' => $this->sqlViewboxCentre,
'Countries (SQL)' => $this->sqlCountryList,
'Excluded IDs (SQL)' => $this->sqlExcludeList
);
}
}

View File

@@ -1,985 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
require_once(CONST_LibDir.'/SpecialSearchOperator.php');
require_once(CONST_LibDir.'/SearchContext.php');
require_once(CONST_LibDir.'/Result.php');
/**
* Description of a single interpretation of a search query.
*/
class SearchDescription
{
/// Ranking how well the description fits the query.
private $iSearchRank = 0;
/// Country code of country the result must belong to.
private $sCountryCode = '';
/// List of word ids making up the name of the object.
private $aName = array();
/// True if the name is rare enough to force index use on name.
private $bRareName = false;
/// True if the name requires to be accompanied by address terms.
private $bNameNeedsAddress = false;
/// List of word ids making up the address of the object.
private $aAddress = array();
/// List of word ids that appear in the name but should be ignored.
private $aNameNonSearch = array();
/// List of word ids that appear in the address but should be ignored.
private $aAddressNonSearch = array();
/// Kind of search for special searches, see Nominatim::Operator.
private $iOperator = Operator::NONE;
/// Class of special feature to search for.
private $sClass = '';
/// Type of special feature to search for.
private $sType = '';
/// Housenumber of the object.
private $sHouseNumber = '';
/// Postcode for the object.
private $sPostcode = '';
/// Global search constraints.
private $oContext;
// Temporary values used while creating the search description.
/// Index of phrase currently processed.
private $iNamePhrase = -1;
/**
* Create an empty search description.
*
* @param object $oContext Global context to use. Will be inherited by
* all derived search objects.
*/
public function __construct($oContext)
{
$this->oContext = $oContext;
}
/**
* Get current search rank.
*
* The higher the search rank the lower the likelihood that the
* search is a correct interpretation of the search query.
*
* @return integer Search rank.
*/
public function getRank()
{
return $this->iSearchRank;
}
/**
* Extract key/value pairs from a query.
*
* Key/value pairs are recognised if they are of the form [<key>=<value>].
* If multiple terms of this kind are found then all terms are removed
* but only the first is used for search.
*
* @param string $sQuery Original query string.
*
* @return string The query string with the special search patterns removed.
*/
public function extractKeyValuePairs($sQuery)
{
// Search for terms of kind [<key>=<value>].
preg_match_all(
'/\\[([\\w_]*)=([\\w_]*)\\]/',
$sQuery,
$aSpecialTermsRaw,
PREG_SET_ORDER
);
foreach ($aSpecialTermsRaw as $aTerm) {
$sQuery = str_replace($aTerm[0], ' ', $sQuery);
if (!$this->hasOperator()) {
$this->setPoiSearch(Operator::TYPE, $aTerm[1], $aTerm[2]);
}
}
return $sQuery;
}
/**
* Check if the combination of parameters is sensible.
*
* @return bool True, if the search looks valid.
*/
public function isValidSearch()
{
if (empty($this->aName)) {
if ($this->sHouseNumber) {
return false;
}
if (!$this->sClass && !$this->sCountryCode) {
return false;
}
}
if ($this->bNameNeedsAddress && empty($this->aAddress)) {
return false;
}
return true;
}
/////////// Search building functions
/**
* Create a copy of this search description adding to search rank.
*
* @param integer $iTermCost Cost to add to the current search rank.
*
* @return object Cloned search description.
*/
public function clone($iTermCost)
{
$oSearch = clone $this;
$oSearch->iSearchRank += $iTermCost;
return $oSearch;
}
/**
* Check if the search currently includes a name.
*
* @param bool bIncludeNonNames If true stop-word tokens are taken into
* account, too.
*
* @return bool True, if search has a name.
*/
public function hasName($bIncludeNonNames = false)
{
return !empty($this->aName)
|| (!empty($this->aNameNonSearch) && $bIncludeNonNames);
}
/**
* Check if the search currently includes an address term.
*
* @return bool True, if any address term is included, including stop-word
* terms.
*/
public function hasAddress()
{
return !empty($this->aAddress) || !empty($this->aAddressNonSearch);
}
/**
* Check if a country restriction is currently included in the search.
*
* @return bool True, if a country restriction is set.
*/
public function hasCountry()
{
return $this->sCountryCode !== '';
}
/**
* Check if a postcode is currently included in the search.
*
* @return bool True, if a postcode is set.
*/
public function hasPostcode()
{
return $this->sPostcode !== '';
}
/**
* Check if a house number is set for the search.
*
* @return bool True, if a house number is set.
*/
public function hasHousenumber()
{
return $this->sHouseNumber !== '';
}
/**
* Check if a special type of place is requested.
*
* param integer iOperator When set, check for the particular
* operator used for the special type.
*
* @return bool True, if speial type is requested or, if requested,
* a special type with the given operator.
*/
public function hasOperator($iOperator = null)
{
return $iOperator === null ? $this->iOperator != Operator::NONE : $this->iOperator == $iOperator;
}
/**
* Add the given token to the list of terms to search for in the address.
*
* @param integer iID ID of term to add.
* @param bool bSearchable Term should be used to search for result
* (i.e. term is not a stop word).
*/
public function addAddressToken($iId, $bSearchable = true)
{
if ($bSearchable) {
$this->aAddress[$iId] = $iId;
} else {
$this->aAddressNonSearch[$iId] = $iId;
}
}
/**
* Add the given full-word token to the list of terms to search for in the
* name.
*
* @param integer iId ID of term to add.
* @param bool bRareName True if the term is infrequent enough to not
* require other constraints for efficient search.
*/
public function addNameToken($iId, $bRareName)
{
$this->aName[$iId] = $iId;
$this->bRareName = $bRareName;
$this->bNameNeedsAddress = false;
}
/**
* Add the given partial token to the list of terms to search for in
* the name.
*
* @param integer iID ID of term to add.
* @param bool bSearchable Term should be used to search for result
* (i.e. term is not a stop word).
* @param bool bNeedsAddress True if the term is too unspecific to be used
* in a stand-alone search without an address
* to narrow down the search.
* @param integer iPhraseNumber Index of phrase, where the partial term
* appears.
*/
public function addPartialNameToken($iId, $bSearchable, $bNeedsAddress, $iPhraseNumber)
{
if (empty($this->aName)) {
$this->bNameNeedsAddress = $bNeedsAddress;
} elseif ($bSearchable && count($this->aName) >= 2) {
$this->bNameNeedsAddress = false;
} else {
$this->bNameNeedsAddress &= $bNeedsAddress;
}
if ($bSearchable) {
$this->aName[$iId] = $iId;
} else {
$this->aNameNonSearch[$iId] = $iId;
}
$this->iNamePhrase = $iPhraseNumber;
}
/**
* Set country restriction for the search.
*
* @param string sCountryCode Country code of country to restrict search to.
*/
public function setCountry($sCountryCode)
{
$this->sCountryCode = $sCountryCode;
$this->iNamePhrase = -1;
}
/**
* Set postcode search constraint.
*
* @param string sPostcode Postcode the result should have.
*/
public function setPostcode($sPostcode)
{
$this->sPostcode = $sPostcode;
$this->iNamePhrase = -1;
}
/**
* Make this search a search for a postcode object.
*
* @param integer iId Token Id for the postcode.
* @param string sPostcode Postcode to look for.
*/
public function setPostcodeAsName($iId, $sPostcode)
{
$this->iOperator = Operator::POSTCODE;
$this->aAddress = array_merge($this->aAddress, $this->aName);
$this->aName = array($iId => $sPostcode);
$this->bRareName = true;
$this->iNamePhrase = -1;
}
/**
* Set house number search cnstraint.
*
* @param string sNumber House number the result should have.
*/
public function setHousenumber($sNumber)
{
$this->sHouseNumber = $sNumber;
$this->iNamePhrase = -1;
}
/**
* Make this search a search for a house number.
*
* @param integer iId Token Id for the house number.
*/
public function setHousenumberAsName($iId)
{
$this->aAddress = array_merge($this->aAddress, $this->aName);
$this->bRareName = false;
$this->bNameNeedsAddress = true;
$this->aName = array($iId => $iId);
$this->iNamePhrase = -1;
}
/**
* Make this search a POI search.
*
* In a POI search, objects are not (only) searched by their name
* but also by the primary OSM key/value pair (class and type in Nominatim).
*
* @param integer $iOperator Type of POI search
* @param string $sClass Class (or OSM tag key) of POI.
* @param string $sType Type (or OSM tag value) of POI.
*
* @return void
*/
public function setPoiSearch($iOperator, $sClass, $sType)
{
$this->iOperator = $iOperator;
$this->sClass = $sClass;
$this->sType = $sType;
$this->iNamePhrase = -1;
}
public function getNamePhrase()
{
return $this->iNamePhrase;
}
/**
* Get the global search context.
*
* @return object Objects of global search constraints.
*/
public function getContext()
{
return $this->oContext;
}
/////////// Query functions
/**
* Query database for places that match this search.
*
* @param object $oDB Nominatim::DB instance to use.
* @param integer $iMinRank Minimum address rank to restrict search to.
* @param integer $iMaxRank Maximum address rank to restrict search to.
* @param integer $iLimit Maximum number of results.
*
* @return mixed[] An array with two fields: IDs contains the list of
* matching place IDs and houseNumber the houseNumber
* if applicable or -1 if not.
*/
public function query(&$oDB, $iMinRank, $iMaxRank, $iLimit)
{
$aResults = array();
if ($this->sCountryCode
&& empty($this->aName)
&& !$this->iOperator
&& !$this->sClass
&& !$this->oContext->hasNearPoint()
) {
// Just looking for a country - look it up
if (4 >= $iMinRank && 4 <= $iMaxRank) {
$aResults = $this->queryCountry($oDB);
}
} elseif (empty($this->aName) && empty($this->aAddress)) {
// Neither name nor address? Then we must be
// looking for a POI in a geographic area.
if ($this->oContext->isBoundedSearch()) {
$aResults = $this->queryNearbyPoi($oDB, $iLimit);
}
} elseif ($this->iOperator == Operator::POSTCODE) {
// looking for postcode
$aResults = $this->queryPostcode($oDB, $iLimit);
} else {
// Ordinary search:
// First search for places according to name and address.
$aResults = $this->queryNamedPlace(
$oDB,
$iMinRank,
$iMaxRank,
$iLimit
);
// finally get POIs if requested
if ($this->sClass && !empty($aResults)) {
$aResults = $this->queryPoiByOperator($oDB, $aResults, $iLimit);
}
}
Debug::printDebugTable('Place IDs', $aResults);
if (!empty($aResults) && $this->sPostcode) {
$sPlaceIds = Result::joinIdsByTable($aResults, Result::TABLE_PLACEX);
if ($sPlaceIds) {
$sSQL = 'SELECT place_id FROM placex';
$sSQL .= ' WHERE place_id in ('.$sPlaceIds.')';
$sSQL .= " AND postcode != '".$this->sPostcode."'";
Debug::printSQL($sSQL);
$aFilteredPlaceIDs = $oDB->getCol($sSQL);
if ($aFilteredPlaceIDs) {
foreach ($aFilteredPlaceIDs as $iPlaceId) {
$aResults[$iPlaceId]->iResultRank++;
}
}
}
}
return $aResults;
}
private function queryCountry(&$oDB)
{
$sSQL = 'SELECT place_id FROM placex ';
$sSQL .= "WHERE country_code='".$this->sCountryCode."'";
$sSQL .= ' AND rank_search = 4';
if ($this->oContext->bViewboxBounded) {
$sSQL .= ' AND ST_Intersects('.$this->oContext->sqlViewboxSmall.', geometry)';
}
$sSQL .= ' ORDER BY st_area(geometry) DESC LIMIT 1';
Debug::printSQL($sSQL);
$iPlaceId = $oDB->getOne($sSQL);
$aResults = array();
if ($iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId);
}
return $aResults;
}
private function queryNearbyPoi(&$oDB, $iLimit)
{
if (!$this->sClass) {
return array();
}
$aDBResults = array();
$sPoiTable = $this->poiTable();
if ($oDB->tableExists($sPoiTable)) {
$sSQL = 'SELECT place_id FROM '.$sPoiTable.' ct';
if ($this->oContext->sqlCountryList) {
$sSQL .= ' JOIN placex USING (place_id)';
}
if ($this->oContext->hasNearPoint()) {
$sSQL .= ' WHERE '.$this->oContext->withinSQL('ct.centroid');
} elseif ($this->oContext->bViewboxBounded) {
$sSQL .= ' WHERE ST_Contains('.$this->oContext->sqlViewboxSmall.', ct.centroid)';
}
if ($this->oContext->sqlCountryList) {
$sSQL .= ' AND country_code in '.$this->oContext->sqlCountryList;
}
$sSQL .= $this->oContext->excludeSQL(' AND place_id');
if ($this->oContext->sqlViewboxCentre) {
$sSQL .= ' ORDER BY ST_Distance(';
$sSQL .= $this->oContext->sqlViewboxCentre.', ct.centroid) ASC';
} elseif ($this->oContext->hasNearPoint()) {
$sSQL .= ' ORDER BY '.$this->oContext->distanceSQL('ct.centroid').' ASC';
}
$sSQL .= " LIMIT $iLimit";
Debug::printSQL($sSQL);
$aDBResults = $oDB->getCol($sSQL);
}
if ($this->oContext->hasNearPoint()) {
$sSQL = 'SELECT place_id FROM placex WHERE ';
$sSQL .= 'class = :class and type = :type';
$sSQL .= ' AND '.$this->oContext->withinSQL('geometry');
$sSQL .= ' AND linked_place_id is null';
if ($this->oContext->sqlCountryList) {
$sSQL .= ' AND country_code in '.$this->oContext->sqlCountryList;
}
$sSQL .= ' ORDER BY '.$this->oContext->distanceSQL('centroid').' ASC';
$sSQL .= " LIMIT $iLimit";
Debug::printSQL($sSQL);
$aDBResults = $oDB->getCol(
$sSQL,
array(':class' => $this->sClass, ':type' => $this->sType)
);
}
$aResults = array();
foreach ($aDBResults as $iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId);
}
return $aResults;
}
private function queryPostcode(&$oDB, $iLimit)
{
$sSQL = 'SELECT p.place_id FROM location_postcode p ';
if (!empty($this->aAddress)) {
$sSQL .= ', search_name s ';
$sSQL .= 'WHERE s.place_id = p.parent_place_id ';
$sSQL .= 'AND array_cat(s.nameaddress_vector, s.name_vector)';
$sSQL .= ' @> '.$oDB->getArraySQL($this->aAddress).' AND ';
} else {
$sSQL .= 'WHERE ';
}
$sSQL .= "p.postcode = '".reset($this->aName)."'";
$sSQL .= $this->countryCodeSQL(' AND p.country_code');
if ($this->oContext->bViewboxBounded) {
$sSQL .= ' AND ST_Intersects('.$this->oContext->sqlViewboxSmall.', geometry)';
}
$sSQL .= $this->oContext->excludeSQL(' AND p.place_id');
$sSQL .= " LIMIT $iLimit";
Debug::printSQL($sSQL);
$aResults = array();
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId, Result::TABLE_POSTCODE);
}
return $aResults;
}
private function queryNamedPlace(&$oDB, $iMinAddressRank, $iMaxAddressRank, $iLimit)
{
$aTerms = array();
$aOrder = array();
if (!empty($this->aName)) {
$aTerms[] = 'name_vector @> '.$oDB->getArraySQL($this->aName);
}
if (!empty($this->aAddress)) {
// For infrequent name terms disable index usage for address
if ($this->bRareName) {
$aTerms[] = 'array_cat(nameaddress_vector,ARRAY[]::integer[]) @> '.$oDB->getArraySQL($this->aAddress);
} else {
$aTerms[] = 'nameaddress_vector @> '.$oDB->getArraySQL($this->aAddress);
}
}
$sCountryTerm = $this->countryCodeSQL('country_code');
if ($sCountryTerm) {
$aTerms[] = $sCountryTerm;
}
if ($this->sHouseNumber) {
$aTerms[] = 'address_rank between 16 and 30';
} elseif (!$this->sClass || $this->iOperator == Operator::NAME) {
if ($iMinAddressRank > 0) {
$aTerms[] = "((address_rank between $iMinAddressRank and $iMaxAddressRank) or (search_rank between $iMinAddressRank and $iMaxAddressRank))";
}
}
if ($this->oContext->hasNearPoint()) {
$aTerms[] = $this->oContext->withinSQL('centroid');
$aOrder[] = $this->oContext->distanceSQL('centroid');
} elseif ($this->sPostcode) {
if (empty($this->aAddress)) {
$aTerms[] = "EXISTS(SELECT place_id FROM location_postcode p WHERE p.postcode = '".$this->sPostcode."' AND ST_DWithin(search_name.centroid, p.geometry, 0.12))";
} else {
$aOrder[] = "(SELECT min(ST_Distance(search_name.centroid, p.geometry)) FROM location_postcode p WHERE p.postcode = '".$this->sPostcode."')";
}
}
$sExcludeSQL = $this->oContext->excludeSQL('place_id');
if ($sExcludeSQL) {
$aTerms[] = $sExcludeSQL;
}
if ($this->oContext->bViewboxBounded) {
$aTerms[] = 'centroid && '.$this->oContext->sqlViewboxSmall;
}
if ($this->sHouseNumber) {
$sImportanceSQL = '- abs(26 - address_rank) + 3';
} else {
$sImportanceSQL = '(CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END)';
}
$sImportanceSQL .= $this->oContext->viewboxImportanceSQL('centroid');
$aOrder[] = "$sImportanceSQL DESC";
$aFullNameAddress = $this->oContext->getFullNameTerms();
if (!empty($aFullNameAddress)) {
$sExactMatchSQL = ' ( ';
$sExactMatchSQL .= ' SELECT count(*) FROM ( ';
$sExactMatchSQL .= ' SELECT unnest('.$oDB->getArraySQL($aFullNameAddress).')';
$sExactMatchSQL .= ' INTERSECT ';
$sExactMatchSQL .= ' SELECT unnest(nameaddress_vector)';
$sExactMatchSQL .= ' ) s';
$sExactMatchSQL .= ') as exactmatch';
$aOrder[] = 'exactmatch DESC';
} else {
$sExactMatchSQL = '0::int as exactmatch';
}
if (empty($aTerms)) {
return array();
}
if ($this->hasHousenumber()) {
$sHouseNumberRegex = $oDB->getDBQuoted('\\\\m'.$this->sHouseNumber.'\\\\M');
// Housenumbers on streets and places.
$sPlacexSql = 'SELECT array_agg(place_id) FROM placex';
$sPlacexSql .= ' WHERE parent_place_id = sin.place_id AND sin.address_rank < 30';
$sPlacexSql .= $this->oContext->excludeSQL(' AND place_id');
$sPlacexSql .= ' and housenumber ~* E'.$sHouseNumberRegex;
// Interpolations on streets and places.
$sInterpolSql = 'null';
$sTigerSql = 'null';
if (preg_match('/^[0-9]+$/', $this->sHouseNumber)) {
$sIpolHnr = 'WHERE parent_place_id = sin.place_id ';
$sIpolHnr .= ' AND startnumber is not NULL AND sin.address_rank < 30';
$sIpolHnr .= ' AND '.$this->sHouseNumber.' between startnumber and endnumber';
$sIpolHnr .= ' AND ('.$this->sHouseNumber.' - startnumber) % step = 0';
$sInterpolSql = 'SELECT array_agg(place_id) FROM location_property_osmline '.$sIpolHnr;
if (CONST_Use_US_Tiger_Data) {
$sTigerSql = 'SELECT array_agg(place_id) FROM location_property_tiger '.$sIpolHnr;
$sTigerSql .= " and sin.country_code = 'us'";
}
}
if ($this->sClass) {
$iLimit = 40;
}
$sSelfHnr = 'SELECT * FROM placex WHERE place_id = search_name.place_id';
$sSelfHnr .= ' AND housenumber ~* E'.$sHouseNumberRegex;
$aTerms[] = '(address_rank < 30 or exists('.$sSelfHnr.'))';
$sSQL = 'SELECT sin.*, ';
$sSQL .= '('.$sPlacexSql.') as placex_hnr, ';
$sSQL .= '('.$sInterpolSql.') as interpol_hnr, ';
$sSQL .= '('.$sTigerSql.') as tiger_hnr ';
$sSQL .= ' FROM (';
$sSQL .= ' SELECT place_id, address_rank, country_code,'.$sExactMatchSQL.',';
$sSQL .= ' CASE WHEN importance = 0 OR importance IS NULL';
$sSQL .= ' THEN 0.75001-(search_rank::float/40) ELSE importance END as importance';
$sSQL .= ' FROM search_name';
$sSQL .= ' WHERE '.join(' and ', $aTerms);
$sSQL .= ' ORDER BY '.join(', ', $aOrder);
$sSQL .= ' LIMIT 40000';
$sSQL .= ') as sin';
$sSQL .= ' ORDER BY address_rank = 30 desc, placex_hnr, interpol_hnr, tiger_hnr,';
$sSQL .= ' importance';
$sSQL .= ' LIMIT '.$iLimit;
} else {
if ($this->sClass) {
$iLimit = 40;
}
$sSQL = 'SELECT place_id, address_rank, '.$sExactMatchSQL;
$sSQL .= ' FROM search_name';
$sSQL .= ' WHERE '.join(' and ', $aTerms);
$sSQL .= ' ORDER BY '.join(', ', $aOrder);
$sSQL .= ' LIMIT '.$iLimit;
}
Debug::printSQL($sSQL);
$aDBResults = $oDB->getAll($sSQL, null, 'Could not get places for search terms.');
$aResults = array();
foreach ($aDBResults as $aResult) {
$oResult = new Result($aResult['place_id']);
$oResult->iExactMatches = $aResult['exactmatch'];
$oResult->iAddressRank = $aResult['address_rank'];
$bNeedResult = true;
if ($this->hasHousenumber() && $aResult['address_rank'] < 30) {
if ($aResult['placex_hnr']) {
foreach (explode(',', substr($aResult['placex_hnr'], 1, -1)) as $sPlaceID) {
$iPlaceID = intval($sPlaceID);
$oHnrResult = new Result($iPlaceID);
$oHnrResult->iExactMatches = $aResult['exactmatch'];
$oHnrResult->iAddressRank = 30;
$aResults[$iPlaceID] = $oHnrResult;
$bNeedResult = false;
}
}
if ($aResult['interpol_hnr']) {
foreach (explode(',', substr($aResult['interpol_hnr'], 1, -1)) as $sPlaceID) {
$iPlaceID = intval($sPlaceID);
$oHnrResult = new Result($iPlaceID, Result::TABLE_OSMLINE);
$oHnrResult->iExactMatches = $aResult['exactmatch'];
$oHnrResult->iAddressRank = 30;
$oHnrResult->iHouseNumber = intval($this->sHouseNumber);
$aResults[$iPlaceID] = $oHnrResult;
$bNeedResult = false;
}
}
if ($aResult['tiger_hnr']) {
foreach (explode(',', substr($aResult['tiger_hnr'], 1, -1)) as $sPlaceID) {
$iPlaceID = intval($sPlaceID);
$oHnrResult = new Result($iPlaceID, Result::TABLE_TIGER);
$oHnrResult->iExactMatches = $aResult['exactmatch'];
$oHnrResult->iAddressRank = 30;
$oHnrResult->iHouseNumber = intval($this->sHouseNumber);
$aResults[$iPlaceID] = $oHnrResult;
$bNeedResult = false;
}
}
if ($aResult['address_rank'] < 26) {
$oResult->iResultRank += 2;
} else {
$oResult->iResultRank++;
}
}
if ($bNeedResult) {
$aResults[$aResult['place_id']] = $oResult;
}
}
return $aResults;
}
private function queryPoiByOperator(&$oDB, $aParentIDs, $iLimit)
{
$aResults = array();
$sPlaceIDs = Result::joinIdsByTable($aParentIDs, Result::TABLE_PLACEX);
if (!$sPlaceIDs) {
return $aResults;
}
if ($this->iOperator == Operator::TYPE || $this->iOperator == Operator::NAME) {
// If they were searching for a named class (i.e. 'Kings Head pub')
// then we might have an extra match
$sSQL = 'SELECT place_id FROM placex ';
$sSQL .= " WHERE place_id in ($sPlaceIDs)";
$sSQL .= " AND class='".$this->sClass."' ";
$sSQL .= " AND type='".$this->sType."'";
$sSQL .= ' AND linked_place_id is null';
$sSQL .= $this->oContext->excludeSQL(' AND place_id');
$sSQL .= ' ORDER BY rank_search ASC ';
$sSQL .= " LIMIT $iLimit";
Debug::printSQL($sSQL);
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId);
}
}
// NEAR and IN are handled the same
if ($this->iOperator == Operator::TYPE || $this->iOperator == Operator::NEAR) {
$sClassTable = $this->poiTable();
$bCacheTable = $oDB->tableExists($sClassTable);
$sSQL = "SELECT min(rank_search) FROM placex WHERE place_id in ($sPlaceIDs)";
Debug::printSQL($sSQL);
$iMaxRank = (int) $oDB->getOne($sSQL);
// For state / country level searches the normal radius search doesn't work very well
$sPlaceGeom = false;
if ($iMaxRank < 9 && $bCacheTable) {
// Try and get a polygon to search in instead
$sSQL = 'SELECT geometry FROM placex';
$sSQL .= " WHERE place_id in ($sPlaceIDs)";
$sSQL .= " AND rank_search < $iMaxRank + 5";
$sSQL .= ' AND ST_Area(Box2d(geometry)) < 20';
$sSQL .= " AND ST_GeometryType(geometry) in ('ST_Polygon','ST_MultiPolygon')";
$sSQL .= ' ORDER BY rank_search ASC ';
$sSQL .= ' LIMIT 1';
Debug::printSQL($sSQL);
$sPlaceGeom = $oDB->getOne($sSQL);
}
if ($sPlaceGeom) {
$sPlaceIDs = false;
} else {
$iMaxRank += 5;
$sSQL = 'SELECT place_id FROM placex';
$sSQL .= " WHERE place_id in ($sPlaceIDs) and rank_search < $iMaxRank";
Debug::printSQL($sSQL);
$aPlaceIDs = $oDB->getCol($sSQL);
$sPlaceIDs = join(',', $aPlaceIDs);
}
if ($sPlaceIDs || $sPlaceGeom) {
$fRange = 0.01;
if ($bCacheTable) {
// More efficient - can make the range bigger
$fRange = 0.05;
$sOrderBySQL = '';
if ($this->oContext->hasNearPoint()) {
$sOrderBySQL = $this->oContext->distanceSQL('l.centroid');
} elseif ($sPlaceIDs) {
$sOrderBySQL = 'ST_Distance(l.centroid, f.geometry)';
} elseif ($sPlaceGeom) {
$sOrderBySQL = "ST_Distance(st_centroid('".$sPlaceGeom."'), l.centroid)";
}
$sSQL = 'SELECT distinct i.place_id';
if ($sOrderBySQL) {
$sSQL .= ', i.order_term';
}
$sSQL .= ' from (SELECT l.place_id';
if ($sOrderBySQL) {
$sSQL .= ','.$sOrderBySQL.' as order_term';
}
$sSQL .= ' from '.$sClassTable.' as l';
if ($sPlaceIDs) {
$sSQL .= ',placex as f WHERE ';
$sSQL .= "f.place_id in ($sPlaceIDs) ";
$sSQL .= " AND ST_DWithin(l.centroid, f.centroid, $fRange)";
} elseif ($sPlaceGeom) {
$sSQL .= " WHERE ST_Contains('$sPlaceGeom', l.centroid)";
}
$sSQL .= $this->oContext->excludeSQL(' AND l.place_id');
$sSQL .= 'limit 300) i ';
if ($sOrderBySQL) {
$sSQL .= 'order by order_term asc';
}
$sSQL .= " limit $iLimit";
Debug::printSQL($sSQL);
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId);
}
} else {
if ($this->oContext->hasNearPoint()) {
$fRange = $this->oContext->nearRadius();
}
$sOrderBySQL = '';
if ($this->oContext->hasNearPoint()) {
$sOrderBySQL = $this->oContext->distanceSQL('l.geometry');
} else {
$sOrderBySQL = 'ST_Distance(l.geometry, f.geometry)';
}
$sSQL = 'SELECT distinct l.place_id';
if ($sOrderBySQL) {
$sSQL .= ','.$sOrderBySQL.' as orderterm';
}
$sSQL .= ' FROM placex as l, placex as f';
$sSQL .= " WHERE f.place_id in ($sPlaceIDs)";
$sSQL .= " AND ST_DWithin(l.geometry, f.centroid, $fRange)";
$sSQL .= " AND l.class='".$this->sClass."'";
$sSQL .= " AND l.type='".$this->sType."'";
$sSQL .= $this->oContext->excludeSQL(' AND l.place_id');
if ($sOrderBySQL) {
$sSQL .= 'ORDER BY orderterm ASC';
}
$sSQL .= " limit $iLimit";
Debug::printSQL($sSQL);
foreach ($oDB->getCol($sSQL) as $iPlaceId) {
$aResults[$iPlaceId] = new Result($iPlaceId);
}
}
}
}
return $aResults;
}
private function poiTable()
{
return 'place_classtype_'.$this->sClass.'_'.$this->sType;
}
private function countryCodeSQL($sVar)
{
if ($this->sCountryCode) {
return $sVar.' = \''.$this->sCountryCode."'";
}
if ($this->oContext->sqlCountryList) {
return $sVar.' in '.$this->oContext->sqlCountryList;
}
return '';
}
/////////// Sort functions
public static function bySearchRank($a, $b)
{
if ($a->iSearchRank == $b->iSearchRank) {
return $a->iOperator + strlen($a->sHouseNumber)
- $b->iOperator - strlen($b->sHouseNumber);
}
return $a->iSearchRank < $b->iSearchRank ? -1 : 1;
}
//////////// Debugging functions
public function debugInfo()
{
return array(
'Search rank' => $this->iSearchRank,
'Country code' => $this->sCountryCode,
'Name terms' => $this->aName,
'Name terms (stop words)' => $this->aNameNonSearch,
'Address terms' => $this->aAddress,
'Address terms (stop words)' => $this->aAddressNonSearch,
'Address terms (full words)' => $this->aFullNameAddress ?? '',
'Special search' => $this->iOperator,
'Class' => $this->sClass,
'Type' => $this->sType,
'House number' => $this->sHouseNumber,
'Postcode' => $this->sPostcode
);
}
public function dumpAsHtmlTableRow(&$aWordIDs)
{
$kf = function ($k) use (&$aWordIDs) {
return $aWordIDs[$k] ?? '['.$k.']';
};
echo '<tr>';
echo "<td>$this->iSearchRank</td>";
echo '<td>'.join(', ', array_map($kf, $this->aName)).'</td>';
echo '<td>'.join(', ', array_map($kf, $this->aNameNonSearch)).'</td>';
echo '<td>'.join(', ', array_map($kf, $this->aAddress)).'</td>';
echo '<td>'.join(', ', array_map($kf, $this->aAddressNonSearch)).'</td>';
echo '<td>'.$this->sCountryCode.'</td>';
echo '<td>'.Operator::toString($this->iOperator).'</td>';
echo '<td>'.$this->sClass.'</td>';
echo '<td>'.$this->sType.'</td>';
echo '<td>'.$this->sPostcode.'</td>';
echo '<td>'.$this->sHouseNumber.'</td>';
echo '</tr>';
}
}

View File

@@ -1,95 +0,0 @@
<?php
/**
* SPDX-License-Identifier: GPL-2.0-only
*
* This file is part of Nominatim. (https://nominatim.org)
*
* Copyright (C) 2022 by the Nominatim developer community.
* For a full list of authors see the git log.
*/
namespace Nominatim;
/**
* Description of the position of a token within a query.
*/
class SearchPosition
{
private $sPhraseType;
private $iPhrase;
private $iNumPhrases;
private $iToken;
private $iNumTokens;
public function __construct($sPhraseType, $iPhrase, $iNumPhrases)
{
$this->sPhraseType = $sPhraseType;
$this->iPhrase = $iPhrase;
$this->iNumPhrases = $iNumPhrases;
}
public function setTokenPosition($iToken, $iNumTokens)
{
$this->iToken = $iToken;
$this->iNumTokens = $iNumTokens;
}
/**
* Check if the phrase can be of the given type.
*
* @param string $sType Type of phrse requested.
*
* @return True if the phrase is untyped or of the given type.
*/
public function maybePhrase($sType)
{
return $this->sPhraseType == '' || $this->sPhraseType == $sType;
}
/**
* Check if the phrase is exactly of the given type.
*
* @param string $sType Type of phrse requested.
*
* @return True if the phrase of the given type.
*/
public function isPhrase($sType)
{
return $this->sPhraseType == $sType;
}
/**
* Return true if the token is the very first in the query.
*/
public function isFirstToken()
{
return $this->iPhrase == 0 && $this->iToken == 0;
}
/**
* Check if the token is the final one in the query.
*/
public function isLastToken()
{
return $this->iToken + 1 == $this->iNumTokens && $this->iPhrase + 1 == $this->iNumPhrases;
}
/**
* Check if the current token is part of the first phrase in the query.
*/
public function isFirstPhrase()
{
return $this->iPhrase == 0;
}
/**
* Get the phrase position in the query.
*/
public function getPhrase()
{
return $this->iPhrase;
}
}

Some files were not shown because too many files have changed in this diff Show More