When a associatedStreet relation has multiple street members
always take the closest one. Avoid geometry operations for
the frequent case that there is only one street.
When deciding if an address interpolation has address information, only
look for addr:street and addr:place. If they are not there go looking
for the address on the address nodes. Ignores irrelevant tags like
addr:inclusion.
Fixes#2797.
When a boundary or place changes its address rank, all places where
it participates as address need to be potentially reindexed.
Also use the computed rank when testing place nodes against
boundaries. Boundaries are computed earlier.
Fixes#2794.
Resolves a couple of situations where a mixed use of places areas and
administrative boundaries would result in a hierarchy that did not
properly respect the contains relation.
When moving the finding of linked places to the precomputation stage,
it was also moved before the statement where the linked_place_id was
removed from the linkee. The result was that the current linkee was
excluded when looking for a linked place on updates because it was
still linked to the boundary to be updated.
Fixed by allowing to either keep the linkage or change to an unlinked
place.
This is needed for pedestrian areas mapped as multipolygons
and consequently as relations. The lookup in placex guarantees
that the referenced OSM object is indeed a street.
Fixes#2669.
The inherited housenumber is needed for display output. We can't
take the one from the housenumber field because it is already
normalized. Remove the inherited address only when reindexing.
Fixes#2683.
Convert the '_place_*' entries back to normal entries before
returning them in the 'namedetails' section. If the name field is
duplicated, kept the '_place_*' notation. This preserves the previous
behaviour before _place_ names were introduces but adds the additional
names from the linked place for reference.
This keeps the names tracable and ensures that all names are searchable
when they differ. Do not keep names when they are exactly the same
to save some space. Linked names are cleaned out before relinking.
Mutations are regular-expression-based replacements that are applied
after variants have been computed. They are meant to be used for
variations on character level.
Add spelling variations for German umlauts.
The highway key is being used more and more for non-ways these
days. This clashes with Nominatim's assumption that essentially
everything that has a highway tag can be used as the street part
of the address.
Change the default rank of highway objects to 30 to avoid this.
Only the known values for streets keep the rank 26 and are now
listed explicitly.
Adds a tagger for names by language so that the analyzer of that
language is used. Thus variants are now only applied to names
in the specific language and only tag name tags, no longer to
reference-like tags.
When matching address parts from addr:* tags against place names,
the address names where so far converted to full names and compared
those to the place names. This can become problematic with the new
ICU tokenizer once we introduce creation of different variants
depending on the place name context. It wouldn't be clear which
variant to produce to get a match, so we would have to create all of
them. To work around this issue, switch to using the partial terms
for matching. This introduces a larger fuzziness between matches but
that shouldn't be a problem because matching is always geographically
restricted.
The search terms created for address parts have a different problem:
they are already created before we even know if they are going to be
used. This can lead to spurious entries in the word table, which slows
down searching. This problem can also be circumvented by using only
partial terms for the search terms. In terms of searching that means
that the address terms would not get the full-word boost, but given
that the case where an address part does not exist as an OSM object
should be the exception, this is likely acceptable.
A boolean check for dynamic changes of address parts is not
sufficient. The order of choice should be:
1. an addr:* part matches the name
2. the address part surrounds the object
3. the address part was declared as isaddress
The implementation uses a slightly different ordering
to avoid geometry checks unless strictly necessary (isaddress
is false and no matching address).
See #2446.
Linked places may bring in extra names. These names need to be
processed by the tokenizer. That means that the linking needs
to be done before the data is handed to the tokenizer. Move finding
the linked place into the preparation stage and update the name
fields. Everything else is still done in the indexing stage.
The BDD tests cannot make assumptions about the structure of the
word table anymore because it depends on the tokenizer. Use more
abstract descriptions instead that ask for specific kinds of
tokens.
Compound decomposition now creates a full name variant on
import just like abbreviations. This simplifies query time
normalization and opens a path for changing abbreviation
and compund decomposition lists for an existing database.
We've previously added searching through rank 30 in a house
number search to enable searches for house number+name.
This had the unintended side effect that rank 30 objects
are also returned in s search that dropped the house number
from the query. This is wrong because POIs cannot function
as a parent to a house number.
This fix drops all rank 30 objects from the results for a
house number search if they do not match the requested house
number.