CONST_BasePath is split into separate configuration variables
for binaries, libraries and data. These variables as well as
the installation path are now set in the executable directly and
no longer configurable via project settings.
This is the first step towards an installable software. The
executables should know per installation where to find their
necessary data to execute. Project configuration needs to be
restricted to settings that really concern the specific Nominatim
installation.
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.
This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.
Also adds a penalty to very short postcodes.
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.
This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.
The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.
no penalty for housenumber searches
In structured queries we should only assume that it is
a postcode search when only the postcode and optionally
the country is given. If any other term is present, it
is better to avoid the search for postcode as it yields
too many bad searches. Given that the terms in a structured
query are ordered, this means that the postcode must be
the first token just like in the unstructured query.
Fixes#1988.
Now that we have result ranking, we can keep the street results
for housenumber searches and reuse them in the next group round
if required. Also fixes an issue where postcode and housenumber
are in the query and one of them is wrong.
Fixes#1200.
The word frequency hash was only used to determine if the
name of a SearchDescription is rare. Do this already when
building the SearchDescription (when the word frequency
is still available) and get gid of the extra hash.
Searches for house numbers are already limited by the
number of parent places. In fact, the limit assumed that
every parent place has exactly one match against the
given housenumber. That is not true in reality and so
we were dropping relevant results.
Fixes#329.
Adding a penalty to a search description because there
is a term at the beginning which looks like a country
turned out to be a bad idea as there are too many
abbreviations around that match against frequently
matched words.
Drop country tokens that do not match the country code list
early. Remove in turn the special country code check for
structured phrases. It is sufficient to do this during
word list building.
The NearPoint is actually common to all SearchDescriptions
and there is other context data as well. like viewbox, that
needs to be available to the search object but is common.