Commit Graph

60 Commits

Author SHA1 Message Date
Sarah Hoffmann
340e7f7210 bdd: complete coverage for API tests
Also removes some functions that are no longer used and
fixes debug output where the tests found an issue.
2021-01-17 16:12:06 +01:00
Sarah Hoffmann
b5480f6e36 reorganise path settings in config
CONST_BasePath is split into separate configuration variables
for binaries, libraries and data. These variables as well as
the installation path are now set in the executable directly and
no longer configurable via project settings.

This is the first step towards an installable software. The
executables should know per installation where to find their
necessary data to execute. Project configuration needs to be
restricted to settings that really concern the specific Nominatim
installation.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
ea844db847 do not classify housenumbers as rare
House numbers are highly redundant, so don't even attempt to
do it as a rare name search. Greatly improves speed of such
queries.
2020-12-08 17:25:15 +01:00
Sarah Hoffmann
df12954312 fix use of term count in partial terms
Term count for partial words is one less than the actual number
of words. Take that into account when adding to the search rank.

Fixes #2081.
2020-12-01 17:21:01 +01:00
Sarah Hoffmann
c5d98effc0 Merge pull request #2074 from lonvia/add-housenumber-to-unknown-places
Improve finding addresses that have their own search_name entry because of unknown addr:* parts
2020-11-25 16:57:09 +01:00
Sarah Hoffmann
3cf763475f do not use artificial housenumbers as names
If they are artificial they cannot have a search_name entry.
2020-11-25 16:11:32 +01:00
Sarah Hoffmann
0f87da017f improve handling of multi-word partials in SearchDescription
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.

This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.

Also adds a penalty to very short postcodes.
2020-11-25 12:07:04 +01:00
Sarah Hoffmann
22800d7d59 Search housenumbers with unknown address parts by housenumber term
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.

This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.

The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.

no penalty for housenumber searches
2020-11-25 11:36:10 +01:00
Sarah Hoffmann
7d2b6879c8 restrict postcode searches to postcode in first token
In structured queries we should only assume that it is
a postcode search when only the postcode and optionally
the country is given. If any other term is present, it
is better to avoid the search for postcode as it yields
too many bad searches. Given that the terms in a structured
query are ordered, this means that the postcode must be
the first token just like in the unstructured query.

Fixes #1988.
2020-10-06 14:08:31 +02:00
(Joy) Yuanyue Ding
cac8a8df18 Modifiy the range of address_rank, fix for issue #164 2020-07-08 17:47:38 +02:00
Sarah Hoffmann
acd8ca2ebd add testing for rank adaption while linking 2020-02-28 15:22:48 +01:00
marc tobias
eeb26aaa6f Addresses with housenumber 0 are found 2019-10-31 13:52:10 +01:00
marc tobias
c9a6350894 On postcode searches observe given bounded viewbox 2019-04-02 14:49:31 +02:00
marc tobias
890d415e1f Nominatim::DB support input variables, custom error messages 2019-03-10 16:56:36 +01:00
marc tobias
d4b633bfc5 replace database abstraction DB with PDO 2019-03-09 00:18:15 +01:00
Sarah Hoffmann
8f0c628310 downgrade housenumbers without numbers
Fixes #1312.
2019-02-24 14:39:14 +01:00
Sarah Hoffmann
85f32d6c0f Keep matches without house number
Now that we have result ranking, we can keep the street results
for housenumber searches and reuse them in the next group round
if required. Also fixes an issue where postcode and housenumber
are in the query and one of them is wrong.

Fixes #1200.
2018-11-17 00:35:38 +01:00
Sarah Hoffmann
9908c93d4c Add result ranking for missing housenumber and postcode
Fixes #988.
2018-11-17 00:00:01 +01:00
Sarah Hoffmann
c5109d39d0 increase limit when searching for street w/ house number
Increase the chance that the correct street is found.
2018-10-20 17:26:45 +02:00
Sarah Hoffmann
119ffbab40 address tokens get a double search rank also as full terms
Fixes #1170.
2018-09-18 21:54:08 +02:00
Sarah Hoffmann
f29c7bf910 introduce classes for token list and token types 2018-05-14 23:04:15 +02:00
Sarah Hoffmann
c555b60b36 narrow down search by house number when postcode is given
Fixes #1034.
2018-05-07 21:34:11 +02:00
Sarah Hoffmann
115792d1db replace word frequency hash
The word frequency hash was only used to determine if the
name of a SearchDescription is rare. Do this already when
building the SearchDescription (when the word frequency
is still available) and get gid of the extra hash.
2018-05-06 22:35:31 +02:00
Sarah Hoffmann
efac4a135a do not apply limit to house number place searches
Searches for house numbers are already limited by the
number of parent places. In fact, the limit assumed that
every parent place has exactly one match against the
given housenumber. That is not true in reality and so
we were dropping relevant results.

Fixes #329.
2018-04-06 22:20:21 +02:00
Sarah Hoffmann
2c42bda9ce nicer formatting for Geocode debug output 2018-03-25 22:28:18 +02:00
marc tobias
27bc8d4f7b replace PHP sizeof() with either count() or empty() 2018-03-22 12:36:24 +01:00
Sarah Hoffmann
f23a860b33 second attempt at strict names in structured queries
If a term does not go into names it should go into
address terms.
2018-03-02 00:26:48 +01:00
Sarah Hoffmann
df008d99f5 do not allow importance to become 0
Importance is weighed against a viewbox factor which disappears
when the importance is 0.

Fixes #930.
2018-03-01 22:37:45 +01:00
Sarah Hoffmann
fd920fba9b for structured search only accept name terms from the first phrase
Fixes #952.
2018-03-01 22:35:34 +01:00
Edward Betts
7e00a6e2ff Correct spelling mistakes. 2018-02-18 13:11:35 +00:00
Sarah Hoffmann
c7b903f4b0 assume name for special operator in bounded search
With bounded=1 we already have a restricted area, so it does
not make sense to interpret the query as a near search.

Fixes #311.
2017-12-17 23:50:16 +01:00
Sarah Hoffmann
cdfa31c390 Gives preference to special terms like postcode and housenumber
Fixes #846.
2017-12-17 20:23:34 +01:00
Sarah Hoffmann
b94229fb8e Give higher penalty to partial search terms
Avoids that the interpreation of a term as partial term
is ranked higher than as a special term like postcode
or house number.

Fixes #847.
2017-12-17 16:00:44 +01:00
Sarah Hoffmann
6c1977b448 replace double-quoting with single quotes where applicable 2017-10-26 21:40:33 +02:00
Sarah Hoffmann
f78d094483 fix variable typo when filtering results
Fixes #830 and #832.
2017-10-25 20:25:23 +02:00
Sarah Hoffmann
7caa67d8ec penalize housenumber after the postcode 2017-10-24 23:30:41 +02:00
Sarah Hoffmann
760807c5e0 revert use of global penalty for a search direction
Adding a penalty to a search description because there
is a term at the beginning which looks like a country
turned out to be a bad idea as there are too many
abbreviations around that match against frequently
matched words.
2017-10-24 22:42:29 +02:00
Sarah Hoffmann
42f079c355 introduce Result class in Geocode and SearchDescription 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
cdf8c67898 fix CodeSniffer offences 2017-10-13 23:11:09 +02:00
Sarah Hoffmann
00265af528 move word recheck into token collection
Drop tokens for special and postcode searches already when
collecting them for ValidTokens when they cannot be found
in the normalized query.
2017-10-13 23:04:12 +02:00
Sarah Hoffmann
77b76ae51b simplify cross-check of country tokens
Drop country tokens that do not match the country code list
early. Remove in turn the special country code check for
structured phrases. It is sufficient to do this during
word list building.
2017-10-13 22:23:39 +02:00
Sarah Hoffmann
77abe882ab take frequency scores from token description
No need to hand them in separately.
2017-10-12 22:59:07 +02:00
Sarah Hoffmann
c8780da19c documentation for SearchContext and SearchDescription 2017-10-10 00:15:56 +02:00
Sarah Hoffmann
c02bf4986f coding style and some documentation 2017-10-09 23:13:04 +02:00
Sarah Hoffmann
9a5d5d9aec move complete search query code into SearchDescription 2017-10-09 22:55:50 +02:00
Sarah Hoffmann
55629a4891 move country list to SearchContext 2017-10-08 23:33:54 +02:00
Sarah Hoffmann
907133a38c move excluded place list to SearchContext 2017-10-08 23:15:06 +02:00
Sarah Hoffmann
86c0858130 move viewbox sql to new SearchContext 2017-10-08 22:44:01 +02:00
Sarah Hoffmann
30511fd3ab replace NearPoint with a more generic context object
The NearPoint is actually common to all SearchDescriptions
and there is other context data as well. like viewbox, that
needs to be available to the search object but is common.
2017-10-08 21:23:31 +02:00
Sarah Hoffmann
614a6ab861 don't trust words from word table to be sanatized 2017-10-08 17:36:38 +02:00