reorganize keyword creation for legacy tokenizer

- only save partial words without internal spaces
- consider comma and semicolon a separator of full words
- consider parts before an opening bracket a full word
  (but not the part after the bracket)

Fixes #244.
This commit is contained in:
Sarah Hoffmann
2021-05-23 23:58:58 +02:00
parent fa3e48c59f
commit 4f4d15c28a
4 changed files with 85 additions and 29 deletions

View File

@@ -2,6 +2,29 @@
Feature: Creation of search terms
Tests that search_name table is filled correctly
Scenario Outline: Comma- and semicolon separated names appear as full names
Given the places
| osm | class | type | name+alt_name |
| N1 | place | city | New York<sep>Big Apple |
When importing
Then search_name contains
| object | name_vector |
| N1 | #New York, #Big Apple |
Examples:
| sep |
| , |
| ; |
Scenario Outline: Name parts before brackets appear as full names
Given the places
| osm | class | type | name+name |
| N1 | place | city | Halle (Saale) |
When importing
Then search_name contains
| object | name_vector |
| N1 | #Halle Saale, #Halle |
Scenario: Unnamed POIs have no search entry
Given the scene roads-with-pois
And the places
@@ -49,7 +72,7 @@ Feature: Creation of search terms
When importing
Then search_name contains
| object | nameaddress_vector |
| N1 | Rose Street, Little, Big, Town |
| N1 | #Rose Street, rose, Little, Big, Town |
When searching for "23 Rose Street, Little Big Town"
Then results contain
| osm_type | osm_id | name |