Sarah Hoffmann
49bd18b048
replace PhraseType enum with simple int constants
2025-02-21 16:44:12 +01:00
Sarah Hoffmann
31412e0674
replace TokenType enum with simple char constants
2025-02-21 10:23:41 +01:00
Sarah Hoffmann
4577669213
replace BreakType enum with simple char constants
2025-02-21 09:57:48 +01:00
Sarah Hoffmann
9bf1428d81
consistently use query module as qmod
2025-02-21 09:31:21 +01:00
Sarah Hoffmann
b56edf3d0a
avoid yielding when extracting words from query
2025-02-20 23:32:39 +01:00
Sarah Hoffmann
abc911079e
remove word_number counting for phrases
...
We can just examine the break types to know if we are dealing
with a partial token.
2025-02-20 17:36:50 +01:00
Sarah Hoffmann
adabfee3be
Merge pull request #3655 from lonvia/remove-name-ranking-in-postcode-search
...
Tweak penalties for postcode searches
2025-02-20 14:32:43 +01:00
Sarah Hoffmann
46c4446dc2
remove address penalty for postcode search
...
Searches of the form <postcode> <city> are in fact quite common.
2025-02-20 11:11:45 +01:00
Sarah Hoffmann
add9244a2f
do not rerank address by full match in postcode search
...
The reranking result will not be completely correct because
the address of a postcode refer to the address _and_ name
of the parent and reranking was only done against the
address. We assume here that the postcode is precise enough
as to not require a penalty to to partial matches.
2025-02-20 10:29:03 +01:00
Sarah Hoffmann
96d7a8e8f6
Merge pull request #3653 from lonvia/trailing-spaces-in-normalization
...
Strip leading and trailing space markers during normalization
2025-02-19 17:25:59 +01:00
Sarah Hoffmann
55c3176957
strip normalisation results of normal and special spaces
2025-02-19 14:40:35 +01:00
Sarah Hoffmann
e29823e28f
add test for structured query with leading spaces
2025-02-19 10:31:36 +01:00
Sarah Hoffmann
97ed168996
Merge pull request #3652 from lonvia/update-variants
...
Cleanup and updates of tokenizer variant configuration
2025-02-18 19:47:45 +01:00
Sarah Hoffmann
9b8ef97d4b
Merge pull request #3649 from lonvia/actions-move-to-ubuntu22
...
Move Github actions to Unbuntu-22 image
2025-02-18 13:21:09 +01:00
Sarah Hoffmann
4f3c88f0c1
remove e-ë mutation, this is taken care of by transliteration
2025-02-18 10:31:44 +01:00
mhsr21
7781186f3c
Add USPS Standard Suffix Abbreviation
2025-02-18 09:28:13 +01:00
Sarah Hoffmann
f78686edb8
fix Norwegian variants
...
More cases of 'no' being interpreted as fasle by yaml.
2025-02-18 09:28:13 +01:00
Sarah Hoffmann
e330cd3162
remove ineffective and dupicate variants
2025-02-18 09:28:13 +01:00
Sarah Hoffmann
671af4cff2
Merge pull request #3555 from IvanShift/patch-1
...
Fixed Russian abbreviation list
2025-02-17 18:44:11 +01:00
Sarah Hoffmann
e612b7d550
actions: use Debians's script for adding the Postgres apt repo
2025-02-17 17:56:23 +01:00
Sarah Hoffmann
0b49d01703
actions: move tests to Ubuntu-20
2025-02-17 17:54:49 +01:00
Sarah Hoffmann
f6bc8e153f
Merge pull request #3648 from lonvia/extratags-for-geocodejson
...
Enable output of extratags for geocodejson format
2025-02-17 11:14:52 +01:00
Sarah Hoffmann
f143ecaf1c
add documentation for new extra field
2025-02-17 10:04:23 +01:00
Sarah Hoffmann
6730c8bac8
add optional output of extratags to geocodejson
2025-02-16 10:16:40 +01:00
Sarah Hoffmann
ee8915f2b6
prepare 5.0.0 release
v5.0.0
2025-02-05 10:54:38 +01:00
Sarah Hoffmann
5475bf7b9c
Merge pull request #3635 from lonvia/replace-wikimedia-importance-test-data
...
Update wikimedia importance file for test database
2025-01-14 16:49:52 +01:00
Sarah Hoffmann
95e2d8c846
adapt tests to changed wikimedia importance test table
2025-01-14 14:19:17 +01:00
Sarah Hoffmann
7552818866
replace wikimedia importance file for test data with CSV version
2025-01-14 09:16:25 +01:00
Sarah Hoffmann
db3991af74
Merge pull request #3626 from lonvia/import-performance
...
Import performance
2025-01-10 16:44:33 +01:00
Sarah Hoffmann
4523b9aaed
Merge pull request #3631 from lonvia/avoid-transactions
...
Creating tables and indexes in autocommit mode
2025-01-10 16:44:18 +01:00
Sarah Hoffmann
8b1cabebd6
Merge pull request #3633 from lonvia/restrict-long-ways
...
Ignore overly long ways during import
2025-01-10 16:06:37 +01:00
Sarah Hoffmann
0cf636a80c
ignore overly long ways during import
2025-01-10 13:55:43 +01:00
Sarah Hoffmann
c2cb6722fe
use autocommit when creating tables and indexes
...
Might avoid some deadlock situations with autovacuum.
2025-01-09 17:14:37 +01:00
Sarah Hoffmann
f8337bedb2
Merge pull request #3629 from lonvia/additional-breaks
...
Introduce new break types and phrase splitting for Japanese addresses
2025-01-09 13:55:29 +01:00
Sarah Hoffmann
efc09a5cfc
add japanese phrase preprocessing
...
Code adapted from GSOC code by @miku.
2025-01-09 09:24:10 +01:00
Sarah Hoffmann
86ad9efa8a
keep break indicators [:-] during normalisation
...
All punctuation will be converted to '-'. Soft breaks : may be
added by preprocessors. The break signs are only used during
query analysis and are ignored during import token analysis.
2025-01-09 09:21:55 +01:00
Sarah Hoffmann
d984100e23
add inner word break penalty
2025-01-07 21:42:25 +01:00
Sarah Hoffmann
499110f549
add SOFT_PHRASE break and enable parsing
...
Also enables parsing of PART breaks.
2025-01-06 17:10:24 +01:00
Sarah Hoffmann
267e5dac0d
split up MultiPolygons before adding them to large_areas table
2024-12-22 09:15:16 +01:00
Sarah Hoffmann
32d3eb46d5
move geometry split into insertLocationAreaLarge()
...
thus insert only needs to be called once.
2024-12-22 09:15:16 +01:00
Sarah Hoffmann
c8a0dc8af1
more efficient belongs-to-address determination
2024-12-22 09:15:16 +01:00
Sarah Hoffmann
14ecfc7834
Merge pull request #3619 from lonvia/demote-farms
...
Remove farms and isolated dwellings from computed addresses
2024-12-22 09:13:42 +01:00
Sarah Hoffmann
cad44eb00c
remove farms and isolated dwellings from computed addresses
...
Farms and isolated dwellings are usually confined to a very small
area. It does not make sense if they are automatically used in
addressing surrounding features. Still works to use them for
parenting when used with addr:place.
2024-12-20 22:59:02 +01:00
Sarah Hoffmann
f76dbb0a16
docs: update Update docs for virtualenv use
2024-12-20 11:27:45 +01:00
Sarah Hoffmann
8dd218a1d0
Merge pull request #3618 from osm-search/settings-md-table-space-osm-index
...
Settings.md - one setting was repeated
2024-12-19 08:40:31 +01:00
mtmail
501e13483e
Settings.md - one setting was repeated
2024-12-18 21:58:51 +01:00
Sarah Hoffmann
b1d25e404f
Merge pull request #3617 from mtmail/pr-3615-wording
...
Slight wording changes for Import-Styles.md
2024-12-18 11:04:21 +01:00
marc tobias
71fceb6854
Slight wording changes for Import-Styles.md
2024-12-18 01:02:46 +01:00
Sarah Hoffmann
a06e123d70
Merge pull request #3616 from osm-search/tokenizers-md-typo
...
fix typo in Tokenizers.md
2024-12-17 08:43:16 +01:00
mtmail
df6f70d223
fix typo in Tokenizers.md
2024-12-16 23:38:18 +01:00