Commit Graph

73 Commits

Author SHA1 Message Date
Sarah Hoffmann
8e1d4818ac use yaml config loader for country info 2021-09-04 00:22:55 +02:00
Sarah Hoffmann
7e7dd769fd remove language and partition from name import 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
79da96b369 read partition and languages from config file 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
78fcabade8 move country name generation to country_info module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
284645f505 move generation of country tables in own module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
75a5c7013f split up large setup function 2021-08-15 12:24:13 +02:00
Sarah Hoffmann
87dedde5d6 allow multiple files for the import command
The files are forwarded to osm2pgsql which is now able to merge
them correctly.
2021-08-14 21:42:21 +02:00
Sarah Hoffmann
e42349c963 replace add-data function with native Python code 2021-07-26 10:41:37 +02:00
Sarah Hoffmann
878835e4bd move add-data subcommand into a separate file 2021-07-25 18:14:12 +02:00
Sarah Hoffmann
cf98cff2a1 more formatting fixes
Found by flake8.
2021-07-12 17:45:42 +02:00
AntoJvlt
3676310efe Improved performance of the postcodes query and some code cleaning 2021-06-12 15:46:08 +02:00
AntoJvlt
a4733eed90 Use place instead of placex to compute postcodes 2021-06-09 09:31:32 +02:00
Sarah Hoffmann
72625dc72a call freeze after running and non-updateable import
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Sarah Hoffmann
cc2f152d70 commit changes to replication log table
Fixes #2350.
2021-05-26 11:47:08 +02:00
Sarah Hoffmann
a0e85cc17c only initialise tokenizer for refresh functions where needed
Fixes #2347.
2021-05-25 19:16:22 +02:00
AntoJvlt
3206bf59df Resolve conflicts 2021-05-17 13:52:35 +02:00
AntoJvlt
8b8dfc46eb Added --no-replace command for special phrases importation and added corresponding tests 2021-05-17 13:25:06 +02:00
AntoJvlt
06aab389ed Code cleaning and SPLoader deleted 2021-05-16 16:59:12 +02:00
Darkshredder
e5ffc59cd5 feat: Added reverse-only-search validation 2021-05-14 02:36:21 +05:30
Sarah Hoffmann
bf864b2c54 index postcodes after refreshing 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
a4aba23a83 move filling of postcode table to python
The Python code now takes care of reading postcodes from placex,
enhancing them with potentially existing external postcodes and
updating location_postcodes accordingly. The initial setup and
updates use exactly the same function.

External postcode handling has been generalized. External postcodes
for any country are now accepted. The format of the external postcode
file has changed. We now expect CSV, potentially gzipped. The
postcodes are no longer saved in the database.
2021-05-13 14:15:42 +02:00
AntoJvlt
9d83da830f Introduction of SPCsvLoader to load special phrases from a csv file 2021-05-10 23:26:39 +02:00
AntoJvlt
00959fac57 Refactoring loading of external special phrases and importation process by introducing SPLoader and SPWikiLoader 2021-05-10 21:49:31 +02:00
Sarah Hoffmann
ced8f0f4a2 fix liniting issues 2021-04-30 17:59:50 +02:00
Sarah Hoffmann
388ebcbae2 move index creation for word table to tokenizer
This introduces a finalization routing for the tokenizer
where it can post-process the import if necessary.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
7cb7cf848d move amenity creation to tokenizer
The BDD tests still use the old-style amenity creation scripts
because we don't have simple means to import a hand-crafted
test file of special phrases right now.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
bef300305e move default country name creation to tokenizer
The new function is also used, when a country us updated. All SQL
function related to country names have been removed.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
ffc2d82b0e move postcode normalization into tokenizer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
e1c5673ac3 require tokeinzer for indexer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fbbdd31399 move word table and normalisation SQL into tokenizer
Creating and populating the word table is now the responsibility
of the tokenizer.

The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
296a66558f move module installation to legacy tokenizer 2021-04-30 11:29:57 +02:00
Sarah Hoffmann
af968d4903 introduce tokenizer modules
This adds the boilerplate for selecting configurable tokenizers.
A tokenizer can be chosen at import time and will then install
itself such that it is fixed for the given database import even
when the software itself is updated.

The legacy tokenizer implements Nominatim's traditional algorithms.
2021-04-30 11:29:57 +02:00
AntoJvlt
1b68152fb2 reorganization of folder/file for the special phrases importer 2021-04-25 17:57:42 +02:00
Sarah Hoffmann
89c90bedb9 pylint: disable check too-few-public-methods 2021-04-24 11:39:44 +02:00
Sarah Hoffmann
79d55357e8 simplify sql and website creation functions 2021-04-19 10:53:30 +02:00
Sarah Hoffmann
d74ae669e3 add support index when continuing import at index phase
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
2021-04-17 11:07:04 +02:00
Sarah Hoffmann
da98a2102a remove transition functions from Python 2021-04-16 18:41:14 +02:00
Sarah Hoffmann
886a01c796 port function to compute initial postcodes to Python 2021-04-16 16:11:20 +02:00
Sarah Hoffmann
76b1885595 use absolute imports in Python code
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Darkshredder
21b1b75b08 Rebase with master 2021-03-29 14:00:45 +05:30
Sarah Hoffmann
09b2510219 Merge pull request #2228 from AntoJvlt/import-special-phrases-porting-python
Import special phrases porting python
2021-03-29 09:49:35 +02:00
AntoJvlt
57ce75eb67 Change command 'import-special-phrases --from-wiki' to 'special-phrases --import-from-wiki'. 2021-03-26 02:22:38 +01:00
AntoJvlt
2c19bd5ea3 Encapsulation of tools/special_phrases.py into SpecialPhrasesImporter class and add new tests. 2021-03-25 21:13:57 +01:00
AntoJvlt
6d56cbb3e8 Changed phrase_settings.py to phrase-settings.json and added migration function for old php settings file. 2021-03-23 23:30:39 +01:00
marc tobias
87d5883ddb nominatim -h was priting wrong text for lookup and details 2021-03-21 16:06:41 +01:00
AntoJvlt
17cb59efbd Ported functions for the import of special phrases from php to python.
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
2021-03-20 19:11:50 +01:00
Darkshredder
7a874d5b97 Ported createCountryNames() to python and added tests 2021-03-12 10:28:41 +05:30
Sarah Hoffmann
9086a794a1 Merge pull request #2204 from darkshredder/tiger-data
Ported tiger-data-import to Python and Added Tarball Support
2021-03-11 22:48:38 +01:00
Darkshredder
122c4618b9 Linting fixes 2021-03-08 22:59:51 +05:30
Darkshredder
2af82975cd Ported tiger-data-import to python and Added Tarball Support 2021-03-08 21:57:56 +05:30