Sarah Hoffmann
e1c5673ac3
require tokeinzer for indexer
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fbbdd31399
move word table and normalisation SQL into tokenizer
...
Creating and populating the word table is now the responsibility
of the tokenizer.
The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
296a66558f
move module installation to legacy tokenizer
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
af968d4903
introduce tokenizer modules
...
This adds the boilerplate for selecting configurable tokenizers.
A tokenizer can be chosen at import time and will then install
itself such that it is fixed for the given database import even
when the software itself is updated.
The legacy tokenizer implements Nominatim's traditional algorithms.
2021-04-30 11:29:57 +02:00
AntoJvlt
1b68152fb2
reorganization of folder/file for the special phrases importer
2021-04-25 17:57:42 +02:00
Sarah Hoffmann
89c90bedb9
pylint: disable check too-few-public-methods
2021-04-24 11:39:44 +02:00
Sarah Hoffmann
79d55357e8
simplify sql and website creation functions
2021-04-19 10:53:30 +02:00
Sarah Hoffmann
d74ae669e3
add support index when continuing import at index phase
...
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
2021-04-17 11:07:04 +02:00
Sarah Hoffmann
da98a2102a
remove transition functions from Python
2021-04-16 18:41:14 +02:00
Sarah Hoffmann
886a01c796
port function to compute initial postcodes to Python
2021-04-16 16:11:20 +02:00
Sarah Hoffmann
76b1885595
use absolute imports in Python code
...
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Darkshredder
21b1b75b08
Rebase with master
2021-03-29 14:00:45 +05:30
Sarah Hoffmann
09b2510219
Merge pull request #2228 from AntoJvlt/import-special-phrases-porting-python
...
Import special phrases porting python
2021-03-29 09:49:35 +02:00
AntoJvlt
57ce75eb67
Change command 'import-special-phrases --from-wiki' to 'special-phrases --import-from-wiki'.
2021-03-26 02:22:38 +01:00
AntoJvlt
2c19bd5ea3
Encapsulation of tools/special_phrases.py into SpecialPhrasesImporter class and add new tests.
2021-03-25 21:13:57 +01:00
AntoJvlt
6d56cbb3e8
Changed phrase_settings.py to phrase-settings.json and added migration function for old php settings file.
2021-03-23 23:30:39 +01:00
marc tobias
87d5883ddb
nominatim -h was priting wrong text for lookup and details
2021-03-21 16:06:41 +01:00
AntoJvlt
17cb59efbd
Ported functions for the import of special phrases from php to python.
...
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
2021-03-20 19:11:50 +01:00
Darkshredder
7a874d5b97
Ported createCountryNames() to python and added tests
2021-03-12 10:28:41 +05:30
Sarah Hoffmann
9086a794a1
Merge pull request #2204 from darkshredder/tiger-data
...
Ported tiger-data-import to Python and Added Tarball Support
2021-03-11 22:48:38 +01:00
Darkshredder
122c4618b9
Linting fixes
2021-03-08 22:59:51 +05:30
Darkshredder
2af82975cd
Ported tiger-data-import to python and Added Tarball Support
2021-03-08 21:57:56 +05:30
Sarah Hoffmann
764a41b973
automatic migration from 3.6 release
...
Adds a 'admin --migrate' command that checks for the current
database version and runs any necessary migrations. Also
has migrations going back to 3.6.
2021-03-06 16:36:57 +01:00
Sarah Hoffmann
09f4d767e4
port index creation to python
...
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00
Sarah Hoffmann
eacabb0e96
move table creation to jinja-based preprocessing
2021-03-03 22:07:51 +01:00
Sarah Hoffmann
3a0a4b9175
save software version in the database
...
The version represents the software version that was used to
import the data.
2021-03-01 20:35:15 +01:00
Sarah Hoffmann
b4f64aa770
make sure that calls to PHP legacy scripts are fatal on error
2021-03-01 16:10:45 +01:00
Sarah Hoffmann
dd03aeb966
bdd: use python library where possible
...
Replace calls to PHP scripts with direct calls into the
nominatim Python library where possible. This speed up
tests quite a bit.
2021-02-26 16:14:29 +01:00
Sarah Hoffmann
15b5906790
move setup function to python
...
There are still back-calls to PHP for some of the sub-steps.
These needs some larger refactoring to be moved to Python.
2021-02-26 15:02:39 +01:00
Sarah Hoffmann
57db5819ef
prot load-data function to python
2021-02-25 21:32:40 +01:00
Sarah Hoffmann
3c186f8030
add a function for the intial indexing run
...
Also moves postcodes to fully parallel indexing.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
c7fd0a7af4
port wikipedia importance functions to python
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
32683f73c7
move import-data option to native python
...
This adds a new dependecy to the Python psutil package.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
7222235579
introduce custom object for cmdline arguments
...
Allows to define special functions over the arguments.
Also splits CLI tests in two files as they have become too many.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
f6e894a53a
port database setup function to python
...
Hide the former PHP functions in a transition command until
they are removed.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
e520613362
convert connect() into a context manager
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
389138abfe
port setup-website to python
2021-02-19 17:51:06 +01:00
Sarah Hoffmann
b169e4c88c
port check-database function to python
...
This change also adapts the hints to use the nominatim tool.
Slightly changed checks, so that they are just as effective on
a frozen database.
2021-02-18 17:32:30 +01:00
Sarah Hoffmann
101a1f895d
port freeze function to python
2021-02-17 21:43:15 +01:00
Sarah Hoffmann
7cc4c53adb
always return 0 for updates unless there is an error
...
This is more in line with previous behavioru than returning
a status code when no updates are available.
2021-02-11 10:33:49 +01:00
Sarah Hoffmann
de37dc9300
forgot to replace one occurence of sql_dir
2021-02-09 19:32:05 +01:00
Sarah Hoffmann
b9517c99ae
rename sql directory to lib-sql
...
Also introduces a separate constant for the sql directory, so that
it can be put separately from the rest of the data if required.
2021-02-09 15:26:56 +01:00
Sarah Hoffmann
d81e152804
integrate analyse of indexing into nominatim tool
2021-02-08 22:22:49 +01:00
Sarah Hoffmann
0cbf98c020
consolidate warm and db-check into single admin command
2021-02-08 21:05:06 +01:00
Sarah Hoffmann
195f9f5ef3
split cli.py by subcommands
...
Reduces file size below 1000 lines.
2021-02-08 17:23:05 +01:00