Sarah Hoffmann
fc995ea6b9
move database check for module to tokenizer
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
3eb4d88057
boilerplate for PHP code of tokenizer
...
This adds an installation step for PHP code for the tokenizer. The
PHP code is split in two parts. The updateable code is found in
lib-php. The tokenizer installs an additional script in the
project directory which then includes the code from lib-php and
defines all settings that are static to the database. The website
code then always includes the PHP from the project directory.
2021-04-30 11:31:52 +02:00
Sarah Hoffmann
7cb7cf848d
move amenity creation to tokenizer
...
The BDD tests still use the old-style amenity creation scripts
because we don't have simple means to import a hand-crafted
test file of special phrases right now.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
bef300305e
move default country name creation to tokenizer
...
The new function is also used, when a country us updated. All SQL
function related to country names have been removed.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
ffc2d82b0e
move postcode normalization into tokenizer
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
d8ed1bfc60
move houseunumber handling to tokenizer
...
Normalization and token computation are now done in the tokenizer.
The tokenizer keeps a cache to the hundred most used house numbers
to keep the numbers of calls to the database low.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
a73711f3cd
add extra column for tokenizer
...
Add a jsonb column to the placex and location_property_osmline tables
which can be used by the installed tokenizer as required. No other
part of the software will use or otherwise rely on this column.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fbbdd31399
move word table and normalisation SQL into tokenizer
...
Creating and populating the word table is now the responsibility
of the tokenizer.
The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
b5540dc35c
add migration for configurable tokenizer
...
Adds a migration that initialises a legacy tokenizer for
an existing database. The migration is not active yet as
it will need completion when more functionality is added
to the legacy tokenizer.
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
296a66558f
move module installation to legacy tokenizer
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
185d369404
remove support for AUX housenumber tables
...
These tables have never been actively maintained and the code is
completely untested. With the upcomming changes, it is unlikely
that the code remains usable.
This removes the aux tables and all code that references them.
2021-04-30 10:08:29 +02:00
Sarah Hoffmann
51d20b19b6
Merge pull request #2299 from lonvia/update-actions
...
Fix database check for reverse-only
2021-04-27 12:18:45 +02:00
Sarah Hoffmann
46e8c6b112
Merge pull request #2291 from AntoJvlt/special-phrases-statistics
...
Special phrases statistics
2021-04-27 11:57:05 +02:00
Sarah Hoffmann
c8fb25201a
do not check for extra housenumber index for reverse-only
...
Also adds a database check for reverse only import to the CI.
2021-04-27 10:14:26 +02:00
Sarah Hoffmann
4457bf7528
avoid Path in subprocess parameters
...
Not supported by Python 3.5.
2021-04-26 10:55:23 +02:00
AntoJvlt
abb3d56b20
Switching to log info and only send warning for invalid phrases
2021-04-25 17:57:43 +02:00
AntoJvlt
c5ecb9bae0
Implemented statistics for the import of special phrases through the SpecialPhrasesImporterStatistics class
2021-04-25 17:57:43 +02:00
AntoJvlt
1b68152fb2
reorganization of folder/file for the special phrases importer
2021-04-25 17:57:42 +02:00
Sarah Hoffmann
b951b11336
fix pylint complaints
2021-04-24 11:59:32 +02:00
Sarah Hoffmann
89c90bedb9
pylint: disable check too-few-public-methods
2021-04-24 11:39:44 +02:00
Sarah Hoffmann
9c51c133f7
indexes with includes are not available for postgresql < 11
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
280406c0d7
use pathlib version of open
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
d5fc3b5e99
subprocess needs string argument
...
Compatibility change for Python 3.5.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
3a642d50a4
use more generic ImportError to check for module
...
ModuleNotFoundError was only introduced in Python 3.6.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
79d55357e8
simplify sql and website creation functions
2021-04-19 10:53:30 +02:00
Sarah Hoffmann
4fa6c0ad53
simplify constructor for SQL preprocessor
...
Use sql path from config.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
8f63f9516b
simplify interface for adding tiger data
...
Also simplifies tests using existing fixtures.
2021-04-19 10:26:25 +02:00
AntoJvlt
b2ae715699
Only log a warning if a wrong input is detected on the wiki while importing special phrases
2021-04-17 20:19:39 +02:00
AntoJvlt
a95c748363
Fix occurence regex
2021-04-17 19:24:13 +02:00
Sarah Hoffmann
886a01c796
port function to compute initial postcodes to Python
2021-04-16 16:11:20 +02:00
Sarah Hoffmann
76b1885595
use absolute imports in Python code
...
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Sarah Hoffmann
c64193f839
Merge pull request #2263 from AntoJvlt/special-phrases-autoupdate
...
Implemented auto update of special phrases while importing them
2021-04-15 10:13:25 +02:00
Sarah Hoffmann
e90adfc7c3
adapt database check to new index layout
2021-04-14 17:52:59 +02:00
Sarah Hoffmann
16267dc021
add migration for new placenode geometry index
2021-04-14 17:52:59 +02:00
Darkshredder
49ee7505ed
Fix: Removed error if endstatement is wrong and improved tests
2021-04-13 15:44:12 +05:30
AntoJvlt
ae2b2cb9a5
Tests added for the auto update of special phrases during import
2021-04-12 14:35:29 +02:00
AntoJvlt
8c2f287ce4
Implemented auto update of special phrases while importing them
2021-04-12 14:30:48 +02:00
AntoJvlt
5ecae10713
Fix default languages loading
2021-04-11 22:26:31 +02:00
Sarah Hoffmann
71564fa1de
split LANGUAGES parameter before use
...
The user supplies the languages as a comma-separated list.
2021-04-09 17:48:28 +02:00
Sarah Hoffmann
96b0699621
add migration for transliterated housenumbers
2021-04-04 15:26:47 +02:00
AntoJvlt
cde9389e75
Errors fixes, Cleaning code, Improvement and addition of tests
2021-03-26 01:53:33 +01:00
AntoJvlt
2c19bd5ea3
Encapsulation of tools/special_phrases.py into SpecialPhrasesImporter class and add new tests.
2021-03-25 21:13:57 +01:00
AntoJvlt
ff34198569
Code cleaning, tests simplification and use of python3-icu package
2021-03-23 23:56:39 +01:00
AntoJvlt
1ce8b530cd
Introduction of PyICU for transliteration in python. Reversed changes in normalization.sql.
2021-03-23 23:34:16 +01:00
AntoJvlt
6d56cbb3e8
Changed phrase_settings.py to phrase-settings.json and added migration function for old php settings file.
2021-03-23 23:30:39 +01:00
AntoJvlt
17cb59efbd
Ported functions for the import of special phrases from php to python.
...
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
2021-03-20 19:11:50 +01:00
Sarah Hoffmann
81a6b746b8
Merge pull request #2212 from darkshredder/country-name
...
Ported createCountryNames() to python and Added tests
2021-03-15 09:36:06 +01:00
Sarah Hoffmann
7212fa8630
fix template variable name
2021-03-13 12:05:53 +01:00
Darkshredder
b108bd1c1e
Linting fix
2021-03-12 18:28:47 +05:30
Darkshredder
077a8c1f95
refactored tests and made changes to code for easy readibility
2021-03-12 18:23:20 +05:30