Sarah Hoffmann
bef1aebf1c
add function for parallel execution of SQL scripts
2022-09-19 11:52:17 +02:00
Sarah Hoffmann
bc94318d83
mypy: fix new warnings due to external type updates
2022-09-05 17:39:35 +02:00
Sarah Hoffmann
8d082c13e0
adapt to new type annotations from typeshed
...
Some more functions frrom psycopg are now properly annotated.
No ignoring necessary anymore.
2022-08-09 11:06:54 +02:00
Kian-Meng Ang
f5e52e748f
docs: fix typos
2022-07-20 22:05:31 +08:00
Sarah Hoffmann
83054af46f
remove typing_extensions requirement
...
The typing_extensions package is only necessary now when running mypy.
It won't be used at runtime anymore.
2022-07-18 09:55:58 +02:00
Sarah Hoffmann
17bbe2637a
add type annotations to tool functions
2022-07-18 09:54:27 +02:00
Sarah Hoffmann
18b16e06ca
add type annotations for legacy tokenizer
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e37cfc64d2
add type annotations to ICU tokenizer helper modules
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
5617bffe2f
add type annotations for indexer
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
7a1d22ff15
type annotations for non-blocking DB connection
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
0dff71a410
add type annotations for SQL preprocessor
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
26f30bff28
add type annotation to DB utils
...
As a cursor is needed as type, make this a public type.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6775e713c
add typing information to DB properties
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
69f9122bef
add typing annotations for DB status module
...
Requires TypedDict which is only available from Python 3.8. Require
therefore typing_extensions to make the functions available for
earlier Python versions.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
845c43137a
add type annotations to freeze functions
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
c4928c646d
define type for enivronment dictionaries
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f12fe54d2b
restrict return type more
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
681aad7e0d
avoid issues with Python < 3.9 and linting
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f22fa992f7
move complex typing annotations to extra file
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
992e6f72cf
type annotations for DB utils
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6ee3c772c
type annotations for DB connection
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
7f7a7df3a2
solve assorted issue with newer pylint versions
...
Includes more use of 'with', adding encodings to open statements
and a couple of issues with parameter renaming.
2022-05-11 10:22:14 +02:00
Sarah Hoffmann
ae6b029543
remove redundant 'u' prefixes for unicode strings
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
bb2bd76f91
pylint: avoid explicit use of format() function
...
Use psycopg2 SQL formatters for SQL and formatted string literals
everywhere else.
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
36a1560117
add migration to mark internal country names
2022-03-31 15:55:20 +02:00
Sarah Hoffmann
a0ed80d821
restore the tokenizer directory when missing
...
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.
See #2515 .
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
c3788d765e
add consistent SPDX copyright headers
2022-01-03 16:23:58 +01:00
Sarah Hoffmann
552fb16cb2
fix template expressions for tablespaces
2021-10-15 15:11:09 +02:00
Sarah Hoffmann
3649487f5e
use SP-GIST index for building index where available
...
Point-in-polygon queries are much faster with a SP-GIST geometry
index, so use that for the index used to check if a housenumber
is inside a building.
Only available with Postgis 3. There is an automatic fallback to
GIST for Postgis 2.
2021-10-10 21:55:38 +02:00
Sarah Hoffmann
eb6814d74e
convert word info column to json before copying
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
2c8242c8df
remove special code for pre9.5 postgresql
...
9.5 is now the minimum requirement.
2021-07-19 10:24:57 +02:00
Sarah Hoffmann
6f6681ce67
add helper function for execute_values
...
Make psycopg2's convenience function accessible through
the cursor.
2021-07-12 21:08:20 +02:00
Sarah Hoffmann
06602b4ec0
provide wrapper function for DROP TABLE
...
Use psycopg2 formatting to ensure correct quoting.
2021-07-12 20:32:46 +02:00
Sarah Hoffmann
cf98cff2a1
more formatting fixes
...
Found by flake8.
2021-07-12 17:45:42 +02:00
Sarah Hoffmann
f8b5a63de3
factor out connection reset code
2021-07-12 14:58:44 +02:00
Sarah Hoffmann
a08ef43e40
simplify if statements
2021-07-12 11:28:47 +02:00
Sarah Hoffmann
a0a7b05c9f
correctly quote strings when copying in data
...
Encapsulate the copy string in a class that ensures that
copy lines are written with correct quoting.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
72625dc72a
call freeze after running and non-updateable import
...
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Sarah Hoffmann
5feece64c1
use WorkerPool for Tiger data import
...
Requires adding an option that SQL errors are ignored.
2021-05-13 20:36:50 +02:00
Sarah Hoffmann
b9a09129fa
move WorkerPool into db module
...
The pool is independent of the indexer and may also be used
by other parts of the software.
2021-05-13 17:11:17 +02:00
Sarah Hoffmann
6ce6f62b8e
fetch place info asynchronously
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
fbbdd31399
move word table and normalisation SQL into tokenizer
...
Creating and populating the word table is now the responsibility
of the tokenizer.
The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
89c90bedb9
pylint: disable check too-few-public-methods
2021-04-24 11:39:44 +02:00
Sarah Hoffmann
91d2fb6b1c
use group() for regex matches
...
Needed for compatibility with Python 3.5.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
3a642d50a4
use more generic ImportError to check for module
...
ModuleNotFoundError was only introduced in Python 3.6.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
9685c68e30
replace usages of fromisoformat() with strptime()
...
fromisoformat was only introduced with Python 3.7 while we
still support Python 3.5.
Fixes #2292 .
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
4fa6c0ad53
simplify constructor for SQL preprocessor
...
Use sql path from config.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
76b1885595
use absolute imports in Python code
...
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Sarah Hoffmann
8d8b1d4307
use non-key index to speed up housenumber search
...
On Postgresql versions 11+ add an index to speed up the lookup
of housenumbers for terms found in search_name. This is really
just a band-aid around the query planer's interpretation of the
query.
2021-04-01 17:10:44 +02:00
Sarah Hoffmann
09f4d767e4
port index creation to python
...
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00