update to modern mkdocstrings python handler

2026-03-13 22:34:07 +00:00 · 2023-08-18 17:28:45 +02:00
parent d5b6042118
commit d3372e69ec
9 changed files with 75 additions and 71 deletions
--- a/docs/CMakeLists.txt
+++ b/docs/CMakeLists.txt
@@ -25,10 +25,10 @@ endforeach()
 ADD_CUSTOM_TARGET(doc
   COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-20.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-20.md
   COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-22.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-22.md
-   COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs build -d ${CMAKE_CURRENT_BINARY_DIR}/../site-html -f ${CMAKE_CURRENT_BINARY_DIR}/../mkdocs.yml
+   COMMAND mkdocs build -d ${CMAKE_CURRENT_BINARY_DIR}/../site-html -f ${CMAKE_CURRENT_BINARY_DIR}/../mkdocs.yml
 )

 ADD_CUSTOM_TARGET(serve-doc
-    COMMAND PYTHONPATH=${PROJECT_SOURCE_DIR} mkdocs serve
-    WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
+    COMMAND mkdocs serve -f ${CMAKE_CURRENT_BINARY_DIR}/../mkdocs.yml
+    WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
 )
--- a/docs/customize/Tokenizers.md
+++ b/docs/customize/Tokenizers.md
@@ -178,64 +178,72 @@ The following is a list of sanitizers that are shipped with Nominatim.
 ::: nominatim.tokenizer.sanitizers.split_name_list
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 ##### strip-brace-terms

 ::: nominatim.tokenizer.sanitizers.strip_brace_terms
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 ##### tag-analyzer-by-language

 ::: nominatim.tokenizer.sanitizers.tag_analyzer_by_language
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 ##### clean-housenumbers

 ::: nominatim.tokenizer.sanitizers.clean_housenumbers
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 ##### clean-postcodes

 ::: nominatim.tokenizer.sanitizers.clean_postcodes
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 ##### clean-tiger-tags

 ::: nominatim.tokenizer.sanitizers.clean_tiger_tags
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 #### delete-tags

 ::: nominatim.tokenizer.sanitizers.delete_tags
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 #### tag-japanese

 ::: nominatim.tokenizer.sanitizers.tag_japanese
    selection:
        members: False
-    rendering:
+    options:
        heading_level: 6
+        docstring_section_style: spacy

 #### Token Analysis

--- a/docs/develop/Development-Environment.md
+++ b/docs/develop/Development-Environment.md
@@ -47,8 +47,8 @@ depending on your choice of webserver framework:
 The documentation is built with mkdocs:

 * [mkdocs](https://www.mkdocs.org/) >= 1.1.2
-* [mkdocstrings](https://mkdocstrings.github.io/) >= 0.16
-* [mkdocstrings-python-legacy](https://mkdocstrings.github.io/python-legacy/)
+* [mkdocstrings](https://mkdocstrings.github.io/) >= 0.18
+* [mkdocstrings-python](https://mkdocstrings.github.io/python/)

 ### Installing prerequisites on Ubuntu/Debian

--- a/docs/develop/ICU-Tokenizer-Modules.md
+++ b/docs/develop/ICU-Tokenizer-Modules.md
@@ -53,21 +53,18 @@ the function.
 ### Sanitizer configuration

 ::: nominatim.tokenizer.sanitizers.config.SanitizerConfig
-    rendering:
-        show_source: no
-        heading_level: 6
+    options:
+        heading_level: 3

 ### The main filter function of the sanitizer

 The filter function receives a single object of type `ProcessInfo`
 which has with three members:

- * `place`: read-only information about the place being processed.
+ * `place: PlaceInfo`: read-only information about the place being processed.
   See PlaceInfo below.
- * `names`: The current list of names for the place. Each name is a
-   PlaceName object.
- * `address`: The current list of address names for the place. Each name
-   is a PlaceName object.
+ * `names: List[PlaceName]`: The current list of names for the place.
+ * `address: List[PlaceName]`: The current list of address names for the place.

 While the `place` member is provided for information only, the `names` and
 `address` lists are meant to be manipulated by the sanitizer. It may add and
@@ -77,17 +74,15 @@ adding extra attributes) or completely replace the list with a different one.
 #### PlaceInfo - information about the place

 ::: nominatim.data.place_info.PlaceInfo
-    rendering:
-        show_source: no
-        heading_level: 6
+    options:
+        heading_level: 3


 #### PlaceName - extended naming information

 ::: nominatim.data.place_name.PlaceName
-    rendering:
-        show_source: no
-        heading_level: 6
+    options:
+        heading_level: 3


 ### Example: Filter for US street prefixes
@@ -145,15 +140,13 @@ They can be found in the directory
 ## Custom token analysis module

 ::: nominatim.tokenizer.token_analysis.base.AnalysisModule
-    rendering:
-        show_source: no
-        heading_level: 6
+    options:
+        heading_level: 3


 ::: nominatim.tokenizer.token_analysis.base.Analyzer
-    rendering:
-        show_source: no
-        heading_level: 6
+    options:
+        heading_level: 3

 ### Example: Creating acronym variants for long names

--- a/docs/develop/Tokenizers.md
+++ b/docs/develop/Tokenizers.md
@@ -134,14 +134,14 @@ All tokenizers must inherit from `nominatim.tokenizer.base.AbstractTokenizer`
 and implement the abstract functions defined there.

 ::: nominatim.tokenizer.base.AbstractTokenizer
-    rendering:
-        heading_level: 4
+    options:
+        heading_level: 3

 ### Python Analyzer Class

 ::: nominatim.tokenizer.base.AbstractAnalyzer
-    rendering:
-        heading_level: 4
+    options:
+        heading_level: 3

 ### PL/pgSQL Functions

--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@@ -59,7 +59,8 @@ plugins:
    - search
    - mkdocstrings:
        handlers:
-          python-legacy:
-            rendering:
-              show_source: false
-              show_signature_annotations: false
+          python:
+            paths: ["${PROJECT_SOURCE_DIR}"]
+            options:
+              show_source: False
+              show_bases: False
--- a/nominatim/tokenizer/base.py
+++ b/nominatim/tokenizer/base.py
@@ -144,8 +144,6 @@ class AbstractTokenizer(ABC):
                tables should be skipped. This option is only required for
                migration purposes and can be safely ignored by custom
                tokenizers.
-
-            TODO: can we move the init_db parameter somewhere else?
        """


@@ -236,8 +234,12 @@ class AbstractTokenizer(ABC):

    @abstractmethod
    def most_frequent_words(self, conn: Connection, num: int) -> List[str]:
-        """ Return a list of the `num` most frequent full words
-            in the database.
+        """ Return a list of the most frequent full words in the database.
+
+            Arguments:
+              conn: Open connection to the database which may be used to
+                    retrive the words.
+              num: Maximum number of words to return.
        """