add sanitizer for TIGER tags

Currently only takes over cleaning the tiger:county data. This was
done by the import until now.
This commit is contained in:
Sarah Hoffmann
2022-11-22 17:10:21 +01:00
parent 55ee08f42b
commit fd3dec8efe
5 changed files with 100 additions and 1 deletions

View File

@@ -35,6 +35,7 @@ sanitizers:
- step: clean-postcodes
convert-to-address: yes
default-pattern: "[A-Z0-9- ]{3,12}"
- step: clean-tiger-tags
- step: split-name-list
- step: strip-brace-terms
- step: tag-analyzer-by-language