Make the penality dependent on the length of the token: no penalty for one letter house numbers and increasing one for more letters.
Saves allocating an empty array.
Moving the logic for extending the SearchDescription into the token classes splits up the code and makes it more readable. More importantly: it allows tokenizer to define custom token classes in the future.