Nominatim

Author	SHA1	Message	Date
Sarah Hoffmann	07c2907064	split normalized word when transliteration is split up	2025-09-08 22:58:01 +02:00
Sarah Hoffmann	621d8e785b	Merge pull request #3779 from lonvia/fix-zero-devision-direction Fix direction factor computation on empty strings	2025-07-11 14:51:00 +02:00
Sarah Hoffmann	21ef3be433	fix direction factor computation on empty strings	2025-07-11 11:25:14 +02:00
Sarah Hoffmann	fe30663b21	remove penalty from TokenRanges The parameter is no longer needed.	2025-07-11 11:01:22 +02:00
Sarah Hoffmann	4634ad0720	rebalance word transition penalties	2025-07-11 11:01:21 +02:00
Sarah Hoffmann	4a9253a0a9	simplify QueryNode penalty and initial assignment	2025-07-11 11:01:09 +02:00
Sarah Hoffmann	7f710d2394	add a comment about the precomputed denominator	2025-04-15 09:38:05 +02:00
Sarah Hoffmann	06e39e42d8	add direction penalties Direction penalties are estimated by getting the name to address ratio usage for each partial term in the query and computing the linear regression of that ratio over the entire phrase. Or to put it in ither words: we try to determine if the terms at the beginning or the end of the query are more likely to constitute a name. Direction penalties are currently used only in classic name queries.	2025-04-11 20:41:06 +02:00
Sarah Hoffmann	2ef0e20a3f	reorganise token reranking As the reranking is about changing penalties in presence of other tokens, change the datastructure to have the other tokens readily avilable.	2025-04-11 13:38:34 +02:00
Sarah Hoffmann	e0e067b1d6	replace use of range when computing word list	2025-04-11 09:59:04 +02:00
Sarah Hoffmann	3980791cfd	use iterator instead of list to go over partials	2025-04-11 09:38:24 +02:00
Sarah Hoffmann	497e27bb9a	move partial token into a separate field in the query struct There is exactly one token to be expected and the token is usually present.	2025-04-11 08:57:34 +02:00
Sarah Hoffmann	6759edfb5d	make word generation from query a class method	2025-03-04 08:57:37 +01:00
Sarah Hoffmann	e362a965e1	search: merge QueryPart array with QueryNodes The basic information on terms is pretty much always used together with the node inforamtion. Merging them together saves some allocation while making lookup easier at the same time.	2025-03-04 08:57:37 +01:00
Sarah Hoffmann	49bd18b048	replace PhraseType enum with simple int constants	2025-02-21 16:44:12 +01:00
Sarah Hoffmann	31412e0674	replace TokenType enum with simple char constants	2025-02-21 10:23:41 +01:00
Sarah Hoffmann	4577669213	replace BreakType enum with simple char constants	2025-02-21 09:57:48 +01:00
Sarah Hoffmann	d984100e23	add inner word break penalty	2025-01-07 21:42:25 +01:00
Sarah Hoffmann	499110f549	add SOFT_PHRASE break and enable parsing Also enables parsing of PART breaks.	2025-01-06 17:10:24 +01:00
Sarah Hoffmann	1f07967787	fix style issue found by flake8	2024-11-10 22:47:14 +01:00
Sarah Hoffmann	a690605a96	remove support for unindexed tokens This was a special feature of the legacy tokenizer who would not index very frequent tokens.	2024-09-22 10:39:10 +02:00
Sarah Hoffmann	6e89310a92	split code into submodules	2024-06-26 11:52:47 +02:00

22 Commits