You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 29, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+4-5Lines changed: 4 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -233,8 +233,7 @@ Completing on the same expressions (`2`, `2+`, `(10*2`) now leads to the followi
233
233
234
234
## Fuzzy completion
235
235
236
-
This library also provides special parsers which support fuzzy completion, present in the `FuzzyParsers` trait, by means of the `oneOfTerms` method capable of fuzzing completion on the input to match a set of terms.
237
-
(note that parsing itself obviously requires an exact match and is really fast thanks to a prefix trie lookup on each input char). For instance, with the following dummy grammar:
236
+
This library also provides special parsers which support fuzzy completion, present in the `FuzzyParsers` trait, by means of the `oneOfTerms` method capable of fuzzing completion on the input to match a set of terms (note that parsing itself obviously requires an exact match and is really fast thanks to a prefix trie lookup on each input char). For instance, with the following dummy grammar:
238
237
239
238
```scala
240
239
objectGrammarextendsFuzzyParsers {
@@ -313,14 +312,14 @@ Below the signature of the `oneOfTerms` method:
313
312
```
314
313
315
314
-`terms`: the list of terms to build the parser for
316
-
-`similarityMeasure`: the string similarity metric to be used. Any `(String, String) => Double` function can be passed in, but the library provides DiceSorensen (default), JaroWinkler, Leenshtein & NgramDistance. Metric choice depends on factors such as type of terms, performance, etc. See below for more information about the underlying data structure.
315
+
-`similarityMeasure`: the string similarity metric to be used. Any `(String, String) => Double` function can be passed in, but the library provides DiceSorensen (default), JaroWinkler, Levenshtein & NgramDistance. Metric choice depends on factors such as type of terms, performance, etc. See below for more information about the underlying data structure.
317
316
-`similarityThreshold`: the minimum similarity score for an entry to be considered as a completion candidate
318
317
-`maxCompletionsCount`: maximum number of completions returned by the parser
319
318
320
319
### Fuzzy matching technique
321
-
For fuzzy completion, terms are decomposed in their trigrams and stored in a map indexed by the corresponding trigrams. This allows fast lookup of a set of completion candidates which share the same trigrams as the input. These candidates are ranked by the number of shared trigrams with the input, and a subset of the highest ranked candidates are kept. These candidates are then re-evaluated with the specified similarity metric (`similarityMeasure`), which is assumed to be more precise (and thus slower).
320
+
For fuzzy completion, terms are decomposed in their trigrams and stored in a map which indexes terms per trigram. This allows fast lookup of a set of completion candidates which share the same trigrams as the input. These candidates are ranked by the number of shared trigrams with the input, and a subset of the highest ranked candidates are kept. This selection of candidates is then re-evaluated with the specified similarity metric (`similarityMeasure`), which is assumed to be more precise (and thus slower).
322
321
323
322
The top candidates according to `maxCompletionsCount` are returned as completions.
324
323
325
-
Note that terms are affixed so that the starting and ending two characters count more than the others in order to favor completions which start or end with the same characters as the input.
324
+
Note that terms are affixed so that the starting and ending two characters count more than the others, in order to favor completions which start or end with the same characters as the input.
0 commit comments