Skip to content
This repository was archived by the owner on Sep 29, 2023. It is now read-only.

Commit c1618dd

Browse files
author
Jonas Chapuis
committed
improvements in README.md
1 parent e1e9b35 commit c1618dd

File tree

1 file changed

+4
-5
lines changed

1 file changed

+4
-5
lines changed

README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -233,8 +233,7 @@ Completing on the same expressions (`2`, `2+`, `(10*2`) now leads to the followi
233233

234234
## Fuzzy completion
235235

236-
This library also provides special parsers which support fuzzy completion, present in the `FuzzyParsers` trait, by means of the `oneOfTerms` method capable of fuzzing completion on the input to match a set of terms.
237-
(note that parsing itself obviously requires an exact match and is really fast thanks to a prefix trie lookup on each input char). For instance, with the following dummy grammar:
236+
This library also provides special parsers which support fuzzy completion, present in the `FuzzyParsers` trait, by means of the `oneOfTerms` method capable of fuzzing completion on the input to match a set of terms (note that parsing itself obviously requires an exact match and is really fast thanks to a prefix trie lookup on each input char). For instance, with the following dummy grammar:
238237

239238
```scala
240239
object Grammar extends FuzzyParsers {
@@ -313,14 +312,14 @@ Below the signature of the `oneOfTerms` method:
313312
```
314313

315314
- `terms`: the list of terms to build the parser for
316-
- `similarityMeasure`: the string similarity metric to be used. Any `(String, String) => Double` function can be passed in, but the library provides DiceSorensen (default), JaroWinkler, Leenshtein & NgramDistance. Metric choice depends on factors such as type of terms, performance, etc. See below for more information about the underlying data structure.
315+
- `similarityMeasure`: the string similarity metric to be used. Any `(String, String) => Double` function can be passed in, but the library provides DiceSorensen (default), JaroWinkler, Levenshtein & NgramDistance. Metric choice depends on factors such as type of terms, performance, etc. See below for more information about the underlying data structure.
317316
- `similarityThreshold`: the minimum similarity score for an entry to be considered as a completion candidate
318317
- `maxCompletionsCount`: maximum number of completions returned by the parser
319318

320319
### Fuzzy matching technique
321-
For fuzzy completion, terms are decomposed in their trigrams and stored in a map indexed by the corresponding trigrams. This allows fast lookup of a set of completion candidates which share the same trigrams as the input. These candidates are ranked by the number of shared trigrams with the input, and a subset of the highest ranked candidates are kept. These candidates are then re-evaluated with the specified similarity metric (`similarityMeasure`), which is assumed to be more precise (and thus slower).
320+
For fuzzy completion, terms are decomposed in their trigrams and stored in a map which indexes terms per trigram. This allows fast lookup of a set of completion candidates which share the same trigrams as the input. These candidates are ranked by the number of shared trigrams with the input, and a subset of the highest ranked candidates are kept. This selection of candidates is then re-evaluated with the specified similarity metric (`similarityMeasure`), which is assumed to be more precise (and thus slower).
322321

323322
The top candidates according to `maxCompletionsCount` are returned as completions.
324323

325-
Note that terms are affixed so that the starting and ending two characters count more than the others in order to favor completions which start or end with the same characters as the input.
324+
Note that terms are affixed so that the starting and ending two characters count more than the others, in order to favor completions which start or end with the same characters as the input.
326325

0 commit comments

Comments
 (0)