-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathChanges
More file actions
41 lines (33 loc) · 1.06 KB
/
Changes
File metadata and controls
41 lines (33 loc) · 1.06 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
0.3.1 2026-02-11
- Introduce hyphenated abbreviations in german tokenizer.
- Support Wikipedia templates.
- Introduced multiple gender forms for nouns
in german tokenizer.
(from KorAP-Tokenizer)
- Added short forms for determiners, adjectives, pronouns
"eine(n)", "gute:r", "ihm/r", "diese(r)", "ein(e)"
0.2.2 2023-09-06
- Fix behaviour for end of text character positions
when no end of sentence occured before.
0.2.1 2023-09-05
- Add english tokenizer.
- Fix buffer bug.
- Improve Readme.
- Minor performance improvements.
0.1.7 2023-02-28
- Add dependabot checks.
- Add update command.
0.1.6 2022-04-14
- Rename TOKEN_SYMBOL to TOKEN_BOUND.
0.1.5 2022-03-28
- Improve Emoticon-List.
0.1.4 2022-03-27
- Improved handling of ellipsis.
- Make algorithm more robust to nevere fail.
- Remove match option.
0.1.3 2022-03-08
- Introduced refined handling of sentences including speech.
0.1.2 2021-12-07
- Improve performance of rune to symbol conversion in transduction
method.
- Support Plusampersand word list in compounds.