Build the language model based on n-grams frequencies#31
Open
iosadchiy wants to merge 5 commits intobakwc:masterfrom
Open
Build the language model based on n-grams frequencies#31iosadchiy wants to merge 5 commits intobakwc:masterfrom
iosadchiy wants to merge 5 commits intobakwc:masterfrom
Conversation
Owner
|
Thanks for PR, good feature! Let me know when you finish - I'll be glad to merge it. |
bakwc
reviewed
May 27, 2018
| @@ -0,0 +1,19 @@ | |||
| # encoding: UTF-8 | |||
Owner
There was a problem hiding this comment.
Better rewrite it on python and put to evaluate folder - all useful scripts are stored there for now.
Author
|
Yep, sure, the model is here |
|
@iosadchiy, did you have time to fix the PR checks error? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is not intended to be merged but to discuss if this feature can be useful.
I was experimenting with n-grams frequencies from Ruscorpora. The idea was to load the frequency files directly into the model:
You can see some short samples of the .csv files here
Let me know if this feature can be useful.