Skip to content

rameshagowda/multilabel-text-clssifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

High leverl summary for multi label, multi class classifier

1 - Data preprocessing and Data wrangling:

  • Use pivoting to convert multi-class, multi-label data into multi-label, binary class data. this can be done using MultiLabelBinarizer(it is also a One-hot encoder).

2 - Data split - Use train-split / k-fold crsoss validation / stratified(balanced) shuffle split:

  • Have single training/testing dataset.

3 - Use OneVsRestClassifier to solve multi-label data by wrapping these algorithms like LinearSVM or Naive Bayes.

  • Tf-Idf vectorizer to convert the string in to numeric value - have appropriate parameters set.
  • english stop_words from english Corpus
  • Parameter tunning through GridSearchCV

4 - Model Evaluation

  • Use precision, recall and f1 score to evaluate the model.
  • Accuracy is not suited measure in multilabel scenario

About

Multi label and multi class text/NLP classifier using Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors