Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 860 Bytes

File metadata and controls

7 lines (4 loc) · 860 Bytes

Text-processing-topic-labelling

The author implemented logistic regression for topic labelling and applied two feature extractions, Bag-of-Words (CountVectorizer) and TF-IDF (TfidfVectorizer), after which the results for both methods were analysed. The accuracy obtained for both methods were 96%.

The author improved the result of the previous approach by implementing a different machine learning classifer (Support vector machine) using the two previous extractions methods Bag-of-Words (CountVectorizer) and TF-IDF (TfidfVectorizer). The result will be analysed and discussed.

The author will further analyse and critically appraise the performance of logistic regression and support vector machine methods for topic labelling using the same feature extractions, discuss the suitability, advantages and drawbacks of these methods for text analysis