Skip to content

Pearlina16/SC4020-Data-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

SC4020-Data-Mining

Project1: Similarity Search

Implemented 2 different similarity search methods Sentence2Vec (Sentence2Vec) and Bidirectional Encoder Representations from Transformers (BERT) using two datasets taken from Kaggle. The models were trained on the training data, fine-tuned using validation data, and assessed on the test data to compare the performance between the two methods. Our findings highlight the strengths and limitations of both models, providing insights into their effectiveness in similarity search tasks. The results demonstrate that while BERT's contextual embeddings generally lead to better performance for complex datasets, Sent2Vec provides a computationally efficient alternative with comparable performance on simpler datasets.

Project2: Apriori ALgorithm

a. Analysis of Co-occurrence Patterns of Points of Interest (POI)

b. Mining Sequential Patterns

c. Open Advanced Task

    Problem Statement: Predicting the user’s next location 
    Dataset: Trip Legs generated in task 2
    Solution: Use the Triplegs generated in task 2 as input into an LSTM model to predict a user's next location. 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors