You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Produce basic plots/graphs for understanding the data
Apply simple kmeans methods
Apply hierarchical clustering
Apply EM
Apply a new method based on research
Test and Compare methods
Produce visualizations
Create Final Report
Crete presentation
Project Plan
Week of 6/7
Change topic from sentiment analysis to stock market clustering and predictions.
Create informal project proposal.
Week of 6/14
Determine as a group which stocks we would like to perform our analysis on. Currently, we are looking forward to analyzing SP500 stocks.
Determine as a group what time periods we would like to look at in order to avoid outlier years.
Get all group members familiar with scikit-learn and R through individual exploration.
Gather all data from the stocks and convert into a format needed for analysis
Week of 6/21
Produce visualization graphics using dummy data
Create and test different models created using different algorithms
Week of 6/28
Create a visualization that demonstrates our results
If we get good results play with the data and attempt to do predictions on stock prices given related stocks. This would be a form of a supervised learning done by altering the data to be given stock prices of the cluster and have to predict what our stock will be.
Begin work on the project progress report.
Week of 7/5
Finish project progress report.
Attempt to use alternative algorithms to cluster the data.
Begin work on final project report.
Week of 7/12
Finish final project report.
Begin working on the project presentation.
Individual Tasks
Task 1: Gathering data using R or Python (everyone)
Gather data using R or Python techniques
Saving the data in correct file formats for future analysis
Task 2: Determining the most important attributes to use and what types of machine learning techniques should be implemented (in short Data manipulation)
Analyze importance of each attribute
Adding or removing attributes
Determine what type of algorithms would work best
Task 3: Generating and testing models.
Design and create the optimal models using basic and advanced algorithms
Support Vector Machines
K-Nearest Neighbor
Expectation Maximization
Density-Based Clustering
Test the methods on the data
Modify and optimize the methods based on the testing
Task 4: Visualizing results
Finding trends in the data results
Creating charts and graphs to visualize the trends
Creating network structure to represent similarities between different stocks
Task 5: Writing the final report
Combine the visual results along with the concluding ideas to form a final report
Task 6: Create the Presentation
Use charts and graphs to present the trends found in our data and analysis results