In this project we experimented with apache spark queries on big data datasets like the movielens dataset ("https://grouplens.org/datasets/movielens/") and tried to optimise their perfomance both on local cluster scenarios and at cloud/server scenarios like the livy server("https://livy.apache.org/").
OperaDevelop07/Big-Data-Querying-with-Apache-Spark
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|