Skip to content

Yangxulei/Big-Data-Specialization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Big-Data-Specialization

EECS598课程阅读总结

No Readings Presenter(原课程小组) Critic(原课程小组) 我的总结(中文)
1 Introduction Mosharaf
Background
2 The Datacenter as a Computer (Chapters 1 and 2) Mosharaf
VL2: A Scalable and Flexible Data Center Network (Optional)
3 The Google File System Mosharaf YangXuLei:(地址1)(2)
MapReduce: Simplified Data Processing on Large Clusters
GFS: Evolution on Fast-Forward (Optional)
Resource Management
4 YARN: Yet Another Resource Negotiator Matthew-Ayush-HyunJong ChunJung-TingWei-Vandit*
5 Borg, Omega, and Kubernetes (Companion)
6 Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center (Optional)
7 Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Fan-Hasan-Henry Andrew-William-Zhao*
Altruistic Scheduling in Multi-Resource Clusters (Companion)
Dataflow Programming Frameworks and Execution Engines
8 Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing Dong-Jinxiaoyu-Huanyu Qiyang-Ruying
Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications (Companion)
9 Naiad: A Timely Dataflow System Die-Chi-Shaowen Matthew-Ayush-HyunJong
Batch Processing
10 Spark SQL: Relational Data Processing in Spark Bor-ChungWen-Hongyu Dong-Jinxiaoyu-Huanyu
Major Technical Advancements in Apache Hive (Companion)
11 Global Analytics in the Face of Bandwidth and Regulatory Constraints Fan-Hasan-Henry* Boyu-Rui-Haojun*
Clarinet: WAN-Aware Optimization for Analytics Queries (Companion)
Stream Processing
12 Discretized Streams: Fault-Tolerant Streaming Computation at Scale ChunJung-TingWei-Vandit TaiYing-PeiXuan-Changfeng
Storm @Twitter (Companion)
13 Realtime Data Processing at Facebook TaiYing-PeiXuan-Changfeng Wen-Eric-Kevin
Twitter Heron: Stream Processing at Scale (Companion)
14 StreamScope: Continuous Reliable Distributed Processing of Big Data Streams Die-Chi-Shaowen* Bor-ChungWen-Hongyu
Graph Processing
15 PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs Wen-Eric-Kevin* Dong-Jinxiaoyu-Huanyu*
GraphX: Graph Processing in a Distributed Dataflow Framework (Companion)
Machine Learning
16 Scaling Distributed Machine Learning with the Parameter Server Qiyang-Ruying Die-Chi-Shaowen
Project Adam: Building an Efficient and Scalable Deep Learning Training System (Optional)
17 TensorFlow: A System for Large-Scale Machine Learning Wen-Eric-Kevin Boyu-Rui-Haojun
18 TuX2: Distributed Graph Computation for Machine Learning Andrew-William-Zhao Wenting-Peter
19 Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds Matthew-Ayush-HyunJong* Fan-Hasan-Henry
20 Mid-Semester Presentations
Approximate Query Processing
21 BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data Boyu-Rui-Haojun Wenting-Peter*
22 Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters Andrew-William-Zhao Bor-ChungWen-Hongyu*
RDMA-Enabled Systems
23 FaRM: Fast Remote Memory Qiyang-Ruying* Yiwen** TaiYing-PeiXuan-Changfeng*
No Compromises: Distributed Transactions with Consistency, Availability, and Performance (Companion)
24 Efficient Memory Disaggregation with Infiniswap Wenting-Peter ChunJung-TingWei-Vandit

About

Big data for Data Engineers Specialization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors