-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathbibliography.html
More file actions
216 lines (216 loc) · 23.2 KB
/
bibliography.html
File metadata and controls
216 lines (216 loc) · 23.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Chronos: Case Bibliography</title>
<meta charset="utf-8" />
<meta content="width=device-width, initial-scale=1" name="viewport" />
<link rel="stylesheet" href="css/bibliography.css">
<link rel="shortcut icon" type="image/png" href="/favicon.png" />
<script src="https://cdn.jsdelivr.net/npm/waypoints@4.0.1/lib/noframework.waypoints.min.js"></script>
<script src="javascripts/bibWaypoints.js"></script>
</head>
<body>
<nav>
<a href="index.html" class="hvr-bob"></a>
<ul>
<li><a href="casestudy.html" class="hvr-float-shadow">Case Study</a></li>
<li><a href="bibliography.html" class="hvr-float-shadow">Bibliography</a></li>
<li><a href="team.html" class="hvr-float-shadow">Team</a></li>
<li><a href="https://github.com/chronos-project" class="hvr-float-shadow"><img src="images/icons/chronos_github_gray.png" alt="Github logo" /></a></li>
</ul>
</nav>
<main>
<nav id="toc">
<ul>
<li class="h2 active" id="devops-toc" data-element="devops">DevOps
<ul>
<li class="h3" id="devops-blog-posts-toc" data-element="devops-blog-posts">Blog Posts</li>
</ul>
</li>
<li class="h2" id="docker-toc" data-element="docker">Docker
<ul>
<li class="h3" id="docker-blog-posts-toc" data-element="docker-blog-posts">Blog Posts</li>
<li class="h3" id="docker-courses-toc" data-element="docker-courses">Courses</li>
<li class="h3" id="docker-videos-toc" data-element="docker-videos">Videos</li>
</ul>
</li>
<li class="h2" id="event-data-metadata-analytics-toc" data-element="event-data-metadata-analytics">Event Data/Analytics
<ul>
<li class="h3" id="event-data-blog-posts-toc" data-element="event-data-blog-posts">Blog Posts</li>
<li class="h3" id="event-data-videos-toc" data-element="event-data-videos">Videos</li>
<li class="h3" id="event-data-whitepapers-toc" data-element="event-data-whitepapers">Whitepapers</li>
</ul>
</li>
<li class="h2" id="event-series-time-series-databases-toc" data-element="event-series-time-series-databases">Event/Time DBs
<ul>
<li class="h3" id="event-time-blog-posts-toc" data-element="event-time-blog-posts">Blog Posts</li>
<li class="h3" id="event-time-slideshows-toc" data-element="event-time-slideshows">Slideshows</li>
<li class="h3" id="event-time-videos-toc" data-element="event-time-videos">Videos</li>
<li class="h3" id="event-time-whitepapers-toc" data-element="event-time-whitepapers">Whitepapers</li>
</ul>
</li>
<li class="h2" id="event-streaming-architecture-toc" data-element="event-streaming-architecture">Streaming Architecture
<ul>
<li class="h3" id="streaming-books-toc" data-element="streaming-books">Books</li>
<li class="h3" id="streaming-blog-posts-toc" data-element="streaming-blog-posts">Blog Posts</li>
<li class="h3" id="streaming-videos-toc" data-element="streaming-videos">Videos</li>
<li class="h3" id="streaming-podcasts-toc" data-element="streaming-podcasts">Podcasts</li>
<li class="h3" id="streaming-whitepapers-toc" data-element="streaming-whitepapers">Whitepapers</li>
</ul>
</li>
<li class="h2" id="kafka-toc" data-element="kafka">Kafka
<ul>
<li class="h3" id="kafka-books-toc" data-element="kafka-books">Books</li>
<li class="h3" id="kafka-blog-posts-toc" data-element="kafka-blog-posts">Blog Posts</li>
<li class="h3" id="kafka-courses-toc" data-element="kafka-courses">Courses</li>
<li class="h3" id="kafka-videos-toc" data-element="kafka-videos">Videos</li>
</ul>
</li>
<li class="h2" id="testing-and-benchmarking-toc" data-element="testing-and-benchmarking">Testing/Benchmarking
<ul>
<li class="h3" id="testing-blog-posts-toc" data-element="testing-blog-posts">Blog Posts</li>
<li class="h3" id="testing-projects-toc" data-element="testing-projects">Projects</li>
</ul>
</li>
</ul>
</nav>
<article>
<div id='markdown'>
<h1 id="bibliography">Bibliography</h1>
<h2 id="devops">DevOps</h2>
<h3 id="devops-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://blog.gruntwork.io/5-lessons-learned-from-writing-over-300-000-lines-of-infrastructure-code-36ba7fadeac1">5 Lessons Learned From Writing Over 300,000 Lines of Infrastructure Code: A concise masterclass on how to write infrastructure code</a> (Yevgeniy Brikman)</p>
<h2 id="docker">Docker</h2>
<p class="resource"><em>Note: Many of these resources were shared with us by the <a href="https://spacecraft-repl.com/">SpaceCraft REPL team</a>. A hat tip to them!</em></p>
<h3 id="docker-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://medium.com/@deeksha.sharma25/import-sql-dump-in-postgres-docker-container-in-10-min-ca16d9264f86">Import SQL dump in postgres docker container in 10 min</a> (Deeksha Sharma)</p>
<h3 id="docker-courses">Courses</h3>
<p class="resource"><a href="https://www.udemy.com/docker-and-kubernetes-the-complete-guide/learn/v4/overview">Docker and Kubernetes: The Complete Guide</a> (Stephen Grider)</p>
<h3 id="docker-videos">Videos</h3>
<p class="resource"><a href="https://www.youtube.com/watch?v=TvnZTi_gaNc">Virtual Machines vs Docker Containers - Dive Into Docker</a> (Nick Janetakis)</p>
<p><em>Succinct and clear explanation; analogy at the end is quite useful</em></p>
<p class="resource"><a href="https://www.youtube.com/watch?v=EnJ7qX9fkcU&list=PL7bmigfV0EqQt5_pBPQ8tsZjI1w68-e0H&index=1">What is a Container?</a> (Ben Corrie)</p>
<p><em>Deeper dive into what a container is, which is useful given that it is an overloaded term. Visuals are very helpful</em></p>
<h2 id="event-data-metadata-analytics">Event Data / Metadata / Analytics</h2>
<h3 id="event-data-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://blog.keen.io/analytics-for-hackers-how-to-think-about-event-data/">Analytics For Hackers: How To Think About Event Data</a> (Michelle Wetzler)</p>
<p><em>Great overview of event data and how it differs from entity data</em></p>
<p class="resource"><a href="https://blog.keen.io/event-data-vs-entity-data-how-to-store-user-properties-in-keen-io/">Event Data vs Entity Data — How to store user properties in Keen IO</a> (Michelle Wetzler)</p>
<p><em>Talks about how to store entity data in an event storage database (specifically Keen.io). Also has a brief description of event data vs entity data</em></p>
<p class="resource"><a href="https://snowplowanalytics.com/blog/2016/03/16/introduction-to-event-data-modeling/">An introduction to event data modeling</a> (Yali Sassoon)</p>
<p><em>Good for it's distinction between atomic data and modeled data. Lots of info on how we work with aggregated modeled data</em></p>
<h3 id="event-data-videos">Videos</h3>
<p class="resource"><a href="https://www.youtube.com/watch?v=tBLWw-C3OdM">(Event) Data is Everywhere</a> (Taylor Barnett)</p>
<p><em>Brief introduction to event data. Largely useful for her clear definition of event data and how it differs from entity data</em></p>
<h3 id="event-data-whitepapers">Whitepapers</h3>
<p class="resource"><a href="https://learn.keen.io/build-vs-buy">Build vs. Buy Gets Easier with APIs: A CTO's Guide to Getting Data Strategy Right</a> (Keen IO)</p>
<p class="resource"><a href="http://info.heapanalytics.com/rs/622-XIP-837/images/Heap-Build-Your-Own-CDP-White-Paper.pdf">The Death of Web Analytics</a> (Heap Analytics)</p>
<p class="resource"><a href="https://mixpanel.com/solutions/infrastructure/">Mixpanel System Architecture</a> (Vijay Jayaram)</p>
<p class="resource"><a href="https://arxiv.org/pdf/1406.2015.pdf">MOOCdb: Developing Standards and Systems to Support MOOC Data Science</a> (Kalyan Veeramachaneni et al.)</p>
<p><em>An account of a "solution to centralizing and generalizing MOOC data organization". Essentially, the group develops a set of schema to handle event data, i.e. online students engagement with their web-based courses</em></p>
<p class="resource"><a href="https://www.researchgate.net/publication/286732446_Time_Series_Databases">Time Series Databases</a> (Dmitry Namiot)</p>
<h2 id="event-series-time-series-databases">Event-series/Time-series Databases</h2>
<h3 id="event-time-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://www.percona.com/blog/2016/12/14/row-store-and-column-store-databases">Row Store and Column Store Databases</a> (Rick Golba)</p>
<p><em>Clear discussion of relative advantages and common use cases of columnar and row-based databases. Emphasis on transaction and query type (and not, e.g. scalability, consistency, etc)</em></p>
<p class="resource"><a href="https://blog.slicingdice.com/">SlicingDice.com Blog</a> (SlicingDice)</p>
<p><em>Slicingdice is a competitor with Keen.io. The company's blog has a series of posts about their infrastructure that are quite dense. They also have a post with a lot of links to concepts (usually wiki pages), books, and white papers that helped them to build their time-based database</em></p>
<p class="resource"><a href="https://dzone.com/articles/dzone-research-sql-or-nosql-that-is-the-question">SQL or NoSQL, That Is the Question</a> (Jordan Baker)</p>
<p><em>Survey results that showed that most developers are still using SQL, but that NoSQL is on the rise</em></p>
<p class="resource"><a href="https://www.xaprb.com/blog/2014/06/08/time-series-database-requirements/">Time-Series Database Requirements</a> (Baron Schwartz)</p>
<p><em>Baron Schwartz talks about requirements for time series databases</em></p>
<p class="resource"><a href="https://www.irondb.io/2018/08/tsdbs-at-scale-part-one/">TSDBs at Scale - Part One</a> (Fred Moyer)</p>
<p class="resource"><a href="https://www.irondb.io/2018/08/tsdbs-at-scale-part-two/">TSDBs at Scale - Part Two</a> (Fred Moyer)</p>
<p class="resource"><a href="https://www.outlyer.com/blog/why-not-to-build-a-time-series-database/">Why Not to Build a Time Series Database</a> (David Gildeh)</p>
<h3 id="event-time-slideshows">Slideshows</h3>
<p class="resource"><a href="https://speakerdeck.com/benbjohnson/behavioral-databases">Behavior Databases - Next Generation NoSQL Analytics</a> (Ben Johnson)</p>
<p><em>Slides describing Ben Johnson's behavior database. Main distinction between entity and event data that Keen uses first made here</em></p>
<p class="resource"><a href="https://speakerdeck.com/dzello/store-json-in-cassandra-the-hard-way">Store JSON in Cassandra the Hard Way</a> (Josh Dzielak)</p>
<p><em>Explains Keen.io's method for getting JSON data into Cassandra. Even though it's just the slides, you can get the overall gist from them</em></p>
<h3 id="event-time-videos">Videos</h3>
<p class="resource"><a href="https://www.youtube.com/watch?v=OoCsY8odmpM">Intro to Time Series Databases & Data | Getting Started 1 of 7</a> (Michael DeSa)</p>
<p><em>Introduction to time series data and demonstration of one of time series databases, InfluxDB</em></p>
<h3 id="event-time-whitepapers">Whitepapers</h3>
<p class="resource"><a href="http://db.csail.mit.edu/pubs/abadi-column-stores.pdf">The Design and Implementation of Modern Column-Oriented Database Systems</a> (Daniel Abadi, Peter Boncz, Stavros Harizopoulos, Stratos Idreos, Samuel Madden)</p>
<p class="resource"><a href="https://www.vertica.com/wp-content/uploads/2018/05/why_all_column_stores_are_not_the_same_wp.pdf">Why All Column Stores Are Not the Same: Twelve Low-Level Features That Offer High Value to Analysts</a> (Vertica)</p>
<h2 id="event-streaming-architecture">Event Streaming Architecture</h2>
<h3 id="streaming-books">Books</h3>
<p class="resource"><a href="https://www.oreilly.com/data/free/stream-processing.csp">Making Sense of Stream Processing: The Philosophy Behind Apache Kafka and Scalable Stream Data Platforms</a> (Martin Kleppmann)</p>
<p><em>More high level than Streaming Systems, but don't let that fool you: Kleppmann more focuses on drawing out the big picture implications of stream data platforms -- paradigm shifting implications</em></p>
<p class="resource"><a href="https://mapr.com/streaming-architecture-using-apache-kafka-mapr-streams/">Streaming Architecture: New Designs Using Apache Kafka and MapR Streams</a> (Ted Dunning & Ellen Friedman)</p>
<p class="resource"><a href="https://www.amazon.com/Streaming-Systems-Where-Large-Scale-Processing/dp/1491983876/">Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing</a>(Tyler Akidau, Slava Chernyak, & Reuvan Lax)</p>
<p><em>This is <strong>the</strong> book to read if you are interested in this topic</em></p>
<h3 id="streaming-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://www.confluent.io/blog/making-sense-of-stream-processing/">Stream processing, Event sourcing, Reactive, CEP… and making sense of it all</a> (Martin Kleppmann)</p>
<p><em>Very informative post about Stream processing, Event sourcing. Also talks about use cases and trade offs of storing raw event data vs aggregated data</em></p>
<p class="resource"><a href="https://www.outlyer.com/blog/top10-open-source-time-series-databases/">Top 10 Time Series Databases</a> (Outlyer)</p>
<p><em>Spreadsheet included in this blog post is very useful for a comparison of features</em></p>
<p class="resource"><a href="https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101">The world beyond batch: Streaming 101</a> (Tyler Akidau)</p>
<p class="resource"><a href="https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102">The world beyond batch: Streaming 102</a> (Tyler Akidau)</p>
<p class="resource"><a href="http://www.complexevents.com/2018/07/29/trends-in-event-stream-processing-products/">Trends in Event Stream Processing Products</a> (Roy Schulte & David Luckham)</p>
<p><em>Covers 6 major trends in CEP services and gives a large listing of them</em></p>
<p class="resource"><a href="https://www.oreilly.com/ideas/questioning-the-lambda-architecture">Questioning the Lambda Architecture</a> (Jay Kreps)
<p><em>Briefly describes the Lambda Architecture and then compares it with the "Kappa" architecture which just uses a stream processor that runs an additional job</em></p>
<p class="resource"><a href="https://blog.keen.io/a-note-about-rate-limits/">A Note About Rate Limits</a> (Keen IO)</p>
<p><em>Brief post that describes some of the (older) rate limits at Keen and what they're for. May be useful when we have to implement our own rate limiting
Uncategorized</em></p>
<p class="resource"><a href="https://blog.keen.io/architecture-of-giants-data-stacks-at-facebook-netflix-airbnb-and-pinterest/">Architecture of Giants: Data Stacks at Facebook, Netflix, Airbnb, and Pinterest</a> (Michelle Wetzler)</p>
<p><em>Gives simplified layouts of a few major internet giants. What's relevant is there is a visual diagram of Keen.io's IO architecture as well as a brief description. Not as thorough as some of the other links, but it's nice to have a Visual</em></p>
<p class="resource"><a href="https://www.tbray.org/ongoing/When/201x/2018/11/18/Post-REST">Post-REST</a> (Tim Bray)</p>
<p><em>Notes some of the problems with RESTful APIs and then details some of the post-REST APIs coming in the future</em></p>
<h3 id="streaming-videos">Videos</h3>
<p class="resource"><a href="https://www.youtube.com/watch?v=QqK_KZryoGM">Handling trillions of events daily and conquering scaling issues with Keen CTO</a> (Christophe Limpalair)</p>
<p><em>Interview with Keen.io's CTO Dan Kador. Lots of good information on Keen.io's architecture here</em></p>
<h3 id="streaming-podcasts">Podcasts</h3>
<p class="resource"><a href="https://softwareengineeringdaily.com/2016/05/23/kafka-storm-cassandra-keen-ios-analytics-architecture-dan-kador/">Kafka, Storm, and Cassandra: Keen IO's Analytic Architecture with Dan Kador</a> (Software Engineering Daily)</p>
<p><em>Another interview with Dan Kador; I think the video is more useful, but still really good. A bit more technical detail about data transformation here than in the video</em></p>
<h3 id="streaming-whitepapers">Whitepapers</h3>
<p class="resource"><a href="https://www.researchgate.net/publication/220622910_Complex_Event_Processing">Complex Event Processing</a> (Alejandro Buchmann, TU Darmstadt, Boris Koldehofe)</p>
<p class="resource"><a href="https://www.unix.com/pdf/CEP_in_distributed_systems.pdf">Complex Event Processing Distributed Systems</a> (David C. Luckham & Brian Frasca)</p>
<p class="resource"><a href="http://leavcom.com/articles/ieee_april09.htm">Complex-Event Processing Poised for Growth</a> (Neal Leavitt)</p>
<p class="resource"><a href="http://ilpubs.stanford.edu:8090/527/">Continuous Queries over Data Streams</a> (Shivnath Babu & Jennifer Widom)</p>
<p class="resource"><a href="http://msdl.cs.mcgill.ca/people/istvan/pub/mtcps2016">Distributed and Heterogeneous Event-based Monitoring in Smart Cyber-Physical Systems</a> (Lászió Balogh, István Dávid, István Ráth, Dániel Varró, Andras Vörös)</p>
<p class="resource"><a href="https://arxiv.org/pdf/1204.3362">Event based classification of Web 2.0 text streams</a> (Andreas Bauer & Christian Wolff)</p>
<h2 id="kafka">Kafka</h2>
<h3 id="kafka-books">Books</h3>
<p class="resource"><a href="https://www.confluent.io/designing-event-driven-systems">Designing Event-Driven Systems: Concepts and Patterns for Streaming Services with Apache Kafka</a> (Ben Stopford)</p>
<p class="resource"><a href="https://www.confluent.io/resources/kafka-the-definitive-guide/">Kafka: The Definitive Guide</a> (Neha Narkhede, Gwen Shapira, & Todd Palino)</p>
<h3 id="kafka-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://blog.keen.io/apache-kafka-vs-amazon-kinesis-to-build-a-high-performance-distributed-system/">Apache Kafka vs Amazon Kinesis to Build a High Performance Distributed System</a> (Kyle Wild)</p>
<p><em>Comparison between Kafka and Kinesis for building something akin to Keen.io</em></p>
<p class="resource"><a href="https://sookocheff.com/post/kafka/kafka-in-a-nutshell/">Kafka in a Nutshell</a> (Kevin Sookocheff)</p>
<p><em>Explains how Kafka works (Kafka topic, replication, Producers and Consumers, Partitions and Brokers, etc)</em></p>
<p class="resource"><a href="https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html">Should you put several event types in the same Kafka topic?</a> (Martin Kleppmann)</p>
<p><em>Great article where Kleppmann describes the different situations when you should or shouldn't combine events into the same topic. Make sure to the read the <a href="https://grokbase.com/t/kafka/users/15a7k5f1rr/mapping-events-to-topics">grokbase</a> post he links to in point #4</em></p>
<p class="resource"><a href="https://content.pivotal.io/rabbitmq/understanding-when-to-use-rabbitmq-or-apache-kafka">Understanding When to use RabbitMQ or Apache Kafka</a> (Pieter Humphrey)</p>
<p><em>This post offers an assessment of the most popular messaging choices today: RabbitMQ and Apache Kafka. Use cases</em></p>
<p class="resource"><a href="https://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/">Introducing Kafka Streams: Stream Processing Made Simple</a> (Jay Kreps)</p>
<p><em>Introduction to the Kafka Streams API with discussions on Table/Stream theory, windowing, etc.</em></p>
<h3 id="kafka-courses">Courses</h3>
<p class="resource"><a href="https://www.udemy.com/apache-kafka/learn/v4/overview">Apache Kafka Series - Learn Apache Kafka for Beginners v2</a> (Stéphane Maarek)</p>
<p><em>Phenomenal course; if it's on sale, make sure to grab it!</em></p>
<p class="resource"><a href="https://www.udemy.com/kafka-connect/learn/v4/overview">Apache Kafka Series - Kafka Connect Hands-on Learning</a> (Stéphane Maarek)</p>
<h3 id="kafka-videos">Videos</h3>
<p class="resource"><a href="https://www.youtube.com/watch?v=v2RJQELoM6Y">Is Kafka a Database?</a> (Martin Kleppmann)</p>
<p class="resource"><a href="https://www.youtube.com/watch?v=HeNegOzjnJY">Kafka and Event-Oriented Architecture</a> (Jay Kreps)</p>
<p class="resource"><a href="https://www.confluent.io/thank-you/kafka-scale-cloud/">Kafka at Scale in the Cloud</a> (Allen Wang)</p>
<p><em>2016 presentation explaining some of the challenges Netflix had with scaling Kafka in the cloud and their solutions. Slides can be found <a href="https://www.slideshare.net/ConfluentInc/kafka-at-scale-in-the-cloud">here</a></em></p>
<p class="resource"><a href="https://www.youtube.com/watch?v=lIGFH9TkL2w&feature=youtu.be">What's New in Kafka 2.1?</a> (Stéphane Maarek)</p>
<p><em>New features in the recent Kafka 2.1 release</em></p>
<h2 id="testing-and-benchmarking">Testing and Benchmarking</h2>
<h3 id="testing-blog-posts">Blog Posts</h3>
<p class="resource"><a href="https://www.arangodb.com/2018/02/nosql-performance-benchmark-2018-mongodb-postgresql-orientdb-neo4j-arangodb/">NoSQL Performance Benchmark 2018 – MongoDB, PostgreSQL, OrientDB, Neo4j and ArangoDB</a> (ArangoDB)</p>
<p><em>Post describing how ArangoDB benchmarked its product, with instructions for using AWS, and scripts</em></p>
<h3 id="testing-projects">Projects</h3>
<p class="resource"><a href="http://snap.stanford.edu/">Stanford Network Analysis Project</a> (Jure Leskovec)</p>
<p><em>Big data sets that may be useful in automated testing. Used by the benchmarking process for ArangoDB</em></p>
</div>
</article>
</main>
<footer>
<p>Current Version: 0.9.0</p>
<p><a href="index.html" class="hvr-float-shadow">Home</a> | <a href="casestudy.html" class="hvr-float-shadow">Case Study</a> | <a href="bibliography.html" class="hvr-float-shadow">Bibliography</a> | <a href="team.html" class="hvr-float-shadow">Team</a></p>
<small>Site design by <a href="https://www.instagram.com/linzimurray.creative/" target="_blank">linzimurray.creative</a></small>
</footer>
</body>
</html>