Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 1.52 KB

File metadata and controls

10 lines (6 loc) · 1.52 KB

I read Karolina Alexiou's excellent blog post about The Top Mistakes Developers Make When Using Python for Big Data Analytics with great interest. I have made - and partially learned from - all of the mistakes she warned about. I was particularly eager to try out and extend some of the code snippets she provided to illustrate 2 of the mistakes:

  • Mistake #1: Reinventing the wheel
  • Mistake #2: Not tuning for performance

I started composing a rather lengthy comment on the blog post, highlighting some aspects I especially appreciated and seeking clarification on others. Whenever I notice myself getting a bit voluminous in a comment on someone else's blog, I typically compose a separate post on my own blog (Gumption), and then substitute a link (with a brief summary) on the original blog post.

In this instance, it seemed more appropriate - and constructive - to create an IPython Notebook to illustrate and/or investigate some of the issues I was raising in that comment ... and thereby finding some of the clarifications I was initially seeking.

I am sharing those investigations here in case they are of interest or use to others ... and because it's been a while since I created and shared an IPython Notebook about Python and data science.