A_Few_Optimizations_for_Data_Analysis_in_Python/README.md at master · gumption/A_Few_Optimizations_for_Data_Analysis_in_Python

I read Karolina Alexiou's excellent blog post about The Top Mistakes Developers Make When Using Python for Big Data Analytics with great interest. I have made - and partially learned from - all of the mistakes she warned about. I was particularly eager to try out and extend some of the code snippets she provided to illustrate 2 of the mistakes:

Mistake #1: Reinventing the wheel
Mistake #2: Not tuning for performance

I started composing a rather lengthy comment on the blog post, highlighting some aspects I especially appreciated and seeking clarification on others. Whenever I notice myself getting a bit voluminous in a comment on someone else's blog, I typically compose a separate post on my own blog (Gumption), and then substitute a link (with a brief summary) on the original blog post.

In this instance, it seemed more appropriate - and constructive - to create an IPython Notebook to illustrate and/or investigate some of the issues I was raising in that comment ... and thereby finding some of the clarifications I was initially seeking.

I am sharing those investigations here in case they are of interest or use to others ... and because it's been a while since I created and shared an IPython Notebook about Python and data science.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls