Este repositório tem como objetivo compartilhar minha experiência no acelera-dev Data Science da Codenation.
É um programa de capacitação gratuito para profissionais de tecnologia. Durante 10 semanas, devs e cientistas de dados têm acesso a desafios, conteúdos e à Comunidade Codenation para adquirirem e praticarem as habilidades técnicas mais utilizadas por empresas de tecnologia de todo o mundo.
- Semana 1: Introdução a Ciência de Dados
- Semana 2: Pré-processamento de Dados em Python
- Semana 3: Análise de dados exploratória
- Semana 4: Continuação Análise de dados exploratória
- Semana 5: Pensamento estatístico em Python
- Semana 6: Continuação Pensamento estatístico em Python
- Semana 7: Engenharia de Features
- Semana 8: Regressão
- Semana 9: Classificação
- Python Data Science Handbook
- Minimally Sufficient Pandas
- Why and How to Use Pandas with Large Data
- Getting started with Data Analysis with Python Pandas
- Python Pandas: Tricks & Features You May Not Know
- Pandas Tutorial: Essentials of Data Science in Pandas Library
- Python Pandas Tutorial: A Complete Introduction for Beginners
- Basic Time Series Manipulation with Pandas
- Tidy Data
- Cheat Sheet Pandas Basics
- How to self-learn statistics of data science
- Statistics Done Wrong
- Exploratory Data Analysis
- A Gentle Introduction to Exploratory Data Analysis
- A Simple Tutorial on Exploratory Data Analysis
- Introduction to Hypothesis Testing
- The Power of Visualization in Data Science
- https://visme.co/blog/examples-data-visualizations/
- http://tylervigen.com/spurious-correlations
- Probability Theory Review for Machine Learning
- Understanding Probability Distributions
- Probability Distribution
- Statistical Modeling: The Two Cultures
- Variáveis Aleatórias Unidimensionais
- Probability and Information Theory
- A Gentle Introduction to Statistical Hypothesis Testing
- How to Correctly Interpret P Values
- A Dirty Dozen: Twelve P-Value Misconceptions
- An investigation of the false discovery rate and the misinterpretation of p-values
- Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations
- Why Are P Values Misinterpreted So Frequently?
- Statistical Significance Explained
- Definition of Power
- The Math Behind A/B Testing with Example Python Code
- Handy Functions for A/B Testing in Python
- StackExchange - Relationship between SVD and PCA. How to use SVD to perform PCA?
- In Depth: Principal Component Analysis
- In-Depth: Manifold Learning
- Recursive Feature Elimination
- A Tutorial on Principal Component Analysis
- Principal Component Analysis Explained
- Step Forward Feature Selection: A Practical Example in Python
- Feature Engineering Book
- Feature Scaling with scikit-learn
- Anthony Goldbloom gives you the secret to winning Kaggle competitions
- What are some best practices in Feature Engineering?
- Machine Learning Mastery
- Fundamental Techniques of Feature Engineering for Machine Learning
- Feature Engineering Cookbook for Machine Learning
- Outlier detection with Scikit Learn
- Working With Text Data
- WTF is TF-IDF?
- Gentle Introduction to the Bias-Variance Trade-Off in Machine Learning
- Understanding the Bias-Variance Tradeoff
- Introduction to Machine Learning Algorithms: Linear Regression
- 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression
- Statistics By Jim
- Tikhonov regularization
- Ridge Regression for Better Usage
- Lasso (statistics)
- Understanding Linear Regression and Regression Error Metrics
- Understand Regression Performance Metrics
- Confusion matrix and other metrics in machine learning
- Let’s learn about AUC ROC Curve!
- Classification Algorithms Comparison
- Having an Imbalanced Dataset? Here Is How You Can Fix It
- FOUNDATIONS OF IMBALANCED LEARNING
- DATA MINING FOR IMBALANCED DATASETS: AN OVERVIEW
- An Introduction to Logistic Regression
- Explaining the Success of Nearest Neighbor Methods in Prediction
- Classification: Basic concepts, decision trees, and model evaluation
