This repository captures a hands-on Python-to-Data-Science learning journey through Jupyter notebooks, practice exercises, mini-projects, and real-world datasets.
It starts with core Python programming, moves into object-oriented programming and file handling, and then expands into NumPy and Pandas for data analysis workflows.
- Overview
- Learning Path
- Repository Structure
- Datasets Included
- How to Use This Repository
- Quick Start
- Tech Stack
- Repository Notes
- Contributing
- Goal
This repo is organized as a practical notebook collection for learning:
- Python fundamentals
- Intermediate and advanced Python concepts
- Exception handling and file handling
- Object-oriented programming
- NumPy basics and advanced operations
- Pandas Series, DataFrames, and practice exercises
- Small Python projects for reinforcement
If you are using this repo as a study roadmap, a good order is:
- Fundamentals of Python
- Advance Python
- Exception Handling in Python
- File Handling in Python
- OOPS in Python
- Python Fundamental Questions
- My Projects
- NumPy For Python
- NumPy For Practice
- NumPy Advance
- Pandas in Python
Foundational Python notebooks covering:
if-else,for,while, and nested loops- strings, lists, tuples, sets, frozensets, and dictionaries
- functions, lambda expressions, and modules
- operators, sequence sum patterns, and practice exercises
- list and dictionary comprehensions
Focused notebooks on:
- decorators
- namespace and scope
Concepts and examples for:
- Python error types
try,except,else, andfinally- custom exception creation and handling
Notebook-based coverage of:
- reading, writing, and appending files
withstatement usage- binary file handling
- serialization, deserialization, and pickling
Object-oriented programming topics including:
- classes and objects
- reference variables and user-defined data types
- inheritance
- encapsulation
- abstraction
- polymorphism
- aggregation
super()usage
Practice notebooks for:
- beginner-level problem solving
- dictionary and list exercises
- list comprehension practice
- decorators and exception handling practice
- OOP practice
Mini-project notebooks such as:
- calculator
- calculator v2
- ATM system
- library project
- DinosaursPedia
- Google account create/login simulation
Core NumPy notebooks covering:
- NumPy fundamentals
- array attributes and helper functions
- indexing and slicing
- iterating and reshaping arrays
- stacking and splitting arrays
Practice notebooks for reinforcing NumPy basics.
Advanced NumPy topics including:
- advanced indexing
- broadcasting
- missing value handling
- plotting workflows
- set functions
- additional NumPy utility methods
Structured Pandas learning content with subfolders:
-
Series in Pandas/
Series creation, indexing, slicing, math methods, and plotting -
DataFrames in Pandas/
DataFrame creation, filtering, selection, index editing, math methods, grouping, merging, joining, concatenation, and reference notebooks -
DateTime in Pandas/
Date/time handling, multi-index objects, and vectorized string operations -
Practice in Pandas/
Applied notebooks using real datasets such as YouTube analytics, cities by GDP, and student academics
The Database from Kaggle/ folder contains datasets used across the NumPy and Pandas notebooks.
Main datasets include:
batsman_runs_ipl.csvbollywood.csvcities_by_gdp.csvdeliveries.csvdiabetes.csvexpense_data.csvglobal_top2000.csvimdb-top-1000.csvipl-matches.csvkohli_ipl.csvmovies.csvstudent_performance_finalscore.csvsubs.csvtitanic.csv
It also includes a supplemental datasets/ folder with additional CSV and Excel files such as course, student, registration, and match data.
- Clone the repository.
- Set up Python 3 and Jupyter in your local environment.
- Open the notebooks in Jupyter Notebook or JupyterLab.
- Move through the folders in the recommended learning order.
- Use the datasets in
Database from Kaggle/while practicing NumPy and Pandas notebooks. - Revisit the mini-projects and practice notebooks to reinforce concepts.
git clone https://github.com/aayushmanz/Python-For-Data-Science.git
cd Python-For-Data-Science
python -m venv .venv
source .venv/bin/activate
pip install jupyter numpy pandas matplotlib
jupyter notebookOn Windows (PowerShell), activate the environment with:
.venv\Scripts\Activate.ps1
Package installation above is a minimal setup for running the notebooks in this repo.
| Tool | Purpose |
|---|---|
| Python 3 | Core programming language |
| Jupyter Notebook | Interactive notebook environment |
| NumPy | Numerical computing |
| Pandas | Data manipulation and analysis |
| Matplotlib | Visualization support |
| Git & GitHub | Version control and hosting |
- Some folder names in this repository include spaces (and a few include trailing spaces), so copy paths carefully when working in the terminal.
- Most content is notebook-based (
.ipynb) and designed for interactive learning.
Contributions are welcome. If you want to improve notebooks, fix typos, or add new exercises:
- Fork the repository.
- Create a feature branch.
- Make your changes and commit them.
- Open a pull request with a short summary of the update.
To build a strong Python foundation for data science by combining conceptual learning, repeated practice, notebook-based experimentation, and small project work.
Maintained by Ayush Suthar