Skip to content

aayushmanz/Python-For-Data-Science

Repository files navigation

Python for Data Science

Python Jupyter License Last Commit

This repository captures a hands-on Python-to-Data-Science learning journey through Jupyter notebooks, practice exercises, mini-projects, and real-world datasets.

It starts with core Python programming, moves into object-oriented programming and file handling, and then expands into NumPy and Pandas for data analysis workflows.


Table of Contents


Overview

This repo is organized as a practical notebook collection for learning:

  • Python fundamentals
  • Intermediate and advanced Python concepts
  • Exception handling and file handling
  • Object-oriented programming
  • NumPy basics and advanced operations
  • Pandas Series, DataFrames, and practice exercises
  • Small Python projects for reinforcement

Learning Path

If you are using this repo as a study roadmap, a good order is:

  1. Fundamentals of Python
  2. Advance Python
  3. Exception Handling in Python
  4. File Handling in Python
  5. OOPS in Python
  6. Python Fundamental Questions
  7. My Projects
  8. NumPy For Python
  9. NumPy For Practice
  10. NumPy Advance
  11. Pandas in Python

Repository Structure

Fundamentals of Python/

Foundational Python notebooks covering:

  • if-else, for, while, and nested loops
  • strings, lists, tuples, sets, frozensets, and dictionaries
  • functions, lambda expressions, and modules
  • operators, sequence sum patterns, and practice exercises
  • list and dictionary comprehensions

Advance Python/

Focused notebooks on:

  • decorators
  • namespace and scope

Exception Handling in Python/

Concepts and examples for:

  • Python error types
  • try, except, else, and finally
  • custom exception creation and handling

File Handling in Python/

Notebook-based coverage of:

  • reading, writing, and appending files
  • with statement usage
  • binary file handling
  • serialization, deserialization, and pickling

OOPS in Python/

Object-oriented programming topics including:

  • classes and objects
  • reference variables and user-defined data types
  • inheritance
  • encapsulation
  • abstraction
  • polymorphism
  • aggregation
  • super() usage

Python Fundamental Questions/

Practice notebooks for:

  • beginner-level problem solving
  • dictionary and list exercises
  • list comprehension practice
  • decorators and exception handling practice
  • OOP practice

My Projects/

Mini-project notebooks such as:

  • calculator
  • calculator v2
  • ATM system
  • library project
  • DinosaursPedia
  • Google account create/login simulation

NumPy For Python/

Core NumPy notebooks covering:

  • NumPy fundamentals
  • array attributes and helper functions
  • indexing and slicing
  • iterating and reshaping arrays
  • stacking and splitting arrays

NumPy For Practice/

Practice notebooks for reinforcing NumPy basics.

NumPy Advance/

Advanced NumPy topics including:

  • advanced indexing
  • broadcasting
  • missing value handling
  • plotting workflows
  • set functions
  • additional NumPy utility methods

Pandas in Python/

Structured Pandas learning content with subfolders:

  • Series in Pandas/
    Series creation, indexing, slicing, math methods, and plotting

  • DataFrames in Pandas/
    DataFrame creation, filtering, selection, index editing, math methods, grouping, merging, joining, concatenation, and reference notebooks

  • DateTime in Pandas/
    Date/time handling, multi-index objects, and vectorized string operations

  • Practice in Pandas/
    Applied notebooks using real datasets such as YouTube analytics, cities by GDP, and student academics


Datasets Included

The Database from Kaggle/ folder contains datasets used across the NumPy and Pandas notebooks.

Main datasets include:

  • batsman_runs_ipl.csv
  • bollywood.csv
  • cities_by_gdp.csv
  • deliveries.csv
  • diabetes.csv
  • expense_data.csv
  • global_top2000.csv
  • imdb-top-1000.csv
  • ipl-matches.csv
  • kohli_ipl.csv
  • movies.csv
  • student_performance_finalscore.csv
  • subs.csv
  • titanic.csv

It also includes a supplemental datasets/ folder with additional CSV and Excel files such as course, student, registration, and match data.


How to Use This Repository

  1. Clone the repository.
  2. Set up Python 3 and Jupyter in your local environment.
  3. Open the notebooks in Jupyter Notebook or JupyterLab.
  4. Move through the folders in the recommended learning order.
  5. Use the datasets in Database from Kaggle/ while practicing NumPy and Pandas notebooks.
  6. Revisit the mini-projects and practice notebooks to reinforce concepts.

Quick Start

git clone https://github.com/aayushmanz/Python-For-Data-Science.git
cd Python-For-Data-Science
python -m venv .venv
source .venv/bin/activate
pip install jupyter numpy pandas matplotlib
jupyter notebook

On Windows (PowerShell), activate the environment with: .venv\Scripts\Activate.ps1
Package installation above is a minimal setup for running the notebooks in this repo.


Tech Stack

Tool Purpose
Python 3 Core programming language
Jupyter Notebook Interactive notebook environment
NumPy Numerical computing
Pandas Data manipulation and analysis
Matplotlib Visualization support
Git & GitHub Version control and hosting

Repository Notes

  • Some folder names in this repository include spaces (and a few include trailing spaces), so copy paths carefully when working in the terminal.
  • Most content is notebook-based (.ipynb) and designed for interactive learning.

Contributing

Contributions are welcome. If you want to improve notebooks, fix typos, or add new exercises:

  1. Fork the repository.
  2. Create a feature branch.
  3. Make your changes and commit them.
  4. Open a pull request with a short summary of the update.

Goal

To build a strong Python foundation for data science by combining conceptual learning, repeated practice, notebook-based experimentation, and small project work.


Maintained by Ayush Suthar

About

Python concepts from basics to advanced for Machine Learning learners

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors