Skip to content

Commit 1d78b7f

Browse files
committed
Add website
0 parents  commit 1d78b7f

33 files changed

Lines changed: 10828 additions & 0 deletions

docs/_config.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
title: Introduction to data manipulation in Python
2+
logo:
3+
description: proefcentrum Hoogstraten
4+
show_downloads: true
5+
theme: jekyll-theme-minimal

docs/contributing.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
layout: default
3+
---
4+
5+
# Contributing guide
6+
7+
First of all, thanks for considering contributing to the course! 👍
8+
9+
## How you can contribute
10+
11+
There are several ways you can contribute to this course.
12+
13+
### Share the love ❤️
14+
15+
Think this course is useful? Let others discover it, by telling them in person, via Twitter or a blog post.
16+
17+
### Ask a question ⁉️
18+
19+
Trying out the material and got stuck? Post your question as an [issue on GitHub](https://github.com/plovercode/course-python-data/issues). While we cannot offer user support, we'll try to do our best to address it, as questions often lead to the discovery of bugs.
20+
21+
Want to ask a question in private? Contact the course maintainer by [email](jorisvandenbossche@gmail.com).
22+
23+
### Propose an idea 💡
24+
25+
Have an idea for to improve the course? Take a look at the [issue list](https://github.com/plovercode/course-python-data/issues) to see if it isn't included or suggested yet. If not, suggest your idea as an [issue on GitHub](https://github.com/plovercode/course-python-data/issues/new).
26+
27+
### Report a bug 🐛
28+
29+
Using the course and discovered a bug or a typo? That's annoying! Don't let others have the same experience and report it as an [issue on GitHub](https://github.com/plovercode/Have an idea for to improve the course? Take a look at the [issue list](https://github.com/plovercode/course-python-data/issues) to see if it isn't included or suggested yet. If not, suggest your idea as an [issue on GitHub](https://github.com/plovercode/course-python-data/issues/new).
30+
/issues/new) so we can fix it. A good bug report makes it easier for us to do so, so please include:
31+
32+
* Your operating system name and version (e.g. Mac OS 10.13.6).
33+
* Any details about your local setup that might be helpful in troubleshooting.
34+
* Detailed steps to reproduce the bug.
35+
36+
### Contribute code 📝
37+
38+
Care to fix issues or typo's? Awesome! 👏
39+
40+
Some notes to take into account:
41+
42+
- The course material is developed in the [course-python-data](https://github.com/plovercode/course-python-data) repository. When updating course material, edit the notebooks in the [course-python-data](https://github.com/plovercode/course-python-data) repository, the other ones (the ones used in the tutorial) are generated automatically.
43+
- the exercises are cleared using the `nbtutor` notebook extension: <https://github.com/plovercode/nbtutor>
44+
45+
46+

docs/index.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Introduction to data manipulation in Python
2+
3+
## Introduction
4+
5+
The handling of data is a recurring task for data analysts. Reading in experimental data, checking its properties,
6+
and creating visualisations are crucial steps in the research process. Hence, increasing the efficiency in this process is beneficial for professionals
7+
handling data. Spreadsheet-based software lacks the ability to properly support this process, due to the lack of automation and repeatability.
8+
The usage of a high-level scripting language such as Python is ideal for these tasks.
9+
10+
This course trains participants to use Python effectively to do these tasks. The course focuses on data manipulation and cleaning of tabular data,
11+
explorative analysis and visualisation using important packages such as Pandas, Matplotlib and Seaborn.
12+
13+
The course does not cover statistics, data mining, machine learning, or predictive modelling. It aims to provide participants the means to effectively
14+
tackle commonly encountered data handling tasks in order to increase the overall efficiency. These skills are both useful for data cleaning as well as
15+
feature engineering.
16+
17+
The course has been developed as a course for the proefcentrum Hoogstraten, but can be taught to others upon request.
18+
19+
## Course info
20+
21+
### Aim & scope
22+
23+
This course is intended for researchers that have at least basic programming skills. A basic (scientific) programming course that is part of the regular curriculum should suffice.
24+
25+
The course is intended for professionals who wish to enhance their general data manipulation and visualization skills in Python, with a specific focus on tabular data. The course is NOT intended to be a course on statistics or machine learning.
26+
27+
### Program
28+
29+
After setting up the programming environment with the required packages using the conda package manager and an introduction of the Jupyter notebook environment, the data analysis package Pandas and the plotting packages Matplotlib and Seaborn are introduced. Advanced usage of Pandas
30+
for different data cleaning and manipulation tasks is taught and the acquired skills will immediately be brought into practice to handle real-world
31+
data sets. Applications include time series handling, categorical data, merging data, tidy data,...
32+
33+
The course closes with a discussion on the scientific Python ecosystem and the visualisation landscape learning
34+
participants to create interactive charts.
35+
36+
## Getting started
37+
38+
The course uses Python 3 and some data analysis packages such as Pandas, Seaborn, Numpy and Matplotlib. To install the required libraries, we recommend Anaconda or miniconda ([https://conda-forge.org/download/](https://conda-forge.org/download/)) or another Python distribution that includes the scientific libraries (this recommendation applies to all platforms, so for both Window, Linux and Mac).
39+
40+
For detailed instructions to get started on your local machine, see the [setup instructions](./setup.html).
41+
42+
In case you do not want to install everything and just want to try out the course material, use the environment setup by
43+
Binder [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/plovercode/proefcentrum-python-data/HEAD) and open de notebooks
44+
rightaway (inside the `notebooks` directory).
45+
46+
## Slides
47+
48+
For the course slides, click [here](https://plovercode.github.io/proefcentrum-python-data/slides.html).
49+
50+
## Contributing
51+
52+
Found any typo or have a suggestion, see [how to contribute](./contributing.html).
53+
54+
## Meta
55+
56+
Authors: Joris Van den Bossche, Stijn Van Hoey

docs/setup.md

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
---
2+
layout: default
3+
---
4+
5+
# Course setup
6+
7+
To get started, you should have the following elements setup:
8+
9+
1. Download the course material to your computer
10+
2. Install Python and the required Python packages using `conda`
11+
3. Test your configuration and installation
12+
4. Start Jupyter lab
13+
14+
In the following sections, more details are provided for each of these steps. When all three are done, you are ready to start coding!
15+
16+
## 1. Getting the course materials
17+
18+
### Option 1: You are already a git user
19+
20+
As the course has been set up as a [git](https://git-scm.com/) repository managed on [Github](https://github.com/plovercode/proefcentrum-python-data),
21+
you can clone the entire course to your local machine. Use the command line to clone the repository and go into the course folder:
22+
23+
```
24+
git clone https://github.com/plovercode/proefcentrum-python-data.git
25+
cd proefcentrum-python-data
26+
```
27+
28+
In case you would prefer using Github Desktop,
29+
see [this tutorial](https://help.github.com/desktop/guides/contributing-to-projects/cloning-a-repository-from-github-to-github-desktop/).
30+
31+
### Option 2: You are not a git user
32+
33+
To download the repository to your local machine as a zip-file, click the `download ZIP` on the
34+
repository page <https://github.com/plovercode/proefcentrum-python-data> (green button "Code"):
35+
36+
![Download button](./static/img/download-button.png)
37+
38+
After the download, unzip on the location you prefer within your user account (e.g. `My Documents`, not `C:\`). Watch out for a nested 'proefcentrum-python-data/proefcentrum-python-data' folder structure after unzipping and move the inner proefcentrum-python-data folder to your preferred location.
39+
40+
__Note:__ Make sure you know where you stored the course material, e.g. `C:/Users/yourusername/Documents/proefcentrum-python-data`.
41+
42+
## 2. Install Python and the required Python packages using `conda`
43+
44+
For scientific and data analysis, we recommend to use `conda`, a command line tool for package and environment management (<https://docs.conda.io/projects/conda/>).
45+
`conda` allows us to install a Python distribution with the the scientific libraries we will use in this course (this recommendation applies to all platforms, so for both Windows, Linux and Mac).
46+
47+
### 2.1 Install `conda`
48+
49+
#### Option 1: I do not have `conda` installed
50+
51+
We recommend to use the installer provided by the conda-forge community: <https://conda-forge.org/download/>.
52+
53+
Follow the instructions on that page, i.e. first download the appropriate installed (depending on your operating system), and then run that installer.
54+
55+
On Windows, this will mean double-clicking the downloaded `.exe` file, and following the instructions. During installation, choose the options (click checkbox):
56+
57+
- '_Register Miniforge3 as my default Python 3.12_' (in case this returns an error about an existing Python 3.12 installation, remove the existing Python installation using [windows Control Panel](https://support.microsoft.com/en-us/windows/uninstall-or-remove-apps-and-programs-in-windows-4b55f974-2cc6-2d2b-d092-5905080eaf98)).
58+
- '_Clear the package cache upon completion_'.
59+
60+
On MacOS or Linux, you have to open a terminal, and run `bash Miniforge3-$(uname)-$(uname -m).sh`
61+
62+
#### Option 2: I already have `conda`, Anaconda or Miniconda installed
63+
64+
When you already have an installation of `conda` or Anaconda, you have to make sure you are working with a recent version. If you installed it only a
65+
few months ago, this step is probably not needed, otherwise follow the next steps:
66+
67+
1. Open a terminal window (on Windows, use the dedicated "Anaconda Prompt" or "Miniforge Prompt", via Start Menu)
68+
2. Run `conda update conda`, by typing that command, hit the ENTER-button
69+
(make sure you have an internet connection), and respond with *Yes* by typing `y`.
70+
3. Run `conda config --add channels conda-forge`, by typing that command, hit the ENTER-button
71+
4. Run `conda config --set channel_priority strict`, by typing that command, hit the ENTER-button
72+
73+
If you are using Anaconda on Windows, replace each time "Miniforge Prompt" by "Anaconda Prompt" in the following sections.
74+
75+
### 2.2 Setup after `conda` installation
76+
77+
Now we will use `conda` to install the Python packages we are going to use
78+
throughout this course.
79+
As a good practice, we will create a new _conda environment_ to work with.
80+
81+
The packages used in the course are enlisted in
82+
an [`environment.yml` file](https://raw.githubusercontent.com/plovercode/proefcentrum-python-data/main/environment.yml). The file looks as follows:
83+
84+
```
85+
name: DS-python
86+
channels:
87+
- conda-forge
88+
dependencies:
89+
- python=3.12
90+
- geopandas
91+
- ...
92+
```
93+
94+
The file contains information on:
95+
- `name` is the name used for the environment
96+
- `channels` to define where to download the packages from
97+
- `dependencies` contains each of the packages
98+
99+
The environment.yml file for this course is included in the course material you
100+
downloaded.
101+
102+
Now we can create the environment:
103+
104+
1. Open the terminal window (on Windows use "Miniforge Prompt", open it via Start Menu > 'Miniforge Prompt')
105+
2. Navigate to the directory where you downloaded the course materials (that directory should contain a `environment.yml` file, double check in your file explorer).:
106+
107+
```
108+
cd FOLDER_PATH_TO_COURSE_MATERIAL
109+
```
110+
(Make sure to hit the ENTER-button to run the command)
111+
112+
3. Create the environment by typing the following commands line by line + hitting the ENTER-button (make sure you have an internet connection):
113+
114+
```
115+
conda env create -f environment.yml
116+
```
117+
118+
__!__ `FOLDER_PATH_TO_COURSE_MATERIAL` should be replaced by the path to the folder containing the downloaded course materials (e.g. in the example it is `C:/Users/yourusername/Documents/proefcentrum-python-data`)
119+
120+
__!__ You can safely ignore the warning `FutureWarning: 'remote_definition'...`.
121+
122+
Respond with *Yes* by typing `y` when asked. Output will be printed and if no error occurs, you should have the environment configured with all packages installed.
123+
124+
When finished, keep the terminal window (or "Miniforge Prompt") open (or reopen it). Execute the following commands to check your installation:
125+
126+
```
127+
conda activate DS-python
128+
ipython
129+
```
130+
131+
Within the terminal, a Python session will be started in which you can start writing Python! Type the following command:
132+
133+
```
134+
import pandas
135+
import matplotlib
136+
```
137+
138+
If no message is returned, you're all set! If a message (probably an error) returned, contact the instructors. Copy paste the message returned.
139+
140+
To get out of the Python session, type:
141+
142+
```
143+
quit
144+
```
145+
146+
## 3. Test your configuration
147+
148+
To check if your packages are properly installed, open the Conda Terminal again (see above) and navigate to the course directory:
149+
150+
```
151+
cd FOLDER_PATH_TO_COURSE_MATERIAL
152+
```
153+
154+
With `FOLDER_PATH_TO_COURSE_MATERIAL` replaced by the path to the folder with the downloaded
155+
course material (e.g. in the example it is `C:/Users/yourusername/Documents/proefcentrum-python-data`).
156+
157+
Activate the newly created conda environment:
158+
159+
```
160+
conda activate DS-python
161+
```
162+
163+
Then, run the `check_environment.py` script:
164+
165+
```
166+
python check_environment.py
167+
```
168+
169+
When all checkmarks are ok, you're ready to go!
170+
171+
172+
## 4.(_start of day during course_) Starting Jupyter Notebook with Jupyter Lab
173+
174+
Each of the course modules is set up as a [Jupyter notebook](http://jupyter.org/), an interactive environment to write and run code. It is no problem if you never used jupyter notebooks before as an introduction to notebooks is part of the course.
175+
176+
177+
* In the terminal (or "Miniforge Prompt"), navigate to the `proefcentrum-python-data` directory (downloaded or cloned in the previous section)
178+
179+
```
180+
cd FOLDER_PATH_TO_COURSE_MATERIAL
181+
```
182+
183+
* Ensure that the correct environment is activated.
184+
185+
```
186+
conda activate DS-python
187+
```
188+
189+
* Start a jupyter notebook server by typing
190+
191+
```
192+
jupyter lab
193+
```
194+
195+
## Next?
196+
197+
This will open a browser window automatically. Navigate to the course directory (if not already there) and choose the `notebooks` folder to access the individual notebooks containing the course material.

0 commit comments

Comments
 (0)