-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathHandling the missing data.py
More file actions
58 lines (47 loc) · 1.18 KB
/
Handling the missing data.py
File metadata and controls
58 lines (47 loc) · 1.18 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import pandas as pd
import numpy as np
# Create a DataFrame with some missing values
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, np.nan, 30, np.nan],
'Score': [85, 90, np.nan, 88]
}
df = pd.DataFrame(data)
print("Original Data:")
print(df)
# Detect missing values
print("\nMissing Data (True = Missing):")
print(df.isnull())
# Fill missing values
df_filled = df.fillna({
'Age': df['Age'].mean(),
'Score': df['Score'].mean()
})
print("\nAfter Filling Missing Values:")
print(df_filled)
# Drop rows with any missing data
df_dropped = df.dropna()
print("\nAfter Dropping Rows with Missing Data:")
print(df_dropped)
OUTPUT:
Original Data:
Name Age Score
0 Alice 25.0 85.0
1 Bob NaN 90.0
2 Charlie 30.0 NaN
3 David NaN 88.0
Missing Data (True = Missing):
Name Age Score
0 False False False
1 False True False
2 False False True
3 False True False
After Filling Missing Values:
Name Age Score
0 Alice 25.0 85.000000
1 Bob 27.5 90.000000
2 Charlie 30.0 87.666667
3 David 27.5 88.000000
After Dropping Rows with Missing Data:
Name Age Score
0 Alice 25.0 85.0