-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathpage2.html
More file actions
309 lines (278 loc) · 11.8 KB
/
page2.html
File metadata and controls
309 lines (278 loc) · 11.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link
href="https://fonts.googleapis.com/css?family=Montserrat:100,200,300,400,900|Ubuntu&display=swap"
rel="stylesheet"
/>
<link
rel="stylesheet"
href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css"
integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm"
crossorigin="anonymous"
/>
<link rel="stylesheet" href="css/style.css" />
<title>NYCHA</title>
</head>
<body>
<section class="top container-fluid">
<header>
<h1 class="title" id="colorMove">NYCHA's Analysis</h1>
<h2 class="title-h2">(New York City Housing Authority)</h2>
</header>
</section>
<section class="middle">
<div class="row">
<div class="col-md-3" id="particles-js"></div>
<div class="col-md-6 middle-content">
<h1 class="middle-title">Part 2</h1>
<div class="intro">
<h2>Measures of Spread and Center</h2>
<table class="table">
<thead>
<tr>
<th class="table-head">Measure</th>
<th class="table-head">Female Household Headed Families</th>
<th class="table-head">Male Household Headed Families</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mean</td>
<td>22378.82</td>
<td>6580.667</td>
</tr>
<tr>
<td>Median</td>
<td>6654</td>
<td>2467</td>
</tr>
<tr>
<td>Variance</td>
<td>1304129049</td>
<td>110608654</td>
</tr>
<tr>
<td>Standard Deviation</td>
<td>36112.73</td>
<td>10517.06</td>
</tr>
</tbody>
</table>
<p class="intro-desc">
The mean and median are different in both cases. This is not
surprising since as seen from the previous graphs, the female’s
household image has a greater range in the x-axis compared to the
male’s one. Furthermore when looking at the graph distributions,
both are skewed to the right, but the tot female household graph
has greater mean compared to the male’s; this might be attributed
to the disparity in income, which would explain the great
difference in numbers. Additionally, the Total Male Household
Headed Families has a standard deviation that is fairly close to
the mean, again showing that the graph a smaller spread across the
data set
</p>
</div>
<hr />
<!-- ScatterPlot -->
<div class="scatterplots">
<h2>Scatterplots and Correlation</h2>
<div class="scatterplot-img">
<img
class="img-fluid page2-img"
src="img/project_regression_line.png"
alt="Total Families"
/>
</div>
<p class="var-description">
From the graph we can see that there appears to be a linear
association between the Total Female Headed Household Families and
Total Male Headed Household Families. The strong correlation of
0.9973634 between the two variables can be attributed by the need
of families to rent a house or apartment, which causes the number
of applicants to increase. Thus, the NYCHA will have a greater
number of applicants. Additionally, there is a gap in the x-axis
starting from 6000 to 8000, that is the same gap that was
previously discussed, which might be caused by the lack of data
for a period of time or the data might have been so small that
they decided to ignore those values.
</p>
</div>
<hr />
<!-- Confidence Interval -->
<div class="confidence-interval">
<h2>Confidence Interval</h2>
<table class="table">
<thead>
<tr>
<th class="table-head">Measure</th>
<th class="table-head">Female Household Headed Families</th>
<th class="table-head">Male Household Headed Families</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mean</td>
<td>22378.82</td>
<td>6580.667</td>
</tr>
<tr>
<td>Median</td>
<td>6654</td>
<td>2467</td>
</tr>
<tr>
<td>Variance</td>
<td>1304129049</td>
<td>110608654</td>
</tr>
<tr>
<td>Standard Deviation</td>
<td>36112.73</td>
<td>10517.06</td>
</tr>
<tr>
<td>Confidence Interval</td>
<td>[ 22189.65, 22567.99 ]</td>
<td>[ 6477.597, 6683.737 ]</td>
</tr>
</tbody>
</table>
<p class="var-description">
In both cases the confidence interval is positive, which is not
surprising since the standard deviations have positive values.
Moreover the mean falls within the interval range, suggesting that
we can be 95% certain that we obtain the true mean. Another
observation is the difference in values of the intervals from both
data sets, which is caused by the distribution of the data being
more spread in the female’s household graph compared to the male’s
one.
</p>
</div>
<hr />
<div class="regression-line">
<h2>Linear Regression</h2>
<p>
<b>Equation of the Regression Line</b>:
<em>y= 0.2905x + 80.4947</em>. Where x is Total Female Headed
Families and we are trying to predict the ratio between female to
male household head:
</p>
<div class="scatterplot-img">
<img
class="img-fluid page2-img"
src="img/project_regression_line.png"
alt="Total Families"
/>
</div>
<p class="var-description">
The graph on the left is the same one as the previous when we
looked at the correlation in the scatterplot. On the right we have
the residual, which shows a normally distributed shape, suggesting
the line is a good fit. This is confirmed by the
<b> R-Squared</b> value of 0.99868, which indicates a strong
linear association between the variables. Image of Scatter Plot
with x-axis and residual y-axis The actual observed response value
(x axis) vs. residual (y axis) graph illustrate that the residuals
do not move/increase as the observed response value
</p>
</div>
<hr />
<div class="hypothesis">
<!-- Test 1 -->
<h2>Hypothesis</h2>
<h3>Test 1</h3>
<p class="var-description">
Based on the NYCHA data, suppose we want to know if the percentage
of females in charge of the family is higher than the male’s one.
Upon taking a subset of the Total Male Headed Household Families
and Total Female Headed Household Families, we found the mean to
be 1626.792 and 5366.042, and sample sizes: 14438 and 19068
respectively. Can we conclude that there are more females in
charge of families than males? Assuming that the significance
level is 0.05:
</p>
<p class="var-description">
<span
><em
>Let μ<sub>x</sub> be the mean of the Total Female Headed
Household Families</em
></span
>
<span
><em
>Let μ<sub>y</sub> be the mean of the Total Male Headed
Household Families</em
></span
>
</p>
<p class="var-description">
<span>Null and Alternative Hypothesis:</span>
<span>H<sub>0</sub>: μ<sub>x</sub> = μ<sub>x</sub></span>
<span>H<sub>1</sub>: μ<sub>y</sub> > μ<sub>y</sub></span>
<span>Test Statistic = 0.01598</span>
Since the significance level is 0.05, the resulting = 0.05, with a
critical value is zα = z0.05 = −1.64. With this information we
reject the null hypothesis and conclude that there are more
females in charge in the NYCHA programs that are in charge of the
house.
</p>
<!-- Test 2 -->
<h3>Test 2</h3>
<p class="var-description">
Based on the NYCHA data, suppose we want to know if the percentage
of females in charge of the family is not the male’s one. Upon
taking a subset of the Total Male Headed Household Families and
Total Female Headed Household Families, we found the mean to be
342.375 and 469.6667, and sample sizes: 187 and 656 respectively.
Can we conclude that there are more females in charge of families
than males? Assuming that the significance level is 0.05
</p>
<p class="var-description">
<span
><em
>Let μ<sub>x</sub> be the mean of the Total Female Headed
Household Families</em
></span
>
<span
><em
>Let μ<sub>y</sub> be the mean of the Total Male Headed
Household Families</em
></span
>
</p>
<p class="var-description">
<span>Null and Alternative Hypothesis:</span>
<span>H<sub>0</sub>: μ<sub>x</sub> = μ<sub>x</sub></span>
<span>H<sub>1</sub>: μ<sub>y</sub> != μ<sub>y</sub></span>
<span>Test Statistic = 0.01896</span>
Since the significance level is 0.05, the resulting = 0.05, with a
critical value is zα = z0.05 = −1.64. With this we can reject the
null hypothesis since the test static is not in the critical value
range. Thus conclude that the percentage of female to male in
charge of the house based NYCHA is not the same.
</p>
<p class="github">
Code Available on <a href="https://github.com/mnti125">github</a>
</p>
<a href="index.html"
><button class="btn bnt-lg btn-outline-dark continue-back">
Back
</button></a
>
</div>
</div>
<div class="col-md-3" id="particles2-js"></div>
</div>
</section>
<!-- Footer -->
<footer>
<p class="foot">@2020 MAT 327</p>
</footer>
<script src="particles.js"></script>
<script src="app.js"></script>
</body>
</html>