Statistical-Inference/01-samplingdistr.Rmd at master · WdeNooy/Statistical-Inference · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
# Sampling Distribution: How Different Could My Sample Have Been? {#samp-dist}

> Key concepts: inferential statistics, generalization, population, random sample, sample statistic, sampling space, random variable, sampling distribution, probability, probability distribution, discrete probability distribution, expected value/expectation, unbiased estimator, parameter, (downward) biased, representative sample, continuous variable, continuous probability distribution, probability density, (left-hand/right-hand) probability.

Watch this micro lecture on sampling distributions for an overview of the chapter.

```{r, echo=FALSE, out.width="640px", fig.pos='H', fig.align='center', dev="png", screenshot.opts = list(delay = 5)}
knitr::include_url("https://www.youtube.com/embed/GuJWgkSHywg", height = "360px")
```

### Summary {-}

```{block2, type='rmdimportant'}
What does our sample tell us about the population from which it was drawn?
```

Statistical inference is about estimation and null hypothesis testing. We have collected data on a random sample and we want to draw conclusions (make inferences) about the population from which the sample was drawn. From the proportion of yellow candies in our sample bag, for instance, we want to estimate a plausible range of values for the proportion of yellow candies in a factory's stock (confidence interval). Alternatively, we may want to test the null hypothesis that one fifth of the candies in a factory's stock is yellow.

The sample does not offer a perfect miniature image of the population. If we would draw another random sample, it would have different characteristics. For instance, it would contain more or fewer yellow candies than the previous sample. To make an informed decision on the confidence interval or null hypothesis, we must compare the characteristic of the sample that we have drawn to the characteristics of the samples that we could have drawn.

The characteristics of the samples that we could have drawn constitute a sampling distribution. Sampling distributions are the central element in estimation and null hypothesis testing. In this chapter, we simulate sampling distributions to understand what they are. Here, _simulation_ means that we let a computer draw many random samples from a population.

In Communication Science, we usually work with samples of human beings, for instance, users of social media, people looking for health information or entertainment, citizens preparing to cast a political vote, an organization's stakeholders, or samples of media content such as tweets, tv advertisements, or newspaper articles. In the current and two subsequent chapters, however, we avoid the complexities of these samples.

We focus on a very tangible kind of sample, namely a bag of candies, which helps us understand the basic concepts of statistical inference: sampling distributions (the current chapter), probability distributions (Chapter \@ref(probmodels)), and estimation (Chapter \@ref(param-estim)). Once we thoroughly understand these concepts, we turn to Communication Science examples.

## Statistical Inference: Making the Most of Your Data

Statistics is a tool for scientific research. It offers a range of techniques to check whether statements about the observable world are supported by data collected from that world. Scientific theories strive for general statements, that is, statements that apply to many situations. Checking these statements requires lots of data covering all situations addressed by theory.

Collecting data, however, is expensive, so we would like to collect as little data as possible and still be able to draw conclusions about a much larger set. The cost and time involved in collecting large sets of data are also relevant to applied research, such as market research. In this context we also like to collect as little data as necessary.

_Inferential statistics_ offers techniques for making statements about a larger set of observations from data collected for a smaller set of observations. The large set of observations about which we want to make a statement is called the _population_. The smaller set is called a _sample_. We want to _generalize_ a statement about the sample to a statement about the population from which the sample was drawn.

Traditionally, statistical inference is generalization from the data collected in a _random sample_ to the population from which the sample was drawn. This approach is the focus of the present book because it is currently the most widely used type of statistical inference in the social sciences. We will, however, point out other approaches in Chapter \@ref(crit-discus).

Statistical inference is conceptually complicated and for that reason quite often used incorrectly. We will therefore spend quite some time on the principles of statistical inference. Good understanding of the principles should help you to recognize and avoid incorrect use of statistical inference. In addition, it should help you to understand the controversies surrounding statistical inference and developments in the practice of applying statistical inference that are taking place. Investing time and energy in fully understanding the principles of statistical inference really pays off later.

## A Discrete Random Variable: How Many Yellow Candies in My Bag? {#discreterandomvariable}

An obvious but key insight in statistical inference is this: If we draw random samples from the same population, we are likely to obtain different samples. No two random samples from the same population need to be identical, even though they can be identical.

### Sample statistic
We are usually interested in a particular characteristic of the sample rather than in the exact nature of each observation within the sample. For instance, I happen to be very fond of yellow candies. If I buy a bag of candies, my first impulse is to tear the bag open and count the number of yellow candies. Am I lucky today? Does my bag contain a lot of yellow candies?

```{r random-variable, fig.pos='H', fig.align='center', fig.cap="How many yellow candies will our sample bag contain?", echo=FALSE, out.width="420px", screenshot.opts = list(delay = 5), dev="png"}
#interactive content: a button to draw a sample from a population of points uniformly distributed over five colours, display the sample (as a set of coloured circles) and the number of yellow candies with each sample ; relevant expectations: (1) number of yellow candies per sample varies (variable), (2) this number depends on chance (random variable), (3) the number may range from 0 to 10 (sampling space), (4) the most likely number of yellow candies is two (expected value, expectation).
knitr::include_app("http://82.196.4.233:3838/apps/random-variable/", height="360px")
```

<A name="question1.2.1"></A>
```{block2, type='rmdquestion'}
1. Figure \@ref(fig:random-variable) shows a population of candies. What do you expect the number of yellow candies to be in a random sample of ten candies from this population? Draw several samples and check whether your expectation comes true. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.1)
```

<A name="question1.2.2"></A>
```{block2, type='rmdquestion'}
2. What are all the possible outcomes for the number of yellow candies in our sample? The collection of all possible outcome scores is called the _sampling space_. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.2)
```

The number of yellow candies in a bag is an example of a _sample statistic_: a value describing a characteristic of the sample. Each bag, that is, each sample, has one outcome score on the sample statistic. For instance, one bag contains four yellow candies, another bag contains seven, and so on. All possible outcome scores constitute the _sampling space_. A bag of ten candies may contain 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 yellow candies. The numbers 0 to 10 are the sampling space of the sample statistic _number of yellow candies in a bag_.

The sample statistic is called a _random variable_. It is a variable because different samples can have different scores. The value of a variable may vary from sample to sample. It is a random variable because the score depends on chance, namely the chance that a particular sample is drawn.

### Sampling distribution
Some sample statistic outcomes occur more often than other outcomes. We can see this if we draw very many random samples from a population and collect the frequencies of all outcome scores in a table or chart. We call the distribution of the outcome scores of very many samples a _sampling distribution_.

```{r sampling-distribution, fig.pos='H', fig.align='center', fig.cap="What is a sampling distribution?", echo=FALSE, screenshot.opts = list(delay = 5), dev="png", out.width="775px"}
#interactive content: three histograms: a uniformly distributed discrete population of five colours on top, a sample in the middle (initially empty), and a sampling distribution in the bottom (initially empty) ; first button allows to draw one sample, simulating drawing one sample from the population and adding the number of yellow candies to the bottom histogram (ideally, the candies 'drop' from the population to the sample, then the number of yellow candies appears below the sample and this number 'drops' from the sample to the sampling distribution) ; second button (becomes active after the first button has been used) draws 1,000 samples and adds the yellow candy counts for all samples to the sampling distribution in one go
knitr::include_app("http://82.196.4.233:3838/apps/sampling-distribution/", height="425px")
```

<A name="question1.2.3"></A>
```{block2, type='rmdquestion'}
3. Draw a random sample of ten candies in Figure \@ref(fig:sampling-distribution). What do the numbers on the horizontal axis of the bottom-right histogram mean? And what does the vertical axis of this histogram represent? [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.3)
```

<A name="question1.2.4"></A>
```{block2, type='rmdquestion'}
4. What are the cases (the 'things' that are counted) in the three histograms? Hint: There are two different types of cases. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.4)
```

<A name="question1.2.5"></A>
```{block2, type='rmdquestion'}
5. Guess the most likely and most unlikely outcome scores for the number of yellow candies in a sample bag containing ten candies in Figure \@ref(fig:sampling-distribution). Check your intuitions by drawing 1,000 samples. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.5)
```

### Probability and probability distribution {#probdistribution}

What is the probability of buying a bag with exactly five yellow candies? In statistical terminology, what is the probability of drawing a sample with five yellow candies as sample statistic outcome? This probability is the proportion of all possible samples that we could have drawn that happen to contain five yellow candies.

Of course, the probability of a sample bag with exactly five yellow candies depends on the share of yellow candies in the population of all candies. Figure \@ref(fig:probability-distribution) displays the probabilities of a sample bag with a particular number of yellow candies if twenty per cent of the candies in the population are yellow. You can adjust the population share of yellow candies to see what happens.

```{r probability-distribution, echo=FALSE, fig.pos='H', fig.align='center', fig.cap="How does the probability of drawing a sample bag with two out of ten candies yellow depend on the proportion of yellow candies in the population?", out.width="420px", screenshot.opts = list(delay = 5), dev="png"}
#Generate a binomial probability distribution for the number of yellow candies in a random sample of ten from a population with the specified proportion of yellow candies and display this as a table ; the user is able to change the population proportion (range [0.0 - 1.0]), which is initially set to .2.
knitr::include_app("http://82.196.4.233:3838/apps/probability-distribution/", height="590px")
```

<A name="question1.2.6"></A>
```{block2, type='rmdquestion'}
6. In Figure \@ref(fig:probability-distribution), what is the sample statistic and what is the sampling space? [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.6)
```

<A name="question1.2.7"></A>
```{block2, type='rmdquestion'}
7. Which number of yellow candies is most likely to be found in a sample bag of ten candies (before you change the slider setting)? How does this relate to the proportion of candies in the population? [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.7)
```

<A name="question1.2.8"></A>
```{block2, type='rmdquestion'}
8. What is the probability that a sample bag of ten candies contains at most three yellow candies if the proportion in the population is .2? [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.8)
```

<A name="question1.2.9"></A>
```{block2, type='rmdquestion'}
9. What do you expect to happen to the probabilities if you increase the proportion of yellow candies in the population (factory stock)? Use the slider to check your answer. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.9)
```

<A name="question1.2.10"></A>
```{block2, type='rmdquestion'}
10. What is special about the distribution if the proportion of yellow candies in the population is .5? [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.10)
```

The sampling distribution tells us all possible samples that we could have drawn. We can use the distribution of all samples to get the probability of buying a bag with exactly five yellow candies from the sampling distribution: We divide the number of samples with five yellow candies by the total number of samples we have drawn. For example, if 26 out of all 1000 samples have five yellow candies, the proportion of samples with five yellow candies is 26 / 1000 = 0.026. Then, the probability of drawing a sample with five yellow candies is 0.026 (we usually write: .026).

If we change the frequencies in the sampling distribution into proportions, we obtain the _probability distribution_ of the sample statistic: A sampling space with a probability (between 0 and 1) for each outcome of the sample statistic. Because we are usually interested in probabilities, sampling distributions tend to have proportions, that is probabilities, instead of frequencies on the vertical axis. See Figure \@ref(fig:expected-value) for an example.

Figure \@ref(fig:probability-distribution) displays the probability distribution of the number of yellow candies per bag of ten candies. This is an example of a _discrete probability distribution_ because only a limited number of outcomes are possible. It is feasible to list the probability of each outcome separately.

The sampling distribution as a probability distribution conveys very important information. It tells us which outcomes we can expect,  in our example, how many yellow candies we may find in our bag of ten candies. Moreover, it tells us the probability that a particular outcome may occur. If the sample is drawn from a population in which 20% of candies are yellow, we are quite likely to find zero, one, two, three, or four yellow candies in our bag. A bag with five yellow candies would be rare, six or seven candies would be very rare, and a bag with more than seven yellow candies is extremely unlikely but not impossible. If we buy such a bag, we know that we have been extremely lucky.

We may refer to probabilities both as a proportion, that is, a number between 0 and 1, and as a percentage: a number between 0% and 100%. Proportions are commonly considered to be the correct way to express probabilities. When we talk about probabilities, however, we tend to use percentages; we may, for example, say that the probabilities are fifty-fifty.

### Expected value or expectation {#expectedvalue}

We haven't yet thought about the value that we are most likely to encounter in the sample that we are going to draw. Intuitively, it must be related to the distribution of colours in the population of candies from which the sample was drawn. In other words, the share of yellow candies in the factory's stock from which the bag was filled or in the machine that produces the candies, seems to be relevant to what we may expect to find in our sample.

```{r expected-value, echo=FALSE, out.width="420px", fig.pos='H', fig.align='center', fig.cap="What is the expected value of a probability distribution?", screenshot.opts = list(delay = 5), dev="png"}
#Generate a binomial probability distribution for the number of yellow candies in a random sample of ten from a population with the specified proportion of yellow candies and display this as a histogram ; the user is able to change the population proportion, which is initially set to .2 ; add a button to reveal the average of the probability distribution in the histogram as a vertical line with associated value.
knitr::include_app("http://82.196.4.233:3838/apps/expected-value/", height="420px")
```

<A name="question1.2.11"></A>
```{block2, type='rmdquestion'}
11. In Figure \@ref(fig:expected-value), which number of yellow candies is most likely to occur in a sample bag of ten candies (before you change the slider setting)? How does this number change if you change the proportion of yellow candies in the population? [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.11)
```

<A name="question1.2.12"></A>
```{block2, type='rmdquestion'}
12. How does the mean of the sampling distribution relate to the expected value? Experiment with different values for the population proportion. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.12)
```

<A name="question1.2.13"></A>
```{block2, type='rmdquestion'}
13. How does the mean of the sampling distribution relate to the population proportion? Experiment with different values for the population proportion. [<img src="icons/2answer.png" width=115px align="right">](#answer1.2.13)
```

If the share of yellow candies in the population is 0.20 (or 20%), we expect one out of each five candies in a bag (sample) to be yellow. In a bag with 10 candies, we would expect two candies to be yellow: one out of each five candies or the population proportion times the total number of candies in the sample = 0.20 * 10 = 2.0. This is the expected value.

The expected value of the proportion of yellow candies in the sample is equal to the proportion of yellow candies in the population. If you carefully inspect a sampling distribution (Figure \@ref(fig:expected-value)), you will see that the expected value also equals the mean of the sampling distribution. This makes sense: Excess yellow candies in some bags must be compensated for by a shortage in other bags.

Thus we arrive at the definition of the _expected value_ of a random variable:

```{block2 type='rmdimportant'}
The expected value is the average of the sampling distribution of a random variable.
```

In our example, the random variable is a sample statistic, more specifically, the number of yellow candies in a sample.

The sampling distribution is an example of a probability distribution, so, more generally, the expected value is the average of a probability distribution. The expected value is also called the _expectation_ of a probability distribution.

### Unbiased estimator {#unbiased-est}

Note that the expected value of the proportion of yellow candies in the bag (sample statistic) equals the true proportion of yellow candies in the candy factory (population statistic). For this reason, the sample proportion is an unbiased estimator of the proportion in the population. More generally, a sample statistic is called an _unbiased estimator_ of the population statistic if the expected value (mean of the sampling distribution) is equal to the population statistic. By the way, we usually refer to the population statistic as a _parameter_.

Most but not all sample statistics are unbiased estimators of the population statistic. Think, for instance, of the actual number of yellow candies in the sample. This is certainly not an unbiased estimator of the number of yellow candies in the population. Because the population is so much larger than the sample, the population must contain many more yellow candies than the sample. If we were to estimate the number in the population (the parameter) from the number in the sample---for instance, we estimate that there are two yellow candies in the population of all candies because we have two in our sample of ten---we are going to vastly underestimate the number in the population. This estimate is _downward biased_: It is too low.

In contrast, the proportion in the sample is an unbiased estimator of the population proportion. That is why we do not use the number of yellow candies to generalize from our sample to the population. Instead, we use the proportion of yellow candies. You probably already did this intuitively.

Sometimes, we have to adjust the way in which we calculate a sample statistic to get an unbiased estimator. For instance, we must calculate the standard deviation and variance in the sample in a special way to obtain an unbiased estimate of the population standard deviation and variance. The exact calculation need not bother us, because our statistical software takes care of that. Our software only uses unbiased estimators.

### Representative sample {#representative}

Because the share of yellow candies in the population represents the probability of drawing a yellow candy, we also expect 20% of the candies in our bag to be yellow. For the same reason we expect the shares of all other colours in our sample bag to be equal to their shares in the population. As a consequence, we expect a random sample to resemble the population from which it is drawn.

A sample is _representative_ of a population (in the strict sense) if variables in the sample are distributed in the same way as in the population. Of course, we know that a random sample is likely to differ from the population due to chance, so the actual sample that we have drawn is usually not representative of the population in the strict sense.

But we should expect it to be representative, so we say that it is _in principle representative_ or _representative in the statistical sense_ of the population. We can use probability theory to account for the misrepresentation in the actual sample that we draw. This is what we do when we use statistical inference to construct confidence intervals and test null hypotheses, as we will learn in later chapters.

### Answers {-}

<A name="answer1.2.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* The colours are equally distributed in the population, so one out of each
five candies in the population is yellow. In other words, the proportion of
yellow candies in the population is .2.
* This is the proportion that we would also expect in the sample. A sample
contains ten candies, so two out of these ten are expected to be yellow. If we
draw several samples, we notice that only a minority of our samples contain
exactly two yellow candies. [<img src="icons/2question.png" width=161px align="right">](#question1.2.1)
```

<A name="answer1.2.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* In a sample of ten candies, zero to ten candies can be yellow.
* The numbers 0, 1, 2, ..., 9, 10 constitute all possible outcomes for the
sample statistic 'Number of yellow candies'. This is called the sampling
space. [<img src="icons/2question.png" width=161px align="right">](#question1.2.2)
```

<A name="answer1.2.3"></A>
```{block2, type='rmdanswer'}
Answer to Question 3.

* The numbers on the horizontal axis constitute the sampling space, that is, all
values that the sample statistic "Number of yellow candies in the sample" can take.
* The left-hand vertical axis shows the number of samples that have been drawn with a particular value for the sample statistic, that is, with a particular number of yellow candies in the sample.
* The right-hand vertical axis gives the proportion of previously drawn samples with a particular number of yellow candies. The proportions can be interpreted as probabilities, namely the probability that a previously drawn sample contains a particular number of yellow candies. [<img src="icons/2question.png" width=161px align="right">](#question1.2.3)
```

<A name="answer1.2.4"></A>
```{block2, type='rmdanswer'}
Answer to Question 4.

* In the two graphs to the left, candies are the cases. We count how many candies have a particular colour.
* In the right-hand graph showing the sampling distribution, samples (candy bags) are the cases. We count how many samples (bags) contain a particular number of yellow candies. [<img src="icons/2question.png" width=161px align="right">](#question1.2.4)
```

<A name="answer1.2.5"></A>
```{block2, type='rmdanswer'}
Answer to Question 5.

* If twenty per cent of candies in the population are yellow, we expect about
twenty per cent of candies in our sample to be yellow. Our sample contains ten
candies, so we expect two yellow candies in our sample. Indeed, samples with
two yellow candies are most frequent if we draw 1,000 random samples.
* If we expect two candies, samples are more unlikely if they contain a number
of yellow candies that is further away from two. So we expect the sample
counts to decrease if we move away from two in the sampling distribution. Ten
yellow candies is furthest away from two in our sampling space, so a sample
bag with ten yellow candies is most unlikely. [<img src="icons/2question.png" width=161px align="right">](#question1.2.5)
```

<A name="answer1.2.6"></A>
```{block2, type='rmdanswer'}
Answer to Question 6.

* The number of yellow candies in the sample (bag) is the sample statistic.
This is the characteristic of the sample (bag) that we are interested in.
* The set of all possible outcomes of the sample statistic is the sampling
space. In this example, the sampling space is the set of (integer) numbers
from 0 to 10. [<img src="icons/2question.png" width=161px align="right">](#question1.2.6)
```

<A name="answer1.2.7"></A>
```{block2, type='rmdanswer'}
Answer to Question 7.

* The numbers and horizontal bars in the _Probability_ column represent the
probabilities of outcomes. In the initial situation, the highest probability
is found for a sample bag containing two yellow candies (p = .302). This
amounts to two out of the ten candies in the sample bag, that is, twenty per
cent.
* This percentage is equal to the percentage of yellow candies in the
population. We are most likely to draw a sample with a percentage or
proportion that is equal to the population percentage, here _p_ = .302, even
though the total probability of drawing a sample with another percentage of
yellow candies is higher (p = 1 - .302 = .698). [<img src="icons/2question.png" width=161px align="right">](#question1.2.7)
```

<A name="answer1.2.8"></A>
```{block2, type='rmdanswer'}
Answer to Question 8.

* At most three out of ten candies means that we have to sum the probabilities
of zero, one, two, and three yellow candies. This probability equals .107 +
.268 + .302 + .201 = .878. That is a fair chance. [<img src="icons/2question.png" width=161px align="right">](#question1.2.8)
```

<A name="answer1.2.9"></A>
```{block2, type='rmdanswer'}
Answer to Question 9.

* If a larger part of the candies is yellow in the population, we should
expect more yellow candies in our sample bag. The probabilities of small
numbers of yellow candies (low outcome values) should go down whereas the
probabilities of large numbers (high outcome scores) should go up.
* If you move the slider to the right, you will see that the distribution
shifts down in the table. [<img src="icons/2question.png" width=161px align="right">](#question1.2.9)
```

<A name="answer1.2.10"></A>
```{block2, type='rmdanswer'}
Answer to Question 10.

* If the population proportion is .5, the probability distribution is
symmetric. The probability of a sample bag with four candies is equal to the
probability of a sample bag with six candies. Probabilities are equal for
three and seven yellow candies, two and eight, one and nine, zero and ten
yellow candies.
* The distribution has the classic symmetric bell shape of a normal
distribution that we will encounter when we discuss continuous probability
distributions. [<img src="icons/2question.png" width=161px align="right">](#question1.2.10)
```

<A name="answer1.2.11"></A>
```{block2, type='rmdanswer'}
Answer to Question 11.

* We expect that the proportion of yellow candies in the sample equals the
population proportion, which initially is .2. In a sample of ten candies,
then, the expected number of yellow candies is two.
* Note that two is the outcome with the highest probability in the sampling
distribution.
* The expected number of yellow candies in a sample bag of ten candies is ten
times the population proportion, so the expected number of candies in the
sample changes in accordance with changes in the population proportion. [<img src="icons/2question.png" width=161px align="right">](#question1.2.11)
```

<A name="answer1.2.12"></A>
```{block2, type='rmdanswer'}
Answer to Question 12.

* The mean of the sampling distribution is equal to the expected value of the
sample statistic. In the initial example, both are two.
* This makes sense: Samples with fewer yellow candies than expected should
balance samples with more yellow candies than expected. The mean represents the
"balance point" of a distribution. [<img src="icons/2question.png" width=161px align="right">](#question1.2.12)
```

<A name="answer1.2.13"></A>
```{block2, type='rmdanswer'}
Answer to Question 13.

* The mean of the sampling distribution of the sample proportion is equal to
the population proportion. If we would have created the sampling distribution
of the proportion of yellow candies in our sample, the mean of the sampling
distribution would be equal to the proportion of yellow candies in the
population.
* Again, we are equally likely to draw a sample with fewer yellow candies than
the expected proportion as a sample with more yellow candies. Samples with
fewer yellow candies than expected should balance samples with more yellow
candies than expected. The mean represents the "balance point" of a
distribution.
* Note that the expected value (mean of the sampling distribution) only equals
the population value if the sample statistic is an unbiased estimate of the
population value (parameter). See the next section. [<img src="icons/2question.png" width=161px align="right">](#question1.2.13)
```

## A Continuous Random Variable: Overweight And Underweight. {#cont-random-var}

Let us now look at another variable: the weight of candies in a bag. The weight of candies is perhaps more interesting to the average consumer than candy colour because candy weight is related to calories.

### Continuous variable

Weight is a _continuous variable_ because we can always think of a new weight between two other weights. For instance, consider two candy weights: 2.8 and 2.81 grams. It is easy to see that there can be a weight in between these two values, for instance, 2.803 grams. Between 2.8 and 2.803 we can discern an intermediate value such as 2.802. In principle, we could continue doing this endlessly, e.g., find a weight between 2.80195661 and 2.80195662 grams even if our scales may not be sufficiently precise to measure any further differences. It is the principle that counts. If we can always think of a new value in between two values, the variable is continuous.

```{block2 type='rmdimportant'}
Continuous variable: We can always think of a new value in between two values.
```

### Continuous sample statistic {#cont_sample_stat}

We are not interested in the weight of a single candy. If a relatively light candy is compensated for by a relatively heavy candy in the same bag, we still get the calories that we want. We are interested in the average weight of all candies in our sample bag, so average candy weight in our sample bag is our key sample statistic. We want to say something about the probabilities of average candy weight in the samples of candies that we can draw. Can we do that?

```{r cont-prob, fig.cap="A continuous sampling distribution.", eval=FALSE, echo=FALSE}
#SKIP: Histogram assumes binning but binning principle is explained later.

#Generate a continuous normally distributed sampling distribution representing average candy weight (using a fixed y scale) ; display it as a histogram with average candy weight in a sample bag on the x-axis and probability on the y-axis ; allow the user to decrease histogram bin width to see how smaller bins reduce the probabilities and with very narrow bins, the probabilities approach zero ("negligible probabilities")

# Figure \@ref(fig:cont-prob) shows a typical probability distribution of average candy weight.
#
# 1. What do you expect to happen if you decrease the bin width of the histogram? Use the slider to check your expectations.
```

When we turn to the probabilities of getting samples with a particular average candy weight, we run into problems with a continuous sample statistic. If we would want to know the probability of drawing a sample bag with an average candy weight of 2.8 grams, we should exclude sample bags with an average candy weight of 2.81 grams, or 2.801 grams, or 2.8000000001 grams, and so on. In fact, we are very unlikely to draw a sample bag with an average candy weight of exactly 2.8 grams, that is, with an infinite number of zeros trailing 2.8. In other words, the probability of such a sample bag is for all practical purposes zero and negligible.

This applies to every average candy weight, so all probabilities are virtually zero. The probability distribution of the sampling space, that is, of all possible outcomes, is going to be very boring: just (nearly) zeros. And it will take forever to list all possible outcomes within the sampling space, because we have an infinite number of possible outcomes. After all, we can always find a new average candy weight between two selected weights.

### Probability density

With a continuous sample statistic, we must look at a range of values instead of a single value. We can meaningfully talk about the probability of having a sample bag with an average candy weight of at least 2.8 grams or at most 2.8 grams. We choose a threshold, in this example 2.8 grams, and determine the probability of values above or below this threshold. We can also use two thresholds, for example the probability of an average candy weight between 2.75 and 2.85 grams. This is probably what you were thinking of when I referred to a bag with 2.8 grams as average candy weight.

If we cannot determine the probability of a single value, which we used to depict on the vertical axis in a plot of a sampling distribution, and we have to link probabilities to a range of values on the x axis, for example, average candy weight above/below 2.8 grams, how can we display probabilities? We have to display a probability as an area between the horizontal axis and a curve. This curve is called a _probability density function_, so if there is a label to the vertical axis of a continuous probability distribution, it is "Probability density" instead of "Probability".

Figure \@ref(fig:p-values) shows an example of a continuous probability distribution for the average weight of candies in a sample bag. This is the familiar normal distribution so we could say that the normal curve is the probability density function here. The total area under this curve is set to one, so the area belonging to a range of sample outcomes (average candy weight) is 1 or less, as probabilities should be.

```{r p-values, fig.pos='H', fig.align='center', fig.cap="How do we display probabilities in a continuous sampling distribution? Tip: Click on a slider handle and use your keyboard arrow keys to make small changes to the slider handle position.", echo=FALSE, screenshot.opts = list(delay = 5), dev="png", out.width="420px"}
# Generate a normal sampling distribution representing average candy weight in a sample bag (M = 2.8, SD = 0.6) ; add a range slider to the x-axis linked to vertical lines, showing the proportions of the probabilities to left and to right and between the lines ; initial setting is 2.8 for the right-hand slider
knitr::include_app("http://82.196.4.233:3838/apps/p-values/", height="260px")
```

<A name="question1.3.1"></A>
```{block2, type='rmdquestion'}
1. In Figure \@ref(fig:p-values), what is the probability of buying a bag with average candy weight of 2.8 grams or more? [<img src="icons/2answer.png" width=115px align="right">](#answer1.3.1)
```

<A name="question1.3.2"></A>
```{block2, type='rmdquestion'}
2. Use the sliders to find the probability of buying a bag with average candy weight between 2.6 and 3.7 grams. [<img src="icons/2answer.png" width=115px align="right">](#answer1.3.2)
```

<A name="question1.3.3"></A>
```{block2, type='rmdquestion'}
3. What is the minimum average weight of the 10% heaviest candy bags? [<img src="icons/2answer.png" width=115px align="right">](#answer1.3.3)
```

A probability density function can give us the probability of values between two thresholds. It can also give us the probability of values up to (and including) a threshold value, which is known as a _left-hand probability_, or the probability of values above (and including) a threshold value, which is called a _right-hand probability_. In a null hypothesis significance test (Chapter \@ref(hypothesis)), right-hand and left-hand probabilities are used to calculate _p_ values.

Why did I put _(and including)_ between parentheses? It does not really matter whether we add the exact boundary value (2.8 grams) to the probability on the left or on the right because the probability of getting a bag with average candy weight at exactly 2.8 grams (with a very long trail of zero decimals) is negligible.

Are you struggling with the idea of areas instead of heights (values on the vertical axis) as probabilities? Just realize that we could use the area of a bar in a histogram instead of the height as indication of the probability in discrete probability distributions, for example, Figure \@ref(fig:expected-value). The bars in a histogram are all equally wide, so (relative) differences between bar areas are equal to differences in bar height.

### Probabilities always sum to 1

While you were playing with Figure \@ref(fig:p-values), you may have noticed that displayed probabilities always add up to one. This is true for every probability distribution because it is part of the definition of a probability distribution.

In addition, you may have realized that we can use probability distributions in two ways. We can use them to say how likely or unlikely we are to draw a sample with the sample statistic value in a particular range. For example, what is the chance that we draw a sample bag with average candy weight over 2.9 grams? But we can also use a probability distribution to find the threshold values that separate the top ten per cent or the bottom five per cent in a distribution. If we want a sample bag with highest average candy weight, say, belonging to the ten per cent bags with highest average candy weight, what should be the minimum average candy weight in the sample bag?

### Answers {-}

<A name="answer1.3.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* The probability is .5. It is represented by the green surface under the
curve. This is exactly half of the total surface under the curve because 2.8
grams is the average candy weight in the population and, as a result, the
average value of the sampling distribution of average sample candy weight. [<img src="icons/2question.png" width=161px align="right">](#question1.3.1)
```

<A name="answer1.3.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* If you set the slider handles to 2.6 and 3.7, the blue area represents the
probability that you are looking for. Its value is reported as .564.
* This is neither a left-hand or right-hand probability because it does not
include either the left-hand or right-hand tail of the sampling distribution. [<img src="icons/2question.png" width=161px align="right">](#question1.3.2)
```

<A name="answer1.3.3"></A>
```{block2, type='rmdanswer'}
Answer to Question 3.

* Drag the right-hand slider handle to the maximum value (6) and then adjust
the left-hand slider handle so the (blue) area in the right-tail of the
sampling distribution represent a probability of .100. This area contains the
ten per cent samples with largest average candy weight scores. The value
displayed with the left-hand slider (3.57) is the minimum average candy weight
of the 10% heaviest candy bags. [<img src="icons/2question.png" width=161px align="right">](#question1.3.3)
```

## Concluding Remarks

A communication scientist wants to know whether children are sufficiently aware of the dangers of media use. On a media literacy scale from one to ten, an average score of 5.5 or higher is assumed to be sufficient.

If we translate this to the simple candy bag example, we realize that the outcome in our sample does not have to be the true population value, for example twenty per cent. If twenty per cent of all candies in the population are yellow, we could very well draw a sample bag with fewer or more than twenty per cent yellow candies.

Average media literacy, then, can exceed 5.5 in our sample of children, even if average media literacy is below 5.5 in the population or the other way around. How we decide on this is discussed in later chapters.

### Sample characteristics as observations
Perhaps the most confusing aspect of sampling distributions is the fact that samples are our cases (units of analysis) and sample characteristics are our observations. We are accustomed to think of observations as measurements on empirical _things_ such as people or candies. We perceive each person or each candy as a case and we observe a characteristic that may change across cases (a variable), for instance the colour or weight of a candy.

In a sampling distribution, however, we observe samples (cases) and measure a sample statistic as the (random) variable. Each sample adds one observation to the sampling distribution and its sample statistic value is the value added to the sampling distribution.

### Means at three levels

If we are dealing with the proportion of yellow candies in a sample (bag), the sample statistic is a proportion and we want to know the proportion of yellow candies in the population. The sampling distribution collects a large number of sample proportions. The mean of the proportions in the sampling distribution (expected value) equals the proportion of yellow candies in the population, because a sample proportion is an unbiased estimator of the population proportion.

Things become a little confusing if we are interested in a sample mean, such as the average weight of candies in a sample bag. Now we have means at three levels: the population, the sampling distribution, and the sample.

```{r three-means, echo=FALSE, out.width="420px", fig.pos='H', fig.align='center', fig.cap="What is the relation between the three distributions?", screenshot.opts = list(delay = 5), dev="png"}
#Generate a population and sample distribution of candy weight, and (in the middle)  sampling distribution of average candy weight. Add the average of each distribution as a vertical line. Add two sliders, one for adjusting the population mean (also adjusts mean of sampling distribution) and one for the sample mean.
knitr::include_app("http://82.196.4.233:3838/apps/three-means/", height="560px")
```

<A name="question1.4.1"></A>
```{block2, type='rmdquestion'}
1. In Figure \@ref(fig:three-means), explain the meaning of the three means (dotted red lines). Which mean is a mean of means? [<img src="icons/2answer.png" width=115px align="right">](#answer1.4.1)
```

<A name="question1.4.2"></A>
```{block2, type='rmdquestion'}
2. Is it a coincidence that the mean of the population and sampling distribution are the same? Use a slider to check if these means are the same. [<img src="icons/2answer.png" width=115px align="right">](#answer1.4.2)
```

<A name="question1.4.3"></A>
```{block2, type='rmdquestion'}
3. How does the sample mean relate to the population mean and the mean of the sampling distribution? [<img src="icons/2answer.png" width=115px align="right">](#answer1.4.3)
```

The sampling distribution, here, is a distribution of sample means but the sampling distribution itself also has a mean, which is called the expected value or expectation of the sampling distribution. Don't let this confuse you. The mean of the sampling distribution is the average of the average weight of candies in every possible sample bag. This mean of means has the same value as our first mean, namely the average weight of the candies in the population because a sample mean is an unbiased estimator of the population mean.

Remember this: The population and the sample consist of the same type of observations. In the current example, we are dealing with a sample and a population of candies. In contrast, the sampling distribution is based on a different type of observation, namely samples, for example, sample bags of candies.

The sampling distribution is the crucial link between the sample and the population. On the one hand the sampling distribution is connected to the population because the population statistic (parameter), for example, average weight of all candies, is equal to the mean of the sampling distribution. On the other hand, it is linked to the sample because it tells us which sample means we will find with what probabilities. We need the sampling distribution to make statements about the population based on our sample.

### Answers {-}

<A name="answer1.4.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* The dotted red line in the top graph represents the average of candy weights
in the population.
* The dotted red line in the middle graph represents the average of sampling
means. So it is the average of average candy weight in a sample. It is a mean
of means.
* The dotted red line in the bottom graph represents the average of candy
weights in the sample. [<img src="icons/2question.png" width=161px align="right">](#question1.4.1)
```

<A name="answer1.4.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* This is not a coincidence because the population mean is the expected value
of the sampling distribution and the expected value of the sampling
distribution is the average of the sampling distribution. [<img src="icons/2question.png" width=161px align="right">](#question1.4.2)
```

<A name="answer1.4.3"></A>
```{block2, type='rmdanswer'}
Answer to Question 3.

* The sample mean is one of the sample means included in the sampling
distribution. The sample is, so to speak, one of the possible samples that are
listed in the sampling distribution.
* The sampling distribution depends on the population distribution. For
instance, the mean or centre of the sampling distribution of sample means
equals the population mean (because the sample mean is an unbiased estimator
of the population mean).
* If the population changes, the sampling distribution changes, so the list of
possible samples (better: the probabilities of sample outcomes) changes, so we
expect a different sample. But this sample need not have the same mean as the
population and the sampling distribution because it is drawn at random. The
sample mean is more likely to be close to than very distant from the
population mean but it still can be quite different from the population mean. [<img src="icons/2question.png" width=161px align="right">](#question1.4.3)
```

## Test Your Understanding

Figure \@ref(fig:sampling-distribution-summary1) simulates drawing random samples from a candy factory's stock of candies. We are interested in the colour of the candies in our sample.
The top-left histogram shows the distribution of candies according to colour in the population. Draw some samples, have a look at the number of yellow candies in each sample, and inspect the sampling distribution.

```{r sampling-distribution-summary1, echo=FALSE, fig.pos='H', fig.align='center', fig.cap="A discrete sampling distribution.", screenshot.opts = list(delay = 5), dev="png", out.width="775px"}
# App sampling-distribution.
knitr::include_app("http://82.196.4.233:3838/apps/sampling-distribution/", height="425px")
```

<A name="question1.5.1"></A>
```{block2, type='rmdquestion'}
1. The top-left histogram in Figure \@ref(fig:sampling-distribution-summary1) shows a simulated population distribution. What would be the real-world population? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.1)
```

<A name="question1.5.2"></A>
```{block2, type='rmdquestion'}
2. Use the button in Figure \@ref(fig:sampling-distribution-summary1) to draw one random sample of ten candies from the population. What do the numbers on the horizontal axis of the bottom-right histogram represent? What is the statistical name of the variable *Number of yellow candies in the sample*? What is the unit of analysis for this characteristic? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.2)
```

<A name="question1.5.3"></A>
```{block2, type='rmdquestion'}
3. Which values can the sample characteristic take here and what is the statistical name for this set of values? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.3)
```

<A name="question1.5.4"></A>
```{block2, type='rmdquestion'}
4. If you would draw many samples from this population each containing ten candies, what is the number of yellow candies per sample that appears most frequently? Draw 1,000 samples to verify your answer. [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.4)
```

<A name="question1.5.5"></A>
```{block2, type='rmdquestion'}
5. Is the colour distribution in each sample that you draw representative (in the strict sense) of the colour distribution in the stock of candies? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.5)
```

<A name="question1.5.6"></A>
```{block2, type='rmdquestion'}
6. Why, do you think, is a sample characteristic (sample statistic) called a _random variable_? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.6)
```

```{r sampling-distribution-summary2, echo=FALSE, fig.pos='H', fig.align='center', fig.cap="A continuous sampling distribution.", screenshot.opts = list(delay = 5), dev="png", out.width="420px"}
# App p-values.
knitr::include_app("http://82.196.4.233:3838/apps/p-values/", height="260px")
```

<A name="question1.5.7"></A>
```{block2, type='rmdquestion'}
7. Use your own words to explain what the sampling distribution in Figure \@ref(fig:sampling-distribution-summary2) represents. [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.7)
```

<A name="question1.5.8"></A>
```{block2, type='rmdquestion'}
8. What do you think is the average weight of all candies in the population? Justify your answer using the concepts _expected value_ and _unbiased estimator_. [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.8)
```

<A name="question1.5.9"></A>
```{block2, type='rmdquestion'}
9. Use the sliders to find the probability of drawing a sample with average candy weight between 2.0 and 2.9 grams. [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.9)
```

<A name="question1.5.10"></A>
```{block2, type='rmdquestion'}
10. What, do you expect, is the probability of drawing a sample with average candy weight of exactly 2.9 grams? Use the sliders to check your expectation. [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.10)
```

<A name="question1.5.11"></A>
```{block2, type='rmdquestion'}
11. Why is this graph an example of a continuous probability distribution? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.11)
```

<A name="question1.5.12"></A>
```{block2, type='rmdquestion'}
12. Why is the vertical axis labelled with "Probability density" instead of "Probability"? [<img src="icons/2answer.png" width=115px align="right">](#answer1.5.12)
```

### Answers {-}

```{block2, type='rmdanswer', echo=!ch1}
Answers to the Test Your Understanding questions will be shown in the web book when the last tutor group has discussed this chapter.
```

<A name="answer1.5.1"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 1.

* If we would really sample candies, the population could be a candy factory's
stock of candies or the factory's machine producing the candies from which we
sample. [<img src="icons/2question.png" width=161px align="right">](#question1.5.1)
```

<A name="answer1.5.2"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 2.

* The numbers on the horizontal axis of the bottom-right histogram represent the number of yellow candies in the sample(s) that we have drawn. The variable for which these numbers are values is called a sample statistic. Together, these numbers constitute the sampling space, that is, all values that the sample statistic “Number of yellow candies in the sample” can take.
* A sample statistic is a characteristic of a sample; we have a sample
statistic score for each sample that we draw. The sample, then, is the unit of
analysis or case for this variable and this histogram. [<img src="icons/2question.png" width=161px align="right">](#question1.5.2)
```

<A name="answer1.5.3"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 3.

* The sample characteristic is the number of yellow candies in our sample.
Because our sample contains ten candies, this number can vary between zero and
ten. This range of values is called the sampling space. [<img src="icons/2question.png" width=161px align="right">](#question1.5.3)
```

<A name="answer1.5.4"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 4.

* If the proportion of yellow candies is .2 in the population, we expect that
this is the proportion of yellow candies that is most likely in our sample.
Therefore, samples with this proportion of yellow candies are most frequent if
we draw many samples. Each sample contains ten candies, so a proportion of .2
equals two yellow candies in a sample. [<img src="icons/2question.png" width=161px align="right">](#question1.5.4)
```

<A name="answer1.5.5"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 5.

* No.
* Let us assume that all colours have equal shares in the candy population,
like the example in the figure. A sample is representative of the population (in the strict sense) with respect to candy colour if the colors have equal shares also in the
sample. This is the only sample that is representative of the candy
population with respect to candy colours. But most of the samples that we
draw have an unequal distribution of colours. These samples are not
representative of the population. [<img src="icons/2question.png" width=161px align="right">](#question1.5.5)
```

<A name="answer1.5.6"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 6.

* As we see when we draw several samples, the number of yellow candies varies
across samples. This variation arises because we draw samples at random. So
the scores of samples on the sample statistic is random. That is a good
reason for calling it a random variable. [<img src="icons/2question.png" width=161px align="right">](#question1.5.6)
```

<A name="answer1.5.7"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 7.

* The sampling distribution in this figure shows average candy weights for all
possible or a very large number of random samples drawn from a population of
candies, for example, a factory's stock. [<img src="icons/2question.png" width=161px align="right">](#question1.5.7)
```

<A name="answer1.5.8"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 8.

* The population is not depicted here, so we must infer average candy weight
in the population from the sampling distribution. If the sample statistic is
an unbiased estimator of the population statistic (this is the case for a
sample mean), the average of the sampling distribution (this is called the
expected value) is equal to the population statistic.
* In the current example, the average of the sampling distribution of average
sample bag candy weights is equal to average candy weight in the population.
* The sampling distribution depicted here is symmetrical, so the average
equals the median value, so half of the observed average sample candy weights
are below this value and the other half is above this value.
* If one of the slider handles demarcates half of the probability from the
other half, this handle indicates the average of the sampling distribution. In
this example, the average is 2.8 (grams). [<img src="icons/2question.png" width=161px align="right">](#question1.5.8)
```

<A name="answer1.5.9"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 9.

* If you set the left-hand slider handle to 2.0 and the right-hand handle to
2.9, the blue area represents the probability of drawing a sample with average
candy weight between 2.0 and 2.9 grams. The value of the probability is
depicted in the blue box within the graph. It is .475. [<img src="icons/2question.png" width=161px align="right">](#question1.5.9)
```

<A name="answer1.5.10"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 10.

* This probability is (virtually) zero. If we would measure weight with very
high precision, no candy bag would have average candy weigth of exactly 2.9
grams, that is, 2.90000000000000000000000000(and so on) grams. [<img src="icons/2question.png" width=161px align="right">](#question1.5.10)
```

<A name="answer1.5.11"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 11.

* See the answer to Question 10. A variable is continuous if we can, in
principle, always find a new value between two values. This applies to average weight
as a variable, because we can in principle always use more decimal places in
our measurement to find a weight that is between two other weights. For
example, between 2.90000 and 2.90001 we can think of the weight 2.900005.
* If the sample statistic is a continuous variable, such as average candy
weight in the sample, the probability distribution for this sample statistic
is continuous. [<img src="icons/2question.png" width=161px align="right">](#question1.5.11)
```

<A name="answer1.5.12"></A>
```{block2, type='rmdanswer', echo=ch1}
Answer to Question 12.

* It makes no sense to speak of the probability (vertical axis) that average
candy weight in a sample (horizontal axis) has one particular value. Because
average weight is a continuous variable (see Exercise 11), the probability of
one particular outcome value is (virtually) zero (see Exercise 10). If we would
draw the probabilities on the vertical axis, we would have a flat line (at
zero) instead of a curve.
* Instead of probabilities of single outcome values, we are interested in
probabilities of ranges or intervals of outcome values if we have a continuous
random variable. For example, the probability of a sample with average candy
weight between 2.8 and 2.9 grams. The probability of a range of outcome values
is depicted as a surface below a probability density function. This is what
our graph of a continuous sampling distribution shows, so the vertical axis is
labelled 'probability density' instead of 'probability'. [<img src="icons/2question.png" width=161px align="right">](#question1.5.12)
```

## Take-Home Points

* Values of a sample statistic vary across random samples from the same population. But some values are more probable than other values.

* The sampling distribution of a sample statistic tells us the probability of drawing a sample with a particular value of the sample statistic or a particular minimum and/or maximum value.

* If a sample statistic is an unbiased estimator of a parameter, the parameter value equals the average of the sampling distribution, which is called the expected value or expectation.

* For discrete sample statistics, the sampling distribution tells us the probability of individual sample outcomes. For continuous sample statistics, it tells us the probability density, which gives us the probability of drawing a sample with an outcome that is at least or at most a particular value, or an outcome that is between two values.