standard error of accuracy estimates

in your [random forest notebook](https://github.com/raviolli77/machineLearning_breastCancer_Python/blob/master/notebooks/02_random_forest.ipynb), in function `cross_val_metrics`
```...
    if print_results:
        for i in range(0, len(scores)):
            print("Cross validation run {0}: {1: 0.3f}".format(i, scores[i]))
        print("Accuracy: {0: 0.3f} (+/- {1: 0.3f})"\
              .format(scores.mean(), scores.std() / 2))
    else:
        return scores.mean(), scores.std() / 2
```
you split the standard deviation of the samples in half and present that as the... standard error? should the standard deviation not be divided by the square root of the number of the folds, in this case `sqrt(10)`? or you could just report the standard deviation, not the half of it. better yet, report/return score.std()*1.96/np.sqrt(n_folds) for a 95% confidence interval. the latter scales the standard deviation by 0.62 as opposed to 0.5 so the numerical results are not drastically different. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

standard error of accuracy estimates #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

standard error of accuracy estimates #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions