- The problem that we are going to solve here is that given a set of features that describe a house in Boston, our machine learning model must predict the house price.
- To train our machine learning model with boston housing data, we will be using boston dataset.
- We will be using Random Forest Regression algorithm for creating the prediction model.
The Project includes 2 datasets (training dataset and testing dataset). In each one of these datasets, each row describes a boston town or suburb. There are 14 attributes (features) with a target column (MEDV/Price) in the training dataset.
| Attributes | Details |
|---|---|
| ID | ID for each row |
| CRIM | per capita crime rate by town |
| ZN | proportion of residential land zoned for lots over 25,000 sq.ft |
| INDUS | proportion of non-retail business acres per town |
| CHAS | Charles River dummy variable (1 if tract bounds river; else 0) |
| NOX | nitric oxides concentration (parts per 10 million) |
| RM | average number of rooms per dwelling |
| AGE | proportion of owner-occupied units built prior to 1940 |
| DIS | weighted distances to five Boston employment centres |
| RAD | index of accessibility to radial highways |
| TAX | full-value property-tax rate per $10,000 |
| PTRATIO | pupil-teacher ratio by town |
| BLACK | 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town |
| LSTAT | % lower status of the population |
| MEDV | Median value of owner-occupied homes in $1000’s (the target) |