Model Evaluation#

Cross Validation#

The data is repeatedly partitioned into distinct training and test sets, enabling the training of multiple models across these variations.

  • Model-1: The initial model is trained using the first fold as the test set, with the remaining folds comprising the training set.

  • Model-2: The subsequent model is trained using the second fold as the test set, while the remaining folds are utilized as the training set.

  • This process continues with folds 3, 4, and 5, each acting as the test set in succession.

  • For each of these five data partitions, accuracy is calculated to evaluate model performance.

CV on Breast Cancer Dataset#

# import breast cancer data
from sklearn.datasets import load_breast_cancer
dataset_bc = load_breast_cancer()
# X_bc, y_bc and shapes
X_bc = dataset_bc.data
y_bc = dataset_bc.target
X_bc.shape, y_bc.shape
((569, 30), (569,))
# training  and test sets
from sklearn.model_selection import train_test_split
X_bc_train, X_bc_test, y_bc_train, y_bc_test = train_test_split(X_bc, y_bc, random_state=42)
# instantiate the class into an object
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier()
# fit the model
rfc.fit(X_bc_train, y_bc_train)
RandomForestClassifier()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
# train test scores
rfc.score(X_bc_train, y_bc_train), rfc.score(X_bc_test, y_bc_test)
(1.0, 0.965034965034965)
# cross validation
from sklearn.model_selection import cross_val_score
cross_val_score(rfc, X_bc, y_bc, cv = 5)
array([0.92982456, 0.93859649, 0.99122807, 0.98245614, 0.96460177])
cross_val_score(rfc, X_bc, y_bc, cv = 5).mean()
0.9578326346840551

CV on Iris Data#

# importiris data
from sklearn.datasets import load_iris
dataset_iris = load_iris()
# X_iris, y_iris and shapes
X_iris = dataset_iris.data
y_iris = dataset_iris.target
X_iris.shape, y_iris.shape
((150, 4), (150,))
# training  and test sets
from sklearn.model_selection import train_test_split
X_iris_train, X_iris_test, y_iris_train, y_iris_test = train_test_split(X_iris, y_iris, random_state=42)
# instantiate the class into an object
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier()
# fit the model
rfc.fit(X_iris_train, y_iris_train)
RandomForestClassifier()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
# train test scores
rfc.score(X_iris_train, y_iris_train), rfc.score(X_iris_test, y_iris_test)
(1.0, 1.0)
# cross validation
from sklearn.model_selection import cross_val_score
cross_val_score(rfc, X_iris, y_iris, cv = 5)
array([0.96666667, 0.96666667, 0.93333333, 0.93333333, 1.        ])
cross_val_score(rfc, X_iris, y_iris, cv = 5).mean()
0.9533333333333334

Validation Set#

  • The test set should not be used for parameter selection, as this could lead to overfitting.

  • The accuracy obtained from the test set might not generalize well to new, unseen data.

  • A common solution is to split the data into three distinct subsets:

    1. Training Set: Used to train the model.

    2. Validation Set: Used to tune model parameters and select the best model.

    3. Test Set: Used for the final evaluation of model performance on unseen data.

Validation Set for Housing Data#

from sklearn.datasets import fetch_california_housing
Xh, yh = fetch_california_housing(return_X_y=True)

Xh_train_valid, Xh_test, yh_train_valid, yh_test = train_test_split(Xh, yh, random_state=0)

Xh_train, Xh_valid, yh_train, yh_valid = train_test_split(Xh_train_valid, yh_train_valid, random_state=0)

Xh_train.shape, Xh_valid.shape, Xh_test.shape
((11610, 8), (3870, 8), (5160, 8))
from xgboost import XGBRegressor
for md in [1,2,3,4,5]:
    xgb = XGBRegressor(max_depth=md)
    xgb.fit(Xh_train, yh_train)
    print(f'max_depth = {md}  --->  Validation Score = {xgb.score(Xh_valid, yh_valid):.2f}')
max_depth = 1  --->  Validation Score = 0.70
max_depth = 2  --->  Validation Score = 0.78
max_depth = 3  --->  Validation Score = 0.80
max_depth = 4  --->  Validation Score = 0.81
max_depth = 5  --->  Validation Score = 0.81
xgb = XGBRegressor(max_depth=4)
xgb.fit(Xh_train_valid, yh_train_valid)
xgb.score(Xh_train_valid, yh_train_valid), xgb.score(Xh_test, yh_test)
(0.8812581464722442, 0.8328913896385808)