CIRANO /Summary / No unbiased Estimator of the Variance of K-Fold Cross-Validation

No unbiased Estimator of the Variance of K-Fold Cross-Validation

In statistical machine learning, the standard measure of accuracy for models is the prediction error, i.e. the expected loss on future examples. When the data distribution is unknown, it cannot be computed but several resampling methods, such as K-fold cross-validation can be used to obtain an unbiased estimator of prediction error. However, to compare learning algorithms one needs to also estimate the uncertainty around the cross-validation estimator, which is important because it can be very large. However, the usual variance estimates for means of independent samples cannot be used because of the reuse of the data used to form the cross-validation estimator. The main result of this paper is that there is no universal (distribution independent) unbiased estimator of the variance of the K-fold cross-validation estimator, based only on the empirical results of the error measurements obtained through the cross-validation procedure. The analysis provides a theoretical understanding showing the difficulty of this estimation. These results generalize to other resampling methods, as long as data are reused for training or testing.

[ - ]

[ + ]

Release date May 1, 2003

Reference number 2003s-22

Author(s) Yoshua Bengio and Yves Grandvalet

Publication type Working Papers

Keywords Prediction error, cross-validation, multivariate variance estimators, statistical comparison of algorithms

Référence bibliographique Bengio, Y., & Grandvalet, Y. (2003). No unbiased Estimator of the Variance of K-Fold Cross-Validation (2003s-22, Working Papers, CIRANO.) https://www.cirano.qc.ca/en/summaries/2003s-22

Headlines

Mistaking immature classroom behaviour with ADHD

Scénarios de risque et prévisions macroéconomiques

Immigrant de deuxième génération et citoyen de second ordre ?

Les salaires dans le secteur des technologies de l’information et de la communication (TIC) : Éléments de comparaison entre le Canada et d’autres pays

Follow us

No unbiased Estimator of the Variance of K-Fold Cross-Validation

Headlines

Subscribe to our monthly bulletin!

Don't miss our latest conferences and publications.

Courriel envoyé

Erreur