Benchmarking the LOOCV
Benchmarking fastloocv¶
In this post I will benchmark the fastloocv code I presented in my previous post.
LOO cross-validation with python
There is a type of cross-validation procedure called leave one out cross-validation (LOOCV). It is very similar to the more commonly used $k-fold$ cross-validation. In fact, LOOCV can be seen as a special case of $k-fold$ CV with $k=n$, where n is the number of data points. In other words, LOOCV trains the statistical model on every possible set containing $n-1$ data points and then tests it on the $n^{th}$ point.
ENEM project, part 1
I am going to explore data resulting from Brazil's (I am Brazilian, even though I've lived most of my life abroad) "Exame Nacional do Ensino Médio (Enem)", a school-leaver's test. It is roughly equivalent to the American SATs, Swiss Matura, French Bac, etc...
This notebook represents cleaned up version of the actual project, in order to get to this point, I had to run a lot of tests on the data. Nonetheless, by following what is here, one should be able to get from the raw data to my final results.