Felipe's Data Science Blog


Don 02 November 2017

LOO cross-validation with python

Posted by Felipe in posts   

There is a type of cross-validation procedure called leave one out cross-validation (LOOCV). It is very similar to the more commonly used $k-fold$ cross-validation. In fact, LOOCV can be seen as a special case of $k-fold$ CV with $k=n$, where n is the number of data points. In other words, LOOCV trains the statistical model on every possible set containing $n-1$ data points and then tests it on the $n^{th}$ point.

Read more...


Son 27 August 2017

ENEM project, part 1

Posted by Felipe in posts   

I am going to explore data resulting from Brazil's (I am Brazilian, even though I've lived most of my life abroad) "Exame Nacional do Ensino Médio (Enem)", a school-leaver's test. It is roughly equivalent to the American SATs, Swiss Matura, French Bac, etc...

This notebook represents cleaned up version of the actual project, in order to get to this point, I had to run a lot of tests on the data. Nonetheless, by following what is here, one should be able to get from the raw data to my final results.

Read more...