Contents tagged with sklearn

  • Cross validation. Sklearn.model_selection, sklearn.cross_validation

    Hello everybody,

    today I want to describe a bit more about cross validation and how to work with it in Python. Here I describe how to split learning set once or how to use different cross validation strategies. 

    Before I'll continue, I'd like to describe different types of splitting data for training. 

    There are following ways to split data:

    1. Split on 70/30 ( sometime 80/20 ) on two sets: Training data, Validation data. 

    You train data on training data, and validate on holdout data. 

    This approach has following pros/cons:

    (+) model is trained only once

    (-) depends from the splitting

    (+/-) works fine for big data sets

    In order to understand one more case take a look at … more