Python Machine Learning with the KDD Cup 1999 Attack Data Set
by Security Dude
Training set and testing set
Machine learning is about learning some properties of a data set and applying them to new data. This is why a common practice in machine learning to evaluate an algorithm is to split the data at hand in two sets, one that we call a training set on which we learn data properties, and one that we call a testing set, on which we test these properties.(from sklearn website)
The main API implemented by scikit-learn is that of the estimator. An estimator is any object that learns from data; it may a classification, regression or clustering algorithm or a transformer that extracts/filters useful features from raw data.