The Iris flower data set or Fisher’s Iris data set is a multivariate data set. The data set consists of 50 samples from each of the three species of Iris (Setosa, Virginica, and Versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. Based on the combination of these four features various machine learning models can be trained.
Based on Fisher’s linear discriminant model, this data set became a typical test case for many statistical classification techniques in machine learning such as support vector machines. This makes the data set a good example to explain the difference between supervised and unsupervised techniques in data mining
Data Set Information
This is perhaps the best-known database to be found in the pattern recognition literature. Fisher’s paper is a classic in the field and is referenced frequently to this day. The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other.
- Explanatory Data Analysis – create various plots
- Machine Learning algorithms used: Decision tree, Support vector machine, Naive Bayes, and K-nearest neighbors
- Visualize IRIS dataset
- Machine learning algorithms – Decision tree, Support vector machine, Naive Bayes and K-nearest neighbors
- Matplotlib for plots