Create a Scatter Plot using Sepal length and Petal_width to Separate the Species Classes Using scikit-learn
In this article, we are going to see how to create Scatter Plot using Sepal length and Petal_width to Separate the Species classes using scikit-learn in Python.
The Iris Dataset contains 50 samples of three Iris species with four characteristics (length and width of sepals and petals). Iris setosa, Iris virginica, and Iris versicolor are the three species. These measurements were utilized to develop a linear discriminant model to classify the species. The dataset is frequently used in data mining, classification, clustering, and algorithm testing.
Now, let’s create a scatter plot using Sepal length and petal width to separate the species classes using scikit-learn.
Import the data
First, let’s import the packages and load the “iris.csv” file. The .head() method returns the first five rows of the dataset. The columns in our dataset are ‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’ and ‘species’.
To view and download the csv file click here.
Label encoding the ‘species’ column of the dataset
sklearn.preprocessing.LabelEncoder() converts string labels to numerical labels. After encoding the ‘species’ column the dataset looks like this:
Creating a scatterplot
We use the matplotlib and seaborn libraries to create a scatterplot. sns.scatterplot() is used to create a scatterplot, as we need to visualize sepal length and petal width, on the x-axis we give ‘sepal_length’, and on the y-axis we give ‘petal_width’, hue parameter is for the color on the plot, we gave the column name ‘species’ for that parameter as we want to differentiate the data among the species and the column is already label encoded. The new labels are 0,1,2. In the legend, we can see that. The species are classified in the scatter plot according to the labels.