Open In App

Create a Scatter Plot using Sepal length and Petal_width to Separate the Species Classes Using scikit-learn

Last Updated : 02 Jun, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see how to create Scatter Plot using Sepal length and Petal_width to Separate the Species classes using scikit-learn in Python.

The Iris Dataset contains 50 samples of three Iris species with four characteristics (length and width of sepals and petals). Iris setosa, Iris virginica, and Iris versicolor are the three species. These measurements were utilized to develop a linear discriminant model to classify the species. The dataset is frequently used in data mining, classification, clustering, and algorithm testing.

Now, let’s create a scatter plot using Sepal length and petal width to separate the species classes using scikit-learn.

Import the data

First, let’s import the packages and load the “iris.csv” file. The .head() method returns the first five rows of the dataset. The columns in our dataset are ‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’ and ‘species’.

To view and download the csv file click here.

Python3




# importing packages
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
import seaborn as sns
  
# loading data
iris = pd.read_csv("iris.csv")
print(iris.head())


Output:

 

Label encoding the ‘species’ column of the dataset

sklearn.preprocessing.LabelEncoder() converts string labels to numerical labels. After encoding the ‘species’ column the dataset looks like this:

Python3




le = preprocessing.LabelEncoder()
  
# Converting string labels of
# the 'species' column into numbers.
iris.species = le.fit_transform(iris.species)
print(iris.head())


Output:

 

Creating a scatterplot

We use the matplotlib and seaborn libraries to create a scatterplot. sns.scatterplot() is used to create a scatterplot, as we need to visualize sepal length and petal width, on the x-axis we give ‘sepal_length’, and on the y-axis we give ‘petal_width’, hue parameter is for the color on the plot, we gave the column name ‘species’ for that parameter as we want to differentiate the data among the species and the column is already label encoded. The new labels are 0,1,2. In the legend, we can see that. The species are classified in the scatter plot according to the labels.

Python3




# plotting a scatterplot using seaborn
sns.scatterplot(data=iris, x='sepal_length',
                y='petal_width', hue='species')
plt.plot()


Output:

 



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads