Skip to content
Related Articles

Related Articles

Save Article
Improve Article
Save Article
Like Article

Emotion Based Music Player – Python Project

  • Difficulty Level : Hard
  • Last Updated : 04 Jul, 2021

In this article, we are going to build an emotion-based Music Player using Python, OpenCV, Android Studios, and FisherFace Algorithm. Music has the power of healing an individual as quoted by Ray Charles. Music plays a very important role in recognizing an individual’s emotions and state of mind; it is a great way for people to express themselves as well as it is an important medium of entertainment for music lovers and listeners. Listening to music helps us relax and calm down. Music is also considered to be the most effective medium as it can induce deep feelings with some kind of message in it. With the advancement in technology, the number of artists, their music, and music listeners all are increasing, and here comes the problem of manually browsing and choosing the music according to their mood or choice. This is where our project comes into the role, as we all know to face an organ of the human body which plays a vital role in extracting human behaviors and their state of mind. Our project detects the mood of the user and plays a song or playlist according to his mood. The project uses a web camera to capture the image of the user, it then classifies the facial expression as happy, sad, neutral, or angry and then plays the song according to the input image. The major advantage of this project is that the user doesn’t need to implement and choose songs manually.

Tools and technologies

We need to have Android studios installed in our systems. We have also encountered OpenCV, Jupiter notebooks, and Convolutional Neural Network (CNN) for detecting the user moods. We have made use of the FisherFace algorithm along with Principle Component Analysis (PCA) and Linear Discriminant Analysis (LDA). 

Attention reader! Don’t stop learning now. Get hold of all the important Machine Learning Concepts with the Machine Learning Foundation Course at a student-friendly price and become industry ready.

Required Skillset 

Basic knowledge of Python and intermediate and advanced knowledge of OpenCV, Android studio, and FisherFace algorithm.


The process of detecting moods is quite challenging. We have trained our model on two data sets namely: the JAFEE and the Conh-Kanade database. These two datasets are easily available online and were used by us for the evaluation of our model. The FisherFace algorithm captures classified images, performs dimensionality reduction on the data, and then calculates the statistical value categorically. It also calculates the same for the input image and compares the value with the training dataset and gives desired output. 

LDA comes under the category of Supervised learning in which the machine has to learn with prior data. LDA applies the Dimensionality Reduction technique which decreases the time in executing and classifying the data. PCA converts uncorrelated and correlated variables in the form of mathematical values. It uses dimensionality reduction to reduce large datasets by transforming them into smaller parts. PCA observes data and calculates probabilistic generative models. 

We trained with the Cohn-Kanade dataset and we made some classification to train and test our model. As the input is related to the user, it will give a good accuracy rate with the advantage of fewer amounts of the dataset and less memory storage. Faster output is also obtained with a quick response time.  

The steps involved in processing the data and detecting the moods are discussed below:

  • Step 1: The user gives input, which is in the form of the image captured by the web camera of the user. 
  • Step 2: The image gets analysed by our model and gets classified as an happy, sad , neutral or angry emotion.
  • Step 3: The data gets extracted and detected with the training datasets , which are JAFEE and Cohn-Kanade datasets.
  • Step 4: The playlist or songs are chosen according to the facial mood recognition of the user.
  • Step 5: The music gets played to boost the user’s mood after successful detection of the sentiments. 

The flowchart representing the same is shown below:

Code Snippte: 

For displaying the source input, use this code. Install the necessary libraries including Tkinter using the command prompt :

pip -V
pip install tkinter


import tkinter as tk
import cv2
from PIL import Image, ImageTk
width, height = 800, 600
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
root = Music_player.Tk()
root.bind('<Escape>', lambda e: root.quit())
lmain = Music_player.Label(root)
def show_frame():
    _, frame =
    frame = cv2.flip(frame, 1)
    cv2image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGBA)
    img = Image.fromarray(cv2image)
    imgtk = ImageTk.PhotoImage(image=img)
    lmain.imgtk = imgtk
    lmain.after(10, show_frame)

Import the set of libraries and use the following code for detecting emotion:


import numpy as np
import glob
import random
import cv2
def update(emotions):
    print("Saving model...")"model.xml")
    print("Model saved!!")
def make_sets(emotions):
    for emotion in emotions:
        training=training=sorted(glob.glob("dataset/%s/*" %emotion))
        for item in training:
            gray=cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return training_data, training_label
def run_recognizer(emotions):
    training_data, training_label=make_sets(emotions)
    print("Training model...")
    print("The size of the dataset is "+str(len(training_data))+" images")
    fishface.train(training_data, np.asarray(training_label))

Results and Outcomes:

The static image was used for facial reaction or response perception. The captured photo is taken from photos stored for testing. Cohn-Kanade database has a total number of 890 photos and as for JAFEE database, it has a total number of 213 photos out of which database is distinct in both the parts: first is the training set and the other one is evaluation set.

For training and evaluation, we arrange the database in an 80/20 relation manner. Both the set carries seven expression assertion.

  • Cohn-Kanade Database: For the evaluation model we gave one sole captured photo to the system. Then system first recognize the position of exclusive mark and the then it finds those marks.
  • For training and evaluation, the mode, the database was given to the model. After evaluation, the confusion matrix and grouping report or the arrangement in a specific order is been generated.

Some of the mind’s states of an individual studied by the system are depicted below:


In this picture, the user’s state of mind is angry.


In this image, the user’s state of mind is sad or depressed.


In this image, user’s state of mind is happy or joyful.  


In this image captured, the user’s state of mind is neutral.

Table1: Classification Report of CK+ Database

Table2: Accuracy Report of CK+ Database

Real-Life applications and Future Scope

In our present state of work, we have proceeded with the music from the database that is available online. In the near future we can proceed with music online but with many more options like selecting a particular language, latest songs, not very new songs and old songs like 80’s or 90’s and many more. The future scope of work aims at holding music therapy sessions that will definitely help many who are suffering from mental illness, who are sad but want to be happy, etc. The mobile application-based emotion detection-based music player has to be user-friendly with more and more options. We can also create apps and websites with this feature and include speech recognition along with facial emotion detection. 

My Personal Notes arrow_drop_up
Recommended Articles
Page :