Project Title: Video Classification based on teaching styles
Every professor has his/her unique style of teaching and based on his/her teaching style he/she falls into three major classes: Verbal, Visual and Lively. The main aim of the project is to classify the NPTEL and Coursera videos into these three different classes. This will help in building a strong recommendation system, which will recommend a video to a learner based on his learning style and professor’s teaching style in the video.
Goal: We aim to classify a given video as either Verbal, Visual or Lively.
Description: Given a video, our prototype will process the video and extract textual and audio features. Based on the feature vector for the video, a threshold value will be computed which will decide the class for the same.
Dataset used: Manually curated videos for each of the class namely visual and verbal. These videos were picked up from various playlist namely Khan Academy, Coursera, Kudvenkant Tutorials, Programming Knowledge, Ravindrababu tutorials, Techtud and Tushar Roy Coding made simple. In all 600 videos have been processed.
Project details: Each of the videos which were manually classified was processed as follows:
- Video to frame conversion and extracting the .wav (audio) file
- Feature extraction performed using Ocropus on the frames selected through uniform sampling
Extracted Features :
- Number of lines per slide
- Number of figures
- Face detection for speaker presence
- Audio features extracted using Praat tool from a .wav file
Extracted Features :
- Total number of syllables
- Speaking time
- Articulation time
- Phonation time
- Speech rate
- Pafy used to obtain YouTube metadata for the video like view count, the number of likes, duration
of the video etc.
- A value was computed based on the feature set for the video
- The threshold was determined by averaging the value obtained for all the training set of videos
Accuracy and Testing: The audio and textual features were extracted with an accuracy of 80 percent.
The prototype was tested against 1000 odd videos which were again curated manually having the equal distribution of videos belonging to both classes and the accuracy obtained was 75 percent.
Future Scope: Our main focus would be to expand our dataset in terms of volume and diversity. To improve the accuracy and to devise a more comprehensive threshold computation algorithm. The first step towards this will be to process the number of frames in a video based on its duration along with uniform sampling. To use the YouTube metadata to establish a correlation between the class of a video and the popularity of the video on the platform.
Literature studied: The bigger goal is to design a recommendation engine for the users based on the teaching style they prefer. Studied existing literature for the Recommendation System. As the classifier for the videos based on teaching style didn’t exist and that is the first step towards solving the bigger goal, the current prototype of Video Classification has been proposed.
- An old, highly cited paper that maps teaching and learning styles, for engineering education: http://www4.ncsu.edu/unity/lockers/users/f/felder/public/Papers/LS-1988.pdf
- Coursera Course we studied: https://www.coursera.org/specializations/recommender-systems
- About Ocropus: http://www.danvk.org/2015/01/09/extracting-text-from-an-image-using-ocropus.html/
- YouTube metadata Libraries: http://pythonhosted.org/Pafy/
- Praat code: https://github.com/timmahrt/praatIO
- Finding figures in image: http://www.pyimagesearch.com/2014/10/20/finding-shapes-images-using-python-opencv/
GitHub Link: https://github.com/acdha/image-mining
Note: This project idea is contributed for ProGeek Cup 2.0- A project competition by GeeksforGeeks.