Dataset:

The dataset we have used is “Cohn-Kanade”.
This dataset is classified so we cannot provide the actual dataset but the link for you to download is :
http://www.consortium.ri.cmu.edu/ckagree/index.cgi
And to read more about the dataset you can refer to:
http://www.pitt.edu/~emotion/ck-spread.htm

Feature Extraction and Selection:

1. Lips
2. Eyes
3. Forehead
4. Nose

These features are processed by CNN layers and then selected by the algorithm and then they are converted to a NumPy array then the model is trained by that and the following three classifications are made.

How this project works:

First Open the Application, CHOOSE THE MODE IN WHICH YOU WANT TO LISTEN to THE SONG

Then it shows “YOUR MOOD, YOUR MUSIC”

Press “OKAY TO CAPTURE THE IMAGE”

After that press “c” to capture

You seem Happy please select your favorite genre

You seem Excited please select your favorite genre

You Seem Sad please select your favorite genre

CODE DESCRIPTION

All libraries are imported into this.

Model Initialization and building.

Training of test and testing.

Training our model

Model Building, Splitting of test and train set, and training of the model.

Saving a model.

Loading a saved model.

Saving image with OpenCV after cropping and loading it and then the prediction

Suggesting songs in Offline mode

Suggesting songs online(Youtube)

Rest of the GUI part

Variable Explorer

IPython Console

Importing Libraries

Model Training

Model Summary

Online Mode

Offline Mode

GUI

Splash Screen

Main Screen

Selection screen

Display songs and then select them, after that they will play

Summary

We successfully build a model for Facial Emotion Recognition(FER) and trained it with an average accuracy over various test sets of over 75%. Then we successfully build a Desktop application to suggest songs on the basis of their facial expression and hence completed our project. This FER model can be widely used for various purposes such as home automation, social media, E-commerce, etc and we have the motivation to take this project to a next level.

Our Audio Classification project illustrates a straightforward audio classification model supported by deep learning. we tend to address the matter of classifying the sort of sound-supported short audio signals and their generated spectrograms, from classifying dog’s audio to cat’s audio throughout model training. So as to satisfy this challenge, we tend to use a model-supported Convolutional Neural Network (CNN). The audio was processed with Mel-frequency Cepstral Coefficients (MFCC) into what is unremarkably called Mel spectrograms, and hence, was reworked into a picture. Our final CNN model achieved 89% accuracy on the testing dataset.

Project Overview :

The input to our model, in this project, is cats and associated dogs recording audio go in WAV kind. It lies below the supervised machine learning class. Thus, a dataset is also present as well as a target class. Hence, the intention here is to classify if the given input wav file is that of a cat or dog. Each of the dog and cat sounds is incredibly distinguished like in their pitch and frequency level since completely different| sounds have different sample rates. By default, Librosa mixes all audio to mono and resamples them to 22050 cycles/second at load time. For music and audio analysis, Librosa is associated ASCII text file python package. The info and the sampling rate are provided by Librosa. Audio or sound is in its raw kind, and the data provided should be pre-processed to extract significant and meaningful features so we implemented an algorithm i.e., MFCC (Mel Frequency Cepstral Coefficients) rule. Then, when audio extraction is done, the information is fed and the dataset is split into training and test set. So, after the preprocessing, a Convolutional Neural Network model is designed using tensor flow. For every code and model building, Keras API was used to implement Google colab.

Motivation

Machine learning can be used in image processing, understanding speech, and musical instruments, speech-to-text, environmental sound classification, and many more. And as for our project, we implemented a class of speech processing i.e, audio classification. Converting sound waves into audio and spectrograms which is a visual representation of frequencies with the help of function provided by machine learning.

There are many techniques to classify images as many different in-built neural networks under CNN are already there, especially if it is related to images. And it’s straightforward to extract options from pictures as a result of pictures already being available in the shape of numbers, because the formation of a picture may be an assortment of pixels, and pixels area units within the sort of numbers. When we have data as text, we use the sequential encoder and decoder-based techniques to find features. But if it is to sound recognition or audio it is more difficult compared to text because it is based on frequency and time. Therefore a proper model is to be made to extract the frequency and pitch of that audio so as to make it easier to later recognize it.

Flow Chart:

Preliminaries and Background

Related work

Machine learning: Image classification of cats and dogs – Before a decade, in computer notion, many problems had been saturating in accordance with their precision. However, the accuracy of those troubles significantly stepped forward with the boom of deep gaining knowledge of strategies. The majority of the problems that arise from image class is that it is defined as predicting the distinct categories a photo can belong to. Hence, for the supplied enter/ photograph detection with the aim of accomplishing high precision, a state-of-the-art approach is incorporated, i.e., a convolutional neural network turned into the build for the photo category mission of puppies and cats. A dataset become given from Kaggle comprising a total of 25000 pix of each dog and cat.

Machine learning: Audio classification of different bird species – Here, the methodology and results of using deep learning to assist in the classification of birds by their sounds are presented. As birds indicate the health of an ecosystem, hence this topic is of high importance. Random Forest Classification and custom-made six CNN models from the literature were performed on a dataset of ten birds that were composed of xeno-canto.org. The highest accuracy was achieved at around 65% by the Random Forest and at about 58% for the CNN model.

conclusion and future work

In this report, we first briefly explained the overview of this project and showed some referred project work already established. Then, we precisely illustrated our task, including the learning task and the performance task. After that, we explained the approach we are heading toward in order to classify the datasets. The approach/model we used is a neural network which is an implementation of the deep network which is a trainable model by which we were able to classify the dog’s and cat’s audio. The highest accuracy we got was 89.6%.

In the future, we will try to implement the different high-level models in order to achieve much higher
We’ll build a system that can directly intake a live raw

Drew Helzermen on Implementation of Telemetry Link on FPGAFebruary 13, 2023
I'm interested in using this for my final project, please send code related to this project.
divya on Health Prediction Management SystemFebruary 12, 2023
how to download
Gokul on Security for Lost or Misplaced device – iLocateFebruary 10, 2023
Tell me the software requirements and wt are the application required to create this app if possible help me or…
Gokul on Security for Lost or Misplaced device – iLocateFebruary 10, 2023
How to do this project what are the software requirements plz can u tell this will help for my final…
Xavi on Project Management System Vb.Net ProjectFebruary 9, 2023
a good project

Tag: CNN

MOODIFY – Suggestion of Songs on the basis of Facial Emotion Recognition Project