Spamming is the process of posting unwanted and not related comments on specific posts in any type of social sharing medium or video-sharing medium. These messages are posted by bots for reducing ranking or disturbing users viewing experience which ultimately reduces the rank of website and post. This spamming is done manually also which are mostly seen in most competitive pages.
There are few methods that can remove spamming methods that use data mining techniques but in this project, we are automating the process of spam comment detection using machine learning by taking a dataset of youtube spam messages and applying countvectorizer and navie base algorithm for clustering on the given dataset using python programming.
This project, countvectorizer is used for extracting features form a given dataset and design model by generating tests and training sets from given data. Then the navie base classifier is applied for clustering and the test and training set is given as input based on this data given message is tested if it is spam or not.
In the existing system, data mining techniques are used for detecting spam messages. Most of these methods work only after posting messages. There is a need for a system that can automate this process before posting message.
- Operating system: Windows 7 or 10.
- Tool :Anaconda ( Jupiter )
- Software :Python 3.5
- Dependencies: numpy , OpenCV
- Libraries: panda, keras, scipy, sklearn