Cyber Bullying Detection Using Machine Learning

Abstract:

Cyber bullying is the process of sending wrong messages to a person or community which causes heated debate with users. Cyberbullying is mostly seen in social networking sites where users reply to post with bullying words to threaten or insult other users. Cyberbullying is considered a misuse of technology. According to the latest survey done on all over the world data day by day, cases are increasing on cyberbullying. In order to solve this problem many natural language processing techniques are proposed by various authors which are time taking and not automatic. With the advancement of machine learning and artificial intelligence, models can be created and automatic detection can be implemented. To show this scenario live chat application is developed in python programming with multiple clients and one server and the Naive Bayes algorithm is used to train the model on a Twitter dataset and using this model live detection of cyberbullying is predicted and alert messages are shown on the chat application.

Problem statement:

       Social networking and online chatting application provide a platform for any user to share knowledge and talent but few users take this platform to threaten users with cyberbullying attacks which cause issues in using these platforms.

Objective:

    To provide a better platform for users to share knowledge on social networking sites there is a need for an effective detection system that can automate the process of cyberbullying detections and take decisions.

Existing system:

  • Techniques like unsupervised labeling methods which use N-gram, TF-IDF methods to detect cyberbullying are used which use the youtube dataset to detect attacks.
  • A support vector classifier is used to train models for detection.

Disadvantages:

         Techniques which are used in the existing system are not automated they need time to process request and update response.

           Social networking and chatting sites require automated detecting and processing methods.

Proposed system:

         Cyberbullying detection is designed using machine learning techniques. Twitter data set is collected with features and labels and mode is trained using the Naive Bayes algorithm and trained model is applied to live chatting application which has multiple clients and a single server. For each message, cyberbullying is detecting using the model and then alert messages are posted on chat boards.

Advantages:

         Cyberbullying detection process is automatic and time taken for detection is less and it works on the live environment. 

            The latest machine learning models are used for training models that are accurate.

Software Requirement:

Programming language: python

Front End GUI : tkinter

Dataset: Twitter cyberbullying dataset

Algorithm                       : Naive bayes

Students Marks Prediction Using Linear Regression

Abstract:

Analyzing and prediction of academic performance is important for any education institutions. Predicting student performance can help teachers to take steps in developing strategy for improving performance at early stages. With the advancement of machine learning supervised and unsupervised techniques developing these kinds of applications are helping teachers to analyze students in better way compare to existing methods. In this student marks prediction using Linear regression project students’ academic performance is prediction considering input as previous students marks and predict next subject marks and accuracy of the model is calculated.

Problem statement:

Analyzing and prediction of marks for students was done based on guess and students’ personal marks details are not considered for academic evaluation.

Objective:

Machine learning based data mining techniques are used to automate process of student performance prediction using linear regression technique.

Existing system:

  • Researches has done work on Grading systems which final examination marks are used for giving grades for students and evaluation of each student is done.
  • Association rule mining and apriori algorithms are used for classifying students based on their marks

Disadvantages:

  • Most of these methods work on data mining techniques which are based on after completing data.
  • Early stage evaluation is not possible in these methods.

Proposed system:

  • Students marks of other subjects are taken as input for evaluation students’ performance. Data set is pre-processed and features and labels are extracted from dataset then dataset is split in to test and train sets then linear regression is applied to dataset for prediction.

Advantages:

  • Before final marks of all subjects are evaluated prediction can be performed.
  • Using machine learning process automation of marks prediction can be done.

SOFTWARE REQUIREMENTS:

  • Operating system                  : Windows XP/7/10
  • Coding Language            :            python 
  • Development environment : anaconda, Jupiter
  • Dataset                         : students marks dataset
  • IDE :           Jupiter notebook

COVID-19 Data Analysis And Cases Prediction Using CNN

ABSTRACT:

Corona virus ( COVID-19 ) is creating panic all over the world with fast growing cases. There are various datasets available which provides information of world-wide effected information. Covid has affected all counties with large number of cases with variation of numbers under death, survived, effected. In this project we are using data set which has county wise details of cases with various combined features and labels. Covid data analysis and case prediction project provide solution for data analysis of various counties on various time and data factors and creating model for survival and death cases and prediction cases in future. Machine learning provides deep learning methods like Convolution neural network which is used for model creation and prediction for next few months are done using this project.  

PROBLEM STATEMENT:

      With the increase of COVID 19 cases all over the world daily predictions and analysis is required for effective control of pandemic all over the world

OBJECTIVE:

    By collecting data from Kaggle and new York dataset data preprocessing is performed and data analysis is performed on dataset and machine learning model is generated for future prediction of cases.

EXISTING SYSTEM:

  • Prediction was performed on COVID 19 cases based on different machine learning techniques which are based on x ray data set collected from COVID 19 patients.
  • Disease prediction from x ray images is done using deep learning techniques.

Disadvantages:

  • Data set used for predicting disease is different compare to one we are using for this project.
  • Image processing techniques are used.

PROPOSED SYSTEM:

Using data set pre-processing is performed on the collected data set and various steps for deep learning model is performed and prediction of cases is done then data analysis is done on various factors.

Advantages:

  • Data analysis and prediction is performed on textual data
  • Deep learning models are generated for predicting future cases.
  • Data analysis is performed for various factors.

SOFTWARE REQUIREMENTS

  • Operating system                   :           Windows XP/7/10
  • Coding Language              :           python
  • Development Kit                  :        anaconda 
  • Programming language: Python
  • IDE                                               :           Anaconda prompt

File Security Using Elliptic Curve Cryptography (ECC) in Cloud

Abstract:

Data security in cloud computing is a mostly researched topic which has various solutions like applying encryption to data and using multi cloud environment. But still there are many issues related to data security. In this project we are using ECC digital signature method to sign signature of user data while uploading to cloud and use same digital signature to download when required.

The Elliptic Curve Cryptography (ECC) is modern family of public-key cryptosystems, you can use an Elliptic Curve algorithm for public/private key cryptography. To be able to use ECC; cryptographic signatures, hash functions and others that help secure the messages or files are to be studied at a deeper level.

It implements all major capabilities of the asymmetric cryptosystems: Encryption, Signatures and Key Exchange The main advantage is that keys are a lot smaller. With RSA you need key servers to distribute public keys. With Elliptic Curves, you can provide your own public key.

In python, the above described method can be implemented using the   ECDSA Algorithm. 

Objective:

  • Using public key cryptosystems with both public and private key can give security for data compare to single key encryption. In this project ECC algorithm is used for security data to cloud and uploading data to cloud.

Existing system:

  • AES, DES are mostly used crypto graphic algorithms for securing data. These methods are used in most of the applications which use single key for encryption and decryption.

Disadvantages:

  • These methods are old methods which are used in most of the applications.
  • They use single key for encryption and decryption.

Proposed system:

  • In cloud environment data security is very important as data is stored in third party servers there is need to effective multi key encryption techniques like ECC algorithms. In this project we are using ECC algorithm in python language and using cloud to store encrypted data.

Advantages:

  • Time taken for encryption process is less
  • Multiple keys are used for encryption and decryption process.

Architecture:

Software Requirement: 

  • Operating system           :           Windows XP/7/10
  • Coding Language           :           Html, JavaScript,  
  • Development Kit             :        Flask Framework
  • Database                             :           SQLite
  • IDE                                          :           Anaconda prompt

Crop Yield Prediction using KNN classification

ABSTRACT:

Agriculture is considered as import field all over the world where there are many challenges in solving problems in the process of estimating crops based on the conditions. This has become a challenge for developing countries.  Using latest technologies many companies are using IOT based services and Mechanical technology to reduce manual work. These methods are mostly useful in the case on reducing manual work but not in prediction process. In this project crop yield prediction using Machine learning latest ML technology and KNN classification algorithm is used for prediction crop yield based on soil and temperature factors.  Dataset is prepared with various soil conditions as features and labels for predicting type of each label is related to certain crop. In prediction process user can give input as soil features and result will be type of crop suitable for specific conditions and application also helps in suggesting best crops with yield for hector.

PROBLEM STATEMENT:

  • In our country large amount of population are depending on agriculture though government is taking financial steps to help farmers still they are facing problems due to lack of data analysis and prediction on crops.

OBJECTIVE:

  • Our objective is to develop an application using machine learning for predicting which crop to be used based on soil condition using k nearest neighbor classification.

Existing system:

          Image based analysis was one of the methods which was previously used for detection land type and then analysis was done.

Disadvantages:

         Process is based on image analysis results are not accurate as in this method soil conditions are not considered.

       Image processing is a time taking process.

Proposed system:

        Machine learning is the latest technology which python programming language gives advantage in using various algorithms for crop yield prediction based on the input data set. In this process KNN classification algorithm is used for prediction. In this project testing training is performed on given text dataset which includes soil and temperature conditions as features and type of crop as labels.

Advantages:

        Crop yield prediction is performed based on textual dataset and any user can check type of crop best suits for conditions and get crop suggestions. 

 

System Requirement:

  • Operating system         :           Windows XP/7/10
  • Coding Language :           Html, JavaScript, 
  • Development Kit :        Flask Framework
  • Programming language: Python
  • IDE :           Anaconda prompt