Students Marks Prediction Using Linear Regression

Abstract:

Education institutions use new technologies to improve the quality of education but most of the applications which are used in colleges are related to service and development there are web applications that are helping students to take online training and tests. There are very few methods that can help teachers to know about student’s performance. Considering this problem machine learning techniques are used to predict students’ marks based on previous marks and predict results. Linear regression models are used to predict student performance and predict the next subject’s marks.

Problem statement:

Education institutions use web applications for training students and checking performance based on marks but there are no specific steps followed for predicting students’ performance and taking measures to improve performance.

Objective:

Design a machine learning model for the prediction of students’ marks and take measures to improve student performance. The linear regression algorithm is used to train the model and prediction.

Existing system:

Researchers had done work on the automation of grading techniques in which previous marks were used to give grades to students.

Algorithms like association rule mining and apriori algorithms are used for classifying students’ marks.

Disadvantages:

Existing methods mostly work based on marks obtained from exams.

Algorithms are used for classifying students based on marks. 

Proposed system:

The dataset of other subject marks is taken as input and the data set is processed with labels and features then test split is performed on the dataset and then the machine learning model is applied to the dataset then the prediction is performed.

Advantages:

Before the final marks of all subjects are evaluated prediction can be performed.

Using a machine learning process automation of marks prediction can be done. 

SOFTWARE REQUIREMENTS:

  • Operating system: Windows XP/7/10
  • Coding Language: python  
  • Development environment: anaconda, Jupiter 
  • Dataset: students mark the dataset
  • IDE: Jupiter notebook

Students Marks Prediction Using Linear Regression Project

Abstract:

Analyzing and predicting academic performance is important for any educational institution. Predicting student performance can help teachers to take steps in developing strategies for improving performance at early stages. With the advancement of machine learning and supervised and unsupervised techniques developing these kinds of applications are helping teachers to analyze students in a better way compared to existing methods. In this student marks prediction using Linear regression project students’ academic performance is predicted considering input as previous students’ marks and predicting next subject marks and the accuracy of the model is calculated.

Problem statement:

Analyzing and prediction of marks for students was done based on guesses and students’ personal marks details are not considered for academic evaluation.

Objective:

Machine learning-based data mining techniques are used to automate the process of student performance prediction using linear regression techniques.

Existing system:

  • Researchers have done work on Grading systems in which final examination marks are used for giving grades to students and evaluation of each student is done.
  • Association rule mining and apriori algorithms are used for classifying students based on their marks

Disadvantages:

  • Most of these methods work on data mining techniques that are based on complete data.
  • Early-stage evaluation is not possible in these methods.

Proposed system:

  • Students’ marks in other subjects are taken as input for the evaluation of students’ performance. The data set is pre-processed and features and labels are extracted from the dataset then the dataset is split into test and train sets then linear regression is applied to the dataset for prediction.

Advantages:

  • Before the final marks of all subjects are evaluated prediction can be performed.
  • Using a machine learning process automation of marks prediction can be done.

SOFTWARE REQUIREMENTS:

  • Operating system: Windows XP/7/10
  • Coding Language:            python 
  • Development environment: anaconda, Jupiter
  • Dataset: students mark the dataset
  • IDE :           Jupiter notebook

File Security Using Elliptic Curve Cryptography (ECC) in Cloud

Abstract:

Data security in cloud computing is a mostly researched topic that has various solutions like applying encryption to data and using multi-cloud environments. But still, there are many issues related to data security. In this project, we are using ECC digital signature method to sign the signature of user data while uploading to the cloud and use the same digital signature to download when required.

Elliptic Curve Cryptography (ECC) is a modern family of public-key cryptosystems, you can use an Elliptic Curve algorithm for public/private key cryptography. To be able to use ECC; cryptographic signatures, hash functions and others that help secure the messages or files are to be studied at a deeper level.

It implements all major capabilities of the asymmetric cryptosystems: Encryption, Signatures, and Key Exchange The main advantage is that keys are a lot smaller. With RSA you need key servers to distribute public keys. With Elliptic Curves, you can provide your own public key.

In python, the above-described method can be implemented using the   ECDSA Algorithm. 

Objective:

  • Using public key cryptosystems with both public and private keys can give security for data compared to single key encryption. In this project, the ECC algorithm is used for securing data to the cloud and uploading data to the cloud.

Existing system:

  • AES and DES are mostly used cryptographic algorithms for securing data. These methods are used in most of the applications which use single keys for encryption and decryption.

Disadvantages:

  • These methods are old methods that are used in most applications.
  • They use a single key for encryption and decryption.

Proposed system:

  • In a cloud environment data security is very important as data is stored in third-party servers there is a need for effective multi-key encryption techniques like ECC algorithms. In this project, we are using the ECC algorithm in python language and using the cloud to store encrypted data.

Advantages:

  • The time taken for the encryption process is less
  • Multiple keys are used for the encryption and decryption process.

Architecture:

Software Requirement: 

  • Operating system: Windows XP/7/10
  • Coding Language:  Html, JavaScript,  
  • Development Kit:  Flask Framework
  • Database: SQLite
  • IDE: Anaconda prompt

Crop Yield Prediction using KNN classification

ABSTRACT:

Agriculture is considered as import field all over the world where there are many challenges in solving problems in the process of estimating crops based on the conditions. This has become a challenge for developing countries.  Using latest technologies many companies are using IOT based services and Mechanical technology to reduce manual work. These methods are mostly useful in the case on reducing manual work but not in prediction process. In this project crop yield prediction using Machine learning latest ML technology and KNN classification algorithm is used for prediction crop yield based on soil and temperature factors.  Dataset is prepared with various soil conditions as features and labels for predicting type of each label is related to certain crop. In prediction process user can give input as soil features and result will be type of crop suitable for specific conditions and application also helps in suggesting best crops with yield for hector.

PROBLEM STATEMENT:

  • In our country large amount of population are depending on agriculture though government is taking financial steps to help farmers still they are facing problems due to lack of data analysis and prediction on crops.

OBJECTIVE:

  • Our objective is to develop an application using machine learning for predicting which crop to be used based on soil condition using k nearest neighbor classification.

Existing system:

          Image based analysis was one of the methods which was previously used for detection land type and then analysis was done.

Disadvantages:

         Process is based on image analysis results are not accurate as in this method soil conditions are not considered.

       Image processing is a time taking process.

Proposed system:

        Machine learning is the latest technology which python programming language gives advantage in using various algorithms for crop yield prediction based on the input data set. In this process KNN classification algorithm is used for prediction. In this project testing training is performed on given text dataset which includes soil and temperature conditions as features and type of crop as labels.

Advantages:

        Crop yield prediction is performed based on textual dataset and any user can check type of crop best suits for conditions and get crop suggestions. 

 

System Requirement:

  • Operating system         :           Windows XP/7/10
  • Coding Language :           Html, JavaScript, 
  • Development Kit :        Flask Framework
  • Programming language: Python
  • IDE :           Anaconda prompt

Medical Data Analysis Python Project

Abstract:

The idea of visualizing data by applying machine learning and pandas in python. Taking dataset from the medical background of different people ( prime Indians dataset from UCI repository). This data set consists of information of the user whose age, sex type of symptoms related to diabetes. Design a testing and training set and predict what are the chances of patients having diabetes in the coming five years. Data is classified and shown in the form of different graphs.

Project Objective:

To analyze data by considering exiting the user’s data set and predict what are the chances of diabetes in the coming five years. Information is shown in the form of different graphs.

Introduction:

Data analysis is playing an important part in analyzing datasets and predicting what are situations in the coming years. This analysis can give the option for departments and organizations to take steps in dealing with these problems. In this project prediction of diabetes in the coming years is considered as the main problem.

Existing System:

There was no chance of prediction in existing studies it was just by manual analysis based on existing data but analyzing large amounts of datasets is not considered.

Proposed System:

Data analysis and machine learning libraries and algorithms are used for prediction on diabetes and information is shown in detail in the form of different types of graphs (histogram, density plots, box and whisker plots, and correlation matrix plots.

SOFTWARE & HARDWARE REQUIREMENT:

OS: Windows 7 or above
Processor: I3 or above
Programming language: python 3.6
Distribution tool: Anaconda.
RAM: 4 GB
Hard Disk: 160 GB

Spam Comments Detection Project in Python

Abstract

Spamming is the process of posting unwanted and not related comments on specific posts in any type of social sharing medium or video-sharing medium. These messages are posted by bots for reducing ranking or disturbing users viewing experience which ultimately reduces the rank of website and post. This spamming is done manually also which are mostly seen in most competitive pages.

There are few methods that can remove spamming methods that use data mining techniques but in this project, we are automating the process of spam comment detection using machine learning by taking a dataset of youtube spam messages and applying countvectorizer and navie base algorithm for clustering on the given dataset using python programming.

Proposed system:

This project, countvectorizer is used for extracting features form a given dataset and design model by generating tests and training sets from given data. Then the navie base classifier is applied for clustering and the test and training set is given as input based on this data given message is tested if it is spam or not.

Existing system:

In the existing system, data mining techniques are used for detecting spam messages. Most of these methods work only after posting messages. There is a need for a system that can automate this process before posting message.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

  • Operating system: Windows 7 or 10.
  • Tool :Anaconda ( Jupiter )

SOFTWARE REQUIREMENTS:

  • Software :Python 3.5
  • Dependencies: numpy , OpenCV
  • Libraries: panda, keras, scipy, sklearn