Students Marks Prediction Using Linear Regression


Analyzing and prediction of academic performance is important for any education institutions. Predicting student performance can help teachers to take steps in developing strategy for improving performance at early stages. With the advancement of machine learning supervised and unsupervised techniques developing these kinds of applications are helping teachers to analyze students in better way compare to existing methods. In this student marks prediction using Linear regression project students’ academic performance is prediction considering input as previous students marks and predict next subject marks and accuracy of the model is calculated.

Problem statement:

Analyzing and prediction of marks for students was done based on guess and students’ personal marks details are not considered for academic evaluation.


Machine learning based data mining techniques are used to automate process of student performance prediction using linear regression technique.

Existing system:

  • Researches has done work on Grading systems which final examination marks are used for giving grades for students and evaluation of each student is done.
  • Association rule mining and apriori algorithms are used for classifying students based on their marks


  • Most of these methods work on data mining techniques which are based on after completing data.
  • Early stage evaluation is not possible in these methods.

Proposed system:

  • Students marks of other subjects are taken as input for evaluation students’ performance. Data set is pre-processed and features and labels are extracted from dataset then dataset is split in to test and train sets then linear regression is applied to dataset for prediction.


  • Before final marks of all subjects are evaluated prediction can be performed.
  • Using machine learning process automation of marks prediction can be done.


  • Operating system                  : Windows XP/7/10
  • Coding Language            :            python 
  • Development environment : anaconda, Jupiter
  • Dataset                         : students marks dataset
  • IDE :           Jupiter notebook

File Security Using Elliptic Curve Cryptography (ECC) in Cloud


Data security in cloud computing is a mostly researched topic which has various solutions like applying encryption to data and using multi cloud environment. But still there are many issues related to data security. In this project we are using ECC digital signature method to sign signature of user data while uploading to cloud and use same digital signature to download when required.

The Elliptic Curve Cryptography (ECC) is modern family of public-key cryptosystems, you can use an Elliptic Curve algorithm for public/private key cryptography. To be able to use ECC; cryptographic signatures, hash functions and others that help secure the messages or files are to be studied at a deeper level.

It implements all major capabilities of the asymmetric cryptosystems: Encryption, Signatures and Key Exchange The main advantage is that keys are a lot smaller. With RSA you need key servers to distribute public keys. With Elliptic Curves, you can provide your own public key.

In python, the above described method can be implemented using the   ECDSA Algorithm. 


  • Using public key cryptosystems with both public and private key can give security for data compare to single key encryption. In this project ECC algorithm is used for security data to cloud and uploading data to cloud.

Existing system:

  • AES, DES are mostly used crypto graphic algorithms for securing data. These methods are used in most of the applications which use single key for encryption and decryption.


  • These methods are old methods which are used in most of the applications.
  • They use single key for encryption and decryption.

Proposed system:

  • In cloud environment data security is very important as data is stored in third party servers there is need to effective multi key encryption techniques like ECC algorithms. In this project we are using ECC algorithm in python language and using cloud to store encrypted data.


  • Time taken for encryption process is less
  • Multiple keys are used for encryption and decryption process.


Software Requirement: 

  • Operating system           :           Windows XP/7/10
  • Coding Language           :           Html, JavaScript,  
  • Development Kit             :        Flask Framework
  • Database                             :           SQLite
  • IDE                                          :           Anaconda prompt

Crop Yield Prediction using KNN classification


Agriculture is considered as import field all over the world where there are many challenges in solving problems in the process of estimating crops based on the conditions. This has become a challenge for developing countries.  Using latest technologies many companies are using IOT based services and Mechanical technology to reduce manual work. These methods are mostly useful in the case on reducing manual work but not in prediction process. In this project crop yield prediction using Machine learning latest ML technology and KNN classification algorithm is used for prediction crop yield based on soil and temperature factors.  Dataset is prepared with various soil conditions as features and labels for predicting type of each label is related to certain crop. In prediction process user can give input as soil features and result will be type of crop suitable for specific conditions and application also helps in suggesting best crops with yield for hector.


  • In our country large amount of population are depending on agriculture though government is taking financial steps to help farmers still they are facing problems due to lack of data analysis and prediction on crops.


  • Our objective is to develop an application using machine learning for predicting which crop to be used based on soil condition using k nearest neighbor classification.

Existing system:

          Image based analysis was one of the methods which was previously used for detection land type and then analysis was done.


         Process is based on image analysis results are not accurate as in this method soil conditions are not considered.

       Image processing is a time taking process.

Proposed system:

        Machine learning is the latest technology which python programming language gives advantage in using various algorithms for crop yield prediction based on the input data set. In this process KNN classification algorithm is used for prediction. In this project testing training is performed on given text dataset which includes soil and temperature conditions as features and type of crop as labels.


        Crop yield prediction is performed based on textual dataset and any user can check type of crop best suits for conditions and get crop suggestions. 


System Requirement:

  • Operating system         :           Windows XP/7/10
  • Coding Language :           Html, JavaScript, 
  • Development Kit :        Flask Framework
  • Programming language: Python
  • IDE :           Anaconda prompt

Medical Data Analysis Python Project


The idea of visualizing data by applying machine learning and pandas in python. Taking dataset from the medical background of different people ( prime Indians dataset from UCI repository). This data set consists of information of the user whose age, sex type of symptoms related to diabetes. Design a testing and training set and predict what are the chances of patients having diabetes in the coming five years. Data is classified and shown in the form of different graphs.

Project Objective:

To analyze data by considering exiting the user’s data set and predict what are the chances of diabetes in the coming five years. Information is shown in the form of different graphs.


Data analysis is playing an important part in analyzing datasets and predicting what are situations in the coming years. This analysis can give the option for departments and organizations to take steps in dealing with these problems. In this project prediction of diabetes in the coming years is considered as the main problem.

Existing System:

There was no chance of prediction in existing studies it was just by manual analysis based on existing data but analyzing large amounts of datasets is not considered.

Proposed System:

Data analysis and machine learning libraries and algorithms are used for prediction on diabetes and information is shown in detail in the form of different types of graphs (histogram, density plots, box and whisker plots, and correlation matrix plots.


OS: Windows 7 or above
Processor: I3 or above
Programming language: python 3.6
Distribution tool: Anaconda.
Hard Disk: 160 GB

Spam Comments Detection Project in Python


Spamming is the process of posting unwanted and not related comments on specific posts in any type of social sharing medium or video-sharing medium. These messages are posted by bots for reducing ranking or disturbing users viewing experience which ultimately reduces the rank of website and post. This spamming is done manually also which are mostly seen in most competitive pages.

There are few methods that can remove spamming methods that use data mining techniques but in this project, we are automating the process of spam comment detection using machine learning by taking a dataset of youtube spam messages and applying countvectorizer and navie base algorithm for clustering on the given dataset using python programming.

Proposed system:

This project, countvectorizer is used for extracting features form a given dataset and design model by generating tests and training sets from given data. Then the navie base classifier is applied for clustering and the test and training set is given as input based on this data given message is tested if it is spam or not.

Existing system:

In the existing system, data mining techniques are used for detecting spam messages. Most of these methods work only after posting messages. There is a need for a system that can automate this process before posting message.



  • Operating system: Windows 7 or 10.
  • Tool :Anaconda ( Jupiter )


  • Software :Python 3.5
  • Dependencies: numpy , OpenCV
  • Libraries: panda, keras, scipy, sklearn