Abstract:
The idea of visualizing data by applying machine learning and pandas in python. Taking dataset from the medical background of different people ( prime Indians dataset from UCI repository). This data set consists of information of the user whose age, sex type of symptoms related to diabetes. Design a testing and training set and predict what are the chances of patients having diabetes in the coming five years. Data is classified and shown in the form of different graphs.
Project Objective:
To analyze data by considering exiting the user’s data set and predict what are the chances of diabetes in the coming five years. Information is shown in the form of different graphs.
Introduction:
Data analysis is playing an important part in analyzing datasets and predicting what are situations in the coming years. This analysis can give the option for departments and organizations to take steps in dealing with these problems. In this project prediction of diabetes in the coming years is considered as the main problem.
Existing System:
There was no chance of prediction in existing studies it was just by manual analysis based on existing data but analyzing large amounts of datasets is not considered.
Proposed System:
Data analysis and machine learning libraries and algorithms are used for prediction on diabetes and information is shown in detail in the form of different types of graphs (histogram, density plots, box and whisker plots, and correlation matrix plots.
SOFTWARE & HARDWARE REQUIREMENT:
OS: Windows 7 or above
Processor: I3 or above
Programming language: python 3.6
Distribution tool: Anaconda.
RAM: 4 GB
Hard Disk: 160 GB