Decision Model for Prediction of Movie Success Rate Data Mining J Component Project

ABSTRACT

The purpose of this Movie Success Rate Prediction project is to predict the success of any upcoming movie using Data Mining Tools. For this purpose, we have proposed a method that will analyze the cast and crew of the movie to find the success rate of the film using existing knowledge. Many factors like the cast (actors, actresses, directors, producers), budget, worldwide gross, and language will be considered for the algorithm to train and test the data. Two algorithms will be tested on our dataset and their accuracy will be checked.

 LITERATURE REVIEW

  • They developed a model to find the success of upcoming movies based on certain factors. The number of audience plays a vital role in a movie becoming successful
  • The factorization Machines approach was used to predict movie success by predicting IMDb ratings for newly released movies by combining movie metadata with social media data
  • Using the grossattribute as a training element for the model. The data are converted into .csv files after the pre-processing is done
  • Using S-PLSA – the sentiment information from online reviews and tweets, we have used the ARSA model for predicting the sales performance of movies using sentiment information and past box office performance.
  • A mathematical Model is used to predict the success and failure of upcoming movies depending on certain criteria. Their work makes use of historical data in order to successfully predict the ratings of movies to be released
  • According to them, Twitter is a platform that can provide geographical as well as timely information, making it a perfect source for spatiotemporal models.
  • The data they collected was gathered from Box Office Mojo and Wikipedia. Their data was comprised of movies released in 2016
  • Initially having a dataset of 3183 movies, they removed movies whose budget could not be found or missed key features in the end a dataset of 755 movies were obtained. After Key feature extraction was completed.
  • some useful data mining on the IMDb data, and uncovered information that cannot be seen by browsing the regular web frontend to the database.
  • According to their conclusion, brand power, actors or directors isn’t strong enough to affect the box office.
  • Their neural network was able to obtain an accuracy of 36.9% and compromising mistakes made within one category an accuracy of a whopping 75.2%
  • They divided the movies into three classes rise, stay, and fall finding that support vector machine SMO can give up to 60% correct predictions
  • The data was taken from the Internet Movie Database or IMDb as the data source, the data they obtained was from the years 1945 to 2017.
  • A more accurate classifier is also well within the realm of possibility, and could even lead to an intelligent system capable of making suggestions for a movie in preproduction, such as a change to a particular director or actor, which would be likely to increase the rating of the resulting film.
  • In this study, we proposed a movie investor assurance system (MIAS) to aid movie investment decisions at the early stage of movie production. MIAS learns from freely available historical data derived from various sources and tries to predict movie success based on profitability.
  • The data they gathered from movie databases was cleaned, integrated, and transformed before the data mining techniques were applied.
  • They used feature, extraction techniques, and polarity scores to create a list of successful or unsuccessful movies. This was done by gathering the data using IMDb and YouTube.

PROBLEM STATEMENT

in this Movie Success Rate Prediction project, The method of using the ratings of the films by the cast and crew has been an innovative and original way to solve the dilemma of film producers. Film producers have often trouble casting successful actors and directors and still trying to keep a budget. Looking at the average ratings of each actor and director together with all the films they participated in should be able to give the producer a good idea of who to cast and who not to cast in a film that is to be out right now.

Implementation:

  • Data Preprocessing & Correlation Analysis
  • Application of Decision Tree Algorithm
  • Application of Random Forest Algorithm

RESULTS & CONCLUSION

After testing both the algorithms on the IMDb dataset i.e. Decision Tree and Random Forest algorithm, we found that the Random Forest algorithm got a better accuracy (99.6%) on the data rather than the decision tree algorithm in which we obtained just 60% accuracy.

IOT Solution for Vehicle Maintenance and Report Generation System

INTRODUCTION

  • Many automotive manufacturers are now moving towards an IoT platform for manufacturing and for service purposes.
  • The main advantages of using IoT in cars are Optimized maintenance and logistics.
  • Our idea is to monitor vehicle status (fuel, efficiency/Km, battery, oil levels, etc..,.) to the customer as well as the manufacturer.

CONCEPT

  • The main aim of every car manufacturer is to increase the life of the car and it’s crucial to maintain the car in a good condition to achieve it.
  • Many problems in vehicles arise due to improper maintenance. Many lose track of their service status and it’s a tiring process to keep in touch with every customer in a large automotive industry.
  • If we maintain a system, that automatically updates the vehicle’s conditions periodically to a specified server, and the system will generate a report, that will be forwarded to the customer and the service team, a lot of manual work will be removed.
  • We as a team provide an IoT solution for vehicle maintenance and report generation system.

FLOW DIAGRAM-FUNCTIONAL DECOMPOSITION

  • Our Vehicle Maintenance and Report Generation system collects data from the sensors available in the car itself and reports it to a transceiver module(ESP8266) which is connected to a database in the cloud.
  • when new data is updated/inserted into the table an event is triggered. This event updates the information in the dashboard, which will be displayed to the customer and manufacturer.
  • Then a weekly/monthly/yearly report generation event is triggered, which will mail the report to the specified recipient.

FUNCTIONAL DECOMPOSITION

Data collection:

The data is collected from the sensor stream of the car. This data is redirected to the ESP8266 module. The ESP8266 is connected to the server, that is allotted to the car. The ESP8266, when all data is collected, converts it into a JSON file. Then the server sends a post request to the server.

Event trigger:

Many database servers provide pl/SQL-based triggers. Here an Update and Insert trigger is created for the table. Oracle server provides a wide range of PL/SQL functions. The IP of ESP8266 is connected to the oracle server, which on periodic updates in the table triggers an event.

Dashboard:

The dashboard is created using HTML and CSS and deployed in the cloud using the NODE JS framework.

FUNCTIONAL SPECIFICATION

Hardware:

ESP8266 CP2101 module(CAR)
ESP8266 CP2101 module(HOME)

Programming Language:

SQL
Javascript (Node JS)
C++(Arduino .ino)
HTML CSS

Dashboard

The Vehicle Maintenance and Report Generation System dashboard are developed using Adafruit.io. This website provides dashboard development for MQTT-based devices

University Leave and Outing Pass Automated System Application

Purpose of the Project:

This Project is a leave/outing pass automated system designed for Educational Universities. This system is an end-to-end module that enables a user (Student) to raise a request and an admin (Mentor) to approve/decline it. This is a robust system where Parent Verification, In-Out Time recording, and Data Security have been taken care of. This Project is built to be a secure, flexible, unique, transparent, and user-friendly environment that aims to digitize the whole process thus removing fake paper trails.

Feasibility Study:

The project has been undertaken after the feasibility study, which paves the way for deployment, and phase development.

Scope of the Project:

The scope of the Automated System is designed to run on the University server and to allow students to raise requests for their leave, trace the request status, and modify them. On the Mentor Dashboard, the software also allows the Mentor/ Mentor Coordinator to view requests, and approve/decline requests. Whereas on the Hostel Dashboard, the Warden/ Deputy Warden and Hostel Supervisors will be able to view and grant leave passes to the students.

This Automated System will provide ease to all the actors – students, mentors, hostel authorities, and security services in regard to leaving/outpassing sanctions and will ultimately eliminate the paperwork.

Overall Product Description:

Product Perspective:

It will provide a way in which existing paper-based work can be supplemented with the end-to-end robust leave management system. The system can be used independently of the platform and device, be it on a smartphone, tablet, or computer.

Product Functionality:

The server will be responsible for storing each request generated, generating one-time passwords, generating QR Codes E-Pass for authorized requests, receiving and authenticating requests, generating statistics at the needs of each audit, and maintaining and verifying security and user privacy. This server can also potentially contact all authorized students by email to give them username information, passwords, server address, OTP code, updates to the users from the Mentors/ Hostel Services, etc.

Process Flow – Student:

  • depicts how the student raises the request and the activity which is continued after the review from the mentor.
  • depicts how the Mentor/Warden/Supervisor approves/declines the requests.
  • depicts how the Security guard can verify the leave request the student displays.

Upper Classes and Characteristics:

It is anticipated that three types of users will use the Licensed Software defined in this SRS.

  1. IT staff/ Software Development Cells are expected to deploy and configure the System using the defined system interfaces. This will include running the whole system and maintaining software after the handover and deployment of the project.
  2. The second type of user of the Hostel Warden/ Deputy Wardens / Hostel Supervisors and Managers is expected to understand and use correctly the software interfaces defined with the appropriate design documentation.
  3. Finally, it is expected that any student accorded with Hostel Services within the domain of Vellore Institute of Technology may access all of the leave request information such that the request is independently verifiable. This will include a web application presented using Hyper Text Markup Language (HTML) to allow a user to raise their request that has been registered under review including the previous requests history.

Working Environment:

The Automated System software is directly made as a web application, so the computer hosting must be capable of running HTML and should have internet. The system will be uploaded to the University server, in order to make it accessible for all the students, faculty mentors, and wardens.

Design and Implementation Constraints:

The Application provides an end-to-end leave management system that copes with malicious attacks provided certain constraints are met. Principally, all necessary steps should be taken to protect the System from unguarded attacks by using physical, network, storage, and user security protection. These safeguards should be penetration tested by the SDC to ensure viability.

User Documentation:

The users are the students or faculty/staff of the university who are authorized by SDC, they will be able to raise/ view/ approve/ disapprove requests on the server. The application client will be available free of charge, and any purchase of the server software will be authorized to distribute it to their users.

Assumptions and Dependencies:

 This software in its initial phases of development depends on a few third-party commercial applications or any assumption. Student Development Team will take care of all the assumptions and dependencies. It will be the responsibility of SDC to purchase/ develop the dependency as per the University IT norms.

System Features:

Login:

This is used to login and maintain security by authenticating the users

1. Should accept the user name password
2. A case-insensitive comparison is done for a user name and a case-sensitive comparison is done for a password
3. If the correct user id and password are supplied then, Main Menu should be displayed
4. If an invalid user id or password is entered then the system should display the error message “Invalid ID or password” and should quit the application.
5. Username – Students – Registration Id Faculty – Employee Id Staff – Employee Id
6. VTop Login Credentials can be used in the further enhancements

Mentor’s Portal:

This feature allows mentors to work out the leave/vacation requests.

1. Can approve a request.
2. Can decline a request.
3. Can edit the request.
4. Can verify the request.

Warden’s Portal:

This feature allows the warden to authenticate the requests

1. Is able to view all requests for outpass.
2. Can reconsider requests

Hostel Supervisor’s Portal:

This feature allows supervisors to issue outpasses to the students

1. Can issue outpasses to students.
2. Can deny the issue of an outpass.
3. Can send a request for reconsideration to the warden.

Student Portal:

This feature will allow students to raise a request for an outing/extended outing/leave.

1. Can raise requests of respective categories.
2. Would receive a system-generated outpass

External Interface Requirement:

User Interfaces:

Login Interface – The login interface consists of the student username and password fields, Students can log in with the same VTop Credentials.

The login interface for the faculty and staff consists of the faculty/ staff employee id and password. Their credentials will also be the same as those of VTop.

Hardware Interfaces – Hardware requirements include a laptop or a desktop or a smartphone with proper connectivity to access the system. Other than above mentioned, no hardware is required.

Software Interfaces – The software is based on an application interface. The Application will interact with the University Server with regard to user verification and information retrieval.

Operating System

Ubuntu

Programming Language

HTML, PHP, CSS, JavaScript

IDE

Visual Studio Code

Database

InnoDB

Hosting Base

Amazon Web Services

Communication Interfaces – This software would be functional on an ethernet connection or a wireless connection.

Cost Calculation:

SOFTWARE COST ESTIMATION:

For any software project under development, it is indispensable to know how much it will cost to develop and how much development time will it take. The project scope must be established in advance and software metrics are used as a support from which evaluation is made. The project is broken into small PCs which are estimated individually. Several estimation procedures have been developed to monitor the project’s progress, so developers and product managers can assess whether the project is progressing according to the procedure and take corrective actions, if necessary.

STATIC, MULTIVARIABLE MODELS:

Static, multivariable models depend on several variables describing various aspects of the software development environment. In some models, several variables are needed to describe the software development process, and the selected equation combines these variables to give an estimate of time and cost.

WALSTON and FELIX developed the models at IBM to provide equations to give a relationship between lines of source code with effort and duration of development.

For our software project, the lines of code (LOC) sum up to 5223, which becomes 5.223 KLOC.

So, according to the WALSTON-FELIX model, we need to hire 24 engineers per month and require 7 and half months to develop our project.