Big Data Hadoop Projects Titles

These are the below Projects on Big Data Hadoop.

1) Twitter data sentimental analysis using Flume and Hive

2) Business insights of User usage records of data cards

3) Wiki page ranking with hadoop

4) Health care Data Management using Apache Hadoop ecosystem

5) Sensex Log Data Processing using BigData tools

6) Retail data analysis using BigData

7) Facebook data analysis using Hadoop and Hive

8) Archiving  LFS(Local File System) & CIFS  Data to Hadoop

9) Aadhar Based Analysis using Hadoop

10) Web Based Data Management of Apache hive

11) Automated RDBMS Data Archiving and Dearchiving  using Hadoop and Sqoop

12) BigData Pdf  Printer

13) Airline on-time performance

14) Climatic Data analysis using Hadoop (NCDC)

15) MovieLens  Data processing and analysis.

16) Two-Phase  Approach for Data Anonymization Using MapReduce

17) Migrating Different Sources To Bigdata And Its Performance

18) Flight History Analysis

19) Pseudo distributed hadoop cluster in script


  • 12-24 1-4TB hard disks in a JBOD (Just a Bunch Of Disks) configuration
  • 2 quad-/hex-/octo-core CPUs, running at least 2-2.5GHz
  • 64-512GB of RAM
  • Bonded Gigabit Ethernet or 10Gigabit Ethernet (the more storage density, the higher the network throughput needed)


  • FRONT END :           Jetty server, WebUI in JSP
  • BACK END :           Apache Hadoop, Apache FLUME, Apache HIVE, Apache PIG, JDK 1.6
  • OS       :           Linux-UBUNTU
  • IDE             :           ECLIPSE

A Signature-Based Indexing Method for Efficient Content-Based Retrieval of Relative data’s Report

 Rule discovery algorithms that are discovered in data mining helps generating an array of rules and patterns. Sometimes it also exceeded the size of the existing database and only a fraction of that proves to be useful for the users. In the process of knowledge discovery it is important to interpret the discovered rules and patterns. When there is a huge number of rules and patterns it become almost difficult to choose and analyze the most interesting among all .

For example it might not be a good idea to provide the user with an association rules list, raked by their support and confidence. This might not also be a good way of organizing the set of methods and rules and on another it can also overwhelm the users. It is also not important that all the rules and methods are interesting and it depends on a variety of reasons.

A useful data mining system must able to generate methods and rules feseability thus providing flexible tools for the purpose of rule selection. In the association of rule mining various approaches for the processing of the discovered rules are discussed before. Another approach is also made for grouping of similar rules that goes well for a moderate quantity of rules. Clusters can be created in case of too many numbers of rules and method.

A flexible approach allows to identify the rules that have special values for the users. It is done through union queries of data or templates. Moreover, this approach is just perfect for complementing the grouping rule approach. By the concept of inductive database the importance if data mining has been highlighted. It also allows the clients to query about the pattern and rules as well as about the data and the models extracted from it.

Port Knocking C++ Computer Science Project with Report

Introduction to Port Knocking C++ Project:

The Port Knocking is the communication system in the host to host where the data transfers from one closed port to another closed port. The port knocking process posses the different variants, the information can be encoded as the Port sequence or the Packet payload. Normally, the data transfers to the closed ports which is arrived to the testing host or monitoring daemon that cross checks the information by not sending the acknowledgement to the sender. 

The Port Knocking is the process of communication with in two or more computers or the client and the server where the information is encoded and encrypted in the chain of the port numbers. The chain or the sequence encoded and encrypted is called the Knock. t The work of the server is to monitor the client’s requests for the connections. The server will not provide the ports initially. The client requests connection trials to the server and send the SYN packets for the mentioned port present in the knock. So it is named as the Port Knocking. The server initially will not answer to the client knocking stage or requests. The server actually processes the port sequence or the monitoring of the SYN packets. The server first decodes the requests of knock and the then allows the client. 

However the authorized users are allowed on the firewall, the closed ports are available for the other users. Our Project is to find the ways of authentication service and found the Port Knocking. The Project considers the existing system and implemented the modification of the existing system by using of the novel port knocking architecture and introduces the high authentication service and highlighted the demerits of existing system. 

The Proposed System 

  1. The Port Knocking gives the highly secured authentication and the information transmission to the host without considering of the ports.
  2. The client is not aware of the server is performing the knocking sequence decoding.
  3. The server monitors the client request.
  4. The port is available to the requests for the specific time. 

Leader Election in Mobile AD HOC Network Final Year CSE Project Report for NIT Students

Introduction to  Leader Election in Mobile AD HOC Network Project:

We achieve pioneer decision ordered systems for portable impromptu systems. The contrivances guarantee that inevitably every associated part of topology diagram has precisely one pioneer. The ordered systems are dependent upon a schedule ordered system called TORA. The functional processes needed junctions to speak with just their present neighbors. The functional process is for a lone topology update. To bring about a go-to person race functional process portable specially appointed systems collecting that there is just a specific topology update in the system during that time frame.

A specially appointed is regularly demarcated as a foundation less grid, implication a system without the matter of course steering base like settled routers and tracking spines. Commonly the specially appointed junctions portable and the underlying correspondence medium is satellite. Every specially appointed junction may be equipped for of functioning as a router. Such specially appointed junctions or system may roll out in private region systems administration, gathering rooms and meetings, debacle alleviation and recover operations, arena operations and so forth.

Boss race is a convenient assembling square in conveyed frameworks, if wired or satellite particularly when disappointment can happen. Guide race can moreover be utilized as a part of correspondence methodologies, to decide on a revamped facilitator when the bunch enrollment updates. Improving distributive equations for specially appointed grids is a particularly testing assignment following the topology may update absolutely oftentimes and capriciously.  

To carry concerning a go-to individual race practical course of action conveyable extraordinarily named frameworks gathering that there is simply a particular topology redesign in the framework around the same time as that time span. A uniquely delegated is customarily outlined as an establishment less matrix, suggestion a framework without the expected result controlling base like settled routers and tracking spines. Usually the extraordinarily delegated intersections convenient and the underlying correspondence medium is satellite. Each extraordinarily delegated intersection may be outfitted for of working as a router.

Cache Compression in the Linux Memory Module NIT Computer Science Project Report

Introduction to Cache Compression in the Linux Memory Module Project:

This cache compression is used by configuration the RAM to catch the pages and the files which surely adds the great brand new version of the existing system. Here each every level of the cache compression attains the huge performance of the random disk memory and the Disk too. Here the current version is verified via the virtual part of the memory. This song has become the great attraction to the human’s eye. The working of the system is completely depended on the speed of the system computer that is based on the total memory of the RAM.

There are various documents and the files which are continuously present in the modules of the memory and its databases and disks. This files and the document where are present in the modeled memory is then compressed and then are permitted for the user to use in the required. 

Here there are some of the nodes which support the operating system like Linux. Linux is capable of generating the high memory for the hardware systems and configurations. Watermarks are the special standard images that are used to display the icon of the existing system here. The detailed information of the system is given in the reference books in detail.

The system is developed on the operating system which is the Linux because it works properly and gives more successful results when it works with the Linux operating system. There are various other zone in which are related to the existing system. The Zone that is related to the DMA is the zone which is the type of memory which has low physical ranges of the memory where compulsorily ISA application or device is required. Zone which is related the normal is the area where the files are internally mapped by the hardware like the kernel and many other substances.

 Download  Cache Compression in the Linux Memory Module NIT Computer Science Project Report .

Linux Project CSE Final Year Project Report

Introduction to Linux Project:

The aim of the project is to field creation Linux clusters on condition that LIVEMORE Computing (LC) users by means of the Liver more Model programming surroundings at LLNL. The details about all this described in the past, this was trained by means of variants of the UNIX operating system on proprietary hardware, i.e. with the usage of Linux on near-commodity hardware. we thought that we possibly will acquire further gainful clusters, present improved support to our users, and generate a stand from which open source system software possibly will be developed together with strategic partners and leveraged by the high-performance computing (HPC) the world at huge.

The focal point of the Linux work is to double. First, we deal with the technical space between Linux software and the Livermore Model all the way through a mixture of local software development hard work and group effort. Second, we take part in the design, getting hold of, and carry of these clusters.

The Livermore Model:

The strategy of the LC scalable system is, branded as the Livermore Model (Seager, 2002) and depicted in is to make available a regular application atmosphere on all LC construction clusters. This permits very much difficult scientific simulation applications to be portable across generations of Hardware architectures and stuck between currently available operating systems. The strategy need to make available a unwavering applications development environment in view of the fact that about 1992, when the Meiko CS-2 MPP was introduced at LLN.

The Linux Project even though successful, effort still now has considerable areas for development. The Luster Lite parallel file system and the SLURM resource manager will mutually be deployed in fall 2002. These are significant hard work that requires minor change under construction stresses. The CHAOS attempt will continue to create releases driven by design growth.

Project Report on Linux From Scratch

Introduction to Linux From Scratch Project:

The LFS’s is to provide the need of understanding to the learner about the Linux System and its features. The interesting one in learning the Linux is to make Linux environment according to our needs and likes. The important feature of the Linux is it gives us to control the system rather than depending on others Linux program. 

The other advantage of the LFS is prepared a compact Linux system. While the installation of the current distribution, many programs are needed to add which are actually not required. The programs occupy the disk space and reduce the CPU capacity. 

The very important feature is the security in custom generated Linux. The whole system is compiled by the source code which enables the user to do audits of all and the security where ever required. 

Module: The module for the new Linux distribution is compiled in such a way that the new Linux system is generated in the already installed Linux distribution. They might be Red Hat, Debian, Mandriva and SUSE. The existing Linux now hosts as the preceding point for the required programs along with the compiler, and a linker, shell for creating a new Linux System. There are five modules to create LFS system which are following.

  1. The new Linux native partition and file system to be generated.
  2. The need of downloading the required packages and patches
  3. To construct a temporary system
  4. To construct a LFS system 

The Existing System 

The existing system has many programs to be installed, some are not useful. It also occupies the large storage volume. The system is not according to the user interests. The protection of the system is more concerned. 

The Proposed System 

The Proposed system consists of more advantages. The LFS system is created in compact system. It is user orientated due to customized. This system is safer. It can be audited easily. 

The Hardware required are Processor Pentium IV, 512 MB RAM, 80 GB Hard disk drive. The Software consists of operating system is Linux.

Download  Project Report on Linux From Scratch.

A Linux Device Driver for USB to USB Direct Link Device Project Report

A Linux Device Driver for USB to USB Direct Link Device Project Report covers detailed explanation about project. Here we provide introduction to topic.

USB is referred to as universal serial bus came into existence in the year 1996 and its second version was released in the year 2000 which is capable of transferring the data with 480 Mbps speed. The product cost is less with higher rate of data transfer and gained more importance in this computer world. Nowadays this USB interfaces are allowing the users to connect different devices to the PC’s and every operating system will support this device. By using Linux there is no possibility for data transfer but by using this USB cables different types of data transactions can be performed between two systems. The operating systems related to Linux are not capable of providing the access through remote among the two PC’s but the file bandwidth is been used in order to transfer or share the files between two different computers or laptops which can be easily performed in operating system of Windows.

The main objective of this study is to create a system that supports USB remote access to transfer the data and files between two different systems and it includes two stages such as Kernel Module which is known as USB driver as well as user interface. In Kernel module phase the device driver will be designed which is called direct cable for connecting the two systems which suits best for operating systems of Linux. The low level system designed in this part will allow the users to transfer accurate, reliable and genuine files transfer among the systems. In this user interface stage, users can access the USB device remotely which provides transparent data access to the users in which the client server will be used that have three resources such as server side program, client side program as well as GUI QT.

Hardware Requirements:

  • USB Direct link cable
  • Intel P4 processor
  • 128MB Ram

Software Requirements:

  • Linux Kernel 2.6
  • GCC
  • QT

Download A Linux Device Driver for USB to USB Direct Link Device Project Report.

Linux Projects For Engineering Students

List of linux projects for engineering students:

cse and it final year students can download latest collection of linux projects for engineering studets with source code and project report. students can find minor and major linux projects with full information . linux projects listed here are part of previous year final year projects.

submit  linux projects for engineering students to us.

Links to download   linux projects for engineering students:

  1. Android Operating System a C++ Project
  2. V3 Mail Server a Java Project
  3. Mcafee Network Access control a Linux C++ Project
  4. LINUX From Scratch a Linux Mini Project
  5. Design of Intranet Mail System Project
  6. Online Tendering CSE Project.

download more  linux projects for engineering students.

Linux Projects for Students

List of linux projects for students 

This category consists of Linux projects for students,CSE Final year linux projects with source code,Linux Projects ideas and topics,Linux Projects abstracts.

  1. Android Operating System a C++ Project
  2. V3 Mail Server a Java Project
  3. Mcafee Network Access control a Linux C++ Project
  4. LINUX From Scratch a Linux Mini Project
  5. Design of Intranet Mail System Project
  6. Online Tendering CSE Project.