The main goal of the Healthcare Hospital System project is to view the hospital statistics by Directors and management or by those who are authorized to see it. Information included in this project is all users registration, payments,  consultation statistics, pharmacy, surgery, transplants, equipment’s, patients and doctors, healthcare bill, multi valued diagnosis, preventive health care, corporate counseling, welfare programmers, claims administration, patient care co-ordination, doctor achievements, staff details and Human Resources.

The attributes  included  in the hospital system are street address, city, county, zip code, birth date, admission/discharge dates, date of death, social security numbers,  record numbers, health plan beneficiary numbers, account numbers, telephone numbers, fax numbers, e-mail addresses. As the information about the patient is already stored in the database, it is easy to retrieve back and to identify the concerned doctor to the patient.

The best way to achieve cost goal is to analyze and interpreting the data and to produce visualization reports.

Project Requirements and Thoughts

  • Identify a topic of interest and run with it!
  • Identify a topic of interest and run with it! Pick a topic that will keep you interested and motivated over the next 14 weeks.
  • Ask the following question: Which answers am I trying to mine or solve?
  • Formulate a hypothesis question. Additional hypothesis questions are encouraged (maximum of 3 questions) are encouraged.  A hypothesis question contains both a null and alternative question.
  • What kind of data is being mined?  Where does the data originate? Can the data be imported into a database (i.e. for formulating updated queries).
  • List Class and/or Concepts related to characteristics and discriminations.
  • Is you project identifying patterns, associations, and/or correlations.
  • Within the project, statistics (e.g. central tendency, range, IQR, variance, etc.) should be presented.
  • Does your dataset contain outliners, if so, please identify them.
  • Projects should list attributes (e.g. nominal, binary, ordinal, and/or numeric)
  • If possible, data should be presented visually (iPython, R, MatLab, Excel, etc.).
  • Was the collected data consider of good or poor quality?
  • Does the data contain missing values or noisy data?
  • Does the dataset contain redundant data? If so, how did you remove duplicated values?
  • When analyzing the data, did you perform OLAP or OLTP processing?
  • Which data models were used for your project (e.g. Stars, Snowflakes, or Fact Constellations)?
  • If OLAP processing was used, did you incorporate bitmap or join indexing into the processing?
  • Did did you determine the size of your data set?