Big Data Spain

15th ~ 16th OCT 2015 MADRID, SPAIN #BDS15




Thursday 15th

from 12:15 pm to 13:00 pm

Room 25



Geriatrics Medicine constitutes a clinical research field in which big data, statistical analysis, machine-learning and visualization techniques can provide relevant, solid and lasting benefits, including performance optimization and enhancements in quality of care.
Those benefits translate into improvement of patients' quality of life, cost rationalization and better use of resources in the public health system.

Read more

In this talk, authors will explain how they have used the aforementioned tools with elderly patients' data to realize those benefits.

The project has covered four key areas: development of predictive models that connect admission-related data with key target variables as LOS (length of stay) or Exitus; development of a Hadoop-based visualization environment to provide domain experts with a tool to perform exploratory analysis; efficiency analysis of programs geared towards providing assistance to nursing homes; and detection of key factors leading to higher admission rates from nursing homes.

For this purpose, anonymized patients' data have been mined with data analytics tools and big data platforms. These data included details about demographics, diagnosis, blood-related indicators, functional and mental status, admission-related complications, diagnosis, prescribed pharma and other variables. One of the datasets, containing thousands of records from the past six years, was used with no sampling, in order to ensure that big data advantages were realized.

Our toolset included SAS Enteprise Miner, IBM SPSS Modeler, R, Weka and Hadoop

Our goal was being able to answer the following questions:

  • How can big data enhance the process used to build predictive models over traditional approaches, like hypothesis-based clinical studies?
  • Can big data help identify what management or research initiatives provide true return on investment, measured as improved patients' quality of life or lower caregiving costs?
  • Can big data bring new knowledge to the medical community in order to optimize scarce resources (as doctors) and improve planning and forecasting processes?
  • How can exploratory analysis be delivered through big data platforms like Hadoop?
  • Can big data help help improve prediction of clinical performance by using a high volume of data, instead of relying on inference techniques from small samples?

Obtained results in each of the four research topics include the ones outlined next.

Through our big data project, one pharma (Digoxine) traditionally used with elderly patients was proved to be significantly connected to patients' death. Since pharmacy is among the top causes of death in several modern countries, discovering what pharma can negatively impact survival rates becomes a key research goal.

An statistically significant model to predict patients' length of stay was built. The model is based on variables that are typically gathered along admission processes. This model can be implemented to improve forecasting processes in hospitals.

Key nursing homes' features that lower hospital admission rates were identified. This finding allows healthcare managers to use a smart criteria to choose what nursing homes will lead to less admissions, hence making a more efficient use of available resources. It also serves to the purpose of helping nursing homes' managers make better decisions as to what features they should be investing on to make these organizations more productive.

Lastly, the exploratory environment built on Hadoop allowed to answer questions like: what is the patient' profile for most popular diseases? How a certain disease has been evolving overtime? What diseases have a higher cost of treament?

By capitalizing on the aforementioned results, the ultimate goal was to prove that big data and data analytics tools and techniques can optimize any clinical research initiative, since exploratory features and a "data discovery" approach (as opposed to traditional test of a single hypothesis) leads to more insights, more findings and detection of non-intuitive or complex relationships between variables.

Rafael San Miguel foto

Rafael San Miguel

UNIRData Scientist

Javier Gómez Pavón foto

Dr. Javier Gómez Pavón

Hospital Central de la Cruz RojaData Scientist