Healthcare / Healthcare

Machine Learning Platform at Mount Sinai Health Systems

By February 27, 2020

  • facebook
  • twitter
  • pinterest
  • linkedin

Arash Kia
Data Science Lead
Mount Sinai


Prem Timsina
Lead Data Engineer
Mount Sinai


Prem Timsina and Arash Kia are part of a collaborative data science team at Mount Sinai whose goal is to bring the power of machine learning to clinicians in an effort to best serve patients. In this presentation, they speak to how their team comprised of mathematicians, devops engineers, data engineers, data scientists, and clinicians have created prediction algorithms to aid in patient treatment.
Watch Prem and Arash's full presentation here

The Challenges

There are many challenges that are typically faced when implementing machine learning models. To add to these typical challenges, Prem and Arash were faced with the additional hurdles of having models whose results were affecting real-time questions and problems as well as remain within the laws surrounding patient confidentiality in the medical industry.

The typical challenges that were encountered included data quality, managing different data sources, and a lack of standardization. This can be particularly difficult when integrating into a medical industry where patients are the priority while the documentation and data collection are considered secondary. Solving this problem required many years’ worth of data. It was necessary to go back to 2011 to retrieve sufficient data logs and begin to build a database that was diverse enough to begin modeling.

"There are multiple types of data. Structured data, unstructured data, clinical data, imaging and genomics data all of which needed to be accounted for."

Finding Solutions

In order to build an automated process, a centralized interface was required. This would allow for different data sources to be aggregated, cleaned, normalized, and standardized into a format that was necessary for accurate models to be built. This also allowed for data integrity to be maintained as each individual data source never directly interacted with another, preventing contamination between data sets. This centralized interface, additionally, allowed for scalable and consistent results to be produced from several different data sources.

Solving these issues were necessary to create a data pipeline that could operate efficiently and accurately. This also allowed for consistent visuals to be generated, an essential piece for clinicians to be able to quickly and correctly identify what outcomes were being predicted.


The original purpose for integrating machine learning at Mount Sinai was twofold. Using time series, predictions were made for malnutrition reporting and predictions were made for whether patients would be discharged within the next 48 hours. These specific goals introduce the secondary unique problem to implementing these machine learning models.

These models were designed to be time sensitive so that the predictions could be used as a valuable resource by clinicians when interacting with their patients on a daily basis and in real time. The accuracy of these models was great enough that after implementation, clinicians would consult these predicted results when prioritizing patient needs.

Time series function by looking at historical data points spaced out evenly in a chronological and time sensitive manner. A successful time series algorithm will identify trends based off of repetitive and time-discrete frequencies and attempt to plot future events that mimic what it has observed. Often times after a prediction is made, the predicted values will be compared to actual values as they occur to determine the accuracy of the model.


On top of being able to create models that operated and gave results in real time Prem, Aresh, and team were able to create time series models that were able to take into account additional features that would typically not be considered when clinicians would make decisions. Creating a feedback loop between the clinicians the time series modeling, Mount Sinai was able to avoid false positive and false negative diagnoses at rates that fell below the medical industry averages. The credit can be attributed to looking at additional features that the models predicted has greater significance in outcomes and being able to adjust the algorithms based off of the clinicians first hand observations.

This process of data collection, knowledge, and learning was the building block for implementing machine learning at Mount Sinai. The data science team was able to overcome hurdles faced from the machine learning perspective and were able to collaborate with clinicians tackle the challenges experienced from the medical field.  In an effort to continue to improve, the future at Mount Sinai looks towards moving the data pipeline to a hybrid cloud, incorporating more imaging, and publishing the results in a manner that can be more broadly used within health systems. Continual improvements and learning have proven to be successful and look to continue into the future.


Recent Posts