End-to-End Machine Learning Frameworks for Medicine: Data Imputation, Model Interpretation and Synthetic Data Generation

Speaker: Jinsung Yoon
Affiliation: Ph.D. Candidate - UCLA

Via Zoom Only: https://ucla.zoom.us/j/210298181      

 

Abstract: Tremendous successes in machine learning have been achieved in a variety of applications such as image classification and language translation via supervised learning frameworks. Recently, with the rapid increase of electronic health records (EHR), machine learning researchers got immense opportunities to adopt the successful supervised learning frameworks to diverse clinical applications. To properly employ machine learning frameworks for medicine, we need to handle the special properties of the EHR and clinical applications: (1) extensive missing data, (2) model interpretation, (3) privacy of the data. This dissertation addresses those specialties to construct end-to-end machine learning frameworks for clinical decision support.

          We focus on the following three problems: (1) how to deal with incomplete data (data imputation); (2) how to explain the decisions of the trained model (model interpretation); and (3) how to generate synthetic data for better sharing private clinical data (synthetic data generation). To appropriately handle those problems, we propose novel machine learning algorithms for both static and longitudinal settings. For data imputation, we propose modified Generative Adversarial Networks and Recurrent Neural Networks to accurately impute the missing values and return the complete data for applying state-of-the-art supervised learning models. For model interpretation, we utilize the actor-critic framework to estimate feature importance of the trained model’s decision in an instance level. We expand this algorithm to active sensing framework that recommends which observations should we measure and when. For synthetic data generation, we extend well-known Generative Adversarial Network frameworks from static setting to longitudinal setting and propose a novel differentially private synthetic data generation framework. To demonstrate the utilities of the proposed models, we evaluate those models on various real-world medical datasets including cohorts in the intensive care units, wards, and primary care hospitals. We show that the proposed algorithms consistently outperform state-of-the-art for handling missing data, understanding the trained model, and generating private synthetic data that are critical for building end-to-end machine learning frameworks for medicine.

Biography: Jinsung Yoon is currently a PhD candidate in the ECE Department at UCLA. He received his B.S. degree in ECE Department from Seoul National University in 2014, and M.S. degree in ECE Department UCLA in 2016, respectively. He has authored/co-authored more than 25 publications in peer-reviewed journals and international conferences. His research interests include deep generative models, semi-supervised learning, and interpretability.

For more information, contact Prof. Mihaela van der Schaar ()

Date/Time:
Date(s) - Apr 13, 2020
9:00 am - 11:00 am

Location:
Map Unavailable