Colloquium Details

Leveraging Structure for Intelligent Representation Learning in Health and Biomedicine

Speaker: Matthew McDermott, Harvard Medical School

Location: 60 Fifth Avenue Room 150

Date: March 13, 2024, 2 p.m.

Host: Rajesh Ranganath

Synopsis:

Machine learning today is undergoing a “foundation model” revolution. Emboldened by successes in natural language processing and computer vision such as GPT-4 and Stable Diffusion, more and more machine learning problems are beginning to leverage the incredible potential of these state of the art representation learning technologies to challenge the kinds of problems we can solve and the methods we can use to solve them. In the high-impact, high-risk domain of health and biomedicine, the appeal of foundation models is especially poignant, given the prevalence of tasks that only have small and/or noisy datasets available for training. However, there remains significant uncertainty on how we can build foundation model systems most effectively in healthcare domains given the significant data heterogeneity, limited dataset sizes, and the notable differences in health data structure versus the comparatively simple structure of natural language. In this talk, I will describe the relevant portions of my prior research and my research vision to solve these problems and drive the state of foundation model research forward within health and biomedicine. Building on my extensive history of research over high capacity representation learning systems, specifically those empowered by external structure and knowledge, I will describe what a “medical foundation model” really is and how we can build them, scale them to new dataset sizes, and leverage them in concert with existing and prior success to incorporate existing medical and modeling expertise.

Speaker Bio:

Dr. Matthew McDermott received his PhD in Computer Science from MIT, studying representation learning algorithms within machine learning for health and biomedicine in Professor Pete Szolovits’ clinical decision making group. Now, as a Berkowitz Postdoctoral Fellow at Harvard Medical School in Professor Isaac Kohane’s lab, he builds high-capacity “foundation models” and other representation learning systems over structured electronic health record (EHR) data to help build the next generation learning health system. His research historically has produced seminal results including one of the earliest and the most widely used pre-trained clinical language models, one of the first theoretical frameworks that identifies how pre-training losses used in representation learning induce structural constraints that motivate fine-tuning task performance, and multiple widely used software packages for performing machine learning analyses over structured EHR data. Prior to his PhD, Dr. McDermott studied mathematics at Harvey Mudd College, worked as a software engineer in data engineering at Google, and co-founded the startup Guesstimate. Throughout his academic career, as evidenced both in his research and in his work organizing conference venues in machine learning for healthcare such as the ML4H symposium and the CHIL conference, Dr. McDermott has also maintained strong interests in reproducibility, robustness, equity, and improving the quality of research in the field of machine learning in health generally.

Notes:

In-person attendance only available to those with active NYU ID cards.

How to Subscribe