However, a
precise understanding of the generalization performance of interpolants in
contemporary high-dimensional settings is far from complete. In this talk, we
will focus on min-L1-norm interpolants and present a theory for their
generalization behavior on high-dimensional binary data that is linearly
separable (in an asymptotic sense). We will establish this in the common modern
context where the number of features and samples are both large and comparable.Subsequently,
we will study the celebrated AdaBoost algorithm. Utilizing its classical
connection to min-L1-norm interpolants, we will establish an asymptotically
exact characterization of the generalization performance of AdaBoost. Our
characterization relies on specific modeling assumptions on the underlying
data—however, we will discuss a universality phenomenon that allows one to
apply our results to certain settings precluded by the prior assumptions. As a
byproduct, our results formalize the following crucial fact for AdaBoost :
overparametrization helps optimization. Furthermore, these results improve upon
existing upper bounds in the boosting literature in our setting, and can be
extended to min-norm interpolants under geometries beyond the L1. Our analysis
is relatively general and has potential applications for other ensembling
approaches. Time permitting, I will discuss some of these extensions. This is
based on joint work with Tengyuan Liang.
Information about the speaker
Pragya Sur
is an Assistant Professor in the Statistics Department at Harvard University.
Her research broadly spans high-dimensional statistics and statistical machine
learning. A major part of her work focuses on developing the theoretical
underpinnings of statistical inference procedures applicable for
high-dimensional data. She simultaneously works on the statistical properties
of modern machine learning algorithms, in particular, ensemble learning
algorithms. Recently, she has been interested in developing theory and methods
for causal inference in high dimensions. On the applied side, she finds
interest in developing computationally scalable statistical methods with a
focus on problems arising from statistical genetics. Her current research is
supported by an NSF DMS Award and a William F. Milton Fund Award. Previously,
she spent a year as a postdoctoral fellow at the Center for Research on
Computation and Society at Harvard. She completed a Ph.D. in Statistics in 2019
from Stanford University, where she received the Ric Weiland Graduate
Fellowship (2017-2019) and the 2019 Theodore W. Anderson Theory of Statistics
Dissertation Award.
The event
is hybrid: zoom-link, password:
mondays23
Category
Seminar
Location:
Analysen, meeting room, EDIT trappa D, E och F, Campus Johanneberg
Starts:
07 December, 2022, 14:00
Ends:
07 December, 2022, 15:00