Abstract
Fitting a model to a collection of observations is one of the quintessential
problems in statistics. The typical assumption is that the data was generated by a model
of a given type (e.g., a mixture model). This is a simplifying assumption that is only
approximately valid, as real datasets are typically exposed to some source of contamination.
Hence, any estimator designed for a particular model must also be robust in the presence
of corrupted data. Until recently, even for the most basic problem of robustly estimating
the mean of a high-dimensional dataset, all known robust estimators were hard to compute.
A recent line of work in theoretical computer science obtained the first computationally
efficient robust estimators for a range of high-dimensional estimation tasks. In this tutorial
talk, we will survey the algorithmic techniques underlying these estimators and the connections
between them. We will illustrate these techniques for the problems of robust mean and covariance
estimation. Finally, we will discuss new directions and opportunities for future work.
Slides
[ppt] [pdf]
Video
[lecture_1] [lecture_2]
Key References