Advanced Learning Theory (CS880)
Fall 2019
This course is a graduate introduction to the information-theoretic and computational aspects of machine learning.
Our focus will be on the design and analysis of statistically and computationally efficient learning and inference algorithms in a variety
of well-defined models.
The course is targeted towards graduate students who want to gain familiarity with mathematical and algorithmic tools for the analysis of large datasets. Advanced undergraduate students with sufficient mathematical maturity are welcome (subject to instructor's approval).
Course Information
- Instructor: Ilias Diakonikolas
Office Hours: After class and by appointment.
- Lectures: Mon/Wed/Fri 14:30-15:45, 1257 Computer Sciences and Statistics.
Prerequisites
Mathematical maturity. Solid background in algorithms, linear algebra, and undergraduate probability/statistics.
Course Outline
The course will have a significant dynamic component, depending on the audience interests and current research in the area.
Here is a rough outline of the course material:
- Distribution Property Testing (Hypothesis Testing), Nonparametric Estimation
- Robust High-Dimensional Statistics, Adversarially Robust Learning
- Learning Latent Variable Models: Spectral Algorithms, Method of Moments, Tensor Methods
- PAC Learning with Noise, Statistical Queries, Agnostic Learning, Fourier Analysis, Noise Sensitivity and Influences
Lectures
- Lectures 1 and 2 Introduction.
Supervised versus Unsupervised Learning. Examples: Learning Linear Separators, Learning Discrete Distributions. Introduction to Property Testing.
Scribe notes here.
- Lecture 3 Algorithm for Testing Uniformity via Collisions.
Scribe notes here.
- Lecture 4 Introduction to Sample Complexity Lower Bounds: Overview of Lower Bounds for Density Estimation and Uniformity Testing.
Scribe notes here.
- Lecture 5 Basics of Information-Theory. Proof of Lower Bound for Uniformity Testing.
Scribe notes here.
- Lecture 6 Testing Equivalence Between Unknown Distributions. L2 Testing and the Flattening Technique.
Scribe notes here.
- Lecture 7 Analysis of Tolerant L2 Testing. Testing Equivalence with Unequal Sized Samples via the Flattening Technique.
Scribe notes here.
- Lecture 8 Overview of First Problem Set. Testing Independence via the Flattening Technique. Introduction
to Sample Lower Bound for Equivalence Testing.
- Lecture 9 Sample Complexity Lower Bound for Equivalence Testing.
Scribe notes here.
- Lecture 10 Overview of Current Research in Distribution Property Testing: Testing Structured Distributions in Low and High-Dimensions,
Privacy, Communication. Scribe notes here.
- Lecture 11 Tolerant Testing under Various Metrics. Sample Complexity Lower Bound for Tolerant Uniformity Testing in L1-Distance.
Scribe notes here.
- Lecture 12 Sample Complexity Upper Bound for Tolerant Uniformity Testing in L1-Distance.
- Lecture 13 Introduction to Robust Learning and Statistics.
- Lecture 14 Robust Mean Estimation in One Dimension: Median, Trimmed Mean, and Information-Theoretic Limits
- Lecture 15 Analysis of Truncated Mean; Introduction to High-Dimensional Mean Estimation.
- Lecture 16 Sample-Efficient Robust Estimators for Mean; Intuition for Efficient Algorithms
Course Evaluation
Homework Assignments: There will be 3 homework assignments that will count for 30% of the grade. The assignments will be proof-based and are intended to be challenging.
Collaboration and discussion among students is allowed, though students must write up their solutions independently.
- First Problem Set: [pdf] (due on October 7)
Course Project: A major part of the course (50% of the grade) is a research project on a topic related to algorithmic aspects of learning theory. Projects can be completed individually or in groups of two students.
The goal of the project is to become an expert in an area related to the class material, and potentially contribute to the state of the art. There are two aspects to the course project. The first is a literature review: Students must decide on a topic and a list of papers, understand these papers in depth, and write a survey presentation of the results in their own words. The second aspect is to identify a research problem/direction on the chosen topic, think about it, and describe the progress they make.
Students must consult with the instructor during the first half of the course for help in forming project teams, selecting a suitable project topic, and selecting a suitable set of research papers.
Students will be graded on the project proposal (10%), the progress report (10%), and the final report (20%). Each student will give a project presentation (10%) during the last week of classes.
The remaining part of the grade will be based on class participation (10%) and scribing one or more lecture notes (10%).
Possible project topics can be found here.
Readings
There is no required textbook for the course.
A list of readings will be given during the semester.
A few online books on the topic that we will occasionally
use as references are: