Introduction to Computational Learning Theory (COMP SCI 639)

Spring 2023

This course will focus on developing the core concepts and techniques of computational learning theory. We will examine the inherent abilities and limitations of learning algorithms in well-defined learning models. Specifically, the course will focus on algorithmic problems in supervised learning. The goal of supervised learning is to infer a function from a set of labeled observations. We will study algorithms for learning Boolean functions from labeled examples in a number of models (online learning, PAC learning, SQ learning, learning with noise, etc.).

Course Information

Instructor: Ilias Diakonikolas

Lectures: Tuesday, Thursday 11:00-12:15, Engineering 2534.

Prerequisites

Mathematical maturity. Background in undergraduate algorithms.

Course Outline

Here is an outline of the course material:

Online Learning: Winnow, Best Experts, Weighted Majority

PAC Learning, Relation to Online Learning, Occam's Razor

VC Dimension and Sample Complexity

Learning Decision Trees and DNFs

Boosting

Learning with Noise: Classification Noise, Malicious Noise

Statistical Query Learning

Distribution Dependent Learning, Fourier Transform

Computational Hardness of Learning

Learning with Membership and Equivalence Queries

Other Models of Learning: Semi-supervised Learning, Active Learning

Lectures

Lecture 1 Introduction to computational learning theory. Supervised Learning.

Lecture 2 Introduction to Online Learning.

Lecture 3 Online Learning of Disjunctions and Decision Lists.

Lecture 4 Winnow and Perceptron Algorithms.

Lecture 5 More on Winnow and Perceptron. Introduction to VC dimension.

Lecture 6 VC dimension and lower bound on online learning. Weighted Majority Algorithm.

Lecture 7 Analysis of Weighted Majority, Randomized Weighted Majority and Analysis

Lecture 8 Introduction to PAC Learning.

Lecture 9 PAC Learning continued. Learning Intervals and Rectangles. Reduction from Online Learning to PAC Learning.

Lecture 10 Finding a Consistent Hypothesis and Occam’s Razor. Cardinality version of Occam’s Razor.

Lecture 11 Greedy Set Cover Heuristic. Using cardinality version of Occam’s razor to PAC learn sparse disjunctions with near-optimal sample complexity.

Lecture 12 Hypothesis Testing. Basic Concentration Inequalities. Proper vs Non-proper PAC Learning.

Lecture 13 Proper vs Non-proper Learning Continued. NP-hardness of properly learning 3-term DNFs. Efficient Algorithm for Non-proper Learning of 3-term DNFs.

Lecture 14 VC Dimension Characterizes Sample Complexity of PAC Learning. Proof of Sample Complexity Lower Bound.

Lecture 15 VC Dimension Characterizes Sample Complexity of PAC Learning. Sauer’s Lemma and Start of Sample Complexity Upper Bound Proof.

Lecture 16 VC Dimension Characterizes Sample Complexity of PAC Learning. Proof of Sauer’s Lemma and Upper Bound Proof Continued.

Lecture 17 Efficient Learning of Linear Threshold Functions. Introduction to Boosting. Schapire’s Three-State Booster.

Lecture 18 Schapire’s Three-State Booster Continued.

Lecture 19 Introduction to Boosting via Sampling Approach. Adaboost.

Lecture 20 Adaboost Algorithm and Analysis Continued.

Lecture 21 Introduction to Learning with Noise. Random Classification Noise, Malicious Noise.

Lecture 22 Information-Theoretic Lower Bound on Learning with Malicious Noise. General Approach for Learning with Malicious Noise.

Lecture 23 Random Classification Noise (RCN). Learning Monotone Disjunctions with RCN. Introduction to Statistical Query Model.

Lecture 24 Statistical Query (SQ) Learning Model. SQ Learning Implies Learning with Random Classification Noise.

Lecture 25 Hardness of Learning: Representation dependent vs Independent Hardness. Worst-case vs Average Case Assumptions.

Lecture 26 Hardness of Learning Continued. Cryptographic Hardness.

Lecture 27 Hardness of Learning Assuming Hardness of Refuting Random CSPs.

Lecture 28 Additional Topics: Active Learning, Unsupervised Learning.

Course Evaluation

Homework Assignments: There will be 5 to 6 homework assignments that will count for 60% of the grade. The assignments which will be proof-based, and are intended to be challenging. Collaboration and discussion among students is allowed, though students must write up their solutions independently.

Exam: There will be a take-home exam that will count for 30% of the grade.

The remaining part of the grade will be based on class participation (10%).

Readings

The textbook for this course is:

M. Kearns and U. Vazirani. An Introduction to Computational Learning Theory.

An additional textbook (available online) we will use is:

S. Shalev-Shwartz and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms.