Introduction to Computational Learning Theory (COMP SCI 639)
Spring 2020
This course will focus on developing the core concepts and techniques of computational learning theory. We will examine the inherent abilities and limitations of learning algorithms in well-defined learning models. Specifically, the course will focus on algorithmic problems in supervised learning. The goal of supervised learning is to infer a function from a set of labeled observations. We will study algorithms for learning Boolean functions from labeled examples in a number of models (online learning, PAC learning, SQ learning, learning with noise, etc.).
Course Information
- Instructor: Ilias Diakonikolas
Office Hours: Tuesday/Thursday, 12-1pm.
- Teaching Assistant: Nikos Zarifis (zarifis@wisc.edu)
Office Hours: Monday/Friday, 11:30-12:30, CS 4384
- Lectures: Tuesday, Thursday 1:00-2:15, COMP SCI 1325.
Prerequisites
Mathematical maturity. Background in undergraduate algorithms.
Course Outline
Here is an outline of the course material:
- Online Learning: Winnow, Best Experts, Weighted Majority
- PAC Learning, Relation to Online Learning, Occam's Razor
- VC Dimension and Sample Complexity
- Learning Decision Trees and DNFs
- Learning with Noise: Classification Noise, Malicious Noise
- Statistical Query Learning
- Distribution Dependent Learning, Fourier Transform
- Computational Hardness of Learning
- Learning with Membership and Equivalence Queries
- Other Models of Learning: Semi-supervised Learning, Active Learning
Lectures
- Lecture 1 (January 21) Introduction to computational learning theory.
Supervised Learning.
- Lecture 2 (January 23) Introduction to Online Learning.
- Lecture 3 (January 28) Online Learning of Disjunctions and Decision Lists.
- Lecture 4 (January 30) Winnow and Perceptron Algorithms.
- Lecture 5 (February 4) More on Winnow and Perceptron. Introduction to VC dimension.
- Lecture 6 (February 6) VC dimension and lower bound on online learning. Weighted Majority Algorithm.
- Lecture 7 (February 11) Analysis of Weighted Majority, Randomized Weighted Majority and Analysis
- Lecture 8 (February 13) Introduction to PAC Learning.
- Lecture 9 (February 18) PAC Learning continued. Learning Intervals
and Rectangles. Reduction from Online Learning to PAC Learning.
- Lecture 10 (February 20) Finding a Consistent Hypothesis and Occam’s Razor.
Cardinality version of Occam’s Razor.
- Lecture 11 (February 25) Greedy Set Cover Heuristic. Using cardinality version
of Occam’s razor to PAC learn sparse disjunctions with near-optimal sample complexity.
- Lecture 12 (February 27) Hypothesis Testing. Basic Concentration Inequalities.
Proper vs Non-proper PAC Learning.
- Lecture 13 (March 3) Proper vs Non-proper Learning Continued. NP-hardness of
properly learning 3-term DNFs. Efficient Algorithm for Non-proper Learning of 3-term DNFs.
- Lecture 14 (March 5) VC Dimension Characterizes Sample Complexity of PAC Learning.
Proof of Sample Complexity Lower Bound.
- Lecture 15 (March 10) VC Dimension Characterizes Sample Complexity of PAC Learning.
Sauer’s Lemma and Start of Sample Complexity Upper Bound Proof.
- Lecture 16 (March 12) VC Dimension Characterizes Sample Complexity of PAC Learning.
Proof of Sauer’s Lemma and Upper Bound Proof Continued.
- Lecture 17 (March 24) Efficient Learning of Linear Threshold Functions. Introduction to Boosting.
Schapire’s Three-State Booster.
- Lecture 18 (March 26) Schapire’s Three-State Booster Continued.
- Lecture 19 (March 31) Introduction to Boosting via Sampling Approach. Adaboost.
- Lecture 20 (April 2) Adaboost Algorithm and Analysis Continued.
- Lecture 21 (April 7) Introduction to Learning with Noise. Random Classification Noise, Malicious Noise.
- Lecture 22 (April 9) Information-Theoretic Lower Bound on Learning with Malicious Noise.
General Approach for Learning with Malicious Noise.
- Lecture 23 (April 14) Random Classification Noise (RCN). Learning Monotone Disjunctions
with RCN. Introduction to Statistical Query Model.
- Lecture 24 (April 16) Statistical Query (SQ) Learning Model.
SQ Learning Implies Learning with Random Classification Noise.
- Lecture 25 (April 21) Hardness of Learning: Representation dependent vs Independent Hardness.
Worst-case vs Average Case Assumptions.
- Lecture 26 (April 23) Hardness of Learning Continued. Cryptographic Hardness.
- Lecture 27 (April 28) Hardness of Learning Assuming Hardness of Refuting Random CSPs.
- Lecture 28 (April 30) Additional Topics: Active Learning, Unsupervised Learning.
Course Evaluation
Homework Assignments: There will be 4 homework assignments that will count for 60% of the grade.
The assignments which will be proof-based, and are intended to be challenging.
Collaboration and discussion among students is allowed, though students must write up their solutions independently.
Course Project: A part of the course (25% of the grade) is an independent project
on a topic related to learning theory. Projects can be completed individually or in groups of two students.
The goal of the project is to become an expert in an area related to the class material, and potentially contribute to the state of the art.
There are two aspects to the course project. The first is a literature review: Students must decide on a topic and a list of papers,
understand these papers in depth, and write a survey presentation of the results in their own words.
The second aspect is to identify a research problem/direction on the chosen topic, think about it,
and describe the progress they make.
Students must consult with the instructor during the first half of the course for
help in forming project teams, selecting a suitable project topic, and selecting a suitable set of research papers.
Students will be graded on the project proposal (5%), the progress report (5%), and the final report (15%).
The remaining part of the grade will be based on class participation (15%).
Readings
The textbook for this course is:
An additional textbook (available online) we will use is: