Introduction to Computational Learning Theory (COMP SCI 639)

Fall 2020

This course will focus on developing the core concepts and techniques of computational learning theory. We will examine the inherent abilities and limitations of learning algorithms in well-defined learning models. Specifically, the course will focus on algorithmic problems in supervised learning. The goal of supervised learning is to infer a function from a set of labeled observations. We will study algorithms for learning Boolean functions from labeled examples in a number of models (including online mistake-bound learning, PAC learning, SQ learning, and learning with various types of noise).

Course Information

Instructor: Ilias Diakonikolas

Teaching Assistants: Yuxin Sun (yxsun@wisc.edu), Nikos Zarifis (zarifis@wisc.edu)

Lectures: The course is currently being offered in an asynchronous manner.

Prerequisites

Mathematical maturity. Background in undergraduate algorithms.

Course Outline

Here is an outline of the course material:

Online Learning: Winnow, Best Experts, Weighted Majority

PAC Learning, Relation to Online Learning, Occam's Razor

VC Dimension and Sample Complexity

Learning Decision Trees and DNFs

Boosting

Learning with Noise: Classification Noise, Malicious Noise

Statistical Query Learning

Distribution Dependent Learning, Fourier Transform

Computational Hardness of Learning

Learning with Membership and Equivalence Queries

Other Models of Learning: Semi-supervised Learning, Active Learning

Lectures

Lecture 1 Introduction to computational learning theory. Supervised Learning.

Lecture 2 Introduction to Online Learning.

Lecture 3 Online Learning of Disjunctions and Decision Lists.

Lecture 4 Winnow and Perceptron Algorithms.

Lecture 5 More on Winnow and Perceptron. Introduction to VC dimension.

Lecture 6 VC dimension and lower bound on online learning. Weighted Majority Algorithm.

Lecture 7 Analysis of Weighted Majority, Randomized Weighted Majority and Analysis

Lecture 8 Introduction to PAC Learning.

Lecture 9 PAC Learning continued. Learning Intervals and Rectangles. Reduction from Online Learning to PAC Learning.

Lecture 10 Finding a Consistent Hypothesis and Occam’s Razor. Cardinality version of Occam’s Razor.

Lecture 11 Greedy Set Cover Heuristic. Using cardinality version of Occam’s razor to PAC learn sparse disjunctions with near-optimal sample complexity.

Lecture 12 Hypothesis Testing. Basic Concentration Inequalities. Proper vs Non-proper PAC Learning.

Lecture 13 Proper vs Non-proper Learning Continued. NP-hardness of properly learning 3-term DNFs. Efficient Algorithm for Non-proper Learning of 3-term DNFs.

Lecture 14 VC Dimension Characterizes Sample Complexity of PAC Learning. Proof of Sample Complexity Lower Bound.

Lecture 15 VC Dimension Characterizes Sample Complexity of PAC Learning. Sauer’s Lemma and Start of Sample Complexity Upper Bound Proof.

Lecture 16 VC Dimension Characterizes Sample Complexity of PAC Learning. Proof of Sauer’s Lemma and Upper Bound Proof Continued.

Lecture 17 Efficient Learning of Linear Threshold Functions. Introduction to Boosting. Schapire’s Three-State Booster.

Lecture 18 Schapire’s Three-State Booster Continued.

Lecture 19 Introduction to Boosting via Sampling Approach. Adaboost.

Lecture 20 Adaboost Algorithm and Analysis Continued.

Lecture 21 Introduction to Learning with Noise. Random Classification Noise, Malicious Noise.

Lecture 22 Information-Theoretic Lower Bound on Learning with Malicious Noise. General Approach for Learning with Malicious Noise.

Lecture 23 Random Classification Noise (RCN). Learning Monotone Disjunctions with RCN. Introduction to Statistical Query Model.

Lecture 24 Statistical Query (SQ) Learning Model. SQ Learning Implies Learning with Random Classification Noise.

Lecture 25 Hardness of Learning: Representation dependent vs Independent Hardness. Worst-case vs Average Case Assumptions.

Lecture 26 Hardness of Learning Continued. Cryptographic Hardness.

Lecture 27 Hardness of Learning Assuming Hardness of Refuting Random CSPs.

Lecture 28 Additional Topics: Active Learning, Unsupervised Learning.

Course Evaluation

Homework Assignments: There will be 4 homework assignments that will count for 60% of the grade. The assignments which will be proof-based, and are intended to be challenging. Collaboration and discussion among students is allowed, though students must write up their solutions independently.

Course Project: A part of the course (40% of the grade) is an independent project on a topic related to learning theory. Projects can be completed individually or in groups of two students.

The goal of the project is to become an expert in an area related to the class material, and potentially contribute to the state of the art. There are two aspects to the course project. The first is a literature review: Students must decide on a topic and a list of papers, understand these papers in depth, and write a survey presentation of the results in their own words. The second aspect is to identify a research problem/direction on the chosen topic, think about it, and describe the progress they make.

Students must consult with the instructor during the first half of the course for help in forming project teams, selecting a suitable project topic, and selecting a suitable set of research papers.

Students will be graded on the project proposal (5%), the progress report (5%), and the final report (30%).

Readings

The textbook for this course is:

M. Kearns and U. Vazirani. An Introduction to Computational Learning Theory.

An additional textbook (available online) we will use is:

S. Shalev-Shwartz and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms.