697L Info

CMPSCI 697L

Deep Learning

Fall 2015

Course description: Deep learning is a recent breakthrough in the field of machine learning that has become highly popular, due in large part to its success at solving extremely difficult high-dimensional problems, ranging from computer vision and speech recognition, to natural language processing and reinforcement learning. Large groups have formed at companies ranging from Baidu, Facebook, Google, IBM, Microsoft, and scores of smaller startups.

This course will provide a state-of-the-art introduction to both the theory and practice of deep learning. The course can be categorized broadly into the following topics:

Historical background, previous work in deep learning neural networks
The problem of representation discovery, and linear/nonlinear approaches
Deep learning models: autoencoders, restricted Boltzmann machines, deep belief networks, convolutional networks, and feedforward networks.
Algorithms for training deep networks
Applications of deep learning to vision, speech, natural language, and reinforcement learning.
Software packages for training deep networks
GPU-based methods for training deep networks
Theory of deep networks, including algorithmic stability and theoretical capabilities

Lectures: Friday 9:00-12:00, Room 142, CS Building

Course Schedule, Reading etc.

Prerequisites: Graduate level exposure to machine learning and artificial intelligence; undergraduate level linear algebra, probability theory and statistics, algorithmic analysis; familiarity with high-level programming languages, such as Python, C++, Java etc.; knowledge of Python, MATLAB helpful, but not required. Please talk with the instructor if you want to take the course but have doubts about your qualifications.

Textbooks:

Deep learning is a new field, and there are no textbooks yet. We will rely on a number of tutorial papers as well as research papers for background reading. A few of them are listed below.

Learning Deep Architectures for AI, by Yoshua Bengio, Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1-127, 2009.
Learning representations by backpropagating errors, by David Rumelhart, Geoffrey Hinton, and Ronald Williams, Nature, Springervol. 323, October 1986.
Reducing the dimensionality of data with neural networks, by Geoffrey Hinton and Ruslan Salakhutdinov, Science, vol. 313, July 2006.
Extracting and Composing Robust Features with Denoising Autoencoders, by Pascal Vincent et al., ICML 2008 .
Human level control through deep reinforcement learning, Volodymyr Mnih et al., Nature, vol. 518, Feb 2015.
Learning with Pseudo-Ensembles, by Philip Bachman, Ouasis Alsharif, and Doina Precup, NIPS 2014.

Moodle:

All students should get access to Moodle (see moodle.umass.edu) using your OIT login account. All lectures, assignments, and grades will be assigned using Moodle.

Credit: 3 units

Instructor: Professor Sridhar Mahadevan (mahadeva AT cs DOT umass DOT edu)

Office hours: Monday/Wednesday 10:00-11:00 a.m. (or by appointment), Room 204, CS Building

Piazza Discussion Forum: All students should sign up for an account on the class forum at piazza.com. Go to this web page to sign up.

Assigned work: There will be one midterm mini-project, and a group final project, besides readings and class participation. Each individual mini-project must be completed by students working on their own, with no help from the other students, except for generic discussion of the questions.

Grading:

Mini Projects (40%)
Independent activities (20%)
Final Group Project (40%)