CMPSCI 697L |
Deep Learning |
Fall 2015 |
|
Course description:
Deep learning is a recent breakthrough in the field of machine
learning that has become highly popular, due in large part to its
success at solving extremely difficult high-dimensional problems,
ranging from computer vision and speech recognition, to natural
language processing and reinforcement learning. Large groups have
formed at companies ranging from Baidu, Facebook, Google, IBM,
Microsoft, and scores of smaller startups.
This course will provide a state-of-the-art introduction to both the
theory and practice of deep learning. The course can be categorized
broadly into the following topics:
- Historical background, previous work in deep learning neural
networks
- The problem of representation discovery, and
linear/nonlinear approaches
- Deep learning models: autoencoders,
restricted Boltzmann machines, deep belief networks, convolutional
networks, and feedforward networks.
- Algorithms for training deep
networks
- Applications of deep learning to vision, speech, natural
language, and reinforcement learning.
- Software packages for training deep networks
- GPU-based methods for training deep networks
- Theory of deep networks, including algorithmic stability
and theoretical capabilities
Lectures: Friday 9:00-12:00, Room 142, CS Building
Course Schedule, Reading etc.
Prerequisites: Graduate level exposure to machine learning
and artificial intelligence; undergraduate level linear algebra,
probability theory and statistics, algorithmic analysis; familiarity
with high-level programming languages, such as Python, C++, Java etc.;
knowledge of Python, MATLAB helpful, but not required. Please talk with
the instructor if you want to take the course but have doubts about
your qualifications.
Textbooks:
Deep learning is a new field, and there are no textbooks yet.
We will rely on a number of tutorial papers as well as research papers for background reading. A few of them are listed below.
- Learning Deep Architectures for AI, by Yoshua Bengio, Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1-127, 2009.
- Learning representations by backpropagating errors, by David Rumelhart, Geoffrey Hinton, and Ronald Williams, Nature, Springervol. 323, October 1986.
- Reducing the dimensionality of data with neural networks, by Geoffrey Hinton and Ruslan Salakhutdinov, Science, vol. 313, July 2006.
- Extracting and Composing Robust Features with Denoising Autoencoders, by Pascal Vincent et al., ICML 2008 .
- Human level control through deep reinforcement learning, Volodymyr Mnih et al., Nature, vol. 518, Feb 2015.
- Learning with Pseudo-Ensembles, by Philip Bachman, Ouasis Alsharif, and Doina Precup, NIPS 2014.
Moodle:
All students should get access to Moodle (see
moodle.umass.edu) using your OIT login account. All lectures,
assignments, and grades will be assigned using Moodle.
Credit: 3 units
Instructor:
Professor Sridhar Mahadevan (mahadeva AT cs DOT umass DOT edu)
- Office hours: Monday/Wednesday 10:00-11:00 a.m. (or by
appointment), Room 204, CS Building
Piazza Discussion Forum: All students should sign up for an account on the class forum at piazza.com. Go to this web page to sign up.
Assigned work:
There will be one midterm mini-project, and a group final project,
besides readings and class participation. Each individual
mini-project must be completed by students working on their own, with
no help from the other students, except for generic discussion of the
questions.
Grading:
- Mini Projects (40%)
- Independent activities (20%)
- Final Group Project (40%)