Deep Learning for Computer Vision, Speech, and Language

Time & Location
7:00-9:30pm, Tuesday, Fall 2018
Mudd Building 633

Co-taught by

Liangliang Cao (liangliang.cao_at_gmail.com)
Xiaodong Cui (xdcuibruin_at_gmail.com)
Kapil Thadani (kapil_at_cs.columbia.edu)

Guest Lecturers

Teaching Assistants

Rajath Kumar (rm3497@columbia.edu): Handling Assignments 1 & 2
Qiao Zhang (qz2301@columbia.edu): Handling Assignments 3 & 4

Office Hours

Location: TA Room (Mudd 122A)
Friday - Qiao: 4:00 - 6:00 p.m.
Thursday - Rajath: 4:00 - 6:00 p.m.

Course Introduction

This graduate level research class focuses on deep learning techniques for vision, speech and natural language processing problems. It gives an overview of the various deep learning models and techniques, and surveys recent advances in the related fields. Four homeworks and one final project with a heavy programming workload are expected.

Programming

This course uses Tensorflow as the primary programminging tool. However, other toolkits including pyTorch, or MxNet are also welcome.

All the programming problems in the homework should be done with IPython Notebook. Both code and experimenal results are required.

Google cloud will be used as the main programming platform. Note you can try colab as notebook with GPU installed. Students are also encouraged to install their computer with GPU cards.

Grading

~~40%~~ 30% homework
20% paper presentation and course attendence
~~40%~~ 50% final project

Submission

Homework should be uploaded on Coursework.
Upload ipython-notebook instead of python file.

Recommended Books
There is no required book for this class. But the following reading materials are recommended to read beyond the class:

Ian Goodfellow and Yoshua Bengio and Aaron Courville: Deep Learning
Yoav Goldberg, Neural Network Methods for Natural Language Processing