Difference between revisions of "CSCI4155/CSCI6505 (2016)"

From Hallab
Jump to navigation Jump to search
(Created page with "== Machine Learning 2016 == === Instructors === Prof: Dr. Thomas Trappenberg (tt@cs.dal.ca) Office: Room 4216 in Mona Campbell Building TA: Paul Hollensen (paulhollensen@...")
 
 
(68 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
Office: Room 4216 in Mona Campbell Building  
 
Office: Room 4216 in Mona Campbell Building  
  
TA: Paul Hollensen (paulhollensen@gmail.com)
+
TA: TBA
  
 
Office hours: By appointment (write email)
 
Office hours: By appointment (write email)
  
 
=== Course Description ===
 
=== Course Description ===
This course is an introduction to machine learning, including their practical use and theoretical foundation. We will start by applying some pre-programmed algorithms in Matlab to get practical experience before unpacking some of the theory behind them. The course includes introductory reviews of the Matlab scientific programming language and some mathematical concepts that we need for the discussions on the theory. The later includes the formalism of describing uncertainty (e.g. basic concepts of probability theory), representations of large data sets (e.g. vector and matrix formalism), and mathematical concepts behind some algorithms (e.g. gradients and optimization techniques).
+
This course is an introduction to machine learning, including their practical use and theoretical foundation. This year we will emphasize deep learning and will be using Python with advanced implementations such as scikit-learn and Google's Tensorflow. We will start by showing how to apply pre-programmed algorithms in Python to get some practical experience before unpacking some of the theory behind them. The course includes introductory reviews of scientific programming with Python. The course requires knowledge of mathematical concepts such as calculus and linear algebra as well as the formalism of describing uncertainty with probability theory.
  
 
=== Course Textbook ===
 
=== Course Textbook ===
  
 
There are many good textbook on machine learning, some which are listed below. We will not follow a single textbook but I will provide lecture notes for most of the material.  
 
There are many good textbook on machine learning, some which are listed below. We will not follow a single textbook but I will provide lecture notes for most of the material.  
 +
Recommended textbooks for further studies are:
  
Example textbooks for further studies:
+
* Kevin Murphy; Machine Learning: A Probabilistic Perspective; MIT Press
 
 
Kevin Murphy; Machine Learning: A Probabilistic Perspective; MIT Press
 
 
 
Ethem Alpaydim; Introduction to Machine Learning; MIT Press
 
 
 
Trevor Hastie, Robert Tibshirani, and Jerome Friedman; The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer
 
  
 +
* Ethem Alpaydim; Introduction to Machine Learning; MIT Press
  
 +
* Trevor Hastie, Robert Tibshirani, and Jerome Friedman; The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer
  
 
=== Assignments ===
 
=== Assignments ===
Line 34: Line 31:
 
=== Grading Scheme ===
 
=== Grading Scheme ===
  
Quizzes 50%, Assignments 50%  
+
Assignments 50%, Midterm 20%, Final 30%.
There will be additional or different assignments and/or quiz questions for the grad course.  
+
Some of the assignment are different between the ugrad and grad version.  
 
Some of the assignments may be group work, but you have to pass all individual components in order to pass the course.
 
Some of the assignments may be group work, but you have to pass all individual components in order to pass the course.
  
=== List of toolboxes that might be useful ===
+
=== Background resources ===
 +
The course assumes background in math, probability theory and programming. Some reviews will be included in the lectures to ensure we are on the same page, though if necessary you need to study this further on your own. As an example of the assumed level of background and to prepare if necessary l recommend the following sections from Khan academy:
 +
* Matrices: https://www.khanacademy.org/math/precalculus/precalc-matrices
 +
* Differential calculus: https://www.khanacademy.org/math/differential-calculus/taking-derivatives
 +
* Probability theory: https://www.khanacademy.org/math/probability/random-variables-topic
 +
 
 +
The Python environment is installed on CS computers and we will provide a VirtualBox installation. This will be somewhat experimental and you might need to port some stuff to other installations. We provide help through TA support and the help desk, but making sure that the necessary tools are available is considered the responsibility of the students in this advanced CS class. In general, we will be using Python 3 through the Spyder IDE with scientific packages such as NumPy and some specific machine learning such as
 +
* scikit-klearn: http://scikit-learn.org/stable/install.html
 +
* Tensorflow and TFLearn: http://tflearn.org/installation/
 +
and maybe
 +
* Lea: https://bitbucket.org/piedenis/lea/wiki/Installation
 +
 
 +
There are many different levels of tutorials available at https://wiki.python.org/moin/BeginnersGuide/Programmers
 +
 
 +
A good overview of Machine learning to prepare for this course is http://www.cs.ubc.ca/~murphyk/MLbook/pml-intro-22may12.pdf. An older overview of machine learning from my perspective can be found at [[Media:MLreview.pdf|A Brief Introduction to Probabilistic Machine Learning and Its Relation to Neuroscience]], in Growing Adaptive Machines, T. Kowaliw et al. (eds.), Studies in Computational Intelligence 557, DOI: 10.1007/978-3-642-55337-0_2, Springer
  
Parallel Computing Toolbox
 
Symbolic Math Toolbox
 
Statistics and Machine Learning Toolbox
 
Curve Fitting Toolbox
 
Optimization Toolbox
 
Neural Network Toolbox
 
Signal Processing Toolbox
 
Image Processing Toolbox
 
Computer Vision System Toolbox
 
Image Acquisition Toolbox
 
  
 
=== Schedule ===
 
=== Schedule ===
This schedule is tentative and will likely change frequently.
+
The schedule below will be updated frequently with assignments, announcements and course material.  
Please check frequently as it provides links to relevant resources and assignments.
 
 
 
 
{| class="wikitable"
 
{| class="wikitable"
 
!Date!!Content !! Reference !! Assignment
 
!Date!!Content !! Reference !! Assignment
 
|-
 
|-
| Sept 10 || Intro  || ||
+
| Sept 6 || Intro  || 5414416a.pdf ||
|-
 
| Sept 15 || Overview  || http://www.cs.ubc.ca/~murphyk/MLbook/pml-intro-22may12.pdf ||
 
 
|-
 
|-
| Sept 17 || Matlab  || http://www.mathworks.com/academia/student_center/tutorials/launchpad.html || [[media:A115.pdf|A1]]
+
| Sept 7 || Intro  || ML2016v21.pdf || Install Python for next class
 
|-
 
|-
| Sept 22 || Supervised learning project 1: SVM ||http://www.mathworks.com/help/stats/support-vector-machines-svm.html#bsr5b6n <br\> http://www.mathworks.com/help/stats/support-vector-machines-svm.html#bss0s6_-1 || [[media:A215.pdf|A2]] [[media:data1.mat|data1]]
+
| Sept 14 || Python || second chapter of ML2016v21.pdf || A1: Exercise 1&2 on page 13. Send programs to dalhousieml2016@gmail.com with subject line A1. Due Monday Sep 19 before class.
|-
 
| Sept 24 ||  Basic math  (matrices & gradients) || [[Media:math.pdf|math.pdf]]||
 
 
|-
 
|-
| Sept 29 ||  Supervised learning project 2: MLP || [[Media:MLP.pdf|MLP.pdf]] ||[[media:A315.pdf|A3]]
+
| Sept 19 || Practical classification || second chapter of ML2016v21.pdf || |[[Media:A22016.pdf|A2]]  
 
|-
 
|-
| Oct 1 || MLP _(... continue) || example program (online, component-wise) [[Media:mlpANDonlineComponent.m|mlpANDonlineComponent.m]]  ||  
+
| Sept 21 || Probability refresher ||   [[Media:ML2016v32.pdf| ML2016v32.pdf]]  || Prepare presentation for next week (max 5 min)
 
|-
 
|-
| Oct 6 || MLP _(... continue)|| example program (batch, matrix): [[Media:mlpXORbatch.m|mlpXORbatch.m]] ||
+
| Sept 26 || Presentations  ||   || Remember to think about what chance is for your application
 
|-
 
|-
| Oct 8 || Classification 1 (SVM) || [[Media:SVM1.pdf|SVM1.pdf]] ||
+
| Sept 28 || Probabilistic regression and Bayesian networks  || [[Media:ML2016v4.1.pdf| 4th chapter]] ||  
 
|-
 
|-
| Oct 13 ||Deep Learning 1 || http://www.robots.ox.ac.uk/~vgg/practicals/cnn/index.html ||[[media:A415.pdf|A4]]
+
| Oct 3 || MAP & ANN 1 || [[Media:ML2016v5.1.pdf| 5th chapter]]  || [[Media:A32016.pdf|A3]] [[Media:train.txt | train.txt]]  
 
|-
 
|-
| Oct 15 ||Deep Learning 2 ||  || Quiz 2 (MLP/SVM)
+
| Oct 5 || Probabilistic ANN ||  ||  
 
|-
 
|-
| Oct 20 || Deep learning 3 ||  ||
+
| Oct 12 || ANN tricks and filters ||  || |[[Media:A42016.pdf|A4]]
 
|-
 
|-
| Oct 22 || Deep learning 4 || ||
+
| Oct 17 || Convolutional deep networks  || www.tensorflow.org ||  
 
|-
 
|-
| Oct 27 || Statistics || [[Media:Prob1.pdf|Prob1.pdf]] ||
+
| Oct 19 || Midterm ||  || |[[Media:A52016.pdf|A5]]  
 
|-
 
|-
| Oct 29 || Stochastic modeling (regression and Maximum Likelihood)|| [[Media:Regression1.pdf|Regression1.pdf]]  ||
+
| Oct 24 || Deep networks with Tensorflow  || [[Media:ML2016v62.pdf| 6th chapter]]  || |[[Media:A52016.pdf|A5]]
 
|-
 
|-
| Nov 3 || Graphical models || [[Media:graphical1.pdf|graphical1.pdf]]  https://code.google.com/p/bnt/ || [[media:A515.pdf|A5]]
+
| Oct 26 || Regularization, autoencoders  || [[Media:ML2016v62.pdf| 6th chapter]]  ||  
 
|-
 
|-
| Nov 5 || MDP || [[Media:MDPslides.pdf|MDPslides.pdf]]  || Quiz 3 (Deep learning, CNN, Stochastic modeling)
+
| Oct 31 || Generative models, Naive Bayes || [[Media:ML2016v71.pdf| 7th chapter]]  ||  
 
|-
 
|-
| Nov 10 || POMDP/ TD(lambda) ||[[Media:RL.pdf|RL.pdf]]   || A6: Exercise 1-4 (+5 for 6505) from RL.pdf (Due Nov 20 by email with subject line A6)
+
| Nov 2 || unsupervised learning, expectation maximization || [[Media:ML2016v81.pdf| 8th chapter]] || |[[Media:A62016.pdf|A6]] 
 
|-
 
|-
| Nov 12 || Study Day ||  ||
+
| Nov 14 || reinforcement learning || [[Media:ML2016v91.pdf| 9th chapter]] [[Media:RL1.txt| RL1]] ||  
 
|-
 
|-
| Nov 17 || Generative models, Discriminant Analysis, Naive Bayes || [[Media:Classification2.pdf|Classification2.pdf]] ||
+
| Nov 21 || reinforcement learning || updated script [[Media:ML2016v91.pdf| 9th chapter]]  and [[Media:Minh2015.pdf| Minh etal. 2015]]  || updated program [[Media:RL1.txt| RL1]] and [[Media:TmazeGridTF.txt| TmazeGridTF]] |[[Media:A72016.pdf|A7]] 
 
|-
 
|-
| Nov 19 ||  Unsupervised Learning (k-means, EM algorithm)  || [[Media:Unsupervised1.pdf|Unsupervised1.pdf]]  [[Media:ExpectationMaximization.m|ExpectationMaximization.m]] || Quiz 4 (RL)
 
|-
 
| Nov 24 ||  Unsupervised Learning (Dimensionality reduction, t-SNE, representational learning, DAE, RBM)  ||  [[Media:Unsupervised2.pdf|Unsupervised2.pdf]] [[Media:DimensionalityReduction.pdf|DimensionalityReduction.pdf]] [[Media:RBMExample.zip|RBMExample.zip]] || [[Media:20newsgroups.zip|20newsgroups.zip]] [[media:A715.pdf|A7]]
 
|-
 
| Nov 26 || || || 
 
|-
 
| Dec 1 || Summary  ||  ||
 
|-
 
| Dec 3  ||  ||  || Quiz 5
 
 
|}
 
|}
 +
 +
=== Further resources ===
 +
 +
We created an Oracle Virtual Machine (VirtualBox) for the programing environment in case you have difficulties with native installations (Windows) at http://subzero.cs.dal.ca:10571/downloads/4155.ova
 +
 +
 +
Some useful websites to visualize http://cs231n.github.io/convolutional-networks/ and http://scs.ryerson.ca/~aharley/vis/conv/
 +
 +
[[Media:DeepLearningBook.pdf.zip| DeepLearningBook.pdf.zip]]
  
 
== Academic Integrity & Plagiarism ==
 
== Academic Integrity & Plagiarism ==
Line 128: Line 123:
 
• Make sure you understand Dalhousies policies on academic integrity.
 
• Make sure you understand Dalhousies policies on academic integrity.
  
• Give appropriate credit to the sources used in your assignment such as written or oral work, com- puter codes/programs, artistic or architectural works, scientific projects, performances, web page designs, graphical representations, diagrams, videos, and images. Use RefWorks to keep track of your research and edit and format bibliographies in the citation style required by the instructor (http://www.library.dal.ca/How/RefWorks)
+
• Give appropriate credit to the sources used in your assignment such as written or oral work, computer codes/programs, artistic or architectural works, scientific projects, performances, web page designs, graphical representations, diagrams, videos, and images. Use RefWorks to keep track of your research and edit and format bibliographies in the citation style required by the instructor (http://www.library.dal.ca/How/RefWorks)
  
 
• Do not download the work of another from the Internet and submit it as your own.
 
• Do not download the work of another from the Internet and submit it as your own.

Latest revision as of 14:02, 2 December 2016

Machine Learning 2016

Instructors

Prof: Dr. Thomas Trappenberg (tt@cs.dal.ca)

Office: Room 4216 in Mona Campbell Building

TA: TBA

Office hours: By appointment (write email)

Course Description

This course is an introduction to machine learning, including their practical use and theoretical foundation. This year we will emphasize deep learning and will be using Python with advanced implementations such as scikit-learn and Google's Tensorflow. We will start by showing how to apply pre-programmed algorithms in Python to get some practical experience before unpacking some of the theory behind them. The course includes introductory reviews of scientific programming with Python. The course requires knowledge of mathematical concepts such as calculus and linear algebra as well as the formalism of describing uncertainty with probability theory.

Course Textbook

There are many good textbook on machine learning, some which are listed below. We will not follow a single textbook but I will provide lecture notes for most of the material. Recommended textbooks for further studies are:

  • Kevin Murphy; Machine Learning: A Probabilistic Perspective; MIT Press
  • Ethem Alpaydim; Introduction to Machine Learning; MIT Press
  • Trevor Hastie, Robert Tibshirani, and Jerome Friedman; The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer

Assignments

Assignments are posted in the schedule below. Late assignments are not excepted.

Grading Scheme

Assignments 50%, Midterm 20%, Final 30%. Some of the assignment are different between the ugrad and grad version. Some of the assignments may be group work, but you have to pass all individual components in order to pass the course.

Background resources

The course assumes background in math, probability theory and programming. Some reviews will be included in the lectures to ensure we are on the same page, though if necessary you need to study this further on your own. As an example of the assumed level of background and to prepare if necessary l recommend the following sections from Khan academy:

The Python environment is installed on CS computers and we will provide a VirtualBox installation. This will be somewhat experimental and you might need to port some stuff to other installations. We provide help through TA support and the help desk, but making sure that the necessary tools are available is considered the responsibility of the students in this advanced CS class. In general, we will be using Python 3 through the Spyder IDE with scientific packages such as NumPy and some specific machine learning such as

and maybe

There are many different levels of tutorials available at https://wiki.python.org/moin/BeginnersGuide/Programmers

A good overview of Machine learning to prepare for this course is http://www.cs.ubc.ca/~murphyk/MLbook/pml-intro-22may12.pdf. An older overview of machine learning from my perspective can be found at A Brief Introduction to Probabilistic Machine Learning and Its Relation to Neuroscience, in Growing Adaptive Machines, T. Kowaliw et al. (eds.), Studies in Computational Intelligence 557, DOI: 10.1007/978-3-642-55337-0_2, Springer


Schedule

The schedule below will be updated frequently with assignments, announcements and course material.

Date Content Reference Assignment
Sept 6 Intro 5414416a.pdf
Sept 7 Intro ML2016v21.pdf Install Python for next class
Sept 14 Python second chapter of ML2016v21.pdf A1: Exercise 1&2 on page 13. Send programs to dalhousieml2016@gmail.com with subject line A1. Due Monday Sep 19 before class.
Sept 19 Practical classification second chapter of ML2016v21.pdf A2
Sept 21 Probability refresher ML2016v32.pdf Prepare presentation for next week (max 5 min)
Sept 26 Presentations Remember to think about what chance is for your application
Sept 28 Probabilistic regression and Bayesian networks 4th chapter
Oct 3 MAP & ANN 1 5th chapter A3 train.txt
Oct 5 Probabilistic ANN
Oct 12 ANN tricks and filters A4
Oct 17 Convolutional deep networks www.tensorflow.org
Oct 19 Midterm A5
Oct 24 Deep networks with Tensorflow 6th chapter A5
Oct 26 Regularization, autoencoders 6th chapter
Oct 31 Generative models, Naive Bayes 7th chapter
Nov 2 unsupervised learning, expectation maximization 8th chapter A6
Nov 14 reinforcement learning 9th chapter RL1
Nov 21 reinforcement learning updated script 9th chapter and Minh etal. 2015 updated program RL1 and TmazeGridTF |A7

Further resources

We created an Oracle Virtual Machine (VirtualBox) for the programing environment in case you have difficulties with native installations (Windows) at http://subzero.cs.dal.ca:10571/downloads/4155.ova


Some useful websites to visualize http://cs231n.github.io/convolutional-networks/ and http://scs.ryerson.ca/~aharley/vis/conv/

DeepLearningBook.pdf.zip

Academic Integrity & Plagiarism

(Based on the sample statement provided at http://academicintegrity.dal.ca. Written by Dr. Alex Brodsky.)

Please familiarize yourself with the university policy on Intellectual Honesty. Every suspected case will be reported.

At Dalhousie University, we respect the values of academic integrity: honesty, trust, fairness, responsibility and respect. As a student, adherence to the values of academic integrity and related policies is a requirement of being part of the academic community at Dalhousie University.


What does academic integrity mean?

Academic integrity means being honest in the fulfillment of your academic responsibilities thus establishing mutual trust. Fairness is essential to the interactions of the academic community and is achieved through respect for the opinions and ideas of others. Violations of intellectual honesty are offensive to the entire academic community, not just to the individual faculty member and students in whose class an offence occurs. (see Intellectual Honesty section of University Calendar)


How can you achieve academic integrity?

• Make sure you understand Dalhousies policies on academic integrity.

• Give appropriate credit to the sources used in your assignment such as written or oral work, computer codes/programs, artistic or architectural works, scientific projects, performances, web page designs, graphical representations, diagrams, videos, and images. Use RefWorks to keep track of your research and edit and format bibliographies in the citation style required by the instructor (http://www.library.dal.ca/How/RefWorks)

• Do not download the work of another from the Internet and submit it as your own.

• Do not submit work that has been completed through collaboration or previously submitted for another assignment without permission from your instructor. • Do not write an examination or test for someone else.

• Do not falsify data or lab results.

These examples should be considered only as a guide and not an exhaustive list.


What will happen if an allegation of an academic offence is made against you?

I am required to report a suspected offence. The full process is outlined in the Discipline flow chart, which can be found at: http://academicintegrity.dal.ca/Files/AcademicDisciplineProcess.pdf and in- cludes the following:

1. Each Faculty has an Academic Integrity Officer (AIO) who receives allegations from instructors.

2. The AIO decides whether to proceed with the allegation and you will be notified of the process.

3. If the case proceeds, you will receive an INC (incomplete) grade until the matter is resolved.

4. If you are found guilty of an academic offence, a penalty will be assigned ranging from a warning to a suspension or expulsion from the University and can include a notation on your transcript, failure of the assignment or failure of the course. All penalties are academic in nature.


Where can you turn for help?

• If you are ever unsure about ANYTHING, contact myself.

• The Academic Integrity website (http://academicintegrity.dal.ca) has links to policies, defini tions, online tutorials, tips on citing and paraphrasing.

• The Writing Center provides assistance with proofreading, writing styles, citations.

• Dalhousie Libraries have workshops, online tutorials, citation guides, Assignment Calculator, Ref- Works, etc.

• The Dalhousie Student Advocacy Service assists students with academic appeals and student discipline procedures.

• The Senate Office provides links to a list of Academic Integrity Officers, discipline flow chart, and Senate Discipline Committee.

Request for special accommodation

Students may request accommodation as a result of barriers related to disability, religious obligation, or any characteristic under the Nova Scotia Human Rights Act. Students who require academic accommodation for either classroom participation or the writing of tests and exams should make their request to the Advising and Access Services Center (AASC) prior to or at the outset of the regular academic year. Please visit www.dal.ca/access for more information and to obtain the Request for Accommodation – Form A.

A note taker may be required as part of a student’s accommodation. There is an honorarium of $75/course/term (with some exceptions). If you are interested, please contact AASC at 494-2836 for more information.

Please note that your classroom may contain specialized accessible furniture and equipment. It is important that these items remain in the classroom, untouched, so that students who require their usage will be able to participate in the class.