M 10-10:50, Arts Hall D

T 11-11:50, Arts Hall C

- Wikipedia article on the Machine Learning.
- Wikipedia article on the Cross validation.
- Wikipedia article on the Receiver operating characteristic curve
- Wikipedia article on the Perceptron Learning Rule and related material, complete with highly revisionist history.
- Wikipedia article on the Backpropagation learning algorithm for multilayer perceptrons (MLPs)
- Wikipedia article on something we did not cover: Recurrent neural networks.
- Someone else's slides on Graphical Models.
- Wikipedia article on Support Vector Machines.
- Strong learning, weak learning, PAC learning, boosting.
- Wikipedia article on Boosting.
- Wikipedia article on AdaBoost.
- Paper: A Short Introduction to Boosting by Yoav Freund and Robert E. Schapire.
- Wikipedia article on VC Dimension.
- Wikipedia article on Probably approximately correct (PAC) learning.

- Wikipedia article on Non-negative_matrix_factorization (NMF).
- Multiplicative updates
- Wikipedia article on the Winnow algorithm.
- Paper on Tracking the best expert.
- Paper on The Multiplicative Weights Update method.
- Paper on The Multiplicative Weights Update Method: a Meta Algorithm and Applications which shows multiplicative weight updates in many settings outside machine learning.

- The Viola-Jones cascade method.
- Convolutional neural networks.
- in 2D for OCR of handwritten digits.
- Wikipedia article on Time Delay Neural Networks for speech recognition.

- Wikipedia article on k-means clustering.
- Wikipedia article on k nearest neighbor classification.
- EM
- Wikipedia article on the Expectation Maximization (EM) algorithm, note the pretty animation.
- Wikipedia article on the Hidden Markov Models
- My own HMM Cheat Sheet

- Reinforcement Learning
- The on-line textbook and reference (slightly dated) Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto.
- Wikipedia article on Reinforcement learning
- Wikipedia article on Policy/Value Iteration.
- Wikipedia article on the Q Learning algorithm.

- The on-line textbook Information Theory, Inference, and Learning Algorithms, by David J.C. MacKay, is excellent but quite hard core.

Your goal is to train a machine learning system using the given data,
label the test data as well as you can, and send me these labels. The
format for your labels should be a 1000 line ASCII file, each line of
which is one character long: either A or B, corresponding to the label
given to the input pattern in the corresponding line of the test set
file. I will measure your error rate on the test set by comparing
your labels to the truth. Your score will be in part based on this.
You should also give me *your estimate* of the error rate you
expect me to see; I will give points for accuracy of that estimate,
subject to some reasonableness conditions. (E.g., if you just guess
random labels, and estimate an accuracy of 50%, I will not be
impressed.) And lastly, turn in a report showing what you did:
include code you wrote, graphs generated (please label the X and Y
axes) to determine parameters for algorithms, brief English
descriptions of what you did, and anything else you want to show me.
If you worked in a team, please *very briefly* state what each
person on the team did.

Deadline: before class, Mon 17-Oct-2011.

- Forty ROC curves for each of the 40 input dimensions, using the values in that dimension as the value to use for discrimination
- One ROC curve (or more if you so desire) for the discriminator you actually used for HW1. (If you need to do something funny for this to make sense, e.g., if you used a 1-nearest-neighbour classifier so it would only give you one point on the ROC curve, do something to get the remainder of the curve, like in the case of 1-NN vary a ratio between the distances to the nearest A and the nearest B to weight towards As or Bs.)