CS 434: Machine Learning and Data Mining

 Fall  2009

MWF 15:00 - 15:50 Kelly 1001




Instructor: Xiaoli Fern

Email:
xfern@eecs.oregonstate.edu
Office:
kelly 3073
Office hour:
MWF 2-3pm, or by appointment
Class email list:
cs434-f09@engr.oregonstate.edu


Machine learning and Data mining is a subfield of artificial intelligence that develops computer programs that can learn from past experience and find useful patterns in data.  This field has provided many tools that are widely used and making significant impacts in both industrial and research settings. Some of the application domains include personalized spam filters, HIV vaccine design, handwritten digit recognition, face recognition, credit card fraud detection, unmanned vehicle control, medical diagnosis, intelligent web search, etc.

This course will provide a basic introduction to this dynamic and fast advancing field. Topics include the three basic branches in this field: (1) Supervised learning for prediction problems (learn to predict); (2) Unsupervised learning for clustering data and discovering interesting patterns from data (learn to understand); and (3) Reinforcement learning for learning to select actions based on positive and negative feedback (learn to act). It will have a special focus on the practical side --- students will not only learn various machine learning and data mining techniques, but also learn how to apply them to real problems in practice.

Syllabus

Course Policy


Course materials


Learning objectives

Upon completing the course, students are expected to:
1) be able to apply supervised learning algorithms to prediction problems and evaluate the results.
2) be able to apply unsupervised learning algorithms to data analysis problems and evaluate results.
3) be able to apply reinforcement learning algorithms to control problem and evaluate results.
4) be able to take a description of a new problem and decide what kind of problem (supervised, unsupervised, or reinforcement) it is.


Lecture Schedule

see previous class for a rough lecture schedule cs434 Fall 2007; cs434 fall 2008

Date Topics Lecture Notes
Reading
Assignments
9/28 M
Introduction to basic concepts
slides
TM Chapter 1
9/30 W
The perceptron algorithm slides notes on perceptron by William Cohen hw1; solution
10/2 F
Linear regression
Slides


10/5 M
K-nearest Neighbor, model selection
slides


10/7 W
Decision trees
slides
J. R. Quinlan, Induction of decision trees, Machine learning 1: 81-106, 1986
10/9 F
Decision trees cont
slides


10/12 M
Review of Probability Theory
slides

hw2; solution
10/14 W
Bayes classifier, Naive bayes
slides


10/16 F
Bayes classifier cont.
slides


10/19 M
Logistic regression
slides
Generative vs discriminative models

Final project Information
10/21 W
support vector machines
slides


10/23 F
SVM cont
 

hw3; solution
10/26 M
Ensemble learning
Slides
A short introduction to boosting
10/28 W
Ensemble learning cont.



10/30 F
Case study
Slides


11/2 M
Case study cont. Clustering.
Slides


11/4 W
Hiararchical Agglomorative Clustering


Project Proposal due
11/6 F
Kmeans
slides


11/9 M
Review

sample midterm ; solution

11/11 W
Mixture of Gaussian
slides


11/13 F
Midterm



11/16 M
Association rule mining
slides


11/18 W
Demensionality reduction
slides

hw4, cluster.csv ;rdata1; rdata2; rdata3; rdata4; rdata5;
11/20 F
Markov Decision process
slides


11/23 M
MDP cont.
slides


11/25 W
Reinforcment learning
slides


11/27 F
Thanksgiving holiday, no class



11/30 M
Project Presentation
slides


12/2 W
TBD
slides


12/4 F
TBD