CPS 271
Machine Learning

Required Text

Pattern Recognition and machine Learning by Christopher M. Bishop. Be sure to download the errata!

Other texts

  • Neural Networks for Pattern Recognition, Christopher M. Bishop. (It's not as much about neural networks as you might expect given the title. This is a good overview of many machine learning topics, but it is largely superseded by the required text.)
  • An Introduction to Computational Learning Theory, Michael J. Kearns and Umesh V. Vazirani. (A good introduction to computational learning theory - not really the focus of the class though.)
  • Reinforcment Learning, An Introduction, Richard S. Sutton and Andrew G. Barto. (An accessible introduction to reinforcement learning that is also available online.)
  • Machine Learning, Tom M. Michell. (An introduction to classic concepts in machine learning - a little dated now.)
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman. (Good coverage of machine learning with more of a statistician's perspective than the others in this list.)
  • Convex Optimization, Stephen Boyd and Lieven Vandenberghe. (Very thorough treatment by real experts and the full text is available online.)

    Tutorials

  • An Introduction to Computational Complexity, by Kimon Spiliopoulos. (Parts A-D give a concise overview of some key concepts - a good read for non computer scientists who want to get up to speed quickly.)
  • An Introduction to Langrange Multipliers (The classic "milkmaid problem"!)

    Useful Code and Programs

  • matlab is an extremely convenient and useful language for machine learning work. It is installed on computer science department machines and Duke has a site license that should permit any student to run matlab while connected to the University's network.
  • octave is an open source alternative to matlab. It is included with many linux distributions and is now included with the latest version of cygwin, a linux-like environment for Windows.
  • Weka is a great environment for learning and experimenting with a variety of machine learning algorithms.
  • An extensive list of SVM related software is available.
  • ghostview is a free postscript interpreter that will let you viewer older papers stored in postscript format. (Newer papers tend to be stored in pdf.)
  • Virtual Box is a free virtual machine that will allow you to run a second OS inside of a window on your current Windows, Linux, Solaris, or OS X machine. This is useful if you want to try software tools that are not available for your native OS.
  • gnuplot is a useful program for plotting arbitrary functions.
  • CutePDF Writer is a free tool for generating PDFs from within Windows. It appears as a printer and has no ads or other distractions, so it's quite painless to use. (From Mac OS or linux, you can use the tools included with the OS distribution.)
  • SVM applet from AT&T research.

    Other Resources

  • The UC Irvine Machine Learning Repository contains many benchmark data sets.
  • The Reinforcement Learning Repository at U. Mass. has code and implementations of benchmark domains for reinforcement learning.
  • The RL Glue project aims to provide a standard interface for reinforcement learning agents and environments.
  • videolectures.net has videos and slides from many academic subjects, but is heavily weighted towards CS and has talks from several machine learning conferences.
  • The matlab neural network toolbox is an easy way to experiment with neural networks in matlab.
  • Yann LeCun's handwritten digit database is a classic benchmark for machine learning algorithms.