The second and third books by Mackay and Hastie et al. are available
Christopher Bishop, Pattern Recognition and Machine Learning,
Trevor Hastie, Robert Tibshirani, and Jerome Friedman,
Statistical Learning, Springer, 2009 (2nd edition). (PDF version available here ).
David MacKay, Information Theory, Inference, and Learning
Algorithms, Cambridge University Press, 2003. ( PDF version available
Bernhard Scholkopf and Alexander Smola, Learning with Kernels, MIT
Duda, Hart, and Stork: Pattern Classification (Wiley), 2nd edition .
Richard Sutton and Andrew Barto, Reinforcement Learning, MIT
Press, 1998. (Online version available here ).
Mitchell, Tom: Machine Learning, McGraw-Hill, 1997.
Additionaly, you will find it helpful to
consult background texts on mathematical foundations, including linear
algebra (e.g. Strang), statistics (e.g. Casella and Berger), and
convex optimization (e.g., Boyd and Vanderberge).
Casella and Berger, Statistical Inference, Duxbury Press, 2001.
Statistics , a concise overview of statistics from a course taught
at Cambridge University, 2000.
Gilbert Strang, Introduction to Linear Algebra, Wellesley Press,
this link for online lectures by Strang).
Steven Boyd and Lieven Vandenberghe, Convex Optimization,
Cambridge University Press, 2004. (There is an online PDF version at
this website ).
Fan Chung Graham, Spectral Graph Theory, American Mathematical