Michael Jordan曾推荐过一份机器学习入门书单,并附上了推荐理由。虽然教科书不是本访谈的讨论对象,但是我还是把它们列出来。如果有同学想要构建机器学习的专业知识体系而不是开阔思维,那最好从以下这些书(或类似的书)着手。Frequentist StatisticsCasella,
G. and Berger,
R.L. (2001).“Statistical Inference” Duxbury Press.—Intermediate-level statistics book.Ferguson,
T. (1996). “A Course in LargeSample Theory” Chapman & Hall/CRC.—For a slightly more advanced book that’squite clear on mathematical techniques.Lehmann,
E. (2004). “Elements ofLarge-Sample Theory” Springer.—About asymptotics which is a good startingplace.Vaart,
A.W. van der (1998). “AsymptoticStatistics” Cambridge.—A book that shows how many ideas in inference (Mestimation,the bootstrap, semi-parametrics, etc) repose on top of empirical processtheory.Tsybakov, Alexandre
B. (2008)“Introduction to Nonparametric Estimation” Springer.—Tools for obtaining lowerbounds on estimators.
B. Efron (2010) “Large-Scale Inference:Empirical Bayes Methods for Estimation, Testing, and Prediction” Cambridge,.—Athought-provoking book.Bayesian StatisticsGelman,
A. et al. (2003). “BayesianData Analysis” Chapman & Hall/CRC.—About Bayesian.Robert,
C. and Casella,
G. (2005).“Monte Carlo Statistical Methods” Springer.—about Bayesian computation.Probability TheoryGrimmett,
G. and Stirzaker,
D. (2001).“Probability and Random Processes” Oxford.—Intermediate-level probability book.Pollard,
D. (2001). “A User’s Guide toMeasure Theoretic Probability” Cambridge.—More advanced level probability book.Durrett,
R. (2005). “Probability:Theory and Examples” Duxbury.—Standard advanced probability book.OptimizationBertsimas,
D. and Tsitsiklis,
J.(1997). “Introduction to Linear Optimization” Athena.—A good starting book onlinear optimization that will prepare you for convex optimization.Boyd,
S. and Vandenberghe,
L. (2004).“Convex Optimization” Cambridge.
Y. Nesterov and Iu
E. Nesterov (2003).“Introductory Lectures on Convex Optimization” Springer.—A start to understandlower bounds in optimization.Linear AlgebraGolub,
G., and Van Loan,
C. (1996).“Matrix Computations” Johns Hopkins.—Getting a full understanding of algorithmic linear algebra is also important.Information TheoryCover,
T. and Thomas,
J. “Elements ofInformation Theory” Wiley.—Classic information theory.Functional AnalysisKreyszig,
E. (1989). “IntroductoryFunctional Analysis with Applications” Wiley.—Functional analysis isessentially linear algebra in infinite dimensions, and it’s necessary forkernel methods, for nonparametric Bayesian methods, and for various othertopics.大牛Ian Goodfellow也推荐过几本机器学习教科书,其中当然包括他自己写的那本《DeepLearning》
|