Bibliografía completa

Agor, J., y Özaltın, O. Y. (2019). Feature selection for classification models via bilevel optimization. Computers and Operations Research, 106, 156-168. https://doi.org/10.1016/j.cor.2018.05.005
Becker, M., Binder, M., Bischl, B., Lang, M., Pfisterer, F., Reich, N. G., Richter, J., Schratz, P., Sonabend, R., y Pulatov, D. (2021). mlr3 book. https://mlr3book.mlr-org.com
Bellman, R. (1961). Adaptive Control Processes: a guided tour. Princeton University Press.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140. https://doi.org/10.1007/bf00058655
Breiman, L. (2001a). Random forests. Machine Learning, 45(1), 5-32.
Breiman, L. (2001b). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199-231. https://doi.org/10.1214/ss/1009213726
Breiman, L., Friedman, J. H., Stone, C. J., y Olshen, R. A. (1984). Classification and Regression Trees. Taylor; Francis.
Chen, T., y Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785-794. https://doi.org/10.1145/2939672.2939785
Chollet, F., y Allaire, J. J. (2018). Deep Learning with R. Manning Publications.
Cortes, C., y Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. https://doi.org/10.1007/bf00994018
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., y Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47(4), 547-553. https://doi.org/10.1016/j.dss.2009.05.016
Craven, P., y Wahba, G. (1978). Smoothing noisy data with spline functions. Numerische Mathematik, 31(4), 377-403. https://doi.org/10.1007/bf01404567
Culp, M., Johnson, K., y Michailidis, G. (2006). ada: An R Package for Stochastic Boosting. Journal of Statistical Software, 17(2), 1-27. https://doi.org/10.18637/jss.v017.i02
De Boor, C., y De Boor, C. (1978). A practical guide to splines (Vol. 27). springer-verlag New York. https://doi.org/10.1007/978-1-4612-6333-3
Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2), 139-157.
Drucker, H., Burges, C. J., Kaufman, L., Smola, A., y Vapnik, V. (1997). Support Vector Regression Machines. Advances in Neural Information Processing Systems, 9, 155-161.
Dunn, P. K., y Smyth, G. K. (2018). Generalized linear models with examples in R (Vol. 53). Springer.
Dunson, D. B. (2018). Statistics in the big data era: Failures of the machine. Statistics and Probability Letters, 136, 4-9. https://doi.org/10.1016/j.spl.2018.02.028
Efron, B., Hastie, T., Johnstone, I., y Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2), 407-499. https://doi.org/10.1214/009053604000000067
Eilers, P. H., y Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2), 89-121. https://doi.org/10.1214/ss/1038425655
Fan, J., y Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman; Hall.
Faraway, J. J. (2016). Linear Models with R (Second). CRC Press.
Fernández-Casal, R., y Cao, R. (2020). Simulación Estadística. https://rubenfcasal.github.io/simbook
Fernández-Casal, R., Roca-Pardiñas, J., y Costa, J. (2019). Introducción al Análisis de Datos con R. https://rubenfcasal.github.io/intror
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
Freund, Y., y Schapire, R. E. (1996). Schapire R: Experiments with a new boosting algorithm. in: Thirteenth International Conference on ML.
Friedman, J. H. (1989). Regularized discriminant analysis. Journal of the American statistical association, 84(405), 165-175. https://doi.org/10.1080/01621459.1989.10478752
Friedman, J. H. (1991). Multivariate Adaptive Regression Splines. The Annals of Statistics, 19(1), 1-67. https://doi.org/10.1214/aos/1176347963
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 1189-1232. https://doi.org/10.1214/aos/1013203451
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & data analysis, 38(4), 367-378. https://doi.org/10.1016/S0167-9473(01)00065-2
Friedman, J. H., y Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954. https://doi.org/10.1214/07-aoas148
Friedman, J. H., y Stuetzle, W. (1981). Projection pursuit regression. Journal of the American statistical Association, 76(376), 817-823. https://doi.org/10.1080/01621459.1981.10477729
Friedman, J. H., y Tukey, J. W. (1974). A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on computers, 100(9), 881-890. https://doi.org/10.1109/t-c.1974.224051
Friedman, J., Hastie, T., y Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337-407. https://doi.org/10.1214/aos/1016218223
Goldstein, A., Kapelner, A., Bleich, J., y Pitkin, E. (2015). Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. Journal of Computational and Graphical Statistics, 24(1), 44-65. https://doi.org/10.1080/10618600.2014.907095
Greenwell, B. M. (2017). pdp: An R Package for Constructing Partial Dependence Plots. The R Journal, 9(1), 421-436. https://doi.org/10.32614/RJ-2017-016
Hair, J. F., Anderson, R. E., Tatham, R. L., y Black, W. (1998). Multivariate Data Analysis. Prentice Hall.
Hastie, T. J., y Pregibon, D. (2017). Generalized linear models (pp. 195-247). Routledge.
Hastie, T., Rosset, S., Tibshirani, R., y Zhu, J. (2004). The entire regularization path for the support vector machine. Journal of Machine Learning Research, 5(Oct), 1391-1415.
Hastie, T., y Tibshirani, R. (1990). Generalized Additive Models. Chapman; Hall. https://doi.org/10.1201/9780203753781
Hastie, T., y Tibshirani, R. (1996). Discriminant Analysis by Gaussian Mixtures. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 155-176. https://doi.org/10.1111/j.2517-6161.1996.tb02073.x
Hothorn, T., Hornik, K., y Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651-674. https://doi.org/10.1198/106186006x133933
Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics, 58(1), 71-120. https://doi.org/10.1016/0304-4076(93)90114-K
James, G., Witten, D., Hastie, T., y Tibshirani, R. (2021). An Introduction to Statistical Learning: With Applications in R, Second Edition. Springer. https://www.statlearning.com
Karatzoglou, A., Smola, A., Hornik, K., y Zeileis, A. (2004). kernlab - An S4 Package for Kernel Methods in R. Journal of Statistical Software, Articles, 11(9), 1-20. https://doi.org/10.18637/jss.v011.i09
Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Journal of the Royal Statistical Society: Series C (Applied Statistics), 29(2), 119-127. https://doi.org/10.2307/2986296
Kearns, M., y Valiant, L. (1994). Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 41(1), 67-95. https://doi.org/10.1145/174644.174647
Kruskal, J. B. (1969). Toward a practical method which helps uncover the structure of a set of multivariate observations by finding the linear transformation which optimizes a new «index of condensation». Statistical Computation, 427-440. https://doi.org/10.1016/b978-0-12-498150-8.50024-0
Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software, 28(5), 1-26. https://doi.org/10.18637/jss.v028.i05
Kuhn, M. (2022). caret: Classification and Regression Training. https://github.com/topepo/caret/
Kuhn, M., y Johnson, K. (2013). Applied predictive modeling (Vol. 26). Springer. https://doi.org/10.1007/978-1-4614-6849-3
Kuhn, M., y Silge, J. (2022). Tidy Modeling with R. O’Reilly Media. https://www.tmwr.org
Kuhn, M., y Wickham, H. (2022). tidymodels: Easily Install and Load the Tidymodels Packages. https://CRAN.R-project.org/package=tidymodels
Kvålseth, T. O. (1985). Cautionary note about R 2. The American Statistician, 39(4), 279-285. https://doi.org/10.1080/00031305.1985.10479448
Lauro, C. (1996). Computational statistics or statistical computing, is that the question? Computational Statistics & Data Analysis, 23(1), 191-193. https://doi.org/10.1016/0167-9473(96)88920-1
Liaw, A., y Wiener, M. (2002). Classification and Regression by randomForest. R News, 2(3), 18-22. https://www.r-project.org/doc/Rnews/Rnews_2002-3.pdf
Loh, W.-Y. (2002). Regression tress with unbiased variable selection and interaction detection. Statistica Sinica, 361-386.
Massy, W. F. (1965). Principal components regression in exploratory statistical research. Journal of the American Statistical Association, 60(309), 234-256. https://doi.org/10.1080/01621459.1965.10480787
McCullagh, P. (2019). Generalized linear models. Routledge.
McCulloch, W. S., y Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133. https://doi.org/10.1007/bf02459570
Mevik, B.-H., y Wehrens, R. (2007). The pls Package: Principal Component and Partial Least Squares Regression in R. Journal of Statistical Software, Articles, 18(2), 1-23. https://doi.org/10.18637/jss.v018.i02
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., y Leisch, F. (2020). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. https://CRAN.R-project.org/package=e1071
Molnar, C. (2020). Interpretable Machine Learning. Lulu.com. https://christophm.github.io/interpretable-ml-book
Quinlan, J. R. et al. (1992). Learning with continuous classes. 5th Australian joint conference on artificial intelligence, 92, 343-348.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Elsevier Science.
Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. https://doi.org/10.2307/410457
Spinoza, B. (1677). Ethics.
Strumbelj, E., y Kononenko, I. (2010). An efficient explanation of individual classifications using game theory. The Journal of Machine Learning Research, 11, 1-18.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Valiant, L. G. (1984). A Theory of the Learnable. Communications of the ACM, 27(11), 1134-1142. https://doi.org/10.1145/800057.808710
Vapnik, V. (2000). Statistical Learning Theory. Willey.
Vapnik, V. (2013). The Nature of Statistical Learning Theory. Springer.
Venables, W. N., y Ripley, B. D. (2002). Modern Applied Statistics with S. Springer New York. https://doi.org/10.1007/978-0-387-21706-2
Vinayak, R. K., y Gilad-Bachrach, R. (2015). Dart: Dropouts meet multiple additive regression trees. Artificial Intelligence and Statistics, 489-497.
Wand, M. (2021). KernSmooth: Functions for Kernel Smoothing Supporting Wand & Jones (1995). https://CRAN.R-project.org/package=KernSmooth
Welch, B. L. (1939). Note on Discriminant Functions. Biometrika, 31(1/2), 218-220. https://doi.org/10.2307/2334985
Werbos, P. (1974). New tools for prediction and analysis in the behavioral sciences. Ph. D. dissertation, Harvard University.
Williams, G. (2011). Data mining with Rattle and R: The art of excavating data for knowledge discovery. Springer Science & Business Media.
Williams, G. (2022). rattle: Graphical User Interface for Data Science in R. https://rattle.togaware.com/
Wold, S., Martens, H., y Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. En Matrix pencils (pp. 286-293). Springer. https://doi.org/10.1007/bfb0062108
Wolpert, D. H., y Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67-82. https://doi.org/10.1109/4235.585893
Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition. CRC Press.
Zou, H., y Hastie, T. (2005). Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x