Bibliografía

Agor, J., y Özaltın, O. Y. (2019). Feature selection for classification models via bilevel optimization. Computers and Operations Research, 106, 156-168. https://doi.org/10.1016/j.cor.2018.05.005

Beck, M. W. (2018). NeuralNetTools: Visualization and Analysis Tools for Neural Networks. Journal of Statistical Software, 85(11), 1-20. https://doi.org/10.18637/jss.v085.i11

Bellman, R. (1961). Adaptive Control Processes: a guided tour. Princeton University Press.

Bertrand, F., y Maumy, M. (2023). Partial Least Squares Regression for Generalized Linear Models. manual.

Biecek, P. (2018). DALEX: Explainers for Complex Predictive Models in R. Journal of Machine Learning Research, 19(84), 1-5. https://jmlr.org/papers/v19/18-416.html

Bischl, B., Sonabend, R., Kotthoff, L., y Lang, M. (2024). Applied machine learning using mlr3 in R. CRC Press.

Boser, B. E., Guyon, I. M., y Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Proceedings of the fifth annual workshop on Computational learning theory, 144-152. https://doi.org/10.1145/130385.130401

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140. https://doi.org/10.1007/bf00058655

Breiman, L. (2001a). Random forests. Machine Learning, 45(1), 5-32.

Breiman, L. (2001b). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199-231. https://doi.org/10.1214/ss/1009213726

Breiman, L., Friedman, J., Stone, C. J., y Olshen, R. A. (1984). Classification and Regression Trees. Taylor; Francis.

Canty, A., y Ripley, B. D. (2024). boot: Bootstrap R (S-Plus) Functions.

Cao Abad, R., Vilar Fernández, J. M., Presedo Quindimil, M. A., Vilar Fernández, J. A., Francisco Fernández, M., Salvador, N., y Vázquez Brage, M. (2001). Introducción a la estadı́stica y sus aplicaciones. Ediciones Pirámide. https://www.edicionespiramide.es/libro.php?id=242639

Chen, T., y Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785-794. https://doi.org/10.1145/2939672.2939785

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T., et al. (2023). xgboost: Extreme Gradient Boosting. https://cran.r-project.org/package=xgboost

Chollet, F., y Allaire, J. J. (2018). Deep Learning with R. Manning Publications.

Cohen, J. (1960). A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20(1), 37-46. https://doi.org/10.1177/001316446002000104

Comon, P. (1994). Independent component analysis, a new concept? Signal Processing, 36(3), 287-314. https://doi.org/10.1016/0165-1684(94)90029-9

Cortes, C., y Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. https://doi.org/10.1007/bf00994018

Cortez, P., Cerdeira, A., Almeida, F., Matos, T., y Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47(4), 547-553. https://doi.org/10.1016/j.dss.2009.05.016

Craven, P., y Wahba, G. (1978). Smoothing noisy data with spline functions. Numerische Mathematik, 31(4), 377-403. https://doi.org/10.1007/bf01404567

Culp, M., Johnson, K., y Michailidis, G. (2006). ada: An R Package for Stochastic Boosting. Journal of Statistical Software, 17(2), 1-27. https://doi.org/10.18637/jss.v017.i02

Dalpiaz, D. (2020). R for Statistical Learning. https://daviddalpiaz.github.io/r4sl.

Dalpiaz, D. (2022). Applied Statistics with R. https://book.stat420.org.

De Boor, C., y De Boor, C. (1978). A practical guide to splines. Springer-Verlag. https://doi.org/10.1007/978-1-4612-6333-3

Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2), 139-157.

Drucker, H., Burges, C. J., Kaufman, L., Smola, A., y Vapnik, V. (1997). Support Vector Regression Machines. Advances in Neural Information Processing Systems, 155-161.

Dunn, P. K., y Smyth, G. K. (2018). Generalized linear models with examples in R. Springer.

Dunson, D. B. (2018). Statistics in the big data era: Failures of the machine. Statistics and Probability Letters, 136, 4-9. https://doi.org/10.1016/j.spl.2018.02.028

Efron, B., Hastie, T., Johnstone, I., y Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2), 407-499. https://doi.org/10.1214/009053604000000067

Eilers, P. H., y Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2), 89-121. https://doi.org/10.1214/ss/1038425655

Everitt, B., y Hothorn, T. (2011). An Introduction to Applied Multivariate Analysis with R. Springer. https://link.springer.com/book/10.1007/978-1-4419-9650-3

Fan, J., y Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman; Hall.

Faraway, J. J. (2016). Linear Models with R (2a. ed.). Chapman & Hall/CRC.

Fasola, S., Muggeo, V. M. R., y Kuchenhoff, H. (2018). A heuristic, iterative algorithm for change-point detection in abrupt change models. Computational Statistics, 33(2), 997-1015.

Febrero-Bande, M., González-Manteiga, W., y Oviedo de la Fuente, M. (2019). Variable selection in functional additive regression models. Computational Statistics, 34, 469-487.https://doi.org/10.1007/s00180-018-0844-5

Febrero-Bande, M., y Oviedo de la Fuente, M. (2012). Statistical Computing in Functional Data Analysis: The R Package fda.usc. Journal of Statistical Software, 51(4), 1-28. https://www.jstatsoft.org/v51/i04/

Fernández-Casal, R., Cao, R., y Costa, J. (2023). Técnicas de simulación y remuestreo. https://rubenfcasal.github.io/simbook.

Fernández-Casal, R., Oviedo-de la Fuente, M., y Costa-Bouzas, J. (2024). mpae: Metodos Predictivos de Aprendizaje Estadistico (Statistical Learning Predictive Methods).

Fernández-Casal, R., Roca-Pardiñas, J., Costa, J., y Oviedo-de la Fuente, M. (2022). Introducción al Análisis de Datos con R. https://rubenfcasal.github.io/intror.

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x

Fox, J., Marquez, M. M., y Bouchet-Valat, M. (2024). Rcmdr: R Commander. https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/

Fox, J., y Monette, G. (2024). cv: Cross-Validation of Regression Models.

Freund, Y., y Schapire, R. E. (1996). Schapire R: Experiments with a new boosting algorithm. Thirteenth International Conference on ML.

Friedman, J. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84(405), 165-175. https://doi.org/10.1080/01621459.1989.10478752

Friedman, J. (1991). Multivariate Adaptive Regression Splines. The Annals of Statistics, 19(1), 1-67. https://doi.org/10.1214/aos/1176347963

Friedman, J. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 1189-1232. https://doi.org/10.1214/aos/1013203451

Friedman, J. (2002). Stochastic gradient boosting. Computational Statistics & data analysis, 38(4), 367-378. https://doi.org/10.1016/S0167-9473(01)00065-2

Friedman, J., Hastie, T., y Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337-407. https://doi.org/10.1214/aos/1016218223

Friedman, J., Hastie, T., Tibshirani, R., Narasimhan, B., Tay, K., Simon, N., Qian, J., y Yang, J. (2023). glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. https://cran.r-project.org/package=glmnet

Friedman, J., y Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954. https://doi.org/10.1214/07-aoas148

Friedman, J., y Stuetzle, W. (1981). Projection pursuit regression. Journal of the American Statistical Association, 76(376), 817-823. https://doi.org/10.1080/01621459.1981.10477729

Friedman, J., y Tukey, J. (1974). A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on computers, 100(9), 881-890. https://doi.org/10.1109/t-c.1974.224051

Fritsch, S., Guenther, F., y Wright, M. N. (2019). neuralnet: Training of Neural Networks. https://CRAN.R-project.org/package=neuralnet

Goldstein, A., Kapelner, A., Bleich, J., y Pitkin, E. (2015). Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. Journal of Computational and Graphical Statistics, 24(1), 44-65. https://doi.org/10.1080/10618600.2014.907095

Greenwell, B. M. (2017). pdp: An R Package for Constructing Partial Dependence Plots. The R Journal, 9(1), 421-436. https://doi.org/10.32614/RJ-2017-016

Greenwell, B. M. (2022). pdp: Partial Dependence Plots. https://cran.r-project.org/package=pdp

Greenwell, B. M., y Boehmke, B. C. (2020). Variable Importance Plots–An Introduction to the vip Package. The R Journal, 12(1), 343-366. https://doi.org/10.32614/RJ-2020-013

Greenwell, B., Boehmke, B., Cunningham, J., y Developers, G. (2022). gbm: Generalized Boosted Regression Models. https://cran.r-project.org/package=gbm

Hair, J. F., Anderson, R. E., Tatham, R. L., y Black, W. (1998). Multivariate Data Analysis. Prentice Hall.

Härdle, W. K., y Simar, L. (2013). Applied Multivariate Statistical Analysis. Springer. https://link.springer.com/book/10.1007/978-3-662-45171-7

Hastie, T., y Pregibon, D. (1991). Generalized linear models. En J. M. Chambers y T. Hastie (Eds.), Statistical models in S (pp. 195-247). Routledge.

Hastie, T., Rosset, S., Tibshirani, R., y Zhu, J. (2004). The entire regularization path for the support vector machine. Journal of Machine Learning Research, 5, 1391-1415.

Hastie, T., y Tibshirani, R. (1990). Generalized Additive Models. Chapman; Hall. https://doi.org/10.1201/9780203753781

Hastie, T., y Tibshirani, R. (1996). Discriminant Analysis by Gaussian Mixtures. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 155-176. https://doi.org/10.1111/j.2517-6161.1996.tb02073.x

Hoerl, A. E., y Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. https://doi.org/10.1080/00401706.2000.10485983

Hornik, K., Buchta, C., y Zeileis, A. (2009). Open-Source Machine Learning: R Meets Weka. Computational Statistics, 24(2), 225-232. https://doi.org/10.1007/s00180-008-0119-7

Hothorn, T., Hornik, K., Strobl, C., y Zeileis, A. (2010). Party: A laboratory for recursive partytioning. https://cran.r-project.org/web/packages/party/

Hothorn, T., Hornik, K., y Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651-674. https://doi.org/10.1198/106186006x133933

Husson, F., Josse, J., y Le, S. (2023). RcmdrPlugin.FactoMineR: Graphical User Interface for FactoMineR. https://cran.r-project.org/package=RcmdrPlugin.FactoMineR

Hvitfeldt, E., Pedersen, T. L., y Benesty, M. (2022). lime: Local Interpretable Model-Agnostic Explanations.

Hyndman, R. J., y Athanasopoulos, G. (2021). Forecasting: principles and practice (3a. ed.). OTexts. https://otexts.com/fpp3.

Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics, 58(1), 71-120. https://doi.org/10.1016/0304-4076(93)90114-K

Inglis, A., Parnell, A., y Hurley, C. (2023). vivid: Variable Importance and Variable Interaction Displays. https://cran.r-project.org/package=vivid

James, G., Witten, D., Hastie, T., y Tibshirani, R. (2021). An Introduction to Statistical Learning: With Applications in R (2a. ed.). Springer. https://www.statlearning.com.

Karatzoglou, A., Smola, A., Hornik, K., y Zeileis, A. (2004). kernlab - An S4 Package for Kernel Methods in R. Journal of Statistical Software, 11(9), 1-20. https://doi.org/10.18637/jss.v011.i09

Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Journal of the Royal Statistical Society: Series C (Applied Statistics), 29(2), 119-127. https://doi.org/10.2307/2986296

Kearns, M., y Valiant, L. (1994). Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 41(1), 67-95. https://doi.org/10.1145/174644.174647

Kruskal, J. B. (1969). Toward a practical method which helps uncover the structure of a set of multivariate observations by finding the linear transformation which optimizes a new «index of condensation». Statistical Computation, 427-440. https://doi.org/10.1016/b978-0-12-498150-8.50024-0

Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software, 28(5), 1-26. https://doi.org/10.18637/jss.v028.i05

Kuhn, M. (2019). The caret Package. https://topepo.github.io/caret.

Kuhn, M. (2023). caret: Classification and Regression Training. https://cran.r-project.org/package=caret

Kuhn, M., y Johnson, K. (2013). Applied predictive modeling. Springer. http://appliedpredictivemodeling.com. https://doi.org/10.1007/978-1-4614-6849-3

Kuhn, M., y Johnson, K. (2019). Feature Engineering and Selection: A Practical Approach for Predictive Models. Chapman & Hall/CRC. http://www.feat.engineering.

Kuhn, M., y Quinlan, J. R. (2023). Cubist: Rule- And Instance-Based Regression Modeling. https://cran.r-project.org/web/packages/Cubist/

Kuhn, M., y Silge, J. (2022). Tidy Modeling with R. O’Reilly. https://www.tmwr.org.

Kuhn, M., Weston, S., Coulter, N., y Quinlan, J. R. (2014). C50: C5.0 Decision Trees and Rule-Based Models. https://cran.r-project.org/web/packages/C50/

Kuhn, M., y Wickham, H. (2020). Tidymodels: a collection of packages for modeling and machine learning using tidyverse principles.

Kuhn, M., y Wickham, H. (2023). tidymodels: Easily Install and Load the Tidymodels Packages.

Kvålseth, T. O. (1985). Cautionary note about R2. The American Statistician, 39(4), 279-285. https://doi.org/10.1080/00031305.1985.10479448

Lang, M., Binder, M., Richter, J., Schratz, P., Pfisterer, F., Coors, S., Au, Q., Casalicchio, G., Kotthoff, L., y Bischl, B. (2019). mlr3: A modern object-oriented machine learning framework in R. Journal of Open Source Software, 4(44), 1903. https://doi.org/10.21105/joss.01903

Lauro, C. (1996). Computational statistics or statistical computing, is that the question? Computational Statistics & Data Analysis, 23(1), 191-193. https://doi.org/10.1016/0167-9473(96)88920-1

Lawson, J. (2014). Design and Analysis of Experiments with R. Chapman & Hall/CRC Press. https://www.taylorfrancis.com/books/mono/10.1201/b17883/design- analysis-experiments-john-lawson

LeDell, E., y Poirier, S. (2020). H2o automl: Scalable automatic machine learning. Proceedings of the AutoML Workshop at ICML.

Liaw, A., y Wiener, M. (2002). Classification and Regression by randomForest. R News, 2(3), 18-22. https://www.r-project.org/doc/Rnews/Rnews_2002-3.pdf

Loh, W.-Y. (2002). Regression tress with unbiased variable selection and interaction detection. Statistica Sinica, 361-386.

Marchini, J. L., Heaton, C., y Ripley, B. D. (2021). fastICA: FastICA Algorithms to Perform ICA and Projection Pursuit. https://cran.r-project.org/package=fastICA

Massy, W. F. (1965). Principal components regression in exploratory statistical research. Journal of the American Statistical Association, 60(309), 234-256. https://doi.org/10.1080/01621459.1965.10480787

McCullagh, P. (2019). Generalized linear models. Routledge.

McCulloch, W. S., y Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of Mathematical Biophysics, 5(4), 115-133. https://doi.org/10.1007/bf02459570

Mevik, B.-H., y Wehrens, R. (2007). The pls Package: Principal Component and Partial Least Squares Regression in R. Journal of Statistical Software, 18(2), 1-23. https://doi.org/10.18637/jss.v018.i02

Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., y Leisch, F. (2020). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. https://cran.r-project.org/package=e1071

Milborrow, S. (2019). rpart.plot: Plot ’rpart’ Models: An Enhanced Version of ’plot.rpart’. http://cran.r-project.org/package=rpart.plot/

Milborrow, S. (2022). plotmo: Plot a Model’s Residuals, Response, and Partial Dependence Plots. https://cran.r-project.org/package=plotmo

Milborrow, S. (2023). earth: Multivariate Adaptive Regression Splines. https://cran.r-project.org/package=earth

Miller, I., Freund, J. E., y Romero, C. O. (1973). Probabilidad y estadística para ingenieros. Reverté. https://www.reverte.com/libro/probabilidad-y-estadistica-para- ingenieros_89225/

Molnar, C. (2023). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. Lulu.com. https://christophm.github.io/interpretable-ml-book.

Molnar, C., Bischl, B., y Casalicchio, G. (2018). iml: An R package for Interpretable Machine Learning. Journal of Open Source Software, 3(26), 786. https://doi.org/10.21105/joss.00786

Ng, A., y Jordan, M. (2001). On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes. En T. Dietterich, S. Becker, y Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2001/file/7b7a53e239400a13bd6be6c91c4f6c4e-Paper.pdf

Nijs, V. (2023). radiant: Business Analytics using R and Shiny.

Paluszynska, A., Biecek, P., y Jiang, Y. (2017). randomForestExplainer: Explaining and visualizing random forests in terms of variable importance. R package version 0.9.

Penrose, K. W., Nelson, A., y Fisher, A. (1985). Generalized body composition prediction equation for men using simple measurement techniques. Medicine & Science in Sports & Exercise, 17(2), 189. https://doi.org/10.1249/00005768-198504000-00037

Pinheiro, J., Bates, D., y R Core Team. (2023). nlme: Linear and Nonlinear Mixed Effects Models. https://cran.r-project.org/package=nlme

Quinlan, J. R. (1992). Learning with continuous classes. 5th Australian joint conference on artificial intelligence, 343-348.

Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81-106.

Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Elsevier.

R Core Team. (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.

Racine, J. S., y Hayfield, T. (2023). np: Nonparametric Kernel Smoothing Methods for Mixed Data Types. https://cran.r-project.org/package=np

Ripley, B. (2023). MASS: Support Functions and Datasets for Venables and Ripley’s MASS. https://cran.r-project.org/package=MASS

Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., y Müller, M. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 12, 77. https://cran.r-project.org/web/packages/pROC/

Ruppert, D., Sheather, S. J., y Wand, M. P. (1995). An effective bandwidth selector for local least squares regression. Journal of the American Statistical Association, 90(432), 1257-1270.

Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. https://doi.org/10.2307/410457

Spinoza, B. (1677). Ethics.

Strumbelj, E., y Kononenko, I. (2010). An efficient explanation of individual classifications using game theory. The Journal of Machine Learning Research, 11, 1-18.

Székely, G. J., Rizzo, M. L., y Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6), 2769-2794. https://doi.org/10.1214/009053607000000505

Therneau, T. M., Atkinson, E. J., y Ripley, B. (2013). Rpart: Recursive Partitioning and Regression Trees. http://cran.r-project.org/package=rpart/

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

Valiant, L. G. (1984). A Theory of the Learnable. Communications of the ACM, 27(11), 1134-1142. https://doi.org/10.1145/800057.808710

Van Rossum, G., y Drake Jr., F. L. (1991). Python reference manual. Instituto Nacional de Investigación en Matemáticas e Informática (CWI), Holanda.

Vapnik, V. N. (1998). Statistical Learning Theory. Wiley. https://www.wiley.com/en-us/Statistical+Learning+Theory-p-9780471030034

Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer. https://link.springer.com/book/10.1007/978-1-4757-3264-1

Venables, W. N., y Ripley, B. D. (2002). Modern Applied Statistics with S (4a. ed.). Springer. https://www.stats.ox.ac.uk/pub/MASS4/

Vinayak, R. K., y Gilad-Bachrach, R. (2015). Dart: Dropouts meet multiple additive regression trees. Artificial Intelligence and Statistics, 489-497.

Wand, M. (2023). KernSmooth: Functions for Kernel Smoothing Supporting Wand and Jones (1995). https://cran.r-project.org/package=KernSmooth

Welch, B. L. (1939). Note on Discriminant Functions. Biometrika, 31(1/2), 218-220. https://doi.org/10.2307/2334985

Werbos, P. (1974). New tools for prediction and analysis in the behavioral sciences. Tesis doctoral, Harvard University.

Williams, G. (2011). Data mining with Rattle and R: The art of excavating data for knowledge discovery. Springer Science & Business Media.

Williams, G. (2022). rattle: Graphical User Interface for Data Science in R. https://cran.r-project.org/package=rattle

Wold, S., Martens, H., y Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. En Matrix pencils (pp. 286-293). Springer. https://doi.org/10.1007/bfb0062108

Wolpert, D. H., y Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67-82. https://doi.org/10.1109/4235.585893

Wood, S. N. (2017). Generalized Additive Models: An Introduction with R (2a. ed.). Chapman & Hall/CRC.

Zou, H., y Hastie, T. (2005). Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x