Modification of the bodyfat dataset for classification.
The response bfan is a factor indicating a body fat value above the normal
range.
The variable bodyfat was dropped for convenience, and two new variables
bmi (body mass index, in kg/m^2) and bmi2 (alternate body mass index,
in kg^1.2/m^3.3) were computed (see examples below).
bfanA data frame with 246 rows and 16 columns:
Body fat above normal range
Age (years)
Weight (kg)
Height (cm)
Neck circumference (cm)
Chest circumference (cm)
Abdomen circumference (cm)
Hip circumference (cm)
Thigh circumference (cm)
Knee circumference (cm)
Ankle circumference (cm)
Biceps (extended) circumference (cm)
Forearm circumference (cm)
Wrist circumference (cm)
Body mass index (kg/m2)
Alternate body mass index
StatLib Datasets Archive: https://lib.stat.cmu.edu/datasets/bodyfat.
See bodyfat and bodyfat.raw for details.
Penrose, K., Nelson, A. and Fisher, A. (1985). Generalized Body Composition Prediction Equation for Men Using Simple Measurement Techniques. Medicine and Science in Sports and Exercise, 17(2), 189. doi:10.1249/00005768-198504000-00037 .
bfan <- bodyfat
# Body fat above normal
bfan[1] <- factor(bfan$bodyfat > 24 , # levels = c('FALSE', 'TRUE'),
labels = c('No', 'Yes'))
names(bfan)[1] <- "bfan"
bfan$bmi <- with(bfan, weight/(height/100)^2)
bfan$bmi2 <- with(bfan, weight^1.2/(height/100)^3.3)
fit <- glm(bfan ~ abdomen, family = binomial, data = bfan)
summary(fit)
#>
#> Call:
#> glm(formula = bfan ~ abdomen, family = binomial, data = bfan)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -23.76718 3.06157 -7.763 8.29e-15 ***
#> abdomen 0.24109 0.03172 7.600 2.95e-14 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 300.88 on 245 degrees of freedom
#> Residual deviance: 171.37 on 244 degrees of freedom
#> AIC: 175.37
#>
#> Number of Fisher Scoring iterations: 6
#>