Modification of the dataset analysed in Penrose et al. (1985). Lists estimates of the percentage of body fat determined by underwater weighing and various body measurements for 246 men.
bodyfat
A data frame with 246 rows and 14 columns:
Percent body fat (from Siri's 1956 equation)
Age (years)
Weight (kg)
Height (cm)
Neck circumference (cm)
Chest circumference (cm)
Abdomen circumference (cm)
Hip circumference (cm)
Thigh circumference (cm)
Knee circumference (cm)
Ankle circumference (cm)
Biceps (extended) circumference (cm)
Forearm circumference (cm)
Wrist circumference (cm)
StatLib Datasets Archive: https://lib.stat.cmu.edu/datasets/bodyfat.
This data set can be used to illustrate multiple regression techniques (e.g. Johnson 1996). Instead of estimating body fat percentage from body density, which is not easy to measure, it is desirable to have a simpler method that allow this to be done from body measurements.
bodyfat.raw
contains the original data.
According to Johnson (1996), there were data entry errors (cases 42, 48, 76,
96 and 182 of the original data) and he suggested some rules to correct them.
These outliers were removed in the bodyfat
dataset, as well as an influential
observation (case 39, which has a big effect on regression estimates).
Additionally, the variable density
was dropped for convenience, and variables
height
and weight
were transformed into metric units (centimetres and
kilograms) for consistency.
See bodyfat.raw
for more details.
Johnson, R. W. (1996). Fitting Percentage of Body Fat to Simple Body Measurements. Journal of Statistics Education, 4(1). doi:10.1080/10691898.1996.11910505 .
Penrose, K., Nelson, A. and Fisher, A. (1985). Generalized Body Composition Prediction Equation for Men Using Simple Measurement Techniques. Medicine and Science in Sports and Exercise, 17(2), 189. doi:10.1249/00005768-198504000-00037 .
fit <- lm(bodyfat ~ abdomen, bodyfat)
summary(fit)
#>
#> Call:
#> lm(formula = bodyfat ~ abdomen, data = bodyfat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -10.9377 -3.5413 0.1526 3.1426 12.7569
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -42.56252 2.75889 -15.43 <2e-16 ***
#> abdomen 0.66779 0.02967 22.51 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 4.699 on 244 degrees of freedom
#> Multiple R-squared: 0.675, Adjusted R-squared: 0.6736
#> F-statistic: 506.7 on 1 and 244 DF, p-value: < 2.2e-16
#>
plot(bodyfat ~ abdomen, bodyfat)
abline(fit)