Modification of the dataset analysed in Penrose et al. (1985). Lists estimates of the percentage of body fat determined by underwater weighing and various body measurements for 246 men.

`bodyfat`

A data frame with 246 rows and 14 columns:

- bodyfat
Percent body fat (from Siri's 1956 equation)

- age
Age (years)

- weight
Weight (kg)

- height
Height (cm)

- neck
Neck circumference (cm)

- chest
Chest circumference (cm)

- abdomen
Abdomen circumference (cm)

- hip
Hip circumference (cm)

- thigh
Thigh circumference (cm)

- knee
Knee circumference (cm)

- ankle
Ankle circumference (cm)

- biceps
Biceps (extended) circumference (cm)

- forearm
Forearm circumference (cm)

- wrist
Wrist circumference (cm)

StatLib Datasets Archive: https://lib.stat.cmu.edu/datasets/bodyfat.

This data set can be used to illustrate multiple regression techniques (e.g. Johnson 1996). Instead of estimating body fat percentage from body density, which is not easy to measure, it is desirable to have a simpler method that allow this to be done from body measurements.

`bodyfat.raw`

contains the original data.
According to Johnson (1996), there were data entry errors (cases 42, 48, 76,
96 and 182 of the original data) and he suggested some rules to correct them.
These outliers were removed in the `bodyfat`

dataset, as well as an influential
observation (case 39, which has a big effect on regression estimates).
Additionally, the variable `density`

was dropped for convenience, and variables
`height`

and `weight`

were transformed into metric units (centimetres and
kilograms) for consistency.

See `bodyfat.raw`

for more details.

Johnson, R. W. (1996). Fitting Percentage of Body Fat to Simple Body
Measurements. *Journal of Statistics Education*, 4(1).
doi:10.1080/10691898.1996.11910505
.

Penrose, K., Nelson, A. and Fisher, A. (1985). Generalized Body Composition
Prediction Equation for Men Using Simple Measurement Techniques.
*Medicine and Science in Sports and Exercise*, 17(2), 189.
doi:10.1249/00005768-198504000-00037
.

```
fit <- lm(bodyfat ~ abdomen, bodyfat)
summary(fit)
#>
#> Call:
#> lm(formula = bodyfat ~ abdomen, data = bodyfat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -10.9377 -3.5413 0.1526 3.1426 12.7569
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -42.56252 2.75889 -15.43 <2e-16 ***
#> abdomen 0.66779 0.02967 22.51 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 4.699 on 244 degrees of freedom
#> Multiple R-squared: 0.675, Adjusted R-squared: 0.6736
#> F-statistic: 506.7 on 1 and 244 DF, p-value: < 2.2e-16
#>
plot(bodyfat ~ abdomen, bodyfat)
abline(fit)
```