Analysis for Built-in Dataset mtcars

We load the dataset

data(mtcars)
head(mtcars)

Suppose we consider a linear regression of mpg against disp. We first check the scatter plot.

plot(mtcars$disp, mtcars$mpg)

It shows a nonlinear trend. So we create two new columns by taking the log.

mtcars['logmpg'] = log(mtcars$mpg)
mtcars['logdisp'] = log(mtcars$disp)
plot(mtcars$logdisp, mtcars$logmpg)

Now the trend is linear. We run a linear regression model.

fit = lm(logmpg ~ logdisp, data=mtcars)
print(summary(fit))
## 
## Call:
## lm(formula = logmpg ~ logdisp, data = mtcars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.22758 -0.08874 -0.00791  0.07970  0.32143 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.38097    0.20803   25.87  < 2e-16 ***
## logdisp     -0.45857    0.03913  -11.72 1.01e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1282 on 30 degrees of freedom
## Multiple R-squared:  0.8207, Adjusted R-squared:  0.8148 
## F-statistic: 137.3 on 1 and 30 DF,  p-value: 1.006e-12

Both intercept and the slope are significant. \(R^2=0.8148\). The fitted model is \[logmpg = 5.38 - 0.459\times logdisp + \epsilon\quad\text{with}\ \epsilon\sim N(0, 0.1282^2).\] The residual plot is checked.

plot(fit$fitted.values, fit$residuals)

The residual plot is not too bad, though some unequal variances.

Then we check the QQ-plot.

qqnorm(fit$residuals)
qqline(fit$residuals)

The QQ-plot indicates the residuals are normal.

Or we can call plot directly.

plot(fit)