Consider a random variable \(X\) that has a \(t\) distribution with \(7\) degrees of freedom. Calculate \(P[X > 1.3]\).

`1 - pt(1.3, df = 7)`

`## [1] 0.1173839`

`pt(1.3, df = 7, lower.tail = FALSE)`

`## [1] 0.1173839`

Consider a random variable \(Y\) that has a \(t\) distribution with \(9\) degrees of freedom. Find \(c\) such that \(P[X > c] = 0.025\).

`qt(1 - 0.025, df = 9)`

`## [1] 2.262157`

`qt(0.025, df = 9, lower.tail = FALSE)`

`## [1] 2.262157`

For this Exercise, use the built-in `trees`

dataset in `R`

. Fit a simple linear regression model with `Girth`

as the response and `Height`

as the predictor. What is the p-value for testing \(H_0: \beta_1 = 0\) vs \(H_1: \beta_1 \neq 0\)?

```
tree_model = lm(Girth ~ Height, data = trees)
summary(tree_model)$coefficients["Height", "Pr(>|t|)"]
```

`## [1] 0.002757815`

Continue using the SLR model you fit in Exercise 3. What is the length of a 90% confidence interval for \(\beta_1\)?

```
tree_model = lm(Girth ~ Height, data = trees)
ci_beta_1 = confint(tree_model, parm = "Height", level = 0.90)
ci_beta_1[2] - ci_beta_1[1]
```

`## [1] 0.2656018`

Continue using the SLR model you fit in Exercise 3. Calculate a 95% confidence interval for the mean tree girth of a tree that is 79 feet tall. Report the upper bound of this interval.

```
tree_model = lm(Girth ~ Height, data = trees)
predict(tree_model, newdata = data.frame(Height = 79), interval = "confidence")[, "upr"]
```

`## [1] 15.12646`

Consider a random variable \(X\) that has a \(t\) distribution with \(5\) degrees of freedom. Calculate \(P[|X| > 2.1]\).

`pt(-2.1, df = 5) + pt(2.1, df = 5, lower.tail = FALSE)`

`## [1] 0.08975325`

`2 * pt(2.1, df = 5, lower.tail = FALSE)`

`## [1] 0.08975325`

Calculate the critical value used for a 90% confidence interval about the slope parameter of a simple linear regression model that is fit to 10 observations. (Your answer should be a positive value.)

```
conf_level = 0.90
sig_level = 1 - conf_level
n = 10
abs(qt(sig_level / 2, df = n - 2))
```

`## [1] 1.859548`

Consider the true simple linear regression model

\[ Y_i = 5 + 4 x_i + \epsilon_i \qquad \epsilon_i \sim N(0, \sigma^2 = 4) \qquad i = 1, 2, \ldots 20 \]

Given \(S_{xx} = 1.5\), calculate the probability of observing data according to this model, fitting the SLR model, and obtaining an estimate of the slope parameter greater than 4.2. In other words, calculate

\[ P[\hat{\beta}_1 > 4.2] \]

```
Sxx = 1.5
beta_1 = 4
sigma = 2
e_beta_1_hat = 4
sd_beta_1_hat = sqrt(sigma ^ 2 / Sxx)
pnorm(4.2, mean = e_beta_1_hat, sd = sd_beta_1_hat, lower.tail = FALSE)
```

`## [1] 0.4512616`

\[ \hat{\beta}_1 \sim N\left( \beta_1, \frac{\sigma^2}{S_{xx}} \right) \]

For Exercises 9 - 13, use the `faithful`

dataset, which is built into `R`

.

Suppose we would like to predict the duration of an eruption of the Old Faithful geyser in Yellowstone National Park based on the waiting time before an eruption. Fit a simple linear model in `R`

that accomplishes this task.

What is the value of \(\text{SE}[\hat{\beta}_1]\)?

```
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "Std. Error"]
```

`## [1] 0.002218541`

What is the value of the test statistic for testing \(H_0: \beta_0 = 0\) vs \(H_1: \beta_0 \neq 0\)?

```
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["(Intercept)", "t value"]
```

`## [1] -11.70212`

What is the value of the test statistic for testing \(H_0: \beta_1 = 0\) vs \(H_1: \beta_1 \neq 0\)?

```
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "t value"]
```

`## [1] 34.08904`

Test \(H_0: \beta_1 = 0\) vs \(H_1: \beta_1 \neq 0\) with \(\alpha = 0.01\). What decision do you make?

- Fail to reject \(H_0\)
- Reject \(H_0\)
- Reject \(H_1\)
- Not enough information

```
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "Pr(>|t|)"]
```

`## [1] 8.129959e-100`

- Fail to reject \(H_0\)
**Reject \(H_0\)**- Reject \(H_1\)
- Not enough information

Calculate a 90% confidence interval for \(\beta_0\). Report the upper bound of this interval.

```
faithful_model = lm(eruptions ~ waiting, data = faithful)
confint(faithful_model, parm = "(Intercept)", level = 0.90)[, 2]
```

`## [1] -1.609697`

For this Exercise, use the `Orange`

dataset, which is built into `R`

.

Use a simple linear regression model to create a 90% confidence interval for the change in mean circumference of orange trees in millimeters when age is increased by 1 day. Report the lower bound of this interval.

```
orange_model = lm(circumference ~ age, data = Orange)
confint(orange_model, parm = "age", level = 0.90)[, 1]
```

`## [1] 0.0927633`

For this Exercise, use the `Orange`

dataset, which is built into `R`

.

Use a simple linear regression model to create a 90% confidence interval for the mean circumference of orange trees in millimeters when the age is 250 days. Report the lower bound of this interval.

```
orange_model = lm(circumference ~ age, data = Orange)
predict(orange_model, interval = "confidence", newdata = data.frame(age = 250), level = 0.90)[, "lwr"]
```

`## [1] 32.48418`

For this Exercise, use the `cats`

dataset from the `MASS`

package.

Use a simple linear regression model to create a 99% prediction interval for a catâ€™s heart weight in grams if their body weight is 2.5 kilograms. Report the upper bound of this interval.

```
library(MASS)
cat_model = lm(Hwt ~ Bwt, data = cats)
predict(cat_model, interval = "prediction", level = 0.99,
newdata = data.frame(Bwt = 2.5))[, "upr"]
```

`## [1] 13.53644`

Consider a 90% confidence interval for the mean response and a 90% prediction interval, both at the same \(x\) value. Which interval is narrower?

- Confidence interval
- Prediction interval
- No enough information, it depends on the value of \(x\)

**Confidence interval**- Prediction interval
- No enough information, it depends on the value of \(x\)

Suppose you obtain a 99% confidence interval for \(\beta_1\) that is \((-0.4, 5.2)\). Now test \(H_0: \beta_1 = 0\) vs \(H_1: \beta_1 \neq 0\) with \(\alpha = 0.01\). What decision do you make?

- Fail to reject \(H_0\)
- Reject \(H_0\)
- Reject \(H_1\)
- Not enough information

**Fail to reject \(H_0\)**- Reject \(H_0\)
- Reject \(H_1\)
- Not enough information

Suppose you test \(H_0: \beta_1 = 0\) vs \(H_1: \beta_1 \neq 0\) with \(\alpha = 0.01\) and fail to reject \(H_0\). Indicate all of the following that must always be true:

- There is no relationship between the response and the predictor.
- The probability of observing the estimated value of \(\beta_1\) (or something more extreme) is greater than \(0.01\) if we assume that \(\beta_1 = 0\).
- The value of \(\hat{\beta}_1\) is very small. For example, it could not be 1.2.
- The probability that \(\beta_1 = 0\) is very high.
- We would also fail to reject at \(\alpha = 0.05\).

- There is no relationship between the response and the predictor.
**The probability of observing the estimated value of \(\beta_1\) (or something more extreme) is greater than \(0.01\) if we assume that \(\beta_1 = 0\).**- The value of \(\hat{\beta}_1\) is very small. For example, it could not be 1.2.
- The probability that \(\beta_1 = 0\) is very high.
- We would also fail to reject at \(\alpha = 0.05\).

Consider a 95% confidence interval for the mean response calculated at \(x = 6\). If instead we calculate the interval at \(x = 7\), mark each value that would change:

- Point Estimate
- Critical Value
- Standard Error

**Point Estimate**- Critical Value
**Standard Error**