We will use the Sepal.Length column from the iris dataset in R.

x = iris$Sepal.Length 
print(c(mean(x), sd(x)))
## [1] 5.8433333 0.8280661

The sample mean is 5.84 and the sample standard deviation is 0.828.

Two-sided T-test

We suppose the population distribution for the sepal length is normal with unknown mean \(\mu\) and unknown standard deviation.

Consider the hypothesis testing: \[H_0: \mu = 6\quad\text{ v.s. }\quad H_a:\mu\neq 6\]

Manual Implementation

We first compute the T statistic, whose value is computed as -2.317 below.

xbar = mean(x) # sample mean
s = sd(x) # sample s.d.
n = length(x) # sample size
mu0 = 6 # null hypothesis 

z = (xbar - mu0) / (s / sqrt(n)) # z statistic
print(z)
## [1] -2.317166

Then we compute the p-value, which is 0.0219 < 0.05.

p.value = 2 * (1 - pt(abs(z), df=n-1))
print(p.value)
## [1] 0.02185662

Therefore, we reject the null with significance level 0.05.

Use t.test function

The t.test function is built-in in R.

test = t.test(x, alternative="two.sided", mu=6, conf.level=0.95)
print(test)
## 
##  One Sample t-test
## 
## data:  x
## t = -2.3172, df = 149, p-value = 0.02186
## alternative hypothesis: true mean is not equal to 6
## 95 percent confidence interval:
##  5.709732 5.976934
## sample estimates:
## mean of x 
##  5.843333

The output of the t-test gives the confidence interval of the population mean, which dose not include the null hypothesis. Also, the p-value is computed. So we reject the null.

Sample size determination

Power and sample size determination are included in the built-in function ``power.t.test” in R. To compute the power at \(\mu_1=5.8\):

t.power = power.t.test(n=n, delta=-0.2, sd=s, 
                       type="one.sample", alternative="two.sided")
print(t.power)
## 
##      One-sample t test power calculation 
## 
##               n = 150
##           delta = 0.2
##              sd = 0.8280661
##       sig.level = 0.05
##           power = 0.8362106
##     alternative = two.sided

To find the minimum sample size to ensure a power of 0.8 at \(\mu=5.8\), we compute

n.min = power.t.test(delta=-0.2, sd=s, power=0.8, 
                     type="one.sample", alternative="two.sided")
print(n.min)
## 
##      One-sample t test power calculation 
## 
##               n = 136.4813
##           delta = 0.2
##              sd = 0.8280661
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided

Therefore, the minimum sample size is 137.

Or, we can compute it by the approximated value below.

n.size = ceiling((s * (qt(0.025, n-1) - qt(0.8, n-1)) / (5.8 - 6) ) ** 2)
print(n.size)
## [1] 137

One-sided T-test

Now we consider the following test: \[H_0: \mu = 6\quad\text{ v.s. }\quad H_a:\mu < 6\]

Manual Implementation

We will reuse the T-statistic we computed before. We adjust the p-value to the new hypothesis testing.

p.value.new = pt(z, n-1)
print(p.value.new)
## [1] 0.01092831

The p-value is 0.011 < 0.05. So we reject the null.

Use t.test function

test.new = t.test(x, alternative="less", mu=6, conf.level=0.95)
print(test.new)
## 
##  One Sample t-test
## 
## data:  x
## t = -2.3172, df = 149, p-value = 0.01093
## alternative hypothesis: true mean is less than 6
## 95 percent confidence interval:
##     -Inf 5.95524
## sample estimates:
## mean of x 
##  5.843333

It gives the same p-value and same conclusion — reject null.

Sample Size Determination

We use the ``power.t.test” function to compute the minimum sample size to ensure a power of 0.8 at \(\mu_1=5.8\).

n.min.new = power.t.test(delta=0.2, sd=s, power=0.8, 
                     type="one.sample", alternative="one.sided")
print(n.min.new)
## 
##      One-sample t test power calculation 
## 
##               n = 107.3493
##           delta = 0.2
##              sd = 0.8280661
##       sig.level = 0.05
##           power = 0.8
##     alternative = one.sided

The minimum sample size is 108.

Two-sample T-test

Unpaired T-test

First, we want to compare whether the average Sepal.Length for Setosa is the same as that of versicolor. We prepare the two samples:

data1 = iris[iris$Species=="setosa", 1]
data2 = iris[iris$Species=="versicolor", 1]

Now we run the built-in t.test function, assuming unequal variance.

out.test = t.test(data1, data2, alternative="two.sided", paired=F, var.equal=F)
print(out.test)
## 
##  Welch Two Sample t-test
## 
## data:  data1 and data2
## t = -10.521, df = 86.538, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.1057074 -0.7542926
## sample estimates:
## mean of x mean of y 
##     5.006     5.936

The p-value is smaller than 0.05. Therefore, there is a significant difference in sepal length between the two species.

We can also run the pooled t-test by assuming equal variance.

out.test = t.test(data1, data2, alternative="two.sided", paired=F, var.equal=T)
print(out.test)
## 
##  Two Sample t-test
## 
## data:  data1 and data2
## t = -10.521, df = 98, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.1054165 -0.7545835
## sample estimates:
## mean of x mean of y 
##     5.006     5.936

The result is the same.

Paired T-test

Now we test whether there is any difference between the average Sepal.Length and the average Petal.Length for iris. First, we prepare the data.

data1 = iris$Sepal.Length
data2 = iris$Petal.Length
dif = data1 - data2

We can run a paired t-test using the built-in t.test function.

out.test = t.test(data1, data2, alternative="two.sided", paired=T)
print(out.test)
## 
##  Paired t-test
## 
## data:  data1 and data2
## t = 22.813, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  1.904708 2.265959
## sample estimates:
## mean difference 
##        2.085333

The outcome says there is significant difference between the two lengths.

Equivalently, we can run a one-sample t-test on the difference.

out.test = t.test(dif, alternative="two.sided")
print(out.test)
## 
##  One Sample t-test
## 
## data:  dif
## t = 22.813, df = 149, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  1.904708 2.265959
## sample estimates:
## mean of x 
##  2.085333

The output is exactly the same as the paired t-test.