> <\body> ||<\author-address> Copyright Marek Rychlik, 2009 >> We define a dataset which contains the results of the following experiment: 12 steaks are randomly divided into 4 even treatment groups, and they are packaged under various conditions (commercial wrap only, vacuum, mixed gas atmosphere, and > atmosphere). The measured quantity is the logarithm of the number of bacteria on the surface of the steak. <\input| >> Treatment \- rep(c("Commercial", "Vacuum", "Mixed.Gas", "CO2"), each=3) <\input| >> LogCount \- c(7.66, 6.98, 7.80, 5.26, 5.44, 5.80, 7.41, 7.33, 7.04, 3.51, 2.91, 3.66) <\input| >> steak.data \- data.frame(Treatment, LogCount) <\input| >> steak.data <\output> \ \ \ \ Treatment LogCount 1 \ Commercial \ \ \ \ 7.66 2 \ Commercial \ \ \ \ 6.98 3 \ Commercial \ \ \ \ 7.80 4 \ \ \ \ \ Vacuum \ \ \ \ 5.26 5 \ \ \ \ \ Vacuum \ \ \ \ 5.44 6 \ \ \ \ \ Vacuum \ \ \ \ 5.80 7 \ \ Mixed.Gas \ \ \ \ 7.41 8 \ \ Mixed.Gas \ \ \ \ 7.33 9 \ \ Mixed.Gas \ \ \ \ 7.04 10 \ \ \ \ \ \ \ CO2 \ \ \ \ 3.51 11 \ \ \ \ \ \ \ CO2 \ \ \ \ 2.91 12 \ \ \ \ \ \ \ CO2 \ \ \ \ 3.66 <\input| >> X11(pointsize = 6); opts \- options(); opts$texmacs$height \- 7.5; opts$texmacs$width=7.5; opts$texmacs$nox11=F; options(opts);plot(steak.data); v() <\output> |ps>||||||> <\input| >> \; > We will test the hypothesis that > is better then the average of all other treatments. We note that for Sheffe's Test it is not necessary that the contrast is a , i.e. that it has been devised in advance of the experiment to support a particular research hypothesis. We could perform the experiment, and then test the conjecture that the > treatment is better. The test provides for testing such research hypotheses. <\input| >> steak.contrast \- c(-3, 1, 1, 1) <\input| >> \; > We note that with Sheffe's test we do not test the one-directional hypothesis, but the bi-directional, i.e. we test whether the contrast is zero or non-zero. A small modification is needed to test the sign of the contrast. Let be an unplanned contrast. Thus, is a linear combination of the means: <\equation*> C=k\ Let > be the estimator of the contrast : <\equation*> =k|^> \; where =>X> is the group mean. Sheffe's test may be performed by first calculating the Studentized value of the contrast estimator >: <\equation*> U=||r>>>=|\<\|\|\>C\<\|\|\>> We note that the denominator is the square root of the variance estimator, as <\equation*> Var()=k Var(|^>)=k|r>=\\<\|\|\>C\<\|\|\> where C\<\|\|\>> is the norm of the contrast with respect to the inner product <\equation*> \C,D\=\d|r> where d\> is another arbitrary contrast. Sheffe's theorem says that <\equation*> >U\U>)\\ if\ <\equation*> U>=,t-1,N-t>> where ,\,\>> is the value of the Fischer -distribution with degrees of freedom > and > for argument >. We should note that Sheffe's statistic (defined by the last equation) is an approximation, and thus it is generally that U\U>)=\>. Our data is given as a data frame, but it is easy to divide it into treatment groups and compute the groups means: <\input| >> (steak.groups \- split(steak.data$LogCount, steak.data$Treatment)) <\output> $CO2 [1] 3.51 2.91 3.66 \; $Commercial [1] 7.66 6.98 7.80 \; $Mixed.Gas [1] 7.41 7.33 7.04 \; $Vacuum [1] 5.26 5.44 5.80 <\input| >> (steak.means \- as.numeric(lapply(steak.groups, mean))) <\output> [1] 3.36 7.48 7.26 5.50 <\input| >> \; > We define the typical parameters for a completely randomized design. <\input| >> (t \- length(levels(steak.data$Treatment))) <\output> [1] 4 <\input| >> (N \- length(steak.data$LogCount)) <\output> [1] 12 <\input| >> (r \- N / t) <\output> [1] 3 <\input| >> \; > <\input| >> (df1 \- t - 1) <\output> [1] 3 <\input| >> (df2 = N - t) <\output> [1] 8 <\input| >> \; > The following two commands arrange for the groups means to be in the correct order. We note that contains the group means in the sorted order of the factor levels that R internally computes. Thus, we first convert the factor to a numeric vector, which results in vector of within the array of factor levels. Then we apply to list the means in the correct order. <\input| >> (idx \- as.numeric(steak.data$Treatment)) <\output> \ [1] 2 2 2 4 4 4 3 3 3 1 1 1 <\input| >> (steak.proj \- steak.means[idx]) <\output> \ [1] 7.48 7.48 7.48 5.50 5.50 5.50 7.26 7.26 7.26 3.36 3.36 3.36 <\input| >> \; > <\input| >> (SSE \- sum((steak.data$LogCount - steak.proj)^2)) <\output> [1] 0.9268 <\input| >> (MSE \- SSE / (N - t)) <\output> [1] 0.11585 <\input| >> (steak.se.estimator \- sqrt(MSE)) <\output> [1] 0.3403674 <\input| >> \; > <\input| >> (steak.contrast.estimator \- as.numeric(crossprod(steak.contrast, steak.means))) <\output> [1] 10.16 <\input| >> (steak.contrast.norm \- sqrt(sum(steak.contrast^2 / r))) <\output> [1] 2 > <\input| >> (steak.contrast.se.estimator \- steak.se.estimator * steak.contrast.norm) <\output> [1] 0.6807349 <\input| >> \; > The first step is just to calculate the studentized value of the contrast. <\input| >> (sheffe.statistic \- steak.contrast.estimator / steak.contrast.se.estimator) <\output> [1] 14.92505 <\input| >> \; > <\input| >> (alpha.E \- 0.05) <\output> [1] 0.05 <\input| >> (F.value \- pf(alpha.E, lower.tail = F, df1 = df1, df2 = df2)) <\output> [1] 0.9841499 <\input| >> (sheffe.critical.level \- sqrt((t - 1) * F.value)) <\output> [1] 1.718269 <\input| >> abs(sheffe.statistic) \ sheffe.critical.level <\output> [1] TRUE <\input| >> \; > Thus, we reject the null hypothesis. Let us calculate the significance level corresponding to the value of the statistic: <\input| >> pf(sheffe.statistic^2 / (t - 1), df1 = df1, df2 = df2, lower.tail = F) <\output> [1] 3.50535e-06 <\input| >> \; > If our contrast were a , i.e. if we had known in advance that we will test its equality to , we could use the Student t-test. The calculations are as follows: <\input| >> t.statistic \- sheffe.statistic <\input| >> (t.critical.level \- pt(alpha.E / 2, df = df2, lower.tail = F)) <\output> [1] 0.4903337 <\input| >> pt(abs(t.statistic), df = df2, lower.tail = F) <\output> [1] 2.003003e-07 <\input| >> \; > As we can see, the confidence level resulting from ths t-test is smaller by an order of magnitude. It is clear that there are situations when > is rejected by the t-test, and not rejected by the Sheffe's test. Careful theoretical analysis would show that the opposite cannot happen. <\initial> <\collection> <\references> <\collection> > > > > > > > > > > > > > <\auxiliary> <\collection> <\associate|toc> |math-font-series||Creation of the data frame> |.>>>>|> |math-font-series||Picking of the contrast> |.>>>>|> |math-font-series||Calculating groups and group means> |.>>>>|> |math-font-series||The usual parameters> |.>>>>|> |math-font-series||Degrees of freedom for Sheffe's Test> |.>>>>|> |math-font-series||Calculating the projection vector> |.>>>>|> |math-font-series||Estimating standard error> |.>>>>|> |math-font-series||Calculating the contrast estimator> |.>>>>|> |math-font-series||Calculating the Sheffe statistic> |.>>>>|> |math-font-series||Finding the critical level> |.>>>>|> |math-font-series||Comparison with the Student t-test> |.>>>>|>