<\body> ||<\author-address> >> Let >, ,k> be a sequence of vectors in >. Let us consider the sequence of vector subspaces spanned by the consecutive vectors: <\eqnarray*> >||},>>|>||,v},>>||>|>|>||,v\,v}.>>>> For simplicity, we assume that the set }> is linearly independent, although it is important to consider linearly dependent sets, too. The consequence of independence is that: <\equation*> V\V\\\V is a strictly increasing sequence of vector spaces. \; We can find a sequence of mutually orthogonal vectors >, ,p> such that: <\equation*> V= span{q,q,\,q}, \ \ \ \ i=1,2,\,k. That is, the sequence ,v,\,v}> can be replaced with a sequence of orthogonal vectors. Sometimes one assumes that the vectors are normalized, which is easy to achieve as an extra step at the end. Let (v)> denote the projection of a vector on a vector subspace >. We define the vectors > by induction as follows: <\eqnarray*> >||,>>|>||-proj> (v),>>||>|>|>||-proj>( v),>>||>|>|>||-proj>(v).>>>> The projection on a space > spanned by orthogonal vectors ,q,\,q>} can be expressed explicitly as follows: <\equation*> proj>(v)=v,q\|\q,q\>q. The angle bracket notation u,v\> denotes the dot product, although could be replaced by any that is . For the purpose of this article, we assume the standard dot product: <\equation*> \u,v\=uv=uv=>|>|>|>>>>>>>|>>|>>|>>>>>. The above version of the algorithm does not produce normalized vectors. However, these three formulas do: <\eqnarray*> >||v\<\|\|\>>v,>>|>||-\v,q\q,>>|>||\<\|\|\>>.>>>> An easy analysis shows that the vectors it produces are the normalized versions of the previous algorithm. Since q>,q\=1> at every step, there is no need to divide by q,q\>. The results of Gram-Schmidt process may be compactly written down as the QR-decomposition. First, we express > through >: <\eqnarray*> >||q,>>|>||q+r q,>>||>|>|>||q+rq+\+rq,>>||>|>|>||q+rq+\+rq.>>>> By comparing with the expressions for the projections we find that <\equation*> r=v,q\|\q,q\>,1\i\k, \ 1\j\i. We define matrix by combining vectors > as columns of : <\equation*> A=\|v\|\\|v>>>>> Similarly, we combine vectors > into a matrix : <\equation*> Q=\|q\|\\|q>>>>>. The matrix is formed from coefficients >: <\equation*> R=>>>>>. It is easy to check that: <\equation*> A=Q R. The fact that the columns of are orthogonal may be expressed as: <\equation*> QQ=diag(\<\|\|\>q\<\|\|\>,\<\|\|\>q\<\|\|\>,\,\<\|\|\>q\<\|\|\>). Moreover, if we use the alternative version of the algorithm that normalizes vectors >, we obtain a more easy to remember equation: <\equation*> QQ=I where > is the k> identity matrix. <\example> Let us consider the following problem: Given three normal, mean 0, independent, identically distributed random variables (IID's) >, > and >. Let =3>=X+X+X> be the sample mean. We ask for two complementary normal variables > and > which are independent of > and of each other. \; We define consider three vectors =(1,1,1)>, =(0,1,0)> and =(0,0,1)>. > is chosen so that =vX>, where ,X,X)>. Vectors > and > can be chosen more or less at random, but for the purpose of this example we need them to be simple, so we chose two elements of the standard basis of >. \; We apply the Gram-Shmidt process to our vectors. We obtain: <\eqnarray*> >||>|>||v,q\|\q,q\>q=(0,1,0)-(1,1,1)>>|||-,,-.>>|>||v,q\|\q,\>q-v,q\|\q,q\>q>>|||(1,1,1)-|-++->-,,->>|||1,1,1+-,,->>|||,,+-,,-=-,0,.>>>> We instantiate matrices , and : <\eqnarray*> |||>|>>||>|>||>|>>>>>,>>||||>|>>|||>>|||>>>>,>>|||||>|||>|||>>>>.>>|||>>> It can be verified directly that and Q=diag3,,>. If we normalize the columns of , we must multiply the rows of by the corresponding norms of the columns of , i.e. factors 3>, >> and >>: <\eqnarray*> ||>>|>>|>>>|>>||>>|>|>>|>>|>>>>>>>>|||>|>>|>>>|||>>|>>>|||>>>>>>>>|||>>> It is obvious why we avoid normalization in by-hand calculations. The answer to our statistics problem is that >, > and > are defined using the coefficients of >, > and > (the normalized variant): <\eqnarray*> >||X=>X+>X+>X,>>|>||X=->X+|>X->X,>>|>||X=->X+>X.>>>> A direct application is to rewrite the Studentized mean >> as a ratio with independent numerator end denominator: <\equation*> t=>|>>>>=|+Y)/2>>. This is the classical representation of the Student -statistic as ratio of a normally distributed random variable by the square root of a >-distributed variable with 2 degrees of freedom, divided degrees of freedom. To show this, we need the equation: <\equation*> CovcX, dX=cd=\c,d\. Using this identity we verify that: <\equation*> Cov(Y,Y)=\ \ \ \ \ (Kronecker delta). We leave the remaining easy calculations to the reader. We need to express >>> using > and >. We have: <\equation*> s>>=+Z+Z |2> where: <\eqnarray> >||-(X+X+X),>>|>||-(X+X+X),>>|>||-(X+X+X).>>>> We rewrite this as a single matrix equation where <\eqnarray*> >|>|>>|>|>|>>|>|>|>>>>>.>||>>> It is good to understand on a more conceptual level. We first note that <\equation*> Z=X->qX,i=1,2,3. These equations can be rewritten in matrix form: <\equation*> Z=X->>|>|>>>>qX=X-qqX=(I-qq)X. For any normalized vector >, the matrix q> is the matrix of the linear transformation: the orthogonal projection onto the 1-dimensional space =span{v}>. (We denoted this transformation >> before.) Indeed, for any vector > we have: <\equation*> P v=qq>v=q(qv)=(qv)q=\q,v\q. Thus V>. Also, q> because of the following calculation: <\equation*> \v-P v,q>\=\v,q\-\\q,v\q,q\=\v,q\-\q,v\\q,q\=0. In the above calculation, we used the fact that \,\\> is a , i.e. it is linear in each argument. By definition, the bilinearity means that for all scalars ,\\> and all vectors >: <\enumerate> \v+\w,z\=\\v,z\+\\w,z\.> v,\w+\z\=\\v,w\+\\v,z\.> We also used q,q\=1>. \; The next observation is that if we have an orthogonal basis ,q,q}> of > then the following holds: <\equation*> I=qq>. (Applying both sides to an arbitrary vector > we obtain a longer version of the above: <\equation*> v=\q,v\q This expresses the fact that every vector is the sum of its projections onto vectors >, . Last two equations are, of course equivalent.) Thus, <\equation*> B=I-qq=qq+qq is the projection onto ,q}>, the orthogonal complement of > (denoted }>)>. \; In order to rewrite >>> in terms of > and >, we neeed to express Z\<\|\|\>> in terms of >. We observe that X> by definition (equivalent to =qX)>. Thus, . Hence . We find the product : <\equation*> B Q=(B q\|B q\|B q)=\|q>>>>>. (Projection onto ,q}> maps > to the zero vector, and it maps > and > to themselves. Also, a matrix product like can be computed by applying to columns of separately.) \; Let > denote the matrix with the first column dropped: <\equation*> Q=>|>>>>>. Thus, >>>>>>>. Let =(Y,Y)> denote with > dropped. We have: <\equation*> Z=B Q Y=>>>>>>>|>>>>>= Clearly: <\equation*> \<\|\|\>Z\<\|\|\>=\<\|\|\>\<\|\|\>=()()=|~>(|~>)= I==\<\|\|\>\<\|\|\>. In other words: <\equation*> Z+Z+Z=Y+Y. Finally, <\equation*> s>>=+Z+Z|2>=+Y|2>. and <\equation*> t=|+Y)/2>>. The function computes the QR-decomposition. <\input| >> A \- matrix(c(1,1,1,0,1,0,0,0,1),nrow=3,ncol=3) <\input| >> A <\output> \ \ \ \ \ [,1] [,2] [,3] [1,] \ \ \ 1 \ \ \ 0 \ \ \ 0 [2,] \ \ \ 1 \ \ \ 1 \ \ \ 0 [3,] \ \ \ 1 \ \ \ 0 \ \ \ 1 <\input| >> qr \- qr(A) <\input| >> qr <\output> \$qr \ \ \ \ \ \ \ \ \ \ \ [,1] \ \ \ \ \ \ [,2] \ \ \ \ \ \ [,3] [1,] -1.7320508 -0.5773503 -0.5773503 [2,] \ 0.5773503 -0.8164966 \ 0.4082483 [3,] \ 0.5773503 -0.2588190 \ 0.7071068 \; \$rank [1] 3 \; \$qraux [1] 1.5773503 1.9659258 0.7071068 \; \$pivot [1] 1 2 3 \; attr(,"class") [1] "qr" <\input| >> Q \- qr.Q(qr) <\input| >> R \- qr.R(qr) <\input| >> Q <\output> \ \ \ \ \ \ \ \ \ \ \ [,1] \ \ \ \ \ \ [,2] \ \ \ \ \ \ [,3] [1,] -0.5773503 \ 0.4082483 -0.7071068 [2,] -0.5773503 -0.8164966 \ 0.0000000 [3,] -0.5773503 \ 0.4082483 \ 0.7071068 <\input| >> R <\output> \ \ \ \ \ \ \ \ \ \ [,1] \ \ \ \ \ \ [,2] \ \ \ \ \ \ [,3] [1,] -1.732051 -0.5773503 -0.5773503 [2,] \ 0.000000 -0.8164966 \ 0.4082483 [3,] \ 0.000000 \ 0.0000000 \ 0.7071068 <\input| >> A - Q %*% R <\output> \ \ \ \ \ \ \ \ \ \ \ \ \ [,1] \ \ \ \ \ \ \ \ \ [,2] \ \ \ \ \ \ \ \ \ [,3] [1,] 1.110223e-16 -1.110223e-16 \ 1.110223e-16 [2,] 0.000000e+00 \ 0.000000e+00 -1.110223e-16 [3,] 0.000000e+00 -5.551115e-17 \ 0.000000e+00 <\input| >> \; > We notice that there is at least sign difference between our answer calculated by hand and the answer given by the computer. This is an inherent ambiguity of the QR-decomposition, that is determined up to flipping the sign of the columns (if we assume Q=I>). Let us check our answer further: <\input| >> Q.byhand = matrix(c(1/sqrt(3),-1/sqrt(6),-1/sqrt(2),1/sqrt(3),sqrt(2)/sqrt(3),0,1/sqrt(3),-1/sqrt(6),1/sqrt(2)),nrow=3,ncol=3,byrow=T) <\input| >> Q.byhand <\output> \ \ \ \ \ \ \ \ \ \ [,1] \ \ \ \ \ \ [,2] \ \ \ \ \ \ [,3] [1,] 0.5773503 -0.4082483 -0.7071068 [2,] 0.5773503 \ 0.8164966 \ 0.0000000 [3,] 0.5773503 -0.4082483 \ 0.7071068 <\input| >> R.byhand = matrix(c(sqrt(3),1/sqrt(3),1/sqrt(3),0,sqrt(2)/sqrt(3),-1/sqrt(6),0,0,1/sqrt(2)),nrow=3,ncol=3,byrow=T) <\input| >> R.byhand <\output> \ \ \ \ \ \ \ \ \ [,1] \ \ \ \ \ [,2] \ \ \ \ \ \ [,3] [1,] 1.732051 0.5773503 \ 0.5773503 [2,] 0.000000 0.8164966 -0.4082483 [3,] 0.000000 0.0000000 \ 0.7071068 <\input| >> crossprod(Q.byhand,Q.byhand) <\output> \ \ \ \ \ [,1] [,2] [,3] [1,] \ \ \ 1 \ \ \ 0 \ \ \ 0 [2,] \ \ \ 0 \ \ \ 1 \ \ \ 0 [3,] \ \ \ 0 \ \ \ 0 \ \ \ 1 <\input| >> A-Q.byhand %*% R.byhand <\output> \ \ \ \ \ [,1] \ \ \ \ \ \ \ \ \ [,2] \ \ \ \ \ \ \ \ \ [,3] [1,] \ \ \ 0 \ 0.000000e+00 -2.220446e-16 [2,] \ \ \ 0 -2.220446e-16 \ 0.000000e+00 [3,] \ \ \ 0 \ 0.000000e+00 \ 0.000000e+00 <\input| >> \; > So, indeed, the only difference is that of the sign of the first column of , and the resulting differences in . We see by both calculations lead to equation within the machine round-off error. \; <\initial> <\collection> <\references> <\collection> > > > > > > <\auxiliary> <\collection> <\associate|toc> |math-font-series||Gram-Schmidt Process> |.>>>>|> |math-font-series||The algorithm> |.>>>>|> |math-font-series||An alternative form of the algorithm> |.>>>>|> |math-font-series||Rewriting the result as QR-decomposition> |.>>>>|> |math-font-series||A statistics example> |.>>>>|> |math-font-series||Computing QR-decompositions with R> |.>>>>|>