1 Introduction to Tensors - Sorin Mitranmitran-lab.amath.unc.edu/.../introduction_to_tensors.pdf · 2020. 11. 19. · 1 Introduction to Tensors In elementary physics, we often come

ii

“CM˙Final” — 2015/3/13 — 11:09 — page 1 — #20 ii

ii

ii

1 Introduction to Tensors

In elementary physics, we often come across two classes of quantities, namely scalars andvectors. Mass, density and temperature are examples of scalar quantities, while velocityand acceleration are examples of vector quantities. A scalar quantity’s value does not de-pend on the choice of the coordinate system. Similarly, although the components of a vectordepend on a particular choice of coordinate system, the vector itself is invariant, and hasan existence independent of the choice of coordinate system. In this chapter, we generalizethe concept of a scalar and vector, to that of a tensor. In this general framework, scalarsare considered as zeroth-order tensors and vectors as first-order tensors. Tensor quantitiesof order two and greater, similar to scalars and vectors, have an existence independentof the coordinate system. Their components, however, just as in the case of vectors, de-pend on the choice of coordinate system. We will see that the governing field equationsof continuum mechanics can be written as tensorial equations. The advantage of writingthe field equations in such ‘coordinate-free’ notation is that it is immediately obvious thatthese equations are valid no matter what the choice of coordinate system is. A particularcoordinate system is invoked only while solving a particular problem, whence the appro-priate form of the differential operators and the components of the tensors with respect tothe chosen coordinate system are used. It must be borne in mind, however, that althoughusing tensorial notation shows the ‘coordinate-free’ nature of the governing equations in agiven frame of reference, it does not address the issue of how the equations transform undera change of frame of reference. This aspect will be discussed in greater detail later in thisbook.

We now present a review of tensors. Throughout the text, scalars are denoted by light-face letters, vectors are denoted by boldface lower-case letters, while second and higher-order tensors are denoted by boldface capital letters. As a notational issue, summation overrepeated indices is assumed, with the indices ranging from 1 to 3. Thus, for example, uivirepresents u1v1 + u2v2 + u3v3, and Tijnj represents Ti1n1 + Ti2n2 + Ti3n3. The quantity onthe right-hand side of a ‘:=’ symbol defines the quantity on its left-hand side. A functionon ‘V ×V to V’ means that the function is defined in terms of two elements that belong toV, and the result is also in V.

1.1 Vector Spaces

In what follows, we consider only real vector spaces. We denote the set of real numbersby <. A vector space (or linear space) is a set, say V, equipped with an addition functionon V × V to V (denoted by +), and a scalar multiplication function on <× V to V, whichsatisfy the following conditions:

use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316134054.002Downloaded from https://www.cambridge.org/core. The University of North Carolina Chapel Hill Libraries, on 20 Aug 2019 at 17:43:31, subject to the Cambridge Core terms of

https://www.cambridge.org/core/terms

https://doi.org/10.1017/CBO9781316134054.002

https://www.cambridge.org/core

ii

“CM˙Final” — 2015/3/13 — 11:09 — page 2 — #21 ii

ii

ii

2 Continuum Mechanics

1. Commutativity: For all u, v ∈ V,

u + v = v + u.

2. Associativity: For all u, v, w ∈ V,

(u + v) + w = u + (v + w).

3. Existence of a zero element: There exists 0 ∈ V such that

u + 0 = u.

4. Existence of negative elements: For each u ∈ V, there exists a negative element de-noted −u in V such that

u− u = 0.

5. Distributivity with respect to addition of vectors: For all α ∈ <, and u, v ∈ V,

α(u + v) = αu + αv.

6. Distributivity with respect to scalar addition: For all α, β ∈ <, and for all u ∈ V,

(α + β)u = αu + βu.

7. Associativity: For all α, β ∈ <, and for all u ∈ V,

α(βu) = (αβ)u.

8. Identity in scalar multiplication: For all u ∈ V,

1u = u.

Since a vector space is a group with respect to addition (see Section 1.12) with 0 and −uplaying the roles of the neutral and reverse elements, respectively, all the results derivedfor groups are applicable for vector spaces. In particular, the zero element of V, and thenegative element −u corresponding to a given u ∈ V are unique. Note also that αu + βv ∈V for all α, β ∈ <, and all u, v ∈ V.

Perhaps, the most famous example of a vector space is the n-dimensional coordinatespace <n, which is defined by

<n := (u1, u2, . . . , un) : ui ∈ < .

For <n, addition and scalar multiplication are defined by the relations

u + v := (u1 + v1, u2 + v2, . . . , un + vn),αu := (αu1, αu2, . . . , αun).



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 3 — #22 ii

ii

ii

Introduction to Tensors 3

If p is a positive real number, then another example of a vector space is the space

Lp[0, 1] :=

f :∫ 1

0| f |p dx < ∞

,

with addition and scalar multiplication defined by

( f + g)(x) := f (x) + g(x), x ∈ [0, 1],(α f )(x) := α[ f (x)], x ∈ [0, 1].

To show that Lp[0, 1] is a vector space, we need to show that f + g ∈ Lp[0, 1] if f , g ∈Lp[0, 1]. This follows from the inequality

| f + g|p ≤ [2 max(| f | , |g|)]p ≤ 2p [| f |p + |g|p] .

A subset u1, u2, . . . , um of V is said to be linearly dependent if and only if there existscalars α1, α2, . . . , αm, not all zero, such that

α1u1 + α2u2 + · · ·+ αmum = 0.

Thus, a subset u1, u2, . . . , um of V is linearly independent if and only if the equation

α1u1 + α2u2 + · · ·+ αmum = 0,

implies that α1 = α2 = · · · = αm = 0. A subset, say u1, u2, . . . , um, 0, which includes thezero element is always linearly dependent even when the subset u1, u2, . . . , um is linearlyindependent, since the coefficient of the zero element can be taken to be nonzero.

A subset e1, e2, . . . , en of V is said to be a basis for V if

1. e1, e2, . . . , en is linearly independent, and

2. Any element of V can be expressed as a linear combination of e1, e2, . . . , en, i.e., ifu ∈ V, then

u = u1e1 + u2e2 + · · ·+ unen,

where the scalars u1, u2, . . . , un are known as the components of u with respect to thebasis e1, e2, . . . , en.

If the bases have a finite number of elements, we have the following theorem:

Theorem 1.1.1. All bases for a given vector space contain the same number of elements.

Proof. Suppose that e1, e2, . . . , en and e∗1 , e∗2 , . . . , e∗m are bases for a vectorspace. Every e∗i , since it is an element of the vector space, can be expressedin terms of the basis e1, e2, . . . , en as βijej, βij ∈ <. Thus,

αie∗i = βijαiej = 0



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 4 — #23 ii

ii

ii


implies, by the linear independence of e1, e2, . . . , en, that

βijαi = 0 i = 1, 2, . . . , m, j = 1, 2, . . . , n.

Let m > n. Then, the number of unknowns αi is more than the number of equa-tions, so that it is possible to find a nontrivial solution. Thus, there exist α1, α2,. . ., αm, not all zero such that

αie∗i = 0,

i.e., e∗1 , e∗2 , . . . , e∗m is linearly dependent, contradicting the fact that it is a basis.Hence, m ≤ n. Next, suppose that m < n. Now reverse the roles of e∗i andei in the above argument to conclude that m ≥ n. Hence, m = n.

In view of the above result, a vector space V is said to be n-dimensional if it contains abasis with n elements. If no such finite integer n exists, then the vector space is said to beinfinite-dimensional. For example, <n is finite-dimensional with

e1 = (1, 0, 0, . . . , 0),e2 = (0, 1, 0, . . . , 0),

. . .en = (0, 0, 0, . . . , 1),

as the ‘canonical’ or natural basis. On the other hand, Lp[0, 1] is an infinite-dimensionalvector space. Note that even in the finite-dimensional case, the basis need not be unique.When V is finite-dimensional, using the linear independence of e1, e2, . . . , en, it is easy toshow that the components of any element u are unique.

Let V be an n-dimensional vector space. A subset f 1, f 2, . . . , f m of V, where

1. m > n cannot be a basis for V, since, as can be shown by following the same method-ology as used in the proof of Theorem 1.1.1, it is linearly dependent.

2. m < n cannot be a basis for V, since by Theorem 1.1.1, any basis has n elements.Although, the subset f 1, f 2, . . . , f m can be linearly independent in this case, anarbitrary element of V cannot be expressed in terms of the elements of this subset.

Thus, m = n is a necessary condition for the subset f 1, f 2, . . . , f m to be a basis. Thefollowing theorem is useful in finding if a given subset with n elements is a basis for ann-dimensional vector space:

Theorem 1.1.2. Let f 1, f 2, . . . , f n be a subset of an n-dimensional vector space V,one of whose bases is given by e1, e2, . . . , en. Since f 1, f 2, . . . , f n is a subset of V,each f i can be expressed as a linear combination of the basis vectors e1, e2, . . . , en, i.e.,f i = βijej, i, j = 1, 2, . . . , n, βij ∈ <. The following statements are equivalent:

1. f 1, f 2, . . . , f n is a basis.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 5 — #24 ii

ii

ii


2. f 1, f 2, . . . , f n is linearly independent.

3. det [βij] 6= 0.

Proof. We prove that (i) =⇒ (ii) =⇒ (iii) =⇒ (i).If f 1, f 2, . . . , f n is a basis, then it is linearly independent by definition.To prove that (ii) =⇒ (iii), we show that “not (iii)” =⇒ “not (ii)”. Thus, let

det [βij] = 0. From matrix algebra, it follows that there exist αi, i = 1, 2, . . . , n,not all zero, such that βijαi = 0. Thus, there exist αi, not all zero, such that

αi f i = αiβijej = 0,

which implies that the set f 1, f 2, . . . , f n is linearly dependent.To prove that (iii) =⇒ (i), note that αi f i = βijαiej = 0, implies, since

e1, e2, . . . , en is a basis (and hence a linearly independent set), that βijαi = 0,i, j = 1, 2, . . . , n, and since det [βij] 6= 0, it follows that all the αi are zero,thus showing that f 1, f 2, . . . , f n is linearly independent. In addition, sincedet [βij] 6= 0, the matrix [βij] is invertible. Let [γij] denote the components of thisinverse, i.e., γijβ jk = δik. Then

γij f j = γijβ jkek = δikek = ei.

Since e1, e2, . . . , en is a basis of V, every element u of V can be expressed asuiei, so that on using the above relation, we have

u = uiei = γijui f j.

Thus, the set f 1, f 2, . . . , f n satisfies the two conditions for qualifying as a basis.

A nonempty subset Vs of a vector space V is said to be a linear subspace if a linear com-bination of any two of its elements also lies in Vs, i.e., if (αu + βv) ∈ Vs for any arbitraryu, v ∈ Vs, and α, β ∈ <. For example, <2 is a linear subspace of <3. By assuming the op-erations of addition and scalar multiplication to be the same as that for the ‘parent space’V, it can be shown by verifying the defining axioms that a linear subspace is itself a vectorspace.

The set of all linear combinations of a subset u1, u2, . . . , um of V is called the linearspan of u1, u2, . . . , um, i.e.,

Lspu1, u2, . . . , um := u : u = α1u1 + α2u2 + · · ·+ αmum, αi ∈ <.From this definition, it immediately follows that Lspu1, u2, . . . , um is a linear subspaceof V.

An inner product space (or Euclidean space) is a vector space V equipped with a functionon V ×V to <, denoted by (u, v), and called the inner product (also called scalar product ordot product) of u and v, that satisfies the following conditions:

(u, v) = (v, u) ∀u, v ∈ V, (1.1a)(αu + βv, w) = α(u, w) + β(v, w) ∀α, β ∈ < and u, v, w ∈ V, (1.1b)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 6 — #25 ii

ii

ii


(u, u) ≥ 0 ∀u ∈ V with (u, u) = 0 if and only if u = 0, (1.1c)

The magnitude |u| is defined by

|u| := (u, u)1/2.

From the above definition, and the properties of the inner product it is obvious that

|αu| = |α| |u| ∀α ∈ < and u ∈ V,|u| ≥ 0 with |u| = 0 if and only if u = 0.

(1.2)

Both <n and L2[0, 1] are inner product spaces. In the case of <n, the inner product canbe defined by

(u, v) := u1v1 + u2v2 + · · ·+ unvn.

The choice of inner product is not unique. If S is a symmetric, positive definite n× n matrix,then another choice of inner product for <n is

(u, v) :=n

∑i=1

n

∑j=1

Sijuivj.

The canonical inner product for L2[0, 1] is defined by

( f , g) :=∫ 1

0f (x)g(x) dx.

We now prove the following important inequality:

Theorem 1.1.3 (Cauchy–Schwartz Inequality). Let V be an inner product space.Then

(u, v)2 ≤ (u, u)(v, v) ∀u, v ∈ V, (1.3)

with equality if and only if u and v are linearly dependent.

Proof. If (u, v) = 0, then the above inequality is obvious. If it is not zero, then uand v are nonzero vectors, so that (u, u) and (v, v) are nonzero. By Eqn. (1.1c),we have

(u− αv, u− αv) ≥ 0 ∀α ∈ <.

Expanding this equation using the properties of the inner product, we get

(u, u)− 2α(u, v) + α2(v, v) ≥ 0 ∀α ∈ <. (1.4)

The left-hand side of the above inequality is a quadratic function in α, with aminimum at α = (u, v)/(v, v). Substituting this value of α into Eqn. (1.4), we



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 7 — #26 ii

ii

ii


get Eqn. (1.3). Alternatively, one can use a modified form of Eqn. (5.128a) thatincludes equality to directly obtain Eqn. (1.3).

If u and/or v is the zero element, then by the remark following the defini-tion of linear independence, u and v are linearly dependent, and we also haveequality in Eqn. (1.3). If they are linearly dependent and nonzero, then thereexists α ∈ < such that u = αv, and equality in Eqn. (1.3) follows immediately.Conversely, if there is equality in Eqn. (1.3), and if u and/or v is the zero ele-ment, then they are linearly dependent. If there is equality, and both u and v arenonzero, then by letting α = (u, v)/(v, v), we see that

(u− αv, u− αv) = 0,

which in turn implies that u = αv.

Applying the Cauchy–Schwartz inequality to <n and L2[0, 1], we get(n

∑i=1

uivi

)2

≤(

n

∑i=1

u2i

)(n

∑i=1

v2i

),

(∫ 1

0f (x)g(x) dx

)2

≤(∫ 1

0f (x)2 dx

)(∫ 1

0g(x)2 dx

).

Using the Cauchy–Schwartz inequality, we now prove the following:

Theorem 1.1.4 (Triangle inequality). Let V be an inner product space. Then

|u + v| ≤ |u|+ |v| ∀u, v ∈ V.

Proof.

|u + v|2 = (u + v, u + v)

= |u|2 + 2(u, v) + |v|2

≤ |u|2 + 2 |(u, v)|+ |v|2

≤ |u|2 + 2 |u| |v|+ |v|2 (by the Cauchy–Schwartz inequality)

= (|u|+ |v|)2.

Taking the square-root of both sides, we get the desired result.

A vector space V is a normed vector space, if we assign a nonnegative real number, ‖u‖,called the norm of u, to each u ∈ V such that

‖αu‖ = |α| ‖u‖ ∀α ∈ < and ∀u ∈ V,‖u‖ ≥ 0 ∀u ∈ V with ‖u‖ = 0 if and only if u = 0,‖u + v‖ ≤ ‖u‖+ ‖v‖ .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 8 — #27 ii

ii

ii


One can show that Lp[0, 1], p ≥ 1 is a normed vector space. From Eqns. (1.2) and thetriangle inequality, it is clear that an inner product space is a normed vector space with thenorm defined by

‖u‖ := |u| = (u, u)1/2.

Since continuum mechanics is primarily the study of deformable bodies inthree-dimensional space, we now specialize the results of this section to the case n = 3,and henceforth present results only for this case.

1.2 Vectors in <3

From now on, V denotes the three-dimensional Euclidean space <3. Let e1, e2, e3 be afixed set of orthonormal vectors that constitute the Cartesian basis. We have

ei · ej = δij,

where δij, known as the Kronecker delta, is defined by

δij :=

0 when i 6= j,1 when i = j.

(1.5)

The Kronecker delta is also known as the substitution operator, since, from the definition,we can see that xi = δijxj, τij = τikδkj, and so on. Note that δij = δji, and δii = δ11 + δ22 +δ33 = 3.

Any vector u can be written as

u = u1e1 + u2e2 + u3e3, (1.6)

or, using the summation convention, as

u = uiei.

The inner product of two vectors is given by

(u, v) = u · v := uivi = u1v1 + u2v2 + u3v3. (1.7)

Using Eqn. (1.6), the components of the vector u can be written as

ui = u · ei. (1.8)

Substituting Eqn. (1.8) into Eqn. (1.6), we have

u = (u · ei)ei. (1.9)

We define the cross product of two base vectors ej and ek by

ej× ek := εijkei, (1.10)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 9 — #28 ii

ii

ii


where εijk are the components of a third-order tensor E known as the alternate tensor(which we will discuss in greater detail in Section 1.7), and are given by

ε123 = ε231 = ε312 = 1ε132 = ε213 = ε321 = −1εijk = 0 otherwise.

Taking the dot product of both sides of Eqn. (1.10) with em, we get

em · (ej× ek) = εijkδim = εmjk.

Using the index i in place of m, we have

εijk = ei · (ej× ek). (1.11)

The cross product of two vectors is assumed to be distributive, i.e.,

(αu)× (βv) = αβ(u× v) ∀α, β ∈ < and u, v ∈ V.

If w denotes the cross product of u and v, then by using this property and Eqn. (1.10), wehave

w = u× v= (ujej)× (vkek)

= εijkujvkei. (1.12)

It is clear from Eqn. (1.12) that

u× v = −v× u.

Taking v = u, we get u× u = 0.The scalar triple product of three vectors u, v, w, denoted by [u, v, w], is defined by

[u, v, w] := u · (v×w).

In indicial notation, we have

[u, v, w] = εijkuivjwk. (1.13)

From Eqn. 1.13, it is clear that

[u, v, w] = [v, w, u] = [w, u, v] = − [v, u, w] = − [u, w, v] = − [w, v, u] ∀u, v, w ∈ V.(1.14)

If any two elements in the scalar triple product are the same, then its value is zero, ascan be seen by interchanging the identical elements, and using the above formula. FromEqn. (1.13), it is also clear that the scalar triple product is linear in each of its argumentvariables, so that, for example,

[αu + βv, x, y] = α [u, x, y] + β [v, x, y] ∀u, v, x, y ∈ V. (1.15)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 10 — #29 ii

ii

ii


As can be easily verified, Eqn. (1.13) can be written in determinant form as

[u, v, w] = det

u1 u2 u3

v1 v2 v3

w1 w2 w3

. (1.16)

Using Eqns. (1.11) and (1.16), the components of the alternate tensor can be written indeterminant form as follows:

εijk =[ei, ej, ek

]= det

ei · e1 ei · e2 ei · e3

ej · e1 ej · e2 ej · e3

ek · e1 ek · e2 ek · e3

= det

δi1 δi2 δi3

δj1 δj2 δj3

δk1 δk2 δk3

. (1.17)

Thus, we have

εijkεpqr = det

δi1 δi2 δi3

δj1 δj2 δj3

δk1 δk2 δk3

det

δp1 δp2 δp3

δq1 δq2 δq3

δr1 δr2 δr3

= det

δi1 δi2 δi3

δj1 δj2 δj3

δk1 δk2 δk3

det

δp1 δq1 δr1

δp2 δq2 δr2

δp3 δq3 δr3

(since det T = det(TT))

= det

δi1 δi2 δi3

δj1 δj2 δj3

δk1 δk2 δk3

δp1 δq1 δr1

δp2 δq2 δr2

δp3 δq3 δr3

(since (det R)(det S)=det(RS))

= det

δimδmp δimδmq δimδmr

δjmδmp δjmδmq δjmδmr

δkmδmp δkmδmq δkmδmr

= det

δip δiq δir

δjp δjq δjr

δkp δkq δkr

. (1.18)

From Eqn. (1.18) and the relation δii = 3, we obtain the following identities (the first ofwhich is known as the ε–δ identity):

εijkεiqr = δjqδkr − δjrδkq, (1.19a)

εijkεijm = 2δkm, (1.19b)

εijkεijk = 6. (1.19c)

Using Eqn. (1.12) and the ε–δ identity, we get

(u× v) · (u× v) = εijkεimnujvkumvn

= (δjmδkn − δjnδkm)ujvkumvn



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 11 — #30 ii

ii

ii


= (umvnumvn − ujvmumvj)

= (u · u)(v · v)− (u · v)2. (1.20)

We now have the following result:

Theorem 1.2.1. For any two vectors u, v ∈ V, u× v = 0 if and only if u and v arelinearly dependent.

Proof. If u and v are linearly dependent, and u and/or v is zero, then it is obviousthat u× v = 0. If they are linearly dependent and nonzero, then there exists ascalar α such that v = αu. Hence

u× v = αu× u = 0.

Conversely, if u× v = 0, then

0 = (u× v) · (u× v)

= (u · u)(v · v)− (u · v)2, (by Eqn. (1.20))

which implies that

(u, v)2 = (u, u)(v, v).

But by Theorem 1.1.3, this implies that u and v are linearly dependent.

The vector triple products u× (v×w) and (u× v)×w, defined as the cross product ofu with v×w, and the cross product of u× v with w, respectively, are different in general,and are given by

u× (v×w) = (u ·w)v− (u · v)w, (1.21a)(u× v)×w = (u ·w)v− (v ·w)u. (1.21b)

The first relation is proved by noting that

u× (v×w) = εijkuj(v×w)kei

= εijkεkmnujvmwnei

= εkijεkmnujvmwnei

= (δimδjn − δinδjm)ujvmwnei

= (unwnvi − umvmwi)ei

= (u ·w)v− (u · v)w.

The second relation is proved in an analogous manner.For scalar triple products, we have the following useful result:



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 12 — #31 ii

ii

ii


Theorem 1.2.2. The scalar triple product [u, v, w] is zero if and only if u, v and w arelinearly dependent.

Proof. Since

u = u1e1 + u2e2 + u3e3,v = v1e1 + v2e2 + v3e3,

w = w1e1 + w2e2 + w3e3,

by Theorem 1.1.2, u, v, w is linearly independent if and only if (see Eqn. (1.16))

[u, v, w] = det

u1 u2 u3

v1 v2 v3

w1 w2 w3

6= 0,

which is equivalent to the statement of the theorem.

Another proof of the above theorem may be found in [43].

1.3 Second-Order Tensors

A second-order tensor is a linear transformation that maps vectors to vectors. We shalldenote the set of second-order tensors by Lin. If T is a second-order tensor that maps avector u to a vector v, then we write it as

v = Tu. (1.22)

T satisfies the property

T(ax + by) = aTx + bTy, ∀x, y ∈ V and a, b ∈ <.

By choosing a = 1, b = −1, and x = y, we get

T(0) = 0.

From the definition of a second-order tensor, it follows that the sum of two second-ordertensors defined by

(R + S)u := Ru + Su ∀u ∈ V,

and the scalar multiple of T by α ∈ <, defined by

(αT)u := α(Tu) ∀u ∈ V,

are both second-order tensors. The two second-order tensors R and S are said to be equalif

Ru = Su ∀u ∈ V. (1.23)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 13 — #32 ii

ii

ii


The above condition is equivalent to the condition

(v, Ru) = (v, Su) ∀u, v ∈ V. (1.24)

To see this, note that if Eqn. (1.23) holds, then clearly Eqn. (1.24) holds. On the other hand,if Eqn. (1.24) holds, then using the bilinearity property of the inner product, we have

(v, (Ru− Su)) = 0 ∀u, v ∈ V.

Choosing v = Ru− Su, and using Eqn. (1.1c), we get Eqn. (1.23).If we define the element Z as

Zu = 0 ∀u ∈ V,

then we see that Z ∈ Lin, and is the zero element of Lin since

(T + Z)u = Tu + Zu = Tu ∀u ∈ V,

which, by the definition of the equality of tensors, implies that

T + Z = T .

Henceforth, we simply use the symbol 0 to denote the zero elements of both Lin and V.With the above definitions of addition, scalar multiplication, and zero, Lin is a vector space,and hence, all the results that we have derived for vector spaces in Section 1.1 are valid forLin.

If we define the function I : V → V by

Iu := u ∀u ∈ V, (1.25)

then it is clear that I ∈ Lin. I is called as the identity tensor.Choosing u = e1, e2 and e3 in Eqn. (1.22), we get three vectors that can be expressed as

a linear combination of the base vectors ei as

Te1 = α1e1 + α2e2 + α3e3

Te2 = α4e1 + α5e2 + α6e3

Te3 = α7e1 + α8e2 + α9e3,(1.26)

where αi, i = 1 to 9, are scalar constants. Renaming the αi as Tij, i = 1, 3, j = 1, 3, we get

Tej = Tijei. (1.27)

The elements Tij are called the components of the tensor T with respect to the base vectorsej; as seen from Eqn. (1.27), Tij is the component of Tej in the ei direction. Taking the dotproduct of both sides of Eqn. (1.27) with ek for some particular k, we get

ek · Tej = Tijδik = Tkj,

or, replacing k by i,

Tij = ei · Tej. (1.28)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 14 — #33 ii

ii

ii


By choosing v = ei and u = ej in Eqn. (1.24), it is clear that the components of two equaltensors are equal. From Eqn. (1.28), the components of the identity tensor in any orthonor-mal coordinate system ei are

Iij = ei · Iej = ei · ej = δij. (1.29)

Thus, the components of the identity tensor are scalars that are independent of the Carte-sian basis. Using Eqn. (1.27), we write Eqn. (1.22) in component form (where the compo-nents are with respect to a particular orthonormal basis ei) as

viei = T(ujej) = ujTej = ujTijei,

which, by virtue of the uniqueness of the components of any element of a vector space,yields

vi = Tijuj. (1.30)

Thus, the components of the vector v are obtained by a matrix multiplication of the com-ponents of T , and the components of u.

The transpose of T , denoted by TT , is defined using the inner product as

(TTu, v) := (u, Tv) ∀u, v ∈ V. (1.31)

Once again, it follows from the definition that TT is a second-order tensor. The transposehas the following properties:

(TT)T = T ,

(αT)T = αTT ,

(R + S)T = RT + ST .

If (Tij) represent the components of the tensor T , then the components of TT are

(TT)ij = ei · TTej

= Tei · ej

= Tji. (1.32)

The tensor T is said to be symmetric if

TT = T ,

and skew-symmetric (or anti-symmetric) if

TT = −T .

Any tensor T can be decomposed uniquely into a symmetric and an skew-symmetric partas (see Problem 5)

T = Ts + Tss, (1.33)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 15 — #34 ii

ii

ii


where

Ts =12(T + TT),

Tss =12(T − TT).

The product of two second-order tensors RS is the composition of the two operations Rand S, with S operating first, and defined by the relation

(RS)u := R(Su) ∀u ∈ V. (1.34)

Since RS is a linear transformation that maps vectors to vectors, we conclude that theproduct of two second-order tensors is also a second-order tensor. From the definitionof the identity tensor given by (1.25) it follows that RI = IR = R. If T represents theproduct RS, then its components are given by

Tij = ei · (RS)ej

= ei · R(Sej)

= ei · R(Skjek)

= ei · SkjRek

= Skj(ei · Rek)

= SkjRik

= RikSkj, (1.35)

which is consistent with matrix multiplication. Also consistent with the results from matrixtheory, we have (RS)T = ST RT , which follows from Eqns. (1.24), (1.31) and (1.34).

1.3.1 The tensor product

We now introduce the concept of a tensor product, which is convenient for working withtensors of rank higher than two. We first define the dyadic or tensor product of two vectorsa and b by

(a⊗ b)c := (b · c)a ∀c ∈ V. (1.36)

Note that the tensor product a⊗ b cannot be defined except in terms of its operation on avector c. We now prove that a⊗ b defines a second-order tensor. The above rule obviouslymaps a vector into another vector. All that we need to do is to prove that it is a linear map.For arbitrary scalars c and d, and arbitrary vectors x and y, we have

(a⊗ b)(cx + dy) = [b · (cx + dy)] a= [cb · x + db · y] a= c(b · x)a + d(b · y)a= c[(a⊗ b)x] + d [(a⊗ b)y] ,

which proves that a⊗ b is a linear function. Hence, a⊗ b is a second-order tensor. Anysecond-order tensor T can be written as

T = Tijei⊗ ej, (1.37)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 16 — #35 ii

ii

ii


where the components of the tensor, Tij are given by Eqn. (1.28). To see this, we considerthe action of T on an arbitrary vector u:

Tu = (Tu)iei = [ei · (Tu)] ei

=

ei ·[T(ujej)

]ei

=

uj[ei · (Tej)

]ei

=(u · ej)

[ei · (Tej)

]ei

=[ei · (Tej)

] [(u · ej)ei

]=[ei · (Tej)

] [(ei⊗ ej)u

]=[

ei · (Tej)]

ei⊗ ej

u.

Hence, we conclude that any second-order tensor admits the representation given byEqn. (1.37), with the nine components Tij, i = 1, 2, 3, j = 1, 2, 3, given by Eqn. (1.28). Thedyadic products ei⊗ ej, i = 1, 2, 3, j = 1, 2, 3 constitute a basis of Lin since, as just men-tioned, any element of Lin can be expressed in terms of them, and since they are linearlyindependent (Tijei⊗ ej = 0 implies that Tij = ei · 0ej = 0).

From Eqns. (1.29) and (1.37), it follows that

I = ei⊗ ei, (1.38)

where e1, e2, e3 is any orthonormal coordinate frame. If T is represented as given byEqn. (1.37), it follows from Eqn. (1.32) that the transpose of T can be represented as

TT = Tjiei⊗ ej. (1.39)

From Eqns. (1.37) and (1.39), we deduce that a tensor is symmetric (T = TT) if and onlyif Tij = Tji for all possible i and j. We now show how all the properties of a second-ordertensor derived so far can be derived using the dyadic product.

Using Eqn. (1.27), we see that the components of a dyad a⊗ b are given by

(a⊗ b)ij = ei · (a⊗ b)ej

= ei · (b · ej)a

= aibj. (1.40)

The components of a vector v obtained by a second-order tensor T operating on a vector uare obtained by noting that

viei = Tij(ei⊗ ej)u = Tij(u · ej)ei = Tijujei, (1.41)

which in equivalent to Eqn. (1.30).Convention regarding complex vectors: In what follows, we will often encounter the

dyadic product of two complex-valued vectors. We note here that we use the same def-inition for the dyadic product of complex-valued vectors as that for real-valued vectors,namely, the definition given by Eqn. (1.36). Using Eqn. (1.40), we get the components ofa⊗ b as aibj. However, it is possible to follow another convention in defining the dyadicproduct, namely (a⊗ b)c = (b · c)a (with the hat denoting complex conjugation), whereby



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 17 — #36 ii

ii

ii


one obtains the components as ai bj. We will follow this rule of using the same definitionsfor the real and complex case even when we define the dyadic products A⊗ B and A Bof two complex-valued second-order tensors A and B. The advantage of our convention isthat all relations that we derive for the real case are also valid for the complex case, sincethe definitions are the same. If we used the alternate convention, one would need to re-derive all the relations for the complex case, and this can be quite cumbersome, since wewill be deriving dozens of relations (including rules for differentiation) involving tensorproducts. We emphasize that both conventions are correct, and, as long as one sticks to oneconvention consistently in all the derivations, the particular choice made is just a matter ofconvenience.

1.3.2 Principal invariants of a second-order tensor

Theorem 1.3.1. If (u, v, w) and (a, b, c) are two pairs of linearly independent vectors,and T is an arbitrary tensor, then

[Tu, v, w] + [u, Tv, w] + [u, v, Tw]

[u, v, w]=

[Ta, b, c] + [a, Tb, c] + [a, b, Tc][a, b, c]

, (1.42a)

[Tu, Tv, w] + [u, Tv, Tw] + [Tu, v, Tw]

[u, v, w]=

[Ta, Tb, c] + [a, Tb, Tc] + [Ta, b, Tc][a, b, c]

, (1.42b)

[Tu, Tv, Tw]

[u, v, w]=

[Ta, Tb, Tc][a, b, c]

. (1.42c)

Proof. Since u = uiei, v = vjej and w = wkek, the left-hand side of Eqn. (1.42a)can be written as

LHS =uivjwk

[Tei, ej, ek

]+[ei, Tej, ek

]+[ei, ej, Tek

][u, v, w]

=εijkuivjwk [Te1, e2, e3] + [e1, Te2, e3] + [e1, e2, Te3]

[u, v, w]

= [Te1, e2, e3] + [e1, Te2, e3] + [e1, e2, Te3] .

Since the choice of vectors u, v and w is arbitrary, Eqn. (1.42a) follows. Equa-tions (1.42b) and (1.42c) are proved in a similar manner.

As a consequence of Theorem 1.3.1, corresponding to an arbitrary tensor T , there existthree scalars, I1, I2 and I3 such that

[Tu, v, w] + [u, Tv, w] + [u, v, Tw] = I1 [u, v, w] , (1.43a)[Tu, Tv, w] + [u, Tv, Tw] + [Tu, v, Tw] = I2 [u, v, w] , (1.43b)

[Tu, Tv, Tw] = I3 [u, v, w] . (1.43c)

The above equations are valid even when [u, v, w] = 0. To see this, consider Eqn. (1.43a).Following the proof of Theorem 1.3.1, we see that if [u, v, w] = 0, then

[Tu, v, w] + [u, Tv, w] + [u, v, Tw] = [u, v, w] [Te1, e2, e3] + [e1, Te2, e3] + [e1, e2, Te3] = 0.

Equations (1.43b) and (1.43c) can be proved analogously.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 18 — #37 ii

ii

ii


The scalars I1, I2 and I3 are called the principal invariants of T . The reason for callingthem as the principal invariants is that any other scalar invariant of T can be expressed interms of them. The first and third invariants are referred to as the trace and determinant ofT , respectively, and written as

I1 = tr T , I3 = det T .

Using the properties of the scalar triple product, it is clear that the trace is a linear operation,i.e.,

tr (αR + βS) = αtr R + βtr S ∀α, β ∈ < and R, S ∈ Lin.

To get the component form for tr T , choose u = e1, v = e2 and w = e3 in Eqn. (1.43a):

tr T = e1 · Te1 + e2 · Te2 + e3 · Te3 = T11 + T22 + T33 = Tii. (1.44)

Similarly, we have

tr TT = e1 · TTe1 + e2 · TTe2 + e3 · TTe3

= e1 · Te1 + e2 · Te2 + e3 · Te3

= tr T . (1.45)

By letting T = a⊗ b in Eqn. (1.44), we obtain

tr (a⊗ b) = a1b1 + a2b2 + a3b3 = aibi = a · b. (1.46)

Using the linearity of the trace operator, and Eqn. (1.37), we get

tr T = tr(Tijei⊗ ej

)= Tijtr (ei⊗ ej) = Tijei · ej = Tii,

which agrees with Eqn. (1.44).To prove

tr (RS) = tr (SR),

we choose u, v and w to be an orthonormal basis, so that

tr (RS) = (RSu, u) + (RSv, v) + (RSw, w)

= (Su, RTu) + (Sv, RTv) + (Sw, RTw).

Substituting

RTu = (RTu, u)u + (RTu, v)v + (RTu, w)w= (u, Ru)u + (u, Rv)v + (u, Rw)w,

and similar expressions for RTv and RTw, we get

tr (RS) =(u, Ru)(u, Su) + (u, Rv)(v, Su) + (u, Rw)(w, Su)+(v, Ru)(u, Sv) + (v, Rv)(v, Sv) + (v, Rw)(w, Sv)+



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 19 — #38 ii

ii

ii


(w, Ru)(u, Sw) + (w, Rv)(v, Sw) + (w, Rw)(w, Sw).

Interchanging R and S in the above equation gives an expression for tr (SR), which is thesame as that for tr (RS), thus yielding the desired result. The proof using indicial notationis left as an exercise (Problem 6).

Similar to the vector inner product given by Eqn. (1.7), we can define a tensor innerproduct of two second-order tensors R and S, denoted by R : S, by

(R, S) = R : S := tr (RTS) = tr (RST) = tr (SRT) = tr (ST R) = RijSij. (1.47)

The fact that R : S satisfies the inner product conditions in Eqn. (1.1) can be easily verified.Thus, Lin equipped with the above inner product is an inner product space, and enjoys allthe properties derived in Section 1.1. In particular, if the magnitude associated with theinner product is given by

|T | = (T : T)1/2,

then by the Cauchy–Schwartz inequality, we get

|(R, S)| ≤ |R| |S| .

We have the following useful property:

R : (ST) = (ST R) : T = (RTT) : S = (TRT) : ST , (1.48)

since

R : (ST) = tr (ST)T R = tr TT(ST R) = (ST R) : T = (RTS) : TT

= tr ST(RTT) = (RTT) : S = (TRT) : ST .

From Eqn. (1.43c), it can be seen that

det I = 1,

det(αT) = α3 det T .

It also follows from Eqn. (1.43c) that

det(RS) [u, v, w] = [RSu, RSv, RSw]

= det R [Su, Sv, Sw]

= (det R)(det S) [u, v, w] .

Choosing u, v and w to be linearly independent, the above equation leads us to the relation

det(RS) = (det R)(det S), (1.49)

from which we also conclude that det(RS) = det(SR).By choosing u = ej× ek = εijkei, v = ej and w = ek in Eqn. (1.43c), so that [u, v, w] =

εijkεijk = 6 (by Eqn. (1.19c)), and using the fact that Tei = Tpiep, we get

det T =16

εijkεpqrTpiTqjTrk =16

εijkεpqrTipTjqTkr, (1.50)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 20 — #39 ii

ii

ii


where the second relation is obtained by interchanging the dummy indices. From Eqn. (1.50),it is immediately obvious that (see Section 1.3.4 for a proof without using indicial notation)

det TT = det T . (1.51)

By choosing u = ep, v = eq and w = er in Eqn. (1.43c), and using Eqn. (1.51), we also have

εpqr(det T) = εijkTipTjqTkr = εijkTpiTqjTrk. (1.52)

By choosing (p, q, r) = (1, 2, 3) in the above equation, we get

det T = εijkTi1Tj2Tk3 = εijkT1iT2jT3k.

Using Eqns. (1.16), (1.49), (1.51), (1.340) and (1.341), we have

[u, v, w] [p, q, r] = det [e1⊗ u + e2⊗ v + e3⊗w] det [e1⊗ p + e2⊗ q + e3⊗ r]= det [u⊗ e1 + v⊗ e2 + w⊗ e3] [e1⊗ p + e2⊗ q + e3⊗ r]= det [u⊗ p + v⊗ q + w⊗ r] . (1.53)

We now prove the following important theorem:

Theorem 1.3.2. Given a tensor T , there exists a nonzero vector n such that Tn = 0 ifand only if det T = 0.

Proof. If det T = 0, then by Eqn. (1.43c),

[Tu, Tv, Tw] = 0 ∀u, v, w ∈ V,

which, by Theorem 1.2.2, implies that Tu, Tv, Tw are linearly dependent. Thismeans that there exist scalars α, β and γ, not all zero, such that

αTu + βTv + γTw = T(αu + βv + γw) = 0.

Thus, Tn = 0, where n = αu + βv + γw.Conversely, if there exists a nonzero vector n such that Tn = 0, choose v and

w such that n,v and w are linearly independent, i.e., [n, v, w] 6= 0. Then, byEqn. (1.43c), we have

det T [n, v, w] = [Tn, Tv, Tw] = [0, Tv, Tw] = 0,

which implies that det T = 0.

1.3.3 Inverse of a tensor

In order to define the concept of a inverse of a tensor, it is convenient to first introduce thecofactor tensor, denoted by cof T , and defined by the relation

cof T(u× v) := Tu× Tv ∀u, v ∈ V. (1.54)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 21 — #40 ii

ii

ii


cof T obviously maps a vector to a vector. To prove linearity, note that every nonzero vec-tor w can be expressed as u× v for some nonzero u, v ∈ V (e.g., take v as a unit vectorperpendicular to w, and u = v×w). We now show that if w1 and w2 are two arbitrarynonzero vectors, they can be expressed as u1 × v and u2 × v. If w1 and w2 are linearlydependent, then w2 = αw1, α ∈ <, and then u2 = αu1. If w1 and w2 are linearly in-dependent, then let v = (w1 × w2)/ |w1×w2|2. Using Eqn. (1.21a), we can show thatu1 = −(w1 ·w2)w1 + |w1|2 w2 and u2 = − |w2|2 w1 + (w1 ·w2)w2 yield u1× v = w1 andu2× v = w2. Noting that T ∈ Lin, we now have

cof T(αw1 + βw2) = cof T [(αu1 + βu2)× v]= [T(αu1 + βu2)]× Tv= [αTu1 + βTu2]× Tv= α(Tu1× Tv) + β(Tu2× Tv)= αcof T(u1× v) + βcof T(u2× v)= αcof T w1 + βcof T w2,

thus proving that cof T ∈ Lin.Using Eqn. (1.54), we now prove the following explicit formula for the cofactor:

(cof T)T = I2 I − (tr T)T + T2. (1.55)

Replacing u in Eqn. (1.43a) by Tu, we get

tr T [Tu, v, w] =[

T2u, v, w]+ [Tu, Tv, w] + [Tu, v, Tw] ,

which on rearranging yields

[Tu, Tv, w] + [Tu, v, Tw] = tr T [Tu, v, w]−[

T2u, v, w]

. (1.56)

Using Eqn. (1.43b) and (1.54), we have

I2 [u, v, w] = [Tu, Tv, w] + [Tu, v, Tw] +[(cof T)Tu, v, w

]. (1.57)

Substituting Eqn. (1.56) into (1.57), using the fact that u, v and w are arbitrary, and invokingequality of tensors, we get Eqn. (1.55).

It immediately follows from Eqn. (1.55) that cof T corresponding to a given T is unique.We also observe that

cof TT = (cof T)T , (1.58)

and that

(cof T)TT = T(cof T)T . (1.59)

The components of the cofactor tensor are given by

(cof T)ij =12

εimnεjpqTmpTnq. (1.60)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 22 — #41 ii

ii

ii


To prove this, let u = eq× ej and v = eq in Eqn. (1.54), and use Eqn. (1.21b) to get u× v =2ej. On the right-hand side of Eqn. (1.54), use the relations eq× ej = εpqjep, Tep = Tmpemand Teq = Tnqen. Taking the dot product with ei of both sides of the relation so obtained,we get the desired result. Equation (1.60) when written out explicitly reads

[cof T ] =

T22T33 − T23T32 T23T31 − T21T33 T21T32 − T22T31

T32T13 − T33T12 T33T11 − T31T13 T31T12 − T32T11

T12T23 − T13T22 T13T21 − T11T23 T11T22 − T12T21

. (1.61)

Equation (1.54) can be used to get a simple expression for I2, the second invariant of atensor. We first write Eqn. (1.43b) as

I2 [u, v, w] = [u, Tv, Tw] + [v, Tw, Tu] + [w, Tu, Tv] ,

and then use Eqn. (1.54) to write the above equation as

I2 [u, v, w] =[(cof T)Tu, v, w

]+[(cof T)Tv, w, u

]+[(cof T)Tw, u, v

]=[(cof T)Tu, v, w

]+[u, (cof T)Tv, w

]+[u, v, (cof T)Tw

]= tr (cof T)T [u, v, w] ,

where the last step follows from Eqn. (1.43a). Using Eqn. (1.45), and choosing u, v and wto be linearly independent (so that [u, v, w] 6= 0), we get

I2 = tr (cof T). (1.62)

An alternative expression for the I2 can be found by directly taking the trace of both sidesof Eqn. (1.55). Using Eqns. (1.45) and (1.62), and the fact that the trace is a linear operator,we get

I2 = 3I2 − (tr T)2 + tr T2,

which implies that

I2 =12

[(tr T)2 − tr T2

]. (1.63)

The equivalence of the two formulae for I2 given by Eqns. (1.62) and (1.63) can also beshown using indicial notation (see Problem 10).

Substituting Eqn. (1.54) into Eqn. (1.43c), we get

(Tu, cof T(v×w)) = det T(Iu, v×w) ∀u, v, w ∈ V,

which implies that

((cof T)TTu, v×w) = ((det T)Iu, v×w) ∀u, v, w ∈ V.

Any arbitrary vector z can be expressed as v×w for some v, w ∈ V (take w as a unit vectorperpendicular to z, and v = w× z), so that from Eqn. (1.24), it follows that (cof T)TT =(det T)I, which when combined with Eqn. (1.59) yields

T(cof T)T = (cof T)TT = (det T)I. (1.64)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 23 — #42 ii

ii

ii


Similar to the result for determinants, the cofactor of the product of two tensors is theproduct of the cofactors of the tensors (see also Problem 9), i.e.,

cof (RS) = (cof R)(cof S).

This is because

[cof R cof S]T RS = (cof S)T(cof R)T RS

= (det R)(cof S)T IS (by Eqn. (1.64))= (det R)(det S)I (by Eqn. (1.64))= det(RS)I, (by Eqn. (1.49))

and since the cofactor matrix is unique, the result follows.The inverse of a second-order tensor T , denoted by T−1, is defined by

T−1T = I, (1.65)

where I is the identity tensor. A characterization of an invertible tensor is the following:

Theorem 1.3.3. A tensor T is invertible if and only if det T 6= 0. The inverse, if itexists, is unique.

Proof. Assuming T−1 exists, from Eqns. (1.49) and (1.65), we have(det T)(det T−1) = 1, and hence det T 6= 0.

Conversely, if det T 6= 0, then from Eqn. (1.64), we see that at least one inverseexists, and is given by

T−1 =1

det T(cof T)T . (1.66)

Let T−11 and T−1

2 be two inverses that satisfy T−11 T = T−1

2 T = I, from whichit follows that (T−1

1 − T−12 )T = 0. Choose T−1

2 to be given by the expression inEqn. (1.66) so that, by virtue of Eqn. (1.64), we also have TT−1

2 = I. Multiplyingboth sides of (T−1

1 − T−12 )T = 0 by T−1

2 , we get T−11 = T−1

2 , which establishesthe uniqueness of T−1.

From Eqns. (1.64) and (1.66), we have

T−1T = TT−1 = I. (1.67)

Another characterization of an invertible tensor is the following:

Theorem 1.3.4. A tensor T is invertible if and only if the equation

Tu = v,

has a unique solution u ∈ V for each v ∈ V.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 24 — #43 ii

ii

ii


Proof. If T is invertible, then multiplying v = Tu by T−1, we get

T−1v = T−1Tu= Iu= u

Since T−1 is unique, u is unique.To prove the converse, note that if u1 and u2 are two solutions of the equation

Tu = v, then, by the assumed uniqueness, u1 = u2. Also, by assumption, forevery v ∈ V, there exists a u such that Tu = v. Thus, the mapping T : V → V isone-to-one and onto, and hence, T−1 : V → V exists.

Summarizing, if T is invertible, we have

Tu = v ⇐⇒ u = T−1v, u, v ∈ V.

By the above property, T−1 clearly maps vectors to vectors. Hence, to prove that T−1 is asecond-order tensor, we just need to prove linearity. Let a, b ∈ V be two arbitrary vectors,and let u = T−1a and v = T−1b. Since I = T−1T , we have

I(αu + βv) = T−1T(αu + βv)

= T−1[T(αu + βv)]

= T−1[αTu + βTv]

= T−1(αa + βb),

which implies that

T−1(αa + βb) = αT−1a + βT−1b ∀a, b ∈ V and α, β ∈ <.

The inverse of the product of two invertible tensors R and S is

(RS)−1 = S−1R−1, (1.68)

since the inverse is unique, and

S−1R−1RS = S−1 IS = S−1S = I.

Similarly, if T is invertible, then TT is invertible since det TT = det T 6= 0. The inverse ofthe transpose is given by

(TT)−1 = (T−1)T , (1.69)

since

(T−1)TTT = (TT−1)T = IT = I.

Hence, without fear of ambiguity, we can write

T−T := (TT)−1 = (T−1)T .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 25 — #44 ii

ii

ii


From Eqn. (1.69), it follows that if T ∈ Sym, then T−1 ∈ Sym.Although, no easy expression for (T1 + T2)

−1 is known, the following Woodbury for-mula gives an expression for a particular kind of perturbation on an invertible tensor T ,and which holds for any underlying space dimension n:

(T + UCV)−1 = T−1 − T−1U(C−1 + V T−1U)−1V T−1. (1.70)

To prove this formula, consider the inversion of the matrix[ T −U

V C−1

]. If

[ A1 B1C1 D1

]represents

the inverse of this matrix, then the condition[A1 B1

C1 D1

] [T −UV C−1

]=

[I 00 I

],

leads to the equations

A1T + B1V = I, (1.71a)

−A1U + B1C−1 = 0. (1.71b)

From Eqn. (1.71b), we get B1 = A1UC, which on substituting into Eqn. (1.71a) yields

A1 = (T + UCV)−1. (1.72)

On the other hand, from Eqn. (1.71a), we have

A1 = T−1 − B1V T−1, (1.73)

which on substituting into Eqn. (1.71b) yields

B1 = T−1U(C−1 + V T−1U)−1.

Substituting this expression into Eqn. (1.73) yields

A1 = T−1 − T−1U(C−1 + V T−1U)−1V T−1. (1.74)

Comparing Eqns. (1.72) and (1.74), we get Eqn. (1.70). If C = 1, then we get the followingSherman–Morrison formula:

(T + u⊗ v)−1 = T−1 − T−1(u⊗ v)T−1

1 + v · T−1u. (1.75)

1 + v · T−1u should be nonzero for T + u⊗ v to be invertible.Let S ∈ Sym be an invertible tensor, and let u = S−1u and v = S−1v. Note that because

of the symmetry of S, we have u · v = u · v. By applying Eqn. (1.75) twice, we get

(S+u⊗v+v⊗u)−1=S−1 +1D

[(u · u)v⊗ v + (v · v)u⊗ u− (1 + u · v)(u⊗ v + v⊗ u)] ,

where

D = (1 + u · v)2 − (u · u)(v · v).



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 26 — #45 ii

ii

ii


D should be nonzero for S + u⊗ v + v⊗ u to be invertible.As a corollary, for S = I, we get

(I + u⊗ v+ v⊗ u)−1 = I +1D

[|u|2 v⊗ v + |v|2 u⊗ u− (1 + u · v)(u⊗ v + v⊗ u)

],

where

D = (1 + u · v)2 − |u|2 |v|2 .

Consider the tensor

H = I + u⊗ v, (1.76)

and let the underlying space dimension be n. Let e∗1 , e∗2 , . . . , e∗n−1 be n− 1 unit vectors thatare perpendicular to v, and mutually perpendicular to each other. Then from Eqn. (1.76),it is immediately evident that (1, e∗1), (1, e∗2), . . . , (1, e∗n−1), (1 + u · v, u/ |u|) are eigenval-ues/eigenvectors of H. If u · v 6= 0, then all the eigenvectors are linearly independent,while if u · v = 0, then the first n− 1 eigenvectors are linearly independent, while the lastcan be expressed in terms of them. The characteristic equation is

(λ− 1)n−1(λ− 1− u · v) = 0.

On applying the Cayley–Hamilton theorem (see Theorem 1.3.5), we get

[H − I]n−1[H − (1 + u · v)I] = 0.

The minimal polynomial is

[H − I][H − (1 + u · v)I] = 0,

which can be immediately verified by using Eqn. (1.76). Multiplying the eigenvalues of H,we get

det H = 1 + u · v.

Now consider the tensor that occurs in Eqn. (1.75), which is a generalization of thetensor H, namely,

M = T + u⊗ v,

which can be written as T [I + (T−1u)⊗ v], assuming T to be invertible. Thus,

det M = det T [1 + (T−1u) · v].

The result for arbitrary T is obtained by taking the limit in the above equation. We get

det M = det T + u · [(cof T)v],

with cof T given by Eqn. (J.12).Consider a symmetric perturbation to I:

K = I + u⊗ v + v⊗ u.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 27 — #46 ii

ii

ii


Let e∗1 , e∗2 , . . . , e∗n−2 be n− 2 unit vectors that are perpendicular to u and v. Then (1, e∗1),(1, e∗2), . . . , (1, e∗n−2), (1+ u · v + |u| |v| , u/ |u|+ v/ |v|), (1+ u · v− |u| |v| , u/ |u| − v/ |v|)are the eigenvalues/eigenvectors of K, so that

det K = (1 + u · v + |u| |v|)(1 + u · v− |u| |v|) = (1 + u · v)2 − |u|2 |v|2 . (1.77)

By virtue of the Cauchy–Schwartz inequality,

det K ≤ 1 + 2u · v.

Let S be a symmetric positive definite tensor, and let U =√

S be its symmetric positivedefinite square-root. If Z is defined by

Z := S + u⊗ v + v⊗ u,

then it can be written as U[I + (U−1u)⊗ (U−1v) + (U−1v)⊗ (U−1u)]U, so that on usingEqn. (1.77), we get

det Z = det S[(

1 + (S−1u) · v)2− (u · S−1u)(v · S−1v)].

Although our proof assumed S to be positive definite, actually, the above result holds forany invertible S ∈ Sym.

1.3.4 Eigenvalues and eigenvectors of tensors

If T is an arbitrary tensor, a vector n is said to be an eigenvector of T if there exists λ suchthat

Tn = λn. (1.78)

Writing the above equation as (T − λI)n = 0, we see from Theorem 1.3.2 that a nontrivialeigenvector n exists if and only if

det(T − λI) = 0.

This is known as the characteristic equation of T . Using Eqn. (1.43c), the characteristic equa-tion can be written as

[Tu− λu, Tv− λv, Tw− λw] = 0,

for linearly independent vectors u, v and w. Using Eqns. (1.14) and (1.15), we can write thecharacteristic equation as

λ3 − I1λ2 + I2λ− I3 = 0, (1.79)

where I1, I2 and I3 are the principal invariants given by Eqns. (1.43). Since the principalinvariants are real, Eqn. (1.79) has either one or three real roots. If one of the eigenval-ues is complex, then it follows from Eqn. (1.78) that the corresponding eigenvector is alsocomplex. By taking the complex conjugate of both sides of Eqn. (1.78), we see that the com-plex conjugate of the complex eigenvalue, and the corresponding complex eigenvector are



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 28 — #47 ii

ii

ii


also eigenvalues and eigenvectors, respectively. Thus, eigenvalues and eigenvectors, ifcomplex, occur in complex conjugate pairs. If λ1, λ2, λ3 are the roots of the characteristicequation, then from Eqn. (1.79), it follows that

I1 = tr T = T11 + T22 + T33,= λ1 + λ2 + λ3, (1.80a)

I2 = tr cof T =12

[(tr T)2 − tr (T2)

]=

∣∣∣∣∣T11 T12

T21 T22

∣∣∣∣∣+∣∣∣∣∣T22 T23

T32 T33

∣∣∣∣∣+∣∣∣∣∣T11 T13

T31 T33

∣∣∣∣∣= λ1λ2 + λ2λ3 + λ1λ3, (1.80b)

I3 = det(T) =16

[(tr T)3 − 3(tr T)(tr T2) + 2tr T3

]= εijkTi1Tj2Tk3

= λ1λ2λ3, (1.80c)

where |.| denotes the determinant. The set of eigenvalues λ1, λ2, λ3 is known as thespectrum of T .

If λ is an eigenvalue, and n is the associated eigenvector of T , then λ2 is the eigenvalueof T2, and n is the associated eigenvector, since

T2n = T(Tn) = T(λn) = λTn = λ2n.

In general, λn is an eigenvalue of Tn with associated eigenvector n. The eigenvalues of TT

and T are the same since their characteristic equations are the same.An extremely important result is the following:

Theorem 1.3.5 (Cayley–Hamilton Theorem). A tensor T satisfies an equation hav-ing the same form as its characteristic equation, i.e.,

T3 − I1T2 + I2T − I3 I = 0 ∀T . (1.81)

Proof. Multiplying Eqn. (1.55) by T , we get

(cof T)TT = I2T − I1T2 + T3.

Since by Eqn. (1.64), (cof T)TT = (det T)I = I3 I, the result follows.

By taking the trace of both sides of Eqn. (1.81), and using Eqn. (1.63), we get

det T =16

[(tr T)3 − 3(tr T)(tr T2) + 2tr T3

]. (1.82)

From the above expression and the properties of the trace operator, Eqn. (1.51) follows.We have

λi = 0, i = 1, 2, 3 ⇐⇒ IT = 0 ⇐⇒ tr (T) = tr (T2) = tr (T3) = 0. (1.83)

The proof is as follows. If all the invariants are zero, then from the characteristic equationgiven by Eqn. (1.79), it follows that all the eigenvalues are zero. If all the eigenvalues arezero, then from Eqns. (1.80a)–(1.80c), it follows that all the principal invariants IT are zero.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 29 — #48 ii

ii

ii


If tr (T) = tr (T2) = tr (T3) = 0, then again from Eqns. (1.80a)–(1.80c) it follows that theprincipal invariants are zero. Conversely, if all the principal invariants are zero, then allthe eigenvalues are zero from which it follows that tr T j = ∑3

i=1 λji , j = 1, 2, 3 are zero.

Consider the second-order tensor u⊗ v. By Eqn. (1.60) or by writing u⊗ v as u⊗ v +0⊗ 0 + 0⊗ 0 and using Eqn. (1.333), it follows that

cof (u⊗ v) = 0, (1.84)

so that the second invariant, which is the trace of the above tensor, is zero. Similarly, onusing Eqn. (1.53), we get the third invariant as zero. The first invariant is given by u · v.Thus, from the characteristic equation, it follows that the eigenvalues of u⊗ v are (0, 0, u ·v). If u and v are perpendicular, u⊗ v is an example of a nonzero tensor all of whoseeigenvalues are zero.

1.4 Skew-Symmetric Tensors

Let W ∈ Skw and let u, v ∈ V. Then

(u, Wv) = (W Tu, v) = −(Wu, v) = −(v, Wu). (1.85)

On setting v = u, we get

(u, Wu) = −(u, Wu),

which implies that

(u, Wu) = 0. (1.86)

Thus, Wu is always orthogonal to u for any arbitrary vector u. By choosing u = ei andv = ej, we see from the above results that any skew-symmetric tensor W has only threeindependent components (in each coordinate frame), which suggests that it might be re-placed by a vector. This observation leads us to the following result:

Theorem 1.4.1. Given any skew-symmetric tensor W , there exists a unique vector w,known as the axial vector or dual vector, corresponding to W such that

Wu = w× u ∀u ∈ V. (1.87)

Conversely, given any vector w, there exists a unique skew-symmetric second-ordertensor W such that Eqn. (1.87) holds.

Proof. A skew-symmetric tensor W , like any other tensor, has at least one realeigenvalue, say λ. Let p be the associated eigenvector. Then W p = λp, and ontaking the dot product of both sides with p, we get λ = p · (W p). By virtueof Eqn. (1.86), λ = 0 (we show later in this section that this is the only realeigenvalue). Let q, r be unit vectors that form an orthonormal basis with p. Wehave

p = q× r, q = r× p, r = p× q. (1.88)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 30 — #49 ii

ii

ii


Since W p = λp = 0, we have (q, W p) = (r, W p) = 0, and by Eqn. (1.85), wealso have (p, Wq) = (p, Wr) = 0. Thus, using W = Wijei ⊗ ej referred to thebasis p, q, r, we get

W = γ(r⊗ q− q⊗ r),

where γ = r · (Wq) must be nonzero if W 6= 0. Let w = γp, and let u be anarbitrary vector. Then

Wu−w× u = γ (r⊗ q)u− (q⊗ r)u− p× [(u · p)p + (u · q)q + (u · r)r]= γ [(u · q)(r− p× q)− (u · r)(q + p× r)]

= 0,

where the last step follows from Eqn. (1.88). Thus, associated with every skew-symmetric tensor W , there is a vector w such that Eqn. (1.87) holds.

Conversely, given w, let q and r be unit vectors such that w/ |w|, q and r forman orthonormal basis. Then

W = |w| (r⊗ q− q⊗ r), (1.89)

is a skew-symmetric tensor with w as the axial vector of W , a fact that can bechecked easily by verifying that Wu = w× u for any arbitrary vector u.

Given W , uniqueness of the axial vector w can be shown by assuming theexistence of two vectors w1 and w2. Now we have Wu = w1 × u and Wu =w2× u for all vectors u. Hence,

(w1 −w2)× u = 0 ∀u ∈ V.

If w1 − w2 is a nonzero vector, then we can choose u to be perpendicular tow1 − w2, so that (w1 − w2)× u is nonzero. But this leads to a contradictionwith the above property. Hence w1 = w2, and the uniqueness of the axial vectoris established.

Similarly, given w, uniqueness of the corresponding W can be established byassuming the existence of two tensors W1 and W2, such that W1u = w× u andW2u = w× u for all vectors u. Now we have

(W1 −W2)u = 0 ∀u ∈ V,

which implies that W1 = W2.

Note that Wu = 0 if and only if u = αw, α ∈ <, since if u = αw then Wu = αw×w =0, while conversely, if Wu = w× u = 0, then u is a scalar multiple of w by Theorem 1.2.1.This result justifies the use of the terminology ‘axial vector’ used for w. Also note that byvirtue of the uniqueness of w, the vector αw, α ∈ <, is a one-dimensional subspace of V.Similarly, W2u = 0 if and only if u = αw, α ∈ <, since if W2u = 0, by the above result,Wu = w× u = βw, which is only possible when u = αw and β = 0.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 31 — #50 ii

ii

ii


By choosing u = ej and taking the dot product of both sides with ei, Eqn. (1.87) can beexpressed in component form as

Wij = −εijkwk,

wi = −12

εijkWjk.(1.90)

More explicitly, if w = (w1, w2, w3), then

W =

0 −w3 w2

w3 0 −w1

−w2 w1 0

.

From Eqns. (1.89) or (1.90), it follows that W : W = −tr (W2) = 2w · w. Since tr W =

tr W T = −tr W and det W = det W T = −det W , we have tr W = det W = 0. The secondinvariant is given by I2 = [(tr W)2 − tr (W2)]/2 = (W : W)/2 = w ·w. Thus, from thecharacteristic equation, we get the eigenvalues of W as (0, i |w| ,−i |w|), and the ‘spectralresolution’ as

W = i |w| (n⊗ n− n⊗ n),

where n is the eigenvector corresponding to i |w|, and n is its complex conjugate. Notethat we have used the convention outlined at the end of Section 1.3.1 in writing the aboveresult.

1.5 Orthogonal Tensors

A second-order tensor Q is said to be orthogonal if QT = Q−1, or, alternatively byEqn. (1.67), if

QTQ = QQT = I, (1.91)

where I is the identity tensor.

Theorem 1.5.1. A tensor Q is orthogonal if and only if it has any of the followingproperties of preserving inner products, lengths and distances:

(Qu, Qv) = (u, v) ∀u, v ∈ V, (1.92a)|Qu| = |u| ∀u ∈ V, (1.92b)|Qu−Qv| = |u− v| ∀u, v ∈ V. (1.92c)

Proof. Assuming that Q is orthogonal, Eqn. (1.92a) follows since

(Qu, Qv) = (QTQu, v) = (Iu, v) = (u, v) ∀u, v ∈ V.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 32 — #51 ii

ii

ii


Conversely, if Eqn. (1.92a) holds, then

0 = (Qu, Qv)− (u, v) = (u, QTQv)− (u, v) = (u, (QTQ− I)v) ∀u, v ∈ V,

which implies that QTQ = I (by Eqn. (1.24)), and hence Q is orthogonal.By choosing v = u in Eqn. (1.92a), we get Eqn. (1.92b). Conversely, if

Eqn. (1.92b) holds, i.e., if (Qu, Qu) = (u, u) for all u ∈ V, then

((QTQ− I)u, u) = 0 ∀u ∈ V,

which, by virtue of Problem 29, leads us to the conclusion that Q ∈ Orth.By replacing u by (u − v) in Eqn. (1.92b), we obtain Eqn. (1.92c), and, con-

versely, by setting v to zero in Eqn. (1.92c), we get Eqn. (1.92b).

As a corollary of the above results, it follows that the ‘angle’ between two vectors uand v, defined by θ := cos−1(u · v)/(|u| |v|), is also preserved. Thus, physically speaking,multiplying the position vectors of all points in a domain by Q corresponds to rigid bodyrotation of the domain about the origin.

From Eqns. (1.49), (1.51) and (1.91), we have det Q = ±1. Orthogonal tensors with de-terminant +1 are said to be proper orthogonal or rotations (henceforth, this set is denotedby Orth+). Since det(−R) = (−1)3 det R = −1 for R ∈ Orth+, −R is an orthogonal tensorthat is not a rotation. On the other hand, if Q is an orthogonal tensor that is not a rota-tion, −Q is a rotation. Thus, when the underlying vector space dimension is odd (whichincludes n = 3), every orthogonal tensor is either a rotation, or the product of a rotationwith −I:

Orth = ±R; R ∈ Orth+.

This result is not true when the dimension n is even, as the example Q =[

0 11 0

], which

cannot be written as −R, where R ∈ Orth+, shows. For Q ∈ Orth+, we have

cof Q = (det Q)Q−T = Q, (1.93)

so that by Eqn. (1.54),

Q(u× v) = (Qu)× (Qv) ∀u, v ∈ V. (1.94)

A characterization of a rotation is as follows:

Theorem 1.5.2. Let e1, e2, e3 and e∗1 , e∗2 , e∗3 be two orthonormal bases. Then

Q = e1⊗ e∗1 + e2⊗ e∗2 + e3⊗ e∗3 ,

is a proper orthogonal tensor. Conversely, every Q ∈ Orth+ can be represented in theabove manner.

Proof. If e1, e2, e3 and e∗1 , e∗2 , e∗3 are two orthonormal bases, then

QQT = [e1⊗ e∗1 + e2⊗ e∗2 + e3⊗ e∗3 ] [e∗1 ⊗ e1 + e∗2 ⊗ e2 + e∗3 ⊗ e3]



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 33 — #52 ii

ii

ii


= [e1⊗ e1 + e2⊗ e2 + e3⊗ e3] (by Eqn. (1.341))

= I, (by Eqn. (1.38))

and, by Eqn. (1.53),

det Q = [e1, e2, e3] [e∗1 , e∗2 , e∗3 ] = 1.

Conversely, if Q ∈ Orth+, and e∗1 , e∗2 , e∗3 is an orthonormal basis, then

Q = QI= Q [e∗1 ⊗ e∗1 + e∗2 ⊗ e∗2 + e∗3 ⊗ e∗3 ] (by Eqn. (1.38))

= (Qe∗1)⊗ e∗1 + (Qe∗2)⊗ e∗2 + (Qe∗3)⊗ e∗3 (by Eqn. (1.343))

= e1⊗ e∗1 + e2⊗ e∗2 + e3⊗ e∗3 ,

where e1, e2, e3 = Qe∗1 , Qe∗2 , Qe∗3 is an orthonormal basis since |ei| =∣∣Qe∗i∣∣ = ∣∣e∗i ∣∣ = 1 for each i, and since, by virtue of Eqn. (1.94),

ej× ek = (Qe∗j )× (Qe∗k ) = Q(e∗j × e∗k ) = εijkQe∗i = εijk ei.

As a result of the above characterization, we have the following theorem.

Theorem 1.5.3. Every T ∈ Lin can be expressed as a linear combination of properorthogonal tensors, i.e.,

Lin = spanOrth+.

Proof. Let e1, e2, e3 denote the canonical Cartesian basis vectors. Then

e1⊗ e1 =12[I + (e1⊗ e1 − e2⊗ e2 − e3⊗ e3)] ,

e1⊗ e2 =12[(e1⊗ e2 + e2⊗ e3 + e3⊗ e1) + (e1⊗ e2 − e2⊗ e3 − e3⊗ e1)] ,

with similar expressions for the remaining combinations of dyadic products.Thus, the tensor T = Tijei ⊗ ej can be expressed as ∑i φiQi, where Qi ∈Orth+.

If ei and ei are two sets of orthonormal basis vectors, then they are related as

ei = QTei, i = 1, 2, 3, (1.95)

where Q = ek ⊗ ek is a proper orthogonal tensor by virtue of Theorem 1.5.2. The compo-nents of Q with respect to the ei basis are given by Qij = ei · (ek ⊗ ek)ej = δik ek · ej =

ei · ej. Thus, if e and e are two unit vectors, we can always find Q ∈ Orth+ (not necessarily



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 34 — #53 ii

ii

ii


unique), which rotates e to e, i.e., e = Qe. Let u and v be two vectors. Since u/ |u| andv/ |v| are unit vectors, there exists Q ∈ Orth+ such that

v|v| = Q

(u|u|

).

Thus, if u and v have the same magnitude, i.e., if |u| = |v|, then there exists Q ∈ Orth+

such that u = Qv.We now study the transformation laws for the components of tensors under an or-

thogonal transformation of the basis vectors. Let ei and ei represent the original and neworthonormal basis vectors, and let Q be the proper orthogonal tensor in Eqn. (1.95). FromEqn. (1.9), we have

ei = (ei · ej)ej = Qijej, (1.96a)

ei = (ei · ej)ej = Qji ej. (1.96b)

Using Eqn. (1.8) and Eqn. (1.96a), we get the transformation law for the components of avector as

vi = v · ei = v · (Qijej) = Qijv · ej = Qijvj. (1.97)

In a similar fashion, using Eqn. (1.28), Eqn. (1.96a), and the fact that a tensor is a lineartransformation, we get the transformation law for the components of a second-order tensoras

Tij = ei · Tej = QimQjnTmn. (1.98)

Conversely, if the components of a matrix transform according to Eqn. (1.98), then they allgenerate the same tensor. To see this, let T = Tij ei⊗ ej and T = Tmnem⊗ en. Then

T = Tij ei⊗ ej

= QimQjnTmn ei⊗ ej

= Tmn(Qim ei)⊗ (Qjn ej)

= Tmnem⊗ en (by Eqn. (1.96b))= T .

Due to this property, often, an alternate viewpoint is followed by many authors who takethe transformation law given by Eqn. (1.98) as the definition of second-order tensors.

We can write Eqns. (1.97) and (1.98) as

[v] = Q[v], (1.99)

[T ] = Q[T ]QT . (1.100)

where [v] and [T ] represent the components of the vector v and tensor T , respectively, withrespect to the ei coordinate system. Using the orthogonality property of Q, we can writethe reverse transformations as

[v] = QT [v], (1.101)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 35 — #54 ii

ii

ii


e1

e2

e1e2

θ

6

6

6

Fig. 1.1 Example of a coordinate system obtained from an existing one by a rotation about the3-axis.

[T ] = QT [T ]Q. (1.102)

As an example, the Q matrix for the configuration shown in Fig. 1.1 is

Q =

e1

e2

e3

=

cos θ sin θ 0− sin θ cos θ 0

0 0 1

.

The only real eigenvalues of Q ∈ Orth can be either +1 or −1, since if λ and n denotethe eigenvalue and eigenvector of Q, i.e., Qn = λn, then

(n, n) = (Qn, Qn) = λ2(n, n),

which implies that (n, n)(λ2 − 1) = 0. If λ and n are real, then (n, n) 6= 0 and λ = ±1,while if λ is complex, then (n, n) = 0. Let λ and n denote the complex conjugates of λ andn, respectively. To see that the complex eigenvalues have a magnitude of unity observethat λλ(n, n) = (Qn, Qn) = (n, n), which implies that λλ = 1 since (n, n) 6= 0. Now,using a similar argument as used in showing (n, n) = 0, we also have (n1, n2) = 0 for n1and n2 corresponding to distinct complex eigenvalues of Q (complex conjugates not beingconsidered as ‘distinct’–thus, this result is relevant only when the dimension n > 3), and(n, e) = (n, e) = 0, where e is the eigenvector of Q corresponding to the eigenvalue of +1or −1.

If R 6= I is a rotation, then the set of all vectors e such that

Re = e (1.103)

forms a one-dimensional subspace of V called the axis of R. To prove that such a vectoralways exists, we first show that +1 is always an eigenvalue of R, and −1 is always aneigenvalue of an improper rotation. Since det R = 1,

det(R− I) = det(R− RRT) = (det R)det(I − RT) = det(I − RT)T

= det(I − R) = −det(R− I),

which implies that det(R− I) = 0, or that +1 is an eigenvalue. If e is the eigenvector corre-sponding to the eigenvalue +1, then Re = e. Since every improper rotation is a product of−I times a proper orthogonal tensor, −1 is an eigenvalue of an improper orthogonal ten-sor, with the same eigenvector as the proper orthogonal tensor from which it is generated.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 36 — #55 ii

ii

ii


To show that the set of vectors e satisfying Eqn. (1.103) is a one-dimensional subspace of V,premultiply both sides of this equation with RT to get RTe = e. Combining this relationwith Eqn. (1.103), we get (R− RT)e = 0. There are two possibilities: (i) R is unsymmetric,or (ii) R is symmetric. If R is unsymmetric, then R− RT is a nonzero skew-symmetric ten-sor with e a scalar multiple of the axial vector of R−RT . Since we have shown that a scalarmultiple of the axial vector of a skew-symmetric tensor constitutes a one-dimensional sub-space, αe, α ∈ < is a one-dimensional subspace of V. Now consider the other possibilitywhere R is symmetric. By Theorem 1.6.1 all the eigenvalues of R are real, and as shownabove the only real eigenvalues can be ±1. Taking into account that det R = 1, R 6= Iand that one of the real eigenvalues is +1, the set of eigenvalues in this case has to be+1,−1,−1. Since the eigenvalue +1 is distinct from the other two, again invoking The-orem 1.6.1, it follows that the corresponding eigenvector e is unique. Using Eqn.(1.123), italso follows that R ∈ Orth+ ∩ Sym− I has to be of the form 2e⊗ e− I, where e is a unitvector.

Conversely, given a vector w, there exists a proper orthogonal tensor R, and an im-proper one R′ such that Rw = w and R′w = −w. To see this, consider the family oftensors

R(w, α) = I +1|w| sin α W +

1

|w|2(1− cos α)W2, (1.104)

where W is the skew-symmetric tensor with w as its axial vector, i.e., Ww = 0. Using theCayley–Hamilton theorem, we have W3 = − |w|2 W , from which it follows that W4 =

− |w|2 W2. Using this result, we get

RT R =

[I − sin α

|w| W +(1− cos α)

|w|2W2

] [I +

sin α

|w| W +(1− cos α)

|w|2W2

]= I.

Since R has now been shown to be orthogonal, det R = ±1. However, since det[R(w, 0)] =det I = 1, by continuity, we have det[R(w, α)] = 1 for any α. Thus, R is a proper orthogonaltensor that satisfies Rw = w. Taking R′ = −R, we get the required improper orthogonaltensor. It is easily seen that Rw = w. Essentially, R rotates any vector in the plane per-pendicular to w through an angle α. By Eqn. (1.89), W2 = w⊗w − |w|2 I. Thus, fromEqn. (1.104), we get R(w, 0) = I and R(w, π) = 2e⊗ e− I, where e = w/ |w|.

Using this result, we have the following representation theorem for proper orthogonaltensors.

Theorem 1.5.4. Let W be a skew-symmetric tensor with an axial vector of unit magni-tude, and let e, q, r be an orthonormal set of vectors. Then tensors of the form

R = I + sin α W + (1− cos α)W2, (1.105a)

R = e⊗ e + cos α(I − e⊗ e) + sin α(r⊗ q− q⊗ r). (1.105b)

are proper orthogonal. Conversely, there exist W ∈ Skw, and orthonormal vectorse, q, r, such that every proper orthogonal tensor can be expressed in the above forms.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 37 — #56 ii

ii

ii


Proof. The tensor R in Eqn. (1.105a) is just a special case of the tensor R inEqn. (1.104), which we have already shown to be proper orthogonal. Lete, q, r be an orthonormal set of vectors. Choose W = r⊗ q − q⊗ r, so thatW2 = e⊗ e− I. Substituting for W and W2 into Eqn. (1.105a), we get the repre-sentation in Eqn. (1.105b). Note that since Re = e, the vector e is the axis of R.

Conversely, we now show that every proper orthogonal tensor can be repre-sented as Eqns. (1.105a) and (1.105b). Let e be the axis of R, and let q, r be suchthat e, q, r form an orthonormal basis. Using the fact that Re = RTe = e, wehave (Re, q) = (Re, r) = (e, Rq) = (e, Rr) = 0. Thus, Rq and Rr lie in the planeperpendicular to e. If (Rq, q) = (Rr, r) = cos α, then

R = RI= R(e⊗ e + q⊗ q + r⊗ r)= (Re)⊗ e + (Rq)⊗ q + (Rr)⊗ r= e⊗ e + [(Rq, e)e + (Rq, q)q + (Rq, r)r]⊗ q+

[(Rr, e)e + (Rr, q)q + (Rr, r)r]⊗ r= e⊗ e + cos α(q⊗ q + r⊗ r) + sin α(r⊗ q− q⊗ r)= e⊗ e + cos α(I − e⊗ e) + sin α(r⊗ q− q⊗ r),

which is Eqn. (1.105b). By taking W = r⊗ q− q⊗ r, we have W2 = e⊗ e− I,from which we get Eqn. (1.105a).

Using the results for the eigenvalues/eigenvectors of R, it is also possible to write thefollowing “spectral resolutions” (using the convention outlined at the end of Section 1.3.1):

I = e⊗ e + n⊗ n + n⊗ n,

R = e⊗ e + λn⊗ n + λn⊗ n, (1.106)

R2 = e⊗ e + λ2n⊗ n + λ2n⊗ n,

where λ = λ = −1 in case all eigenvalues of R 6= I are real (so that R = 2e⊗ e− I), andn · n = n · n = e · n = e · n = 0 and n · n = 1 in case λ is complex.

Note from Eqn. (1.105b) that tr R = 1 + 2 cos α. It is easily seen by multiplying theexpression

φ1e⊗ e + φ2(q⊗ q + r⊗ r) + φ3(r⊗ q− q⊗ r) = 0

successively by e, q and r that φ1 = φ2 = φ3 = 0, which implies that e⊗ e, q⊗ q + r⊗r, r⊗ q− q⊗ r are linearly independent. Using the representation given by Eqn. (1.105b),we have I

RR2

=

1 1 01 cos α sin α

1 cos 2α sin 2α

e⊗ e

q⊗ q + r⊗ rr⊗ q− q⊗ r

.

The determinant of the square matrix in the above equation is 2 sin α(1− cos α). Thus, fromTheorem 1.1.2 it follows that I, R, R2 are linearly independent for all α except when (i)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 38 — #57 ii

ii

ii


α = 0, in which case R = R2 = I and (ii) α = π, in which case R = 2e⊗ e− I ∈ Sym, andR2 = I (in this case I, R are linearly independent).

Using the representation in Eqn. (1.104), let

R1(w1, α) = I +1|w1|

sin α W1 +1

|w1|2(1− cos α)W2

1,

R2(w2, φ) = I +1|w2|

sin φ W2 +1

|w2|2(1− cos φ)W2

2,

be two proper orthogonal tensors. If the axes of these two tensors coincide, i.e., if e1 =w1/ |w1| = w2/ |w2| = e2, then we have W1/ |w1| = W2/ |w2|, and hence R1 and R2commute, i.e., R1R2 = R2R1. However, the converse is not true since, for example, R1 =diag[1,−1,−1] and R2 = diag[−1, 1,−1] commute, but do not have the same axes. If e1and e2 are the axes of R1 and R2, respectively, then we assume the (unnormalized) axisof R2R1 to be of the form w = αe1 + βe2 + γe1 × e2 (since e1, e2, e1 × e2 constitutea basis when e1 and e2 are linearly independent) and find (α, β, γ) using the conditionR2R1w = w. Using this procedure, we obtain the axis of R2R1 as e = w/ |w|, where

w = (1− cos α)(1 + cos φ)e1 + sin α sin φ e2 − (1− cos α) sin φ e1× e2. (1.107)

In the representation given by Eqn. (1.104), w is a fixed vector; however, in problemsof rigid-body dynamics, we shall need to consider the case when the angular velocity vec-tor is a function of time, and which, in general, will not coincide with the axis of Q. Insuch a situation, it is helpful to express Q ∈ Orth+ in terms of generalized coordinates.Problem 26 shows that 3 parameters are required to characterize an orthogonal tensor inthree-dimensional space, and presents one such characterization. However, the most com-monly used characterization is the one using the Euler angles θ, φ and ψ. The representationof Q in terms of these angles is

Q =

cos ψ − sin ψ 0sin ψ cos ψ 0

0 0 1

1 0 0

0 cos θ − sin θ

0 sin θ cos θ

cos φ − sin φ 0

sin φ cos φ 00 0 1

=

cos φ cos ψ− cos θ sin ψ sin φ − sin φ cos ψ− cos θ sin ψ cos φ sin θ sin ψ

cos φ sin ψ + cos θ cos ψ sin φ − sin φ sin ψ + cos θ cos ψ cos φ − sin θ cos ψ

sin φ sin θ cos φ sin θ cos θ

. (1.108)

The axis of Q is [0, 0, 1] when θ = 0, and

e =

(1− cos θ)(sin φ + sin ψ)

(1− cos θ)(cos φ− cos ψ)

sin θ[1− cos(φ + ψ)]

(θ 6= 0). (1.109)

We now present an expression for the eigenvalues of Q. Since from Eqn. (1.105b),tr Q = 1+ 2 cos α, and since we have shown that the magnitude of the eigenvalues is 1, theeigenvalues of Q ∈ Orth+ are 1, eiα, e−iα. Since −1 ≤ cos α ≤ 1, we have −1 ≤ tr Q ≤ 3,with tr Q = 3 if and only if Q = I (α = 0 in Eqn. (1.105b)). The other bound tr Q = −1 is at-tained when α = π (or when θ or φ+ψ is equal to π in Eqn. (1.108)), and from Eqn. (1.105b),



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 39 — #58 ii

ii

ii


we see that Q = 2e⊗ e− I ∈ Sym with eigenvalues 1,−1,−1. When −1 < tr Q < 3,two eigenvalues and the corresponding eigenvectors are complex.

Since for Q ∈ Orth+, I2 = tr (cof Q) = tr [(det Q)Q−T ] = tr Q, by the Cayley–Hamilton theorem, we get

Q−1 = QT = Q2 − (tr Q)Q + (tr Q)I. (1.110)

If Q1, Q2 ∈ Orth+, which have the representations (see Theorem 1.5.2)

Q1 = a1⊗ b1 + a2⊗ b2 + a3⊗ b3,Q2 = c1⊗ d1 + c2⊗ d2 + c3⊗ d3,

where ai, bi, ci, di are orthonormal sets of vectors, then there exist proper orthog-onal tensors R1 = ck ⊗ ak and R2 = dk ⊗ bk such that Q2 = R1Q1RT

2 . However, if thetrace of Q1 and Q2 are the same (which implies that all their principal invariants are alsothe same), the following stronger result holds.

Theorem 1.5.5. Two proper orthogonal tensors Q1 and Q2 have the same trace, i.e.,tr Q1 = tr Q2, if and only if there exists an orthogonal tensor Q0 (which means thateither Q0 or −Q0 is a rotation) such that Q2 = Q0Q1QT

0 .

Proof. If there exists Q0 ∈ Orth such that Q2 = Q0Q1QT0 , then using the same

method of proof as used to prove Eqn. (1.137), we see that all the principal in-variants of Q1 and Q2 are the same. To prove the converse, note that if all theprincipal invariants of Q1 and Q2 are the same, then their characteristic equa-tions, and hence their eigenvalues are the same. Thus, using Eqn. (1.106), we canwrite

Q1 = e⊗ e + λn⊗ n + λn⊗ n,

Q2 = f ⊗ f + λg⊗ g + λg⊗ g.

The tensor

Q0 = f ⊗ e + g⊗ n + g⊗ n

is orthogonal since QT0 Q0 = e⊗ e + n⊗ n + n⊗ n = I, and using the relations

following Eqn. (1.106), we also see that Q2 = Q0Q1QT0 . In the above treatment,

we have assumed that two eigenvalues are complex. If (1,−1,−1) are the eigen-values of Q1 and Q2, then they have the representations Q1 = 2e⊗ e − I andQ2 = 2 f ⊗ f − I, so that any tensor that rotates e into f can be taken to be Q0.

An alternative proof can also be given using Eqn. (1.105b) (recall that tr R =1 + 2 cos α) by writing

Q1 = e⊗ e + cos α(I − e⊗ e) + sin α(r⊗ q− q⊗ r),Q2 = f ⊗ f + cos α(I − f ⊗ f )± sin α(r⊗ q− q⊗ r),

where e, q, r and f , q, r are orthonormal sets of axes, and then taking Q0 asf ⊗ e + q⊗ q + r⊗ r.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 40 — #59 ii

ii

ii


1.6 Symmetric Tensors

In this section, we examine some properties of symmetric second-order tensors. We firstdiscuss the properties of the principal values (eigenvalues) and principal directions (eigen-vectors) of a symmetric second-order tensor.

1.6.1 Principal values and principal directions

We have the following result:

Theorem 1.6.1. Every symmetric tensor S has at least one principal frame, i.e., a right-handed triplet of orthogonal principal directions, and at most three distinct principalvalues. The principal values are always real. For the principal directions three possibili-ties exist:

• If all the three principal values are distinct, the principal axes are unique (modulosign reversal).

• If two eigenvalues are equal, then there is one unique principal direction, and theremaining two principal directions can be chosen arbitrarily in the plane perpen-dicular to the first one, and mutually perpendicular to each other.

• If all three eigenvalues are the same, then every right-handed frame is a principalframe, and S is of the form S = λI.

The components of the tensor in the principal frame are

S∗ =

λ1 0 00 λ2 00 0 λ3

. (1.111)

Proof. We seek λ and n such that

(S− λI)n = 0. (1.112)

But this is nothing but an eigenvalue problem. For a nontrivial solution, we needto satisfy the condition that

det(S− λI) = 0,

or, by Eqn. (1.79),

λ3 − I1λ2 + I2λ− I3 = 0, (1.113)

where I1, I2 and I3 are the principal invariants of S.We now show that the principal values given by the three roots of the cubic

equation Eqn. (1.113) are real. Suppose that two roots, and hence the eigenvec-tors associated with them, are complex. Denoting the complex conjugates of λand n by λ and n, we have

Sn = λn, (1.114a)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 41 — #60 ii

ii

ii


Sn = λn, (1.114b)

where Eqn. (1.114b) is obtained by taking the complex conjugate of Eqn. (1.114a)(S being a real matrix is not affected). Taking the dot product of both sides ofEqn. (1.114a) with n, and of both sides of Eqn. (1.114b) with n, we get

n · Sn = λn · n, (1.115)

n · Sn = λn · n. (1.116)

Using the definition of a transpose of a tensor, and subtracting the second rela-tion from the first, we get

ST n · n− n · Sn = (λ− λ)n · n. (1.117)

Since S is symmetric, ST = S, and we have

(λ− λ)n · n = 0.

Since n · n 6= 0, λ = λ, and hence the eigenvalues are real.The principal directions n1, n2 and n3, corresponding to distinct eigenvalues

λ1, λ2 and λ3, are mutually orthogonal and unique (modulo sign reversal). Wenow prove this. Taking the dot product of

Sn1 = λ1n1, (1.118)Sn2 = λ2n2, (1.119)

with n2 and n1, respectively, and subtracting, we get

0 = (λ1 − λ2)n1 · n2,

where we have used the fact that S being symmetric, n2 · Sn1 − n1 · Sn2 = 0.Thus, since we assumed that λ1 6= λ2, we get n1 ⊥ n2. Similarly, we haven2 ⊥ n3 and n1 ⊥ n3. If n1 satisfies Sn1 = λ1n1, then we see that −n1 alsosatisfies the same equation. This is the only other choice possible that satisfiesSn1 = λ1n1. To see this, let r1, r2 and r3 be another set of mutually perpendiculareigenvectors corresponding to the distinct eigenvalues λ1, λ2 and λ3. Then r1has to be perpendicular to not only r2 and r3, but to n2 and n3 as well. Similarcomments apply to r2 and r3. This is only possible when r1 = ±n1, r2 = ±n2and r3 = ±n3. Thus, the principal axes are unique modulo sign reversal.

To prove that the components of S in the principal frame are given byEqn. (1.111), assume that n1, n2, n3 have been normalized to unit length, andthen let e∗1 = n1, e∗2 = n2 and e∗3 = n3. Using Eqn. (1.28), and taking into accountthe orthonormality of e∗1 and e∗2 , the components S∗11 and S∗12 are given by

S∗11 = e∗1 · Se∗1 = e∗1 · (λ1e∗1) = λ1,S∗12 = e∗1 · Se∗2 = e∗1 · (λ2e∗2) = 0.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 42 — #61 ii

ii

ii


Similarly, on computing the other components, we see that the matrix represen-tation of S with respect to e∗ is given by Eqn. (1.111).

If there are two repeated roots, say, λ2 = λ3, and the third root λ1 6= λ2, thenlet e∗1 coincide with n1, so that Se∗1 = λ1e∗1 . Choose e∗2 and e∗3 such that e∗1-e∗3form a right-handed orthogonal coordinate system. The components of S withrespect to this coordinate system are

S∗ =

λ1 0 00 S∗22 S∗230 S∗23 S∗33

. (1.120)

By Eqn. (1.80), we have

S∗22 + S∗33 = 2λ2,

λ1S∗22 + (S∗22S∗33 − (S∗23)2) + λ1S∗33 = 2λ1λ2 + λ2

2,

λ1

[S∗22S∗33 − (S∗23)

2]= λ1λ2

2.

(1.121)

Substituting for λ2 from the first equation into the second, we get

(S∗22 − S∗33)2 = −4(S∗23)

2.

Since the components of S are real, the above equation implies that S∗23 = 0 andλ2 = S∗22 = S∗33. This shows that Eqn. (1.120) reduces to Eqn. (1.111), and thatSe∗2 = S∗12e∗1 + S∗22e∗2 + S∗32e∗3 = λ2e∗2 and Se∗3 = λ2e∗3 (thus, e∗2 and e∗3 are eigen-vectors corresponding to the eigenvalue λ2). However, in this case the choice ofthe principal frame e∗ is not unique, since any vector lying in the plane of e∗2 ande∗3 , given by n∗ = c1e∗2 + c2e∗3 where c1 and c2 are arbitrary constants, is also aneigenvector. The choice of e∗1 is unique (modulo sign reversal), since it has to beperpendicular to e∗2 and e∗3 . Though the choice of e∗2 and e∗3 is not unique, wecan choose e∗2 and e∗3 arbitrarily in the plane perpendicular to e∗1 , and such thate∗2 ⊥ e∗3 .

Finally, if λ1 = λ2 = λ3 = λ, then the tensor S is of the form S = λI. Toshow this choose e∗1–e∗3 , and follow a procedure analogous to that in the previouscase. We now get S∗22 = S∗33 = λ and S∗23 = 0, so that [S∗] = λI. Using thetransformation law for second-order tensors, we have

[S] = QT [S∗]Q = λQTQ = λI.

Thus, any arbitrary vector n is a solution of Sn = λn, and hence every right-handed frame is a principal frame.

As a result of Eqn. (1.111), we can write

S = λ1e∗1 ⊗ e∗1 + λ2e∗2 ⊗ e∗2 + λ3e∗3 ⊗ e∗3 , (1.122)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 43 — #62 ii

ii

ii


which is called as the spectral resolution of S. The spectral resolution of S is unique since

• If all the eigenvalues are distinct, then the eigenvectors are unique, and consequentlythe representation given by Eqn. (1.122) is unique.

• If two eigenvalues are repeated, then, by virtue of Eqn. (1.38), Eqn. (1.122) reduces to

S = λ1e∗1 ⊗ e∗1 + λ2e∗2 ⊗ e∗2 + λ2e∗3 ⊗ e∗3= λ1e∗1 ⊗ e∗1 + λ2(I − e∗1 ⊗ e∗1), (1.123)

from which the asserted uniqueness follows, since e∗1 is unique.

• If all the eigenvalues are the same then S = λI.

Although the eigenvalues of S ∈ Sym are real, its eigenvectors can be complex (notethat in this case, real eigenvectors, obtained by combining the complex eigenvectors, alsoexist). For example, if S = diag[1, 2, 3], then n = (1+ i)[1, 0, 0]/

√2 and n = (1− i)[1, 0, 0]/√

2 are eigenvectors corresponding to the eigenvalue 1. If S = diag[1, 1, 2], then n =

[1, i, 0]/√

2 and n = [1,−i, 0]/√

2 are eigenvectors corresponding to the repeated eigen-value 1. Note that the eigenvectors in such a case occur in complex conjugate pairs (whichis proved simply by taking the complex conjugate of the relation Sn = λn, so that (n, n)are the eigenvectors corresponding to the eigenvalue λ). Note, however, that although theeigenvectors can be complex, the eigenprojections Pi as given by Eqn. (J.4) are real.

Theorem 1.6.2. Let S be given by Eqn. (1.122), and let ek denote the canonical basisof V. Then Q = ek ⊗ e∗k is a proper orthogonal tensor that diagonalizes S, i.e., thematrix representation of QSQT with respect to the canonical basis is diagonal.

Conversely, if Q ∈ Orth+ diagonalizes a tensor S, i.e., QSQT = Λ where Λ is adiagonal matrix, then S is symmetric, the diagonal matrix Λ comprises the eigenvaluesof S, and the rows of Q are the eigenvectors of S.

Proof. By Theorem 1.5.2, Q = ek⊗ e∗k ∈ Orth+, and since ei = Qe∗i , we have

QSQT = Q

[∑

iλie∗i ⊗ e∗i

]QT

= ∑i

λi(Qe∗i )⊗ (Qe∗i )

= ∑i

λiei⊗ ei,

which is a diagonal matrix with λi along the diagonals.Conversely, if QSQT = Λ where Λ is a diagonal matrix, then S = QTΛQ is

clearly a symmetric tensor. By Theorem 1.6.10, S and Λ have the same eigenval-ues, and hence Λ has the eigenvalues of S along the diagonal. If follows that

S = QTΛQ = QT

[∑

iλiei⊗ ei

]Q = ∑

iλi(QTei)⊗ (QTei),



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 44 — #63 ii

ii

ii


which implies that S(QTei) = λi(QTei) (no sum on i); thus, QTei are the eigen-vectors of S. Since Q = IQ, we have

Q = (ek⊗ ek)Q = ek⊗ (QTek) = ek⊗ e∗k ,

which shows that the rows of Q are the eigenvectors of S.

The Cayley–Hamilton theorem (Theorem 1.3.5) applied to S ∈ Sym yields

S3 − I1S2 + I2S− I3 I = 0. (1.124)

We have already proved this result for any arbitrary tensor. However, the following sim-pler proof can be given for symmetric tensors. Using Eqn. (1.341), the spectral resolutionsof S, S2 and S3 are

S = λ1e∗1 ⊗ e∗1 + λ2e∗2 ⊗ e∗2 + λ3e∗3 ⊗ e∗3 ,

S2 = λ21e∗1 ⊗ e∗1 + λ2

2e∗2 ⊗ e∗2 + λ23e∗3 ⊗ e∗3 ,

S3 = λ31e∗1 ⊗ e∗1 + λ3

2e∗2 ⊗ e∗2 + λ33e∗3 ⊗ e∗3 .

(1.125)

Substituting these expressions into the left-hand side of Eqn. (1.124), we get

LHS = (λ31 − I1λ2

1 + I2λ1 − I3)(e∗1 ⊗ e∗1) + (λ32 − I1λ2

2 + I2λ2 − I3)(e∗2 ⊗ e∗2)

+ (λ33 − I1λ2

3 + I2λ3 − I3)(e∗3 ⊗ e∗3) = 0,

since λ3i − I1λ2

i + I2λi − I3 = 0 for i = 1, 2, 3.S− I1 I/3 : S ∈ Sym constitutes the set of traceless symmetric tensors, while tensors

of the form (a⊗ b + b⊗ a)/2, a, b ∈ V have determinant zero (a fact that can be verifiedby writing (a⊗ b + b⊗ a)/2 as (a⊗ b + b⊗ a + 0⊗ 0)/2 and then using Eqn. (1.53)).However, not all symmetric tensors with determinant zero can be represented as (a⊗ b +b⊗ a)/2 since, from the Cauchy–Schwartz inequality it follows that the second invariantof (a⊗ b + b⊗ a)/2 computed using Eqn. (1.63) is always negative.

The following explicit expressions can be given for the eigenvalues of S ∈ Sym when itis not of the form λI [298]:

λ1 =I1

3+ 2√

p cos φ,

λ2 =I1

3−√p(cos φ +

√3 sin φ),

λ3 =I1

3−√p(cos φ−

√3 sin φ),

where

p =16

(S− I1

3I)

:(

S− I1

3I)

,

q =12

det(

S− I1

3I)

,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 45 — #64 ii

ii

ii


φ =13

tan−1√

p3 − q2

q, 0 ≤ φ ≤ π.

As can be seen from the above expressions, λ2 = λ3 when p3 = q2.Alternative expressions for finding the eigenvalues are

λ1 =I1

3− 2√

Q cos α,

λ2 =I1

3− 2√

Q cos(

α +2π

3

),

λ3 =I1

3− 2√

Q cos(

α− 2π

3

),

(1.126)

where

Q =I21 − 3I2

9,

U =1

54

[9I1 I2 − 2I3

1 − 27I3

],

α =13

cos−1

(U√Q3

).

If Q = 0 (i.e., if I21 = 3I2), then all the eigenvalues are repeated, λ1 = λ2 = λ3 = I1/3. If

Q 6= 0 and Q3 = U2 (i.e., if I21 I2

2 − 4I32 − 4I3

1 I3 + 18I1 I2 I3− 27I23 = 0), then λ1 = I1/3− 2

√Q

and λ2 = λ3 = I1/3 +√

Q if U > 0, and λ2 = I1/3 + 2√

Q and λ1 = λ3 = I1/3−√Q ifU < 0. If Q 6= 0 and Q3 6= U2, then the eigenvalues are distinct.

We saw in Theorem 1.6.1 that the component form of a symmetric tensor is a diagonalmatrix in a particular basis. The following theorem states the conditions under which thematrix representation has all the diagonal terms zero [18], [27], [117]:

Theorem 1.6.3. There exists an orthonormal basis ni, i = 1, 2, 3, for which

n1 · Sn1 = n2 · Sn2 = n3 · Sn3 = 0, (1.127)

if and only if tr S = 0.

Proof. If Eqn. (1.127) holds then clearly tr S = n1 · Sn1 + n2 · Sn2 + n3 · Sn3 = 0.Thus, we just need to prove the converse. If tr S = λ1 + λ2 + λ3 = 0, then thespectral resolution of S can be written as

S = λ1(e∗1 ⊗ e∗1 − e∗3 ⊗ e∗3) + λ2(e∗2 ⊗ e∗2 − e∗3 ⊗ e∗3).

We have seen that e∗i , i = 1, 2, 3 can always be chosen to be an orthonormalbasis, and we assume that such a choice has been made. Note that when tr S = 0,at most one eigenvalue can be zero, since if two or more are zero, the tensor S



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 46 — #65 ii

ii

ii


is the zero tensor. Thus, we can assume, without loss in generality, that both λ1and λ2 are nonzero, and λ3 ≡ −(λ1 + λ2) can be either zero or nonzero. If wechoose

n1 = (e∗1 + e∗2 + e∗3)/√

3, (1.128)

it is clear that n1 · Sn1 = 0. We now try and find n2 and n3 in the plane perpen-dicular to n1 such that n2 · Sn2 = n3 · Sn3 = 0. Since n1 · n2 = 0, n2 and n3 are ofthe form

n2 =γ1e∗1 + γ2e∗2 − (γ1 + γ2)e∗3√

γ21 + γ2

2 + (γ1 + γ2)2,

n3 = n1× n2 =−(γ1 + 2γ2)e∗1 + (2γ1 + γ2)e∗2 + (γ2 − γ1)e∗3√

3√

γ21 + γ2

2 + (γ1 + γ2)2,

(1.129)

where γ1, γ2 are constants to be determined. Both the conditions n2 · Sn2 = 0and n3 · Sn3 = 0 yield the equation

λ1γ22 + 2(λ1 + λ2)γ1γ2 + λ2γ2

1 = 0,

which on solving yields (recall that both λ1 and λ2 are nonzero)

γ2 =

−(λ1 + λ2)±√

λ21 + λ1λ2 + λ2

2

λ1

γ1.

It suffices to consider only the solution with the positive sign, since it can beshown that the solution obtained with the negative sign is simply a rearrange-ment of the one obtained with the positive one, i.e., if (n1, n2, n3) is the solutionobtained with the positive sign, then (n1,−n3, n2) is the solution obtained withthe negative one. Substituting for γ2 into Eqns. (1.129), we get

n2 =g1e∗1 + g2e∗2 + g3e∗3√

g21 + g2

2 + g23

,

n3 =h1e∗1 + h2e∗2 + h3e∗3√

h21 + h2

2 + h23

,(1.130)

where

g1 = λ1,

g2 =√

λ21 + λ1λ2 + λ2

2 − (λ1 + λ2),

g3 = λ2 −√

λ21 + λ1λ2 + λ2

2,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 47 — #66 ii

ii

ii


h1 = λ1 + 2λ2 − 2√

λ21 + λ1λ2 + λ2

2,

h2 = λ1 − λ2 +√

λ21 + λ1λ2 + λ2

2,

h3 = −2λ1 − λ2 +√

λ21 + λ1λ2 + λ2

2.

Equations (1.128) and (1.130) are the desired solutions.The solution that has been presented above is only one choice in an infinite

one-parameter family of solutions in terms of a parameter θ. To find this infiniteone-parameter family, assume a unit vector in the principal basis e∗i to be givenby

n1 = (sin θ cos φ, sin θ sin φ, cos θ),

and also assume without loss in generality that λ1 and λ2 are chosen suchthat λ1 6= λ2. Using the fact that S in this coordinate system is given bydiag[λ1, λ2,−(λ1 + λ2)], and eliminating φ by using the condition n1 · Sn1 = 0yields

n1(θ) =

±√

(λ1+2λ2) cos2 θ−λ2λ1−λ2

±√

(λ2+2λ1) cos2 θ−λ1λ2−λ1

± cos θ

.

The second vector n2 also satisfies n2 · Sn2 = 0, and hence has to be of the formn1(α), where α is found in terms of θ by imposing the condition that n1 · n2 = 0.Denoting t ≡ cos2 θ, we get

cos2 α =h1 ±

√h2

D, (1.131)

where

h1 = λ1λ2(−2 + 7t− 5t2) + (λ21 + λ2

2)(−1 + 3t− 2t2),

h2 = (λ41 + λ4

2)(1− 2t)2t2 + (λ31λ2 + λ1λ3

2)(−1 + 8t− 22t2 + 20t3)t+

λ21λ2

2(−2 + 15t− 38t2 + 33t3)t,

D = 2(λ21 + λ2

2)(−1 + 2t) + 2λ1λ2(−2 + 5t).

Denoting the two roots for cos2 α corresponding to the plus and minus signsin Eqn. (1.131) by cos2 α1 and cos2 α2, we get n2 = n1(α1) and n3 = n1(α2).The plus or minus signs for the components of n1 can be chosen arbitrarily; thecorresponding signs for n2 are obtained by imposing the condition n1 · n2 = 0,and then n3 is found as n1× n2.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 48 — #67 ii

ii

ii


The result in Theorem 1.6.3 actually holds for an arbitrary traceless tensor T . This isseen by splitting T into its symmetric and antisymmetric parts, and since the diagonalelements of the antisymmetric part are zero in any basis, the basis found for the symmetricpart works. As a simple example of the above theorem, let the matrix representation ofS with respect to a basis be diag[1,−1, 0]. Then, by convention, we choose λ1 and λ2 tobe the nonzero eigenvalues, i.e., λ1 = 1, λ2 = −1. The corresponding eigenvectors aree∗1 = (1, 0, 0), e∗2 = (0, 1, 0) and e∗3 = (0, 0, 1). From Eqns. (1.128) and (1.130), we getn1 = (1, 1, 1)/

√3, n2 = (1, 1,−2)/

√6, and n3 = (−1, 1, 0)/

√2. A more complicated

example is given in Section 3.7.The decomposition of a matrix into two symmetric components, i.e., T = S1S2, where

S1, S2 ∈ Sym, and where one of the factors can be taken to be invertible [26], can be foundas follows. Assuming S2 to be invertible, we get S1 = TS−1

2 . Since S1 ∈ Sym, we getS−1

2 TT = TS−12 , or, alternatively, TTS2 = S2T . This results in at most n(n− 1)/2 indepen-

dent equations for the n(n + 1)/2 components of S2. For example, for dimension n = 2,we get−T12 T21 T11 − T22

0 0 00 0 0

S11

S22

S12

= 0, (1.132)

while for dimension n = 3, we get

−T12 T21 0 T11 − T22 T31 −T32

−T13 0 T31 −T23 T21 T11 − T33

0 −T23 T32 −T13 T22 − T33 T12

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

S11

S22

S33

S12

S23

S13

= 0.

The decomposition is obviously not unique since, e.g., I can be factorized as (I)(I) or as(−I)(−I).

In general, it is not possible to express the factors S1 and S2 as a function of T . To seethis, let T =

[ 0 w−w 0

], whose determinant is nonzero, so that both S1 and S2 are invertible.

Using Eqn. (1.132), we get the most general solution as[0 w−w 0

]=

1a2 + b2

[bw −aw−aw −bw

] [a bb −a

], a, b ∈ <.

One can see that it is not possible to express either S1 or S2 as a function of T .

1.6.2 Positive definite tensors and the polar decomposition

A second-order symmetric tensor S is positive definite if

(u, Su) ≥ 0 ∀u ∈ V with (u, Su) = 0 if and only if u = 0.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 49 — #68 ii

ii

ii


We denote the set of symmetric, positive definite tensors by Psym. Since by virtue ofEqn. (1.33), all tensors T can be decomposed into a symmetric part Ts, and a skew-symmetricpart Tss, we have

(u, Tu) = (u, Tsu) + (u, Tssu),= (u, Tsu),

because (u, Tssu) = 0 by Eqn. (1.86). Thus, the positive definiteness of a tensor is decidedby the positive definiteness of its symmetric part. In Theorem 1.6.4, we show that a sym-metric tensor is positive definite if and only if its eigenvalues are positive. Although theeigenvalues of the symmetric part of T should be positive in order for T to be positive def-inite, positiveness of the eigenvalues of T itself does not ensure its positive definitenessas the following counterexample shows. If T =

[ 1 −100 1

], then T is not positive definite

since u · Tu < 0 for u = (1, 1), but the eigenvalues of T are (1, 1). Conversely, if T is pos-itive definite, then by choosing u to be the real eigenvectors n of T , it follows that its realeigenvalues λ = (n · Tn) are positive.

Theorem 1.6.4. Let S ∈ Sym. Then the following are equivalent:

1. S is positive definite.

2. The principal values of S are strictly positive.

3. The principal invariants of S are strictly positive.

Proof. We first prove the equivalence of (1) and (2). Suppose S is positive defi-nite. If λ and n denote the principal values and principal directions, respectively,of S, then Sn = λn, which implies that λ = (n, Sn) > 0 since n 6= 0.

Conversely, suppose that the principal values of S are greater than 0. Assum-ing that e∗1 , e∗2 and e∗3 denote the principal axes, the representation of S in theprincipal coordinate frame is (see Eqn. (1.122))

S = λ1e∗1 ⊗ e∗1 + λ2e∗2 ⊗ e∗2 + λ3e∗3 ⊗ e∗3 .

Then

Su = (λ1e∗1 ⊗ e∗1 + λ2e∗2 ⊗ e∗2 + λ3e∗3 ⊗ e∗3)u= λ1(e∗1 · u)e∗1 + λ2(e∗2 · u)e∗2 + λ3(e∗3 · u)e∗3= λ1u∗1e∗1 + λ2u∗2e∗2 + λ3u∗3e∗3 ,

and

(u, Su) = u · Su = λ1(u∗1)2 + λ2(u∗2)

2 + λ3(u∗3)2, (1.133)

which is greater than or equal to zero since λi > 0. Suppose that (u, Su) = 0.Then by Eqn. (1.133), u∗i = 0, which implies that u = 0. Thus, S is a positivedefinite tensor.

To prove the equivalence of (2) and (3), note that, by Eqn. (1.80), if allthe principal values are strictly positive, then the principal invariants are also



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 50 — #69 ii

ii

ii


strictly positive. Conversely, if all the principal invariants are positive, thenI3 = λ1λ2λ3, is positive, so that all the λi are nonzero in addition to being real.Each λi has to satisfy the characteristic equation

λ3i − I1λ2

i + I2λi − I3 = 0, i = 1, 2, 3.

If λi is negative, then, since I1, I2, I3 are positive, the left-hand side of the aboveequation is negative, and hence the above equation cannot be satisfied. We havealready mentioned that λi cannot be zero. Hence, each λi has to be positive.

Theorem 1.6.5. If S ∈ Psym, then there exists a unique H ∈ Psym, such that H2 :=HH = S. The tensor H is called the positive definite square root of S, and we writeH =

√S.

Proof. Before we begin the proof, we note that a positive definite, symmet-ric tensor can have square roots that are not positive definite. For example,diag[1,−1, 1] is a non-positive definite square root of I. Here, we are interestedonly in those square roots that are positive definite. The short proof of unique-ness given below is due to Stephenson [305].

Since

S =3

∑i=1

λie∗i ⊗ e∗i ,

is positive definite, by Theorem 1.6.4, all λi > 0. Define

H :=3

∑i=1

√λie∗i ⊗ e∗i . (1.134)

Since the e∗i are orthonormal, it is easily seen that HH = S. Since the eigen-values of H given by

√λi are all positive, H is positive definite. Thus, we have

shown that a positive definite square root tensor of S given by Eqn. (1.134) exists.We now prove uniqueness.

With λ and n denoting the principal value and principal direction of S, wehave

0 = (S− λI)n

= (H2 − λI)n

= (H +√

λI)(H −√

λI)n.

Calling (H −√

λI)n = n, we have

(H +√

λI)n = 0.

This implies that n = 0. For, if not, −√

λ is a principal value of H, which contra-dicts the fact that H is positive definite. Therefore,

(H −√

λI)n = 0;



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 51 — #70 ii

ii

ii


i.e., n is also a principal direction of H with associated principal values√

λ. IfH =

√S, then it must have the form given by Eqn. (1.134) (since the spectral

decomposition is unique), which establishes its uniqueness.

Now we prove the polar decomposition theorem, which plays a crucial role in the char-acterization of frame-indifferent constitutive relations.

Theorem 1.6.6 (Polar Decomposition Theorem). Let F be an invertible tensor. Then,it can be factored in a unique fashion as

F = RU = V R,

where R is an orthogonal tensor, and U, V are symmetric and positive definite tensors.One has

U =√

FT F

V =√

FFT .

Proof. The tensor FT F is obviously symmetric. It is positive definite since

(u, FT Fu) = (Fu, Fu) ≥ 0,

with equality if and only if u = 0 (Fu = 0 implies that u = 0, since F is in-vertible). Let U =

√FT F. U is unique, symmetric and positive definite by

Theorem 1.6.5. Define R = FU−1, so that F = RU. The tensor R is orthogonal,because

RT R = (FU−1)T(FU−1)

= U−T FT FU−1

= U−1(FT F)U−1 (since U is symmetric)

= U−1(UU)U−1

= I.

Since det U > 0, we have det U−1 > 0. Hence, det R and det F have the samesign. Usually, the polar decomposition theorem is applied to the deformationgradient F satisfying det F > 0. In such a case det R = 1, and R is a rotation.

Next, let V = FUF−1 = FR−1 = RUR−1 = RURT . Thus, V is symmetricsince U is symmetric. V is positive definite since

(u, Vu) = (u, RURTu)

= (RTu, URTu)≥ 0,

with equality if and only if u = 0 (again since RT is invertible). Note that FFT =

VV = V2, so that V =√

FFT .Finally, to prove the uniqueness of the polar decomposition, we note that since

U is unique, R = FU−1 is unique, and hence so is V .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 52 — #71 ii

ii

ii


Let (λi, e∗i ) denote the eigenvalues/eigenvectors of U. Then, since V Re∗i = RUe∗i =λi(Re∗i ), the pairs (λi, f ∗i ), where f i ≡ Re∗i are the eigenvalues/eigenvectors of V . Thus, Fand R can be represented as

F = RU = R3

∑i=1

λie∗i ⊗ e∗i =3

∑i=1

λi(Re∗i )⊗ e∗i =3

∑i=1

λi f ∗i ⊗ e∗i ,

R = RI = R3

∑i=1

e∗i ⊗ e∗i = ∑i(Re∗i )⊗ e∗i =

3

∑i=1

f ∗i ⊗ e∗i .

(1.135)

When F is invertible, the polar decomposition can be given in explicit form in terms ofthe eigenvalues λ1, λ2, λ3 of U, and powers of C or B. First, one determines the eigen-values λ2

1, λ22, λ2

3 of C = FT F (which are also the eigenvalues of B = FFT = RCRT),using Eqn. (1.126). Then, depending on whether the eigenvalues are repeated or distinct,the relevant expressions for the component matrices in the polar decomposition are ([126],[153], [326])

All eigenvalues distinct (λ1 6= λ2 6= λ3 6= λ1):

R = F[ a1

aI +

a2

aC +

a3

aC2]

,

U−1 =a1

aI +

a2

aC +

a3

aC2,

V−1 =a1

aI +

a2

aB +

a3

aB2,

U =b1

bI +

b2

bC +

b3

bC2,

V =b1

bI +

b2

bB +

b3

bB2,

where

a1 = (λ1 + λ2 + λ3)(λ21λ2

2 + λ22λ2

3 + λ21λ2

3) + λ1λ2λ3(λ21 + λ2

2 + λ23 + λ1λ2

+ λ2λ3 + λ1λ3),

a2 = −[(λ1 + λ2 + λ3)(λ

21 + λ2

2 + λ23) + λ1λ2λ3

],

a3 = λ1 + λ2 + λ3,a = λ1λ2λ3 [(λ1 + λ2 + λ3)(λ1λ2 + λ2λ3 + λ1λ3)− λ1λ2λ3] ,

b1 = λ1λ2λ3(λ1 + λ2 + λ3),

b2 = λ21 + λ2

2 + λ23 + λ1λ2 + λ2λ3 + λ1λ3,

b3 = −1,b = (λ1 + λ2 + λ3)(λ1λ2 + λ2λ3 + λ1λ3)− λ1λ2λ3.

Two distinct eigenvalues (λ1 6= λ2 = λ3):

R = F[ a1

aI +

a2

aC]

,

U−1 =a1

aI +

a2

aC,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 53 — #72 ii

ii

ii


V−1 =a1

aI +

a2

aB,

U =b1

bI +

b2

bC,

V =b1

bI +

b2

bB,

where

a1 = λ21 + λ1λ2 + λ2

2,a2 = −1,a = λ1λ2(λ1 + λ2),

b1 = λ1λ2,b2 = 1,b = λ1 + λ2.

All eigenvalues repeated (λ1 = λ2 = λ3 ≡ λ):

R =1λ

F, U = V = λI.

Rates of the stretch and rotation tensors are discussed in Section 1.9.4.

1.6.3 Isotropic functions

We shall frequently make use of the following theorem:

Theorem 1.6.7. Let X, Y and Z be normed vector spaces, and let f : X → Z andg : X → Y be two mappings such that

x, x′ ∈ X and g(x) = g(x′) =⇒ f (x) = f (x′).

Then, there exists a mapping h : g(X) ⊂ Y → Z such that

f (x) = h(g(x)) ∀x ∈ X.

A scalar function φ : V → < is isotropic if φ(v) = φ(Qv) for all v ∈ V and Q ∈ Orth+.We have the following characterization:

Theorem 1.6.8. φ is isotropic if and only if there exist a function φ such that

φ(v) = φ(|v|) ∀v ∈ V.

Proof. If φ exists, then

φ(Qv) = φ(|Qv|) = φ(|v|) = φ(v) ∀v ∈ V.

To prove the converse, we need to show that |u| = |v| implies φ(u) = φ(v),and then the result follows from Theorem 1.6.7. As noted after the proof of



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 54 — #73 ii

ii

ii


Theorem 1.5.2, if |u| = |v|, there exists a proper orthogonal tensor that rotates uinto v, i.e.,

v = Qu.

Now by the assumed isotropy, we have

φ(v) = φ(Qu) = φ(u).

A vector function g : V → V is isotropic if Qg(v) = g(Qv) for all v ∈ V and Q ∈Orth+. We have the following characterization:

Theorem 1.6.9. g is isotropic if and only if there exists a function φ such that

g(v) = φ(|v|)v ∀v ∈ V.

Proof. If the function φ exists, then

g(Qv) = φ(|Qv|)Qv = φ(|v|)Qv = Qg(v).

To prove the converse, let v be an arbitrary vector, and let Q 6= I be a properorthogonal tensor with v/ |v| as its axis so that Qv = v. By the assumed isotropy,we have

Qg(v) = g(Qv) = g(v).

Since the axis of Q is a one-dimensional subspace, g(v) is parallel to v, i.e.,g(v) = α(v)v. Again by the assumed isotropy, α(v) = α(Qv), and the desiredresult follows by Theorem 1.6.8.

In what follows, we restrict the domain of the functions under consideration to a sub-set of Sym, and denote the list of principal invariants (I1, I2, I3) by I . A scalar functionφ : Psym→ < is an isotropic function if it remains invariant under orthogonal transforma-tions, i.e.,

φ(S) = φ(QSQT) ∀S ∈ Psym, Q ∈ Orth+. (1.136)

An isotropic tensor function G : Psym→ Sym is one that satisfies

QG(S)QT = G(QSQT) ∀S ∈ Psym, Q ∈ Orth+.

Although we have restricted S to be in Psym in the above definitions, all the followingresults will hold even if S lies in any subset A of Sym that is invariant under Orth+, i.e., if

QAQT ∈ A when A ∈ A, Q ∈ Orth+.

This condition is imposed in order that φ(QSQT) and G(QSQT) make sense. As is obvi-ous from the above definition, Sym itself is invariant under Orth+. To show that Psym isinvariant under Orth+, let S ∈ Psym. By Theorem 1.6.4, all the eigenvalues of S are strictly



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 55 — #74 ii

ii

ii


positive. The following theorem shows that the eigenvalues of QSQT are the same as theeigenvalues of S. Then, again appealing to Theorem 1.6.4, we conclude that QSQT ∈ Psym.

Theorem 1.6.10. The list of principal invariants I is isotropic, i.e.,

IS = IQSQT ∀S ∈ Psym, Q ∈ Orth+. (1.137)

Proof. Note that

det(QSQT − λI) = det(QSQT − λQQT)

= det[Q(S− λI)QT ]

= (det Q)(det QT)det(S− λI)

= [det(QQT)]det(S− λI)

= det(S− λI).

Thus, the characteristic equation for QSQT and S is the same, leading to thesame principal invariants (and hence, also to the same eigenvalues).

Theorem 1.6.11. The tensors A, B ∈ Sym have the same principal invariants if andonly if there exists a proper orthogonal tensor Q such that A = QBQT .

Proof. If there exists Q ∈ Orth+ such that A = QBQT , then by Eqn. (1.137)(which also holds when S ∈ Sym), A and B have the same principal invariants.Conversely, if A and B have the same principal invariants, then by Eqn. (1.113)they have the same eigenvalues. By the spectral resolution theorem, we canwrite

A = ∑i

λie∗i ⊗ e∗i ,

B = ∑i

λi ei⊗ ei.

Since e∗i and ei are orthonormal sets of vectors, by Eqn. (1.95) there existsQ = e∗k ⊗ ek ∈ Orth+ such that e∗i = Qei. Using Eqn. (1.344), we now have

QBQT = ∑i

λiQ(ei⊗ ei)QT = ∑i

λi(Qei)⊗ (Qei) = ∑i

λie∗i ⊗ e∗i = A.

Theorem 1.6.12. A function φ : Psym → < is isotropic if and only if there exists afunction φ : I(S)→ < such that

φ(S) = φ(IS) ∀S ∈ Psym. (1.138)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 56 — #75 ii

ii

ii


Proof. Assuming Eqn. (1.138) holds, we conclude that φ is isotropic fromEqn. (1.137).

To prove the converse, we need to show that

IA = IB =⇒ φ(A) = φ(B),

for A, B ∈ Psym, and then the desired result follows from Theorem 1.6.7. SinceA and B have the same invariants, by Theorem 1.6.11 there exists Q ∈ Orth+

such that A = QBQT . Since φ is isotropic,

φ(A) = φ(QBQT) = φ(B).

The generalization of the above theorem to arbitrary tensors is given by the follow-ing [246, 294]:

Theorem 1.6.13. A function φ : Lin → < is isotropic if and only if there exists afunction φ such that

φ(T) = φ(tr T , tr T2, tr TTT), ∀T ∈ Lin (n = 2),

φ(T) = φ(tr T , tr T2, tr T3, tr TTT , tr T2TT , tr T2(TT)2, tr TTTT2(TT)2)

∀T ∈ Lin (n = 3).

Now we state and prove the Rivlin–Ericksen representation theorem for isotropic ten-sor functions (the generalization to an isotropic symmetric tensor-valued function of twosymmetric tensors is stated in Eqn. (7.56)).

Theorem 1.6.14 (Rivlin–Ericksen Representation Theorem). A function G :Psym→ Sym is isotropic if and only if there exist scalar functions φ0, φ1, φ2 : I(S)→<, such that

G(S) = φ0(IS)I + φ1(IS)S + φ2(IS)S2 ∀S ∈ Psym. (1.139)

Proof. Assume Eqn. (1.139) holds. Choose S ∈ Psym and Q ∈ Orth+. ByEqn. (1.137), we have

G(QSQT) = φ0(IQSQT )I + φ1(IQSQT )QSQT + φ2(IQSQT )QSQTQSQT

= φ0(IS)QQT + φ1(IS)QSQT + φ2(IS)QS2QT

= QG(S)QT ,

which proves that G is isotropic.To prove the converse, assume that G is isotropic, i.e., G(QSQT) = QG(S)QT

for all Q ∈ Orth+. We first prove that every eigenvector of S is an eigenvectorof G(S), or, alternatively, any matrix that diagonalizes S also diagonalizes G(S).Let the spectral resolution of S be given by S = ∑i λie∗i ⊗ e∗i . If we form a



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 57 — #76 ii

ii

ii


matrix Q with the eigenvectors of S along its rows, i.e., if Q = ek ⊗ e∗k , then byTheorem 1.6.2, Q diagonalizes S, and is proper orthogonal. Let

Q1 = e1⊗ e1 − e2⊗ e2 − e3⊗ e3,Q2 = −e1⊗ e1 + e2⊗ e2 − e3⊗ e3.

Then, we see that Q1, Q2, Q1Q and Q2Q are all proper orthogonal tensors. Inaddition,

QβQSQTQTβ = ∑

iλiei⊗ ei = QSQT , β = 1, 2 (no sum on β).

By the assumed property of isotropy, we have

QβQG(S)QTQTβ = G(QβQSQTQT

β ) = G(QSQT) = QG(S)QT , β = 1, 2.

(1.140)

A straightforward computation shows that the above equations imply thatQG(S)QT is diagonal, or, in other words, again appealing to Theorem 1.6.2, theeigenvectors of S and G(S) are the same. Hence

G(S) = µ1e∗1 ⊗ e∗1 + µ2e∗2 ⊗ e∗2 + µ3e∗3 ⊗ e∗3 , (1.141)

where µi are the eigenvalues of G(S).Let us now establish that G(S) can be written as

G(S) = α0(S)I + α1(S)S + α2(S)S2 ∀S ∈ Psym, (1.142)

where α0, α1, α2 are real-valued functions of S. Assume first that S has three dis-tinct eigenvalues λi, with associated eigenvectors e∗i . Then the two sets I, S, S2and e∗1 ⊗ e∗1 , e∗2 ⊗ e∗2 , e∗3 ⊗ e∗3 span the same subspace of the vector space ofsymmetric tensors. To see this, we first observe that e∗1⊗ e∗1 , e∗2⊗ e∗2 , and e∗3⊗ e∗3are symmetric tensors. They are also linearly independent, since if

αe∗1 ⊗ e∗1 + βe∗2 ⊗ e∗2 + γe∗3 ⊗ e∗3 = 0,

then carrying out the operation e∗1 · ( )e∗1 , where ( ) denotes the left- and right-hand sides of the above equation, and using the orthogonality of e∗1 , e∗2 and e∗3 ,we get α = 0; similarly, the operations e∗2 · ( )e∗2 and e∗3 · ( )e∗3 yield β = 0 andγ = 0. Thus, Lspe∗1 ⊗ e∗1 , e∗2 ⊗ e∗2 , e∗3 ⊗ e∗3 is a three-dimensional subspace ofthe vector space of symmetric tensors, and, hence, is a vector space itself. Since,by Eqns. (1.38) and (1.125),

I = e∗1 ⊗ e∗1 + e∗2 ⊗ e∗2 + e∗3 ⊗ e∗3 ,S = λ1e∗1 ⊗ e∗1 + λ2e∗2 ⊗ e∗2 + λ3e∗3 ⊗ e∗3 ,

S2 = λ21e∗1 ⊗ e∗1 + λ2

2e∗2 ⊗ e∗2 + λ23e∗3 ⊗ e∗3 ,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 58 — #77 ii

ii

ii


and since the van der Monde determinant

det

1 1 1λ1 λ2 λ3

λ21 λ2

2 λ23

= (λ1 − λ2)(λ2 − λ3)(λ3 − λ1) 6= 0,

by virtue of the eigenvalues being assumed to be distinct, Theorem 1.1.2 yieldsthe result

Lspe∗1 ⊗ e∗1 , e∗2 ⊗ e∗2 , e∗3 ⊗ e∗3 = LspI, S, S2.

Hence, Eqn. (1.141) can be written in the form given by Eqn. (1.142).Assume, next, that the tensor S has a double eigenvalue, say λ2 = λ3 6=

λ1. Then, the two sets I, S and e∗1 ⊗ e∗1 , e∗2 ⊗ e∗2 + e∗3 ⊗ e∗3 span the samesubspace, since, in this case, we can write

I = e∗1 ⊗ e∗1 + (e∗2 ⊗ e∗2 + e∗3 ⊗ e∗3),S = λ1e∗1 ⊗ e∗1 + λ2(e∗2 ⊗ e∗2 + e∗3 ⊗ e∗3),

and

det

[1 1

λ1 λ2

]= λ2 − λ1 6= 0.

All the vectors in the subspace spanned by e∗2 and e∗3 are eigenvectors of S, andhence those of G(S) as already proved. Thus,

G(S)e∗2 = µ2e∗2 , G(S)e∗3 = µ3e∗3 ,G(S)(e∗2 + e∗3) = µ(e∗2 + e∗3),

which implies that µ = µ2 = µ3. Therefore, we can write

G(S) = µ1e∗1 ⊗ e∗1 + µ(e∗2 ⊗ e∗2 + e∗3 ⊗ e∗3),

as

G(S) = α0(S)I + α1(S)S.

Assume, finally, that S has three repeated eigenvalues. Now, the eigenvaluesof G(S) are also all repeated (proved in the same way as in the previous case),and all unit vectors are eigenvectors of S, and hence also of G(S). Thus, by usingthe spectral resolution of G(S), we conclude that it is a multiple I, i.e.,

G(S) = α0(S)I,

proving the assertion in all cases. Note from the above discussion that α2(S) = 0,when there are two repeated eigenvalues, and α1(S) = α2(S) = 0, when alleigenvalues are identical.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 59 — #78 ii

ii

ii


It remains to show that the functions α0, α1 and α2 are only functions of IS.By virtue of Theorem 1.6.12, it suffices to prove that α0, α1 and α2 are isotropicfunctions. This follows from the isotropy of G(S):

G(QSQT) = Q[α0(QSQT)I + α1(QSQT)S + α2(QSQT)S2

]QT

= QG(S)QT = Q[α0(S)I + α1(S)S + α2(S)S2

]QT ∀S ∈ Psym, Q ∈ Orth+,

which implies that

αi(QSQT) = αi(S) ∀S ∈ Psym, Q ∈ Orth+; for i = 0, 1, 2,

again by the uniqueness of the expansion of G(S) in the spaces spanned byI, S, S2, I, S or I according to which case is being considered. Thus, byTheorem 1.6.12, there exist functions φ(IS) such that

αi(S) = φi(IS) ∀S ∈ Psym; i = 0, 1, 2.

Note that the above theorem does not say that an isotropic function is a quadratic func-tion of the elements of S, since the functions φi(IS), i = 0, 1, 2, can be arbitrary.

For linear functions, we have the following simpler result:

Theorem 1.6.15. (Representation Theorem for Linear Isotropic Tensor Func-tions). A linear function G(S) : Sym → Sym is isotropic if and only if there existscalars µ and λ such that

G(S) = λ(tr S)I + 2µS ∀S ∈ Sym.

Proof. Although one could try and specialize the result from the previous theo-rem to the present case, it is easier to proceed directly; the following short proofis due to Gurtin [108]. Let V1 be the set of all unit vectors. For e ∈ V1, the tensore⊗ e has eigenvalues (0, 0, 1) (since there exists Q ∈ Orth+ such that e = Qe1,which implies that e⊗ e = Q(e1⊗ e1)QT , so that by Eqn. (1.137), the principalinvariants, and hence the eigenvalues, of e⊗ e and e1⊗ e1 are the same). Sincetwo eigenvalues of e⊗ e are the same, analogous to the case of repeated eigen-values in the previous theorem’s proof, G(e⊗ e) also has a repeated eigenvalue,and thus by Eqn. (1.123) we have

G(e⊗ e) = λ(e)I + 2µ(e)e⊗ e ∀e ∈ V1. (1.143)

Choose e, f ∈ V1, and let Q be a proper orthogonal tensor such that Qe = f .Since

Q(e⊗ e)QT = f ⊗ f ,

and since G is isotropic,

0 = QG(e⊗ e)QT −G( f ⊗ f ) = [λ(e)− λ( f )]I + 2[µ(e)− µ( f )] f ⊗ f .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 60 — #79 ii

ii

ii


By multiplying both sides of αI + β f ⊗ f = 0 by a vector g perpendicular tof , it follows that α = β = 0, implying in turn that I and f ⊗ f are linearlyindependent; thus,

λ(e) = λ( f ), µ(e) = µ( f ).

Therefore, λ and µ are scalar constants, and we conclude from Eqn. (1.143) that

G(e⊗ e) = λI + 2µe⊗ e. (1.144)

The spectral resolution of S ∈ Sym is

S = ∑i

αie∗i ⊗ e∗i ,

with e∗i orthonormal; therefore, in view of Eqn. (1.144), and the linearity of G,

G(S) = ∑i

αiG(e∗i ⊗ e∗i ) = λ(α1 + α2 + α3)I + 2µS.

Since tr S = α1 + α2 + α3, the above equation implies that

G(S) = λ(tr S)I + 2µS.

The converse assertion that the above representation is an isotropic function isleft as an exercise (see Problem 37).

1.7 Higher-Order Tensors

A detailed discussion about the decomposition of higher-order tensors into irreducible ten-sors, and their eigenvalues and eigentensors can be found in [8], [150], [151], [260], [304]and [360].

Similar to the definition of a second-order tensor, a third-order tensor, which we repre-sent by B, is defined as a linear transformation that transforms a vector v to a second-ordertensor T , i.e.,

Bv = T . (1.145)

In addition, the map is linear, i.e.,

B(av1 + bv2) = aBv1 + bBv2 ∀a, b ∈ < and v1, v2 ∈ V.

In order to find the representation for a third-order tensor in terms of its components, wedefine the dyadic or tensor product of three vectors a, b and c as

(a⊗ b⊗ c)d := (a⊗ b)(c · d) ∀d ∈ V. (1.146)

Since a⊗ b is a second-order tensor, and since the above transformation is linear, the tensorproduct of three vectors is a linear transformation that transforms vectors into



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 61 — #80 ii

ii

ii


second-order tensors, and hence is a third-order tensor. Just as a second-order tensor canbe expressed as the sum of 9 dyads (Eqn. (1.37)), a third-order tensor can be expressed asthe sum of 27 dyads as follows:

B = Bijkei⊗ ej⊗ ek, (1.147)

where

Bijk = ei · (Bek)ej. (1.148)

In order to prove this, we consider the action of B on an arbitrary vector u:

Bu = (Bu)ij(ei⊗ ej)

= ei ·[(Bu)ej

](ei⊗ ej)

= ei ·[B(ukek)] ej

(ei⊗ ej)

= ei ·[B((ek · u)ek)]ej

(ei⊗ ej)

= ei ·[(Bek)ej

](ek · u)(ei⊗ ej)

= ei ·[(Bek)ej

] [(ei⊗ ej⊗ ek)u

].

Since u is arbitrary, we have proved that the tensor B can be represented by Eqn. (1.147),with the components Bijk of this tensor given by Eqn. (1.148). Note that

Bem = Bijmei⊗ ej.

In [360], using the identity εpjrεpbr = 2δjb, the authors write

Bijk =12

εpjrεpbrBibk,

then, assuming all eigenvalues to be distinct, use the spectral resolution of a fourth-ordertensor to write Bibkεpbr in the form ∑9

m=1 λm(Pm)ipkr, from which it follows that

Bijk =9

∑m=1

λm

[12

εpjr(Pm)ipkr

]. (1.149)

Based on this representation, the authors claim that the λi’s (not all independent, sinceBibiεpbp = ∑9

m=1 λm = 0) are eigenvalues of B. However, since the eigenvalue problem fora third-order tensor is never defined in this work, it is not clear how the eigenvalues of thefourth-order tensor are also the eigenvalues of the third-order tensor.

In general, an nth order tensor is defined as a linear transformation that transforms avector to an n− 1th order tensor. It can be proved by induction that an nth order tensorcan be represented by the sum of 3n dyads as

T = Ti1i2i3 ...in ei1 ⊗ ei2 ⊗ ei3 · · ·⊗ ein ,

where

Ti1i2i3 ...in = ei1 ·[(((Tein)ein−1) · · · )ei2

].



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 62 — #81 ii

ii

ii


As in the case of second-order tensors, we define the sum of two nth order tensors Sand T as

(S + T)u := Su + Tu ∀u ∈ V,

and the scalar multiple of T by α ∈ < as

(αT)u := α(Tu) ∀u ∈ V.

It is easily seen from the definition of a tensor as a linear transformation that S + T and αTare nth order tensors.

The transformation law for third order tensors can be deduced from Eqn. (1.148) asfollows:

Bijk = Qilel · [B(Qknen)]Qjmem

= QilQjmQknel · (Ben)em

= QilQjmQknBlmn.

The alternate tensor E whose components are given by Eqn. (1.11) is an example of a third-order tensor. This can be proved by noting that

εijk = ei · (ej× ek)

= QilQjmQknel · (em× en)

= QilQjmQknεlmn,

which is nothing but the transformation law for the components of a third-order tensor. Infact, it is an isotropic third-order tensor, as we show in Section 1.8. Note that [[Ew]v]u =[u, v, w].

Similarly, the transformation law for a fourth-order tensor is

Cijkl = QipQjqQkrQlsCpqrs. (1.150)

Generalizing the above arguments, the transformation law for an nth order tensor is givenby

Ti1i2i3...in = Qi1 j1 Qi2 j2 Qi3 j3 . . . Qin jn Tj1 j2 j3...jn . (1.151)

We have seen how to construct fourth-order tensors as a dyadic product of vectors.However, another alternative way that is very widely used is to construct it as a dyadicproduct of two second-order tensors. This procedure is analogous to constructing second-order tensors using the dyadic product of vectors. We now discuss this methodology,which should be compared with the one for second-order tensors presented in Section 1.3.

A fourth-order tensor is defined as a linear transformation that maps second-order ten-sors to second-order tensors. Thus, if C is a fourth-order tensor that maps a second-ordertensor A to a second-order tensor B, we write it as

B = CA. (1.152)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 63 — #82 ii

ii

ii


C satisfies the property C(αA1 + βA2) = αCA1 + βCA2, for all α, β ∈ < and A1, A2 ∈ Lin.The sum, scalar multiple and composition of fourth-order tensors is defined as

(C + D)A = CA + DA ∀A ∈ Lin,(λC)A = λ(CA) ∀A ∈ Lin,(CD)A = C(DA) ∀A ∈ Lin.

For presenting the component form of fourth-order tensors, it is convenient to introducethe notation Eij ≡ ei⊗ ej. We have

T : Eij = tr (T(ej⊗ ei)) = tr ((Tej)⊗ ei) = Tkjtr (ek⊗ ei) = Tkjδki = Tij. (1.153)

By letting T ≡ Ekl , we get

Eij : Ekl = δikδjl . (1.154)

Analogous to the definition of the components of a second-order tensor (see Eqn. (1.27)),we now define the components via Eqn. (1.152) as

CEkl = CijklEij. (1.155)

Taking the dot product on either side with Emn and using Eqn. (1.154), we get Cmnkl =Emn : CEkl , which, on replacing the indices (m, n) with (i, j) can be written as

Cijkl = Eij : CEkl . (1.156)

Two tensors C and D are said to be equal if

CB = DB ∀B ∈ Lin, (1.157)

or, alternatively, if

(A, CB) = (A, DB) ∀A, B ∈ Lin.

By choosing A ≡ Eij and B ≡ Ekl in the above equation, we see that equal tensors haveequal components.

Using Eqn. (1.155) and the linearity of fourth-order tensors, Eqn. (1.152) can be writtenas

BijEij = C(AklEkl) = AklCEkl = Cijkl AklEij,

which by virtue of the linear independence of the Eij implies that

Bij = Cijkl Akl .

Similarly, we have

(CD)ijkl = (C)ijmn(D)mnkl .

We now give some examples of fourth-order tensors. The fourth-order identity tensor I

is defined as one that maps a second-order tensor into itself. The transposer T, symmetrizer



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 64 — #83 ii

ii

ii


S and skew-symmetrizer W are defined as ones that map a second-order tensor into itstranspose, symmetric and skew-symmetric part, respectively, i.e.,

IA = A ∀A ∈ Lin,

TA = AT ∀A ∈ Lin,

SA =12(A + AT) ∀A ∈ Lin,

WA =12(A− AT) ∀A ∈ Lin.

Note that

S = (I + T)/2, I = S + W,W = (I−T)/2, T = S−W.

(1.158)

We have

ST =12(I + T)T =

12(T + TT) =

12(T + I) = S = TS, (1.159)

WT =12(I−T)T =

12(T−TT) =

12(T− I) = −W = TW,

SS =12(I + T)S =

12(S + TS) =

12(S + S) = S,

WW =12(I−T)W =

12(W−TW) =

12(W + W) = W,

SW =12(I + T)W =

12(W + TW) =

12(W−W) = 0 = WS. (1.160)

The components of I, T, S and W are obtained using Eqn. (1.156), and are given by

Iijkl = δikδjl ,

Tijkl = δilδjk,

Sijkl =12(δikδjl + δilδjk),

Wijkl =12(δikδjl − δilδjk).

(1.161)

Using the above component form of S, we have

[SAS]ijkl =14

[Aijkl + Aijlk + Ajikl + Ajilk

], (1.162)

where A is any fourth-order tensor.The transpose of C, denoted by CT is defined using the inner product as

(A, CT B) := (B, CA) ∀A, B ∈ Lin. (1.163)

The components of CT are obtained by taking A ≡ Eij and B ≡ Ekl in the above equation,and are given by

(CT)ijkl = Cklij. (1.164)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 65 — #84 ii

ii

ii


Similar to second-order tensors, one has

(CT)T = C, (1.165a)

(CD)T = DTCT . (1.165b)

Analogous to the dyadic product of two vectors, the dyadic product of two tensors Aand B can be defined as

(A⊗ B)C := (B : C)A ∀C ∈ Lin. (1.166)

The component form of this dyadic product is

(A⊗ B)ijkl = Eij : (A⊗ B)Ekl = (A : Eij)(B : Ekl) = AijBkl .

The transpose of A⊗ B is B⊗ A since

(C, (A⊗ B)T D) = (D, (A⊗ B)C) = (B : C)(A : D) = (C, (B⊗ A)D).

By taking C ≡ u⊗ v in Eqn. (1.166), we get

(A⊗ B)(u⊗ v) = (u · Bv)A.

A tensor C can be written as CijklEij⊗ Ekl since, for an arbitrary A ∈ Lin, we have

CA = (CA)ijEij

= [(CA) : Eij]Eij

= Akl [Eij : (CEkl)]Eij

= Cijkl(A : Ekl)Eij

= Cijkl [Eij⊗ Ekl ]A.

Also analogous to the results for dyadic products of vectors, we have

C(A⊗ B) = (CA)⊗ B,

(A⊗ B)C = A⊗ (CT B).

Another tensor product that is of great use is the “square tensor product” defined as

(A B)C := ACBT ∀C ∈ Lin.

The components of A B are given by

(A B)ijkl = Eij : (A B)Ekl = Eij : (AEkl BT) = (ei⊗ ej) : [(Aek)⊗ (Bel)]

= (ei · Aek)(ej · Bel) = AikBjl .

Since (I I)A = IAIT = A, for all A ∈ Lin, it is clear from Eqn. (1.157) that the fourth-order identity tensor can be written as

I = I I, (1.167)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 66 — #85 ii

ii

ii


with components (I)ijkl = δikδjl .The transpose of A B is given by AT BT since

(C, (A B)T D) = (D, (A B)C) = (D, ACBT) = (C, AT DB) = (C, (AT BT)D).

From the definition of the square tensor product, it follows that

(H I)A = H A,

(I HT)A = AH.

As an example of the use of these equations, note that the tensor equation H A + AH = 0can be written as (H I + I HT)A = 0.

The composition rule is given by

(A B)(C D) = (AC) (BD), (1.168)

since

(A B)(C D)E = (A B)(CEDT) = ACEDT BT = [(AC) (BD)]E ∀E ∈ Lin.

Some additional properties are

(A⊗ B)(C D) = A⊗ (CT BD), (1.169)

(A B)(C⊗ D) = (ACBT)⊗ D, (1.170)(A B)(u⊗ v) = (Au)⊗ (Bv), (1.171)(a⊗ b) (u⊗ v) = (a⊗ u)⊗ (b⊗ v). (1.172)

As per our convention outlined at the end of Section 1.3.1, the above relations are valid evenif the involved second-order tensors and vectors are complex-valued. The first relation isproved as follows:

(A⊗ B)(C D)E = (A⊗ B)(CEDT) = [B : (CEDT)]A = [(CT BD) : E]A

= [A⊗ (CT BD)]E ∀E ∈ Lin.

The remaining relations are proved in a similar manner.The transposer T commutes with the tensor product A B in the following way:

T(A B) = (B A)T, (1.173)

since

T(A B)E = T(AEBT) = BET AT = (B A)ET = (B A)TE ∀E ∈ Lin.

In an identical fashion, one can show that

S(A B)S = S(B A)S, (1.174)S(A B + B A) = (A B + B A)S = S(A B + B A)S = 2S(A B)S, (1.175)S(A A) = (A A)S = S(A A)S, (1.176)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 67 — #86 ii

ii

ii


where the second relation in Eqn. (1.175) follows from the first by multiplying both sidesby S, and using the fact that SS = S, while the third relation in Eqn. (1.175) follows fromEqn. (1.174). Equation (1.176) follows from Eqn. (1.175) by taking B = A.

The component form of (AB)T is Ail Bjk while that of T(AB) is AjkBil . A summaryof the component forms of the various dyadic products is

[A⊗ B]ijkl = AijBkl ,

[A B]ijkl = AikBjl ,

[(A B)T]ijkl = Ail Bjk,

[T(A B)]ijkl = AjkBil .

(1.177)

The tensor C is said to have the first minor symmetry if C = TC, or, in indicial notation,if

Cijkl = TijmnCmnkl = δinδjmCmnkl = Cjikl ,

and is said to have the second minor symmetry if C = CT, which in indicial notation reads

Cijkl = Cijlk.

Note that the first minor symmetry condition can be written as (I− T)C = 2WC = 0,while the second minor symmetry condition can be written as CW = 0. Equivalently,using S = (I + T)/2, one can also write the two symmetry conditions as C = SC andC = CS, respectively. One has

C = SC and C = CS ⇐⇒ C = SCS,

since given the conditions on the left, we directly get C = SCS, while, conversely, if C =SCS, then SC = SSCS = SCS = C, and CS = SCSS = SCS = C. Thus a tensor C has boththe minor symmetries if and only if C = SCS.

The tensor C is said to have major symmetry if C = CT , or, in indicial notation (usingEqn. (1.164)), if Cijkl = Cklij. The symmetric and skew-symmetric parts of C are given by(C + CT)/2 and (C− CT)/2. Using Eqn. (1.161) or directly, it can be shown that I, T, S

and W are all symmetric.Assume that C has the first minor symmetry, i.e., C = TC. By transposing this equation,

and using Eqn. (1.165b) and the symmetry of T, we get

CT = CTT.

Thus, if C has the first minor symmetry, then CT has the second minor symmetry. It alsofollows that if C has major symmetry and one of the minor symmetries, it also has the otherminor symmetry.

The eigenvalue problem for a fourth-order tensor A that does not possess minor sym-metries, involves finding the eigenvalue and eigentensor pair (λ, A), A ∈ Lin, which satis-fies

AA = λA. (1.178)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 68 — #87 ii

ii

ii


By applying the Ψ-mapping as per Eqn. (I.1e) to the above equation, we get

[Ψ(A)− λI]Ψ(A) = 0.

This equation has a nontrivial solution Ψ(A) 6= 0, if and only if the characteristic equa-tion det[Ψ(A) − λI] = 0 is satisfied. The characteristic equation can also be written inexpanded form as

λ9 − I1(A)λ8 + I2(A)λ7 + · · · − I9(A) = 0,

where (see Eqn. (J.17))

I1(A) = tr Ψ(A) = λ1 + λ2 + · · ·+ λ9,

I2(A) =12[(tr Ψ(A))2 − tr (Ψ(A))2] = λ1λ2 + λ2λ3 + · · ·+ λ1λ9,

. . .I9(A) = det Ψ(A) = λ1λ2 . . . λ9,

are a strict subset of the complete set of the principal invariants of A (see the referencesbelow for the case where A has minor symmetries, denoted by C there). Similar to theresult for second-order tensors,

det AT = det Ψ(AT) = det[Ψ(A)T ] = det Ψ(A) = det A. (1.179)

If A = A1A2, then, again, similar to the result for second-order tensors,

det A = det Ψ(A) = det[Ψ(A1)Ψ(A2)] = [det Ψ(A1)][det Ψ(A2)] = (det A1)(det A2).(1.180)

By virtue of Eqn. (1.171), we have

Theorem 1.7.1. If (λi, ui) and (γi, vi), i = 1, 2, . . . , n, where n is the underlying spacedimension, are the (possibly complex-valued) eigenvalues/eigenvectors of diagonalizabletensors A, B, then (λiγj, ui⊗ vj), i, j = 1, 2, . . . , n, are the eigenvalues/eigentensors ofA B.

The theorem may not hold for non-diagonalizable tensors. For example, if A is non-diagonalizable but invertible, and B = A−T , then (1, I) is an eigenvalue/eigentensor ofA A−T , but ui ⊗ vj (or their linear combinations) corresponding to λiγj = 1 may notequal I. An application of the above theorem is presented in Theorem 1.7.2.

An orthogonal fourth-order tensor Q is defined as one for which

QTQ = QQT = I. (1.181)

A fourth-order tensor is orthogonal if and only if it preserves inner products, i.e., Q isorthogonal if and only if

(QA, QB) = (A, B) ∀A, B ∈ Lin,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 69 — #88 ii

ii

ii


the proof being identical to that for second-order orthogonal tensors. An example of anorthogonal tensor is Q1 Q2, where Q1, Q2 ∈ Orth. This follows since

(Q1 Q2)T(Q1 Q2) = (QT

1 QT2 )(Q1 Q2) = (QT

1 Q1) (QT2 Q2) = I I = I.

A special class of fourth-order orthogonal tensors are fourth-order rotations obtained bychoosing Q1 = Q2 = Q, where Q ∈ Orth+. Equation (1.150) can be expressed as

[A] = [Q][A][Q]T , (1.182)

where Q := Q Q, Q ∈ Orth+, is a fourth-order rotation.Since Ψ(I) = I9×9, we have det I = 1. It follows from Eqns. (1.179)–(1.181) that

det Q = ±1.

From Eqns. (1.180)–(1.182), we get

det(A− λI) = det[Q(A− λI)QT ] = det(QQT)det(A− λI) = (det I)det(A− λI)

= det(A− λI), (1.183)

showing that the invariants that occur in the characteristic equation (and hence the eigen-values) of A are independent (as they should be) of the choice of basis.

The Cayley–Hamilton theorem reads

0 = A9 − I1A8 + I2A7 + · · · − I9I

=9

∏i=1

(A− λiI).

The minimal polynomial of A is given by

q(A) =k

∏i=1

(A− λiI)mi ,

where k is the number of distinct eigenvalues, and 1 ≤ mi ≤ ni, with ni denoting thealgebraic multiplicity of each eigenvalue λi. If A is “normal”, i.e., if ATA = AAT , or, inparticular, symmetric, then each mi = 1.

The tensor A is said to be invertible if there exists another tensor, denoted by A−1, suchthat

A−1A = AA−1 = I.

Similar to Eqns. (1.68) and (1.69), we have

(UV)−1 = V−1U−1, (1.184a)

(CT)−1 = (C−1)T =: C−T , (1.184b)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 70 — #89 ii

ii

ii


From Eqn. (1.184b), it follows that if C has major symmetric then so does its inverse. Theinverse of A B (assuming that A and B are invertible) is given by

(A B)−1 = A−1 B−1, (1.185)

since, by Eqns. (1.167) and (1.168), we have

(A−1 B−1)(A B) = (A−1 A) (B−1B) = I I = I.

A fourth-order tensor A is invertible if and only if I9(A) = ∏9i=1 λi is nonzero, in which

case,

A−1 =1I9

[A8 − I1A7 + · · ·+ I8I

].

By virtue of the relation (AT)T = A, we have TTA = A for all A ∈ Lin, which implies that

TT = I. (1.186)

Thus, the inverse of the transposer tensor T is T itself.If A has major symmetry (but not necessarily the minor symmetries), then [Ψ(A)]T =

Ψ(AT) = Ψ(A), so that, similar to the case of second-order symmetric tensors, all the nineeigenvalues are real, and the corresponding eigentensors Ai, i = 1, 2, . . . , 9 are orthogonalin the sense that

(Ai, Aj) = Ai : Aj = δij. (1.187)

Thus, we can write the spectral representation of A as

A =9

∑i=1

λi Ai⊗ Ai, with9

∑i=1

Ai⊗ Ai = I.

If k is the number of distinct eigenvalues, then, similar to Eqn. (J.4), we have the followingexplicit formula:

Ai⊗ Ai =

∏kj=1j 6=i

A−λjI

λi−λj, k > 1

I, k = 1(1.188)

When A is invertible, the spectral decomposition of A−1 is given by

A−1 =9

∑i=1

1λi

Ai⊗ Ai.

A is said to be positive definite if T : AT > 0 for all nonzero T ∈ Lin. Similar to symmetricsecond-order tensors (see Theorem 1.6.4), one can show that A is positive definite if andonly if all the eigenvalues λi (or all the invariants that occur in the characteristic equation)are strictly positive. If A is positive semi-definite, the square root of A defined via therelation HH = A is given by

H =9

∑i=1

√λi Ai⊗ Ai.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 71 — #90 ii

ii

ii


Now consider the case when the fourth-order tensor C has both minor symmetries,i.e., C = SCS. The transpose is still defined by Eqn. (1.163), except that A, B ∈ Sym.Using this definition, one can show that when C and D have both minor symmetries, thenEqns. (1.165) still hold. The eigenvalue problem given by Eqn. (1.178) is now written asCS = λS = λSS, S ∈ Sym, or, alternatively, using Eqn. (I.14), as[

χ(C)H−1 − λI6×6

]χ(S) = 0.

Thus, C has only six eigenvalues, with the remaining three eigenvalues zero, and the cor-responding eigentensors being linearly independent skew-symmetric tensors. The aboveequation has a nontrivial solution χ(S) 6= 0, if and only if det[χ(C)H−1 − λI] = 0. Thischaracteristic equation can be written in expanded form as

λ6 − I1(C)λ5 + I2(C)λ4 + · · ·+ I6(C) = 0,

where

I1(C) = tr [χ(C)H−1] = λ1 + λ2 + · · ·+ λ6,

I2(C) =12[(tr [χ(C)H−1])2 − tr (χ(C)H−1)2] = λ1λ2 + λ2λ3 + · · ·+ λ1λ6,

. . .

I6(C) = det[χ(C)H−1] = 8 det χ(C) = λ1λ2 . . . λ6,

are a strict subset of the set of principal invariants of C [4, 20, 141, 238] (the results of Ahmadand Norris on the one hand, and Betten on the other regarding the number of quadraticprincipal invariants seem to be contradictory). As an example,

det S = det[χ(S)H−1] = det[HH−1] = 1. (1.189)

Using Eqn. (I.11c), we now have, similar to Eqn. (1.179),

det CT = det C. (1.190)

If C = C1C2, then using Eqn. (I.11f), we again have similar to Eqn. (1.180),

det C = (det C1)(det C2). (1.191)

If C has both the minor symmetries, then [C] = [S][C][S] and [C] = [S][C][S], so thatEqn. (1.182) can be written as

[C] = [Q][C][Q]T , (1.192)

where Q := [S][Q Q][S], Q ∈ Orth+. Now applying Eqns. (1.165b), (I.11c) and (I.11f),and using the symmetry of S and H, we get

χ([C]) = χ(Q)H−1χ([C])H−1χ(Q)T = Aχ([C])AT ,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 72 — #91 ii

ii

ii


where A := χ(Q)H−1 is found by using Eqn. (I.5); the final expression for A is given byEqn. (I.12) with J ≡ QT . Using Eqns. (1.165b) and (1.176), we get

QQT = QTQ = S(I I)S = SIS = S. (1.193)

It follows from Eqns. (1.189)–(1.191) that

det Q = ±1.

From Eqns. (1.189) and (1.191)–(1.193), we get

det(C− λS) = det[Q(C− λS)QT ] = det(QQT)det(C− λS)

= (det S)det(C− λS) = det(C− λS), (1.194)

showing that the invariants that occur in the characteristic equation (and hence the eigen-values) of C are independent of the choice of basis.

The spectral decomposition of C reads

C =6

∑i=1

λi Ai⊗ Ai, with Ai = ATi and

6

∑i=1

Ai⊗ Ai = S.

In place of Eqn. (1.188), we now have

Ai⊗ Ai =

∏kj=1j 6=i

C−λjS

λi−λj, k > 1

S. k = 1

C is invertible if there exists another tensor, denoted by C−1, such that

C−1C = CC−1 = S.

The necessary and sufficient condition for C to be invertible is that I6(C) be nonzero. If thiscondition is satisfied, then

C−1 =6

∑i=1

1λi

Ai⊗ Ai.

As an example, since SS = S, we have S−1 = S. Let U, V and C be tensors that haveminor symmetries. Then, since ST = S, both relations given by Eqns. (1.184) hold. FromEqn. (1.176), it follows (provided A is invertible) that

[S(A A)S]−1 = S(A−1 A−1)S.

Note that, in general, [S(A B)S]−1 6= S(A−1 B−1)S.C is said to be positive semi-definite if S : CS ≥ 0 for all S ∈ Sym. If C is positive

semi-definite, its square root H is given by

H =6

∑i=1

√λi Ai⊗ Ai, with Ai = AT

i .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 73 — #92 ii

ii

ii


C is positive definite if and only if all the λi, i = 1, 2, . . . , 6, are strictly positive.Examples of some spectral resolutions are

I =9

∑i=1

Ai⊗ Ai,

T =6

∑i=1

Ai⊗ Ai −9

∑i=7

Ai⊗ Ai,

S =6

∑i=1

Ai⊗ Ai, ATi = Ai

W =3

∑i=1

Ai⊗ Ai, ATi = −Ai

I⊗ I = 3(

I√3⊗ I√

3

).

Let e be the axial vector of Q ∈ Orth+\Sym, and let (λ, n) and (λ, n) denote the re-maining eigenvalue/eigenvector pairs, with the “hat” denoting a complex conjugate. Thefollowing result, which follows directly from Theorem 1.7.1, holds:

Theorem 1.7.2. The eigenvalue/eigentensor pairs of Q := Q Q, where Q ∈Orth+\Sym, are (1, I), (1, Q), (1, Q2), (λ, e⊗ n), (λ, n⊗ e), (λ, e⊗ n), (λ, n⊗ e),(λ2, n⊗ n), and (λ2, n⊗ n).

If the dimension of the underlying space is four, and (λ1, n1), (λ1, n1), (λ2, n2), (λ2, n2)denote the eigenvalue/eigenvector pairs of Q, then the eigenvalue/eigentensor pairs ofQ are (1, I), (1, Q), (1, Q2), (1, Q3), (λ2

1, n1 ⊗ n1), (λ21, n1 ⊗ n1), (λ2

2, n2 ⊗ n2), (λ22, n2 ⊗

n2), (λ1λ2, n1 ⊗ n2), (λ1λ2, n2 ⊗ n1), (λ1λ2, n1 ⊗ n2), (λ1λ2, n2 ⊗ n1), (λ1λ2, n1 ⊗ n2),(λ1λ2, n2⊗ n1), (λ1λ2, n1⊗ n2) and (λ1λ2, n2⊗ n1). In general, when the underlying spacedimension is n, the eigenvalue one is repeated n times, with the corresponding eigenten-sors given by I, Q, . . ., Qn−1, and the remaining n(n − 1) eigenvalue/eigentensor pairsare formed by combinations of each complex eigenvalue/eigenvector of Q with itself andwith all the remaining eigenvalues/eigenvectors of Q barring the complex conjugate ofthat eigenvalue/eigenvector.

We now discuss the solution of the tensor equation AXB + CXD = H, where A, B, C,D and H are given second-order tensors. We write this equation as

[A BT + C DT ]X = H.

In component form, the above equation reads

TijklXkl = Hij,

where

Tijkl = AikBl j + CikDl j, (1.195)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 74 — #93 ii

ii

ii


Applying Eqn. (I.1e), we get

X11

X22

X33

X12

X23

X31

X21

X32

X13

=

T1111 T1122 T1133 T1112 T1123 T1131 T1121 T1132 T1113

T2211 T2222 T2233 T2212 T2223 T2231 T2221 T2232 T2213

T3311 T3322 T3333 T3312 T3323 T3331 T3321 T3332 T3313

T1211 T1222 T1233 T1212 T1223 T1231 T1221 T1232 T1213

T2311 T2322 T2333 T2312 T2323 T2331 T2321 T2332 T2313

T3111 T3122 T3133 T3112 T3123 T3131 T3121 T3132 T3113

T2111 T2122 T2133 T2112 T2123 T2131 T2121 T2132 T2113

T3211 T3222 T3233 T3212 T3223 T3231 T3221 T3232 T3213

T1311 T1322 T1333 T1312 T1323 T1331 T1321 T1332 T1313

−1

H11

H22

H33

H12

H23

H31

H21

H32

H13

,

(1.196)

with the components Tijkl given by Eqn. (1.195). It is clear that this method can be extendedto solve a matrix equation of the type ∑i AiXBi = H of any dimension n by inverting ann2 × n2 matrix.

Of particular importance is the case when B = C = I and D = A ≡ S ∈ Sym. In thiscase, Tijkl = Sikδl j + δikSl j.

Although the above approach is suitable for numerical purposes, a tensorial solution ismore suitable for proving properties of the solution X. We now present a method, based on[155], for determining (S I + I S)−1, S ∈ Sym, which is applicable for any dimensionn (see Rosati [273] and references therein for other methods).

Consider the tensor T I + I TT , T ∈ Lin. If (λi, ui) and (λi, vi) are the eigen-value/eigenvectors of T and TT , respectively, then

(T I + I TT)(ui⊗ vj) = T(ui⊗ vj) + (ui⊗ vj)T

= (Tui)⊗ vj + ui⊗ (TTvj)

= (λi + λj)(ui⊗ vj),

which shows that (λi + λj, ui ⊗ vj), i, j = 1, 2, . . . , n, are eigenvalue/eigentensor pairs ofT I + I TT . However, unfortunately, these are not the only eigenvalue/eigentensorpairs, especially when the eigenvectors ui are not linearly independent. For example,when T = e1 ⊗ e2 + e3 ⊗ e4 + e4 ⊗ e5, with n = 5, all the 25 eigenvalues of T I +I TT are zero, and there are nine linearly independent eigentensors given by e1 ⊗ e2,e1⊗ e5, e3⊗ e2, e3⊗ e5, e3⊗ e3 − e4⊗ e4 + e5⊗ e5, −e3⊗ e4 + e4⊗ e5, e3⊗ e1 + e4⊗ e2,−e1⊗ e4 + e2⊗ e5 and −e1⊗ e1 + e2⊗ e2. Although the first four eigentensors are of theform ui ⊗ vj, i, j = 1, 2, the remaining five are not. A similar situation arises even in thecase of the tensor T I + I T , which again has nine linearly independent eigentensors.Nevertheless, as we now show, λi + λj, i, j = 1, 2, . . . , n, are the eigenvalues of T I + I TT (or T I + I T). Thus, the necessary and sufficient condition for T I + I TT to beinvertible is that (det T)∏n

i=1j=i+1

(λi + λj) 6= 0. For example, if n = 3, this condition is given

by λ1λ2λ3(λ1 + λ2)(λ2 + λ3)(λ1 + λ3) 6= 0.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 75 — #94 ii

ii

ii


Let T = H1 J1H−11 and TT = H2 J2H−1

2 , where J1 and J2 are matrices in Jordan normalform. Then

T I + I TT = (H1 J1H−11 ) I + I (H2 J2H−1

2 )

= (H1 H2)(J1 I + I J2)(H−11 H−1

2 )

= (H1 H2)(J1 I + I J2)(H1 H2)−1.

Thus, the eigenvalues of T I + I TT and J1 I + I J2 are the same. Since the matrixform of J1 I + I J2 is an upper triangular matrix with λi + λj, i, j = 1, 2, . . . , n, on thediagonals, we get the desired result. A similar proof can be devised for T I + I T .

Let (λi, ni) be the eigenvalues/eigenvectors of S ∈ Sym. Then, the eigenvalue/eigentensorpairs of S(S I + I S)S when all the λi are distinct are λi + λj, ni ⊗ nj + nj ⊗ ni, i, j =1, 2, . . . , n (the eigenvalue/eigentensors when the eigenvalues of S are repeated can befound similar to Eqns. (1.258) and (1.259)). Thus, the necessary and sufficient conditionthat S(S I + I S)S be invertible is that (det S)∏n

i=1j=i+1

(λi + λj) 6= 0.

Let λi, i = 1, 2, . . . , k, be the distinct eigenvalues of S ∈ Sym, and let Pi be the associatedprojections, which are given by Eqn. (J.4). We have

S(S I + I S)S = S

[(k

∑i=1

λiPi

)

(k

∑j=1

Pj

)+

(k

∑i=1

Pi

)

(k

∑j=1

λjPj

)]S

= S

[k

∑i=1

k

∑j=1

(λi + λj)Pi Pj

]S.

Assuming that the necessary and sufficient condition for invertibility is satisfied, the de-sired inverse is

[S(S I + I S)S]−1 = S

[k

∑i=1

k

∑j=1

Pi Pj

λi + λj

]S, (1.197)

since, on using Eqn. (1.175),

S

[k

∑i=1

k

∑j=1

Pi Pj

λi + λj

]S [S I + I S] S = S

[k

∑i=1

k

∑j=1

(PiS) Pj + Pi (PjS)λi + λj

]S

= S

[k

∑i=1

k

∑j=1

Pi Pj

]S

= S

[(k

∑i=1

Pi

)

(k

∑j=1

Pj

)]S

= S [I I] S

= S,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 76 — #95 ii

ii

ii


where we have used the properties PmPi = δimPi (no sum on i), ∑ki=1 Pi = I and SPi =

(∑km=1 λmPm)Pi = λiPi = PiS (no sum on i). Thus, the solution of the tensor equation

S(S I + I S)SX = H is

X =k

∑i=1

k

∑j=1

Pi(H + HT)Pj

2(λi + λj). (1.198)

The above method can obviously be generalized to the case when T is diagonalizable.Thus, if the tensor equation to be solved is (T I + I T)X = TX + XTT = H, whereT = ∑k

i=1 λiPi, with Pi given by Eqn. (J.26), then assuming that T is such that the conditionsfor invertibility of T I + I T are satisfied, we have

X = (T I + I T)−1H

=

[k

∑i=1

k

∑j=1

Pi Pj

λi + λj

]H

=k

∑i=1

k

∑j=1

Pi HPTj

λi + λj.

We now show how the solutions given by Eqns. (1.198) reduce to some of the solutionspresented in the literature for the case H = W ∈ Skw. When n = 3, and when theeigenvalues are distinct, for a given i, we have PiWPi = (e∗i ·We∗i )Pi = 0. Multiplyingboth sides of Eqn. (1.198) by the determinant of

S := (tr S)I − S = (λ2 + λ3)P1 + (λ1 + λ3)P2 + (λ1 + λ2)P3,

which is given by (λ1 + λ2)(λ2 + λ3)(λ1 + λ3), we get

(det S)X = SWS, (1.199)

which is the solution presented by Scheidler [282]. Since the solution has no explicitdependence on the eigenvalues λi, it holds even under the case when the eigenvaluesare repeated. Alternatively, by writing W as (P1 + P2 + P3)W(P1 + P2 + P3), S2W as(λ2

1P1 +λ22P2 +λ2

3P3)W(P1 +P2 +P3), etc. and noting that (λ1 +λ2)(λ2 +λ3)(λ1 +λ3) =I1(S)I2(S)− I3(S), one can also show that when H = W , Eqn. (1.198) reduces to

(I1 I2 − I3)X = (I21 − I2)W − (S2W + WS2),

which is the solution presented by Guo [104].Finally, consider the case when the dimension of the underlying vector space n is two,

and H is arbitrary. Using I1(S) = λ1 + λ2, I2(S) = λ1λ2, and writing H = (P1 +P2)H(P1 + P2), SH = (λ1P1 + λ2P2)H(P1 + P2), etc., one can show that Eqn. (1.198)can be written as

(2I1 I2)X = (I21 + I2)H − I1(SH + HS) + SHS,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 77 — #96 ii

ii

ii


which is the solution presented by Hoger and Carlson [127]. Alternatively, if S is invertible,writing S−1 = P1/λ1 + P2/λ2, we can also show that

2I1X = H + I2(S−1HS−1),

which is the solution presented by Ting [327].

1.8 Isotropic Tensors

A second-order tensor is said to be isotropic if it’s components are the same in all orthonor-mal bases, i.e., if T = Tijei ⊗ ej = Tij ei ⊗ ej. Since ei = QTei, this condition can also bewritten as

QTQT = T ∀Q ∈ Orth+, (1.200)

or, alternatively, as

QT = TQ ∀Q ∈ Orth+. (1.201)

We have the following characterization for isotropic tensors.

Theorem 1.8.1. A tensor T is isotropic if and only if it is of the form T = λI, whereλ ∈ <.

Proof. If T = λI, then obviously Eqn. (1.200) holds.To prove the converse, let v be an arbitrary vector, and let R be the proper

orthogonal tensor for which v is along its axis, i.e., Rv = v (see Eqn. (1.104)).Since RT = TR, we have

RTv = TRv = Tv,

i.e., Tv also lies along the axis of R. Since the axis of R is a one-dimensionalsubspace, we have

Tv = λv = λIv.

Since the vector v is arbitrary, we have T = λI by the definition of equality oftensors.

Another proof can be given as follows. Express Eqn. (1.200) as

QT = T ,

where Q = QQ. By Theorem 1.7.2, I, Q and Q2 are the eigentensors of Q correspondingto the eigenvalue one. Hence, the above equation implies that

T = α0(Q)I + α1(Q)Q + α2(Q)Q,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 78 — #97 ii

ii

ii


where α0, α1 and α2 are, in general, constants that depend on Q. Since T is independent ofQ, we have

T = α0 I + α1Q0 + α2Q20,

where, now, α0, α1 and α2 are constants. Substituting this expression into Eqn. (1.201), weget α1 = α2 = 0, since, in general, rotations do not commute. This methodology can beextended to other dimensions as well. For example, if the dimension n is four, then weget T = α0 I + α1Q0 + α2Q2

0 + α3Q30, but again, since rotations do not commute, we get

all the constants other than α0 to be zero. Now consider the case when n is two. In thiscase, T = α0 I + α1Q0, but now since rotations do commute when n = 2, both α0 and α1are nonzero. Using the representation of Q0 in terms of cos θ0 and sin θ0, we can write themost general form of an isotropic second-order tensor when n = 2 as T =

[ α β−β α

].

A third-order tensor is isotropic if and only if it is of the form λE , where E is the alter-nate tensor with components εijk. To see the ‘if’ part, note that

Eijk = ei · (ej× ek) = QTei · [cof QT(ej× ek)] = QTei · [QT(ej× ek)] = ei · (ej× ek) = Eijk.

The reverse implication is proved by making special choices of Q (such as e1⊗ e1 + e2⊗e3 − e3⊗ e2, e2⊗ e2 + e3⊗ e1 − e1⊗ e3, etc.) in the relation

Bijk = QipQjqQkrBpqr,

and showing that Bijk has to be a scalar multiple of εijk.A fourth-order tensor is said to be isotropic if it has the same components in all or-

thonormal bases, i.e., if

C = QCQT , (1.202)

where Q := QQ, with Q ∈ Orth+. Alternatively, the above condition can also be writtenas

Q(CA) = C(QAQT) ∀A ∈ Lin. (1.203)

We have the following characterization for isotropic fourth-order tensors:

Theorem 1.8.2. A fourth-order tensor C : Lin → Lin is isotropic if and only if it is ofthe form

C = λI⊗ I + (µ + γ)I + (µ− γ)T (1.204)

= λI⊗ I + 2µS + 2γW,

or, in other words, a fourth-order isotropic tensor is a linear combination of tensors withcomponents δijδkl , δikδjl and δilδjk.

If C has the first or second minor symmetry, i.e., if C = SC or C = CS, then usingEqn. (1.160), it follows as a corollary that

C = λI⊗ I + 2µS. (1.205)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 79 — #98 ii

ii

ii


Proof. The proof is based on [157]. To see the “if” part, note that

Q(I⊗ I)QT = (QI)⊗ (QI) = [(Q Q)I]⊗ [(Q Q)I] = I⊗ I,

QIQT = QQT = I,

QTQT = (Q Q)TQT = T(Q Q)QT = TQQT = T. (by Eqn. (1.173))

To prove the converse, take A to be Q ∈ Orth+\Sym in Eqn. (1.203) to get

Q(CQ) = CQ.

Thus, CQ is a linear combination of the eigentensors of Q corresponding to theeigenvalue one, i.e., by virtue of Theorem 1.7.2,

CQ = α0(Q)I + α1(Q)Q + α2(Q)Q2 ∀Q ∈ Orth+\Sym.

In the limiting cases following Eqn. (1.105b), namely, (i) α → 0, we have Q = I,so that α1 and α2 can be taken to be zero, and (ii) α → π, we have Q2 = I, sothat α2 can be taken to be zero in the above expression. Thus, by virtue of thecontinuous dependence of Q on α, we can write

CQ = α0(Q)I + α1(Q)Q + α2(Q)Q2 ∀Q ∈ Orth+, (1.206)

with α1 = α2 = 0 when Q = I, and α2 = 0, when Q = 2e⊗ e− I.We now show that the functions αi, i = 0, 1, 2, are in fact functions of the

principal invariants of Q (which is just the trace, since, for proper orthogonaltensors, the determinant is equal to unity, and the second invariant is equal tothe trace). By letting A ≡ R0 ∈ Orth+ in Eqn. (1.203), we get

CR0 = QTC(QR0QT)Q, ∀Q ∈ Orth+.

Substituting the expressions for CR0 and C(QR0QT) obtained from Eqn. (1.206)into the above equation, we get[

α0(R0)− α0(QR0QT)]

I +[α1(R0)− α1(QR0QT)

]R0 +

[α2(R0)− α2(QR0QT)

]R2

0 = 0.

Since, as shown in the discussion on page 38, I, R0, R20, I, R0 and I are

linearly independent sets, depending on whether R0 ∈ Orth+\Sym, R0 ∈Orth+ ∩ Sym− I (α2 = 0), or R0 = I (α1 = α2 = 0), we have

αi(R0) = αi(QR0QT), i = 0, 1, 2, ∀Q ∈ Orth+, (1.207)

i.e., the αi’s are isotropic scalar-valued functions of R0. To show that they arefunctions of I1(R0), we have to show that

I1(Q1) = I1(Q2) =⇒ αi(Q1) = αi(Q2), i = 0, 1, 2,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 80 — #99 ii

ii

ii


and then the conclusion follows from Theorem 1.6.7. If I1(Q1) = I1(Q2), thenby Theorem 1.5.5, there exists Q0 ∈ Orth+ such that Q2 = Q0Q1QT

0 . Thus,using Eqn. (1.207), we get αi(Q2) = αi(Q0Q1QT

0 ) = αi(Q1). It follows fromTheorem 1.6.7 that there exist functions αi such that αi(Q) = αi(I1(Q)).

Equation (1.206) can now be written as

CQ = α0(I1)I + α1(I1)Q + α2(I1)Q2. (1.208)

Eliminating Q2 from Eqn. (1.208) using Eqn. (1.110), we get

CQ = [α0(I1)− α2(I1)I1]I + [α1(I1) + α2(I1)I1]Q + α2(I1)QT .

Since C is independent of Q, and a linear transformation from Lin to Lin, theright-hand side of the above expression has to be linear in Q, i.e., the coefficientsof Q and QT are constants, and the coefficient of I has to be a multiple of tr Q.Thus, set α0(I1) ≡ (λ + µ − γ)I1, α1(I1) ≡ (µ + γ) − (µ − γ)I1, and α2(I1) ≡µ− γ, where λ, µ and γ are constants. This leads to

CQ = λ(tr Q)I + (µ + γ)Q + (µ− γ)QT

= [λ(I⊗ I) + (µ + γ)I + (µ− γ)T] Q ∀Q ∈ Orth+. (1.209)

The last step is to show that the above equation implies the expressiongiven in Eqn. (1.204). Thus, by virtue of Eqn. (1.157), we have to showthat Eqn. (1.209) holds for an arbitrary tensor T in place of Q. Let Ciso :=[λ(I⊗ I) + (µ + γ)I + (µ− γ)T]. Since fourth-order tensors are a linear trans-formation from Lin to Lin, Eqn. (1.209) implies that

[C−Ciso] (∑i

φiQi) = 0 ∀φi ∈ <, Qi ∈ Orth+.

Since by Theorem 1.5.3, any T ∈ Lin can be expressed as a linear combination ofQ ∈ Orth+, the proof is complete.

For various other proofs, see, e.g., [24], [102], [103], [105], [110], [142], [148], [175], [210],[250] and [251]. Note that this result does not hold when the space dimension n = 2, asthe counterexample C = T T , where T =

[ α β−β α

]shows (the above proof fails in the last

step for this case, since an arbitrary tensor cannot be expressed as a linear combination ofrotations). However, the result is applicable for any n if we assume C to be symmetric.

1.9 Differentiation of Tensors

In the subsequent chapters, we shall deal with tensor fields, where a tensor T is a functionof a scalar quantity t, and a vector quantity x (e.g., t representing time, and x the positionvector). In this section, we define the concept of the derivative of a tensor quantity. Let X



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 81 — #100 ii

ii

ii


and Y denote normed vector spaces, and let f : Ω ⊂ X → Y. We say that f (u) approacheszero faster than u, and write f (u) = o (u) if

limu→0

‖ f (u)‖‖u‖ = 0.

1.9.1 The directional derivative

Let g be a function whose values are scalars, vectors or tensors, and whose domain Ω is anopen interval of <. The derivative Dg(t) is defined by

Dg(t) :=ddt

g(t) = limα→0

1α[g(t + α)− g(t)] . (1.210)

From Eqn. (1.210), it follows that

limα→0

1α[g(t + α)− g(t)− αDg(t)] = 0,

or, equivalently,

g(t + α) = g(t) + αDg(t) + o (α) . (1.211)

Thus, we see that the derivative of a vector function is a vector, or the derivative of a tensorfunction is a tensor.

We now extend the above concept of a derivative to domains that lie in spaces of di-mensions higher than one. Let L(X, Y) denote the vector space of all continuous linearmappings from X to Y, and let g : Ω ⊂ X → Y. We say that g is differentiable at x ∈ Ω ifthere exists a linear transformation Dg(x) ∈ L(X, Y) such that

g(x + u) = g(x) + Dg(x)[u]+ o (u) . (1.212)

The quantity Dg(x) ∈ L(X, Y) is called the Frechet derivative, while the quantityDg(x)[u] ∈ Y is called the directional derivative or Gateaux derivative. Note that the di-rectional derivative is computed by the action of the Frechet derivative on elements of X.If the directional derivative exists, it is unique, and can be written as

Dg(x)[u] = limα→0

1α[g(x + αu)− g(x)] =

ddα

g(x + αu)∣∣∣∣α=0

. (1.213)

When Ω ⊂ <, by comparing Eqns. (1.211) and (1.212), we see that

Dg(t)[α] = αdgdt

∀α ∈ <.

Often, the easiest way of computing the directional derivative is to appeal directly toits definition. We now consider some examples. Consider the function φ(v) = v · v. Then

φ(v + u) = (v + u) · (v + u)= v · v + 2v · u + u · u= φ(v) + 2v · u + o (u) ,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 82 — #101 ii

ii

ii


so that

Dφ(v)[u] = 2v · u.

Now consider the function G : Lin → Lin defined by G(T) = T2, where T is a second-order tensor. Then we have

G(T + U) = (T + U)2 = T2 + TU + UT + U2 = G(T) + TU + UT + o (U) ,

which implies that DG(T)[U] = TU + UT .Similarly for G(T) = T3, we have

G(T + U) = G(T) + T2U + TUT + UT2 + o (U) ,

which implies that DG(T)[U] = T2U + TUT + UT2.For φ(T) = tr (T), we have φ(T + U) = φ(T) + tr U, which yields

D(tr T)[U] = tr U = I : U. (1.214)

If φ(T) = tr Tk, then (with T0 ≡ I)

(T + U)k = Tk +k

∑i=1

Tk−iUT i−1 + o (U) ,

and on using the linearity of the trace operator, and the fact that tr AB = tr BA, we get

φ(T + U) = φ(T) +k

∑i=1

tr [Tk−iUT i−1] + o (U)

= φ(T) +k

∑i=1

tr [Tk−1U] + o (U)

= φ(T) + k(Tk−1)T : U + o (U) ,

so that

Dφ(T)[U] = k(Tk−1)T : U. (1.215)

If φ(T) = det T , then from Problem 14, we have

det(T + U) = det T + cof T : U + T : cof U + det U= det T + cof T : U + o (U) ,

which yields

Dφ(T)[U] = cof T : U. (1.216)

Recall that when T is invertible, cof T = (det T)T−T .We now discuss the invariance properties of the derivative of a tensor function.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 83 — #102 ii

ii

ii


Theorem 1.9.1. Let G : Sym → Sym be an isotopic tensor function. Then the deriva-tive is invariant in the following sense:

QDG(S)[U]QT = DG(QSQT)[QUQT] ∀Q ∈ Orth+, S, U ∈ Sym.(1.217)

Proof. We have already remarked that Sym is invariant under Orth+. Thus, wesimply need to prove Eqn. (1.217). Since G(S) is isotropic, we have

G(

Q(S + U)QT)= QG(S + U)QT

= QG(S)QT + QDG(S)[U]QT + o (U)

= G(QSQT) + QDG(S)[U]QT + o (U) .

On the other hand, we also have

G(

Q(S + U)QT)= G(QSQT + QUQT)

= G(QSQT) + DG(QSQT)[QUQT]+ o (U) .

From the above equations, and the uniqueness of the derivative, we getEqn. (1.217).

1.9.2 Product rule

Quite often, we will be required to compute the directional derivative of a bilinear mapπ( f , g). Examples of bilinear maps are [108]

• the product of a scalar and a vector

π(φ, v) = φv,

• the inner product of two vectors

π(u, v) = u · v,

• the tensor product of two vectors

π(u, v) = u⊗ v,

• the action of a tensor on a vector

π(T , v) = Tv,

and so forth. Assuming that f and g are differentiable at a point x in the common domainof f and g, the product π( f (x), g(x)) is differentiable, and its directional derivative is givenby (see [108] for the proof)

Dπ( f (x), g(x))[u] = π( f (x), Dg(x)[u]) + π(D f (x)[u], g(x)) ∀u. (1.218)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 84 — #103 ii

ii

ii


When x ∈ <, then we simply have (with x replaced by t)

π( f (t), g(t)) = π ( f (t), g(t)) + π(

f (t), g(t))

.

Another result that is used frequently is the chain rule, which we now state.

1.9.3 Chain rule

Let f : Ω2 → Ω3 and g : Ω1 → Ω2, and let g(x) be differentiable at x ∈ Ω1. Further, let fbe differentiable at y = g(x). Then the composition f g is differentiable at x, and

D( f g)(x)[u] = D f (g(x))[Dg(x)[u]]. (1.219)

In case x ∈ <, then writing t in place of x, we have

D( f g)(t)[α] = αddt( f g),

Dg(t)[α] = αdgdt

,

and hence,

ddt

f (g(t)) = D f (g(t))[

dgdt

]. (1.220)

We have already shown how to find the directional derivatives of the first and third in-variants. To illustrate the product and chain rules, we now compute the directional deriva-tive of the second invariant of a second-order tensor I2(T) = (det T)(tr T−1) given by

DI2(T)[U] = det T[tr (T−1U)tr T−1 − tr (T−1UT−1)

]=[(tr T)I − TT

]: U. (1.221)

First we calculate the derivative of T−1. Since TT−1 = I, we have

DT[U]T−1 + TDT−1[U] = 0,

and since DT[U] = U, we get DT−1[U] = −T−1UT−1. Next, denoting tr T−1 by φ(G(T)),where G(T) = T−1, we have

Dφ(T)[U] = Dφ(G)[DG(T)[U]] = −Dφ(G)[T−1UT−1] = −tr (T−1UT−1).

Finally, writing I2 as (det T)φ(T), and applying the product rule, we get

DI2(T)[U] = D(det T)(T)[U]φ(T) + det TDφ(T)[U]

= det Ttr (T−1U)tr T−1 − det Ttr (T−1UT−1),

which proves the first relation in Eqn. (1.221). To get the second relation, we note fromEqn. (1.80) that I2 is also given by

12

[(tr T)2 − tr (T2)

].



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 85 — #104 ii

ii

ii


Hence, we have

DI2[U] =12[2(tr T)tr U − 2tr (TU)]

=[(tr T)I − TT

]: U.

The second relation can also be obtained directly from the first using the Cayley–Hamiltontheorem.

One other useful result is the following:

ddt(det T) = cof T :

dTdt

. (1.222)

To see this note that

ddt(det T) = D det(T)

[dTdt

](by Eqn. (1.220))

= cof T :dTdt

(by Eqn. (1.216)).

When T is invertible, then using cof T = (det T)T−T , we can also write Eqn. (1.222) as

ddt(det T) = det T tr

(dTdt

T−1)

. (1.223)

By taking the directional derivative of the Cayley–Hamilton equation, we get the fol-lowing “Rivlin’s identity” [68]:

T2U + TUT + UT2 − (tr U)T2 − (tr T)(TU + UT) +12

[(tr T)2 − tr T2

]U

+ [(tr T)(tr U)− tr (TU)] T − (cof T : U)I = 0.

Similar identities can be derived in other space dimensions by taking the directional deriva-tives of the relevant Cayley–Hamilton equation. Identities involving three tensors can bederived by taking the directional derivative twice. Other identities can be derived by tak-ing the directional derivative of the Cayley–Hamilton equation with a tensor function [68].

1.9.4 Gradient, divergence and curl

When X is a Hilbert space, the directional derivative of a real-valued function φ : Ω ⊂ X →< can be identified with an element of the space X. The derivative Dφ(x) is an element ofthe dual space X′ = L(X,<) and thus, since the space X is a Hilbert space, there exists bythe Riesz representation theorem [63] a unique element∇φ(x) in the space X that satisfies

(∇φ(x), u) = Dφ(x)[u], (1.224)

where (., .) denotes the inner product of the space X. The element ∇φ(x) is called thegradient of φ. Note that while Dφ(x) belongs to the dual space X′, the gradient ∇φ(x)belongs to the space X.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 86 — #105 ii

ii

ii


As an example, consider the case when X = <3 equipped with the Euclidean innerproduct (u, v) = u · v. Then we have

φ(x + u) = φ(x) +∇φ(x) · u + o (u) .

To find the component form of the gradient, we take u = ei to get

(∇φ)i = Dφ(x)[ei] = limα→0

1α[φ(x + αei)− φ(x)] =

∂φ(x)∂xi

.

Thus, we have

Dφ(x)[u] = ∇φ · u =∂φ

∂xiui.

As another example, consider the case when X = Lin, with the matrix inner product(A, B) = A : B. Then the gradient of φ(T) is the matrix

(∂φ/∂Tij

). By definition, we have

Dφ(T)[U] =∂φ

∂T: U =

∂φ

∂TijUij.

Similarly, if V(T) is a tensor-valued function of T , then the gradient and directional deriva-tive of V are related as

DV(T)[U] =∂V∂T

U. (1.225)

The gradients of the invariants I1, I2 and I3 are obtained from Eqns. (1.214), (1.216) and(1.221), and are given by

∂I1

∂T= I, (1.226a)

∂I2

∂T= (tr T)I − TT , (1.226b)

∂I3

∂T= cof T =

[I2 I − I1T + T2

]T. (1.226c)

The gradient of φ(T) = tr Tk is obtained using Eqn. (1.215), and is given by

∂

∂T(tr Tk) = k(Tk−1)T . (1.227)

Since ∂Tij/∂Tkl = ∂Tji/∂Tlk = δikδjl and ∂Tji/∂Tkl = ∂Tij/∂Tlk = δjkδil , we have

∂T∂T

=∂TT

∂TT = I,

∂TT

∂T=

∂T∂TT = T.

(1.228)

In general, if φ1(T) and φ2(T) are scalar-valued functions of T , then

∂(φ1φ2)

∂T= φ1

∂φ2

∂T+ φ2

∂φ1

∂T.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 87 — #106 ii

ii

ii


If φ(T) and U(T) are scalar and tensor-valued functions of T , then

∂φ

∂TT = T∂φ

∂T, (1.229)

∂U∂TT =

∂U∂T

T. (1.230)

Since (φU)ij = φUij, we have

∂(φU)

∂T= U⊗ ∂φ

∂T+ φ

∂U∂T

. (1.231)

If U(T) and V(T) are tensor-valued functions of T , then writing U : V as UmnVmn, we get

∂(U : V)

∂T=

(∂U∂T

)TV +

(∂V∂T

)TU. (1.232)

Similarly, using the relation (UV)ij = UimVmj, we get[∂(UV)

∂T

]ijkl

= Uim∂Vmj

∂Tkl+

∂Uim∂Tkl

Vmj

= Uimδnj∂Vmn

∂Tkl+

∂Unm

∂TklδinVmj

= Uimδnj∂Vmn

∂Tkl+

∂Umn

∂TklδimVnj,

which in tensorial form reads

∂(UV)

∂T= (U I)

∂V∂T

+ (I V T)∂U∂T

. (1.233)

Using Eqns. (1.158), (1.173), (1.228) and (1.233), we have the following results:

∂(T2)

∂T= T I + I TT ,

∂(TTT)∂T

= (I TT)T + TT I = 2S(TT I), (1.234)

∂(TTT)

∂T= (I T) + (T I)T = 2S(I T), (1.235)

∂(T−1)

∂T= −T−1 T−T = −(T TT)−1, (1.236)

∂(T−T)

∂T= −T(T−1 T−T) = −(T−T T−1)T = −T(T TT)−1. (1.237)

The fourth relation is obtained by taking U and V to be T−1 and T , respectively, and thendifferentiating T−1T = I. We get

0 = T−1 I + (I TT)∂T−1

∂T



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 88 — #107 ii

ii

ii


=⇒ ∂T−1

∂T= −(I TT)−1(T−1 I)

= −(I T−T)(T−1 I) (by Eqn. (1.185))

= −T−1 T−T . (by Eqn. (1.168))

The fifth relation is obtained by writing T−T as TT−1 and then using Eqn. (1.236).If φ(G(T)) is a scalar-valued function of a tensor function G(T), then by the chain rule,

we have

∂φ

∂Tij=

∂φ

∂Gkl

∂Gkl∂Tij

,

which implies that

∂φ

∂T=

(∂G∂T

)T ∂φ

∂G. (1.238)

Similarly, if H(G(T)) is a tensor-valued function of a tensor function G(T), then

∂H∂T

=∂H∂G

∂G∂T

. (1.239)

Special care has to be exercised in computing the gradients with respect to a symmetrictensor S. As a specific example, let us consider the computation of the first and second-order derivatives of a scalar-valued function W(C), where C ∈ Sym. Let S and C denotethese derivatives. We have

S(C) =12

(∂W∂C

+∂W∂CT

)=

12

(∂W∂C

+ T∂W∂C

)(by Eqn. (1.229))

=12(I + T)

∂W∂C

= S∂W∂C

. (by Eqn. (1.158))

In indicial notation, the above equation reads

Sij =12

(∂W∂Cij

+∂W∂Cji

).

The second-order gradient C = ∂S/∂C is computed as follows:

C =12

(∂(SS)

∂C+

∂(SS)∂CT

)=

12

(∂(SS)

∂C+

∂(SS)∂C

T

)(by Eqn. (1.230))



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 89 — #108 ii

ii

ii


=12

S∂S∂C

(I + T)

= S∂S∂C

S (1.240)

= S∂2W

∂C∂CS. (1.241)

In other words, to find ∂2W/∂C∂C when C ∈ Sym, we compute it as if C is not symmetric,and then pre- and post-multiply the result with the symmetrizer S. In indicial notation, theabove equations read

Cijkl =14

(∂Sij

∂Ckl+

∂Sji

∂Ckl+

∂Sij

∂Clk+

∂Sji

∂Clk

)

=14

(∂2W

∂Cij∂Ckl+

∂2W∂Cji∂Ckl

+∂2W

∂Cij∂Clk+

∂2W∂Cji∂Clk

).

Now consider the evaluation of the stretch and rotation tensors of F, and their deriva-tives for arbitrary space dimension n. Let λ2

i , i = 1, 2, . . . , k be the distinct eigenval-ues, and Pi be the corresponding projections of C ∈ Psym, so that C = ∑k

i=1 λ2i Pi and

U = ∑ki=1 λiPi, with Pi given via Eqn. (J.4) as

Pi =

∏kj=1j 6=i

C−λ2j I

λ2i −λ2

j= ∏k

j=1j 6=i

U−λj Iλi−λj

, k > 1

I, k = 1.(1.242)

The rotation tensor R is now obtained as FU−1 = F ∑ki=1 Pi/λi. Note that |det F| =

det U =√

det C.Now we evaluate the derivatives of these tensors. Differentiating the relation C = UU

using Eqn. (1.233), we get

S = S [U I + I U] S∂U∂C

S.

Since the eigenvalues λi, i = 1, 2, . . . , n, of U are positive, the necessary and sufficientcondition that [U I + I U] be invertible is satisfied1. Thus, using (1.197), we get theunique solution ∂U/∂C = S/(2λ) for k = 1, while for k > 1, we have

∂U∂C

= S [U I + I U] S−1

= S

[k

∑i=1

k

∑j=1

Pi Pj

λi + λj

]S, (1.243)

Now using Eqns. (1.175), (1.234) and (1.239), we get

∂U∂F

= S∂U∂C

S∂C∂F

1If C is positive semi-definite instead of positive definite, then ∂U/∂C does not exist when C is singular.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 90 — #109 ii

ii

ii


= 2S

[k

∑i=1

k

∑j=1

Pi Pj

λi + λj

]S[

FT I]

= 2S

[k

∑i=1

k

∑j=1

(PiFT) Pj

λi + λj

]. (1.244)

By differentiating F = RU, and using Eqns. (1.158), (1.168), (1.185), (1.173), we get

∂R∂F

= (I U)−1[

I− (R I)∂U∂F

]

= I U−1 − 2[R U−1]S

k

∑i=1

k

∑j=1

(PiFT) Pj

λi + λj

= I U−1 −

k

∑i=1

k

∑j=1

(RPiFT) (U−1Pj) + T[(U−1PiFT) (RPj)

]λi + λj

=k

∑i=1

k

∑j=1

(λi + λj)(RPiRT) (U−1Pj)− λi(RPiRT) (U−1Pj)− [(RPi) (PjRT)]T

λi + λj

=k

∑i=1

k

∑j=1

(RPiRT) Pj − [(RPi) (PjRT)]T

λi + λj. (1.245)

Note that we have interchanged i and j in the last term in the second-to-last step. Nowusing Eqn. (1.243), and the fact that U = (∂U/∂C)C (or, alternatively, Eqn. (1.244) and thefact that U = (∂U/∂F)F), and in a similar manner using (1.245), we get U = C/(2λ) andRT R = (RT F − FT R)/(2λ) for k = 1, while for k > 1,

U =k

∑i=1

k

∑j=1

PiCPj

λi + λj=

k

∑i=1

k

∑j=1

Pi(FT F + FT F)Pj

λi + λj. (1.246)

R = Rk

∑i=1

k

∑j=1

Pi(RT F − FT R)Pj

λi + λj. (1.247)

In a similar fashion, one can find DFU(F)[Z] = (∂U/∂F)Z. If F is the deformation gradi-ent, then see Eqns. (2.70) and (2.77).

The corresponding results for the left stretch tensor V are

∂V∂B

= S

[k

∑i=1

k

∑j=1

Gi Gj

λi + λj

]S,

∂V∂F

= 2S

[k

∑i=1

k

∑j=1

Gi (GjF)λi + λj

],

V =k

∑i=1

k

∑j=1

GiBGj

λi + λj=

k

∑i=1

k

∑j=1

Gi(FFT + FFT)Gj

λi + λj, (1.248)

R =

[k

∑i=1

k

∑j=1

Gi(FRT − RFT)Gj

λi + λj

]R, (1.249)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 91 — #110 ii

ii

ii


where Gi = RPiRT are the projections of V , which are obtained using Eqn. (1.242) byreplacing C and U by B and V , respectively. If F is the deformation gradient, then seeEqns. (2.71) and (2.76).

We now verify the following relations derived by Chen and Wheeler [47] and Wheeler[352] when n = 3:(

∂U∂F

)TU = F,(

∂R∂F

)TF = 0,

∂U∂F

L = RT L− 1det U

U(RT L− LT R)UU,

∂R∂F

L =1

det URU(RT L− LT R)U,

where L ∈ Lin, and

U := (tr U)I −U = (λ2 + λ3)P1 + (λ1 + λ3)P2 + (λ1 + λ2)P3.

Note that det U = (λ1 + λ2)(λ2 + λ3)(λ1 + λ3), and that

UU = λ1(λ2 + λ3)P1 + λ2(λ1 + λ3)P2 + λ3(λ1 + λ2)P3. (1.250)

Using the facts (A B)T = AT BT , (C1C2)T = CT

2 CT1 , TA = AT , ST = S, TT = T,

PiPj = δijPj (no sum on j), PTi = Pi, ∑k

i=1 Pi = I, UPj =[∑k

m=1 λmPm

]Pj = λjPj = PjU

(no sum on j), we have(∂U∂F

)TU =

k

∑i=1

k

∑j=1

FPjUPi + FPiUPj

λi + λj

= F

[k

∑i=1

k

∑j=1

PjUPi + PiUPj

λi + λj

]

= F

[k

∑i=1

k

∑j=1

λi(PjPi + PiPj)

λi + λj

]

= Fk

∑i=1

λiPiλi

= F,(∂R∂F

)TF =

k

∑i=1

k

∑j=1

RPiRT FPj −T(PiRT FPjRT)

λi + λj

=k

∑i=1

k

∑j=1

RPiUPj − RPjUPi

λi + λj



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 92 — #111 ii

ii

ii


=k

∑i=1

k

∑j=1

λjR(PiPj − PjPi)

λi + λj

=k

∑i=1

λiR(Pi − Pi)

2λi

= 0,

∂U∂F

L =k

∑i=1

k

∑j=1

PiLT FPj + PiFT LPj

λi + λj

=k

∑i=1

k

∑j=1

PiLT RUPj + PiURT LPj

λi + λj

=k

∑i=1

k

∑j=1

λjPiLT RPj + λiPiRT LPj

λi + λj

=k

∑i=1

k

∑j=1

PiRT LPj −k

∑i=1

k

∑j=1

λjPi(RT L− LT R)Pj

λi + λj

=

(k

∑i=1

Pi

)RT L

(k

∑j=1

Pj

)−

k

∑i=1

k

∑j=1


λi + λj

= RT L−k

∑i=1

k

∑j=1


λi + λj

= RT L− 1det U

U(RT L− LT R)UU, (by Eqn. (1.250))

∂R∂F

L = (I U−1)L− (R U−1)

[RT L− 1

det UU(RT L− LT R)UU

]= LU−1 − RRT LU−1 +

1det U

RU(RT L− LT R)U

=1

det URU(RT L− LT R)U.

We now discuss the computation of the gradients of the eigenvalues λ1, λ2, λ3 of adiagonalizable tensor T for the case n = 3, and then specialize the results to the case of asymmetric tensor–a more comprehensive discussion of the derivatives of the eigenvaluesof a symmetric tensor for the case n = 3 may be found in [344]. By Eqn. (J.25), T can beexpressed as ∑k

i=1 λiPi, where k is the number of distinct eigenvalues, Pi are projectionsgiven by Eqn. (J.26), and ∑k

i=1 Pi = I. Consider the case when all eigenvalues are distinct,i.e., λ1 6= λ2 6= λ3 6= λ1. It is known (see, e.g., the remark following (2.2.4) in [37])that the eigenvalues and associated eigenprojections are differentiable in this case. UsingEqn. (1.60), we can show, for example, that cof (T − λ1 I) = (T − λ2 I)T(T − λ3 I)T = PT

1 .Since det(T − γI − (λi − γ)I) = det(T − λi I) = 0, i = 1, 2, 3 and γ ∈ <, we see thatλi − γ, i = 1, 2, 3 are the eigenvalues of T − γI, γ ∈ <. Thus, (λ1 − λ1), (λ2 − λ1) and



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 93 — #112 ii

ii

ii


(λ3− λ1) are the eigenvalues of (T − λ1 I), which implies that (0, 0, (λ2− λ1)(λ3− λ1)) arethe eigenvalues of cof (T − λ1 I), which in turn leads to tr cof (T − λ1 I) = (λ1 − λ2)(λ1 −λ3). By differentiating the relation det(T − λi I) = 0, i = 1, 2, 3, and using the chain rule,we get

(I− ∂λ1

∂T⊗ I)cof (T − λ1 I) = 0, i = 1, 2, 3, (1.251)

or, in other words,

∂λ1

∂T=

cof (T − λ1 I)tr cof (T − λ1 I)

=(T − λ2 I)T(T − λ3 I)T

(λ1 − λ2)(λ1 − λ3)= PT

1 . (1.252)

The corresponding results for ∂λ2/∂T and ∂λ3/∂T are obtained by permuting the indices1, 2 and 3. Equation (1.252) can also be directly obtained by differentiating the relationλ3

1 − I1λ21 + I2λ1 − I3 = 0 with respect to T , and using Eqns. (1.226a)–(1.226c); this shows

that the second expression for ∂λ1/∂T in Eqn. (1.252) is valid even for an arbitrary tensor,provided the denominator is nonzero.

By differentiating the expression P1 = (T − λ2 I)(T − λ3 I)/[(λ1 − λ2)(λ1 − λ3)] usingEqns. (1.170), (1.231) and (1.233), we get

∂P1

∂T=

1(λ1 − λ2)(λ1 − λ3)

[T I + I TT − (λ2 + λ3)I

+ (λ2 + λ3 − 2λ1)P1⊗ PT1 − (λ2 − λ3)

[P2⊗ PT

2 − P3⊗ PT3

]].

By using Eqn. (J.29) and the fact that I = ∑3i=1 Pi = ∑3

i=1 PTi , the above relation simplifies

to

∂P1

∂T=

P1 PT2 + P2 PT

1λ1 − λ2

+P1 PT

3 + P3 PT1

λ1 − λ3. (1.253)

The gradients of P2 and P3 are obtained by cyclically permuting the indices 1, 2 and 3. Theabove expressions are valid for an arbitrary tensor with distinct eigenvalues (since such atensor is diagonalizable).

Now consider the case when two of the eigenvalues are repeated, i.e., λ1 6= λ2 = λ3.Note that now, Eqn. (J.4) yields P1 = (T − λ2 I)/(λ1 − λ2). Since P1P1 = P1, and sinceλ2 = λ3, (1.252) also holds. Using the fact that λ2 + λ3 = I1 − λ1, we get

∂

∂T(λ2 + λ3) = I − PT

1 =λ1 I − TT

(λ1 − λ2). (1.254)

The gradient of P1 is obtained by setting λ2 = λ3 and using P2 +P3 = I−P1 in Eqn. (1.253)as

∂P1

∂T=

P1 (I − P1)T + (I − P1) PT

1(λ1 − λ2)

.

Since P2 + P3 = I − P1, we get

∂

∂T(P2 + P3) = −

∂P1

∂T.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 94 — #113 ii

ii

ii


When all three eigenvalues are repeated, the gradient of the sum of the eigenvalues isI, but the individual gradients are undefined. Since symmetric second-order tensors arediagonalizable, all the above results can be applied to them by setting Pi = PT

i . For ageneralization to any underlying space dimension n, see Eqn. (J.27).

The foregoing development can be used to find the gradients of scalar or tensor-valuedfunctions of a diagonalizable tensor T = ∑k

i=1 λiPi, where (λi, Pi), i = 1, 2, . . . , k, are thedistinct eigenvalues and corresponding eigenprojections of T (see, e.g., [37], [47], [104],[127], [244], [272], [352] and [355] for various other methods). For example, if G(T) =

∑ki=1 f (λi)Pi is a tensor-valued function of T , then using the above results, we get ∂G/∂T =

f ′(λ)I for the case k = 1, while for k > 1,

∂G∂T

=k

∑i=1

∂ f∂λi

Pi PTi +

k

∑i=1

k

∑j=1j 6=i

f (λi)− f (λj)

λi − λjPi PT

j . (1.255)

The above formula is derived by first considering the case of distinct eigenvalues (i.e., k =n), and then appropriate limits are taken (e.g., λ3 → λ2 in case λ2 = λ3) to obtain theresults for repeated eigenvalues. By virtue of Eqn. (J.27), the above result holds for anyarbitrary underlying space dimension n. In case G(S) = ∑k

i=1 f (λi)Pi is a symmetric-valued function of S ∈ Sym, then the above result reduces to ∂G/∂S = f ′(λ)S for the casek = 1, while for k > 1,

∂G∂S

= S

[k

∑i=1

∂ f∂λi

Pi Pi +k

∑i=1

k

∑j=1j 6=i

f (λi)− f (λj)

λi − λjPi Pj

]S. (1.256)

An application of Eqn. (1.256) can be found towards the end of Section 2.4. We now findthe eigenvalue/eigentensors of ∂G/∂S in the above equation. First consider the case whenall the eigenvalues λi of S are distinct. Then, if ni represent the eigenvectors of S, each Pican be represented as ni ⊗ ni (no sum on i). Using Eqns. (1.172) and (1.342), the eigenval-ues/eigentensors (the eigentensors have not been normalized) are found to be(

∂ f∂λ1

, n1⊗ n1

),(

∂ f∂λ2

, n2⊗ n2

),(

∂ f∂λ3

, n3⊗ n3

),(

f (λ1)− f (λ2)

λ1 − λ2, n1⊗ n2 + n2⊗ n1

),(

f (λ2)− f (λ3)

λ2 − λ3, n2⊗ n3 + n3⊗ n2

), (1.257)(

f (λ1)− f (λ3)

λ1 − λ3, n1⊗ n3 + n3⊗ n1

), λ1 6= λ2 6= λ3 6= λ1.

Next consider the case when two eigenvalues of S are repeated, say, λ1 6= λ2 = λ3. Wenow have

∂G∂S

= S

[∂ f∂λ1

P1 P1 +∂ f∂λ2

(I − P1) (I − P1)

+f (λ1)− f (λ2)

λ1 − λ2[P1 (I − P1) + (I − P1) P1]

]S.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 95 — #114 ii

ii

ii


The distinct eigenvalue/eigentensor pairs of ∂G/∂S in this case are(∂ f∂λ1

, n1⊗ n1

),(

∂ f∂λ2

, I − n1⊗ n1

),(

f (λ1)− f (λ2)

λ1 − λ2, n1⊗ [n− (n · n1)n1] + [n− (n · n1)n1]⊗ n1

), λ1 6= λ2 = λ3,

(1.258)

where n is such that n× n1 6= 0, but is otherwise arbitrary. Finally, for the case when alleigenvalues are repeated, i.e., λ1 = λ2 = λ3 ≡ λ, the distinct eigenvalue/eigentensor pairis (

∂ f∂λ

, A)

, λ1 = λ2 = λ3 ≡ λ, (1.259)

where A ∈ Sym.The derivatives of the eigenvectors e∗i of S ∈ Sym are given by [244]

∂e∗1∂S

=e∗2 ⊗ e∗1 ⊗ e∗2 + e∗2 ⊗ e∗2 ⊗ e∗1

2(λ1 − λ2)+

e∗3 ⊗ e∗1 ⊗ e∗3 + e∗3 ⊗ e∗3 ⊗ e∗12(λ1 − λ3)

,

∂e∗2∂S

=e∗1 ⊗ e∗1 ⊗ e∗2 + e∗1 ⊗ e∗2 ⊗ e∗1

2(λ2 − λ1)+

e∗3 ⊗ e∗2 ⊗ e∗3 + e∗3 ⊗ e∗3 ⊗ e∗22(λ2 − λ3)

,

∂e∗3∂S

=e∗1 ⊗ e∗1 ⊗ e∗3 + e∗1 ⊗ e∗3 ⊗ e∗1

2(λ3 − λ1)+

e∗2 ⊗ e∗2 ⊗ e∗3 + e∗2 ⊗ e∗3 ⊗ e∗22(λ3 − λ2)

.

Similar to the gradient of a scalar field, we define the gradient of a vector field v as

∇v(x)u := Dv(x)[u]. (1.260)

The terminology ‘gradient’ used for the above quantity is misleading since the gradientwas defined in terms of the inner product, whereas the above quantity is not [58]. Thequantity ∇v is in reality simply the matrix representation of the Frechet derivative of v,and hence a second-order tensor. Taking u = ej, and taking the dot product of the aboveequation with ei, we get the components of∇v as

(∇v)ij =∂vi∂xj

.

Thus, we have

∇v =∂vi∂xj

ei⊗ ej.

The scalar field

∇ · v := tr∇v =∂vi∂xj

tr (ei⊗ ej) =∂vi∂xj

ei · ej =∂vi∂xj

δij =∂vi∂xi

, (1.261)

is called the divergence of v.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 96 — #115 ii

ii

ii


The gradient of a second-order tensor T is a third-order tensor defined in a way similarto the gradient of a vector field as

∇T(x)u := DT(x)[u]. (1.262)

∇T can be written in terms of its components as

∇T =∂Tij

∂xkei⊗ ej⊗ ek.

The divergence of a second-order tensor T , denoted as∇ · T , is defined as

(∇ · T) · u := ∇ · (TTu) ∀ constant u ∈ V, (1.263)

leading to the component form

∇ · T =∂Tij

∂xjei.

The curl of a vector v, denoted as∇× v, is defined by

(∇× v)× u := [∇v− (∇v)T ]u ∀u ∈ V. (1.264)

Thus,∇× v is the axial vector corresponding to the skew tensor [∇v− (∇v)T ]. In com-ponent form, we have

∇× v = εijk(∇v)kjei = εijk∂vk∂xj

ei =∂vk∂xj

ej× ek.

The curl of a tensor T , denoted by∇× T , is defined by

(∇× T)u := ∇× (TTu) ∀ constant u ∈ V. (1.265)

In component form,

∇× T = εirs∂Tjs

∂xrei⊗ ej =

∂Tjs

∂xr(er× es)⊗ ej. (1.266)

The Laplacian of a scalar function φ(x) is defined by

∇2φ := ∇ · (∇φ). (1.267)

In component form, the Laplacian is given by

∇2φ =∂2φ

∂xi∂xi.

If∇2φ = 0, then φ is said to be harmonic.The Laplacian of a tensor function T(x), denoted by∇2T , is defined by

(∇2T) : H := ∇2(T : H) ∀ constant H ∈ Lin. (1.268)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 97 — #116 ii

ii

ii


In component form,

(∇2T)ij =∂2Tij

∂xk∂xk.

In the remaining chapters, we shall assume that a function is sufficiently smooth (with-out explicitly stating what these smoothness properties are), so that all the equations pre-sented make sense. For example, in the equation

∇ · τ + ρb = 0,

if we are given that ρ and b are continuous functions of the position vector x, then weimplicitly assume τ to be continuously differentiable.

1.9.5 Examples

Although it is possible to derive tensor identities involving differentiation using the abovedefinitions of the operators, the proofs can be quite cumbersome, and hence we prefer touse indicial notation instead. In what follows, u and v are vector fields, and∇ ≡ ∂

∂xiei (this

is to be interpreted as the ‘del’ operator acting on a scalar, vector or tensor-valued field,e.g.,∇φ = ∂φ

∂xiei):

1. Show that

∇×∇φ = 0. (1.269)

2. Show that

12∇(u · u) = (∇u)Tu. (1.270)

3. Show that∇ · [(∇u)v] = (∇u)T : ∇v + v · [∇(∇ · u)].4. Show that

∇ · (∇u)T = ∇(∇ · u), (1.271a)

∇2u := ∇ · (∇u) = ∇(∇ · u)−∇× (∇× u). (1.271b)

From Eqns. (1.271a) and (1.271b), it follows that

∇ · [(∇u)− (∇u)T ] = −∇× (∇× u).

From Eqn. (1.271b), it follows that if∇ · u = 0 and∇× u = 0, then∇2u = 0, i.e., uis harmonic.

5. Show that∇ · (u× v) = v · (∇× u)− u · (∇× v).

6. Let W ∈ Skw, and let w be its axial vector. Then show that

∇ ·W = −∇×w,∇×W = (∇ ·w)I −∇w. (1.272)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 98 — #117 ii

ii

ii


Solution:

1. Consider the ith component of the left-hand side:

(∇×∇φ)i = εijk∂2φ

∂xj∂xk

= εikj∂2φ

∂xk∂xj(interchanging j and k)

= −εijk∂2φ

∂xj∂xk,

which implies that∇×∇φ = 0.

2.12∇(u · u) = 1

2∂(uiui)

∂xjej = ui

∂ui∂xj

ej = (∇u)Tu.

3. We have

∇ · [(∇u)v] = ∇j((∇u)v)j

=∂

∂xj

(∂uj

∂xivi

)=

∂vi∂xj

∂uj

∂xi+ vi

∂2uj

∂xi∂xj

= (∇u)T : ∇v + v ·∇(∇ · u).

4. The first identity is proved as follows:

[∇ · (∇u)T ]i =∂

∂xj

(∂uj

∂xi

)=

∂

∂xi

(∂uj

∂xj

)= [∇(∇ · u)]i.

To prove the second identity, consider the last term

∇× (∇× u) = εijk∇j(∇× u)kei

= εijk∇j(εkmn∇mun)ei

= εijkεmnk∂2un

∂xj∂xmei

= (δimδjn − δinδjm)∂2un

∂xj∂xmei

=

[∂2uj

∂xi∂xj− ∂2ui

∂xj∂xj

]ei

= ∇(∇ · u)−∇ · (∇u).



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 99 — #118 ii

ii

ii


5. We have

∇ · (u× v) =∂(u× v)i

∂xi

= εijk∂(ujvk)

∂xi

= εijkvk∂uj

∂xi+ εijkuj

∂vk∂xi

= εkijvk∂uj

∂xi− εjikuj

∂vk∂xi

= v · (∇× u)− u · (∇× v).

6. Using the relation Wij = −εijkwk, we have

(∇ ·W) = −εijk∂wk∂xj

ei

= −∇×w.

(∇×W)ij = εimn∂Wjn

∂xm

= −εimnεjnr∂wr

∂xm

= (δijδmr − δirδmj)∂ur

∂xm

=∂ur

∂xrδij −

∂ui∂xj

,

which is the indicial version of Eqn. (1.272).

1.10 The Exponential and Logarithmic Functions

The exponential of a tensor T(t) can be defined either in terms of its series representationas

eT(t) := I + T(t) +12![T(t)]2 + · · · , (1.273)

or in terms of a solution of the initial value problem

X(ξ) = T(t)X(ξ) = X(ξ)T(t), ξ > 0, (1.274)X(0) = I, (1.275)

for the tensor function X(ξ), where the parameter ξ is independent of t. Note that the su-perposed dot in the above equation denotes differentiation with respect to ξ. The existencetheorem for linear differential equations tells us that this problem has exactly one solutionX : [0, ∞)→ Lin, which we write in the form

X(ξ) = eT(t)ξ .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 100 — #119 ii

ii

ii


From Eqn. (1.273), it is immediately evident that

e[T(t)]T= (eT(t))T , (1.276)

and that if A ∈ Lin is invertible, then e(A−1BA) = A−1eB A for all B ∈ Lin.

Theorem 1.10.1. For each t ≥ 0, eT(t) belongs to Lin+, and

det(eT(t)) = etr T(t). (1.277)

Proof. If (λi, n) is an eigenvalue/eigenvector pair of T(t), then from Eqn. (1.273),it follows that (eλi(t), n) is an eigenvalue/eigenvector pair of eT(t). Hence, thedeterminant of eT(t), which is just the product of the eigenvalues, is given by

det(eT(t)) = Πni=1eλi(t) = e∑n

i=1 λi(t) = etr T(t).

Since etr T(t) > 0 for all t, eT(t) ∈ Lin+.

From Eqn. (1.277), it directly follows that

det(eAeB) = det(eA)det(eB) = etr Aetr B = etr (A+B) = det(eA+B).

Theorem 1.10.2. Let A(t), B(t) ∈ Lin. If

e[A(t)+B(t)]ξ = eA(t)ξ eB(t)ξ ∀ξ (1.278)

then A(t)B(t) = B(t)A(t). Conversely, if A(t)B(t) = B(t)A(t), then

e[A(t)+B(t)]ξ = eA(t)ξ eB(t)ξ = eB(t)ξ eA(t)ξ ∀ξ (1.279)

Proof. We shall write A(t) and B(t) simply as A and B for notational conve-nience. Let X A(ξ) := eAξ , XB(ξ) := eBξ , and X A+B := e(A+B)ξ . Then, we have

X A+B = (A + B)X A+B, (1.280)˙X AXB = X AXB + X AXB

= AX AXB + X ABXB, (1.281)

where, as before, the superposed dot denotes differentiation with respect to ξ.Let Z := X A+B − X AXB. From Eqns. (1.280) and (1.281), we get

Z = (A + B)X A+B − AX AXB − X ABXB. (1.282)

If AB = BA, then X AB = BX A, so that Eqn. (1.282) becomes

Z = (A + B) [X A+B − X AXB] = (A + B)Z. (1.283)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 101 — #120 ii

ii

ii


The above differential equation is to be solved subject to the initial conditionZ(0) = 0. The general solution of the differential equation is Z = e(A+B)ξC,where the tensor C is a constant tensor. On using the initial condition, we getC = 0, from which it follows that Z = 0, thus proving Eqn. (1.279). In ananalogous fashion, by taking Z := X A+B − XBX A, one can also show that ifAB = BA, then e(A+B)ξ = eBξeAξ . Thus, Eqn. (1.279) follows.

Conversely, if Eqn. (1.278) holds, i.e., if X A+B = X AXB, then Eqn. (1.282)reduces to

BX AXB = X ABXB ∀ξ.

By Theorem 1.10.1, XB ∈ Lin+. Hence multiplying the above equation by X−1B ,

we get BX A = X AB for all ξ. Differentiating this relation with respect to ξ, andevaluating at ξ = 0, we get BA = AB.

As a corollary of the above theorem, it follows, by taking ξ = 1 in Eqn. (1.279), that

A(t)B(t) = B(t)A(t) =⇒ eA(t)+B(t) = eA(t)eB(t) = eB(t)eA(t). (1.284)

However, the converse of the above statement may not be true. Indeed, if AB 6= BA, onecan have eA+B = eA = eB = eAeB, or eAeB = eA+B 6= eBeA or even eAeB = eBeA 6= eA+B asthe following examples show:

1. Wood [354] presents the following example:

A = 2π

0 0√

32

0 0 − 12

−√

32

12 0

, B = 2π

0 0 −√

32

0 0 − 12√

32

12 0

,

so that eA+B = eA = eB = eAeB = I. The eigenvalues of A and B are (0, 2πi,−2πi).Another example presented by Horn and Johnson [136] is

A =

0 0 0 00 0 0 00 0 0 −2π

0 0 2π 0

, B =

0 0 1 00 0 0 10 0 0 −2π

0 0 2π 0

,

for which, again, one has eA+B = eA = eB = eAeB = I. The eigenvalues of A and Bare (0, 0, 2πi,−2πi).

2. Wermuth [350] presents the following example where eAeB = eA+B 6= eBeA: if z =a + ib is a solution of ez − z = 1, e.g., a = 2.088843 . . . and b = 7.461489 . . ., then for

A =

0 0 0 00 0 0 00 0 a −b0 0 b a

, B =

0 0 1 00 0 0 10 0 0 00 0 0 0

,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 102 — #121 ii

ii

ii


we have

eA =

1 0 0 00 1 0 00 0 a + 1 −b0 0 b a + 1

, eB =

1 0 1 00 1 0 10 0 1 00 0 0 1

, (1.285)

so that

eAeB = eA+B =

1 0 1 00 1 0 10 0 a + 1 −b0 0 b a + 1

, eBeA =

1 0 a + 1 −b0 1 b a + 10 0 a + 1 −b0 0 b a + 1

.

3. An example for the case eAeB = eBeA 6= eA+B, again provided in [136], is

A =

0 −π 0 0π 0 0 00 0 0 π

0 0 −π 0

, B =

0 0 1 00 0 0 10 0 0 00 0 0 0

,

which yields

eA = eA+B = −I, eB =

1 0 1 00 1 0 10 0 1 00 0 0 1

,

so that eAeB = eBeA 6= eA+B.

As an application of Eqn. (1.284), since T(t) and −T(t) commute, we have eT(t)−T(t) =

I = eT(t)e−T(t). Thus,

(eT(t))−1 = e−T(t). (1.286)

In fact, one can extend this result to get

(eT(t))n = enT(t) ∀ integer n.

As an application of Eqn. (1.286), consider the solution of the tensorial differential equa-tion

X + AX + XBT = F(t), (1.287)

subject to the initial condition X(0) = C. The differentiation is with respect to t, and A andB are assumed to be independent of t. Multiplying Eqn. (1.287) by eAt eBt, we get

(eAt eBt)X + AeAtXeBT t + eAtXBTeBT t = eAtF(t)eBT t.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 103 — #122 ii

ii

ii


The above equation can be written as

ddt

[(eAt eBt

)X]= eAtF(t)eBT t.

Integrating the above equation between the limits [0, t], we get

(eAt eBt)X(t) =∫ t

0eAξ F(ξ)eBTξ dξ + C,

which, by virtue of Eqn. (1.286), leads to

X(t) =∫ t

0e−A(t−ξ)F(ξ)e−BT(t−ξ) dξ + e−AtCe−BT t

=∫ t

0e−Aξ F(t− ξ)e−BTξ dξ + e−AtCe−BT t. (1.288)

What we have effectively shown is that

e(AI+IB)t = eAt eBt ∀t.

The result given by Eqn. (1.288) also holds if B = 0, and X, F and C are vectors. Thisobservation allows us to solve the tensorial differential equation

X + AXB = F(t),

which can be written as

X + (A BT)X = F(t),

Using the Ψ mapping of Appendix I, the above equation can be written as

Ψ(X) + Ψ(A BT)Ψ(X) = Ψ(F),

whose solution is given by

Ψ[X(t)] =∫ t

0e−Ψ(ABT)ξΨ[F(t− ξ)] dξ + e−Ψ(ABT)tΨ(C).

For the exponential of a skew-symmetric tensor, we have the following theorem:

Theorem 1.10.3. Let W(t) ∈ Skw for all t. Then eW(t) is a rotation for each t ≥ 0.Conversely, for R(t) ∈ Orth+, there exists a W(t) ∈ Skw such that R(t) = eW(t).

Proof. By Eqn. (1.286),

(eW(t))−1 = e−W(t) = eW T(t) = (eW(t))T ,

where the last step follows from Eqn. (1.276). Thus, eW(t) is a orthogonal tensor.By Theorem 1.10.1, det(eW(t)) = etr W(t) = e0 = 1, and hence eW(t) is a rotation.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 104 — #123 ii

ii

ii


The converse is proved simply by taking W(t) = log R(t), which we show tobe a skew-symmetric tensor later in this section.

In the three-dimensional case, by using the Cayley–Hamilton theorem, we get W3(t) =− |w(t)|2 W(t), where w(t) is the axial vector of W(t). Thus, W4(t) = − |w(t)|2 W2(t),W5(t) = |w(t)|4 W(t), and so on. Substituting these terms into the series expansion of theexponential function, and using the representations of sine and cosine functions, we get2

R(t) = eW(t) = I +sin(|w(t)|)|w(t)| W(t) +

[1− cos(|w(t)|)]|w(t)|2

W2(t). (1.289)

The directional derivative DR(W)[U] can be obtained from the above expression using thefact that |w|2 = w ·w = W : W/2, so that 2 |w|D |w| (W)[U] = W : U.

Another proof of Eqn. (1.289) is obtained by assuming R(ξ) ≡ eW(t)ξ to be of the formh1(ξ)I + h2(ξ)W + h3(ξ)W2, and then determining the unknown functions h1(ξ), h2(ξ),and h3(ξ) by using the governing equation R = RW (where the dot denotes differentiationwith respect to ξ) subject to the initial conditions R(0) = I and R(0) = W . By using thefact that W3(t) = − |w(t)|2 W(t), and that I, W , W2 is a linearly independent set (seeProblem 4), we get

h1(ξ) = 0,

h2(ξ) = h1 − |w(t)|2 h3(ξ),

h3(ξ) = h2(ξ),

which are to be solved subject to the initial conditions

h1(0) = 1, h1(0) = 0,

h2(0) = 0, h2(0) = 1,

h3(0) = 0, h3(0) = 0.

We get the solution as

h1(ξ) = 1, h2(ξ) =sin(|w(t)ξ|)|w(t)| , h3(ξ) =

[1− cos(|w(t)ξ|)]|w(t)|2

,

2Similarly, in the two-dimensional case, if

W(t) =

[0 γ(t)

−γ(t) 0

],

where γ is a parameter which is a function of t, then

R(t) = eW(t) = cos γ(t)I +sin γ(t)

γ(t)W(t) =

[cos γ(t) sin γ(t)− sin γ(t) cos γ(t)

].

Let γ(t) =∫ t

0 γ(ξ) dξ. Then the solution of dz/dt = W(t)z in this two-dimensional case is

z(t) =

[cos γ(t) sin γ(t)− sin γ(t) cos γ(t)

]z(0).

This result is a special case of the result in Theorem 1.10.4.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 105 — #124 ii

ii

ii


which, on setting ξ = 1, yields Eqn. (1.289).Yet another proof is obtained by using Eqn. (1.292) with T ≡W(t), λ = 0, µ = i |w(t)|,

ν = −i |w(t)| and ξ = 1 (which implies that the eigenvalues of eW(t) are (1, ei|w(t)|, e−i|w(t)|)).Not surprisingly, Eqn. (1.289) has the same form as Eqn. (1.104) with α = |w(t)|. Equa-

tion (1.289) is known as Rodrigues formula. If W(t) is a complex-valued skew-symmetrictensor, replace |w(t)| in Eqn. (1.289) by

√w(t) ·w(t) if it is nonzero; if it is zero, then W

is nilpotent, and using the series expansion, we get eW(t) = I + W(t) + W2(t)/2. A gen-eralization of Rodrigues formula to the case when n ≥ 4 was developed by Gallier andXu [86]; we shall present this formula later in this section (see Eqn. (1.295)), although weuse a different method for deriving it than the one that they have used.

Let a = sin(|w(t)|)/ |w(t)|, b = [1 − cos(|w(t)|)]/ |w(t)|2 and c = |w(t)|2, and letR(t) be given by Eqn. (1.289). Using the facts RRT + RRT

= 0, 2b − b2c − a2 = 0,WWW = −cW/2, and WWW2 −W2WW = 0 (where now the superposed dot denotesdifferentiation with respect to t), we get

RRT =1c(w · w)(1− a)W + aW + b(WW − WW), (1.290a)

RT R =1c(w · w)(1− a)W + aW − b(WW − WW). (1.290b)

The axial vectors of RRT and RT R are, thus, (w · w)(1 − a)w/c + aw + b(w× w) and(w · w)(1− a)w/c + aw − b(w× w), respectively. Of course, for the special case whenW(t) = W0t, where W0 is independent of t, the above formulae reduce to RRT = RT R =W0. In this connection, one has the following theorem [354]:

Theorem 1.10.4. For T(t) ∈ Lin,

d(eT(t))

dt= TeT(t) = eT(t)T , (1.291)

if and only if TT = TT . Thus, the solution of Z = T(t)Z is Z(t) =(

e∫ t

0 T(ξ) dξ)

Z(0)

if and only if TT = TT .

Proof. If TT = TT , then T2 = TT + TT = 2TT = 2TT , and so on for deriva-tives of higher powers of T . Thus, by simply differentiating Eqn. (1.273), we getEqn. (1.291).

To prove the converse, note from Eqn. (1.273) that eT T = TeT . Differentiatingboth sides of this relation, and using the fact that Eqn. (1.291) holds, we get

TeT T = TTeT .

Noting again that eT T = TeT , and multiplying both sides by (eT)−1 = e−T , weget the desired result.

By replacing T(t) by∫ t

0 T(ξ) dξ in the first relation in Eqn. (1.291), and post-multiplying by Z(0), we get the stated solution of the differential equationZ = T(t)Z.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 106 — #125 ii

ii

ii


In what follows T(t) is written simply as T .Explicit formulae for eTξ (where T can be complex-valued) when n = 3 and n = 4 are

given by Cheng and Yau [52] in terms of the eigenvalues and powers of T .

Theorem 1.10.5. For n = 3, we have the following cases:

1. λ1 = λ2 = λ3 ≡ λ

(a) If q(T) = (T − λI), then

eTξ = eλξ I.

(b) If q(T) = (T − λI)2, then

eTξ = eλξ [ξ(T − λI) + I] .

(c) If q(T) = (T − λI)3, then

eTξ = eλξ

[ξ2

2(T − λI)2 + ξ(T − λI) + I

].

2. λ1 ≡ µ 6= λ2 = λ3 ≡ λ

(a) If q(T) = (T − λI)(T − µI), then

eTξ =eλξ − eµξ

λ− µT +

λeµξ − µeλξ

λ− µI.

(b) If q(T) = (T − µI)(T − λI)2, then

eTξ =

[ξeλξ

λ− µ− eλξ − eµξ

(λ− µ)2

](T − λI)2 + eλξ [ξ(T − λI) + I] .

3. λ1 ≡ λ 6= λ2 ≡ µ 6= λ3 ≡ ν 6= λ1. We have q(T) = (T − λI)(T − µI)(T −νI), and

eTξ = eλξ (T − µI)(T − νI)(λ− µ)(λ− ν)

+ eµξ (T − λI)(T − νI)(µ− λ)(µ− ν)

+ eνξ (T − λI)(T − µI)(ν− λ)(ν− µ)

.

(1.292)

We now present two methods for the explicit determination of the exponential functionfor arbitrary n (for a detailed survey, see [225]). The first method is based on the Jordancanonical form given by Eqn. (J.37), while the second method is based on the definitiongiven by Eqn. (1.274). The first method yields the formula

eTξ =k

∑i=1

[eλiξ

(Pi +

mi−1

∑j=1

ξ j

j!N j

i

)], (1.293)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 107 — #126 ii

ii

ii


where mi is as in Eqn. (J.24). To prove this result, we first note that Pi and N i commute sincethey are polynomials in T . Using Eqns. (J.37) and (1.279), and the properties PiPj = δijPi

(no sum on i), N i N j = Pi N j = 0 for i 6= j, Pi N i = N i (no sum on i), and N ji = 0, j ≥ mi,

we have

eTξ =(

e∑ki=1 λiPiξ

) (e∑k

i=1 N iξ)

=

[k

∑i=1

eλiξ Pi

] [I +

k

∑i=1

mi−1

∑j=1

ξ j

j!N j

i

]

=k

∑i=1

[eλiξ

(Pi +

mi−1

∑j=1

ξ j

j!N j

i

)].

As an example, if T is given by Eqn. (J.39), we get

eT = eαP1 + eβ(P2 + N2) =

eα eα−eβ

α−βeα+(β−α−1)eβ

(α−β)2

0 eβ eβ

0 0 eβ

.

The exponential tensor eT for a diagonalizable (in particular, symmetric) tensor T is ob-tained either as a special case of Eqn. (1.293), or by substituting Eqn. (J.25) into Eqn. (1.273),and is given by

eTξ =k

∑i=1

eλiξPi, (1.294)

with Pi given by Eqn. (J.26). The method proposed by Putzer [259] essentially yields thesame result as that given by Eqn. (1.293), although more indirectly.

Writing λm as (λr)m + i(λs)m, we can write Eqn. (1.293) as

eT =k

∑m=1

e(λr)m[sin((λs)m)Bm − cos((λs)m)B2

m + [cos((λs)m) + i sin((λs)m)] Gm

],

where Bm = iPm and Gm = ∑nm−1j=1

1j! N j

m. Note that B3m = −iPm = −Bm. The above

form allows us to write a Rodrigues-type formula for any dimension n when T is a skew-symmetric tensor [86]. As discussed earlier, the eigenvalues of a real-valued W ∈ Skware either zero or of the form ±iλm, m = 1, 2, . . . , k, where k is the number of distinctnonzero eigenvalues. Also, since W is normal, it is diagonalizable. Let Bm := i[P(1)

m −P(2)m ],

where P(1)m and P(2)

m are the projections associated with the eigenvalues iλm and −iλm,respectively. Since B2

m = −[P(1)m + P(2)

m ], it follows that I = −∑km=1 B2

m. From Eqn. (1.294),we get

eW = I +k

∑i=1

[sin(λi)Bi + [1− cos(λi)] B2

i

], (1.295)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 108 — #127 ii

ii

ii


where W = ∑ki=1 λiBi, with each Bi being a real-valued skew tensor. The λi are determined

by taking the square roots of the eigenvalues of−W2, while the matrices Bi are determinedby solving the following system of k equations

W =k

∑i=1

λiBi,

−W3 =k

∑i=1

λ3i Bi,

· · · ,

(−1)k+1W2k−1 =k

∑i=1

λ2k−1i Bi,

by inverting the van der Monde matrix [153]λ1 λ2 . . . λk

λ31 λ3

2 . . . λ3k

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .λ2k−1

1 λ2k−12 . . . λ2k−1

k

,

to obtain

Bi(W) =

(−1)k+1 W

λi∏k

j=1j 6=i

W2+λ2j I

λ2i −λ2

j, k > 1

Wλ , k = 1.

An identical procedure can be used to find the exponential of a skew-Hermitian tensor(which is unitary) since, again, in this case, the real part of the eigenvalues and the Gi’s arezero. We have already noted that some complex-valued skew-symmetric tensors can benilpotent, and hence, no Rodrigues-type formula can be derived for its exponential (whichis an orthogonal tensor), in general.

The second method that we discuss is the one devised by Fulmer [85]. Note that sincedk/dξk(eTξ) = TkeTξ , for k = 1, 2, . . . , n, we have

eTξ∣∣∣ξ=0

= I and

[dk

dξk (eTξ)

]ξ=0

= Tk. (1.296)

Also, by virtue of the Cayley–Hamilton theorem,

(Tn − I1Tn−1 + · · ·+ (−1)n In I)eTξ = 0,

or, alternatively,(dn

dξn − I1dn−1

dξn−1 + · · ·+ (−1)n In

)eTξ = 0. (1.297)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 109 — #128 ii

ii

ii


As a result of Eqns. (1.296) and (1.297), we see that eTξ is the unique solution of the nthorder initial value problem

DG(ξ) = 0, G(0) = I,dkG(ξ)

dξk

∣∣∣∣∣ξ=0

= Tk, k = 1, 2, . . . , n− 1, (1.298)

where D is the differential operator in parenthesis in Eqn. (1.297).If T has the characteristic polynomial given by Eqn. (J.23), then the general solution of

DG(ξ) = 0 is

G(ξ) = eλ1ξ(C11 + ξC12 + · · ·+ ξn1−1C1n1)+ · · ·+ eλkξ(Ck1 + ξCk2 + · · ·+ ξnk−1Cknk),

(1.299)

where the Cij are n matrices to be determined from the n initial conditions

I = C11 + C21 + · · ·+ Ck1,T = (λ1C11 + C12) + · · ·+ (λkCk1 + Ck2),

T2 = (λ21C11 + 2λ1C12) + · · ·+ (λ2

kCk1 + 2λkCk2),· · ·

Tn−1 =

(λn−1

1 C11 + (n− 1)λn−21 C12 + · · ·+

(n− 1)!(n− n1)!

λn−n11 C1n1

)+ · · ·

+

(λn−1

k Ck1 + (n− 1)λn−2k Ck2 + · · ·+

(n− 1)!(n− nk)!

λn−nkk Cknk

).

The Cij are obtained as polynomials in T by inverting the n × n coefficient matrix in theabove equations; substituting these expressions into Eqn. (1.299), we get the desired ex-pression for eTξ . Note that the above method is based on the characteristic polynomial3,whereas the first method is based on the minimal polynomial.

As an illustration, let us again find the expression for eT , with T given by Eqn. (J.39).Since the characteristic polynomial is given by (T − αI)(T − βI)2, the general solution isG(ξ) = eαξC11 + eβξ(C21 + ξC22), where the Cij are obtained from the initial conditions

I = C11 + C21,T = αC11 + βC21 + C22,

T2 = α2C11 + β2C21 + 2βC22.

Solving for Cij and substituting into eT = eαC11 + eβ(C21 + C22), we get the same expres-sion as before.

Now we discuss the differentiation of eT with respect to T , where T ∈ Lin, for anyunderlying space dimension n; the treatment is based on [155], alternative treatments maybe found in [143] and [144]. We shall first give an alternate derivation of a formula that

3Actually, the method can be modified quite easily to work with the minimal polynomial as well–in this casethe number of Cij matrices to be determined is m = ∑k

i=1 mi instead of n. This is particularly advantageous, forexample, if T is diagonalizable, since then mi = 1 for i = 1, 2, . . . , k, so that Cij = 0 for i = 1, 2, . . . , k and j ≥ 2.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 110 — #129 ii

ii

ii


appears in Mathias [211] and Ortiz et al. [241]. Let H ≡ ∂X/∂T , where X = eTξ . Differen-tiating the expression X = TX with respect to T using Eqn. (1.233), we get

H = (T I)H + I XT .

Multiplying this equation by the “integrating factor” e−Tξ I, we get

ddξ

[(e−Tξ I)H

]= e−Tξ (eTξ)T .

Integrating this equation subject to the condition H|ξ=0 = 0, and using Eqn. (1.286), weget

H(ξ) = (eTξ I)∫ ξ

0

[e−Tη (eTη)T

]dη

=∫ ξ

0

[eT(t−η) (eTη)T

]dη.

Similarly, by differentiating X = XT , we also get

H(ξ) =∫ xi

0

[eTη eTT(t−η)

]dη.

Since ∂eT /∂T = H|ξ=1, we obtain

∂(eT)

∂T=∫ 1

0

[eT(1−η) (eTη)T

]dη

=∫ 1

0

[eTη

(eT(1−η)

)T]

dη. (1.300)

Explicit expressions for eTη such as those given by Eqn. (1.293) (or those presented in The-orem 1.10.5 for n = 3) should be substituted into the above expressions to get explicitformulae for ∂eT /∂T . Note that these formulae are valid for any T ∈ Lin and any n.

Simpler expressions for a diagonalizable tensor T can be obtained using Eqns. (1.294)and (1.300), or directly by using Eqn. (1.255) with f (λi) = eλi . The final result that weobtain for a diagonalizable tensor (for any n) is eλI for k = 1, and

∂(eT)

∂T=

[k

∑i=1

eλi Pi PTi +

k

∑i=1

k

∑j=1j 6=i

eλi − eλj

λi − λjPi PT

j

], k > 1

where k is the number of distinct eigenvalues of T , and Pi is given by Eqn. (J.26). WhenT ∈ Sym, we obviously have PT

i = Pi, and the above result should be pre- and post-multiplied by S.

In what follows, we shall continue to denote T(t) as T ; the treatment is based on [158].Recall that difficulties arise in defining a unique scalar-valued logarithmic function sinceeλ = eλ+2πin where n is an integer. Similar difficulties arise in defining the logarithm of atensor. If one defines the solution X of eX = T as log T , then a non-singular T may have



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 111 — #130 ii

ii

ii


an infinite number of real and complex logarithms. In order to avoid this nonuniquenessand ensure that the logarithm of a real tensor is real, we define the principal logarithmicfunction, denoted by log T , of a matrix T with no eigenvalues that are negative or zero (i.e.λi /∈ (−∞, 0], i = 1, 2, . . . , n), as the unique function such that elog T = T . The eigenvaluesof log T have imaginary parts that lie strictly between −π and π, since as we shall showbelow, if reiθ , −π < θ < π, are the eigenvalues of T , then log(reiθ) = log r + iθ, −π < θ <π, are the eigenvalues of log T . Thus, if λi denote the eigenvalues of T , then

1. elog T = T if and only if λi /∈ (−∞, 0] ∀i.

2. log eT = T if and only if −π < Im(λi) < π ∀i.

In series form the logarithmic function is given by

log(I + T) =∞

∑i=1

(−1)i−1

i(T)i, (1.301)

or, alternatively as

log(T) =∞

∑i=1

(−1)i−1

i(T − I)i. (1.302)

The series given by Eqn. (1.301) is absolutely convergent only if ‖T‖ < 1. It is immediatelyevident from the above representation that log(I) = 0 and log(TT) = (log T)T . Since,for all invertible A ∈ Lin, eA−1BA = A−1eB A for all B ∈ Lin, we get log(A−1BA) =A−1(log B)A.

If H(ξ) : [0, 1]→ Lin is a tensor-valued function of ξ with no eigenvalues on the closednegative real axis for any ξ ∈ [0, 1], an alternative definition can be given as the solution ofthe differential equation

H(ξ)dXdξ

=dHdξ

, X(0) = 0, (1.303)

with the solution denoted by X(ξ) = log H. For example, if H(ξ) = I + (T − I)ξ, with noeigenvalues of T on the closed negative real axis, then, from the above definition, we get

log[I + (T − I)ξ] =∫ ξ

0[I + (T − I)η]−1(T − I) dη. (1.304)

If one violates the restriction on the eigenvalues of T , then, from the above definition, onecan get a complex-valued logarithm even for a real tensor as is seen, for example, by takingT = diag[1,−1, 1]. Let (λ, n) denote an eigenvalues/eigenvector pair of T . Then, by usingthe fact that

In = [I + (T − I)η]−1[I + (T − I)η]n = (1 + (λ− 1)η)[I + (T − I)η]−1n,

we get from Eqn. (1.304),

log[I + (T − I)ξ]n =

[∫ ξ

0

λ− 11 + (λ− 1)η

dη

]n = log[1 + (λ− 1)ξ]n,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 112 — #131 ii

ii

ii


so that (log[1+ (λ− 1)ξ], n) are the corresponding eigenvalue/eigenvector of log[I + (T −I)ξ]. In particular, for ξ = 1, (log λ, n) are the eigenvalue/eigenvector pair of log T . Simi-larly, one can show that (− log λ, n) are the eigenvalue/eigenvector pair of log(T−1). Anal-ogous to Eqn. (1.277), we get

tr log T =n

∑i=1

log λi = log(det T). (1.305)

We have the following analogue of Theorem 1.10.2:

Theorem 1.10.6. Let A, B be such that no eigenvalues of A, B and AB are on the closednegative real axis. If

log [I + (A− I)ξ][I + (B− I)ξ] = log[I + (A− I)ξ] + log[I + (B− I)ξ]

∀ξ ∈ [0, 1] (1.306)

then AB = BA.Conversely, if AB = BA (so that A, B and AB have a common set of eigenvec-

tors), and if −π < Im(log αi + log βi) < π for each i,a where αi and βi denote theeigenvalues of A and B corresponding to the same eigenvector, then

log [I + (A− I)ξ][I + (B− I)ξ] = log [I + (B− I)ξ][I + (A− I)ξ] =log[I + (A− I)ξ] + log[I + (B− I)ξ] ∀ξ ∈ [0, 1] (1.307)

Proof. If Eqn. (1.306) holds, then by differentiating it with respect to ξ usingEqn. (1.303), we get

[I + (B− I)ξ]−1[I + (A− I)ξ]−1[A + B− 2I + 2ξ(A− I)(B− I)] =

[I + (A− I)ξ]−1(A− I) + [I + (B− I)ξ]−1(B− I).

Differentiating the above relation once again with respect to ξ (using the fact thatd(T−1)/dξ = −T−1[dT/dξ]T−1), and evaluating the resulting expression at ξ =0, we get AB = BA.

Conversely, given that AB = BA, using Eqn. (1.303), we have

log [I + (A− I)ξ][I + (B− I)ξ]

=∫ ξ

0[I + (B− I)η]−1[I + (A− I)η]−1[A + B− 2I + 2η(A− I)(B− I)] dη

=∫ ξ

0[I + (B− I)η]−1[I + (A− I)η]−1 [I + (A− I)η](B− I) + [I + (B− I)η]

(A− I) dη

= l∫ ξ

0[I + (B− I)η]−1(B− I) dη +

∫ ξ

0[I + (A− I)η]−1(A− I) dη

= log[I + (A− I)ξ] + log[I + (B− I)ξ].

In a similar manner, one gets the same expression forlog [I + (B− I)ξ][I + (A− I)ξ], leading to Eqn. (1.307).



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 113 — #132 ii

ii

ii


aThis condition is imposed since, as stated, the imaginary part of the eigenvalues of the log of anytensor has to lie between −π and π.

As a corollary, it follows by taking ξ = 1 in Eqn. (1.307), that

AB = BA and − π < Im(log αi + log βi) < π ∀i =⇒ log(AB)

= log(BA) = log A + log B. (1.308)

The above implication need not be true if AB = BA, but the constraints on the eigenvaluesof A and B are violated. Cheng et al. [53] have presented the following (complex-valued)scalar example: if a = b = e(π−ε)i, where ε is small and positive, then

log(ab) = −2εi 6= 2(π − ε)i = log(a) + log(b).

We present a (real-valued) matrix example that is a modification of the example ofWood [354]. If

A = (π− 0.01)

0 0√

32

0 0 − 12

−√

32

12 0

, H = eA ≈

−0.499963 0.866004 0.008660110.866004 0.500012 −0.00499992−0.00866011 0.00499992 −0.99995

,

then

log H ≈

0 0 2.712040 0 −1.5658

−2.71204 1.5658 0

, log H2 ≈

0 0 −0.01732050 0 0.01

0.0173205 −0.01 0

,

so that log H2 6= 2 log H. The eigenvalues of log H and log H2 are (approximately) (0, (π−0.01)i,−(π − 0.01)i) and (0, 0.02i,−0.02i), respectively.

The converse assertion of Eqn. (1.308) is true for dimension n = 2, as seen in [226, 227],and also for symmetric tensors (for any n), as is evident by taking the transpose. For n = 4,it may appear that by taking X = eA, Y = eB, where eA and eB are given by Eqn. (1.285), wehave log(XY) = log X + log Y , but XY 6= YX. However, since both A and A + B do notsatisfy the constraint −π < Im(λi) < −π, we have log(eA) 6= A and log(eA+B) 6= A + B.Thus, the question of whether the converse assertion of Eqn. (1.308) is true for n > 2remains unresolved.

Let T be a tensor with no eigenvalues λi on the closed negative real axis. Since T andT−1 commute, and since log λi + log(λi)

−1 = 0, by taking A ≡ T and B ≡ T−1, we get

log(T−1) = − log T . (1.309)

If R ∈ Orth+ is such that it does not have any eigenvalues on the closed negative real axis,i.e., if R ∈ Orth+/Sym + I, then log R ∈ Skw. This is proved by noting that

[log R]T = log(RT) = log(R−1) = − log(R),

where the last step follows from Eqn. (1.309).



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 114 — #133 ii

ii

ii


We now discuss the explicit determination of the logarithm of a tensor; an alternativemethod may be found in [34]. Analogous to Eqn. (J.38), one can prove that

[I + (T − I)η]−1 =k

∑i=1

1[1 + (λi − 1)η]

[Pi +

mi−1

∑j=1

( −ηN i[1 + (λi − 1)η]

)j]

.

Substituting this relation into Eqn. (1.304), we get

log[I + (T − I)ξ] =∫ ξ

0

k

∑i=1

λi − 1

[1 + (λi − 1)η]Pi +

mi−1

∑j=1

(−1)j−1η j−1N ji

[1 + (λi − 1)η]j+1

dη

=k

∑i=1

log[1 + (λi − 1)ξ]Pi +

mi−1

∑j=1

(−1)j−1ξ jN ji

j[1 + (λi − 1)ξ]j

. (1.310)

In particular, for ξ = 1, we get

log T =k

∑i=1

(log λi)Pi +

mi−1

∑j=1

(−1)j−1N ji

jλji

. (1.311)

An alternative derivation of Eqn. (1.310) can be given as follows. Since

P1, P2, . . . , Pk, N1, . . . , Nm1−11 , . . . , Nk, . . . , Nmk−1

k constitutes a basis for any tensor, we write

log[I + (T − I)ξ] =k

∑i=1

[fi(ξ)Pi +

mi−1

∑j=1

fij(ξ)(N i)j

],

where the fi(ξ) and fij(ξ) are functions to be determined. Substituting into the definitiongiven by Eqn. (1.303), and using the linear independence of the basis, we get for each i theequations

[1 + (λi − 1)ξ] fi = (λi − 1),

[1 + (λi − 1)ξ] fi1 + ξ fi = 1,

[1 + (λi − 1)ξ] fi2 + ξ fi1 = 0,...

[1 + (λi − 1)ξ] fi(mi−1) + ξ fi(mi−2) = 0,

which are to be solved subject to the condition that fi(0) = fij(0) = 0. We first solve for fiusing the first equation, then for fi1 using the second one, and so on. The solution is givenby fi = log[1+(λi− 1)ξ] and fij = (−1)j−1ξ j/[j(1+(λi− 1)ξ)j], which yields Eqn. (1.310).

For a diagonalizable tensor T , Eqn. (1.311) simplifies to

log T =k

∑i=1

(log λi)Pi, (1.312)

with Pi given by Eqn. (J.26). By specializing Eqn. (1.311) to the case n = 3, we get thefollowing analogue of Theorem 1.10.5:



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 115 — #134 ii

ii

ii


Theorem 1.10.7. For n = 3, we have the following cases:

1. λ1 = λ2 = λ3 ≡ λ

(a) If q(T) = (T − λI), then

log T = (log λ)I.

(b) If q(T) = (T − λI)2, then

log T = (log λ)I +T − λI

λ.

(c) If q(T) = (T − λI)3, then

log T = (log λ)I +T − λI

λ− (T − λI)2

2λ2 .

2. λ1 ≡ µ 6= λ2 = λ3 ≡ λ

(a) If q(T) = (T − λI)(T − µI), then

log T = log µ

(T − λIµ− λ

)+ log λ

(T − µIλ− µ

).

(b) If q(T) = (T − µI)(T − λI)2, then

log T = log µ(T − λI)2

(λ− µ)2 − log λ(T − µI)(T − (2λ− µ)I)

(λ− µ)2 +(T − λI)(T − µI)

λ(λ− µ).

3. λ1 ≡ λ 6= λ2 ≡ µ 6= λ3 ≡ ν 6= λ1. We have q(T) = (T − λI)(T − µI)(T −νI), and

log T = log λ(T − µI)(T − νI)(λ− µ)(λ− ν)

+ log µ(T − λI)(T − νI)(µ− λ)(µ− ν)

+ log ν(T − λI)(T − µI)(ν− λ)(ν− µ)

.

Some examples are

T =

e 1 00 e 00 0 e

, log T =

1 1e 0

0 1 00 0 1

, (Case 1(b)),

T =

e 1 10 e 10 0 e

, log T =

1 1e

2e−12e2

0 1 1e

0 0 1

, (Case 1(c)),

T =

e 1 10 e2 10 0 e2

, log T =

1 1e(e−1)

e3−e2−1e3(e−1)2

0 2 1e2

0 0 2

, (Case 2(b)).



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 116 — #135 ii

ii

ii


As an application, the logarithm of R ∈ Orth+/Sym, with eigenvalues 1, eiα, e−iα(and hence tr R = 1 + 2 cos α) is obtained using Case 3 in the above theorem as

log R = iα(P2 − P3)

=iα(R− I)eiα − e−iα

[R− e−iα I

eiα − 1+

R− eiα Ie−iα − 1

]=

α

2 sin α(R− RT)

=√

2αR− RT∥∥R− RT∥∥ .

By substituting W ≡ log R (|w| = α) into the right-hand side of Eqn. (1.289), we can easilyverify that elog R = R.

By differentiating Eqn. (1.304) with respect to T and using Eqns. (1.233) and (1.236), weget (for any underlying space dimension n)

∂

∂Tlog(I + (T − I)ξ) =

∫ ξ

0

[(I + (T − I)η)−1 (I + (T − I)η)−T

]dη.

For a diagonalizable tensor, the above equation simplifies to

∂

∂Tlog(I + (T − I)ξ) =

k

∑i=1

ξ

1 + (λi − 1)ξPi PT

i +k

∑i=1

k

∑j=1j 6=i

log[1 + (λi − 1)ξ]− log[1 + (λj − 1)ξ]λi − λj

Pi PTj ,

a result that we could also have obtained directly by differentiating Eqn. (1.312) usingEqn. (1.255). By setting ξ = 1 in the above equation, we get

∂

∂Tlog T =

k

∑i=1

1λi

Pi PTi +

k

∑i=1

k

∑j=1j 6=i

log λi − log λj

λi − λjPi PT

j .

For A ∈ Sym,

∂

∂Alog A = S

[k

∑i=1

1λi

Pi Pi +k

∑i=1

k

∑j=1j 6=i

log λi − log λj

λi − λjPi Pj

]S, k > 1,

=1λ

S λ1 = λ2 = λ3 ≡ λ,

(1.313)

with Pi given by Eqn. (J.4).An application is presented in Section 2.4.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 117 — #136 ii

ii

ii


1.11 Divergence and Stokes’ Theorems

We state the divergence, Stokes’, potential and localization theorems that are used quitefrequently in the following development. The divergence theorem relates a volume inte-gral to a surface integral, while the Stokes’ theorem relates a contour integral to a surfaceintegral. Let S represent the surface of a volume V, n represent the unit outward normalto the surface, φ a scalar field, u a vector field, and T a second-order tensor field. Then wehave

Divergence theorem (also known as the Gauss’ theorem)

∫V∇φ dV =

∫S

φn dS. (1.314)

Applying Eqn. (1.314) to the components ui of a vector u, we get∫V∇ · u dV =

∫S

u · n dS, (1.315)∫V∇× u dV =

∫S

n× u dS,∫V∇u dV =

∫S

u⊗ n dS.

Similarly, on applying Eqn. (1.314) to∇ · T , we get the vector equation∫V∇ · T dV =

∫S

Tn dS. (1.316)

Note that the divergence theorem is applicable even for multiply connected domains pro-vided the surfaces are closed.Stokes’ theoremLet C be a contour, and S be the area of any arbitrary surface enclosed by the contour C.Then ∮

Cu · dx =

∫S(∇× u) · n dS, (1.317)∮

Cu× dx =

∫S

[(∇ · u)n− (∇u)Tn

]dS. (1.318)

In what follows, we assume n to be C0 continuous over the entire surface S (so that∇ · nmakes sense). By taking u as w× n in Eqn. (1.317), using (w× n) · λ = w · (n× λ), whereλ = dx/ds is the unit tangent to C, and then applying Eqn. (1.354), we get∮

Cw · (n× λ) ds =

∫S[(∇ · n)(w · n)−∇ ·w + n · (∇w)n] dS. (1.319)

Thus, for a closed surface S, we get∫S(∇ · n)(w · n) dS =

∫S[I − n⊗ n] : ∇w dS. (1.320)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 118 — #137 ii

ii

ii


By choosing w to be a constant vector c in the above equation, we get c ·∫

S(∇ · n)n dS = 0.Since the choice of c is arbitrary, we obtain∫

Sκn dS = 0, (1.321)

where κ = ∇ · n is the curvature. For a more detailed discussion of this result, see [22]. Bychoosing w = x, w = c× x, w = (x · x)c and w = (x⊗ x)c in Eqn. (1.320), we get∫

Sκ(x · n) dS = 2S, (1.322a)∫

Sκ(x× n) dS = 0, (1.322b)∫

Sκ(x · x)n dS = 2

∫S[I − n⊗ n] x dS, (1.322c)∫

Sκ(x · n)x dS =

∫S[3I − n⊗ n] x dS. (1.322d)

Note with relation to Eqns. (1.321) and (1.322b) that∫

S n dS and∫

S(x× n) dS are also zero,as can be seen by using the divergence theorem.Potential Theorems

1. Let V be a simply connected region, and let u = ∇φ be a vector field on V. Then∇× u = 0. Conversely, if∇× u = 0, then there exists a scalar potential φ such thatu = ∇φ.We have already proved the forward assertion (see Eqn. (1.269)). To prove the con-verse, we note from Eqn. (1.317) that

∮C u · dx = 0, and thus, the value of

∫ xx0

u(y) · dyevaluated along any curve in V joining a fixed point x0 and x, just depends on x. Let

φ(x) =∫ x

x0

u(y) · dy.

Then u = ∇φ.

2. If T is a tensor field such that∇× T = 0, then there exists a vector field u such thatT = ∇u.To see this, using the definition of the curl of a tensor, we have∇× (TTw) = (∇×T)w = 0, where w is a constant unit vector. By the above result this implies that thereexists a scalar field φ such that TTw = ∇φ, or, alternatively, T = w⊗∇φ = ∇(φw).Letting u = φw, we get the desired result.

3. If T is a tensor field such that ∇× T = 0, and in addition, if tr T = 0, then thereexists W ∈ Skw such that T = ∇×W .To prove this, we take the trace of the relation T = ∇u to get∇ · u = 0. Let W T =−W be the skew-symmetric tensor whose axial vector is u. Then by Eqn. (1.272), weget T = ∇u = ∇×W .

4. If v and T are vector and tensor fields such that∫S

v · n dS = 0,∫S

TTn dS = 0,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 119 — #138 ii

ii

ii


for every closed surface S in V, then there exist vector and tensor fields u and U suchthat v = ∇× u and T = ∇×U, respectively.

To prove the second statement from the first, take the dot product of the equation∫S TTn dS = 0 with a constant unit vector w. Then, by the definition of the transpose,

we get∫

S Tw · n dS = 0, which by virtue of the first result implies the existence of avector field u such that Tw = ∇× u, or, alternatively, T = (∇× u)⊗w = ∇×U,where U = w⊗ u.

If the region V does not contain any ‘holes’ in the interior (i.e., the boundary of V iscomprised of only the outer surface), then by means of the divergence theorem, weconclude that

∫S v · n dS = 0 for every closed surface S is equivalent to ∇ · v = 0.

Thus, for a vector field v that satisfies∇ · v = 0 on such a body, there exists a vectorfield u such that v = ∇× u.

Now we state the localization theorem, which is used in Chapter 3 to obtain the differ-ential equations from the integral form of the governing equations.

Theorem 1.11.1 (Localization theorem). Let φ be a continuous scalar, vector or tensorfield on V. Then for any given x0 ∈ V,

φ(x0) = limr→0

1V(Br)

∫Br

φ dV,

where Br is the closed ball of radius r > 0 centered at x0, and V(Br) denotes its volume.It follows that if

∫B φ dV = 0 for every closed ball B, then φ = 0.

Proof. We have∣∣∣∣φ(x0)− limr→0

1V(Br)

∫Br

φ dV∣∣∣∣ = 1

V(Br)

∣∣∣∣∫Br[φ(x0)− φ(x)] dV

∣∣∣∣≤ 1

V(Br)

∫Br|φ(x0)− φ(x)| dV

≤ supx∈Br

|φ(x0)− φ(x)| ,

which tends to zero as r → 0 since φ is continuous.

Note that there is no localization theorem for surfaces, i.e.,∫

S φ dS = 0 for every closedsurface S does not imply that φ = 0. To see this, take, for example, φ = a · n, where a is aconstant vector.

1.12 Groups

The treatment in this section is based on [35]. A group is a set, say G, equipped with afunction from G × G, called combination and denoted by

(a, b)→ a ∗ b,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 120 — #139 ii

ii

ii


which satisfies the following axioms:

1. Associativity: For all a, b, c ∈ G

(a ∗ b) ∗ c = a ∗ (b ∗ c).

2. Existence of a neutral element: There exists n ∈ G such that

n ∗ a = a ∗ n = a ∀a ∈ G.

3. Existence of reverse elements: For each a ∈ G, there existsra ∈ G such that

ra ∗ a = a ∗ r

a = n.

If, in addition, the commutative property

a ∗ b = b ∗ a ∀a, b ∈ G

is satisfied, then G is said to be a commutative or Abelian group.Since the combination function in the definition of a group may be thought of as taking

two elements from G and producing an element also in G, it is often referred to as a closedbinary operation. The closure property, i.e., the fact that the function value a ∗ b is also in G, isoften called the fundamental closure property.

As an example, the set of real numbers is a group with respect to addition. We writea + b instead of a ∗ b, i.e., the combination function is defined on <×< to < by

(a, b)→ a + b.

The three defining axioms are satisfied since

(a + b) + c = a + (b + c) ∀a, b, c ∈ <,

0 + a = a + 0 ∀a ∈ <,

(−a) + a = a + (−a) = 0 ∀a ∈ <.

Thus, 0 is the neutral element, and reverse elements are the negatives. In fact, < is a com-mutative group since

a + b = b + a ∀a, b ∈ <.

We have the following useful results:

Theorem 1.12.1. Let G be a group. Then for any a, b ∈ G, there exists a unique x ∈ G,such that a ∗ x = b. In fact, x =

ra ∗ b. Similarly, there exists a unique y ∈ G such that

y ∗ a = b. In fact, y = b ∗ ra.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 121 — #140 ii

ii

ii


Proof. Suppose, tentatively, that there exists a x ∈ G such that a ∗ x = b. Then

combining on the left withra, we have

ra ∗ b =

ra ∗ (a ∗ x)

= (ra ∗ a) ∗ x (associativity)

= n ∗ x (reverse)= x. (neutral)

Thus, if x exists, it must be preciselyra ∗ b. This establishes the uniqueness. To

prove existence, one must show that this x has the desired property. First of all,

we note that by closurera ∗ b ∈ G. Next, consider

a ∗ ( ra ∗ b) = (a ∗ r

a) ∗ b (associativity)= n ∗ b (reverse)= b. (neutral)

Hence, x =ra ∗ b does the job. The result for y can be proved similarly.

Theorem 1.12.2 (Cancellation Property). Let G be a group. Then for a, b, c ∈ G

b ∗ a = c ∗ a =⇒ b = c.a ∗ b = a ∗ c =⇒ b = c.

Proof. We have

(b ∗ a) ∗ ra = (c ∗ a) ∗ r

a.

Using the associativity, reverse and neutral properties, we get

b = c.

The second statement can be proved in a similar manner.

Theorem 1.12.3. A given group has only one neutral element.

Proof. Suppose n and n′ are both neutral elements for a group G, i.e., n, n′ ∈ Gand for all a ∈ G

a ∗ n = n ∗ a = a and n′ ∗ a = a ∗ n′ = a.

Consider a ∗ n′ = a. We have

n′ =ra ∗ a = n.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 122 — #141 ii

ii

ii


Theorem 1.12.4. A given element of a group has only one reverse.

Proof. Let a ∈ G. Supposera and

ra′

are both reverses of a, i.e.,

ra ∗ a = a ∗ r

r = n andra′∗ a = a ∗ r

a′= n.

Consider a ∗ ra′= n. We have

ra′=

ra ∗ n =

ra.

Theorem 1.12.5. Let n be a neutral element of a group. Then

rn = n.

Proof. For every element a ∈ G, the reverse axiom requires a ∗ ra = n. Taking

a = n, we get n ∗ rn = n. Hence,

rn =

rn ∗ n = n.

Theorem 1.12.6. Let a be any element of a group. Then

rra = a.

Proof. We havera ∗ a = n. Hence,

a =

rra ∗ n =

rra.

Theorem 1.12.7. Let a and b be any two elements of a group. Then

ra ∗ b =

rb ∗ r

a.

Proof. We have

(a ∗ b) ∗ (rb ∗ r

a) = a ∗ (b ∗ (rb ∗ r

a)) (associativity)

= a ∗ ((b ∗rb) ∗ r

a) (associativity)

= a ∗ (n ∗ ra) (reverse)

= a ∗ ra (neutral)

= n. (reverse)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 123 — #142 ii

ii

ii


In the same way

(rb ∗ r

a) ∗ (a ∗ b) = n.

Thus,rb ∗ r

a is a reverse element of a ∗ b. By the uniqueness of the inverse, it is thereverse.

In general, subsets of groups will not be groups, e.g., the subset G − n. We introducethe following terminology:Definition: Let H be a nonempty [proper] subset of a group G. Then, if H is a group withrespect to the combination function of G, it is called a [proper] subgroup of G.4

SinceH×H ⊂ G ×G, the combination function forH is the restriction of the combina-tion function for G to H×H. However, the function values, which necessarily belong toG, are not automatically in H, i.e., for an arbitrary subset, closure could fail. An exampleof a subgroup of a group G with neutral element n is the singleton set n. The followingtheorem provides a sufficient condition for a subset to be a subgroup.

Theorem 1.12.8. LetH be a nonempty subset of a group G. If for all a, b ∈ H, a ∗rb ∈

H, thenH is a subgroup of G.

Proof. As noted, the combination function for G makes sense for H. Hence, theassociativity requirement is automatically satisfied. We have to show that theneutral and reverse elements of all a ∈ H lie in H, and that H is closed underthe combination operation.

Assume that for a, b ∈ H

a ∗rb ∈ H. (1.323)

Choosing b = a, we get a ∗ ra ∈ H. But a ∗ r

a = n. Thus, n ∈ H.

Next, choose a = n. Then, we have n ∗rb ∈ H. But n ∗

rb =

rb. Thus, b ∈ H

impliesrb ∈ H. So H contains the reverses of all its elements. The reverses

necessarily exist because of the properties of the ‘parent’ group G.

Finally, let a, b ∈ H. We have shown thatrb ∈ H. Letting

rb play the role of b in

Eqn. (1.323), we have

a ∗rrb ∈ H.

But

rrb = b. Thus, a ∗ b ∈ H, and this proves that H is closed under the combina-

tion operation.

4The bracket device above is used to make two statements simultaneously. To get the first statement, includethe bracketed material. To get the second leave out the bracketed material.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 124 — #143 ii

ii

ii


We now come to a result that plays a key role in the classification of materials. First,we define the set of unimodular tensors Unim, and the set of proper unimodular tensorsUnim+ as

Unim : = H : |det H| = 1,

Unim+ : = H : det H = 1.

Theorem 1.12.9. Unim and Unim+ are groups with respect to tensor multiplication,and Unim+ is a proper subgroup of Unim. Hence, Unim is called the unimodular groupof tensors, while Unim+ is called the proper unimodular group.

Proof. Closure holds since H1, H2 ∈ Unim implies that |det(H1H2)| =|det H1| |det H2| = 1. Associativity follows since tensor multiplication is as-sociative. The reverse element is simply the inverse, while the neutral elementis the identity tensor. Similar arguments can be made for Unim+. Unim+ is aproper subgroup of Unim, since there are elements in Unim which are not inUnim+, e.g., diag[−1, 1, 1].

Theorem 1.12.10. The orthogonal group Orth is a proper subgroup of the unimodulargroup Unim, and the proper orthogonal group Orth+ is a proper subgroup of the properunimodular group Unim+.

Proof. The closure property of Orth follows since Q1, Q2 ∈ Orth implies thatQ1Q2 ∈ Orth. The other properties are proved as in the previous proof. Orth isa proper subgroup since there are unimodular tensors that are not orthogonal.e.g., 1 0 0

1 1 00 0 1

.

A similar proof can be given for Orth+.

Before stating the next theorem, we prove the following lemma:Lemma: Let S0 ∈ Psym ∩Unim+, and let it have at least two distinct eigenvalues, say λu

and λl , with λu > λl . Then, for every ξ ∈[(

λlλu

)2,(

λuλl

)2], there exists R ∈ Orth+ such

that the symmetric and proper unimodular tensor

T = S−10 RS2

0R−1S−10

has the spectrum ξ, ξ−1, 1.Proof. Since S0 ∈ Psym ∩Unim+, all the eigenvalues of S0 are positive, and we can writethe spectral resolution of S0 as

S0 = λue∗1 ⊗ e∗1 + λle∗2 ⊗ e∗2 +

1λlλu

e∗3 ⊗ e∗3 .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 125 — #144 ii

ii

ii


Let R(θ) be the usual rotation matrix about the e∗3 axis, i.e.,

R(θ) =

cos θ sin θ 0− sin θ cos θ 0

0 0 1

.

In terms of its action on the basis e∗1 , e∗2 , e∗3, we have

R(θ)e∗1 = cos θe∗1 + sin θe∗2 ,R(θ)e∗2 = − sin θe∗1 + cos θe∗2 ,R(θ)e∗3 = e∗3 .

Note that R−1(θ) = RT(θ) = R(−θ). Now, consider

T(θ) = S−10 R(θ)S2

0R−1(θ)S−10 . (1.324)

That T(θ) is symmetric and proper unimodular follows from straightforward calculations.Hence, its eigenvalues are all real. Moreover, +1 is an eigenvalue of T(θ) for all θ since

T(θ)e∗3 = S−10 R(θ)S2

0R(−θ)S−10 e∗3

= λlλuS−10 R(θ)S2

0R(−θ)e∗3= λlλuS−1

0 R(θ)S20e∗3

=1

λlλuS−1

0 R(θ)e∗3

=1

λlλuS−1

0 e∗3

= e∗3 .

For θ = 0, we have R = I, which when substituted into Eqn. (1.324) yields T = I, i.e., thespectrum of T(0) is 1, 1, 1. At θ = π/2, we have R(π/2)e∗1 = e∗2 , R(π/2)e∗2 = −e∗1 , andR(π/2)e∗3 = e∗3 . Hence,

T(π

2

)e∗1 = S−1

0 R(π

2

)S2

0R(−π

2

)S−1

0 e∗1

=1

λuS−1

0 R(π

2

)S2

0R(−π

2

)e∗1

= − 1λu

S−10 R

(π

2

)S2

0e∗2

= −λ2l

λuS−1

0 R(π

2

)e∗2

=λ2

lλu

S−10 e∗1

=λ2

lλ2

ue∗1 .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 126 — #145 ii

ii

ii


Similarly, T(π/2)e∗2 = λ2u

λ2le∗2 . Thus, the ordered spectrum of T(π/2) is

(λ2

lλ2

u

), 1,(

λ2u

λ2l

).

Since +1 is always an eigenvalue, and since T(θ) depends continuously on θ, we see that

given any ξ ∈[(

λ2l

λ2u

),(

λ2u

λ2l

)], there exists θ ∈ [0, π/2] for which the lemma holds.

Thus, if e′i are the eigenvectors of T , then the spectral resolution of T is

T = ξe′1⊗ e′1 +1ξ

e′2⊗ e′2 + e′3⊗ e′3, (1.325)

where ξ ∈[(

λ2l

λ2u

),(

λ2u

λ2l

)]. Now we are in a position to prove the main result of this

section. The proof that we present is due to Noll (see [237], page 200).

Theorem 1.12.11. (Maximality of the Orthogonal Group in the UnimodularGroup). If G is a group with respect to tensor multiplication such that

Orth ⊂ G ⊂ Unim,

then either G = Orth or G = Unim. In other words, there is no group between theorthogonal group and the unimodular group. The corresponding result for Orth+ andUnim+ is that Orth+ is maximal in Unim+.

Proof. There are two possibilities:

1. G contains a tensor S0 ∈ Psym ∩Unim with at least two distinct eigenval-ues, say λu > λl .

2. G does not contain such a tensor, i.e., the only tensor from Psym ∩Unimthat it contains is the identity tensor.

Consider the first possibility. Using the polar decomposition, any tensor H ∈Unim can be decomposed as H = QU, where Q is an orthogonal tensor andU ∈ Psym ∩ Unim. Hence, if ei are the eigenvectors of U, we can write thespectral resolution of U as

U = γ1e1⊗ e1 + γ2e2⊗ e2 +1

γ1γ2e3⊗ e3,

where γ1, γ2 > 0. We can decompose U as

U = U1U2,

where

U1 = γ1e1⊗ e1 +1

γ1e2⊗ e2 + e3⊗ e3,

U2 = e1⊗ e1 + γ1γ2e2⊗ e2 +1

γ1γ2e3⊗ e3.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 127 — #146 ii

ii

ii


Now choose integers m1, m2 large enough so that ξ1 := γ1/m11 and ξ2 :=

(γ1γ2)1/m2 satisfy(λ2

lλ2

u

)≤ ξi

(λ2

u

λ2l

).

(Choosing m1 ≥ |ln γ1/[2 ln(λu/λl)]| and m2 ≥ |ln(γ1γ2)/[2 ln(λu/λl)]| doesthe job.) By the Lemma, there exist tensors T1 and T2 whose spectral resolutionis given by

T1 = ξ1e1⊗ e1 +1ξ1

e2⊗ e2 + e3⊗ e3,

T2 = e′1⊗ e′1 + ξ2e′2⊗ e′2 +1ξ2

e′3⊗ e′3.

Thus,

Tm11 = γ1e1⊗ e1 +

1γ1

e2⊗ e2 + e3⊗ e3,

Tm22 = e′1⊗ e′1 + γ1γ2e′2⊗ e′2 +

1γ1γ2

e′3⊗ e′3.

Let R1 and R2 be the two proper orthogonal tensors that rotate ei and e′i into ei.Then, we have

R1Tm11 RT

1 = R1(γ1e1⊗ e1 +1

γ1e2⊗ e2 + e3⊗ e3)RT

1

= γ1(R1e1)⊗ (R1e1) +1

γ1(Re2)⊗ (R1e2) + (R1e3)⊗ (R1e3)

= γ1e1⊗ e1 +1

γ1e2⊗ e2 + e3⊗ e3

= U1.

Similarly,

R2Tm22 RT

2 = U2.

Hence,

H = QU

= QU1U2

= Q(

R1Tm11 RT

1

) (R2Tm2

2 RT2

)



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 128 — #147 ii

ii

ii


= QR1

[S−1

0 R(θ1)S20R−1(θ1)S−1

0

]m1RT

1 R2

[S−1

0 R(θ2)S20R−1(θ2)S−1

0

]m2RT

2 ,

(1.326)

for some θ1, θ2 ∈ [0, π/2]. Thus, any H ∈ Unim can be expressed in terms ofpowers of S0 ∈ Psym ∩Unim, Q ∈ Orth, and R(θ1), R(θ2), R1, R2 ∈ Orth+. ButOrth ⊂ G and S0 ∈ G, and since G is a group, any element generated by powersof S0 and elements of Orth has to lie in G. Thus, we have Unim ⊂ G. But, byhypothesis, G ⊂ Unim. Hence, for the case when S0 has at least two distincteigenvalues G = Unim.

Now consider the case when the only member from Psym ∩Unim in G is theidentity tensor. For any H ∈ G, we have H = QU, or, alternatively, U = Q−1H,where Q ∈ Orth ⊂ G, and U ∈ Psym ∩ Unim. Since G is a group, and sinceH, Q ∈ G, we have U = Q−1H ∈ G. However, since the only member fromPsym∩Unim in G is I, we have U = I. Therefore, H = Q ∈ Orth, i.e., G ⊂ Orth.But, by hypothesis, Orth ⊂ G. Hence, in this case, we have G = Orth.

To prove that Orth+ is maximal in Unim+, note that if H ∈ Unim+, then Q inEqn. (1.326) belongs to Orth+. Thus, any H ∈ Unim+ can be expressed in termsof powers of S0 ∈ Psym∩Unim and tensors in Orth+, and the result follows.

EXERCISES

1. Show that

(w× u)× (w× v) = [w⊗w](u× v). (1.327)

[u× v, v×w, w× u] = [u, v, w]2 . (1.328)

From Eqn. (1.327), it follows that if n is a unit vector, then

[n, n× u, n× v] = [n, u, v] .

2. Which of the spaces Sym, Psym, Skw, Orth+ are linear subspaces of Lin? Justify.Evaluate if the matrices 0 1 −1

−1 0 01 0 0

,

0 0 10 0 −1−1 1 0

,

0 −1 01 0 10 −1 0

,

constitute a basis for Skw, and if the matrices1 1 01 1 00 0 0

,

1 0 10 0 01 0 1

,

0 0 00 1 10 1 1

,

constitute a basis for Sym. If not, then give a ‘canonical’ basis for Skw and Sym. Alsogive a basis for deviatoric symmetric tensors.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 129 — #148 ii

ii

ii


3. Let pi and qi, i = 1, 2, . . . , n, be sets of vectors in <n. Determine if the set pi⊗qi, i = 1, 2, . . . , n, is linearly dependent or independent if

(a) pi is linearly independent, and qi is linearly independent;

(b) pi is linearly independent, but qi is not;

(c) qi is linearly independent, but pi is not;

(d) pi is linearly dependent, and qi is also linearly dependent.

4. Let W ∈ Skw. Determine if the set I, W , W2 is linearly dependent or linearlyindependent. Deduce from this result and Eqn. (1.105a) if I, Q, QT, where Q ∈Orth+, is linearly dependent or independent.

5. Show that the decomposition of a tensor into a symmetric and skew-symmetric partas given by Eqn. (1.33) is unique.

6. Show that tr RS = tr SR = tr RTST = tr ST RT .

7. If S and W are symmetric and skew-symmetric tensors, respectively, and T is anarbitrary second-order tensor prove that

S : T = S : TT = S :[

12(TT + T)

], (1.329)

W : T = −W : TT = W :[

12(T − TT)

],

S : W = 0. (1.330)

8. Prove the following:(i) If A : B = 0 for every symmetric tensor B, then A ∈ Skw.(ii) If A : B = 0 for every skew tensor B, then A ∈ Sym.

9. Using Eqns. (1.35) and (1.52) prove that

cof (RS) = (cof R)(cof S),

det(RS) = (det R)(det S).

10. Using indicial notation prove that

tr (cof T) =12

[(tr T)2 − tr (T2)

], (1.331)

[(cof T)u]× v = T(u× TTv), (1.332)

cof (cof T) = (det T)T .

11. Show that an arbitrary tensor T cannot be represented as

αa⊗ b + βb⊗ c + γc⊗ a,

by showing, in particular, that the identity tensor I cannot be represented in this way.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 130 — #149 ii

ii

ii


12. Let T = u⊗ p + v⊗ q + w⊗ r. Show using Eqn. (1.60) that

cof T = (u× v)⊗ (p× q) + (v×w)⊗ (q× r) + (w× u)⊗ (r× p). (1.333)

Using Eqn. (1.64), deduce that

(u× v)⊗w + (v×w)⊗ u + (w× u)⊗ v = [u, v, w] I.

Using Eqn. (1.53), show that Eqn. (1.328) is recovered for the case [u, v, w] 6= 0(Eqn. (1.328) holds even when [u, v, w] = 0). Note that the principal invariants ofT are

I1 = tr T = u · p + v · q + w · r,

I2 = tr (cof T) = (u× v) · (p× q) + (v×w) · (q× r) + (w× u) · (r× p),

I3 = det T = [u, v, w] [p, q, r] ,

and that T−1 (when it exists) is given by (cof T)T/ det T .

13. Let λi, i = 1, 2, · · · , n, denote the eigenvalues of T for space-dimension n. UsingEqn. (J.11), show that the eigenvalues of cof T are ∏n

j=1j 6=i

λj, i = 1, 2, . . . , n. For the

case n = 3, deduce Eqn. (1.331), I2(cof T) = tr T(det T) and I3(cof T) = (det T)2.Now consider the case when T is diagonalizable with distinct eigenvalues, so thatT = ∑n

i=1 λiPi (thus, T − γI = ∑ni=1(λi − γ)Pi; tr Pi = e∗ · e∗ = 1). Using Eqn. (J.11),

show that cof T = ∑ni=1(∏n

j=1j 6=i

λj)PT

i , and cof (T − γI) = ∑ni=1

[∏n

j=1j 6=i

(λj − γ)]PT

i .

Thus, for γ = λ1 say, cof (T − λ1 I) = ∏nj=2(λj − λ1)PT

1 , and tr cof (T − λ1 I) =

∏nj=2(λj − λ1).

14. If T , R and S are second-order tensors, then show that

det T =13

tr [(cof T)TT ] =13

T : cof T , (1.334)

det(R + S) = det R + cof R : S + R : cof S + det S, (1.335)

cof (R + S) = cof R + cof S + [(tr R)(tr S)− tr (RS)] I − (tr R)ST − (tr S)RT

+ (SR)T + (RS)T . (1.336)

By putting R = I in Eqn. (1.336), deduce that

cof (I + S) = (1 + tr S)I − ST + cof S. (1.337)

By putting S = u⊗ v in the above equation and using Eqn. (1.84), deduce that

cof (I + u⊗ v) = (1 + u · v)I − v⊗ u.

Using Eqns. (1.335) and (1.337), show that for W ∈ Skw,

(I + W)−1 =I −W + w⊗w

1 + |w|2.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 131 — #150 ii

ii

ii


15. Apply the Cayley–Hamilton theorem to W ∈ Skw, and find the axial vector of W3.

16. If w is the axial vector of W ∈ Skw, show that w · [u×Wu] ≥ 0 ∀u, with equality ifand only if (u, w) are linearly dependent.

17. Let W ∈ Skw, and let w be its axial vector. Using Eqn. (1.332), show that (cof T)w isthe axial vector of TWTT . If T ≡ Q ∈ Orth+, then deduce using Eqn. (1.93) that Qwis the axial vector of QWQT . Also, find the axial vector of QW3QT .

18. Show that a (not necessarily symmetric) tensor T commutes with every skew tensorW if and only if T = λI.

19. Either using the result of the previous problem, or independently, show that a (notnecessarily symmetric) tensor T commutes with every orthogonal tensor Q if andonly if T = λI.

20. Show that a (not necessarily symmetric) tensor T commutes with every symmetrictensor S if and only if T = λI.

21. Show that (u, Tv) = 0 ∀ u, v that are mutually orthogonal, if an only if T = λI.

22. Let a, b be arbitrary vectors, and u, v be unit vectors such that u · v 6= −1.

(a) Show that a× b is the axial vector of the skew tensor b⊗ a− a⊗ b. It followsthat if b⊗ a = a⊗ b, then a× b = 0, so that by Theorem 1.2.1, a and b arelinearly dependent.

(b) Using (a) and Eqn. (1.104), show that the unique orthogonal tensor that rotatesu into v about an axis perpendicular to u and v is given by

R = I + rv⊗ u− s(u⊗ u + u⊗ v + v⊗ v).

where r = (1 + 2u · v)/(1 + u · v) and s = 1/(1 + u · v).23. If w is the axial vector of W ∈ Skw, then show using Eqns. (1.54) and (1.327) (or

Eqns. (1.60) and (1.90)) that cof W = w⊗w. Deduce using Eqn. (1.273) that

ecof W = I +(e|w|

2 − 1)

|w|2w⊗w.

The determinant of the above tensor is etr (cof W) = e|w|2.

24. Let W1 and W2 be skew-symmetric tensors with axial vectors w1 and w2. Showusing Eqns. (1.21a) and (1.87) that

W1W2 = w2⊗w1 − (w1 ·w2)I, (1.338)

from which it follows that W1W2 −W2W1 = w2 ⊗ w1 − w1 ⊗ w2. By using theresult of Problem 22a (or, independently, by means of Eqns. (1.21)), we see that w1×w2 is the axial vector of W1W2 −W2W1. If w1 = αw2, where α ∈ <, then, byEqn. (1.90), W1 = αW2, and hence W1W2 = W2W1. Conversely, if W1W2 = W2W1,then w1 × w2, the axial vector of W1W2 −W2W1, is 0, and by Theorem 1.2.1, it



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 132 — #151 ii

ii

ii


follows that w1 and w2 are linearly dependent. Thus, two skew-symmetric tensorsW1 and W2 commute if and only if their axial vectors are linearly dependent.

It also follows from Eqn. (1.338) that

W1W2 + W2W1 = w1⊗w2 + w2⊗w1 − 2(w1 ·w2)I.

The eigenvalues/eigenvectors of the above symmetric tensor are

(|w1| |w2| −w1 ·w2),w1

|w1|+

w2

|w2|,

−(|w1| |w2|+ w1 ·w2),w1

|w1|− w2

|w2|,

−2(w1 ·w2), w1×w2.

One can now easily compute the principal invariants of W1W2 + W2W1 using theseeigenvalues.

25. For W ∈ Skw, find the polar decomposition of I + W in terms of W and its axialvector w.

26. Prove that there is a one-to-one correspondence between Skw and members of Orth+

with no eigenvalue equal to −1,5 by showing that

(a) For every W ∈ Skw, Q = (I −W)−1(I + W) ∈ Orth+. If w is the axial vectorof W , then for n = 3, show using Eqn. (1.66) and Problem 14 that the aboveexpression simplifies to

Q =1

1 + w ·w[(1 + w ·w)I + 2W + 2W2

], (1.339)

with w/ |w| as the axis. In expanded form, if

W =

0 −γ β

γ 0 −α

−β α 0

,

then

Q =1

1 + α2 + β2 + γ2

1 + α2 − β2 − γ2 2(αβ− γ) 2(β + αγ)

2(αβ + γ) 1− α2 + β2 − γ2 −2(α− βγ)

2(−β + αγ) 2(α + βγ) 1− α2 − β2 + γ2

.

5For members of Orth+ with at least one eigenvalue equal to −1, the correspondence when the dimension n isthree, is given by

Q = I + 2W2,

where W is of the form

W =

0 − sin θ cos θ sin φ

sin θ 0 − cos θ cos φ

− cos θ sin φ cos θ cos φ 0

.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 133 — #152 ii

ii

ii


By comparing Eqn. (1.104) with Eqn. (1.339), we see that sin α ≡ 2 |w| /(1 +

|w|2) and cos α = (1− |w|2)/(1+ |w|)2, which corresponds to letting α ∈ [0, π)in Eqn. (1.104). If w1/ |w1| and w2/ |w2| are the axes of R1 and R2, respectively,then, from Eqn. (1.107) and the above expressions for cos α and sin α, the axis ofR2R1 is e = w/ |w|, where

w = w1 + w2 −w1×w2.

(b) For every Q ∈ Orth+ such that −1 is not an eigenvalue of Q,W = (Q− I)(Q + I)−1 ∈ Skw.

Thus, the number of parameters required to define an orthogonal tensor is the num-ber of independent components in a skew-symmetric matrix, namely n(n− 1)/2; e.g.,for n = 3, the number of parameters is 3.

27. If u is an arbitrary vector, show that there exists W ∈ Skw and a vector v (bothdependent on u) such that Wv = u. Use this result to show that if WT1 = WT2 ∀W ∈Skw, then T1 = T2.

28. Show that a scalar-valued function φ : Skw→ < is isotropic if and only if there existsa function φ : tr (W2)→ < such that

φ(W) = φ(tr W2) ∀W ∈ Skw.

The above result follows directly as a corollary of Theorem 1.6.13, or can be provedindependently by using the representation given by Eqn. (1.89), and mimicking theproof of Theorem 1.6.12.

29. For S ∈ Sym, show using the spectral resolution of S or otherwise that

(u, Su) = 0 ∀u ∈ V,

if and only if S = 0. It follows that (u, Tu) = 0 ∀u ∈ V, if and only if T ∈ Skw.

30. A set of Cartesian axes is rotated about the origin to coincide with the unit vectors

e1 = (

√3

4,

14

,

√3

2),

e2 = (34

,

√3

4,−1

2),

e3 = (−12

,

√3

2, 0).

Write down the rotation matrix corresponding to this rotation, and transform thecomponents of the tensor T with respect to ei given by

[T ] =

8 −4 0−4 −3 −10 −1 −2

.



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 134 — #153 ii

ii

ii


31. Using direct notation show that (also verify using indicial notation)

(a⊗ b)T = b⊗ a, (1.340)

(a⊗ b)(c⊗ d) = (b · c)a⊗ d, (1.341)

(a⊗ b) : (u⊗ v) = (a · u)(b · v), (1.342)

T(a⊗ b) = (Ta)⊗ b, (1.343)

(a⊗ b)T = a⊗ (TTb), (1.344)

T : (a⊗ b) = a · Tb. (1.345)

Use Eqn. (1.341) to show that for a symmetric S, the spectral decompositions of Sn

and S−1 (when it exists) are

Sn = λn1 e∗1 ⊗ e∗1 + λn

2 e∗2 ⊗ e∗2 + λn3 e∗3 ⊗ e∗3 ,

S−1 = λ−11 e∗1 ⊗ e∗1 + λ−1

2 e∗2 ⊗ e∗2 + λ−13 e∗3 ⊗ e∗3 .

32. The Fibonacci numbers satisfy the recurrence relation Fn+1 = Fn + Fn−1 with F1 =F2 = 1. In matrix form this recurrence relation can be written as[

Fn+1

Fn

]= S

[Fn

Fn−1

],

where S =[

1 11 0]. Use the above matrix form to first express

[ Fn+1Fn

]in terms of

[ F2F1

],

and then use this to find an explicit expression for Fn in terms of the eigenvalues λ1and λ2 of S.

33. Let B be an invertible tensor. Show that if n is an eigenvector of A, then Bn is aneigenvector of BAB−1 corresponding to the same eigenvalue. It follows that if Q ∈Orth+ and if n is an eigenvector of a tensor T , then Qn is an eigenvector of QTQT

corresponding to the same eigenvalue.

34. If the eigenvalues of a symmetric tensor S are ordered such that λ1 ≤ λ2 ≤ λ3, showusing the spectral decomposition of S that

λ1 = minu

u · Suu · u , λ3 = max

u

u · Suu · u .

35. If T ∈ Lin and u ∈ V, show that

(Tu, Tu) ≤ (T , T)(u, u).

(Hint: TTT is a symmetric, positive semi-definite tensor.)

36. If R ∈ Lin, show that∣∣RT R

∣∣ ≤ |R|2 (Hint: RT R is a symmetric, positive semi-definitetensor). Use this result and the Cauchy–Schwartz inequality to show that if R, S ∈Lin, then

|RS| ≤ |R| |S| .



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 135 — #154 ii

ii

ii


As a result of the Cauchy–Schwartz inequality, we also have |(R, S)| ≤ |R| |S|. How-ever, among the two quantities |(R, S)| and |RS|, either can be greater. For example,let R =

[ 1 0−1 2

]; if S =

[1 10 1], then |(R, S)| = 3 and |RS| = 2, while if S =

[ −1 10 1

], then

|(R, S)| = 1 and |RS| = 2.

37. Show that Sn, S−1 and λ(tr S)I + 2µS are isotropic functions.

38. Show that for any orthogonal Q, the tensor QQT is skew at each t.

39. Show using Eqn. (1.336) that if G(T) = cof T , then

DG(T)[U] = [tr Ttr U − tr (TU)] I + [TT − (tr T)I]UT + [UT − (tr U)I]TT .(1.346)

Alternatively, using Eqn. (1.60), show that

[DG(T)[U]]ij = εimnεjpqTmpUnq, (1.347)

∂Gij

∂Tkl= εikmεjlnTmn. (1.348)

Equation (1.348) in direct tensorial notation, obtained using Eqns. (1.55), (1.226a),(1.226b) and (1.233), is given by

∂

∂T(cof T) = tr T [I⊗ I −T]− (I⊗ TT + TT ⊗ I) + (I T)T + (TT I)T.

(1.349)

If T ≡ C ∈ Sym, then, by virtue of Eqn. (1.241), we just pre- and post-multiply theabove result with S, and use Eqn. (1.175) to get

∂

∂C(cof C) = tr C [I⊗ I − S]− (I⊗ C + C⊗ I) + S [I C + C I] S. (1.350)

The indicial notation form of the above equation, obtained using Eqns. (1.161) and(1.348), is (for the engineering form, see Eqn. (I.6))[

∂

∂C(cof C)

]ijkl

=14

[εikrεjls + εilrεjks + εjkrεils + εjlrεiks

]Crs. (1.351)

Finally, consider the case when C ∈ Sym is invertible, i.e., cof C = (det C)C−1. Showusing Eqns. (1.236) and (1.240) that

∂

∂C(cof C) = det C

[C−1⊗ C−1 − S(C−1 C−1)S

]. (1.352)

40. Let φ1(T) = tr (T−1T−1) = T−T : T−1, φ2(T) = (det T)T−1 : T−1, and φ3(T) =(cof T) : (cof T) =

[(T : T)2 − (TTT) : (TTT)

]/2. Using either Eqns. (1.228)–(1.237)

or by means of the directional derivative, show that

∂φ1

∂T= −2(T−T)3,



https://doi.org/10.1017/CBO9781316134054.002


ii

“CM˙Final” — 2015/3/13 — 11:09 — page 136 — #155 ii

ii

ii


∂φ2

∂T= (T−1 : T−1)cof T − 2(det T)T−TT−1T−T ,

∂φ3

∂T= 2(T : T)T − 2TTTT .

41. Prove that

∇(φv) = φ∇v + v⊗∇φ,

∇ · (φv) = φ(∇ · v) + v · (∇φ), (1.353)

∇× (u× v) = (∇ · v)u− (∇v)u− (∇ · u)v + (∇u)v, (1.354)

∇(u · v) = (∇u)Tv + (∇v)Tu,

∇[(u · v)w] = (u · v)∇w + w⊗ [(∇v)Tu] + w⊗ [(∇u)Tv],

∇ · (u⊗ v) = u∇ · v + (∇u)v,

∇ · (TTv) = T : ∇v + v · (∇ · T), (1.355)

∇ · (φT) = φ∇ · T + T∇φ,

∇(φT) = φ∇T + T ⊗∇φ,

∇(φTv) = φ∇(Tv) + (Tv)⊗∇φ,

∇2(u · v) = u ·∇2v + v ·∇2u + 2∇u : ∇v,

∇ · [a× (∇× b)] = (∇× a) · (∇× b)− a · [∇× (∇× b)]. (1.356)

By integrating Eqn. (1.356) over a closed volume V with surface S show that

−∫

S(∇× b) · (a× n) dS =

∫V(∇× a) · (∇× b)− a · [∇× (∇× b)] dV.

42. Show that

∇ · [(∇u)u] = ∇u : (∇u)T + u · [∇(∇ · u)],∇u : (∇u)T = ∇ · [(∇u)u− (∇ · u)u] + (∇ · u)2.

43. Show that∫V

[u⊗ (∇ · TT) + (∇u)T

]dV =

∫S

u⊗ (TTn) dS.

44. Show that <− 0 is a commutative group with respect to multiplication.



https://doi.org/10.1017/CBO9781316134054.002


Documents

1 Introduction to Tensors - Sorin Mitranmitran-lab.amath.unc.edu/.../introduction_to_tensors.pdf · 2020. 11. 19. · 1 Introduction to Tensors In elementary physics, we often come