Phd Thesis Burckner Zeilinger

8/22/2019 Phd Thesis Burckner Zeilinger

1/139

Information in Individual

Quantum Systems

Dissertation zur Erlangung des akademisches Grades eines

Doktors der technischen Wissenschaften

unter der Leitung vono.Univ.Prof.Dr. Anton Zeilinger

E141

Atominstitut der osterreichischen Universitaten

eingereicht an der Technische Universitat Wien

Naturwissenschaftliche Fakultat

von

Mag. Caslav Brukner9108742

Pulverturmgasse 15/22, 1090 Wien

Wien, am 16. September 1999

Gefordert vom Fonds zur Forderung der wissenschaftlichen Forschung,

Projekt Nr. S6502 und F1506


2/139

Contents

Introduction 5

From quantum theory to an information invariant ... 11

1 Information Acquired in a Quantum Experiment 11

1.1 Unbestimmtheit vs Unbekanntheit in a Quantum Experiment . 12

1.2 Conceptual Inadequacy of the Shannon Information in a Quan-

tum Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.2.1 An Operational Approach . . . . . . . . . . . . . . . . . . 17

1.2.2 An Axiomatic Approach . . . . . . . . . . . . . . . . . . . 22

1.2.3 A Physical Approach . . . . . . . . . . . . . . . . . . . . . 30

1.3 Measure of Information Acquired in a Quantum Experiment . . . 35

2 Information Content of a Quantum System 43

2.1 A Qubit Carries One Bit . . . . . . . . . . . . . . . . . . . . . . . 44

2.1.1 Complementary Propositions . . . . . . . . . . . . . . . . 44

2.1.2 Invariant Information in a Qubit . . . . . . . . . . . . . . 48

2.2 Two Qubits Carry Two Bits Entanglement . . . . . . . . . . . 53

2.2.1 Pairs of Complementary Propositions . . . . . . . . . . . 53

2.2.2 Invariant Information in Two Qubits . . . . . . . . . . . . 57

2.3 N Qubits Carry N Bits . . . . . . . . . . . . . . . . . . . . . . . . 62

A.1 Information Content of a Classical System . . . . . . . . . . . . . 65

i


3/139

... and back. 71

3 Information and the Structure of Quantum Theory 71

3.1 The Principle of Quantization of Information . . . . . . . . . . . 71

3.2 The Number of Mutually Complementary Propositions . . . . . . 77

3.3 Maluss Law in Quantum Mechanics . . . . . . . . . . . . . . . . 81

3.4 The deBroglie Wavelength . . . . . . . . . . . . . . . . . . . . . . 89

3.5 Dynamics of Information . . . . . . . . . . . . . . . . . . . . . . . 93

3.6 Linearity and Arbitrarily Fast Communication . . . . . . . . . . 99

3.7 Change of Information in Measurement Reduction of the Wave

Packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

B.1 Continuity of Information Implies Analyticity of Information . . 113

B.2 A General Transformation in the Space of Information . . . . . . 115

Conclusions 117

Preprint from Phys. Rev. Lett. 121

References 127


4/139


5/139

Zusammenfassung

In jedem moglichen Quantenexperiment ist eine endliche Anzahl von unter-

schiedlichen Resultaten, z.B. die einzelnen Spinresultate: Spin hinauf und

Spin hinunter, moglich. Bevor das Experiment durchgefuhrt wird, kennt ein

Beobachter nur die spezifischen Wahrscheinlichkeiten aller moglichen einzelnen

Resultate. Wir definieren ein neues Informationma fur eine einzelne Mes-

sung. Dieses basiert auf der Tatsache, da in einer einzelnen Quantenmessung

die einzigen Eigenschaften des Systems, die vor der Durchfuhrung der Messung

definiert sind, die spezifischen Wahrscheinlichkeiten fur alle moglichen einzelnen

Resultate sind.

Nach der Kopenhagener Deutung der Quantenmechanik, die besonders von

Niels Bohr ausgearbeitet wurde, macht es keinen Sinn, von der Eigenschaft

eines Quantensystems unabhangig von dem Versuchsaufbau, in dem sich diese

Eigenschaft manifestiert, zu reden. Dem Beobachter steht es jedenfalls frei, un-

terschiedliche Versuchsanordnungen zu wahlen, die einander sogar vollstandig

ausschliessen konnen, z.B. die Messung der orthogonalen Komponenten des

Spins. Diese Quantenkomplementaritat von Variablen tritt auf, wenn die ent-sprechenden Operatoren nicht kommutieren. Eine Variable, z.B. eine Kompo-

nente des Spins, kann auf Kosten von maximaler Ungewiheit uber die an-

deren orthogonalen Komponenten prazise definiert werden. Wir definieren

den Gesamtinformationsgehalt eines Quantensystems als die Summe der In-

formationmae einzelner Variablen eines vollstandigen Satzes sich gegenseitig

vollstandig ausschlieender (komplementarer) Variablen.

Der Beobachter kann sich entscheiden, einen anderen Satz komplementarer

Variablen zu messen und gewinnt folglich Kenntnis uber eine oder mehrere

Variablen auf Kosten geringerer Kenntnis uber andere. Im Fall der Spinmes-

sungen konnten jene die Projektionen entlang gedrehter Richtungen sein, in de-

nen die Ungewiheit in einer Komponente verringert wird und in einer anderen

Komponente (oder mehreren Komponenten) entsprechend erhoht wird. Intuitiv

erwartet man, da die Gesamtungewiheit, oder gleichwertig die Gesamtinfor-

mation, die in dem System enthalten ist, unter einer solchen Transformation

1


6/139

von einem vollstandigen Satz komplementarer Variablen zu einem anderen un-

verandert bleibt.

Wir zeigen, da die Gesamtinformation eines Systems, die unserem neuen

Ma entsprechend definiert ist, genau diese Invarianzeigenschaft hat. Wir

deuten das Bestehen dieser Eigenschaft der Gesamtinformation als Indiz, da

in der Quantenmechanik die Information der grundlegendste Begriff ist. Im er-sten Teil der vorliegenden Arbeit zeigen wir, gegrundet auf den Ergebnissen der

Quantentheorie, die Gultigkeit der Invarianzeigenschaft der Gesamtinformation

und schlagen Ideen fur das grundlegende Prinzip der Quantenmechanik vor.

Im zweiten Teil argumentieren wir fur ein neues Grundprinzip der Quan-

tenmechanik, das davon ausgeht, da das elementarste System durch ein Bit

an Information gekennzeichnet ist. Ebenso stellt ein zusammengesetztes Sys-

tem, das beispielweise aus zwei Elementarsystemen besteht, zwei Bits dar. Von

diesem Grundprinzip ausgehend, leiten wir dann einige wesentliche Elemente

der logischen Struktur der Quantentheorie ab. Die Gesamtinformation eines

System (bestehend aus einer endlichen Anzahl von Bits) manifestiert sich nurin bestimmten Messungen. Da ein Quantensystem nicht mehr Information tra-

gen kann als in den Bits enhalten ist, ist der Zufallscharakter der einzelnen

Resultate in den anderen (komplementaren) Messungen dann eine notwendige

Konsequenz. Diese Art des Zufallscharakters ist nicht reduzierbar, d.h. er kann

nicht auf verborgene Eigenschaften des Systems zuruckgefuhrt werden. An-

dernfalls wurde das Elementarsystem mehr Information als ein Bit tragen. Die

naturlichste Funktion zwischen der Wahrscheinlichkeit fur das Auftreten eines

spezifischen Resultates und der Laborparameter, die mit dem Grundprinzip,

da ein Elementarsystem nur ein Bit an Information tragt, vereinbar ist, mu

die sinusformige Abhangigkeit sein.

Verschrankung resultiert aus der Tatsache, da Information eines zusam-

mengesetzten Mehrteilchensystems auf gemeinsame Eigenschaften verteilt wer-

den kann. Fur ein Zweiteilchensystem beispielweise erhalten wir maximale Ver-

schrankung dann, wenn die zwei Bits, um gemeinsame Eigenschaften zu spez-

ifizieren, erschopft worden sind, und keine weitere Moglichkeit mehr existiert,

Information in den Einzelteilchen zu verschlusseln.


7/139

Abstract

A new measure of information in quantum mechanics is proposed which takes

into account that for quantum systems, the only feature known before an ex-

periment is performed are the probabilities for various events to occur. The

sum of the individual measures of information for mutually complementary ob-

servations is invariant under the choice of the particular set of complementary

observations and conserved in time if there is no information exchange with an

environment. This operational quantum information invariant results in k bits

of information for a system consisting of k qubits. For a composite system,

maximal entanglement results if the total information carried by the system is

exhausted in specifying joint properties, with no individual qubit carrying any

information on its own.

Our results we interpret as implying that information is the most fundamen-

tal notion in quantum mechanics. Based on this observation we suggest ideas

for a foundational principle for quantum theory. It is proposed here that the

foundational principle for quantum theory may be identified through the as-

sumption that the most elementary system carries one bit of information only.Therefore an elementary system can only give a definite answer in one spe-

cific measurement. The irreducible randomness of individual outcomes in other

measurements and quantum complementarity are then necessary consequences.

The most natural function between probabilities for outcomes to occur and the

experimental parameters, consistent with the foundational principle proposed,

is the well-known sinusoidal dependence.

3


8/139

4


9/139

Introduction

The ongoing debate about the interpretation of quantum mechanics, including

the meaning of specific phenomena like the measurement problem, indicate that

the foundations of quantum theory are not understood to the same degree as

those of classical mechanics or special relativity. While the basic concepts of

classical mechanics coincide well with our intuition, special relativity is out of

our immediate insight. Yet this theory is based on the principle of relativity,

which asserts that the laws of physics must be the same in all inertial systems

including constancy of the speed of light. However, even as the theory itself

is based on such simple and in part intuitively clear principles it nevertheless

predicts some surprising and even counter-intuitive consequences.

The foundational principles for special relativity imply an invariance of the

specific interval (eigenzeit) between two events with respect to all inertial frames

of reference. Data on pure time intervals obtained with respect to two relatively

moving inertial frames of reference will differ, and so will data on spatial dis-

tances. It is possible however, to form a single expression from time intervals

and space distances that will have the same value with respect to all inertialframes of reference. If the time interval between two distant events is denoted

by t and their space distance from each other by l, an expression involv-

ing a quantity symbolized by s can be derived in which (s)2 equals the

square of the time interval minus the fraction of distance squared over speed

of light squared, (s)2 = (t)2 (l)2/c2. This will have the same value as(t)2 (l)2/c2 with t and l having been obtained in another inertialframe of reference.

Quantum mechanics lacks such invariants and principles to this day. Pos-

sibly the lack of generally accepted invariants and foundational principles for

quantum mechanics is the main reason for the problem in understanding quan-

tum mechanics1 and thus, for the coexistence of philosophically quite different

1In his book [1967] Richard Feynman makes the following statement: There was a time

the newspaper said that only twelve men understood the theory of relativity. I do not believe

there ever was such a time. There might have been a time when only one man did, because

5


10/139

6

interpretations of the theory. In fact, we have a number of coexisting inter-

pretations utilizing mutually contradictory concepts [Zeilinger, 1996]. A very

incomplete list of the many interpretations of quantum mechanics includes the

original Copenhagen Interpretation [Bohr, 1935], the ManyWorld Interpreta-

tion [Everett, 1957], the Statistical Interpretation [Ballentine, 1970], Bohms

interpretation [Bohm, 1952], the Transactional Interpretation [Cramer, 1986],

Consistent Histories Interpretation [Griffiths, 1984] and Mermins Ithaca inter-pretation [Mermin, 1998(a), 1998(b), 1998(c)].

In any quantum experiment with discrete variables a number of different

outcomes are possible, for example, the individual spin outcomes spin up

and spin down. Before the experiment is performed an experimentalist only

knows the specific probabilities for all possible individual outcomes. In chapter

1 we define a new measure of the experimentalists information for an individual

measurement based on the fact that the only features defined before the mea-

surement is performed are the specific probabilities for all possible individual

outcomes.

The observer is free to choose different experiments which might even com-

pletely exclude each other, for example measurements of orthogonal compo-

nents of spin. This quantum complementarity of variables occurs when the

corresponding operators do not commute. One quantity, for example the z-

component of spin, might be well defined at the expense of maximal uncertainty

about the other orthogonal components. In chapter 2 we define the total infor-

mation content in a quantum system to be the sum over all individual measures

for a complete set of mutually complementary experiments.

The experimentalist may decide to measure a different set of complementary

variables thus gaining certainty about one or more variables at the expense of

losing certainty about other(s). In the case of spin this could be the projections

along rotated directions, for example, where the uncertainty in one compo-

nent is reduced but the one in another component is increased correspondingly.

Intuitively one expects that the total uncertainty or, equivalently, the total in-

formation carried by the system is invariant under such transformation from

one complete set of complementary variables to another one. In chapter 2 we

show that the total information defined according to our new measure has ex-

actly that invariance property. We interpret the existence of that quantum

information invariant as implying that in quantum mechanics information is

the most fundamental notion. In the first part From Quantum Theory to an

he was the only guy who caught on, before he wrote his paper. But after people read the

paper a lot of people understood the theory of relativity in some way or other, certainly more

than twelve. On the other hand, I think I can safely say that nobody understands quantum

mechanics.


11/139

7

Information Invariant ... of the thesis (chapter 1 and 2) we argue, based on the

known features of quantum physics, for the validity of the quantum information

invariant and we suggest ideas for a foundational principle for quantum theory.

In the second part ... and backof the thesis (chapter 3) we will turn the rea-

soning around and, based on the suggested foundational principle for quantum

mechanics, derive some essential features of the logical structure of quantumtheory. In a similar fashion as the foundational principles for special relativity

imply invariance of the specific measure of distance (eigenzeit) in space-time

with respect to all observers in inertial frames of reference, the suggested foun-

dational principle for quantum mechanics will imply invariance of a specific

operational information measure with respect to all possible observers choices

for a complete set of complementary experiments.

By a foundational principle we do not mean an axiomatic formalization of

the mathematical foundation of quantum mechanics, but a foundational concep-

tual principle which answers Wheelers [1983] question Why the Quantum?

This principle is then the reason for some essential features of quantum mechan-ics, like the irreducible randomness of an individual quantum event, quantum

complementarity, sinusoidal relation between probabilities and laboratory pa-

rameters, and entanglement. In this view we will discuss precisely the empirical

significance of the terms involved in formulating quantum theory, particularly

the notion of a quantum state, in a way which leads clearly to an understanding

of the theory. However we are aware of the possibility that this might not carry

the same degree of emotional appeal for everyone. The conceptual groundwork

for the ideas presented here has been prepared most notably by Bohr [1958],

von Weizsacker [1985] and Wheeler [1983].


12/139

8


13/139

From quantum theory to an

information invariant ...

9


14/139


15/139

Chapter 1

Information Acquired in a

Quantum Experiment

In a review article about the role of information in physics W. T. Grandy,

Jr. [1997] writes that ... an unambiguous clear-cut definition of information

remains slippery as that of randomness, say, or complexity. Is it merely a set of

data? Or is it itself physical? If the latter, as Einstein once commented upon

the ether, it has no definite spacetime coordinates. He continuous further in

the text: The difficulty is somewhat similar to that of attempting to explain

the origin and meaning of inertia to beginning students. While the term can

seem a bit obscure in its meaning, there is no ambiguity in defining inertial

mass as its measure, and the concept becomes scientifically useful. Similarly,

the general notion of information becomes a scientific one only if it is made

measurable. The question arises: How to measure information? In particularwe ask: How to measure information acquired in a quantum experiment?

Assume we want to find out the position of the moon in the sky on a full-

moon night. Before we look at the sky we are completely uncertain about the

position of the moon. When we look at the sky, we find out where the moon is

and it is certainly safe to assume that the property of the moon to be there is

independent of whether anyone looks or not. Our ignorance about the position

of the moon given before we look at the sky is the ignorance about a property

already existing in the outside world.

The situation in quantum measurement is drastically different. With the

only exception of a system being in an eigenstate of the measured observable,

an individual quantum event is intrisically random and therefore cannot be

assumed to just reveal a property of the system existing before the experiment

is performed. This we interpret in Sec. 1.1 as implying that the notion of

11


16/139

12 Chapter 1: Information Acquired in a Quantum Experiment

our ignorance, or information, as to which specific experimental result will be

obtained in an individual run of the experiment plays a more fundamental role

in quantum measurement than in classical measurement.

Based on the fact that in an individual quantum measurement the only fea-

ture defined before the measurement is performed are probabilities for all pos-

sible individual outcomes to occur, we propose a new measure of informationfor an individual quantum measurement in Sec. 1.3. For clarity we emphasize

that our measure of information is not equivalent to Shannons information.

In fact, we show in Sec. 1.2 that because of the completely different root of a

quantum measurement as compared to that of a classical measurement, certain

conceptual difficulties arise when we try to define information gain in a quan-

tum measurement by the notion of Shannons information. While Shannons

information is applicable when a measurement reveals a pre-existing property,

our measure of information takes into account that, in general, a quantum

measurement does not reveal a pre-existing property.

1.1 Unbestimmtheit vs Unbekanntheit in a Quan-

tum Experiment

We begin with a brief survey of the usual textbook examples. Perhaps the

archetypical example is Einsteins recoiling-slit experiment [Bohr, 1949]. By

this example Einstein hoped to give a gedanken double-slit experiment which

would yield both which-path information and also show the wave-like interfer-

ence phenomenon. In a famous paper [1949], Bohr analyzed two arrangements

related to the recoiling-slit experiment. In the first arrangement, the diaphragm

placed in front of the diaphragm pierced with two slits can recoil (Fig. 1.1a)

and reveal through which slit of the second diaphragm the photon reached the

screen, in as much as only one of the momenta of a photon passing through one

or the other slit is consistent with a known amount of recoil momentum. In

the second arrangement in Fig. 1.1b, the diaphragm is fixed so that the path

can not be determinated. One finds that only in the latter arrangement an

interference pattern is exhibited. Bohr concluded ... we are presented with a

choice of either tracing the path of a particle or observing interference effects.

Another example along these lines is Feynmans [Feynman et al., 1965] ver-sion of Einsteins gedanken experiment. In this scheme the interfering electron

is observed by light-scattering. The scattering of a photon is used to detect

the electron position just behind the slits, revealing through which slit the elec-

tron reached the screen. Feynman explained that this observation procedure

destroys the interference pattern. He concluded his analysis with the following


17/139

1.1 Unbestimmtheit vs Unbekanntheit in a Quantum Experiment 13

a) b)

Figure 1.1: Two mutually exclusive experimental arrangements to observe the in-terference pattern (Fig. a) and the path of the particle (Fig. b) in the double slit

experiment. The figures are taken from [Bohr, 1949]. If the diaphragm with two slits is

fixed an interference pattern is exhibited as given in Fig. a). In the experimental situa-

tion in Fig. b) when the diaphragm can recoil no interference pattern is observed. Bohr

[1949] writes: Since, however, any reading of the scale, in whatever way performed,will involve an uncontrollable change in the momentum of the diaphragm, there will

always be, in conformity with the indeterminacy principle, a reciprocal relationship

between our knowledge of the position of the slit and the accuracy of the momentum

control. The lack of our knowledge of the position of the slit excludes then the

appearance of the interference phenomena.

statement: If an apparatus is capable of determining which hole the electron

goes through, it cannot be so delicate that it does not disturb the pattern in

an essential way. No one has ever found (or even thought of) a way around the

uncertainty principle.

In the experimental situations discussed so far, as in most other usual text-

book examples, the which-path information is obtained, exposing the interfering

particle to uncontrollable scattering effects. This initiated a number of miscon-

ceptions being put forward in the literature. According to the most significant

misconception, loss of interference is due to an uncontrollable transfer of en-

ergy and/or momentum to the particle associated with any attempt to observe

the particles path. Unavoidable disturbances might again be because of the

intrinsic clumsiness of any macroscopic measuring apparatus. Over the last

few years experiments were considered and some already performed, where thereason why no interference pattern arises is not due to any uncontrollable dis-

turbance of the quantum system or the clumsiness of the apparatus. Rather the

lack of interference is due to the fact that the quantum state is prepared in such

a way as to permit path information to be obtained, in principle, independent

of whether the experimenter cares to read it out or not.


18/139


Figure 1.2: An arrangement for two-particle interferometry. The source emits two

particles in the entangled state (1.1). Particle 1 traverses the Mach-Zehnder interfer-ometer starting with the beams A and B while particle 2 traverses the Mach-Zehnder

interferometer starting with the beams C and D. Phase shifters in both interferometers

permit continuous variations of the phases 1 and 2.

One line of such research considers the use of pairs of particles which are

strongly entangled. Consider a setup where a source emits two particles with

antiparallel momenta which then feed two Mach-Zehnder interferometers [Horne

et al., 1989], [Rarity and Tapster, 1990], [Herzog et al., 1995] as shown in Fig.

1.2. Then whenever particle 1 is found in beam A, particle 2 is found in beam

C and whenever particle 1 is found in beam B particle 2 is found in beam D.The quantum state is

| = 12

(|A1|C2 + |B1|D2). (1.1)

Will we now observe an interference pattern for particle 2, i.e. the well-known

sinusoidal variation of the intensities registered in the detectors U2 and L2upon variation of the phase 2? The answer has to be negative because by

simply placing detectors in the beams A and B of particle 1 we can determine

which path particle 1 took. The lack of interference can easily be calculated

starting from the state (1.1). Yet, if we recombine the two paths of particle 1

as indicated in Fig. 1.2, and if we register both particle 1 in either detector U1or L1 and particle 2 in either detector U2 or L2, we have forgone any possibility

of obtaining path information. Therefore we conclude an interference pattern

should arise in coincidence counts between the detectors for particle 1 and for

particle 2 shown in Fig. 1.2. This indeed follows from quantum mechanical

calculations [Horne et al., 1989].

Another independent approach to complementarity in an interference ex-

periment considers the use of micromasers in atomic beam experiments [Scully

et al., 1991]. Typically in such an experiment, an atom passes through a cavitysuch that it exchanges exactly one photon with the cavity without changing

momentum. Thus by investigating the cavity, one has information on whether

or not an atom passed through it without influencing the momentum of the

atom. Now, if we place one cavity into the each of two paths of the interference

experiment, we may obtain information on which path the atom took. The


19/139

1.1 Unbestimmtheit vs Unbekanntheit in a Quantum Experiment 15

interference pattern does not arise. It is the mere possibility of obtaining path

information which guarantees that no interference occurs1. On the other hand,

we can read the information in the micromasers in such a way as to erase all

information on which micromaser the photon has been stored in. Then we have

just the information that the atom passed through the apparatus, but not along

which path. In this case the atoms counted in coincidence with the photons are

members of an ensemble defining an interference pattern.

These two experiments underline clearly that complementary does not origi-

nate in some uncontrollable disturbance of pre-assigned properties of a quantum

system in a measurement process. In fact, as theorems like those of Bell [Bell,

1964] and Greenberger, Horne and Zeilinger [Greenberger et al. 1989, 1990]

show, it is in principle not possible to assign to a quantum system simultane-

ously properties that both correspond to complementary measurements, and

which in order to be in agreement with special relativity, have to be local. The

principle impossibility of local realism will now be briefly demonstrated for our

example of the two-particle interference experiment given in Fig. 1.2.

As the two particles in our example might be widely separated, it is nat-

ural to assume validity of the locality condition suggested by EPR [Einstein,

Podolsky and Rosen, 1935]: Since at the time of measurement the two sys-

tems no longer interact, no real change can take place in the second system in

consequence of anything that may be done to the first system. Then, whether

detector U2 or L2 for a specific phase 2 is triggered must be independent of

which measurement we actually perform on the other particle (e.g, indepen-

dent of the phase 1) and even independent of whether we care to perform

any measurement at all on that particle. This assumption implies that certain

combinations of expectation values have definite bounds. The mathematicalexpression of that bound is called Bells inequality, of which many variants ex-

ists. For example, a version given by Clauser, Horne, Shimony and Holt [1969]

is

|E(1, 2) E(1, 2)| + |E(1, 2) + E(1, 2)| 2 (1.2)

where

E(1, 2) = (1.3)P++(1, 2) + P(1, 2) P+(1, 2) P+(1, 2).

1Scully et al. [1991] wrote: ... it is simply the information contained in a functioning

measuring apparatus that changes the outcome of the experiment, and not uncontrollable

alternations of the spatial wave function, resulting from the action of the measuring apparatus

on the system under observation.


20/139


For the quantum state (1.1) this becomes

EQM(1, 2) = cos(2 1),

where we suppose a phase shift of i for reflection and 1 for transmission at

the beam splitter. Here we assume that particle 1 gives result + (

) when

it triggers detector U1 (L1) and particle 2 gives result + () when it triggersdetector U2 (L2). Then, e.g. P++(1, 2) is the joint probability that particle 1

gives + and particle 2 gives +. Maximal violation occurs for 1 = 45, 2 = 0,

1 = 135, 2 = 90, where the left-hand side of Eq. (1.2) will be 2

2 in clear

violation of the inequality. Thus, the assumption of local realism is in conflict

with quantum physics itself.

From this we learn that we cannot speak of complementarity as a conse-

quence of some disturbance of a system in the measurement if there are no

objective properties to disturb2. An important feature of the analysis so far is

that we have to base our concept of complementarity on the much more funda-mental concept of information. Any firm foundation of complementarity has to

make recourse to the property of mutual exclusiveness of different classes of in-

formation of a quantum system. As stated by Pauli [1958] in the analysis of the

uncertainty relations3: ... diese Relationen enthalten die Aussage, da jede

genaue Kenntnis des Teilchenortes zugleich eine prinzipielle Unbestimmtheit,

nicht nur Unbekanntheit des Impulses zur Folge hat und umgekehrt. Die Un-

terscheidung zwischen (prinzipieller) Unbestimmtheit und Unbekanntheit und

der Zusammenhang beider Begriffe sind fur die ganze Quantentheorie entschei-

dend.

We note that a view of information as the most fundamental concept in

quantum mechanics also leads to the most natural understanding of new phe-

nomena in quantum computation [Barenco et al., 1995(a)], entanglement swap-

ping [Zukowski et al., 1993], [Pan et al., 1998], quantum cloning [Wootters and

Zurek, 1982], [Buzek and Hillery, 1996] and quantum communication such as

quantum dense coding [Mattle et al., 1996], quantum cryptography [Bennett et

al., 1992] and quantum teleportation [Bennett et al., 1993], [Bouwmeester et.

al, 1997].

2Bohr dislikes phrases like disturbing phenomena by observations exactly because of their

potential for confusion. He stresses [Bohr, 1958] the use of the word phenomenon exclusively

to refer to observations made under specific circumstances, including an account of the wholeexperimental arrangement.

3Translated:... this relations contain the statement that any precise knowledge of the

position of a particle implies a fundamental indefiniteness, not just an unknownness, of the

momentum for a consequence and vice versa. The distinction between (fundamental) indef-

initeness and unknownness, and a connection of these two notions is decisive for the whole

quantum theory.


21/139

1.2 Conceptual Inadequacy of the Shannon Information ... 17

1.2 Conceptual Inadequacy of the Shannon Informa-

tion in a Quantum Measurement

Shannons measure of information is generally considered to be very useful to

describe information in a physical observation. Here we will see that, while this

is rather natural in classical physics, it becomes problematic and even untenablein quantum physics.

There are various ways to motivate the Shannon measure of information. In

an operational approach Shannons information is introduced as the expected

minimal number of binary questions, i.e. questions with yes or no an-

swers only, required to discern the outcome of an experiment. In an axiomatic

approach the Shannon measure is uniquely specified by Shannons postulates

which establish some intuitively clear relations between individual amounts of

information gained in different individual observations. And in a physical ap-

proach Shannons information is characterized in terms of some natural prop-

erties which are essential from the point of view of the physics considered.

When investigating these three approaches in the next sections we will no-

tice that each approach contains an element that escapes complete and full

description in quantum mechanics. This element is always associated with the

objective randomness of individual quantum events and with quantum comple-

mentarity.

1.2.1 An Operational Approach

For classical observations Shannons information can be strengthened through

an operational approach to the question. To carry this out, consider the fol-

lowing example. An urn is filled with colored balls. The proportions in which

the different colors are present is known. Now the urn is shaken, and we draw

a single ball. To what extent can we predict the color of the drawn ball? If

all the balls in the urn are of the same color, we can completely predict the

outcome of the draw. On the other hand, if the various colors are present in

equal proportions, we are completely uncertain about the outcome. One can

think of these situations as extreme cases on a varying scale of predictability.

As a specific example consider an urn containing balls of four colors: black,

white, red, and green, with the proportions p1 =12 , p2 =

14 , p3 =

18 and p4 =

18 ,

respectively. Suppose now that one wishes to learn the color of the drawn ball

by asking questions to which only yes or no can be given as an answer. Of

course, the number of questions needed will depend on the questioning strategy


22/139


Figure 1.3: Binary question tree to determine the color of a drawn ball. The pro-portions in which black, white, red and green colors are present are p1 =

1

2, p2 =

1

4,

p3 =1

8and p4 =

1

8, respectively.

adopted. In order to make this strategy the most optimal, that is, in order

that we can expect to gain from each yes-or-no question maximal information,

we evidently have to ask questions whose answers will strike out half of the

possibilities.

Indeed, a good question to start with is to ask Is the color of the drawnball black? (Fig. 1.3), the virtue being that, regardless of the answer yes

or no, we will be able to strike out a weighted half of the possibilities. If

the answer is yes, then we are done. If the answer is no, one may divide

the set that remains after this first round into two parts of equal probability

{white} and {red, green} and proceed by posing the question Is the color ofthe drawn ball white?. Again, if the answer is yes, we are done, and if the

answer is no we proceed in a similar fashion until the identity of the outcome

is at hand. A particular outcome is specified by writing down, in order, the

yess and nos encountered in travelling from the root to the specific leaf of the

tree schematically depicted in Fig. 1.3. It is easy to see that following theabove optimal strategy the mean minimal number of binary questions needed

to determine the color of the drawn ball is

p1 1 +p2 2 + (p3 +p4) 3 = 12

1 + 14

2 +

1

8+

1

8

3 = 7

4.

Notice that this may be written as

1

2

log1

2 1

4

log1

4 1

8

log1

8 1

8

log1

8

=4

i=1pi logpi.

where the logarithm is taken to base 2.

Now of course for an arbitrary probability distribution p1, p2, p3 and p4over a set of colors, a division into two sets of equal probability is not always


23/139


possible. One may then consider a generalized situation where we draw a ball

N times without replacing the drawn ball. We assume again that we wish to

learn the colors of N drawn balls by asking questions to which only yes or

no can be given as an answer. Now, however, questions of a mixed type may

be asked, like Is the color of the first drawn ball black or white, of the second

drawn ball red, ..., and of the Nth black or white or green?. In this manner

it becomes easier to find questions for which the probability of yes and noare approximately equal, and thus the total number of questions needed can be

reduced.

Suppose p1N, p2N, p3N and p4N are all integers, then the probability of

obtaining the sequence containing p1N black balls, p2N white balls, p3N red

balls and p4N green balls is [Shannon, 1949]

p(sequence) = pNp11 pNp22 p

Np33 p

Np44 =

1

2NH

where

H = 4

i=1

pi logpi (1.4)

is the Shannon information expressed in bits when the logarithm is taken to

base 2. Such a sequence is called typical sequence4. Notice that a particular

typical sequence is specified by the particular order of the balls distinguishable

by the particular color sequence. The total number of typical sequences can

be obtained as the number of distinguishable permutations of N balls made up

of 4 groups of black, white, red and green balls indistinguishable within each

group. If N is sufficiently large then

N!

(p1N)!(p2N)!(p3N)!(p4N)! 2NH, (1.5)

where we use the Stirling approximation N! 2N NNeN. Hence, thetypical sequences all have equal probability, and there are 2NH of them.

Let us now turn back to our problem. We wish to learn colors of N drawn

balls by asking questions to which only yes or no can be given as the answer.4To be specific, we define the set of typical sequences to be all sequences such that

2N(H+) p(sequence) 2N(H) > 0.

Now, it can be shown that the probability that N outcomes actually form a typical sequence

is greater than 1 , for sufficiently large N, no matter how small might be.


24/139


Figure 1.4: Binary question tree to determine the specific sequence of outcomes (colorof the drawn balls) in a sufficiently large number N of experimental trials (numberof drawings). An urn is filled with black and white balls with proportions p1 and

p2, respectively. The expected number of questions needed to determine the actual

sequence of outcomes is N H, where H = p1 logp1 p2 logp2.

If we address this problem in a piece-wise manner, determining the colors of

the drawn balls one after another, the number of questions needed will just be

N times that needed for a single ball.

However we may use another strategy. Suppose N is sufficiently large that

the sequence of N drawn balls contains close to p1N black balls, p2N whiteballs, p3N red balls and p4N green balls. In other words, suppose N drawn

balls form a typical sequence. Now, in order to learn the colors of the drawn

balls we need only to identify which particular typical sequence is actually

drawn. Since there are 2NH possible typical sequences and all of them have

equal probability to be drawn, the minimal number of yes-no questions needed

is just NH. Or equivalently, the Shannon information5 expressed in bits is the

minimal number of yes-no questions necessary to determine which particular

sequence of outcomes occurs, divided by N [Feinstein, 1958], [Uffink, 1990].

This is known as the noiseless coding theorem. An explicit example with an

urn containing balls of two different colors is given in Fig. 1.4. A generalization

5The Shannon information therefore refers to the information about an individual outcome

of an experiment. This should be contrasted to the cases where the notion of information refers

to knowledge about an unknown parameter in a probability distribution [Fisher, 1925], or the

information for discriminating between two probability distributions [Kullback, 1959], or the

information that one event provides about another event [Gelfand and Yaglom, 1957].


25/139


for the probability distribution p1, p2, ..., pn over a finite set of n colors may

easily be obtained.

We now analyze Shannons notion of information in a quantum measure-

ment. In particular we consider a beam of photons prepared with vertical

polarization and analyzed by a filter polarized at an angle of 45 from the verti-

cal position. Each individual photon, when it encounters the polarization filter,has exactly two equally probable options: to pass straight through the filter

(we call this the outcome 1) or to be absorbed by the filter (the outcome

0). Now suppose we perform the polarization experiment a sufficiently large

number N of times so that the sequence of actual outcomes forms a typical

sequence.

We observe a particular sequence of 1s and 0s. An individual outcome

observed in a single experimental trial is fundamentally random and cannot be

assumed to reveal the property of an individual photon, assigned before the

measurement is performed, to pass through the filter or to be absorbed by the

filter. The principal indefiniteness, in the sense of fundamental nonexistenceof a detailed description of and prediction for the individual quantum event

resulting in the particular measurement result, implies that the particular out-

come sequence of 1s and 0s specified by writing down, in order, the yess and

nos encountered in a row of yes/no questions asked is not defined before the

measurement is performed. This implies that Shannons information defined

as the number of yes/no questions needed to determine the particular order of

1 and 0s in the actual sequence of outcomes cannot be assumed to describe

our ignorance about the future measurement results that is given before the

measurements are performed and that is then removed after the measurements

are performed, because no individual outcome and consequently no particularorder of 1s and 0s we observe in the sequence of measurements is defined before

the measurements are performed.

Of course, after the measurement is performed and its actual result becomes

known the information necessary to specify the measurement result is quantified

by the Shannon measure of information. Yet, this information has no reference

to the particular experimental situation given before the experiment is per-

formed and therefore it is not appropriate to define the information about the

system that is gained by the performance of the experiment. In the sense that

an individual quantum event manifests itself only in the measurement process

and is not precisely defined before measurement is performed, we may speak

of a creation of Shannons information in the measurement. In our explicit

example, the amount of created information is maximal because vertical po-

larization and polarization at 45 are maximally complementary attributes. Itis interesting to contrast this with Shannons [1949] writing of information as


26/139


being produced by a source.

The Shannon information is surely adequate for the situation in classical

physics where we can always mentally split the ensemble into its constituents

and where the stochastic behavior of the whole ensemble follows from the be-

havior of its intrinsic different individual constituents which can be thought of

as being defined to any precision. In classical physics, this can be done evenin situations where we have no way to distinguish the individual constituents

and their behavior experimentally. If we perform a sequence of measurements

on the ensemble, a particular order of individual events that is recorded is

predetermined and originates in the intrinsic properties individual constituents

possess before measurements. The Shannon information may then be assumed

to measure the information necessary to reveal the property of an individual

system of the ensemble given before measurements are performed. Again this

cannot be assumed in a quantum measurement, because a quantum measure-

ment, with the only exception being that of the system in an eigenstate of the

measured observable, changes the state of the system into a new state in a fun-

damentally unpredictable way, and thus cannot be claimed to reveal a property

existing before the measurement is performed. In fact, as theorems like those

of Kochen-Specker [Kochen and Specker 1967] show, in quantum mechanics it

is not possible, not even in principle, to assign to a quantum system properties

corresponding to all possible measurements.

1.2.2 An Axiomatic Approach

An important reason for preferring the Shannon measure of information inthe literature lies in the fact that it is uniquely characterized by Shannons

intuitively reasonable postulates, and that alternative expressions should be

rejected for that reason. This has been expressed strongly by Jaynes [1957]

in words: One ... important reason for preferring the Shannon measure is

that it is the only one that satisfies ... [Shannons postulates]. Therefore one

expects that any deduction made from other information measures, if carried

far enough, will eventually lead to contradiction. A good way to continue our

discussion is by reviewing how Shannon, using his postulates, arrived at his

famous expression. He writes [1949]:

Suppose we have a set of possible events whose probabilities of occurrenceare p1, p2,...,pn. These probabilities are known but that is all we know con-

cerning which event will occur. Can we find a measure of how much choice is

involved in the selection of the event or how uncertain we are of the outcome?

If there is such a measure, say H(p1, p2,...,pn), it is reasonable to require of


27/139


12

13

13

23

16

13

16

121

2

12

Figure 1.5: Decomposition of a choice from three possibilities. Figure taken from[Shannon, 1949].

it the following properties:

1. H should be continuous in the pi.

2. If all the pi are equal, pi =1n

, then H should be a monotonic increas-

ing function of n. With equally likely events there is more choice, or

uncertainty, when there are more possible events.

3. If a choice be broken down into two successive choices, the original H

should be the weighted sum of the individual values of H. The meaning of

this is illustrated in Fig. 1.5. At the left we have three possibilities p1 =12 ,

p2 =13 , p3 =

16 . On the right we first choose between two possibilities

each with probability 12 , and if the second occurs make another choice

with probabilities 23 ,13 . The final results have the same probabilities as

before. We require, in this special case, that

H

1

2,

1

3,

1

6

= H

1

2,

1

2

+

1

2H

2

3,

1

3

.

The coefficient 12 is the weighing factor introduced because this second

choice occurs half the time.

Shannon then shows that only the function (1.4) satisfies all three postulates.

It is clear from the way Shannon formulated the problem, that H is in-

troduced as an uncertainty about the outcome of an experiment based on a

given probability distribution. The uncertainty arises, of course, because the

probability distribution does not enable us to predict exactly what the actual

outcome will be. This uncertainty is, of course, removed when the experiment is

performed and its actual outcome becomes known. Thus, we may think of H asthe amount of information that is gained by the performance of the experiment.

We now turn to the discussion of Shannons postulates. While the first two

postulates are purely qualitative and natural for every meaningful measure of

information, the last postulate might appear to have no immediate intuitive


28/139


appeal. The third Shannon postulate originally formulated as an example was

reformulated as an exact rule by Faddeev [1957]: For every n 2

H(p1,..,pn1, q1, q2) =H(p1,..,pn1, pn)+pnH

q1pn

,q2

pn

, (1.6)

where pn = q1 + q2.

Without physical interpretation the recursion postulate (1.6) is merely a

mathematical expression which is certainly necessary for the uniqueness of the

function (1.4) but has no further physical significance. We adopt the following

well-known interpretation [Uffink, 1990], [Jaynes, 1996]. Assume the possible

outcomes of the experiment to be a1,...,an and H(p1,...,pn) to represent the

amount of information that is gained by the performance of the experiment.

Now, decompose event an into two distinct events an b1 and an b2 (denotes and, thus a b denotes a joint event). Denote the probabilities ofoutcomes an b1 and an b2 by q1 and q2, respectively. Then the left-hand sideH(p1,...,pn1, q1, q2) of Eq. (1.6) represents the amount of information that isgained by the performance of the experiment with outcomes a1,...,an1, an b1, an b2. When the outcome an occurs, the conditional probabilities forb1 and b2 are

q1pn

and q2pn

respectively and the amount of information gained

by the performance of the conditional experiment is Hq1pn

, q2pn

. Hence the

recursion requirement states that the information gained in the experiment

with outcomes a1,...,an1, an b1, an b2 equals the sum of the informationgained in the experiment with outcomes a1,...,an and the information gained

in the conditional experiment with outcomes b1 or b2, given that the outcome

an occurred with probability pn. This interpretation implies that the third

postulate can be rewritten as

H(p(a1),...,p(an1), p(an b1), p(an b2)) (1.7)= H(p(a1),...,p(an1), p(an))+p(an)H(p(b1|an), p(b2|an)),

where

p(an) = p(an b1) +p(an b2),p(an b1) =p(an)p(b1|an) and p(an b2) =p(an)p(b2|an).

Here p(bi|an) i = 1, 2 denotes the conditional probability for outcome an giventhe outcome bi occurred andp(anbi) denotes the joint probability that outcomean bi occurs.

If we analyze the generalized situation with n outcomes ai of the first ex-

periment A, m outcomes bj of the conditional experiment B and mn outcomes


29/139


aibj of the joint experiment AB, we may then rewrite the recursion postulatein a short form as

H(A B) = H(A) + H(B|A) (1.8)

where H(B|A) = nj p(aj)H(b1|aj ,...,bm|aj) is the average of information gainedby observation B given that the conditional outcome aj occurred weighted by

probability p(aj) for aj to occur.

It is essential to note that the recursion postulate is inevitably related to

the manner in which we gain information in a classical measurement. In fact, in

classical measurements it is always possible to assign to a system simultaneously

attributes corresponding to all possible measurements, here ai, bj and ai bj.Also, the interaction between measuring apparatus and classical system can be

thought to be made arbitrarily small so that the experimental determination

of A has no influence on our possibility to predict the outcomes of the possible

future experiment B. In conclusion, the information expected from the jointexperiment A B is simply the sum of the information expected from the firstexperiment A and the conditional information of the second experiment B with

respect to the first, as predicted by Eq. (1.8).

In contrast we know that in a quantum measurement it is not possible to

assign to a system simultaneously complementary attributes, like position and

momentum, or the path of the system and the position of appearance in the

interference pattern in the double-slit experiment, or the spin values along or-

thogonal directions. Therefore Shannons crucial third postulate (1.8) necessary

for uniqueness of Shannons measure of information is not well-defined in quan-

tum mechanics when A and B are measurements of mutually complementary

attributes. Consequently, the Shannon measure loses its preferential status with

respect to alternative expressions when applied to define information gain in

quantum measurements.

Here a certain misconception might be put forward that arises from a certain

operational point of view. According to that view, for example, complementar-

ity between interference pattern and information about the path of the system

in the double-slit experiment arises from the fact that any attempt to observe

the particle path would be associated with an uncontrollable disturbance of the

particle. Such a disturbance in itself would then be the reason for the loss of theinterference pattern. In such of view it would be possible to define Shannons

information for all attributes of the system simultaneously, and the third Shan-

non postulate would be violated because of the unavoidable disturbance of the

system occurring whenever the subsequently measured property B is incompat-

ible with the previous one A. Yet, this is a misconception not only because it


30/139


was shown [Bell, 1964], [Greenberger et al. 1989, 1990] that it is in principle im-

possible to assign to a quantum system simultaneously observation-independent

properties (which in order to be in agreement with special relatively have to be

local) but also because some experiments have already been performed [Herzog

et al., 1995] where the reason why no interference pattern arises is not due to

an uncontrollable disturbance of the quantum system (see also Sec. 1.1).

We next introduce two requirements that are immediate consequences of

Shannons postulate and in which all the probabilities that appear are well-

defined in quantum mechanics. We will show that the two requirements are

violated by the information gained in quantum measurements.

1. Every new observation reduces our ignorance and increases our knowledge.

In his work Shannon [1949] offers a list of properties to substantiate that

H is a reasonable measure of information. He writes: It is easily shown

that

H(A B) H(A) + H(B)

with equality only if the events are independent (i.e., p(aibj) = p(ai)p(bj)).The uncertainty of a joint event is less than or equal to the sum of the

individual uncertainties. He continues further in the text: ... we have

H(A) + H(B) H(A B) = H(A) + H(B|A).

Hence,

H(B)

H(B

|A). (1.9)

The uncertainty of B is never increased by knowledge of A. It will be

decreased unless A and B are independent events, in which case it is not

changed (we have changed Shannons notation to coincide with that of

our work).

2. Information is indifferent on the order of acquisition. The total amount

of information gained in successive measurements is independent of the

order in which it is acquired, so that the amount of information gained

by the observation of A followed by the observation of B is equivalent to

the amount of information gained from the observation of B followed by

the observation of A

H(A) + H(B|A) = H(B) + H(A|B). (1.10)

This is an immediate consequence of the recursive postulate which can

be obtained when we write the recursion postulate in two different ways


31/139

1.2 Conceptual Inadequacy of the Shannon Information ... 2700110 00 01 11 1 white plastic ballblack plastic ballwhite wooden ball 0 00 01 11 1white

color

1/3

black2/30 01 1 01 01 0011whiteblackcolor 1/21/2 plasticwoodcomposition

plastic1/2

wood

composition

1/2

1

0

a)

plastic

wood

composition

3/4

1/4

white

black

color

1

0

b)

Figure 1.6: Indifference of information to the order of its acquisition in classicalmeasurements. A box is filled with balls of different compositions (plastic and wooden

balls) and different colors (black and white balls). Now, the box is shaken. In Fig a)

we first draw a ball asking about the color of the drawn ball and gain H(color) = 1 bit

of information. Subsequently, we put the black and white balls in separate boxes, draw

a ball from each box separately and ask about the composition of the drawn ball. We

gain Hbl(comp.) = 0 bits for the black balls and Hwh(comp.) = 1 bit for the white balls.

In Fig. b) we pose the two questions in the opposite order. We firstly ask about the

composition of the drawn ball and gain H(comp.) = 0.81 bit. In a conditional drawing

we ask about the color of the drawn ball and gain Hwo(color) = 0 bits for wooden

balls and Hpl(color) = 0.92 bits for the plastic balls. The total information gained is

independent on the order of the two questions asked, i.e. H(color)+1/2Hbl(comp.)+

1/2Hwh(comp.)=H(comp.)+1/4Hwo(color)+3/4Hpl(color)= 1.5.

depending on whether the observation of A is followed by the observa-

tion of B or vice versa. An explicit example for a sequence of classical

measurements is given in Fig. 1.6.

Are these two requirements satisfied by information gained in quantum mea-

surements? Consider a beam of randomly polarized photons. Filters F, Fand F are oriented vertically, at +45, and horizontally respectively, and canbe placed so as to intersect the beam of photons (Fig. 1.7). If we insert filter Fthe intensity at the detection plate will be half of the intensity of the incoming

beam. The outgoing photons are now all with vertical polarization. Notice

that the function of filter F cannot be explained as a sieve that only letsthose photons pass that are already with horizontal polarization in the incoming

beam. If that were the case, only a certain small number of randomly polarized

incoming photons would be with horizontal polarization, so we would expect a

much larger attenuation of the intensity of the beam as it passes the filter.

Insertion of filters F and F correspond to the measurements of A polar-ization at +45 and B horizontal polarization, respectively. Now, when filterF is inserted behind the filter F, the intensity of the outgoing beam dropsto zero. None of the photons with vertical polarization can pass through the


32/139


Figure 1.7: New observation (of polarization at 45) reduces our knowledge (of thevertical/horizontal) polarization) at hand from a previous observation. Filters F, Fand F are oriented vertically, at +45

and horizontally, respectively. If filter F is

inserted behind the filter F (Fig. a), no photons are observed at the detector plate.

In that case we have complete knowledge of the vertical/horizontal polarization of the

photon. After filter F is inserted between F and F (Fig. b), a certain number of

photons will be observed at the detection plate. Here acquisition of information about

the polarization of the photon at 45 leads to a decrease of our knowledge aboutvertical/horizontal polarization of the photon.

horizontal filter as shown in Fig. 1.7a. In this case we have complete knowledge

of the property B, i.e. H(B) = 0. Notice that a sieve model where F (F)only lets those photons pass that have already horizontal (vertical) polarization

in the incoming beam could explain this behaviour. Now, after filter F isinserted between F and F, a certain intensity will be visible at the detec-tion plate, exactly 14 of the intensity of the beam passed through F as shownin Fig. 1.7b. In that case, a certain number of photons that passed through

F will also pass through F. Therefore, acquisition of information about thepolarization of the photon at 45 leads to a decrease of our knowledge abouthorizontal polarization of the photon implying H(B|A) > 0. Consequently,0 = H(B) H(B|A) > 0 which clearly violates requirement (1.9). Now,imagine after F we insert the filter F in Fig. 1.7a (this, of course, doesnot make any essential change compared with the situation without the addi-

tional filter). We may consider this new experimental situation as a sequence

of measurements BA. Now, information gained in the sequence BA in Fig.

1.7a differs from the information gained in the sequence AB in Fig. 1.7b, i.e.

0 = H(B) + H(A|B) = H(A) + H(B|A) > 0, thus violating the requirement(1.10). Another independent example where requirement (1.10) is violated is

given in Fig. 1.8.

Here we have an effect which cannot be explained by a sieve model. Classical

experience suggests that the addition of a filter should only be able to decrease

the intensity of the beam getting through. In a sieve model where the filter

does not change the object, adding a new filter will always reduce the intensity

of the beam. For completeness we note that a classical wave model can explain


33/139


Figure 1.8: Dependence of information on the order of its acquisition in successivequantum measurements. A spin-1/2 particle is in the state |z+ spin-up along the z-axis. Spin along the x-axis and spin along the direction in the x-z plane tilted at an angle

from the z-axes are successively measured, in the order in Fig. a) and opposite to that

in Fig. b). Whereas we obtain an equal portion H(cos2(/4/2), sin2(/4/2)) ofinformation in the conditional (subsequent) measurement both in Fig. a) and in Fig.

b), the amounts of information H(cos2 /2, sin2 /2) and H(12

, 12

) = 1 we gain in the

first measurement in Fig. a) and in the first measurement in Fig. b) respectively, can

be significantly different. Specifically for 0 we have complete knowledge aboutspin along the direction at the angle in Fig. a) but absolutely no knowledge about

the spin along the x-axis in Fig. b). We emphasize that we do not assume any specific

functional dependence for the measure of information H.

the increase of the intensity of the wave transmitted through the filters.

In contrast to the sieve model where adding a new filter just add some new

knowledge of the object and never decrease our knowledge at hand from the

previous measurements, a quantum measurement can decrease our knowledge

collected from previous measurements. This originates from the distinction

between maximal and complete information in quantum physics. In clas-

sical physics the maximal information about a system is complete. In quantum

physics the maximal information, represented by the state vector, is never com-

plete in the sense that all possible future measurement results are precisely

defined. Yet, we do not hesitate to emphasize that it certainly is complete in

the sense that it is not possible to have more information about a system than

what can be specified in its quantum state. In fact, the state vector represents

that information which is necessary to arrive at the maximum possible set of

probabilistic prediction for all possible future observations of the system.

In our explicit example the state vector of the polarization of a photon can

be expressed as | = a|+b| (a and b are complex numbers) in the basis ofvertical | and horizontal | polarization. The probability to observe verticalpolarization is |a|2 and the probability to observe horizontal polarization is |b|2.Measurement of vertical/horizontal polarization will change the state to an

eigenstate associated with the result of the measurement. In our example if

measurement by filter F results in vertical polarization, then the state changes


34/139


to | and when the polarization is measured again with respect to the samebasis by F, it will return vertical polarization with probability one. Thus,no photon will have the property of horizontal polarization as indicated in Fig.

1.7a implying H(B) = 0. In Fig. 1.7b, a photon passing through F withthe state | will pass filter F with a probability of 1/2, and so 50% of thephotons will pass through F. A photon passing through F changes the state

from | to | =12(| + | ), indicating gain of the new knowledge

(about polarization at 45) at the expense of unavoidable and irrecoverableloss of the prior knowledge (about vertical/horizontal polarization). As before,

this photon will pass F with a probability of 1/2. Thus, the probability for aphoton to pass the sequence of filters FF is 1/4 implying H(B|A) = 0.56.

Being a summary representation of the observers in general probabilistic

predictions for future observations, the quantum state normally changes in a

measurement process into one of the new states defined by the measurement

apparatus. After the measurement the old summary of the observers informa-

tion is at least partially lost and a new one, established to be in accord with the

change of the state, is indifferent to the knowledge collected from the previous

measurements in the whole history of the system. Such a view was assumed by

Pauli [1958] who writes6: Bei Unbestimmtheit einer Eigenschaft eines Systems

bei einer bestimmten Anordnung (bei einem bestimmten Zustand des Systems)

vernichtet jeder Versuch, die betreffende Eigenschaft zu messen, (mindestend

teilweise) den Einflu der fruheren Kenntnisse vom System auf die (eventuell

statistischen) Aussagen uber spatere mogliche Messungsergebnisse.

1.2.3 A Physical Approach

A specific measure of information becomes a meaningful concept in physics only

when it can be characterized by the properties which naturally follow from the

physics considered. Such a property can be, for example, invariance of the

total information content of the system under variation of modes of observa-

tion or conservation of the total information in time if there is no information

exchange with an environment. We will show that for a quantum system the

total information defined according to Shannons measure does not have these

properties.

The classical world appears to be composed of particles and fields, and thenature of each one of these constituents could be specified quite independently

6Translated: In the case of indefiniteness of a property of a system for a certain experi-

mental arrangement (for a certain state of the system) any attempt to measure that property

destroys (at least partially) the influence of earlier knowledge of the system on (possibly

statistical) statements about later possible measurement results.


35/139


of the particular phenomenon discussed or of the experimental procedure a

physicist chooses. In other words, any concept introduced in classical physics is

totally noncontextual. In particular, the total information content of a classical

pointlike system (with no rotation and inertial degrees of freedom) defined as

Shannons information associated with the probability distribution over the

phase space is independent of the specific set of variables (such as position and

momentum, or angle and angular momentum, etc.) considered and conserved intime if there is no information exchange with an environment7. Operationally

the total information content of a classical system can be obtained in the joint

measurement of position and momentum, or in successive measurements in

which the observation of position is followed by the observation of momentum

or vice versa8.

In quantum physics any concept is limited to the description of phenom-

ena taking place within some well-defined experimental context, that is, always

restricted to a specific experimental procedure the physicist chooses. This im-

plies the question: How to define the total information content of a quantum

system if in order to be in reasonable agreement with common sense it has to

be invariant under variation of modes of observation and conserved in time if

there is no information exchange with an environment?

For a given density matrix the von Neumann entropy

S() = T r( log ) (1.12)

is widely accepted as a suitable definition for an information content of a quan-

tum system. For a system described in N-dimensional Hilbert space this ranges

from log N for a completely mixed state up to 0 for a pure state. Also, the von

Neumann entropy is invariant under unitary transformations UU+. Thatis, it is invariant under the change of the representation (basis) of and also

conserved in time if there is no information exchange with an environment.

However, we observe that any function9 of the form T r(f()) can possess these

7We discuss this in detail in Appendix A.1. Here, we note that given the probability

distribution (r, p, t) over the phase space the total lack of information of a classical system

is defined by [Jaynes, 1962]

Htotal(t) =

d3rd3p(r, p, t)log

(r, p, t)

(r, p), (1.11)

where a background measure (r, p) is an additional ingredient that has to be added to theformalism to ensure invariance under variable change when we consider continuous probability

distributions. The conservation of Htotal in time for a system with no information exchange

with an environment is implied by the Hamiltonian evolution of a point in the phase space.8In full analogy with (1.10) we may write Htotal(r, p) = H(r) + H(p|r) = H(p) + H(r|p).9The operator f() is identified by having the same eigenstates as and the eigenvalues

f(wj), equal to the function values taken at the eigenvalues wj of .


36/139


properties for a suitably defined function f and can, therefore, serve as indices

of the measure of the information content of a system. We also observe that the

von Neumann entropy is a property of the quantum state as a whole without

explicit reference to information contained in individual measurements. The

question arises: How to define and how to obtain information content of a

quantum system operationally? Here we ask precisely: What set of individual

measurements should we perform and how to combine individual measures ofinformation gained in different individual measurements to arrive at the total

information content of a quantum system?

We observe that, unlike the classical case, information carried by a quantum

system cannot be obtained through a set of successive measurements in a con-

sistent way, because information gained in successive quantum measurements

depends on the order of its acquisition (see Fig. 1.8 and discussion above). This

suggests that any attempt to obtain the total information content of a quan-

tum system has to be related to the specific set of different possible experiments

performed on an ensemble of equally prepared systems.

For a quantum system in the state different experiments correspond to

different probabilities for possible outcomes and therefore to different Shannon

information. How are individual measures of information obtained in different

individual experiments related to the total information carried by a quantum

system? It can be shown that the optimal experiment, which minimizes Shan-

nons information, is the one which corresponds to the orthonormal basis |iformed by the eigenvectors of the density matrix : |i = wi|i. The cor-responding Shannon information is then equal to the von Neumann entropy,

i.e.

H = i

wi log wi = T r( log ). (1.13)

Clearly this is invariant under unitary transformations. Again this implies

invariance of H under the change of the representation basis of and also its

conservation in time if there is no information exchange with an environment.

That is, if we perform the optimal experiments both at time t0 and at some

future time t, the Shannons information measures associated to the optimal

experiments at the two times

H(t) = i

wi(t)log wi(t) = i

wi log wi = H(t0) (1.14)

will be the same. Here, the eigenvalues of the density matrix at time t are wi(t).

However, without the additional knowledge of the eigenbasis of the density


37/139


matrix we cannot find the optimal experiment and determine experimentally

the Shannon information associated. Also, all the statistical predictions that

can be made for the optimal measurement are the same as if we had an ordi-

nary (classical) mixture, with fractions wi of the systems giving with certainty

results that are associated to the eigenvectors |i. In this sense the optimalmeasurement is a classical type measurement and therefore in this particular

case, and only then, Shannons measure defines the information gain in a mea-surement appropriately. It is thus not surprising that Shannons measure is

useful only when applied to measurements which can be understood as classical

measurements. Again the question arises: How to combine individual measures

of information obtained in different individual measurements in order to arrive

at the information content of a quantum system if the individual measurements

are incompatible with the density operator (non-optimal measurements)?

One may be tempted to define the total information content of a quantum

system in a constructive fashion, namely as a sum of individual measures of

information over a complete set of mutually complementary experiments. These

are experiments with the property that complete knowledge of the outcome

in one of the experiments excludes any knowledge of the outcomes in others.

For example, a set of measurements of (1) vertical/horizontal polarization, (2)

polarization at +45/45, and (3) left/right circular polarization is a completeset of mutually complementary measurements for photons polarization.

Consider a photons polarization state | = cos | + sin | . We sum-marize individual measures of Shannons information for the mutually comple-

mentary observations (1), (2) and (3) and obtain

Htotal = H1 + H2 + H3

= cos2

2log cos2

2+ sin2

2log sin2

2(1.15)

+1 sin

2log

1 sin 2

+1 + sin

2log

1 + sin

2

for the total Shannon information carried by the photons polarization. Our

result clearly depends on the parameter and thus is not invariant under uni-

tary transformations. This further associates certain features with our candi-

date Htotal for the total information carried by the photons polarization that

strongly disagrees with our intuitive appeal. Firstly, Htotal is not equal for eachpolarization state of the same purity. Secondly, Htotal is not specified by the

polarization state alone but depends on the particular set of mutually com-

plementary observations. If we choose another set of mutually complementary

observations, e.g., (1) polarization along the direction at an angle with re-spect to the vertical direction, (2) polarization along the direction at an angle


38/139


( + 45) with respect to the vertical direction, and (3) left/right circular po-larization, the total information carried by photons polarization might not be

the same (it depends on the particular value of the angle ). And thirdly, Htotalis not conserved in time for a system isolated from its environment completely.

In this section we have stressed some conceptual difficulties arising when

we apply Shannons notion of information to define information gain in a quan-tum measurement. Investigating three different approaches to the concept of

Shannons information we argued that these difficulties arise whenever it is not

possible, not even in principle, to assume that attributes observed are assigned

to the system before the observation is performed. The question arises: Are

there other physical situations where the use of Shannons measure of informa-

tion might be justified in quantum mechanics? Obviously, there are.

Suppose that there is a set of different possible preparations of the initial

state and that the a prioriprobabilities for the different preparations are known

to the observer. The observer is not told which one of the states is actually

implemented. Suppose now that the observer wants to determine the actualstate. Here the observers ignorance about the possible prepared states can be

quantified by Shannons measure of information because the possible states, in

principle, can be thought of as being objectively present before the measurement

is performed.

We briefly review an explicit example analyzed by Peres [1995]. Let n1, n2and n3 denote three unit vectors defined in a plane separated by angles of 120

.Consider a spin-1/2 particle and define normalized states |i by ni|i = |i(i=1,2,3). The spin-1/2 particle can be prepared in one of three states |idefined above, and these three preparations have equal a priori probability, i.e

H = log 3. Which one of these states is actually prepared? Since the states

are not orthogonal the answer cannot be unambiguous. The procedure giving

the maximal possible information (that is reducing H as much as possible)

is obtained in a POVM (positive-operator-valued measurement) by ruling out

one of the three allowed states, and leaving equal a posteriori probabilities for

the two others. The value of H is reduced to log 2 = 1 , so that the actual

information gain is log(3/2).


39/139

1.3 Measure of Information Acquired in a Quantum Experiment 35

1.3 Measure of Information Acquired in a Quantum

Experiment

Quantum mechanics is an intrinsically probabilistic description of Nature. All

an experimentalist can know before a quantum experiment is performed are

the probabilities for all possible outcomes to occur. In general, which specificoutcome occurs is objectively random. We define a new measure of information

for an individual measurement which is based on the fact that the probabilistic

predictions an experimentalist can make have no empirical significance for any

individual experiment but only as predictions about the number of occurrences

of a specific outcome in future repetitions of the experiment.

Consider a stationary experimental arrangement with two detectors, where

only one detector fires at a time, i.e. in each experimental trial. Detector 1,

say, fires (we call this the yes outcome) with probability p. If it does not

fire (the no outcome) the other detector will fire with probability q = 1

p.

When exactly one detector has fired, the experiment is over. Examples wouldbe the Stern-Gerlach experiment with a spin-1/2 particle or an interference

experiment with an interferometer of the Mach-Zehnder type.

Knowing the probabilities for the two outcomes to occur all an experimenter

can predict is how many times a specific detector fires. In making her prediction

she has only a limited number of systems to work with. Then, because of the

statistical fluctuations associated with any finite number of experimental trials,

the number of occurrences of a specific outcome in N future repetitions of the

experiment is not precisely predictable. In N independent experimental trials,

the particular ordered sequence of results yes,no,no ... yes containingyes exactly n times and no exactly N n times occurs with probability

p (1 p) (1 p) p = pn(1 p)Nn. (1.16)

The various different permutations of the sequence are independent events, and

so we can add their probabilities to obtain10

PN(n) =

N

n

pn(1 p)Nn, (1.17)

10We are ignorant about different possible orders of individual outcomes because, in quan-

tum measurement the particular order of individual outcomes is not defined before the ex-

periment is performed. In contrast, classical measurements reveals pre-existing properties of

individual systems and therefore the particular sequence of individual outcomes is of impor-

tance. Information that is gained about a particular sequence observed is adequately defined

by Shannons measure of information (see Sec. 1.2).


40/139


the probability that from N independent experimental trials we observe n times

yes and N n times no. This is known as the binomial distribution [Gne-denko, 1976]. Note that if one bets on a specific result, e.g. that the number of

yes outcomes will be the one with highest probability, which is nmax pN,the probability of success still depends on p. With an inner probability of

p = 0.5, the probability of 5 yes outcomes in 10 trials is only 0.25, but with

one where p = 0.9 the probability of 9 yes outcomes in 10 trials is 0.39. Itis a peculiar feature of the binomial distribution, that the future number of

occurrences is less specified when p is around 0.5.

An experimenters uncertainty11, or lack of information, in the value n is

given by the mean-square-deviation defined as the expectation of the square of

the deviation of n from the mean value pN [Gnedenko, 1976]

2 :=Nn=1

PN(n)(n pN)2 = p(1 p)N. (1.19)

In fact, if is small, then each term in the sum in Eq. (1.19) is small. A

value n for which |n pN| is large must therefore have a small probabilityPN(n). In other words, in the case of small , large deviations of the number

of occurrences of the yes outcome from the mean pN are improbable. In

this case an experimenter knows the future number of occurrences with a high

certainty. Conversely, a large variance indicates that not all highly probable

values of n lie near the mean pN. In that case experimenter knows much less

about the future number of occurrences.

For a sufficiently large number N of experimental trials, the confidence

interval within which the number of occurrences of the yes outcome can be

found in 68% of cases is given as [Gnedenko, 1976]

(pN ,pN+ ). (1.20)

Therefore, if an observer just plans to perform the experiment N times, he

knows in advance, before the experiments are performed and their outcomes

11Since the binomial distribution has a finite deviation, it fulfills Chebyshevs inequality

[Gnedenko, 1976]:

Prob{|n pN| > k} 1k2

. (1.18)

This inequality means tha

Documents

Phd Thesis Burckner Zeilinger