Upload
magdalene-allison
View
218
Download
0
Embed Size (px)
Citation preview
CS252Graduate Computer Architecture
Lecture 28
Esoteric Computer ArchitectureDNA Computing & Quantum
Computing
Prof John D. Kubiatowicz
http://www.cs.berkeley.edu/~kubitron/cs252
5/11/2009 cs252-S09, Lecture 28 2
DNA Computing• Can we use DNA to do
massive computations?– Organisms do it– DNA has very high
information density:» 4 different base pairs:
• Adenine/Thymine• Guanine/Cytosine
» Always paired on opposite strands Energetically favorable
– Active operations:» Copy: Split strands of DNA
apart in solution, gain 2 copies
» Concatenate: eg:GTAATCCT will combine XXXXXCATT with AGGAYYYYY
» Polymerase Chain Reaction (PCR): amplifies region of molecule between two marker molecules
©1999 Access Excellence @ the National Health Museum
5/11/2009 cs252-S09, Lecture 28 3
DNA Computing and Hamiltonian Path
• Given a set of cities and costs between them (possibly directedpaths):– Find shortest path to all cities– Simpler: find single path that
visits all cities
• DNA Computing example is latter version:– Every city represented by unique 20 base-pair strand– Every path between cities represented by
complementary pairs: 10 pairs from source city, 10 pairs from destination
– Shorter example: AAGT for city 1, TTCG for city 2Path 1->2: CAAAWill build: AAGTTTCG
..CAAA..– Dump “city molecules” and “path molecules” into
testtube. Select and amplify paths of right length. Analyze for result.
– Been done for 6 cities! (Adleman, ~1998!)
5/11/2009 cs252-S09, Lecture 28 4
Even more promising uses of DNA
• Self-assembly of components– DNA serves as substrate– Attach active elements in middle of components.– Final step – metal deposited over DNA
• Other interesting structures could be built
Active Region
Active Region
DN
A
Bon
ding
5/11/2009 cs252-S09, Lecture 28 5
Use Quantum Mechanics to Compute?
• Weird but useful properties of quantum mechanics:– Quantization: Only certain values or orbits are good
» Remember orbitals from chemistry???– Superposition: Schizophrenic physical elements don’t quite know
whether they are one thing or another
• All existing digital abstractions try to eliminate QM– Transistors/Gates designed with classical behavior– Binary abstraction: a “1” is a “1” and a “0” is a “0”
• Quantum Computing: Use of Quantization and Superposition to compute.
• Interesting results:– Shor’s algorithm: factors in polynomial time!– Grover’s algorithm: Finds items in unsorted database in time
proportional to square-root of n.– Materials simulation: exponential classically, linear-time QM
5/11/2009 cs252-S09, Lecture 28 6
Quantization: Use of “Spin”
• Particles like Protons have an intrinsic “Spin” when defined with respect to an external magnetic field
• Quantum effect gives “1” and “0”:– Either spin is “UP” or “DOWN” nothing between
North
South
Spin ½ particle:(Proton/Electron)
Representation: |0> or |1>
5/11/2009 cs252-S09, Lecture 28 7
Kane Proposal II (First one didn’t quite work)
• Bits Represented by combination of proton/electron spin
• Operations performed by manipulating control gates– Complex sequences of pulses perform NMR-like operations
• Temperature < 1° Kelvin!
PhosphorusImpurity Atoms
Single SpinControl Gates
Inter-bit Control Gates
5/11/2009 cs252-S09, Lecture 28 8
Now add Superposition!• The bit can be in a combination of “1” and
“0”:– Written as: = C0|0> + C1|1>– The C’s are complex numbers!
– Important Constraint: |C0|2 + |C1|2 =1
• If measure bit to see what looks like, – With probability |C0|2 we will find |0> (say “UP”)
– With probability |C1|2 we will find |1> (say “DOWN”)
• Is this a real effect? Options:– This is just statistical – given a large number of protons, a
fraction of them (|C0|2 ) are “UP” and the rest are down.– This is a real effect, and the proton is really both things
until you try to look at it
• Reality: second choice! – There are experiments to prove it!
5/11/2009 cs252-S09, Lecture 28 9
A register can have many values!• Implications of superposition:
– An n-bit register can have 2n values simultaneously!– 3-bit example:
= C000|000>+ C001|001>+ C010|010>+ C011|011>+ C100|100>+ C101|101>+ C110|110>+ C111|111>
• Probabilities of measuring all bits are set by coefficients:– So, prob of getting |000> is |C000|2, etc.– Suppose we measure only one bit (first):
» We get “0” with probability: P0=|C000|2+ |C001|2+ |C010|2+ |C011|2
Result: = (C000|000>+ C001|001>+ C010|010>+ C011|011>)» We get “1” with probability: P1=|C100|2+ |C101|2+ |C110|2+ |C111|2
Result: = (C100|100>+ C101|101>+ C110|110>+ C111|111>)
• Problem: Don’t want environment to measure before ready!– Solution: Quantum Error Correction Codes!
5/11/2009 cs252-S09, Lecture 28 10
Spooky action at a distance• Consider the following simple 2-bit state:
= C00|00>+ C11|11>– Called an “EPR” pair for “Einstein, Podolsky, Rosen”
• Now, separate the two bits:
• If we measure one of them, it instantaneously sets other one!– Einstein called this a “spooky action at a distance”– In particular, if we measure a |0> at one side, we get a |0> at the
other (and vice versa)
• Teleportation– Can “pre-transport” an EPR pair (say bits X and Y)– Later to transport bit A from one side to the other we:
» Perform operation between A and X, yielding two classical bits» Send the two bits to the other side» Use the two bits to operate on Y» Poof! State of bit A appears in place of Y
Light-Years?
5/11/2009 cs252-S09, Lecture 28 11
Model: Operations on coefficients + measurements
• Basic Computing Paradigm:– Input is a register with superposition of many values
» Possibly all 2n inputs equally probable!– Unitary transformations compute on coefficients
» Must maintain probability property (sum of squares = 1)» Looks like doing computation on all 2n inputs simultaneously!
– Output is one result attained by measurement
• If do this poorly, just like probabilistic computation:– If 2n inputs equally probable, may be 2n outputs equally probable.– After measure, like picked random input to classical function!– All interesting results have some form of “fourier transform”
computation being done in unitary transformation
Unitary Transformations
InputComplex
StateMeasure
OutputClassicalAnswer
5/11/2009 cs252-S09, Lecture 28 12
• The Security of RSA Public-key cryptosystems depends on the difficult of factoring a number N=pq (product of two primes)– Classical computer: sub-exponential time factoring– Quantum computer: polynomial time factoring
• Shor’s Factoring Algorithm (for a quantum computer)1) Choose random x : 2 x N-1.2) If gcd(x,N) 1, Bingo!3) Find smallest integer r : xr 1 (mod N)4) If r is odd, GOTO 15) If r is even, a x r/2 (mod N) (a-1)(a+1) =
kN6) If a = N-1 GOTO 17) ELSE gcd(a ± 1,N) is a non trivial factor of
N.
Hard
Security of Factoring
EasyEasy
EasyEasyEasyEasy
5/11/2009 cs252-S09, Lecture 28 13
Finding r with xr 1 (mod N)
k
/\k /
\xkk
/\k /
\1
/\
/\x
y
r yw0w
w1r
( ) /\x
r0
r r1 k
0ww
1r
QuantumFourier
Transform
• Finally: Perform measurement– Find out r with high probability– Get |y>|aw’> where y is of form k/r and w’ is related
5/11/2009 cs252-S09, Lecture 28 14
ION Trap Quantum Computer: Promising technology
• IONS of Be+ trapped in oscillating quadrature field– Internal electronic modes of
IONS used for quantum bits– MEMs technology – Target? 50,000 ions– ROOM Temperature!
• Ions moved to interaction regions– Ions interactions with one
another moderated by lasers
Cross-Sectional
View
Top View
Top
Proposal: NIST Group
5/11/2009 cs252-S09, Lecture 28 15
Ion Trap Quantum Computer
- Data = an ion
- Gate = a location
Tw
o-Qubit G
ate
• Ballistic Movement
- Apply pulse sequences to electrodes- Electrostatic forces move ion
Q1
Q2
- Intersections similar, but more complicated pulse sequences
• Major Components
5/11/2009 cs252-S09, Lecture 28 16
Ballistic Movement Network
One-Qubit Gate
Two-Qubit Gate
Two-Qubit Gate
Memory Cell
One-Qubit Gate
Memory Cell
R R
RR
Interconnection Network
Q2 Q3
Q4 Q5
Q1
• Problem: Noise accumulation!
5/11/2009 cs252-S09, Lecture 28 17
Noise Accumulation from Movement
• Noise may increase error by factor of 100
Qu
bit
Err
or
Distance Moved in Gates
1.0E-08
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
0 16 32 48 64
1.0E-4 inital error
1.0E-5 inital error
1.0E-6 inital error
1.0E-7 inital error
1.0E-8 inital error
5/11/2009 cs252-S09, Lecture 28 18
D?
Movement Option 2: Teleportation
D
D
E2E1
Source Location Target Location
1. Generate EPR pair
3. Transmit two classical bits
• Teleportation Benefits
• Problem: EPR pairs become noisy
- Error Correction of data (arbitrary state): ~100 ms Purification of EPR pair (known state): ~120 µs
- Pre-distribution of EPR pairs
• Goal: Transfer the state, not the data ion
2. LocalOps
4. Local OpsEntanglement
5/11/2009 cs252-S09, Lecture 28 19
One-Qubit Gate
Q2
Two-Qubit Gate
Q3
Two-Qubit Gate
Q4 Q5
Memory Cell
Q1
One-Qubit Gate
Memory Cell
R R
RR
Interconnection Network
EPR Pair Distribution Network
EPR PairGenerators
5/11/2009 cs252-S09, Lecture 28 20
Setting Up a Teleportation Link
G PP
EPR QubitsEPR Qubits
Recycled QubitsRecycled Qubits
For Data Teleportation
• Purification = Amplification of EPR pair link- Two EPR pairs One “purer” pair, one junk pair- Chance of failure
• Need to send multiple pairs
STRONGER Entanglement
5/11/2009 cs252-S09, Lecture 28 21
Chained Teleportation
G TT G TGT G TGT
PP
TeleportationTeleportation
• Adjacent T nodes linked for teleportation
• Positive Features- T node linking not on critical path- Pre-purification (Link Amplification): part of link setup
5/11/2009 cs252-S09, Lecture 28 22
Quantum Network Architecture
• Grid of T nodes
• Packet-switched network - Dimension-order routing
T T T T
T T T T
P P P P
P P P P
G
G
G
G
G
G G
G G G
, linked by G nodes
• Each qubit has associated message
Gate Gate Gate Gate
Gate Gate Gate Gate
5/11/2009 cs252-S09, Lecture 28 23
Classical Control• Quantum Datapath Layer
- T Nodes and G Nodes (P Nodes and Gates not shown)• Classical Control Layers
- Messages Associated with Qubits- Teleportation and Purification Bits
5/11/2009 cs252-S09, Lecture 28 24
Running a Quantum Circuit
• Simple gates (transversal)• More complex gates (non-transversal)
– Exist in any universal set
• Quantum Error Correction (QEC)– 10-8 to 10-6 error rates from gates, movement and
idleness– Data must be encoded and periodically error corrected
• Ancilla (helper) qubits– Necessary for complex gates and for QEC– Computation with ancilla qubits > 90% of quantum
programH
CX
HT
T
QEC QEC QEC
QEC QEC QECT
T QEC
QEC
QEC
QEC
QEC
QECQ0
Q1
time Serial Circuit Latency
5/11/2009 cs252-S09, Lecture 28 25
Running a Quantum Circuit
• Ancilla qubits are independent of data– Preparation may be pulled offline– Ancilla qubits should be ready just in time to avoid noise
from idleness
HCX
H
T
T QEC
QEC
QEC
QEC
QEC
QECQ0
Q1
Parallel Circuit Latency
5/11/2009 cs252-S09, Lecture 28 26
Running at The Speed of Data
• Ideally, execution time determined solely by data
time
hardware
Ancilla encoding
Operationsinvolving
data qubits
5/11/2009 cs252-S09, Lecture 28 27
Limited Ancilla Bandwidth
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1 10 100 1000
• 32-bit Quantum Carry-Lookahead Adder– Varying rate at which encoded zero ancillae are provided
for QEC– Conclusion: design architecture with “ancilla factories”
Encoded Ancilla Bandwidth Available (Ancillae per ms)
Exe
cutio
n T
ime
of a
3
2-B
it Q
CL
A (
μs)
5/11/2009 cs252-S09, Lecture 28 28
Ancilla Factory Design I
• “In-place” ancilla preparation
Encoded Ancilla Verification Qubits
• Ancilla factory consists of many of these– Encoded ancilla prepared
in many places, thenmoved to output port
– Movement is costly!
In-placePrep
In-placePrep
In-placePrep
In-placePrep
0 Prep
Cat Prep
0 Prep
Cat Prep
0 Prep
Cat Prep
Verify
Verify
Verify
?
?
?
BitCorrect
PhaseCorrect
QEC QEC
Ancilla Generation Circuit
5/11/2009 cs252-S09, Lecture 28 29
Idealized Qalypso Architecture
• Dense data region– Data qubits only– Local communication
• Shared Ancilla Factories– Distributed to data as needed– Fully multiplexed to all data– Output ports ( ): close to data– Input ports ( ): may be far from data, since recycled qubits have irrelevant state
• Goals– Design ancilla factories– Answer Question: How much hardware is needed for
ancilla generation to run at the speed of data?
5/11/2009 cs252-S09, Lecture 28 30
Discussion of ISCA 2009 paper
• “A Fault Tolerant, Area Efficient Architecture for Shor’s Factoring Algorithm”– Mark Whitney, Nemanja Isailovic, Yatish Patel, and John
Kubiatowicz– ISCA 2009
• How to compare layouts? (what is good)?– Probabilistic circuits need metric that includes probability of
failure– ADCR = Probabilistic version of area-delay product
– Lower is better
• What to optimize?– Many different “datapath organizations”– Far too much error correction
success
single
P
LatencyAreaADCR
5/11/2009 cs252-S09, Lecture 28 31
Datapath Organizations
• QLA: “Quantum Logic Array”– Every compute region has space for 2 bits and ancilla
generation for 2 bits (to correct after every operation
• CQLA: “Compressed Quantum Logic Array”– Same compute regions as QLA, but ability to have less
ancilla generation/bit for memory (idle bits less prone to error)
• Qalypso – Matching ancilla generation to needs
5/11/2009 cs252-S09, Lecture 28 32
Error Correction Optimization
• Selectively error correction placement– Standard techniques: correct after every error– Instead – only correct bits that are particularly “dirty”
• Error correction modeled after retiming optimization– Only place correction when approximate “EDist” parameter reaches
threshold– Then, perform full mapping– Choose EDist by optimizing ADCR
• Very successful at reducing area/latency• Even improves probability of success in some cases!
– Why? Because error correction involves operations which can introduce error
5/11/2009 cs252-S09, Lecture 28 33
Shor’s Factoring Circuit
• Most of time spent in modular exponentiation– QFT is much smaller fraction of time– Easiest way to build modular exponentiation: with adders
» Build multiplier instead? Not studied yet – would be very large
• Paper result: can factor in 7659 mm2
– Previous result was 0.9 m2
5/11/2009 cs252-S09, Lecture 28 34
Conclusion• Computing can be done in a variety of ways
– Normal silicon gates not required
• DNA Computing– Limited use “demonstration of concept”– Form of massive parallelism– Interesting consequences: self assembly
• Quantum Computing:– Computing using superposition and quantization– Ion Traps: a particularly promising technology
• CAD Tools for Quantum Computing– Can actually optimize circuits – just like classical case– ADCR = probabilistic version of Area-Delay product