Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Verifiable ASICs:trustworthy hardware with untrusted components
Riad S. Wahby◦?, Max Howald†?,Siddharth Garg?, abhi shelat‡, and Michael Walfish?
◦Stanford University?New York University†The Cooper Union
‡The University of Virginia
June 10th, 2016
Setting: ASICs with mutually distrusting designer, manufacturer
chip designPrincipal
(government,chip designer)
Manufacturer(“foundry”or “fab”)
Here we are thinking about ASICs, not CPUs:
CPU
RAM
cache
registerfile ALU
ASIC
in[0]
in[n]
...D Q
D Q
Setting: ASICs with mutually distrusting designer, manufacturer
chip designPrincipal
(government,chip designer)
Manufacturer(“foundry”or “fab”)
Here we are thinking about ASICs, not CPUs:
CPU
RAM
cache
registerfile ALU
ASIC
in[0]
in[n]
...D Q
D Q
Setting: ASICs with mutually distrusting designer, manufacturer
Firewall
e.g., a network firewall appliance,with a custom chip for packet processing
What if our packet processing chip has a back door?
Threat: incorrect execution of the packet filter
(Other concerns, e.g., secret state, are important but orthogonal)
US DoD controls supply chain with trusted foundries.
Untrusted manufacturers can craft hardware Trojans
Firewall
e.g., a network firewall appliance
,with a custom chip for packet processing
What if our packet processing chip has a back door?
Threat: incorrect execution of the packet filter
(Other concerns, e.g., secret state, are important but orthogonal)US DoD controls supply chain with trusted foundries.
Untrusted manufacturers can craft hardware Trojans
Firewall
e.g., a network firewall appliance
,with a custom chip for packet processing
What if our packet processing chip has a back door?
Threat: incorrect execution of the packet filter
(Other concerns, e.g., secret state, are important but orthogonal)
US DoD controls supply chain with trusted foundries.
Untrusted manufacturers can craft hardware Trojans
Firewall
e.g., a network firewall appliance
,with a custom chip for packet processing
What if our packet processing chip has a back door?
Threat: incorrect execution of the packet filter
(Other concerns, e.g., secret state, are important but orthogonal)US DoD controls supply chain with trusted foundries.
Untrusted manufacturers can craft hardware Trojans
Firewall
e.g., a network firewall appliance
,with a custom chip for packet processing
What if our packet processing chip has a back door?
Threat: incorrect execution of the packet filter
(Other concerns, e.g., secret state, are important but orthogonal)
US DoD controls supply chain with trusted foundries.
Trusted fabs are the only way to get strong guarantees
For example, stealthy trojans can thwart post-fab detection[A2: Analog Malicious Hardware, Yang et al., IEEE S&P 2016;Stealthy Dopant-Level Trojans, Becker et al., CHES 2013]
But trusted fabrication is not a panacea:
7 Only 5 countries have cutting-edge fabs on-shore
7 Building a new fab takes $$$$$$, years of R&D
7 Semiconductor scaling: chip area and energy go withsquare and cube of transistor length (“critical dimension”)
7 So using an old fab means an enormous performance hite.g., India’s best on-shore fab is 108× behind state of the art
Can we get trust more cheaply?
Trusted fabs are the only way to get strong guarantees
For example, stealthy trojans can thwart post-fab detection[A2: Analog Malicious Hardware, Yang et al., IEEE S&P 2016;Stealthy Dopant-Level Trojans, Becker et al., CHES 2013]
But trusted fabrication is not a panacea:
7 Only 5 countries have cutting-edge fabs on-shore
7 Building a new fab takes $$$$$$, years of R&D
7 Semiconductor scaling: chip area and energy go withsquare and cube of transistor length (“critical dimension”)
7 So using an old fab means an enormous performance hite.g., India’s best on-shore fab is 108× behind state of the art
Can we get trust more cheaply?
Trusted fabs are the only way to get strong guarantees
For example, stealthy trojans can thwart post-fab detection[A2: Analog Malicious Hardware, Yang et al., IEEE S&P 2016;Stealthy Dopant-Level Trojans, Becker et al., CHES 2013]
But trusted fabrication is not a panacea:
7 Only 5 countries have cutting-edge fabs on-shore
7 Building a new fab takes $$$$$$, years of R&D
7 Semiconductor scaling: chip area and energy go withsquare and cube of transistor length (“critical dimension”)
7 So using an old fab means an enormous performance hite.g., India’s best on-shore fab is 108× behind state of the art
Can we get trust more cheaply?
Trusted fabs are the only way to get strong guarantees
For example, stealthy trojans can thwart post-fab detection[A2: Analog Malicious Hardware, Yang et al., IEEE S&P 2016;Stealthy Dopant-Level Trojans, Becker et al., CHES 2013]
But trusted fabrication is not a panacea:
7 Only 5 countries have cutting-edge fabs on-shore
7 Building a new fab takes $$$$$$, years of R&D
7 Semiconductor scaling: chip area and energy go withsquare and cube of transistor length (“critical dimension”)
7 So using an old fab means an enormous performance hite.g., India’s best on-shore fab is 108× behind state of the art
Can we get trust more cheaply?
Verifiable ASICs
Principal
F → designsfor P,V
Verifiable ASICs
Untrustedfab (fast)builds P
Trustedfab (slow)builds V
Principal
F → designsfor P,V
Verifiable ASICs
Untrustedfab (fast)builds P
Trustedfab (slow)builds V
Principal
F → designsfor P,V
IntegratorV P
Verifiable ASICs
Untrustedfab (fast)builds P
Trustedfab (slow)builds V
Principal
F → designsfor P,V
Integrator
V Pinput
output
Verifiable ASICs
Untrustedfab (fast)builds P
Trustedfab (slow)builds V
Principal
F → designsfor P,V
Integrator
V Pxy
proof thaty = F(x)
input
output
Can we build Verifiable ASICs?
V Pxy
proof thaty = F(x)
input
output Fvs.
Makes sense if V + P are cheaperthan trusted F
Babai85GMR85BCC86BFLS91FGLSS91Kilian92ALMSS92AS92Micali94BG02GOS06IKO07GKR08KR09GGP10Groth10GLR11Lipmaa11BCCT12GGPR13BCCT13KRR14. . .
SBW11CMT12
SMBW12TRMP12
SVPBBW12SBVBPW13
VSBW13PGHR13Thaler13
BCGTV13BFRSBW13
BFR13DFKP13
BCTV14aBCTV14b
BCGGMTV14FL14
KPPSST14FTP14
WSRHBW15BBFR15
CFHKNPZ15CTV15
KZMQCPPsS15
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
108×
Can we build Verifiable ASICs?
V Pxy
proof thaty = F(x)
input
output Fvs.
Makes sense if V + P are cheaperthan trusted F
Reasons for hope:• running time of V < F (asymptotically)
• Implementations exist
• P overheads are massive, but using anadvanced fab might offset these costs
Babai85GMR85BCC86BFLS91FGLSS91Kilian92ALMSS92AS92Micali94BG02GOS06IKO07GKR08KR09GGP10Groth10GLR11Lipmaa11BCCT12GGPR13BCCT13KRR14. . .
SBW11CMT12
SMBW12TRMP12
SVPBBW12SBVBPW13
VSBW13PGHR13Thaler13
BCGTV13BFRSBW13
BFR13DFKP13
BCTV14aBCTV14b
BCGGMTV14FL14
KPPSST14FTP14
WSRHBW15BBFR15
CFHKNPZ15CTV15
KZMQCPPsS15
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
108×
Can we build Verifiable ASICs?
V Pxy
proof thaty = F(x)
input
output Fvs.
Makes sense if V + P are cheaperthan trusted F
Reasons for hope:• running time of V < F (asymptotically)
• Implementations exist
• P overheads are massive, but using anadvanced fab might offset these costs
Babai85GMR85BCC86BFLS91FGLSS91Kilian92ALMSS92AS92Micali94BG02GOS06IKO07GKR08KR09GGP10Groth10GLR11Lipmaa11BCCT12GGPR13BCCT13KRR14. . .
SBW11CMT12
SMBW12TRMP12
SVPBBW12SBVBPW13
VSBW13PGHR13Thaler13
BCGTV13BFRSBW13
BFR13DFKP13
BCTV14aBCTV14b
BCGGMTV14FL14
KPPSST14FTP14
WSRHBW15BBFR15
CFHKNPZ15CTV15
KZMQCPPsS15
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
108×
Can we build Verifiable ASICs?
V Pxy
proof thaty = F(x)
input
output Fvs.
Makes sense if V + P are cheaperthan trusted F
Reasons for hope:• running time of V < F (asymptotically)
• Implementations exist
• P overheads are massive, but using anadvanced fab might offset these costs
Babai85GMR85BCC86BFLS91FGLSS91Kilian92ALMSS92AS92Micali94BG02GOS06IKO07GKR08KR09GGP10Groth10GLR11Lipmaa11BCCT12GGPR13BCCT13KRR14. . .
SBW11CMT12
SMBW12TRMP12
SVPBBW12SBVBPW13
VSBW13PGHR13Thaler13
BCGTV13BFRSBW13
BFR13DFKP13
BCTV14aBCTV14b
BCGGMTV14FL14
KPPSST14FTP14
WSRHBW15BBFR15
CFHKNPZ15CTV15
KZMQCPPsS15
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
108×
Can we build Verifiable ASICs?
V Pxy
proof thaty = F(x)
input
output Fvs.
Makes sense if V + P are cheaperthan trusted F
Reasons for hope caution:• Theory is silent about feasibility
• Onus is heavier than in prior work
• Hardware issues: energy, chip area
• Need physically realizable circuit design
• Need V to save for plausible computation sizes
Babai85GMR85BCC86BFLS91FGLSS91Kilian92ALMSS92AS92Micali94BG02GOS06IKO07GKR08KR09GGP10Groth10GLR11Lipmaa11BCCT12GGPR13BCCT13KRR14. . .
SBW11CMT12
SMBW12TRMP12
SVPBBW12SBVBPW13
VSBW13PGHR13Thaler13
BCGTV13BFRSBW13
BFR13DFKP13
BCTV14aBCTV14b
BCGGMTV14FL14
KPPSST14FTP14
WSRHBW15BBFR15
CFHKNPZ15CTV15
KZMQCPPsS15
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
108×
A qualified success
Zebra: a hardware design that saves costs
. . .
. . . sometimes.
A qualified success
Zebra: a hardware design that saves costs. . .
. . . sometimes.
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
F must be expressed as an arithmetic circuit (AC)
AC satisfiable ⇐⇒ F was executed correctly
P convinces V that the AC is satisfiable
generalized boolean circuit over Fp
∧ → × ∨ → +What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .
These all seem a bit further from practicality.
73
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
Arguments [GGPR13,SBVBPW13, PGHR13, BCTV14]
e.g., Zaatar, Pinocchio, libsnark
+ nondeterministic ACs,arbitrary connectivity
+ Few rounds (≤ 3)
Unsuited to hardwareimplementation
IPs[GKR08, CMT12, VSBW13]
e.g., Muggles, CMT, Allspice
– deterministic ACs;layered, low depth
– Many rounds
Suited to hardwareimplementation
generalized boolean circuit over Fp
∧ → × ∨ → +
What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .
These all seem a bit further from practicality.
7 3
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
Arguments [GGPR13,SBVBPW13, PGHR13, BCTV14]
e.g., Zaatar, Pinocchio, libsnark
+ nondeterministic ACs,arbitrary connectivity
+ Few rounds (≤ 3)
Unsuited to hardwareimplementation
IPs[GKR08, CMT12, VSBW13]
e.g., Muggles, CMT, Allspice
– deterministic ACs;layered, low depth
– Many rounds
Suited to hardwareimplementation
generalized boolean circuit over Fp
∧ → × ∨ → +
What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .
These all seem a bit further from practicality.7 3
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
Arguments [GGPR13,SBVBPW13, PGHR13, BCTV14]
e.g., Zaatar, Pinocchio, libsnark
+ nondeterministic ACs,arbitrary connectivity
+ Few rounds (≤ 3)
Unsuited to hardwareimplementation
IPs[GKR08, CMT12, VSBW13]
e.g., Muggles, CMT, Allspice
– deterministic ACs;layered, low depth
– Many rounds
Suited to hardwareimplementation
generalized boolean circuit over Fp
∧ → × ∨ → +
What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .These all seem a bit further from practicality.
7 3
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
Arguments [GGPR13,SBVBPW13, PGHR13, BCTV14]
e.g., Zaatar, Pinocchio, libsnark
+ nondeterministic ACs,arbitrary connectivity
+ Few rounds (≤ 3)
Unsuited to hardwareimplementation
IPs[GKR08, CMT12, VSBW13]
e.g., Muggles, CMT, Allspice
– deterministic ACs;layered, low depth
– Many rounds
Suited to hardwareimplementation
generalized boolean circuit over Fp
∧ → × ∨ → +
What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .
These all seem a bit further from practicality.
7 3
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
Arguments [GGPR13,SBVBPW13, PGHR13, BCTV14]
e.g., Zaatar, Pinocchio, libsnark
+ nondeterministic ACs,arbitrary connectivity
+ Few rounds (≤ 3)
Unsuited to hardwareimplementation
IPs[GKR08, CMT12, VSBW13]
e.g., Muggles, CMT, Allspice
– deterministic ACs;layered, low depth
– Many rounds
Suited to hardwareimplementation
generalized boolean circuit over Fp
∧ → × ∨ → +
What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .
These all seem a bit further from practicality.
7
3
Probabilistic proof protocols, briefly
V Pxy
proof thaty = F(x)
input
output
Arguments [GGPR13,SBVBPW13, PGHR13, BCTV14]
e.g., Zaatar, Pinocchio, libsnark
+ nondeterministic ACs,arbitrary connectivity
+ Few rounds (≤ 3)
Unsuited to hardwareimplementation
IPs[GKR08, CMT12, VSBW13]
e.g., Muggles, CMT, Allspice
– deterministic ACs;layered, low depth
– Many rounds
Suited to hardwareimplementation
generalized boolean circuit over Fp
∧ → × ∨ → +
What about other schemes? e.g.,FHE [GGP10], MIP+FHE [BC12], MIP [BTWV14],PCIP [RRR16], IOP [BCS16], PIR [BHK16], . . .
These all seem a bit further from practicality.
7 3
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
F must be expressed as alayered arithmetic circuit.
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates
, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates
, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates
, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates
, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
y
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check
, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check, getsclaim about second-last layer
5. V iterates
, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
1. V sends inputs
2. P evaluates, returns output y
3. V constructs polynomial relatingy to last layer’s input wires
4. V engages P in a sum-check, getsclaim about second-last layer
5. V iterates, gets claim aboutinputs, which it can check
V Px
thinking...
y
thinking...
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
Soundness error ∝ p−1
Cost to execute F directly:O(depth · width)
V ’s sequential running time:O(depth · log width + |x | + |y |)(assuming precomputed queries)
P ’s sequential running time:O(depth · width · log width)
V Pxy
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
Soundness error ∝ p−1
Cost to execute F directly:O(depth · width)
V ’s sequential running time:O(depth · log width + |x | + |y |)(assuming precomputed queries)
P ’s sequential running time:O(depth · width · log width)
V Pxy
... sum-check[LFKN90]
more sum-checks
Zebra builds on IPs of GKR [GKR08, CMT12, VSBW13]
Soundness error ∝ p−1
Cost to execute F directly:O(depth · width)
V ’s sequential running time:O(depth · log width + |x | + |y |)(assuming precomputed queries)
P ’s sequential running time:O(depth · width · log width)
V Pxy
... sum-check[LFKN90]
more sum-checks
Extracting parallelism in Zebra
P executing AC: layers aresequential, but all gates at alayer can be executed in parallel
Proving step: Can V and Pinteract about all of F’s layersat once?
No. V must ask questions inorder or soundness is lost.
But: there is still parallelism tobe extracted. . .
Extracting parallelism in Zebra
P executing AC: layers aresequential, but all gates at alayer can be executed in parallel
Proving step: Can V and Pinteract about all of F’s layersat once?
No. V must ask questions inorder or soundness is lost.
But: there is still parallelism tobe extracted. . .
Extracting parallelism in Zebra
P executing AC: layers aresequential, but all gates at alayer can be executed in parallel
Proving step: Can V and Pinteract about all of F’s layersat once?
No. V must ask questions inorder or soundness is lost.
But: there is still parallelism tobe extracted. . .
Extracting parallelism in Zebra
P executing AC: layers aresequential, but all gates at alayer can be executed in parallel
Proving step: Can V and Pinteract about all of F’s layersat once?
No. V must ask questions inorder or soundness is lost.
But: there is still parallelism tobe extracted. . .
Extracting parallelism in Zebra’s P
V questions P aboutF(x1)’s output layer.
F(x1)
Extracting parallelism in Zebra’s P
V questions P aboutF(x1)’s output layer.
Simultaneously, Preturns F(x2).
F(x1)
F(x2)
Extracting parallelism in Zebra’s P
V questions P aboutF(x1)’s next layer
F(x1)
Extracting parallelism in Zebra’s P
V questions P aboutF(x1)’s next layer, andF(x2)’s output layer.
F(x1)
F(x2)
Extracting parallelism in Zebra’s P
V questions P aboutF(x1)’s next layer, andF(x2)’s output layer.
Meanwhile, P returnsF(x3).
F(x1)
F(x2)
F(x3)
Extracting parallelism in Zebra’s P
This process continues. . .
F(x1)
F(x2)
F(x3)
F(x4)
Extracting parallelism in Zebra’s P
This process continues. . .
F(x1)
F(x2)
F(x3)
F(x4)
F(x5)
Extracting parallelism in Zebra’s P
This process continuesuntil V and P interactabout every layersimultaneously—but fordifferent computations.
V and P can completeone proof in each timestep.
F(x1)
F(x2)
F(x3)
F(x4)
F(x5)
F(x6)
F(x7)
F(x8)
Extracting parallelism in Zebra’s P with pipelining
Input (x)
Output (y)
prove
prove
prove
Sub-prover, layer 0
Sub-prover, layer 1
Sub-prover, layer d −1
V
Pqueries
responses
queries
responses
queries
responses
.
.
.
.
.
.
This approach is just a standard hardware technique, pipelining;it is possible because the protocol is naturally staged.
There are other opportunities to leverage the protocol’s structure.
Extracting parallelism in Zebra’s P with pipelining
Input (x)
Output (y)
prove
prove
prove
Sub-prover, layer 0
Sub-prover, layer 1
Sub-prover, layer d −1
V
Pqueries
responses
queries
responses
queries
responses
.
.
.
.
.
.
This approach is just a standard hardware technique, pipelining;it is possible because the protocol is naturally staged.
There are other opportunities to leverage the protocol’s structure.
Per-layer computations
For each sum-check round, Psums over each gate in a layer
,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:
gateprover
δ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateprover
δ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateprover
δ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateprover
δ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateprover
δ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateprover
δ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateprover
δ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateprover
δ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:
gateprover
δ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateprover
δ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateprover
δ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateprover
δ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateprover
δ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateprover
δ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateprover
δ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateprover
δ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:
gateprover
δ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateprover
δ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateprover
δ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateprover
δ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateprover
δ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateprover
δ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateprover
δ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateprover
δ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:gate
proverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Per-layer computations
For each sum-check round, Psums over each gate in a layer,evaluating H[k], k ∈ {0, 1, 2}
In software:// compute H[0],H[1],H[2]for k ∈ {0, 1, 2}:H[k] ← 0for g ∈ layer:H[k] ← H[k] + δ(g, k)// δ uses state[g]
// update lookup table// with V’s random coinfor g ∈ layer:state[g] ← δ(g, rj)
layer:H[k] =
∑g∈layer
δ(g , k)
In hardware:
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
RAM
+ ++
Adder tree
state[0]
gateproverδ(0, 0)
δ(0, 1)
δ(0, 2)
δ(0, rj)
state[1]
gateproverδ(1, 0)
δ(1, 1)
δ(1, 2)
δ(1, rj)
state[2]
gateproverδ(2, 0)
δ(2, 1)
δ(2, 2)
δ(2, rj)
state[3]
gateproverδ(3, 0)
δ(3, 1)
δ(3, 2)
δ(3, rj)
. . .
+ ++
Adder tree
Zebra’s design approach
3 Extract parallelism
e.g., pipelined provinge.g., parallel evaluation of δ by gate provers
3 Exploit locality: distribute data and control
e.g., no RAM: data is kept close to places it is needed
e.g., latency-insensitive design: localized control
3 Reduce, reuse, recycle
e.g., computation: save energy by adding memoization to Pe.g., hardware: save chip area by reusing the same circuits
Zebra’s design approach
3 Extract parallelism
e.g., pipelined provinge.g., parallel evaluation of δ by gate provers
3 Exploit locality: distribute data and control
e.g., no RAM: data is kept close to places it is needede.g., latency-insensitive design: localized control
3 Reduce, reuse, recycle
e.g., computation: save energy by adding memoization to Pe.g., hardware: save chip area by reusing the same circuits
Zebra’s design approach
3 Extract parallelism
e.g., pipelined provinge.g., parallel evaluation of δ by gate provers
3 Exploit locality: distribute data and control
e.g., no RAM: data is kept close to places it is needede.g., latency-insensitive design: localized control
3 Reduce, reuse, recycle
e.g., computation: save energy by adding memoization to Pe.g., hardware: save chip area by reusing the same circuits
Architectural challenges
Interaction between V and P requires a lot of bandwidth7 V and P on circuit board? Too much energy, circuit area
3 Zebra uses 3D integration
Protocol requires input-independent precomputation [VSBW13]
3 Zebra amortizes precomputations over many V-P pairs
Precomputations need secrecy, integrity
7 Give V trusted storage? Cost would be prohibitive3 Zebra uses untrusted storage + authenticated encryption
V
Pxy
proof thaty = F(x)
input
outputprei
V Pxy
proof thaty = F(x)
input
outputEk(prei)
Architectural challenges
Interaction between V and P requires a lot of bandwidth7 V and P on circuit board? Too much energy, circuit area3 Zebra uses 3D integration
Protocol requires input-independent precomputation [VSBW13]
3 Zebra amortizes precomputations over many V-P pairs
Precomputations need secrecy, integrity
7 Give V trusted storage? Cost would be prohibitive3 Zebra uses untrusted storage + authenticated encryption
V
Pxy
proof thaty = F(x)
input
outputprei
V Pxy
proof thaty = F(x)
input
outputEk(prei)
Architectural challenges
Interaction between V and P requires a lot of bandwidth7 V and P on circuit board? Too much energy, circuit area3 Zebra uses 3D integration
Protocol requires input-independent precomputation [VSBW13]
3 Zebra amortizes precomputations over many V-P pairs
Precomputations need secrecy, integrity
7 Give V trusted storage? Cost would be prohibitive3 Zebra uses untrusted storage + authenticated encryption
V
Pxy
proof thaty = F(x)
input
outputprei
V Pxy
proof thaty = F(x)
input
outputEk(prei)
Architectural challenges
Interaction between V and P requires a lot of bandwidth7 V and P on circuit board? Too much energy, circuit area3 Zebra uses 3D integration
Protocol requires input-independent precomputation [VSBW13]3 Zebra amortizes precomputations over many V-P pairs
Precomputations need secrecy, integrity
7 Give V trusted storage? Cost would be prohibitive3 Zebra uses untrusted storage + authenticated encryption
V
Pxy
proof thaty = F(x)
input
outputprei
V Pxy
proof thaty = F(x)
input
outputEk(prei)
Architectural challenges
Interaction between V and P requires a lot of bandwidth7 V and P on circuit board? Too much energy, circuit area3 Zebra uses 3D integration
Protocol requires input-independent precomputation [VSBW13]3 Zebra amortizes precomputations over many V-P pairs
Precomputations need secrecy, integrity7 Give V trusted storage? Cost would be prohibitive
3 Zebra uses untrusted storage + authenticated encryption
V
Pxy
proof thaty = F(x)
input
outputprei
V Pxy
proof thaty = F(x)
input
outputEk(prei)
Architectural challenges
Interaction between V and P requires a lot of bandwidth7 V and P on circuit board? Too much energy, circuit area3 Zebra uses 3D integration
Protocol requires input-independent precomputation [VSBW13]3 Zebra amortizes precomputations over many V-P pairs
Precomputations need secrecy, integrity7 Give V trusted storage? Cost would be prohibitive3 Zebra uses untrusted storage + authenticated encryption
V
Pxy
proof thaty = F(x)
input
outputprei
V Pxy
proof thaty = F(x)
input
outputEk(prei)
Implementation
Zebra’s implementation includes
• a compiler that produces synthesizable Verilog for P• two V implementations
• hardware (Verilog)• software (C++)
• library to generate V ’s precomputations
• Verilog simulator extensions to modelsoftware or hardware V ’s interactions with P
. . . and it seemed to work really well!Zebra can produce 10k–100k proofs per second,while existing systems take tens of seconds per proof!
But that’s not a serious evaluation. . .
. . . and it seemed to work really well!Zebra can produce 10k–100k proofs per second,while existing systems take tens of seconds per proof!
But that’s not a serious evaluation. . .
Evaluation method
V Pxy
proof thaty = F(x)
input
output Fvs.
Baseline: direct implementation of F in same technology as V
Metrics: energy, chip size per throughput (discussed in paper)
Measurements: based on circuit synthesis and simulation,published chip designs, and CMOS scaling models
Charge for V, P, communication; retrieving and decryptingprecomputations; PRNG; Operator communicating with V
Constraints: trusted fab = 350 nm; untrusted fab = 7 nm;200 mm2 max chip area; 150 W max total power
350 nm: 1997 (Pentium II)7 nm: ≈ 2017 [TSMC]≈ 20 year gap betweentrusted and untrusted fab
Evaluation method
V Pxy
proof thaty = F(x)
input
output Fvs.
Baseline: direct implementation of F in same technology as V
Metrics: energy, chip size per throughput (discussed in paper)
Measurements: based on circuit synthesis and simulation,published chip designs, and CMOS scaling models
Charge for V, P, communication; retrieving and decryptingprecomputations; PRNG; Operator communicating with V
Constraints: trusted fab = 350 nm; untrusted fab = 7 nm;200 mm2 max chip area; 150 W max total power
350 nm: 1997 (Pentium II)7 nm: ≈ 2017 [TSMC]≈ 20 year gap betweentrusted and untrusted fab
Evaluation method
V Pxy
proof thaty = F(x)
input
output Fvs.
Baseline: direct implementation of F in same technology as V
Metrics: energy, chip size per throughput (discussed in paper)
Measurements: based on circuit synthesis and simulation,published chip designs, and CMOS scaling models
Charge for V, P, communication; retrieving and decryptingprecomputations; PRNG; Operator communicating with V
Constraints: trusted fab = 350 nm; untrusted fab = 7 nm;200 mm2 max chip area; 150 W max total power
350 nm: 1997 (Pentium II)7 nm: ≈ 2017 [TSMC]≈ 20 year gap betweentrusted and untrusted fab
Evaluation method
V Pxy
proof thaty = F(x)
input
output Fvs.
Baseline: direct implementation of F in same technology as V
Metrics: energy, chip size per throughput (discussed in paper)
Measurements: based on circuit synthesis and simulation,published chip designs, and CMOS scaling models
Charge for V, P, communication; retrieving and decryptingprecomputations; PRNG; Operator communicating with V
Constraints: trusted fab = 350 nm; untrusted fab = 7 nm;200 mm2 max chip area; 150 W max total power
350 nm: 1997 (Pentium II)7 nm: ≈ 2017 [TSMC]≈ 20 year gap betweentrusted and untrusted fab
Application #1: number theoretic transform
NTT: a Fourier transform over Fp
Widely used, e.g., in computer algebra
Application #1: number theoretic transformRatio of baseline energy to Zebra energy
6 7 8 9 10 11 12 130.1
0.3
1
3
log2(NTT size)
baselin
e v
s. Z
ebra
(hig
her
is b
etter)
Application #2: Curve25519 point multiplication
Curve25519: a commonly-used elliptic curve
Point multiplication: primitive, e.g., for ECDH
Application #2: Curve25519 point multiplicationRatio of baseline energy to Zebra energy
84 170 340 682 11470.1
0.3
1
3
Parallel Curve25519 point multiplications
baselin
e v
s. Z
ebra
(hig
her
is b
etter)
A qualified success
Zebra: a hardware design that saves costs. . .
. . . sometimes.
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Arguments versus IPs, redux
Design principleIPs[GKR08, CMT12,VSBW13]
Arguments[GGPR13, SBVBPW13,PGHR13, BCTV14]
Extract parallelism 3 3Exploit locality 3
7
Reduce, reuse, recycle 3
7
Argument protocols seem friendly to hardware?
P computes over entire AC at once =⇒ need RAM
P does crypto for every gate in AC =⇒ special crypto circuits
. . . but we hope these issues are surmountable!
Arguments versus IPs, redux
Design principleIPs[GKR08, CMT12,VSBW13]
Arguments[GGPR13, SBVBPW13,PGHR13, BCTV14]
Extract parallelism 3 3Exploit locality 3 7Reduce, reuse, recycle 3
7
Argument protocols seem unfriendly to hardware:
P computes over entire AC at once =⇒ need RAM
P does crypto for every gate in AC =⇒ special crypto circuits
. . . but we hope these issues are surmountable!
Arguments versus IPs, redux
Design principleIPs[GKR08, CMT12,VSBW13]
Arguments[GGPR13, SBVBPW13,PGHR13, BCTV14]
Extract parallelism 3 3Exploit locality 3 7Reduce, reuse, recycle 3 7
Argument protocols seem unfriendly to hardware:
P computes over entire AC at once =⇒ need RAM
P does crypto for every gate in AC =⇒ special crypto circuits
. . . but we hope these issues are surmountable!
Arguments versus IPs, redux
Design principleIPs[GKR08, CMT12,VSBW13]
Arguments[GGPR13, SBVBPW13,PGHR13, BCTV14]
Extract parallelism 3 3Exploit locality 3 7Reduce, reuse, recycle 3 7
Argument protocols seem unfriendly to hardware:
P computes over entire AC at once =⇒ need RAM
P does crypto for every gate in AC =⇒ special crypto circuits
. . . but we hope these issues are surmountable!
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Summary of Zebra’s applicability
Applies to IPs, but not arguments
1. Computation F must have a layered, shallow, deterministic AC
2. Must have a wide gap between cutting-edge fab (for P)and trusted fab (for V)
3. Amortizes precomputations over many instances
4. Computation F must be very large for V to save work
5. Computation F must be efficient as an arithmetic circuit
Common to essentially all built proof systems
101
105
0
109
103
107
1011
wor
ker’
s co
st
norm
aliz
ed to
nat
ive
C
matrix multiplication (m=128) PAM clustering (m=20, d=128)
N/A
1013
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Pepp
erG
inge
r
Pin
occh
io
Zaa
tar
CM
T
nativ
e C
Alls
pice
Tin
yRA
M
Tha
ler
Figure 5—Prover overhead normalized to native execution cost for twocomputations. Prover overheads are generally enormous.
you’re sending Will Smith and Jeff Goldblum into space to saveEarth; then you care a lot more about correctness than costs (a calcu-lus that applies to ordinary satellites too). More prosaically, there isa scenario with an abundance of server CPU cycles, many instancesof the same computation to verify, and remotely stored inputs: data-parallel cloud computing. Verifiable MapReduce [15] is thereforean encouraging application.
OPEN QUESTIONS AND NEXT STEPSThe main issue in this area is performance, and the biggest problemis the prover’s overhead. The verifier’s costs are also quantitativelyhigher than ideal; qualitatively, we would ideally eliminate the veri-fier’s setup phase while retaining expressivity (such schemes existin theory [11] but have very high overheads, even in principle).
The computational model is also a critical area of focus. OnlyTinyRAM handles data-dependent loops, only Pantry handles re-motely stored inputs, and only TinyRAM and Pantry handle com-putations that work with RAM. Unfortunately, TinyRAM adds highoverhead to the circuit representation for every operation in the givencomputation; Pantry, on the other hand, adds even higher overheadto its constraint representation but only to operations that interactwith state. While improving the overhead of either representationwould be worthwhile, a more general research direction is to movebeyond the circuit and constraint model.
There are also questions in systems and programming languages.For instance, can we develop programming languages that are well-tailored to the circuit or constraint formalism? We might also be ableto co-design the language, computational model, and verification ma-chinery: many of the protocols naturally work with parallel computa-tions, and the current verification machinery is already amenable toa parallel implementation. Another systems question is to develop arealistic database application, which requires concurrency, relationalstructures, etc. More generally, an important test for this area—sofar unmet—is to run experiments at realistic scale.
Another interesting area of investigation concerns privacy. Byleveraging Pinocchio, Pantry has experimented with simple applica-tions that hide the prover’s state from the verifier, but there is morework to be done here and other notions of privacy that are worthproviding. For example, we can provide verifiability while conceal-ing the program that is executed (by composing techniques fromPantry, Pinocchio, and TinyRAM). A speculative application is toproduce versions of Bitcoin in which transactions can be conductedanonymously, in contrast to the status quo [18, 22].
REFLECTIONS AND PREDICTIONS
It is worth recalling that the intellectual foundations of this researcharea really had nothing to do with practice. For example, the PCPtheorem is a landmark achievement of complexity theory, but if wewere to implement the theory as proposed, generating the proofs,even for simple computations, would have taken longer than theage of the universe. In contrast, the projects described in this articlehave not only built systems from this theory but also performedexperimental evaluations that terminate before publication deadlines.
So that’s the encouraging news. The sobering news, of course, isthat these systems are basically toys. Part of the reason that we arewilling to label them near-practical is painful experience with whatthe theory used to cost. (As a rough analogy, imagine a graduatestudent’s delight in discovering hexadecimal machine code afteryears spent programming one-tape Turing machines.)
Still, these systems are arguably useful in some scenarios. In high-assurance contexts, we might be willing to pay a lot to know that aremotely deployed machine is executing correctly. In the streamingcontext, the verifier may not have space to compute locally, so wecould use CMT [21] to check that the outputs are correct, in concertwith Thaler’s refinements [57] to make the prover truly inexpensive.Finally, data parallel cloud computations (like MapReduce jobs)perfectly match the regimes in which the general-purpose schemesperform well: abundant CPU cycles for the prover and many in-stances of the same computation with different inputs.
Moreover, the gap separating the performance of the currentresearch prototypes and plausible deployment in the cloud is a feworders of magnitude—which is certainly daunting, but, given thecurrent pace of improvement, might be bridged in a few years.
More speculatively, if the machinery becomes truly low overhead,the effects will go far beyond verifying cloud computations: we willhave new ways of building systems. In any situation in which onemodule performs a task for another, the delegating module will beable to check the answers. This could apply at the micro level (ifthe CPU could check the results of the GPU, this would exposehardware errors) and the macro level (distributed systems could bebuilt under very different trust assumptions).
But even if none of the above comes to pass, there are excitingintellectual currents here. Across computer systems, we are startingto see a new style of work: reducing sophisticated cryptographyand other achievements of theoretical computer science to prac-tice [32, 43, 47, 60]. These developments are likely a product ofour times: the preoccupation with strong security of various kinds,and the computers powerful enough to run previously “paper-only”algorithms. Whatever the cause, proof-based verifiable computationis an excellent example of this tendency: not only does it composetheoretical refinements with systems techniques, it also raises re-search questions in other sub-disciplines of Computer Science. Thiscross-pollination is the best news of all.
AcknowledgmentsWe thank Srinath Setty both for his help with this article, includingthe experimental aspect, and for his deep influence on our under-standing of this area. This article has also benefited from many pro-ductive conversations with Justin Thaler, whose patient explanationshave been most helpful. This draft was improved by experimen-tal assistance from Riad Wahby; by detailed feedback from AlexisGallagher and the anonymous CACM reviewers; and by commentsfrom Boaz Barak, William Blumberg, Oded Goldreich, Yuval Ishai,Guy Rothblum, Riad Wahby, Eleanor Walfish, and Mark Walfish.This work was supported by AFOSR grant FA9550-10-1-0073; NSF
System Amortization regime Advice
Zebra many V-P pairs short
Allspice[VSBW13]
batch of instancesof a particular F
short
BootstrappedSNARKs[BCTV14a,CTV15]
all computations long
BCTV[BCTV14b]
all computationsof the same length
long
Pinocchio[PGHR13]
all future instancesof a particular F
long
Zaatar[SBVBPW13]
batch of instancesof a particular F
long
Exception: [CMT12] with logspace-uniform ACs
For example, libsnark [BCTV14b], a highly optimized implementa-tion of [GGPR13] and Pinocchio [PGHR13]:
V ’s work: 6 ms + (|x |+ |y |) · 3 µs on a 2.7 GHz CPU
⇒ break-even point ≥ 16× 106 CPU ops
With 32 GB RAM, libsnark handles ACs with ≤ 16× 106 gates
⇒ breaking even requires > 1 CPU op per AC gate, e.g.,computations over Fp rather than machine integers
Recap
V Pxy
proof thaty = F(x)
input
output
+ Verifiable ASICs: a new approach to buildingtrustworthy hardware under a strong threat model
+ First hardware design for a probabilistic proof protocol
+ Improves performance compared to trusted baseline
– Improvement compared to the baseline is modest
– Applicability is limited:precomputations must be amortizedcomputation needs to be “big enough”large gap between trusted and untrusted technologydoes not apply to all computations
Bottom line: Zebra is plausible—when it applieshttps://www.pepper-project.org/
Recap
V Pxy
proof thaty = F(x)
input
output
+ Verifiable ASICs: a new approach to buildingtrustworthy hardware under a strong threat model
+ First hardware design for a probabilistic proof protocol
+ Improves performance compared to trusted baseline
– Improvement compared to the baseline is modest
– Applicability is limited:precomputations must be amortizedcomputation needs to be “big enough”large gap between trusted and untrusted technologydoes not apply to all computations
Bottom line: Zebra is plausible—when it applieshttps://www.pepper-project.org/
Recap
V Pxy
proof thaty = F(x)
input
output
+ Verifiable ASICs: a new approach to buildingtrustworthy hardware under a strong threat model
+ First hardware design for a probabilistic proof protocol
+ Improves performance compared to trusted baseline
– Improvement compared to the baseline is modest
– Applicability is limited:precomputations must be amortizedcomputation needs to be “big enough”large gap between trusted and untrusted technologydoes not apply to all computations
Bottom line: Zebra is plausible—when it applieshttps://www.pepper-project.org/