To Detect, Locate, and Mask Hardware Trojans in Digital ...alanmi/publications/other/aspdac… · Easy-Logic Technology Ltd. Hong Kong Science Park, Hong Kong fxwei, ydiao [email protected]

To Detect, Locate, and Mask Hardware Trojans in Digital Circuits byReverse Engineering and Functional ECO

Xing Wei, Yi Diao, and Yu-Liang WuEasy-Logic Technology Ltd.

Hong Kong Science Park, Hong Kong{xwei, ydiao ylw}@easylogic.hk

Abstract— During the EDA process, a design may

be tampered directly by dishonest engineers (or “in-

dustry spy”), or may be tampered indirectly through

the use of malicious modules from a third party In-

tellectual Property (3PIP) block vendor. During in-

tegration and fabrication, the chips may also be tam-

pered by untrusted system integrator or even foundry.

Particularly for high-end commercial or classified mili-

tary chips, Hardware Trojan (HT) Detect-Locate-and-

Mask (DL&M) is crucially necessary so as to make

sure a design is produced exactly as the original spec-

ification (golden). Our objectives are (1) to detect any

functionality difference which might be caused by bugs

or HTs, (2) to locate/output the difference circuitry

to correct the bugs or to investigate the tampering in-

tention or purpose, and (3) to “kill” (mask) the HTs

by restoring the chip’s functionality back to golden

with a minimum circuitry change. Besides blocking

the plotted damage in an early stage and pointing the

spy source by revealing the HT intention, the mask-

ing circuit revision must also be minimized to avoid

affecting the chip performance (timing) too much. In

this paper, we propose a scheme that integrates re-

verse engineering, formal verification, functional ECO,

and logic rewiring to detect, locate and mask Hard-

ware Trojans with minimized cost. This formal verifi-

cation based scheme can guarantee catching 100% of

the hidden combinational circuit HTs and can handle

multiple HTs (no number limit) automatically in one

run. Some techniques within our scheme won the first

places of the CAD Contests at ICCAD 2012, 2013,

and 2014 [1–3].

I. Introduction

A. Background

Due to the getting complicated IC design process andhigh Electronic Design Automation (EDA or CAD) toolcost, similar to chip fabrication out-sourcing, design out-sourcing may become more and more attractive to designhouses. Nonetheless, circuit design is complex and a cir-cuit may contain billions of gates. It is virtually imprac-tical to complete a whole design solely by a single organi-

zation. Using third party Intellectual Properties (3PIPs)from various vendors is a common practice in the indus-try. This practice not only promotes knowledge reuse andfacilitates hierarchical design, but unfortunately also in-troduces serious security concerns. Dishonest engineersmay change the design before fabrication. Any one of thevendors may provide modules together with injected se-cret Trojans. It is also possible for unexpected functionsto be fitted to the chips by the untrusted foundry and/ordistributors.

Several surveys [4–6] are published for Hardware Tro-jan detecting and masking. A Trojan can affect circuit inmany ways: “functional” Trojan will change functional-ity of the circuit by adding, deleting or modifying circuit’scomponent; “parametric” Trojan changes circuit functionindirectly by modifying the parameters to be fed into thecircuit. Trojans can also be classified according to theiractivation characteristics. They may be activated exter-nally after receiving some signals from some sensor or an-tenna. They can also be activated when certain condi-tions of the hosting circuit are fulfilled. Another classi-fication factor is what actions the HT will perform, orin other words, the threat that they bring. They maysimply be designed for leaking information. Some maybe designed to change the hosting circuitry specificationssuch as delay, power consumption and reliability. For in-stance, a chip may be designed to work correctly for tenyears. But if it is implanted with a HT, it may be reliablefor only one year. Among all of these Trojan purposes,functional HTs could be one of the most malicious forit can virtually take over the original control of the chipowners and could cause the worst possible sophisticatedproblems. Therefore, in this paper, we will mainly focuson functional Hardware Trojans in combinational logicswhich may change or inject logics into the original designfor some malicious purpose.

It is easy to inject HTs into designs, but it is diffi-cult to detect HTs [7–9]. They are uneasy to be detectedin the traditional circuit design and verification phases.Firstly, they are usually active only under certain circum-stances. For example, they may reside within the testingcircuit of the chip to avoid being detected during nor-mal operation and be activated occasionally to carry out

malicious operations. Secondly, the amount of gates inmodern chips is too large that exhaustive testing is in-feasible. Since simulation based approaches cannot guar-antee a 100% detection of all hidden HTs, the currentlyknown formal verification techniques cannot perform wellfor certain NP-complete Circuit-SAT cases, we proposea so-called Complementary Greedy Coupling (CGC) for-mal verification scheme [10] to achieve the 100% HT de-tection goal. Together with an adapted Functional-ECO,and very powerful rewiring engine, we are able to detect-locate-and-mask multiple HTs (no limit on the numberof HTs) at the same run automatically, and nearly 1/5 ofthe test cases are actually solved by logic rewiring only forHT masking (“killing”). The runtime of our tool is withinminutes for million-gate circuits. To our knowledge, thereis no published work or commercial product tackling allthese three jobs (to detect, locate and mask HTs) at thesame time yet.

B. An Example of Functional HT

Figure 1 illustrates a scenario on when the HTs couldbe injected. In today’s circuit design flow, the designis firstly specified by certain RTL specification language(like Verilog, VHDL), then it has to go through a longtract of EDA processes (logical synthesis, physical P&R... etc.), with some parts out-sourced too. During thisprocess, HTs can be injected at any intermediate stageeither by the company’s own employers or by some 3PIPvendors.

Fig. 1. A typical scenario of Hardware Trojan injection

Figure 2 shows an HT injection in a gate-level circuit.Figure 2(a) is the original netlist. Assume that input s isa redundant internal signal and will always be a stuck-atlogic 1 during the normal working mode. In Figure 2(b),some malicious logics are added into the circuit and theoriginal “and” logic is replaced by a multiplexer. It isclear that the added logics will not be triggered on normalworking mode and cannot be tested by simulation usingnormal input testing vectors. In our proposed scheme,

this kind of HTs can be detected and masked (killed andcorrected) by adding some quite simple patch logics. Fig-ure 2(c) is the produced after-masking circuit with patchinserted. In this masking step, there can be many validsolutions generated, we would tend to choose the smallestsize patch to minimize the timing perturbation of the tar-get chip. For other consideration (like P&R), other patchsolution alternative could also be considered.

Figure 2(d) shows the HT diagnostic report, whichclearly gives the suspiciously HT tampered gate-levelnetlist marked with red or green colors for different in-dications.

(a) Originalnetlist

(b) HT Tampered netlist

(c) Patched netlist (d) HT tampered netlist in Ver-ilog

Fig. 2. Functional HT injection and masking in gate-level circuit

C. Related Works and Our Contribution

The HT related issues have drawn more and more at-tention in recent years. DARPA [11] Integrity Reliabilityof Integrated Circuits (IRIS) program provided a $49 mil-lion research fund for this research. However, there is nosolid solution for solving this problem yet. In [12], reverseengineer techniques are used to extract some specific logicpatterns from a fabricated gate-level netlist. However,only very simple logic patterns can be extracted. Com-plex but basic arithmetic blocks like adders and multipli-ers in data paths were not handled yet. In another recentworks (FANCI [13],VeriTrust [14]) for detecting HTs, thegolden design (R1) is used for simulation tool to gener-ate simulation input patterns. FANCI and VeriTrust willthen apply the simulation input patterns to the new HTinjected (R2) circuit. Gates with low activation probabil-ity would be considered as potential HT logics by FANCI.VeriTrust analyzes the fan-in cone of each signals and con-sider a gate suspicious if it has some input being redun-

dant under the simulation test for its normal (withoutHT) chip (g1). In [15], the authors proposed a scheme toformally identify corruption of critical registers. So far,no published work addresses how to locate where the HTlogical body and boundary exactly is for the chip ownerto analyze the intending damage of a HT.

The previous works for HT detection can be gen-erally classified into (1) code/structural analysis tech-niques [12–14] and (2) formal verification techniques [15].The main disadvantages of the first kind are not being au-tomatic and no 100% guarantee of catching all HTs. Themain weakness of the former verification technique wasthat the the current SAT-based functional formal verifica-tion technique has been shown to be incapable in verifyingcertain arithmetic logics designed in different styles [10](e.g. Non-Booth versus Booth multipliers), which causedsome common misperception to assume that formal verifi-cation techniques cannot be applied for the HT detectionproblem.

In this paper, we propose a scheme coupling reverse en-gineer and formal verification to overcome the incapabilityof arithmetic verification. Our scheme can formally com-pare the functionality between golden design (G1) and“examined” design (G2) generated with HT injected. Ifany logic difference (HT) is detected, we propose to adapta state-of-the-art functional ECO technique to locate andmask Hardware Trojans by inserting rectification patchlogics, followed by a powerful logic rewiring technique tooptimize the patch logics.

II. Functional Formal Verification

Given the golden (the originally specified) and the ex-amined netlists, functional formal verification can be per-formed to check the equivalence of these two netlists. Anearly such formal verification work was proposed in [16].The Ordered Binary Decision Diagram (OBDD), which iscanonical for a logic function, was later proposed [17] andwas found to be powerful for functional verification forsmall circuits. Techniques for solving Boolean Satisfiabil-ity (SAT) have been greatly progressed in the last decade.Many high quality SAT-solvers such as MiniSat [18] werewell developed.

It was demonstrated [19] that SAT-based methods arerobust and flexible for formal verification, and are morescalable than OBDD-based methods. However, it hasbeen shown that regular formal verification methods areunsuitable for arithmetic logics or large circuits [20].In [21], the experimental results show that the perfor-mance of SAT-based methods in verifying multipliers wasnot as good as the authors’ algebraic approach. But, thealgebraic approach cannot work without knowing moduleIO positions in prior.

In this paper, we adopted a novel coupling idea [10] us-ing reverse engineering technology [22] to extract arith-metic logics while using a regular SAT solver to han-

dle other circuit logics. This coupling verification engineachieves an enormous speed-up and performance jump onthis formal verification task [10], which makes the formalverification approach based HT solutions become realis-tic and practical. Only seconds or minutes are needed tosolve large circuits by this technique while it might takehours or days if solved by other tools.

III. Algorithm Flow

The whole flow is shown in Figure 3. We first synthesizethe golden RTL circuit (R1) into gate-level netlist (G1).The objective of our scheme is to compare correct netlist(G1) and examined netlist (G2) to locate and kill anyTrojan. We will firstly apply a reverse engineer techniqueto extract all arithmetic macros like adders, multiplierswith their formula forms for verification. After extractingarithmetic macros, a global HT locating process will beperformed by trimming out all equivalent parts betweenboth circuits. Then, the potential HTs can only existin the left over inequivalent sub-circuits. Next, a detailHT locate-and-mask (rectification) process is performedby using functional-ECO to locate the malicious logicsand mask its ill-effect by adding rectification patch logicsinto the infected circuit. Lastly, the added patch is fullyoptimized by logic rewiring technique to reduce its size orfit better in P&R and timing.

Fig. 3. Algorithm flow of HT locating and masking

A. Arithmetic Extraction

We proposed a powerful reverse engineer scheme [10,22] to extract and compare arithmetic logics that maycontain arithmetic components such as adders/multipliersimplemented in various styles (CLA, Ripple, Booth orNon-Booth). Figure 4 [10] shows a multiplier designedin two styles. “HA” represents a 1-bit half adder while“FA” represents a 1-bit full adder. We can find that bothdesign styles share some common structural units like 1-bit-adders.

In our proposed scheme, all 1-bit adders with their con-nections will be extracted firstly. The functionality of a

(a) 4 × 4 Booth multiplier (b) 4 × 4 Non-Booth Wal-lace tree multiplier

Fig. 4. Multipliers in different design styles [10]

1-bit full adder is as follows:

FAz = a⊗ b⊗ ciFAco = ab + bci + aci (1)

where a, b, ci are inputs and z, co are outputs.Adders/multipliers are all composed by 1-bit adders.

For example, the functionality of the 3rd output of a 4-bit multiplier in the non-Booth style can be expressed as:

z2 = HAz(FAz(a0b2, a1b1, a2b0), HAco(a0b1, a1b0))= a0b2 ⊗ a1b1 ⊗ a2b0 ⊗HAco(a0b1, a1b0) (2)

Similarly, the 4th output is:

z3 = HAz(FAz(FAz(a0b3, a1b2, a2b1), a3b0, FAco),HAco)

= a0b3 ⊗ a1b2 ⊗ a2b1 ⊗ a3b0 ⊗HAco ⊗ FAco (3)

where FAco and HAco are carry out signals from smallerweight.

We note that for a such arithmetic logic, the inputsof exclusive or (XOR) operation can be the outputsof other XOR operations. Thus XOR operators can begrouped into different “XOR trees”. The inputs of theseXOR trees are either bit products of adders/multipliers orcarry out signals of internal 1-bit adders. We can deducecarry out signals based on XOR trees and connect thesetrees to become “XOR forest” which is actually a networkof 1-bit adders. The construction of an XOR forest isshown in Figure 5.

Fig. 5. Construction of an XOR forest

After building the network of 1-bit-adders, we deter-mine the arithmetic functions such as additions, subtrac-tions and multiplications by XOR forests. A complex

arithmetic logic such as the combination of adders andmultipliers ((a + b) × c, a × b + c × d,...) can then bebuilt bottom up. The whole algorithm flow is shown inAlgorithm 1. (The above techniques won the first placesin 2013 and 2014 CAD contests at ICCAD.)

Algorithm 1: Procedure Identify adder multiplier

input : gate-level circuitoutput: Identified arithmetic macros

1 begin2 Identify 2-input XOR sub circuits;3 Build XOR trees ;4 Find the connections (cout signals) among XOR

trees;5 Build XOR forests;6 Determine functionality of each XOR forests;7 Figure out arithmetic boundaries for each XOR

forest;

B. Global HT Locating

It is inefficient to compare the whole circuits directlywithout a proper trimming. After trimming, all the po-tential Trojans will only exist inside the non-equivalentsub-circuit. Trimming itself is a quite complicated NP-hard research topic. We adapted algorithm proposedin [23], where new equivalent sub-circuit pair will be iden-tified and stripped from circuit iteratively in our globalHT locating process. Figure 6 gives a simple example ofthis trimming process. Figure 6(a) and (b) are the goldenand examined netlists respectively. We find a sub-circuitpair both implementing a 6-input XOR function. Thuswe can strip away these equivalent sub-circuits from bothcircuits as shown in Figure 6(c). We can iteratively per-form the above stripping process as far as we can to leavea non-equivalent circuit pair as small as possible.

(a) Examinednetlist (G2)

(b) Golden netlist(G1)

(c) Trimmed netlist

Fig. 6. An example of global HT locating

C. Detail HT Locating & Masking

In here we applied a technique adapted from a func-tional ECO process [24,25] to locate and mask HardwareTrojans. And a multiple-level Complementary Greedy

Coupling (CGC) optimization scheme [26] was also em-bedded inside to smooth the “algorithm delaying effect”of this clearly intractable problem to boost the results.

Given that the “old” design is the tampered implemen-tation and the “new” design is the originally specifiedgolden design. As it is possible that the HT is foundduring the very late EDA design process or even afterthe foundry masks have been completed, to minimize theperturbation of the already fully optimized circuit, we ex-tended the method in [27] to generate a minimized patchto mask the HT.

In a circuit, a set of Boolean variables X= {x1,...,xn}denotes the set of primary inputs (PIs). The functions ofthe primary outputs (POs) in the old and new specifica-tions are denoted by F (X) = {f1(X), f2(X),...,fm(X)}and G(X) = {g1(X), g2(X),...,gm(X)} respectively.

For an old and new function pair, fi and gi, a diff-setcharacterizes the set of input assignments for which thefunctions fi and gi have opposite values and is defined asfollows:

diffi(X) = fi(X)⊕ gi(X) (4)

Our target is to minimize the diff-set for every functionpair by changing the old functions incrementally until alldiff-sets are empty (old function and new function areequivalent). In our algorithm, we replace the existingfunction at an internal signal with some patch functionto minimize the diff-set.

Consider an internal signal r whose original function ist(X), and a PO POi driven by r whose function is fi, wecan express fi(X, r) in terms X and r, then the care-setfor r is defined as follows.

careri = fi(X, t(X))⊕ fi(X,¬t(X)) (5)

The care-set characterizes the set of input assignmentsfor which any change at signal r can be observed at theoutput function fi. It may overlap with the diff-set andcan be divided into two partitions:

• care-out-diff: contains min-terms in the care-set butnot in the diff-set, careri ∧ ¬diffi.

• care-in-diff: contains min-terms in both the care-setand diff-set, careri ∧ diffi.

Changing the values of the min-terms in care-out-diffchanges the value of fi and enlarges the diff-set. Hence,these min-terms in the function t must be preserved andthe following constraint must be satisfied by the patchfunction p(X):

p(X) ⊇ t(X) ∧ careri (X) ∧ ¬diffi(X) (6)

On the other hand, in order to minimize the diff-set, t’smin-terms inside care-in-diff should evaluate to the oppo-site values:

p(X) ⊇ ¬t(X) ∧ careri (X) ∧ diffi(X) (7)

Therefore, if p(X) and diff-set satisfy the following con-dition,

p(X) ⊇ ¬t(X) ∧ diffi(X) (8)

which implies that

careri (X) ⊇ diffi(X) (9)

then p(X) can completely empty diffi(X) and accom-plish the new function gi.

Specifically, when r = POi, we always have careri (X) ⊇diffi(X), which implies that we can always find a patchfunction satisfying constraint Equation 8, that completelyempties diffi(X) and accomplishes new function gi (forexample, we can directly use gi as the patch function).These type of patches are called trivial patch and rep-resented by It. In our algorithm, we focus mainly ongenerations of non-trivial patches.

Figure 7 shows a simple example for creating a patchfunction. The diff-set can be reduced after generating thepatch.

(a) Before patching (b) After patching

Fig. 7. An example of patch creation

Constraints Equation 6 ∼ Equation 8 are consideredwhen creating patch functions. If the signal r only drivesa single output, the corresponding patch function mustsatisfy both Equation 6 and Equation 7. However, oftenan internal signal drives multiple primary outputs. And anumber of patch candidates can be obtained according tovarious constraints applied to each POs. To enhance thepossibility of creating an effective patch while avoid ex-haustive searches, in our algorithm, two special subsets ofpatch candidates, namely Conservative patches and Ag-gressive patches, are selected.

C.1 Conservative Patch

In the conservative strategy, any patch at signal r mustguarantee that no PO’s diff-set is worsened. Thus con-straint Equation 6 must be satisfied for all POs.

We select a subset of l POs from the PO set {PO1,PO2,...,POm}. The subset {POi1 , POi2 ,...,POil} iscalled the improved PO set. And the created patch at rmust cut down the diff-set of POs in the improved PO set.In other words, for each PO in this set, constraint Equa-tion 7 must be satisfied too.

The selection of POs and the size of the improved POset can be adjusted dynamically as the logic patching pro-ceeds. The smaller the improved PO set size is, the easierto create a satisfying patch.

Figure 8 shows a conceptual example for creating a con-servative patch at an internal signal driving two primaryoutputs. The diff-sets of both outputs are minimized asshown in the figure.


Fig. 8. An example of conservative patch

C.2 Aggressive Patch

Unlike the conservative strategy that guarantees no PO’sdiff-set is worsened, in our aggressive strategy, we try toimprove the diff-sets of some POs while ignoring the diff-sets of some other POs, because sometimes it is difficultto find a conservative patch.

Such a PO set can be divided into three subsets.(i) Ignored Set : POs in the set will not be considered by

the patch, their diff-sets may become worse after patching.Some heuristics could be used to select certain POs intothe ignored set.

(ii) No Change Set : The diff-sets of the POs in this setwill not become worse (it may not be improved either).Constraint Equation 6 must be satisfied for every PO inthis set. The POs that have been fixed in previous itera-tions (their diff-sets are already empty) must be assignedto this set, because we do not want them to become un-fixed again.

(iii) Improved Set : the diff-sets of the POs in thisset must be improved by the patch created. Both con-straints Equation 6 and Equation 7 must be satisfied.Moreover, for at least one PO in this set, constraint Equa-tion 8 must be satisfied. This means that the patch cre-ated must be able to fix at least one PO completely.

Figure 9 shows a simple conceptual example for creatingan aggressive patch. The diff-set of o1 can be completelyeliminated while the diff-set of o2 is enlarged. Our algo-rithm may still generate the patch under this situation.(The above technique has helped us won the first place of2012 CAD contest at ICCAD.)


Fig. 9. An example of aggressive patch

D. Patch Optimization

Logic rewiring has been shown to be a generaland powerful logic transformation scheme where certainwires/logics can be removed by the addition of its al-ternative wires/logics without changing the functional-ity [28–32]. It has also been proved to be a theoreti-cally complete logic transformation process that any logictransformation achievable by other scheme can also beachieved by a sequence of rewiring steps [33].

In our patch optimization procedure, two well coupledrewiring transformation algorithms named Add-First andCut-First are selected to form a coupling optimizationprocedure.

(a) Patch before logic rewiring (b) Patch after logic rewiring

Fig. 10. An example of Add-First logic rewiring transformation

Fig. 10 shows an example of the Add-First rewiringtransformation. A certain redundant wire is added intothe patch circuit first (a wire from g5 to g9 in the fig-ure) by redundancy addition and removal algorithms [29].Then several wires and consequentially several gates (g4,g6, and g7) can become redundant and thus removable.The optimized patch is shown in Fig. 10(b). We can seethat the optimized patch is fully minimized.

(a) Patch before logic rewiring (b) Patch after logic rewiring

Fig. 11. An example of Cut-First logic rewiring transformation

Fig. 11 shows an example of the Cut-First rewiringtransformation [32], where the wire from b to g6 inFig. 11(a) is removed first. The removal will then causeobservable errors propagating from g6 to o2. By an errorcancellation analysis [31], it can be shown that all errorscan be corrected by adding additional logics at g8 andg9. The simplified resultant patch is shown in Fig. 11(b),which requires fewer gates and wires.

Add-First and Cut-First rewiring techniques can coverdifferent optimization solution space so that they are se-lected as the coupling algorithms and performed after apatch has been generated.

E. Can a “Short-cut” design flow now become feasible?

A side benefit of developing a very powerful Functional-ECO engine may make the following “short-cut” designflow become practical. For example, if a serial of Internetof Things (IoT) chips with just very minor differences areto be designed. Given that each chip requires 3 monthsto complete the P&R process. Under the current flow,completing 2 chips requires 3 + 3 = 6 months. But, ifapplying Functional-ECO, we can: (1) complete P&R ofchip A; (2) Functional-ECO (after-PR-of-A, before-PR-of-B)→ patch; (3) after-PR-of-A + patch→ after-PR-of-B (10 mins). Thus, this new short-cut flow can completethe same job in 3 months + 10 minutes.

IV. Experimental Results

In this section we show the capability of the schemeand the effectiveness of each stage. The experiments wereperformed on a Linux machine (Ubuntu 12.04 with Linuxkernel 3.2) powered by a 2.33GHz CPU and 2G memory.Our tool can be downloaded from [34].

TABLE IExtract arithmetic logics from “sea of gates”

Case #gates extracted arithmetics style* widthut1 280 ∼ 1261 a× c + b× c NB 6 ∼ 8

(a + b)× cut2 1197 ∼ 1994 a× b B 16 ∼ 16ut3 2727 ∼ 4226 a× b B 32 ∼ 48ut5 1025 ∼ 2261 s(a× b) ∪ s̄(c× d), NB 12 ∼ 12

(sa ∪ s̄c)× (sb ∪ s̄d)ut7 474 ∼ 2301 (signed)a× b B, NB 9 ∼ 24ut8 1061 ∼ 2308 (signed)a× b B, NB 23 ∼ 24

ut13 697 ∼ 2385 a× b B, NB 11 ∼ 17ut14 1402 ∼ 3442 a× b B, NB 17 ∼ 19ut15 851 ∼ 3023 a× b B, NB 12 ∼ 17ut20 584 ∼ 22600 (signed)a× b B, NB 10 ∼ 45

a× b− c× dut26 564 ∼ 10383 a× b, a× b + c B, NB 9 ∼ 28

a× b + c× d + e× fut32 711 ∼ 2480 (signed)a× b B, NB 10 ∼ 18ut36 2855 ∼ 25489 fail to find any arithmetic macrout41 1103 ∼ 5463 a× b B, NB 13 ∼ 30hid1 1157 ∼ 21467 a× c + b× c, B 11 ∼ 32

(a + b)× chid2 563 ∼ 24520 a× b B, NB 10 ∼ 64hid3 462 ∼ 8672 a + b + ... + y + z NB 4 ∼ 13hid4 789 ∼ 5262 (signed)a× b B, NB 8 ∼ 24hid5 479 ∼ 5127 (signed)a× b B, NB 6 ∼ 11hid6 697 ∼ 2385 a× b B, NB 11 ∼ 17hid7 712 ∼ 2773 a× b B, NB 11 ∼ 19hid8 588 ∼ 22629 a× b + c, B, NB 10 ∼ 33

a× b− c× d,a× b + ... + e× f

hid9 1004 ∼ 3327 a× b B, NB 12 ∼ 16hid10 18911 ∼ 61496 fail to find any arithmetic macro

*B/NB means for Booth/non-Booth

A. Arithmetic Extraction

The set of open benchmarks released by Cadence’s logicverification team for the 2014 CAD Contest at ICCAD [3]was adopted to evaluate our reverse engineering approach.Each of these benchmarks is a gate-level combinational

circuit containing arithmetic logics. We use our reverseengineering technique to locate those arithmetic logicsfrom flatten circuits (like “sea of gates”) without know-ing of the component IO and boundaries. The formulaesuccessfully extracted by our tool are shown in Table I.

The first column is the name of a case suite. Eachsuite contains 13 benchmarks which may implement sim-ilar arithmetic functions but with different operands’ bit-widths. Our extracted arithmetic logics as well as theirdesign styles (in Booth or in Non-Booth) and operands’bit-widths are shown at columns 3–5. Our technique canextract most (97%) of the contested benchmarks with onlysuites ut36 and hid10 failed. With the arithmetic log-ics successfully extracted, the regular formal verificationtechniques such as SAT solvers can then be called to de-tect the existence of any HT.

B. HTs Locating & Masking

The set of open benchmarks released by Cadence’s logicverification team for the 2012 CAD Contest at ICCAD [1]was used to evaluate our HTs locating & masking scheme.Each benchmark has 2 circuits g1 and g2, which have logicdifferences. We assumed that g1 is the “tampered” circuitwhile g2 is the golden one. Then we perform our schemeon these cases and the results are shown in Table II. Thefirst 5 columns show the benchmark information and thenext 2 columns show the patch size in gates and the run-time by our scheme. We also give the results producedby our contest version at the last 2 columns. Our currentscheme (25th Oct. 2014 version) can generate patches40% smaller with CPU time 86% reduced.

V. Conclusion

The Hardware Trojan issue has become a more andmore sensitive security concern for design service nowa-days. However, the goal to achieve a 100% detection,locating, and masking requires to tackle several difficultNP-hard or NP-complete sub-problems. In [26], a so-called “algorithm decaying effect” was observed when ap-plying polynomial algorithms for an NP-hard problem.A complementary scheme was then designed to smooththis decaying effect to yield better results. Similarly, wealso observed a “computation black hole” (“exponentialtrap”) when applying one algorithm for the NP-completeCircuit-SAT problem, and hence proposed this RE@SATcoupling scheme. The jumping improvement is unusual.It would be interesting to study if this CGC schemeproved working for this HT DL&M problem, can also beuseful for other NP-complete or NP-hard problems.

In this paper, we propose an empirical solution built byintegrating reverse engineering, formal verification, func-tional ECO, and a state-of-the-art logic rewiring tech-nique with different levels of embedded CGC schemes.Some of the related techniques contributed to our re-ceiving the first places of 2012∼2014 CAD contests at

TABLE IIHTs locating and masking

Ours*** 1st at [1]#patch time #patch time

case #g1.* #g2. size (sec) size (sec)open01 54 51 0 0 0 0open02 54 51 0 0 0 0open03 206 215 1 18 0 10open04 220 226 0 21 0 10open05 220 226 0 1 0 65open06 272 211 0 24 0 235open07 272 211 20 18 12 278open08 758 519 38 1 55 109open09 223 214 1 2 5 108open10 223 214 11 3 9 154open11 161 193 13 8 12 108open12 161 193 21 2 20 205open13 338 436 8 1 43 99open14 101 65 16 11 16 61open15 770 448 2 0 29 796open16 748 453 1 1 21 626

hid01 26 44 0 0 0 1hid02 26 44 0 0 0 0hid03 170 169 2 14 1 16hid04 176 179 1 17 1 16hid05 177 175 3 15 3 77hid06 212 188 10 38 29 293hid07 210 187 18 32 19 210hid08 460 379 108 245 107 169hid09 137 157 1 4 1 79hid10 137 157 10 34 8 79hid11 84 93 9 17 10 51hid12 84 93 18 27 25 126hid13 637 498 75 19 64 151hid14 41 35 10 5 10 14hid15 422 417 35 5 129 519hid16 442 430 3 2 96 469hid17 460 379 54 27 111 383hid18 96 108 55 45 55 126hid19 188 172 65 48 FAIL FAILhid20 758 519 53 78 117 456hid21 161 184 73 146 86 498hid22 224 174 22 28 FAIL FAIL

total 10109 8707 757 957 1094 6597ratio** 61% 14% 1 1*#g1/g2/patches: the number of gates of g1/g2/patch circuits**ratio is calculated exclude hid19 and hid22 for fare comparison***Easy-HT-Killer 25th Oct. 2014 version results

ICCAD [1–3]. We hope our proposed scheme may pro-vide an alternative solution for solving this so challengingHT problem.

References

[1] W. Jong, H.-T. Wang, C. Hsieh, and K.-Y. Khoo, “ICCAD-2012 CAD contestin finding the minimal logic difference for functional ECO and benchmarksuite: Cad contest,” in Proc. International Conference on Computer-Aided Design,2012.

[2] C.-J. Hsu, W.-H. Lin, H.-T. Wang, F. Lu, and K.-Y. Khoo, “ICCAD-2013 CADcontest in technology mapping for macro blocks and benchmark suite,” in Proc.International Conference on Computer-Aided Design, 2013.

[3] C.-J. Hsu, “ICCAD-2014 CAD contest in simultaneous CNF encoder opti-mization with SAT solver setting selection,” in Proc. International Conference onComputer-Aided Design, 2014.

[4] X. Wang, M. Tehranipoor, and J. Plusquellic, “Detecting malicious inclusionsin secure hardware: Challenges and solutions,” in Proc. IEEE Intl WorkshopHardware-Oriented Security and Trust (HOST), IEEE CS Press, pp. 15–19, 2008.

[5] T. Mohammad and F. Koushanfar, “A survey of hardware trojan taxonomyand detection,” IEEE Design & Test of Computers, vol. 1, pp. 10–25, 2010.

[6] B. M. B. Hopkins and T. Newby, “Hardware trojans-prevention, detection,countermeasures (a literature review),” Defence Science And Technology Organi-sation Edinburgh (Australia) Command Control Communications And Intelligence Div,2011.

[7] M. Tehranipoor and F. Koushanfar, “Confronting the hardware trustworthi-ness problem,” Guest Editorial, IEEE Design and Test of Computers, Jan 2010.

[8] J. Zhang and Q. Xu, “On hardware trojan design and implementation atregister-transfer level,” in Proc. IEEE Intl Workshop Hardware-Oriented Security andTrust (HOST), IEEE CS Press, 2013.

[9] J. Zhang, F. Yuan, and Q. Xu, “DeTrust: Defeating hardware trust verificationwith stealthy implicitly-triggered Hardware Trojans,” in Proceedings of the 2014ACM SIGSAC Conference on Computer and Communications Security, 2014.

[10] Y. Diao, X. Wei, K.-K. Lam, and Y.-L. Wu, “Coupling reverse engineering andSAT to tackle NP-Complete arithmetic circuitry verification in O(# of gates),”in Asia and South Pacific Design Automation Conference (ASP-DAC) 2016, 2016.

[11] “DARPA, Arlington, VA, USA. Integrity and Reliability of Integrated Cir-cuits (IRIS).” http://www.darpa.mil/Our_Work/MTO/Programs/Integrity_and_Reliability_

of_Integrated_Circuits_(IRIS).aspx, 2012.

[12] P. Subramanyan, N. Tsiskaridze, W. Li, A. Gascn, W. Y. Tan, A. Tiwari,N. Shankar, S. A. Seshia, and S. Malik, “Reverse engineering digital circuitsusing structural and functional analyses,” Emerging Topics in Computing, IEEETransactions on, vol. 2, pp. 63–80, 2014.

[13] A. Waksman, M. Suozzo, and S. Sethumadhavan, “FANCI:Identification ofstealthy malicious logic using Boolean functional analysis,” in ACM Conferenceon Computer and Communications Security, pp. 697–708, 2013.

[14] J. Zhang, F. Yuan, L. Wei, Z. Sun, and Q. Xu, “Veritrust: verification forhardware trust,” in Proc. Design Automation Conference, 2013.

[15] J. Rajendran, V. Vedula, and R. Karri, “Detecting malicious modificationsof data in third-party intellectual property cores,” in Proc. Design AutomationConference, 2015.

[16] G. L. Smith, R. J. Bahnsen, and H. Halliwell, “Boolean comparison of hardwareand flowcharts,” IBM Journal of Research and Development, vol. 26, no. 1, pp. 106–116, 1982.

[17] R. Bryant, “Graph-based algorithms for Boolean function manipulation,” Com-puters, IEEE Transactions on, vol. C-35, no. 8, pp. 677–691, 1986.

[18] N. Een and N. Sorensson, “An extensible SAT-solver,” in Proc. SAT, pp. 502–518, 2003.

[19] E. Goldberg, M. Prasad, and R. Brayton, “Using SAT for combinational equiv-alence checking,” in DATE’01, pp. 114–121, 2001.

[20] M. Jarvisalo, “Equivalence checking hardware multiplier designs,” SAT Compe-tition 2007 - benchmark description, 2007.

[21] M. Ciesielski, C. Yu, W. Brown, and D. Liu, “Verification of gate-level arith-metic circuits by function extraction,” in Proc. Design Automation Conference,2015.

[22] X. Wei, Y. Diao, T.-K. Lam, and Y.-L. Wu, “A universal macro block mappingscheme for arithmetic circuits,” in Proceedings of the 2015 Design, Automation & Testin Europe Conference & Exhibition, pp. 1629–1634, EDA Consortium, 2015.

[23] S. Krishnaswamy, H. Ren, N. Modi, and R. Puri, “DeltaSyn: An efficientlogic difference optimizer for ECO synthesis,” in Proc. International Conferenceon Computer-Aided Design, pp. 789–796, Nov. 2009.

[24] C.-C. Lin, K.-C. Chen, S.-C. Chang, M. Marek-Sadowska, and K.-T. Cheng,“Logic synthesis for engineering change,” in Proc. Design Automation Conference,pp. 647–652, 1995.

[25] K.-F. Tang, C.-A. Wu, P.-K. Huang, and C.-Y. R. Huang, “Interpolation-basedincremental ECO synthesis for multi-error logic rectification,” in Proc. DesignAutomation Conference, 2011.

[26] Y.-L. Wu and M. Marek-Sadowska, “Orthogonal greedy coupling - a new opti-mization approach to 2-D FPGA routing,” in Proc. Design Automation Conference,1995.

[27] S.-Y. Huang, K.-C. Chen, and K.-T. Cheng, “Autofix: A hybrid tool for auto-matic logic rectification,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.,vol. 18, pp. 1376–1384, July 1999.

[28] C.-W. Chang and M. Marek-Sadowska, “Theory of wire addition and removalin combinational Boolean networks,” Microelectronic Engineering, vol. 84, no. 2,pp. 229–243, 2007.

[29] S.-C. Chang, M. Marek-Sadowska, and K.-T. Cheng, “Perturb and simplify:Multilevel Boolean network optimizer,” IEEE Trans. Comput.-Aided Design Integr.Circuits Syst., vol. 15, pp. 1494–1504, Dec 1996.

[30] X. Yang, T.-K. Lam, and Y.-L. Wu, “ECR: A low complexity generalized errorcancellation rewiring scheme,” in Proc. Design Automation Conference, pp. 511–516, 2010.

[31] X. Yang, T.-K. Lam, W.-C. Tang, and Y.-L. Wu, “Almost every wire is remov-able: A modeling and solution for removing any circuit wire,” in Proc. Design,Automation and Test in Europe Conference and Exhibition, 2012.

[32] X. Wei, T.-K. Lam, X. Yang, W.-C. Tang, Y. Diao, and Y.-L. Wu, “Delete andCorrect (DaC): An atomic logic operation for removing any unwanted wire,”in VLSI Design and 2014 13th International Conference on Embedded Systems, 2014 27thInternational Conference on. IEEE, 2014.

[33] W. Kunz, D. Stoffel, and P. R. Menon, “Logic optimization and equivalencechecking by implication analysis,” IEEE Trans. Comput.-Aided Design Integr. Cir-cuits Syst., vol. 16, Nov. 1997.

[34] “Easy-HT-Killer (25th Oct, 2014 version).” Easy-logic Technology Ltd. http:

//www.easylogic.hk/download/tool/Easy_HT_Killer_V20141025.tar.gz.

Documents

To Detect, Locate, and Mask Hardware Trojans in Digital ...alanmi/publications/other/aspdac… · Easy-Logic Technology Ltd. Hong Kong Science Park, Hong Kong fxwei, ydiao [email protected]