34
Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Design of a Reconfigurable Hardware

For Efficient Implementation of Secret Key and Public Key

Cryptography

Page 2: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Presentation Outline

Introduction & Motivation Related Work Design Methodology Design Description Algorithm Implementations Comparison with other Work Programming Paradigm Conclusion/Work in Progress

Page 3: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Motivating Factors

Need for high speed cryptography Need for algorithm independence Need for more secure implementations Need for implementing both Symmetric

and Asymmetric key encryption

Page 4: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Need for High Speed Implementations Software implementations cannot provide

real time rates Hardware implementations essential for

IPSec end pointsSSL serversVPN at rates exceeding ATM

Algorithm implementation must be able to sustain the network bandwidth

Page 5: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Need for Algorithm Independence IPSec

Cipher Algorithm Specified in Security Association (SA) SSL Transactions

Algorithm Negotiable for both Key Exchange & Encryption Need for Both Secret Key and Public Key Encryption

Session establishment - Large Number of transactions Dedicated hardware not cheap!

Page 6: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Hardware Implementation Benefits

More secure implementations Implementing both algorithms in hardware

removes bottleneck associated with slow computations in key establishment

Single hardware implementation supporting both algorithms reduce costs of separate hardware

Page 7: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Advantages of Reconfigurable Hardware Implementations Algorithm Agility Algorithm Upload/Modification Architecture Efficiency/Throughput Cost Efficiency

Page 8: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Comparison of Different Approaches

Page 9: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

FPGAs? Post Fabrication Customization Low Cost Design Cycle Fast turnaround time Potential for Parallelism

Instruction-level – Multiple operationsData-level – Multiple blocks of dataTask-level – Parallel tasks (e.g. secret key)

Page 10: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

FPGA: The basics

General purpose logic elements (LUTs)

Very flexible interconnect

Basically fine grained to support both data paths and random logic

Page 11: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

FPGA: Disadvantages

Too much flexible – inefficiencies Too fine grained – again inefficiencies Block ciphers primarily data flow oriented –

implemented using a large number of small elements

Ciphers have a well defined data flow – general purpose interconnect end up being slow and overkill in terms of area

Page 12: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

FPGA vs. Specialized Reconfigurable Logic Coarse grained vs. Fine grained Specialized interconnect vs. generic

interconnect Reduced reconfiguration times End result

Faster performance with reduced area while maintaining enough flexibility to support the application domain

Page 13: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Issues in Reconfigurable Hardware Designs How much of what to support?

How many functional units?What kinds of functional units?How much support for random logic?How much interconnect flexibility to allow?

Programming/CAD toolsWhat kind of programming model to targetHow to design efficient automated tools

Page 14: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Custom Reconfigurable Hardware Design- What’s involved? Looking for commonalities/overlaps as well as disjoint

elements Identify crucial components Utilize potential overlap or partial reuse Generic enough but fast components Minimizing the differences in component types

Balancing the resources Upper bounds/Lower bounds Logic units vs. memory blocks Determining exact number of each type of unit

Make the common case fast- IMPORTANT ALWAYS!

Page 15: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Related Work

Cavium Networks’ SSL & IPSEC Protocol Aware Security Processor

USC Mark II ‘s Advanced Cryptographic Engine for IPsec

Worcester Polytechnic Institute’s COBRA Architecture

Page 16: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

SSL/IPsec Security Processor Support for both

public key and secret key encryption

Not Reconfigurable Dedicated hardware

blocks for each operation

Page 17: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Advanced Cryptographic Engine (ACE) Designed to implement

flexible cipher needs of IPsec

Only supports block ciphers

Support for any algorithm through a library of general purpose FPGA implementations

Page 18: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

COBRA Architecture

Custom Reconfigurable Hardware for block ciphers

Each RCE is a macro block supporting various component operations

Configured using VLIW instructions

Page 19: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Design Methodology

Literature Survey Block cipher implementations Public key cipher implementations Identifying essential components of efficient

implementations Iterative Development of Architecture Validation by mapping several representative

algorithms Identification of Programming Methodology

Page 20: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Categorizing Implementation Requirements Essential step to handle the design

complexityLogic Requirements Interconnection RequirementsMemory (RAM/ROM) Requirements

Area and Performance directly affected by these

Page 21: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Prioritizing Support

Ordered by importance and then by relative hardware complexity

AES (Rijndael) DES Modular Exponentiation (RSA) Serpent Twofish RC6, MARS, and others

Page 22: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Block Ciphers: Key Elements

Bitwise XOR, AND, OR. Addition or subtraction modulo 2n Shift or rotation by a constant number of bits. Data-dependent rotation by a variable number of bits. Multiplication modulo the table entry value. Multiplication in the Galois field specified by the table

entry value. Inversion modulo the table entry value. Look-up-table substitution

Page 23: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Block Cipher: Core Operations

Page 24: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Modular Multiplication and Exponentiation Modular Exponentiation implemented with

multiple and square algorithm Montgomery Multiplication algorithm the

most popular for modulo multiplication Various Approaches for Implementation

Systolic Array Word Based

Page 25: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

ME & MM

ME primarily requires fast adders CSA based implementation most common The highest throughput implementation used

redundant representation with carry save adders for computation of partial results

The same implementation style thus selected for ME

Page 26: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Our Design: Key Insight

CSA made up of 2 half adders with 1 OR gate Each half adder itself 1 XOR & 1 AND Add some configurability to the basic CSA Result: A fast basic element with support for most of

primitive operations

Half Adder

Half Adder

X1 X2

Ci

AB

Co

SUM

Page 27: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

So What Else is needed?

Shifts between rounds of addition (for modulo exponentiation)

support for fixed length shifts, rotates & arbitrary permutes of 32-bit operands (for symmetric key)

Solution: A Permutation Unit!

Page 28: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Structure of Proposed Design

Final Design arrived upon by iterative refinement

Hierarchical DesignCellBlock/ClusterGroupsTop of Hierarchy

Page 29: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

The Cell

BA CD

Output Select Logic

O1 O2

Half Adder

Half Adder

MUX M

Half Adder

Half Adder

X1 X2

Ci

AB

Co

SUM

Page 30: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

The Block/Cluster

Permute Unit

4-BitRandom

Logic

64 Carry SaveAdders

A B C D

O1 O2

64 Bit RegisteredOutputs

64

32 32

64

32 32

64

32 32

64

32 32

64

32 32

64

32 32

Page 31: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Group

Block 1

MemoryBlock 3

Block 2

Block 4

Block 5

Page 32: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Interconnects In a Group

Block 1 Block 2 Block 3 Block 4 Block 5 Memory

Extern

al input fro

m oth

erb

locks

Extern

al input fro

m oth

erb

locks

Page 33: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Overall Structure

Page 34: Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography

Random Logic Support

16 configuration bits

8 input bits6 input bits

2 bits output4 bits output

1 bit output

4 input bits