Turing is from Mars, Shannon is from Venus: Tsws/pubs/mars.pdf · Editor: S.W. Smith, [email protected]. Secure Systems halt, or will it run forever? Funda- ... tory answers—Turing

such as user interfaces, business prac-tices, and public policy. However, tomangle an analogy from physics, theobserver is also part of the system.When reasoning about or designing(or breaking into) secure systems, it’simportant to remember the tools,mindset, and background we bringto the table.

Computer security’s primarybackground fields are computerscience and computer engineering(although some might make a casefor mathematical logic). Thesefields sometimes bring very differ-ent approaches to the same basic se-curity problems. In this install-ment, we take a lighthearted lookat these differences.

Within universities, distinguish-ing the computer science depart-ment from the computer engineer-ing department can be tricky. At myundergraduate institution, the twoprograms came from the same an-cestor, electrical engineering andcomputer science. At DartmouthCollege, however, computer sci-ence emerged from the mathemat-ics department. Engineering is in acompletely different school. As aconsequence, although computer

science and computer engineeringare natural partners for computer se-curity courses and grants, we mustmove up several levels in the hierar-chy of deans before we find one incommon, thus complicating coop-eration. Nevertheless, most coursesare cross-listed, leading many un-dergraduates starting my operatingsystems class to think they’re takingan engineering course. (In fact, a“computer science” major is often aliberal arts program that lets studentstake courses in less technical subjectssuch as literature and history. Eachcamp seems to regard this differenceas a weakness of the other camp.)

However murky the organiza-tional roots, each discipline takes itsown distinctive approach to com-puter and computation problems. Ata very coarse (and, hence, wrong)granularity, computer science looksat computation as an abstract, almostmathematical process, drawing fromthinkers such as Alan Turing on thefundamental nature of computation;computer engineering considers thephysical systems that realize thisprocess, holding up Claude Shan-non’s work on the fundamental na-ture of information. These roots can

lead to different perspectives on se-curity problems.

Computability theoryFirst, we might ask the basic ques-tion: what does computation do?

Computer scientists, drawing onTuring, think of a function as a mapthat takes each element of an inputset to some output value. Becausewe’re talking about computers, wecan assume that the input set consistsof strings of bits, and that the outputvalues are 1 or 0. (Although this as-sumption might sound limiting, it’snot—we can still express any reason-able problem this way.) TheChurch-Turing thesis characterizescomputing as running a program on adevice such as a modern computer,with the minor difference that thethesis assumes the computer mighthave unbounded memory. For somesimple functions (such as “print a 1 ifthe input has an odd number ofones, and 0 otherwise”), it’s prettyclear how we can write a program tocompute them. We might naturallyassume that we can write a programto compute any function.

A classic (and surprising) result oftheoretical computer science is thatthis assumption is wrong! Functionsexist that aren’t computable. The wayto construct troublesome examples isto realize that programs themselvescan be represented as bitstrings—andthen to start asking about programsthat compute functions about pro-grams. The standard example is thehalting problem—that is, will a givenprogram on a given input actually

S.W. SMITH

DartmouthCollege

The aim of this department is to look at security issues

in the context of larger systems. When thinking

about systems, it’s tempting to only envision compu-

tational elements such as machines, operating sys-

tems (OSs), and programming languages, or human elements

Turing is from Mars, Shannon is from Venus:

66 PUBLISHED BY THE IEEE COMPUTER SOCIETY ■ 1540-7993/05/$20.00 © 2005 IEEE ■ IEEE SECURITY & PRIVACY

Computer Science and Computer Engineering

Secure SystemsEditor: S.W. Smith, [email protected]

Secure Systems

halt, or will it run forever? Funda-mental work in computational the-ory shows that we can’t write a pro-gram that will get this problem righton every possible input.

Freshly armed with a PhD incomputer science (and looking at theworld through Turing-coloredlenses) I set out as a post-doc doing se-curity consulting at Los Alamos Na-tional Laboratory, in the early days ofthe World Wide Web. A large gov-ernment agency approached us witha problem. They planned to con-struct what they called a subweb; userswould enter the agency’s site via anofficial entrance page, but would thenwander through a large collection ofpages, hosted by many different thirdparties (typically universities), butbranded with the agency’s name. Theagency was worried that bored sys-tem administrators might install“Easter egg” functionality that wouldentertain them and their friends, butwould embarrass the agency anddamage its reputation. For example,suppose a Web form asked for a zipcode and fed this back to a complexserver-side program that producedsome response to send back to theuser. The agency worried that a userentering a certain zip code (say,12345) would get a response thattook the user to someplace embar-rassing (say, playboy.com).

The agency asked us if we couldwrite a Web spider that would crawlthrough their subweb and deter-mine if such functionality existed.We could have certainly written abasic crawler, but these sites mighthave had arbitrary programs on theback end that would take Web formrequests and such as input and returnHTML as output.

Suppose we could write an ana-lyzer program A that would take sucha program P, determine whetherthere was some input to P that wouldreturn a link to playboy.com, and an-swer “yes” or “no.” For any arbitraryprogram Q and input I, we could au-tomatically translate it to a programQ´ that runs Q on I and outputs

“playboy.com” whenever it halts. IfQ halts on I, our program A will flagQ´ as unacceptable; if Q doesn’t halt,our program A will approve Q´. Wecould thus use our analyzer programA to solve the halting problem. Be-cause that’s impossible, such an A(that works correctly for every pro-gram) can’t exist.

Chuckling, I cited Turing’s 1937paper1 on the halting problem inmy report for this agency. (I’m notsure if they renewed their contractwith us; they probably wished I hadgiven a practical answer instead of acorrect one.)

Information theoryI had another, more recent, experi-ence with computer science under-mining a security project. Whentrying to attack an Internet system, ithelps an adversary to know thedetails of the system’s OS and under-lying architecture because theadversary can then pick appropriateattack strategies. A student suggesteddefending systems by keeping adver-saries from obtaining this knowl-edge, and (as an engineer) proposedmeasuring these techniques’ effec-tiveness via entropy.

Entropy, in Shannon’s informa-

tion theory, measures how muchinformation a message carries.(Some scientists characterize infor-mation content as how surprising amessage is.) Although informationtheory is a beautiful way to reasonabout things such as transmissionover noisy channels, it neglects totake into account the difficulty ofactually extracting information.Consequently, in many practicalscenarios, trying to minimize whatthe adversary might learn by min-imizing entropy gives exactly thewrong answer because a substantialgap might exist between how muchinformation a message contains andwhat the receiver can extract. (Uni-versity lecturers certainly appreciatethis phenomenon.)

For example, think about en-cryption with standard symmetricblock ciphers such as the Triple-mode Data Encryption Standard(TDES) or the Advanced Encryp-tion Standard (AES). An eavesdrop-ping adversary can see theciphertext, but knows neither theplaintext nor the key; to get this in-formation, the adversary might tryto guess the key and then use thisguess to decrypt the ciphertext to apotential plaintext. If the plaintext

www.computer.org/security/ ■ IEEE SECURITY & PRIVACY 67

Secure Systems

might be any possible bitstring, theadversary can never know if he orshe guessed the right key. However,if the plaintext space is sufficiently

sparse (for example, English text inASCII characters instead of arbi-trary binary) and if the ciphertextdecrypts to valid plaintext, the ad-versary can be more confident thathe or she guessed correctly. Shan-non formalized this relation: thelonger the ciphertext gets, the moreconfident the adversary can be that aguess is correct, if the ciphertext de-crypts to sensible plaintext. Oncethe ciphertext is sufficiently long,it uniquely determines the plain-text. For encrypt–decrypt–encryptTDES and English ASCII plaintext,as few as 18 characters are enough.

Suppose then that we want tokeep the adversary from learningsome data, such as machine config-uration, that we express as 18 ormore characters of English text, andwe wanted to choose between twostrategies. Should we delete everyother character and send the re-mainder to the adversary? Orshould we encrypt the entire stringwith TDES and a secret randomkey, and send the ciphertext to theadversary? If we use entropy to de-cide, the former is a better strategy.If we use common sense, the latter isbetter. Yes, we might have given theadversary sufficient information tocompute the plaintext, but (by cur-rent intractability assumptions) thecomputation isn’t feasible. Com-puter science and computer engi-neering give apparently contradic-tory answers—Turing is from Mars,Shannon is from Venus.

On the other hand, Shannon’sinformation theory can also enablesecurity projects. For example,consider trying to build a system

that decrypts a ciphertext, en-crypted under an unknown key, bysystematically trying every key.The typical computer science stu-dent could build a decryptionmodule and determine how to co-ordinate the activity of a vast col-lection of modules in parallel.Skilled in complexity theory, acomputer science student mightalso be able to estimate the ex-pected time and space complexitynecessary for this search. However,a computer science educationwhere Turing eclipsed Shannonwouldn’t help the student tell, au-tomatically, whether a guessed keyis right.

For this problem, informationtheory gives the right answer. Thepoint where the information in theciphertext dominates the adver-sary’s uncertainty about the key (forexample, the approximately 18characters in the previous case) iscalled the unicity distance. In the late1990s, the Electronic FrontierFoundation (EFF) made a technical(and political) statement by build-ing Deep Crack, a machine thatused a massive number of such de-cryption modules in parallel tobreak DES by brute force.2 It takesinformation theory to appreciatethe technical slickness of the EFFdesign: essentially, each decryptionmodule uses unicity distance tolimit how much ciphertext it has todeal with, and a bit-vector to testfor plaintext such as ASCII.

A dangerous fondness for clean abstractionsComputer science educations canalso leave students ill-equipped inother ways. My colleague MichaelCaloyannides (who has also writtenfor S&P) has an entire list. One thatstayed with me is the tendency ofnonengineers to believe physicalspecifications (that is, I suppose, ifthey read them at all). For example,students tend to believe that if thespec for the wireless card says that thesignal goes 50 meters, it can’t be de-tected at 51 meters, and it can be de-tected perfectly at 49 meters.

This becomes a problem whensuch students use these assumptionsto guide subsequent operational andpolicy decisions. The system they’resecuring isn’t an abstraction follow-ing clean rules, but a mess of physi-cal devices interconnected in thereal world.

Overemphasis on clean abstrac-tions hampers computer science stu-dents’ security education in otherways. A continuing debate exists incomputer science departments aboutwhat programming languages to usein the curriculum. Usually, this debatecenters around which high-level lan-guage provides the cleanest abstrac-tion to let new students learn basicgood design principles and algo-rithms. Maybe I’m just being a mid-dle-aged fogey here (having come ofage in an era when the only way youcould get a computer as a teenager wasto design and build one yourself ).However, by the time these studentsarrive in my operating systems class, Ihave to undo the clean abstractionsthey learned in their introductorycourses, and instead teach them C andexplain that their high-level languagesreduce to compilers, OSs, syscalls,machine code, and transistors.

Students need to understandhow these abstract functions, meth-ods, variables, and so on—whichhave always magically appeared forthem—are actual bytes living in ac-tual memory. The abstraction of

68 IEEE SECURITY & PRIVACY ■ MARCH/APRIL 2005

A computer science education whereTuring eclipsed Shannon wouldn’t help the student tell, automatically, whether aguessed key is right.

Secure Systems

high-level object-oriented lan-guages might be necessary for learn-ing good design principles, butstripping the abstraction away isnecessary for understanding manyimportant issues in performance,efficiency, implementation, and se-curity. Students need both levels ofunderstanding to be effective com-puter professionals.

For example, buffer overflow isstill a major source of security prob-lems in the information infrastruc-ture. The “return-to-libc” variant(which caused a compromise atDartmouth recently) is particularlynasty; the attacker doesn’t even needto inject code, so marking the stackas nonexecutable doesn’t help. Stu-dents suffer from these attacks today;when they graduate into the realworld as computer professionals,their employers and customers willface these attacks. Without under-standing how it all comes down tobytes in memory, students won’t beequipped to handle these challenges.

Light bulbs neededMoving from stack-smashing to ab-straction-smashing, a few years ago,Princeton’s Sudhakar Govindava-jhala and Andrew Appel developeda wonderful example of the inter-play between perspectives of under-standing and security issues.3 At anInternet level, a fundamental secu-rity challenge in the infrastructure isthat many parties want to run codeon your computer. How do youpermit this functionality withoutalso letting malicious or merelyclumsy programmers trash your ma-chine? One way to make it easier forprogrammers to do the rightthing—and to assure the end userthat these programs won’t do thewrong thing—is to work in a pro-gramming language designed tomake it (hopefully) impossible towrite code that damages the rest ofthe system. This idea was behindJava Virtual Machines, but thePrinceton researchers defeated thislanguage-based security mechanism

by filling memory with devious datastructures that broke the protectionif any one bit in a large fraction ofmemory spontaneously flipped.They then induced such sponta-neous bit flips using a light bulb—the heat gently induced memory er-rors. For other examples of the realworld breaking clean abstractionswe need only consider the continu-ing effectiveness of side-channel at-tacks on cryptographic systems,such as a smart card on a lab table—or a Secure Sockets Layer server onthe other end of the Internet.

C urmudgeons might observethat we could defeat the light

bulb attacks on Java using error-correcting memory. Still, too fewstudents see the whole picture—thatis, the problem that type-safety cansolve, the ways it can solve that prob-lem, and the fact that light bulbsmight be a risk. Turing is from Mars,Shannon is from Venus, and neverthe twain shall meet. We need tochange that.

References1. A. Turing, “On Computable

Numbers, With an Application tothe Entscheidungsproblem,” Proc.London Mathematical Society, ser. 2,vol. 42, 1937.

2. Electronic Frontier Foundation,Cracking DES, O’Reilly, 1998.

3. S. Govindavajhala and A. Appel,“Using Memory Errors to Attack aVirtual Machine,” Proc. IEEESymp. Security and Privacy, IEEE CSPress, 2003, p. 154.

S.W. Smith is an assistant professor ofcomputer science at Dartmouth College.Previously, he was a research staff mem-ber at IBM Watson, working on securecoprocessor design and validation, and astaff member at Los Alamos National Lab-oratory, doing security reviews anddesigns for public-sector clients. Hereceived a BA in mathematics from Prince-ton University and an MSc and a PhD incomputer science from Carnegie MellonUniversity. Contact him at [email protected]; www.cs.dartmouth.edu/~sws/.

www.computer.org/security/ ■ IEEE SECURITY & PRIVACY 69

Documents

Turing is from Mars, Shannon is from Venus: Tsws/pubs/mars.pdf · Editor: S.W. Smith, [email protected]. Secure Systems halt, or will it run forever? Funda- ... tory answers—Turing