View
246
Download
1
Category
Preview:
Citation preview
CS/IT 138 THEORY OF COMPUTATION
Chapter 1
Introduction to the Theory of Computation
Topic of course
• What are the fundamental capabilities and limitations of computers?
• To answer this, we will study abstract mathematical models of computers
• These mathematical models abstract away many of the details of computers to allow us to focus on the essential aspects of computation
• It allows us to develop a mathematical theory of computation
Review of set theory
Can specify a set in two ways: - list of elements: A = {6, 12, 28} - characteristic property: B = {x | x is a positive, even integer}
Set membership: 12 A, 9 ASet inclusion: A B (A is a subset of B)
A B (A is a proper subset of B)Set operations: union: A {9, 12} = {6, 9, 12, 28} intersection: A {9, 12} = {12} difference: A - {9, 12} = {6, 28}
Set theory (continued)
Another set operation, called “taking the complement of a set”,assumes a universal set.
Let U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} be the universal set.Let A = {2, 4, 6, 8}Then = U - A = {0, 1, 3, 5, 7, 9}
The empty set:
A
Set theory (continued)
The cardinality of a set is the number of elements in a set.Let S = {2, 4, 6}Then |S| = 3
The powerset of S, represented by 2S, is the set of all subsets of S.2S = {{}, {2}, {4}, {6},{2,4}, {2,6}, {4,6}, {2,4,6}}
The number of elements in a powerset is |2S| = 2|S|
What does the title of this course mean?
• Formal language– a subset of the set of all possible strings from from a
set of symbols – example: the set of all syntactically correct C
programs
• Automata– abstract, mathematical model of computer– examples: finite automata, pushdown automata
Turing machine
Alphabet = finite set of symbols or characters examples: = {a,b}, binary, ASCII
String = finite sequence of symbols from an alphabet examples: aab, bbaba, also computer programs A formal language is a set of strings over an alphabetExamples of formal languages over alphabet = {a, b}:
L1 = {aa, aba, aababa, aa} L2 = {all strings containing just two a’s and any number of b’s}
A formal language can be finite or infinite.
Formal language
We often use string variables; u = aab, v = bbaba
Operations on strings length: |u| = 3 reversal: uR = baa concatenation: uv = aabbbaba
The empty string, denoted , has some special properties:
| | = 0 ww w
Formal languages (continued)
Operations on languagesSet operations:
L1 L2 = {x | x L1 or x L2} is unionL1 L2 = {x | x L1 and x L2} is intersectionL1 L2 = {x | x L1 and x L2} is difference = * - L is complementL1 L2 = (L1 - L2) (L2 - L1) is “symmetric difference”
String operations:LR = {wR | w L} is “reverse of language”L1L2 = {xy | x L1, y L2} is “concatenation of languages”L* = {x = x1…xk | k 0 and x1, …, xk L} =
L0 L1 L2 . . . . is “Kleene star” or "star closure"L+ = L1 L2 . . . . is positive closure
L
Important example of a formal language
• alphabet: ASCII symbols• string: a particular C++ program• formal language: set of all legal C++ programs
Language-recognition problem
• There are many types of computational problem. We will focus on the simplest, called the “language-recognition problem.”
• Given a string, determine whether it belongs to a language or not. (Practical application for compilers: Is this a valid C++ program?)
• We study simple models of computation called “automata,” and measure their computational power in terms of the class of languages they can recognize.
Grammars
A grammar G is defined as a quadruple:
G = (V, T, S, P)
Where V is a finite set of objects called variables
T is a finite set of objects called terminal symbols
S V is a special symbol called the Start symbol
P is a finite set of productions or "production rules"
Sets V and T are nonempty and disjoint
Grammars
Production rules have the form:x y
where x is an element of (V T)+ and y is in (V T)*
Given a string of the formw = uxv
and a production rulex y
we can apply the rule, replacing x with y, giving z = uyv
We can then say that w z
Read as "w derives z", or "z is derived from w"
Grammars
If u v, v w, w x, x y, and y z, then we say: * u z
This says that u derives z in an unspecified number of steps.
Along the way, we may generate strings which contain variables as well as terminals. These are called sentential forms.
Grammars
What is the relationship between a language and a grammar?
Let G = (V, T, S, P)
The set *
L(G) = {w T* : S w}is the language generated by G.
Grammars
Consider the grammar G = (V, T, S, P), where:
V = {S}
T = {a, b}
S = S,
P = S aSb
S
Grammars
What are some of the strings in this language?
S aSb ab
S aSb aaSbb aabb
S aSb aaSbb aaaSbbb aaabbb
It is easy to see that the language generated by this grammar is:
L(G) = {anbn : n 0}
GrammarsLet's go the other way, from a description of a language to a grammar that generates it. Find a grammar that generates:
L = {anbn+1 : n 0}
So the strings of this language will be:b (0 a's and 1 b)abb (1 a and 2 b's)aabbb (2 a's and 3 b's) . . .
In order to generate a string with no a's and 1 b, you might want to write rules for the grammar that say:
S ab a
But you can't do this; a is a terminal, and you can't change a terminal, only variables
GrammarsSo, instead of:
S ab a
we create another variable, A (we often use capital letters to stand for variables), to use in place of the terminal, a:
S Ab A
Now you might think that we can use another S rule here to generate the other part of the string, the anbn part
S aSbBut you can't, because that will generate ab, aabb, etc.Note, however, that if we use A in place of S, that will solve our problem:
A aAb
GrammarsSo, here are our rules:
S AbA aAbA
The S Ab rule creates a single b terminal on the right, preceded by other strings (including possibly the empty string) on the left.
The A rule allows the single b string to be generated.
The A aAb rule and the A rule allows ab, aabb, aaabbb, etc. to be generated on the left side of the string.
Automata
“Computer” or Turing machine(Alan Turing 1936)
0
1
2
3
X 0 X B 0
Finite-statecontrol
Infinite tape or “memory”
Read/write head
Finite automata• Developed in 1940’s and 1950’s for neural net models of
brain and computer hardware design• Finite memory!• Many applications:
– text-editing software: search and replace– many forms of pattern-recognition (including use in WWW
search engines)– compilers: recognizing keywords (lexical analysis)– sequential circuit design– software specification and design– communications protocols
Pushdown automata
• Noam Chomsky’s work in 1950’s and 1960’s on grammars for natural languages
• infinite memory, organized as a stack
• Applications:– compilers: parsing computer programs– programming language design
Automata, languages, and grammars
• In this course, we will study the relationship between automata, languages, and grammars
• Recall that a formal language is a set of strings over a finite alphabet
• Automata are used to recognize languages
• Grammars are used to generate languages
• (All these concepts fit together)
Classification of automata, languages, and grammars
Automata Language GrammarTuring machine Recursively
enumerableRecursively enumerable
Linear-bounded automaton
Context sensitive Context sensitive
Nondeterministic push-down automaton
Context free Context free
Finite-state automaton
regular regular
Besides developing a theory of classes of languages and automata, we will study the limits of computation. We will consider the following two important questions:– What problems are impossible for a computer
to solve?– What problems are too difficult for a computer
to solve in practice (although possible to solve in principle)?
Uncomputable (undecidable) problems
• Many well-defined (and apparently simple) problems cannot be solved by any computer
• Examples:– For any program x, does x have an infinite loop?– For any two programs x and y, do these two
programs have the same input/output behavior?– For any program x, does x meet its specification?
(i.e., does it have any bugs?)
Intractable problems• We will learn how to mathematically
characterize the difficulty of computational problems.
• There is a class of problems that can be solved in a reasonable amount of time and another class that cannot (What good is it for a problem to be solvable, if it cannot be solved in the lifetime of the universe?)
• The field of cryptography, for example, relies on the fact that the computational problem of “breaking a code” is intractable
Why study the theory of computing?
• Core mathematics of CS (has not changed in over 30 years)
• Many applications, especially in design of compilers and programming languages
• Important to be able to recognize uncomputable and intractable problems
• Need to know this in order to be a computer scientist, not simply a computer programmer
Recommended