Textbook: Introduction to Automata Theory, Languages and Computation by John E. Hopcroft and Jeffrey D. Ullman (1979)

Course Outline:

- there exists a unique r element of V with no predecessors (called root)
- For every v_i \elementof V - r
a. there exists a directed path from r to v_1
b. v_i has exactly one predecessor
###### Special terms for directed (digraph) trees

- father, son
- ancestor, descendant If you can find a path from the ancestor to the descendant, then we have an ancestor-descendant relationship
- leaf, interior
A leaf is a node with no sons; everything else is in the interior.
#### Relations

Definition: A binary relation from A to B is a subset of A x B. A relation on A is a subset of A x A. Ex. 1.: N = { 0, 1, 2, 3, ... } N x N = { (), (0, 0), (0, 1), (0, 2), ... }

The relation "=" = { (0,0), (1,1), (2,2), (3,3), ... } The relation ">" = { (1,0), (2,0), (3,0), (4,0), ... } The relation "<" = { (0,1), (0,2), (0,3), (0,4), ... }

The relation \modulo{}_3 = { (0,0), (0,3), (0,6), (0,9), ...

(1,1), (1,4), (1,7), (1,10), ... (2,2), (2,5), (2,8), (2,11), ... (3,0), (3,3), (3,6), (3,9), ... ... }

Definition: Suppose we let R be a relation on a set A. We will be writing aRb instead of (a,b) \elementof R to indicate a is related to b via the relation R.

- R is reflexive. For every element a in the set A, aRa is true.
- R is symmetric. aRb implies bRa.
- R is transitive. For every element a, b and c in the set A, if aRb and bRc, then aRc irreflexive != not reflexive: there is no element a such that aRa is true. asymmetric != not symmetric: there is no element a, b such that aRb -> bRa
- R is an equivalence relation on A if R is reflexive, symmetric, and transitive.

Example 2: Consider A = {1, 2, 3} and a given relation R on A. a. R = {(1,3), (3,1), (2,2)} symmetric, not reflexive, not transitive b. R = {(1,1), (2,2), (3,3), (1,2)} not symmetric, reflexive, transitive

If R is symmetric and transitive, is R reflexive? Nope. Theorem: If a relation R on a set S is an equivalence relation, then R partitions S into disjoint non-empty equivalence classes, i.e. there exists S1, S2, S3, ... such that S = S1 U S2 U S3 U ... and for every i,j (i != j)

- Si does not intersect with Sj
- For every a,b element of Si, aRb is true
- For every a element of Si, for every b element of Sj, aRb is false
ex4: N = {0,1,2,3,...}
modulo 3 partitions N into 3 equivalenc classes
S1 = {0,3,6,9,12,...}
S2 = {1,4,7,10,13,...}
S3 = {2,5,8,11,14,...}
##### Closure of relations

Let P be a set of properties of relations. The P-closure of a relation R is the smallest relation R' that includes all the pairs of R and possess the properties of P.

Ex. Let R = {(1,2),(2,2),(2,3)} be a relation on {1,2,3} => transitive closure of R: R^+ = {(1,2),(2,2),(2,3),(1,3),(2,3)} => symmetric-transitive closure of R: R^/ = {(1,2),(2,2),(2,3),(2,1),(3,2),(1,3),(3,1),(3,3),(1,1)}

E | = 0 | ||||

abc | = | aba | = | aaa | = 3 |

substring

prefix - initial substring (includes E) suffix - final substring (includes E) \sigma^* = set of all possible strings over \sigma {E, a, b, c, aa, ab, ac, ba, bb, bc, ...}

If the alphabet is {0,1}, then {0,1}* can be thought of as a generator. Given this operation, we can now define what a (formal) language is.

L recognizes

L_2 = {x element of {a,b}* x < 3}

= { [e], a, b, aa, ab, ba, bb }

L_3 = {x element of {a, b}* | n_a(x) >= n_b(x)} |

where n_i = # of i's in a string = {a, ab, ba, aab, aba, baa, aabb, ... }

In fact the most interesting languages are those that are infinite.

xy = abbbab Language: L_1 = {hope, fear} L_2 = {ful, less} L_1 L_2 = {hopeful, hopeless, fearful, fearless} Definition of concatenation for languages: Suppose we let L_1 and L_2 be languages over an alphabet [Sigma]. The concatenation of L_1 and L_2 is given as follows: L_1 L_2 = cartesian product of L_1 and L_2

## Exponential notation

Suppose we let a [element of] [Sigma_1] x [element of] [Sigma^*] L [element of] [Sigma]

a^k = aaaaa...a a^k = k x^k = xxxxx...x k terms x^k = k x L^k = LLLLL...L k terms { [x_1][x_2][x_3]...[x_k] x_1 [element of] L, x_2 [element of] L, x_3 [element of] L, ... x_k [element of] L} [sigma^k] = {x [element of] [Sigma^*] x = k }

if k = 0 a^0 = [e] x^0 = [e] [Sigma^0] = {[e]} L^0 = {[e]}

L is empty != L = {[e]}

{0,1}*{2}{0,1}*

- P(n_0) is true (base case)
- For each k >= n_0, P(k) => P(k+1) (Inductive step)

example> Let [Sigma] be an alphabet, L be a language over [sigma]. Prove L^2 [subset=] L => L+ [subset=] L

This statement is equivalent to the statement L^2 [subset=] L => U(1, infinity)L^n [subset=] L

n = 1 L^1 [s=] L assume it's true for n = k L^2 [s=] L => L^k [s=] L L^2 [s=] L => LL^k [s=] LL L^2 [s=] L => L^(k+1) [s=] L^2

Principles of Strong Math Induction Suppose P(n) is a statement about n element of Z To prove P(n) is true for n >= n_0, it is sufficient to show 2 things: 1) P(n) is true 2) for every k >= n_0, if P(n) is true for all n, n_0 <= n <= k, then P(k+1) is true.

Another one is the Fibonacci function f(n + 1) = f(n) + f(n - 1) where f(0) = f(1) = 1. f(4) = 5

Union(i = 1, n, A_i) = { A if n = 0, union(i = 1, n - 1, A_i) union A_n if n elementof Z+ }

axa elementof pal iv. No other string is in pal.

## Parenthesized expressions

Let [Sigma] = { i, (, ), +, - } The language AE of fully parenthesized algebraic expressions involving operations +, - and the identifier i can be defined as: i. i elementof AE ii. For every string s1 and s2 in AE, (s1 + s2) elementof AE iii. For every string s1 and s2 in AE, (s1 - s2) elementof AE iv. Nothing else is in AE.## Example 1

Suppose a language L, a subset of {0, 1}^*, is defined as follows:

- [Epsilon] elementof L
- For any x elementof L, both 0x and 0x1 are in L
- No other string is in L
0^i 1^j where i >= j >= 0
#### Example 2

Let L subset of {a,b}^* defined as follows: - a elementof L
- For any x elementof L, ax elementof L
- For any x, y elementof L, bxy, xby, and xyb are in L.
- No other string is in L.
number of A > number of B >= 0
#### Define recursively

##### a. N - all natural numbers

1 elementof S for any x elementof S, S+1 elementof S no other element is in S##### b. Set S of all integers divisible by 7.

7 elementof S for any x elementof S, 7 + x elementof S no other element is in S##### c. Set T of all integers divisible by 2 or 7.

2 elementof S, 7 elementof S for any x elementof S such that x mod 2 = 0, x + 2 elementof S for any x elementof S such that x mod 7 = 0, x + 7 elementof S no other element is in S##### d. The set W of all strings of the form 0^i 1^j where i >= 2j >= 0

[Epsilon] is in W For any x in W, then 0x is in W For any x in W, then 00x1 is in W## B. Regular expressions and finite automata

A regular language over an alphabet [Sigma] is one that can be obtained

- from languages that are simple (i.e., one-string, w/length 0 or 1)
- using the following operations
union
concatenation
kleene*
### Example:

- {001} = {0}{0}{1}
- {0,1} = {0} U {1}
- {0,10} = {0} U {1}{0}
- {110}^*{0,1} = {{1}{1}{0}}^*{{0}U{1}}
(110)^*(0+1)
### Definition

The class R of regular languages over the [Sigma] and corresponding regular expression r are defined as follows: - 0 elementof R, and the corresponding regexp is 0
- {[Epsilon]} elementof R, regexp is [Epsilon]
- for every a elementof [Sigma], {a} elementof R, and regexp is a
- L_1, L_2 elementof R, and regexp r_1 & r_2
a. L_1 U L_2 elementof R, regexp r_1 + r_2
b. L_1 L_2 elementof R, regexp r_1 r_2
c. L_1^* elementof R, regexp r_1^*
#### Ex 1:

{0,01}^* is a regular language - corresponding regex - (0+10)^*#### Ex 2:

The language over {0,1} described as composed of strings where a. there is exactly one 0 {1}^*{0}{1}^* -- 1*01* b. there are exactly three 0s 1*01*01*01* c. even length ((0+1)(0+1))* d. string with odd number of 1s 0*10*(10*10*)* e. length = 6 (0+1+e)(0+1+e)(0+1+e)(0+1+e)(0+1+e)(0+1+e) f. ends in one, no consecutive 0s 1*(01)*1*(10)*1 1*(1*011*)*1### HOMEWORK #1

Consider r_1 = 0*+1* r_2 = 01* + 10* + 1*0 + (0*1)* Find a string such that a. r_1 is true and r_2 is not true 00 b. r_1 is not true and r_1 is true 01 c. both r_1 and r_2 are true 11 d. both r_1 and r_2 are not true 010 considering [Sigma] = {0,1} - Simplify the following:
a. 0*1(0*10*1)*0*
b. ((0+1)*)*(e+0+1)
c. (1+01)*(0+01)*
### Definition: recognize

(see definition under Languages) single 'pass' from left to right remembering modeling by directed graphs finite state machine / automaton### Exercise

Write a regular expression that generates all strings over {0,1} that do not contain 3 consecutive 0s.### Definition: finite automaton

A finite automaton (FA) or a finite state machine is a 5-tuple (Q, Sigma, delta, q_0, F) where Q is a finite set of states Sigma is a finite alphabet of symbols delta is a (transition) function Q x Sigma -> Q q_0 elementof Q is an initial state F subsetof Q is a set of final/accepting states#### Example 1:

language over the alphabet {0,1} that ends in 10. A regular expression that will generate this will be (0+1)*10

Q = {q0, q1, q2, q3, q4, q5, q6} Sigma = {0, 1} delta =

\ 0 1 +------

q0 q3 q1 q1 q2 q6 q2 q4 q5 q3 q4 q5 q4 q4 q5 q5 q4 q6 q6 q2 q6

q_0 = q_0 F = { q4 }

simplified version q0, q1, q2

0 1 q0 q0 q1 q1 q2 q1 q2 q0 q1 **

## Example 2:

Describe using regular expressions the FA below 0 1 q0 q1 q3 q1 q0 q2 q2 q2 q2 q3 q2 q4 final q4 q2 q3 initial, final

(00)*(11)*

0 1 q0 q4 q1 none, 110 q1 qX q2 1 q2 q0 q1 11 q3 q0 q1 final q4 qX qX final qX qX qX

q0 q1 q2 q1 q3 q3 final q2 q3 q4 q3 q3 q3 q4 q4 q2 q5 q1 q2 final

--- (a+b)*babaa

q1 q1 q2 initial q2 q3 q1 b q3 q1 q4 ba q4 q5 q1 bab q5 q6 q4 baba q6 q1 q2 final babaa

delta hat : Q x Sigma -> 2^Q

delta hat : Q x Sigma* -> 2^Q defined (recursively) as follows

- delta hat (q, epsilon) = { q }
- For every x elementof Sigma*, a elementof Sigma, q elementof Q delta hat (q, xa) = delta(delta hat(q,x), a) = U delta(p, a) where p elementof delta hat (q, x)

delta hat(q, x) -> set of all reachable states from q via x

There exists a q elementof delta hat (q0, x) such that q elementof F

The string x elementof Sigma* is accepted by M iff delta hat(q0, x) intersection F is not the null set.

The language accepted (or recognized) by M is the set

L(M) = { x elementof Sigma* | x is accepted by M } |

Q is a finite set of states

Sigma is a finite alphabet of symbols q0 elementof Q is the initial state F subsetof Q is a set of final states delta is a transition function delta: Q x Sigma -> 2^Q (all the subsets of Q)

## Conversion

NFA q0 0+1->q0 0->q1 1->q2 q1 0->q1 1->q3 q2 0->q4 1->q2 (q3) (q4)

DFA 0 1 q0 {q0,1} {q0,2} {q0,1} {q0,1} {q0,2,3} {q0,2} {q0,1,4} {q0,2} [{q0,2,3}] {q0,1,4} {q0,2} [{q0,1,4}] {q0,1} {q0,2,3}

0*(01)*

q0 ---0---> q1 ----1----> (q2)

U ^ | 0 +------0-------+ |

0 1 q0 q0,1 q3 q0,1 q0,1 q2,3 q2,3 q1 q3 q1 q0,1

NFA-epsilon

- Every element of P is an element of E(P)
- For any q elementof Epsilon(P), every element of delta(q, Epsilon) will also be in the closure.
- No other state is in E(P).
### Definition: delta hat for NFA-Epsilon

Let M = (Q, Sigma, delta, q0, F) be an NFA-Epsilon. The extended transition function delta hat: Q x Sigma* -> 2^Q is defined: - For any q elementof Q, delta hat(q, Epsilon) = Epsilon({q})
- For any state q elementof Q and any string x elementof Sigma* and any alphabet a elementof Sigma delta hat(q, xa) = Epsilon(U delta(p, a)) where p elementof delta hat (q, x)

Example: Let r = 0*(01)*0*

Our idea will be the following. Suppose we have the following state diagram: p -0->q -e-> r -e-> s . To remove all epsilons, we simply copy the arrows to all of them, so p -0-> q

-0-> r -0-> s

A A,B,C,E D,E B C E C B D E D E

A {ABCE} {DE} {ABCE} {ABCE} {BDE} {DE} {E} {D} {BDE} {CE} {DE} {D} {DE} {CE} {B}

- Given a language L,
L is accepted by an NFA-Epsilon
NFA
DFA
### Kleene's Theorem

The language generated by regular expression L(r) is equivalent to the language accepted by FA machine M, L(M).

L(R) is the language generated by the regular expression R L(M) is the language recognized by DFA M

part 1: recall regular expression: built over simple expressions, 0, epsilon, a elementof Sigma operations Union (+), concatenation and Kleene closure

ii) assuming true for some ri exists Mi such that L(ri) = L(Mi) in particular: r1: ----[ () M1 (F, F') ]--epsilon-->(()) r2: ----[ () M2 (F, F') ]--epsilon-->(()) Show that

r1r2

[ () M1 (F, F') ]--epsilon-->[ () M2 (F, F') ]--epsilon-->(())

r1 + r2 +----[ () M1 (F, F') ]--epsilon-->(()) --+

+----[ () M2 (F, F') ]--epsilon-->(())

r1* ---+---------------------+--epsilon-->(())

^ |
| +--[ () M1 (F, F') ]--+ |

or -----(())---[ () M1 (F, F') ]--+

+------------------------+

((0+1)*(10)+(00)*(11)*)*

(o)<------------------------------+

+-+--+-----+--1-->--0--------+---+ | ||||||||

^ | +--0--+ | +--1--+ | +--+--------+--+--------+--+ | |||||

^ | ^ |
| +--0--0--+ +--1--1--+ |

Let Rij^k = { x elementof Sigma* | deltahat(i,x) = j without "going through" a state labeled k+1 or more } |

- Submit today: +20pts (4:30, SEC-A 321)
- Submit Friday: +10pts (4:30, SEC-A 321)
### Grading System: Components:

A 92-100 Midterm exam 35% B+ 86-91 Project 30% B 77-85 Problem sets 20%C+ 69-76 Quizzes/Assign 15% C 60-68 D 50-59F below 50### Project

Lexical analyzer

Due January 10, 2002 Bonus +10 points if submitted on or before Options - Groups of at most 3

- C or Java

input file -> [ lexical analyzer ] -> output file containing tokens
For example:
int myNum = 7;
if (x == 1.4E-9) then
y = 2;
output:

www.math.grin.edu/~rebelsky/Courses/CS362/2001S/Project/project.01.html view this by tomorrow afternoon: www.math.admu.edu.ph/~jonjon/CS130/main.htm Submit before 12:00 nn tomorrow decisions: group, language Output:

- written report (containing the DFAs for each of the tokens)

example: ---()-"i"->()-"f"->()-blank->(())

- (implementation) program

J. Martin, _Introduction to Languages and the Theory of Computation (2nd Ed)_, McGraw-Hill Companies, Inc., 1997.

B. Mikolajczak (ed), Algebraic and Structural Automata Theory, _Annals of Discrete Mathematics_ 44, Elsevier Science Publishers B.V., 1985

The majority of the textbooks are available in the Cubao branch. Need this by Thursday.

Assignments will be given regularly; however, assignments will be checked randomly.

Dr. Mccluskey will arrive in January and handle the second half of CS130 for both classes.

When we're given, for example, some input: 5, 11 and the output: 16 What we've done between the input and the output is computing the sum.

In this sense we will be using 'computer' and 'algorithm' in the same sense as CS110 and CS101

Just as an overview, we will be discussing the following models for computation

- Finite automata (FA)
- Pushdown automata
- Turing machine, which is the strongest model and generally accepted as capable of doing all computations

Actually, the first part of the textbook deals with preliminaries only, and I will be assigning you to read Chapter 1 of the textbook. Submit by tomorrow a 3x5 index card with the following information: [Front] Chua, Sandra Jean V. Sacha 3 BS CS Philippine Science High School 12 August 1983 [email protected]

[Back] Schedules Subject Room Time Days

---

The telephone of the Math department: 4266001 5680-83