+CS 154 Notes: January 30, 2012

+==============================

+Preliminary definitions:

+- An __alphabet__ is a finite set of symbols. An alphabet is *not* a symbol.

+- A __string__ over an alphabet is Σ is a finite sequence of symbols from Σ.

+- A __language__ over the alphabet Σ is a set of strings over Σ.

+ - Also, alphabets and strings must be finite but languages needn't

+***NOTE*** Thus, programming languages are sets of programs.

+- Capital letters are used for languages. (L, M, N...)

+- Capital greek are used for alphabets. (Σ, Μ, Ν...)

+- Lower case leters near the beginning of the english alphabet are symbols from

+- Lowercase letters near the end of the english alphabet are used for

+- λ will denote the empty string.

+We'll write the string whose only symbol is a as a.

+With this notation x**a** = x.**a**.

+Why is This Interesting?

+------------------------

+Many questions in programming reduce to questions of membership in languages.

+- Is `x2!y` a legal identifier in language PQR?

+- Is `null` a legal statement in language QRS?

+- Is 10101010101 a prime number?

+- Is `Buffalo buffalo.` a legal English sentence.

+Strings, an enhanced and expanded definition

+--------------------------------------------

+A string over alphabet Σ is one of of the following:

+- has the form x.a where x is a string over Σ and a is an element of Σ.

+Intuition shows that x.a results from adding the symbol a to the end of the

+string x. The empty string is written as λ.

+String Concatenation and Length

+-------------------------------

+If x and y are strings then the concatenation xy (or x∙y) of x and y is

+*Recursion is your friend!*

+x is a __prefix__ of y iff there exists __z__ with y = x**z**. In the previous

+case, x is a prefix of y.

+If x is a string then the length of the string |x| is

+*Recursion is your friend!*

+So the length of the character string abc is:

+ = 1 + (1 + (1 + length(λ)))

+1. If x is an empty string, then xy = y.

+2. If x and y are strings, then |xy| = |x| + |y|.

+***Warning***: We'll soon define a notion of language concatenation for which

+the analog of the second theorem does not hold true.

+3. **Preliminary**: The first symbol of a nonempty string xa is

+ - the first symbol of x otherwise.

+**Theorem**: The first symbol of xy is

+ - the first symbol of x if x is nonempty.

+ - the first symbol of y if x is empty but y is nonempty.

+ - undefined if both x and y are empty.

+**Corollary** If x is a string over an alphabet Σ and b is a symbol of Σ, then

+the first symbol of bx is b. Also, |bx| = 1 + |x|.

+- The language LM is equal to {xy | x ε L and y ε M}.

+- An alternate notation for LM is L∙M.

+The string x^n is obtained from x by concatenating n copies of x. Formally it

+The Language L^n contains all concatenations of n members of L. Formally it

+If L is a language then,

+- L\* is the union of L^n over all n >= 0

+- L+ is the union of L^n over all n >= 1 (Thus excluding the empty string)

+- If Σ is an alphabet then:

+ - Σ\* is the set of all strings over Σ.

+ - Σ+ is the set of all nonempty strings over Σ.

+- The second definition follows from the first if Σ is considered as a set of

+Sometimes the \* operator is called the Kleene closure operator.

+λ is a member of L\* for all L. (L\*)\* = L\*, so L\* is closed under the

+Properties of Concatenation

+---------------------------

+- Theorem: (xy)w = x(yw)

+- Theorem x^m x^n = x^(m+n)

+- Corollary: x^n = x x^(n-1)

+ - So the defintion of x^n could also have been stated this way.

+ - Also, L^m L^n = L^(m+n), so L^n = L L^(n-1)

+***Warning***: |LM| needn't euqal |L||M|, as L might be {a, ab} and M might be

+The reverse w^R of a string w is

+*Recursion is your friend!*

+If x is an empty string, then xy = y.

+Proof: by induction on |y|

+Basis: If |y| = 0, then y is empty. And then by definition of concatenation,

+xy = x, which is empty by assumption.

+Inductive Step: If |y| > 0, then y = z∙a for some z and a, Here

+ xy = (xz).a (by definition of concatenation)

+Proof: by induction on |y|

+Basis: If |y| = 0 then y is empty so we need to show that |xy| = |x| But in this

+Induction step: If |y| > 0 then y = z.a for some z and a, so by definition of

+concatenation, xy = (xz).a

+> "Induction proofs are fair game for the exam but they will be simple."

+If x and y are strings, the first symbol of xy is the first symbol

+of x if x is nonempty, the first symbol of y if x is empty but y is not, and

+undefined if both x and y are empty.

+Proof: If x is empty, then xy = y, so hte last two cases follow immediately from

+If x is not empty we may proceed by induction on |y|.

+Induction step: If |y| > 0, then y = z.a for some z and a, so

+ the first symbol of xy = the first symbol of (xz).a (by defintion of concatenation)

+ = the first symbol of xz (by definition of first symbol)

+ = the first symbol of x (by induction)

+Proof: by induction on |w|

+Basis: If |w| = 0 then w is empty, so that both (xy)w and x(yw) equal xy.

+Inductive Step: If |w| > 0 then w is empty then w = z.a for some z and a, so

+ = ((xy)z).a (by definition of concatenation)

+ = ((xy)z).a (by definition of concatenation)