Unit 2 - Notes
Unit 2: REGULAR EXPRESSIONS AND REGULAR SETS
1. Regular Expressions (RE)
A Regular Expression is a algebraic notation used to describe a Regular Language. It serves as a declarative method to define strings accepted by a Finite Automata (FA).
1.1 Recursive Definition
For a finite alphabet :
- is a regular expression denoting the empty set.
- (epsilon) is a regular expression denoting the set (language containing only the empty string).
- is a regular expression denoting the set , for all .
- If and are regular expressions denoting languages and , then:
- (Union) is an RE denoting .
- (Concatenation) is an RE denoting (often written as ).
- (Kleene Closure) is an RE denoting .
1.2 Operator Precedence
To avoid excessive parentheses, the precedence order (highest to lowest) is:
- Kleene Star ()
- Concatenation ()
- Union ( or )
1.3 Identities for Regular Expressions
Two regular expressions and are equivalent () if they represent the same language.
Basic Identities:
- (Idempotent Law)
Distributive Laws:
Important Identity (Shifting Rule):
Identities involving Union and Closure:
2. Finite Automata and Regular Expressions
Regular expressions and Finite Automata are equivalent in power.
- For every Regular Expression, there exists an FA (NFA-) that accepts it.
- For every FA, there exists a Regular Expression that describes the language accepted by it.
2.1 Transition Systems Containing Null Moves (NFA-)
An NFA with -moves allows the automaton to transition from one state to another without consuming any input symbol.
Formal Definition:
An NFA- is a 5-tuple where:
- : Finite set of states.
- : Finite set of input symbols.
- : Initial state.
- : Set of final states.
- : Transition function defined as .
2.2 -Closure
The -Closure of a state , denoted as , is the set of all states reachable from using only -transitions (including itself).
Algorithm to find -closure(q):
- Start with set .
- For every state in , if there is a transition , add to .
- Repeat step 2 until no new states can be added.
2.3 Conversion of NFA- to DFA
To convert an NFA with null moves to a Deterministic Finite Automata (DFA), we use the Subset Construction Algorithm modified for -closures.
Steps:
- Start State: The start state of the DFA is .
- Transitions: For a set of states (which is a single state in the resulting DFA) and input symbol , calculate the next state:
Translation: From all states in A, find where input 'a' takes you, then take the -closure of those results. - Final States: Any set in the DFA containing at least one final state from the NFA is a final state in the DFA.
3. Algebraic Methods using Arden's Theorem
Arden's Theorem is a powerful tool for finding the Regular Expression corresponding to a Finite Automata by solving a system of linear equations.
3.1 Arden's Theorem Statement
Let and be two regular expressions over .
If does not contain (null string), then the equation:
has a unique solution given by:
3.2 Application of Arden's Theorem (FA to RE)
Method:
- Write an equation for every state in the FA.
Note: If is the start state, add to the equation. - Solve the system of equations for the Final State(s) using substitution and Arden's Theorem.
- If there are multiple final states, the resulting RE is the sum (Union) of the expressions derived for each final state.
4. Construction of Finite Automata Equivalent to a Regular Expression
We typically use Thompson’s Construction Method to convert a Regular Expression into an NFA-. This method is structural and builds the FA inductively.
Basic Building Blocks
- For RE = :
[q0] --a--> ((qf)) - For RE = (Union):
Create a new start state and new final state. Add -transitions from the new start to the start of and . Add -transitions from the final states of and to the new final state. - For RE = (Concatenation):
Merge the final state of with the start state of (or add an -transition between them). - For RE = (Iteration):
- Add new start state () and new final state ().
- Transition via .
- Transition via .
- Loop back: via .
- Skip: via .
5. Equivalence of Two Finite Automata and Two Regular Expressions
5.1 Equivalence of Two Regular Expressions
Two REs, and , are equivalent iff .
Testing Method:
- Convert to DFA .
- Convert to DFA .
- Minimize both and .
- Check if the minimized DFAs are Isomorphic (identical structure except for state names).
5.2 Equivalence of Two Finite Automata
Two FAs, and , are equivalent if they accept the same language.
Method 1: Product Construction (Difference Method)
Construct a new machine accepting . If this language is empty, the FAs are equivalent.
Method 2: Equivalence Algorithm
- Treat and as a single disconnected graph.
- Apply the DFA minimization algorithm.
- If the start state of and the start state of end up in the same equivalence class, the machines are equivalent.
6. Closure Properties of Regular Sets
If a class of languages is closed under an operation, applying that operation to languages in the class results in a language that is also in the class. Regular languages are closed under:
- Union: If and are regular, is regular. ()
- Intersection: If and are regular, is regular. (Proof via De Morgan's laws or Product Automata).
- Complement: If is regular, is regular. (Flip final and non-final states in DFA).
- Concatenation: If and are regular, is regular.
- Kleene Closure: If is regular, is regular.
- Difference: . Since regular sets are closed under intersection and complement, they are closed under difference.
- Reversal: If is regular, is regular.
- Homomorphism and Inverse Homomorphism.
7. Pumping Lemma for Regular Sets
The Pumping Lemma is a tool used primarily to prove that a language is NOT regular. It describes a property that all regular languages must possess (necessary condition).
7.1 Formal Statement
Let be a regular language. There exists a constant (the pumping length) such that for any string with length , we can split into three parts, , satisfying the following conditions:
- (y is not empty)
- For all , the string . (We can "pump" any number of times).
7.2 Steps to Prove Non-Regularity
To prove is not regular (Proof by Contradiction):
- Assume is regular. Let be the pumping length.
- Choose a string such that . (Choose strategically to create a contradiction).
- According to the lemma, can be split into .
- Analyze cases for the split based on and .
- Find an such that .
- This contradicts the Pumping Lemma. Therefore, the assumption is false, and is not regular.
Common Example: is not regular.
8. Myhill-Nerode Theorem
The Myhill-Nerode theorem provides a necessary and sufficient condition for a language to be regular. It characterizes regular languages based on the number of equivalence classes of a specific relation.
8.1 Indistinguishability
Two strings and are distinguishable with respect to language if there exists a string such that exactly one of or is in . If no such exists, and are indistinguishable (equivalent), denoted .
8.2 Theorem Statement
The following statements are equivalent:
- is a regular language.
- is the union of some of the equivalence classes of a right-invariant equivalence relation of finite index.
- The relation has a finite number of equivalence classes (finite index).
8.3 Applications
- Minimization of DFA: The number of states in the minimal DFA for is exactly the number of equivalence classes of .
- Proving Non-Regularity: If the relation has infinite equivalence classes, then is not regular.
9. Summary of Conversions (Equivalence Chain)
To ensure mastery of the unit, understand that all these models describe the exact same class of languages (Regular Languages):
- RE NFA-: Thompson's Construction.
- NFA- NFA/DFA: Subset Construction with -closure.
- DFA RE: Arden's Theorem or State Elimination Method.
- Minimization: Myhill-Nerode or Table Filling Algorithm.