Equivalence of regular expressions and regular languages Theory of Computation
๐ผ The Big Idea (in simple words)
Think of a regular expression (RE) as a recipe.
It tells you how to build strings using symbols, choices, repetitions, etc.
Think of a regular language as the collection of all dishes made using that recipe.
Now imagine a DFA or NFA as a machine that checks whether a dish matches the recipe.
The amazing fact:
Whatever recipe (regular expression) you writeโฆ
there is always a matching machine (finite automaton) that understands it.
And whatever machine (DFA/NFA) you buildโฆ
you can always create a recipe (regular expression) that describes the exact same set of strings.
So they are two different representations of the same idea.
โญ Why is this equivalence important?
Because it gives us two ways to describe languages:
โ REs are simple and compact
Good for writing patterns, specifying tokens, searching, etc.
โ Automata are great for analysis
You can simulate them, minimize them, and prove things using them.
Since both express the same languages, you can switch from one to the other whenever needed.
๐ Part 1: From Regular Expression โ NFA
Every regular expression can be turned step-by-step into an NFA.
This is commonly done using Thompsonโs construction.
Hereโs the idea:
For a single symbol
(a)
|
โโโโโ a โโโโโ
โ S โ -----> โ F โ
โโโโโ โโโโโ
For alternation (choice)
Expression: a | b
You branch:
ฮต a ฮต
โโโโ>โ ----->โ----------->โโโโโ
โ โ
โ โ
โ ฮต b ฮต
โโโโ>โ ----->โ----------->โโโโโ
For concatenation
Expression: ab
S โโaโ โ โโbโ F
For Kleene star
Expression: a*
โโโโโโโโโbโโโโโโโโโ
โ โ
โ (S) --ฮต--> [a] --ฮต--> (F)
โ โ
โ-------ฮต---------โ
Every part of an RE has a small NFA structure.
By combining them, the whole regular expression becomes an NFA.
๐ Part 2: From DFA/NFA โ Regular Expression
Going the other way is a bit like peeling onions.
We remove states one by one and update transitions with equivalent regular expressions.
This is called state elimination.
Simple idea:
- Pick a state to remove.
- Re-route all transitions around it by combining labels into regular expressions.
- Continue until only start and final states remain.
- The final label between them is the regular expression.
A rough sketch:
Before elimination:
(Start) --a--> [Q] --b--> (Final)
After removing Q:
(Start) --ab--> (Final)
The transition label becomes the regular expression describing paths through Q.
๐ผ Why does equivalence hold?
Because both REs and finite automata describe patterns built from:
- sequences (concatenation)
- choices (union)
- repetitions (Kleene star)
These are exactly the tools needed for defining regular languages.
REs do it as written patterns.
Automata do it using states and transitions.
But the power is the same.
๐ Visual Summary Diagram
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Regular Expressions (RE)โ
โ โ patterns โ โ
โโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ Convert (Thompson)
โผ
โโโโโโโโโโโโโโ
โ NFA โ
โ(ฮต-moves ok)โ
โโโโโโโโฌโโโโโโ
โ Subset Construction
โผ
โโโโโโโโโโโโโโ
โ DFA โ
โ(no ฮต-moves)โ
โโโโโโโโฌโโโโโโ
โ State Elimination
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Regular Expressions (RE)โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The loop shows that you can always move between RE โ Automata.
