Let’s explore this using a famous non-regular language, then see how to build a context-free grammar (CFG) for its complement.
1. Start with a Classic Non-Regular Language
One of the most well-known non-regular languages is:
[
L = { a^n b^n \mid n \ge 0 }
]
This language contains strings where:
- All the a’s come first
- All the b’s come after
- And the number of a’s equals the number of b’s
Examples in L:
""(the empty string)abaabbaaabbb
This language is not regular, but it is context-free.
2. What Is Its Complement?
The complement means: “All strings made of a’s and b’s that are NOT in L.”
So, any string that breaks the rule aⁿbⁿ belongs to the complement.
Examples NOT in L:
b…(because b appears before a)aab(counts don’t match)abba(a and b are mixed)aaaabb(4 a’s, 2 b’s — mismatch)ababab(completely unordered)
Let’s call the complement:
[
L’ = \Sigma^* – L,\ \text{where }\Sigma = {a, b}
]
**3. How to Think About L’?
Break the Mistakes into Categories**
A string is not in ( L ) if it breaks at least one rule.
The rules for L were:
- All a’s must come first.
- b’s must come after.
- Numbers of a’s and b’s must match.
So a string belongs to the complement ( L’ ) if:
Case 1: It starts with ‘b’
Examples:b, bb, ba, baba
Case 2: It has a ‘b’ followed later by an ‘a’
This breaks the “all a’s before b’s” rule.
Examples:abba, aababb, baaa
Case 3: All a’s are before all b’s, but counts do NOT match
Examples:aaab, aabbb, aaaaabbb
By handling these cases one by one, we can build a clean grammar.
4. A Simple CFG for the Complement
We now construct a context-free grammar that produces exactly the strings from (L’).
Case 1 Grammar: Strings that start with b
S → Bstart
Bstart → bA
A → aA | bA | ε
This generates any string beginning with b.
Case 2 Grammar: A ‘b’ followed later by an ‘a’
We force a b first, then ensure that an a appears after that b.
S → WrongOrder
WrongOrder → a* b a Tail
Tail → aTail | bTail | ε
(To write it in pure CFG form, we expand a* using recursion.)
Case 3 Grammar: Right order but mismatch in counts
Here, the string looks like a^m b^n but m ≠ n.
We use two productions:
- More a’s than b’s
- More b’s than a’s
S → MoreA | MoreB
MoreA → a MoreA b | a MoreA | a
MoreB → a MoreB b | MoreB b | b
This ensures the counts never match.
Combined CFG (Unified Version)
To make it clean, we define:
S → S1 | S2 | S3
Where each part handles a different case:
1. Strings starting with b
S1 → bA
A → aA | bA | ε
2. Strings where b appears before a (wrong order)
S2 → X b Y a Z
X → aX | ε
Y → aY | bY | ε
Z → aZ | bZ | ε
3. Strings with correct order but mismatched counts
S3 → MoreA | MoreB
MoreA → a MoreA b | a MoreA | a
MoreB → a MoreB b | MoreB b | b
This CFG generates exactly the complement of the non-regular language (a^n b^n).
5. Simple Diagram (Conceptual View)
Below is a simple diagram showing the 3 major branches of the grammar:
+-------------+
| S |
+------+------+
|
---------------------------------
| | |
+-----+ +-----+ +-------+
| S1 | | S2 | | S3 |
+-----+ +-----+ +-------+
| | |
Strings starting Strings with Correct order
with 'b' b…a pattern but wrong counts
This visual simply shows how our grammar “splits” the complement into understandable categories.
6. Why Is This Useful?
Understanding CFGs for complements teaches you:
- how to break complicated languages into manageable parts
- how to think like a language designer
- and how CFGs can describe languages that DFAs and NFAs can’t handle
It’s a great way to build intuition before moving toward deeper topics like PDA complement properties and closure operations.