Parsing A Simple Ompass Compiler
A Simple Onepass Compiler — Understanding Parsing
Now that we have a tiny language called Ompass, imagine someone writes this in it:
x = a + 3
Before the compiler can translate this, it must make sense of the structure.
It must decide:
- What is the left side?
- What is the right side?
- Where is an expression?
- What is a term?
- What is a number?
This careful checking and structuring of code is called parsing.
🌟 What Is Parsing? (The Friendly Way)
Think of parsing like reading a sentence in English.
If someone says:
“Dog the fast very runs.”
You might recognize all the words, but the order is strange.
Parsing helps the compiler understand the correct order of programming language elements.
In simple words:
Parsing is the stage where the compiler reads the code and organizes it into a tree-like structure based on grammar rules.
It’s like turning messy puzzle pieces into a completed picture.
🧠 Why Does Parsing Matter?
Parsing tells the compiler:
- whether the program is valid
- which pieces belong together
- how to build meaning from the code
- where to report errors if the code is wrong
Without parsing, the compiler would be like a person trying to read a book in a language they don’t know — the words mean nothing.
🧩 Parsing in a Simple Ompass Compiler
For our tiny Ompass language, let’s use a simple grammar:
<statement> → <id> = <expression>
<expression> → <expression> + <term> | <term>
<term> → <term> * <factor> | <factor>
<factor> → <number> | <id> | ( <expression> )
When the parser reads the Ompass program, it uses rules like these to figure out:
- What counts as an expression
- What belongs inside parentheses
- What is a valid assignment
🌳 The Parse Tree
A parser takes the input and builds a tree.
This tree shows how the program fits the grammar.
Let’s take the example:
x = a + 3
Here is a simple text-style diagram of its parse tree:
<statement>
|
-------------------------------------
| |
<id> <expression>
| |
x -------------------------
| |
<expression> + <term>
| |
<term> <factor>
| |
<factor> 3
|
a
This tree tells the compiler:
xis an identifiera + 3is an expressionais a factor3is a factor+connects two subexpressions
It’s a map of the whole statement.
🧠 Two Main Ways the Parser Works
To keep things fun and light, let’s compare them to two types of readers:
1. Top-Down Parsing – “The Guessing Reader”
Starts from the top rule and tries to predict what the input should look like.
Like a teacher saying:
“Hmm… you’re probably trying to write an expression here. Let me check…”
2. Bottom-Up Parsing – “The Detective Reader”
Starts from the smallest pieces and works upward.
Like solving a puzzle:
“These little pieces look like numbers… these look like operators… let’s connect them…”
Either approach still builds the same parse tree.
🪄 Parsing in Our Ompass Compiler: A Simple Walkthrough
Let’s slowly parse the example line:
Step 1:
See x → treat it as <id>
Step 2:
See = → matches assignment rule
Step 3:
See a + 3 → check if it matches <expression>
Step 4:
Inside the expression:
abecomes a<factor>3becomes a<factor>+shows we’re combining terms
Step 5:
Build the full parse tree
When the parser reaches the end of the input successfully, it knows:
“This program follows the grammar. I can continue!”
If something is wrong, like:
x = + 5 a
the parser immediately catches it and reports:
“Hey! That doesn’t match any valid rule.”
