BNF Notation in Compiler Design
BNF stands for Backus Naur Form notation. It is a formal method for describing the syntax of programming language which is understood as Backus Naur Formas introduced by John Bakus and Peter Naur in 1960. BNF and CFG (Context Free Grammar) were nearly identical. BNF may be a meta-language (a language that cannot describe another language) for primary languages.
For human consumption, a proper notation for encoding grammars intended and called Backus Naur Form (BNF). Different languages have different description and rules but the general structure of BNF is given below –
name ::= expansion
The symbol ::= means “may expand into” and “may get replaced with.” In some texts, a reputation is additionally called a non-terminal symbol.
- Every name in Backus-Naur form is surrounded by angle brackets, < >, whether it appears on the left- or right-hand side of the rule.
- An expansion is an expression containing terminal symbols and non-terminal symbols, joined together by sequencing and selection.
- A terminal symbol may be a literal like (“+” or “function”) or a category of literals (like integer).
- Simply juxtaposing expressions indicates sequencing.
- A vertical bar | indicates choice.
<expr> ::= <term> "+" <expr> | <term> <term> ::= <factor> "*" <term> | <factor> <factor> ::= "(" <expr> ")" | <const> <const> ::= integer
Rules For making BNF :
Naturally, we will define a grammar for rules in BNF –
rule → name ::= expansion name → < identifier > expansion → expansion expansion expansion → expansion | expansion expansion → name expansion → terminal
- We might define identifiers as using the regular expression [-A-Za-z_0-9]+.
- A terminal could be a quoted literal (like “+”, “switch” or ” “<<=”) or the name of a category of literals (like integer).
- The name of a category of literals is typically defined by other means, like a daily expression or maybe prose.
It is common to seek out regular-expression-like operations inside grammars. as an example, the Python lexical specification uses them. In these grammars:
postfix * means "repeated 0 or more times" postfix + means "repeated 1 or more times" postfix ? means "0 or 1 times"
The definition of floating-point literals in Python may be an exemplar of mixing several notations –
floatnumber ::= pointfloat | exponentfloat pointfloat ::= [intpart] fraction | intpart "." exponentfloat ::= (intpart | pointfloat) exponent intpart ::= digit+ fraction ::= "." digit+ exponent ::= ("e" | "E") ["+" | "-"] digit+
It does not use angle brackets around names (like many EBNF notations and ABNF), yet does use ::= (like BNF). It mixes regular operations like + for non-empty repetition with EBNF conventions like [ ] for option. The grammar for the whole Python language uses a rather different (but still regular) notation.