One problem I ran into was that C must be context-sensitive and cannot be parsed with a single lookahead marker. for instance
int main1;
int main() {}
This is the simplest example that I can think of, in which both the definition of a function and the declaration of a variable start with the same type of token. You will need to go all the way to the left character or semicolon to determine what needs to be analyzed.
My question is: how is this done? Does the lexical analyzer have some tricks up its sleeve to pretend and issue an invisible token that distinguishes them? Do modern parsers have many tokens?
Scott source
share