I am taking a course in building a compiler, and my current task is to write a lexer for the language that we are implementing. I cannot figure out how to satisfy the requirement that a lexer must recognize concatenated tokens. That is, tokens are not separated by spaces. For example: a string 39ifshould be recognized as a number 39and a keyword if. At the same time, the lexer should also exit(1)when it encounters an invalid input.
A simplified version of the code that I have:
%{
%}
%option main warn debug
%%
if |
then |
else printf("keyword: %s\n", yytext);
[[:digit:]]+ printf("number: %s\n", yytext);
[[:alpha:]][[:alnum:]]* printf("identifier: %s\n", yytext);
[[:space:]]+ // skip whitespace
[[:^space:]]+ { printf("ERROR: %s\n", yytext); exit(1); }
%%
When I launched this (or my full version) and gave it the input 39if, the correspondence of the error rule and the exit ERROR: 39if, when I would like to:
number: 39
keyword: if
(Ie the same as if I entered 39 ifas input.)
, , , , flex . , , . regexp, , , "catch-all" lexer.
: , catch-all be . { exit(1); }, , " 1".