How to make Lex / flex recognize tokens not separated by spaces?

I am taking a course in building a compiler, and my current task is to write a lexer for the language that we are implementing. I cannot figure out how to satisfy the requirement that a lexer must recognize concatenated tokens. That is, tokens are not separated by spaces. For example: a string 39ifshould be recognized as a number 39and a keyword if. At the same time, the lexer should also exit(1)when it encounters an invalid input.

A simplified version of the code that I have:

%{
#include <stdio.h>
%}

%option main warn debug

%%

if      |
then    |
else    printf("keyword: %s\n", yytext);

[[:digit:]]+    printf("number: %s\n", yytext);

[[:alpha:]][[:alnum:]]*     printf("identifier: %s\n", yytext);

[[:space:]]+    // skip whitespace
[[:^space:]]+   { printf("ERROR: %s\n", yytext); exit(1); }

%%

When I launched this (or my full version) and gave it the input 39if, the correspondence of the error rule and the exit ERROR: 39if, when I would like to:

number: 39
keyword: if

(Ie the same as if I entered 39 ifas input.)

, , , , flex . , , . regexp, , , "catch-all" lexer.

: , catch-all be . { exit(1); }, , " 1".

+5
1

, "" . "" , , , --bison-bridge, , bison. - , , , /, IMHO :

%x LEXING_ERROR
%%
// all your rules; the following *must* be at the end
.                 { BEGIN(LEXING_ERROR); yyless(1); }
<LEXING_ERROR>.+  { fprintf(stderr,
                            "Invalid character '%c' found at line %d,"
                            " just before '%s'\n",
                            *yytext, yylineno, yytext+1);
                    exit(1);
                  }

. , . .+ , , , , , ( flex , ). yyless(n) n, . , () - . ( , , . . , , , , , . .)

flex %x BEGIN

+4

All Articles