Parsing strings using JavaCC

I am trying to understand strings well using JavaCC without mistakenly matching them with another token. These lines must have spaces, letters and numbers.

My identifier and token are listed as follows:

<IDENTIFIER: (["a"-"z", "A"-"Z"])+>
<NUMBER: (["0"-"9"])+>

My current token:

<STRING: "\"" (<IDENTIFIER> | <NUMBERS> | " ")+ "\"">

Ideally, I only want to keep the material inside the quotation marks. I have a separate file in which I actually save variables and values. Should I delete quotes there?

Initially, I had a method in the parser file:

variable=<INDENTIFIER> <ASSIGN> <QUOTE> message=<IDENTIFIER> <QUOTE>
{File.saveVariable(variable.image, message.image);}

But, as you might have guessed, this did not allow spaces or numbers. For identifiers, such as variable names, I only want to allow letters.

So, I would like to get some tips on how I can collect string literals. In particular, I would like to make lines such as:

" hello", "hello ", " hello " and "\nhello", "hello\n", "\nhello\n"

.

+5
1

" STRING STATE (: ) ".

TOKEN:
{
  <QUOTE:"\""> : STRING_STATE
}

<STRING_STATE> MORE:
{
  "\\" : ESC_STATE
}

<STRING_STATE> TOKEN:
{
  <ENDQUOTE:<QUOTE>> : DEFAULT
| <CHAR:~["\"","\\"]>
}

<ESC_STATE> TOKEN:
{
  <CNTRL_ESC:["\"","\\","/","b","f","n","r","t"]> : STRING_STATE
}

:

/**
 * Match a quoted string.
 */
String string() :
{
  StringBuilder builder = new StringBuilder();
}
{
  <QUOTE> ( getChar(builder) )* <ENDQUOTE>
  {
    return builder.toString();
  }
}

/**
 * Match char inside quoted string.
 */
void getChar(StringBuilder builder):
{
  Token t;
}
{
  ( t = <CHAR> | t = <CNTRL_ESC> )
  {
    if (t.image.length() < 2)
    {
      // CHAR
      builder.append(t.image.charAt(0));
    }
    else if (t.image.length() < 6)
    {
      // ESC
      char c = t.image.charAt(1);
      switch (c)
      {
        case 'b': builder.append((char) 8); break;
        case 'f': builder.append((char) 12); break;
        case 'n': builder.append((char) 10); break;
        case 'r': builder.append((char) 13); break;
        case 't': builder.append((char) 9); break;
        default: builder.append(c);
      }
    }
  }
}

.

+10

All Articles