SWAT /

Flex

Reading

Outdoors

Games

Hobbies

LEGO

Food

Code

Events

Nook

sidebar

Flex

flex++

Either I'm not seeing it, or documentation for flex++ is... somewhat typical for a GNU tool: Actually around, somewhere, and solid, but cryptic and scattered as well.

I would be very appreciative of comments or corrections on any of this.

Changing Scanner Classes

The following is a trivial demonstration of using flex++, and in particular changing the generated class name. This sounds like it should be really easy, but for some reason is slightly awkward. Most of this is largely derived from PUMA documentation.

I'll assume the below scanner/lexer specification is in calc.l:

%option c++
%option noyywrap
%option outfile="calc.cc"
%option yyclass="Scanner"
%option prefix="Scanner"

%{
#include "calc.h"
#include <iostream>

using namespace std;
%}

DIGIT   [0-9]
DIGIT1  [1-9]

%%

"+"               { cout << "operator <" << yytext[0] << ">" << endl; }
"-"               { cout << "operator <" << yytext[0] << ">" << endl; }
"="               { cout << "operator <" << yytext[0] << ">" << endl; }
{DIGIT1}{DIGIT}*  { cout << "  number <" << yytext    << ">" << endl; }
.                 { cout << " UNKNOWN <" << yytext[0] << ">" << endl; }

%%

int main(int argc, char ** argv)
{
	Scanner scanner;
	scanner.yylex();
	return 0;
}

The c++ option is not necessary if you use a flex-derived executable ending in '+', but it can't hurt. The noyywrap option indicates that a single file should be processed by the generated lexer, stopping at end of file rather than wrapping to the next input. The outfile option has the same function as the -o command line option.

Most pertinent here, the yyclass option says that we want to use "Scanner" as the generated class name for yylex(). Similarly, the prefix option says we want to put everything in the class ScannerFlexLexer rather than yyFlexLexer. These declarations actually barely change the generated file, simply redefining the yylex() function signature and class name.

The preamble block in %{ ... %} includes the iostream basics for this example, and our own calc header. The latter is important as it's somewhat non-obvious. The generated lexer code won't actually define the class, it just defines the yylex() method of that class. This is given below.

Following those beginning elements is a trivial lexical specification for identifying numbers and operators. The conclusion of the specification is a main() function that just runs the lexer over the default standard input.

The missing, non-obvious piece then is declaring the denoted lexer class. This could be done in the preamble block, but is just as easy to put in a separate header. In this example I've assumed it's in calc.h:

#ifndef __calc__
#define __calc__

class Scanner : public ScannerFlexLexer {
public:  
  Scanner() {}

  int yylex ();
};

#endif // __calc__

The key point here is defining the yylex() method of our lexer class ("Scanner"), which is the method for which flex++ generates code. This class must extend ScannerFlexLexer, which includes all of the various flex++ generated and library methods for reading characters, tracking state, etc. Note that the symbol for this is xxFlexLexer, where xx is the prefix we set, with a default of yy. The real meat for this is in turn defined in the global FlexLexer.h and the generated code; the custom symbols are all handled through two macros.

Once all of this is together, the example can be compiled and run as such:

> flex++ calc.l
> g++ -o calc calc.cc 
> ./calc 
1+1+1
  number <1>
operator <+>
  number <1>
operator <+>
  number <1>

Recent Changes (All) | Edit SideBar Page last modified on December 21, 2009, at 12:50 PM Edit Page | Page History