With both %define api.value.type variant
and %define
api.token.constructor
, the parser defines the type symbol_type
, and
expects yylex
to have the following prototype.
parser::symbol_type
yylex ()
¶parser::symbol_type
yylex (type1 arg1, …)
¶Return a complete symbol, aggregating its type (i.e., the traditional
value returned by yylex
), its semantic value, and possibly its
location. Invocations of ‘%lex-param {type1 arg1}’ yield
additional arguments.
parser
: symbol_type ¶A “complete symbol”, that binds together its kind, value and (when applicable) location.
symbol_type
: symbol_kind_type
kind () const
¶The kind of this symbol.
symbol_type
: const char *
name () const
¶The name of the kind of this symbol.
Returns a std::string
when parse.error
is verbose
.
For each token kind, Bison generates named constructors as follows.
parser::symbol_type
: symbol_type (int
token, const value_type&
value, const location_type&
location)
¶parser::symbol_type
: symbol_type (int
token, const location_type&
location)
¶parser::symbol_type
: symbol_type (int
token, const value_type&
value)
¶parser::symbol_type
: symbol_type (int
token)
¶Build a complete terminal symbol for the token kind token (including
the api.token.prefix
), whose semantic value, if it has one, is
value of adequate value_type. Pass the location iff
location tracking is enabled.
Consistency between token and value_type is checked via an
assert
.
For instance, given the following declarations:
%define api.token.prefix {TOK_} %token <std::string> IDENTIFIER; %token <int> INTEGER; %token ':';
you may use these constructors:
symbol_type (int token, const std::string&, const location_type&); symbol_type (int token, const int&, const location_type&); symbol_type (int token, const location_type&);
Correct matching between token kinds and value types is checked via
assert
; for instance, ‘symbol_type (ID, 42)’ would abort. Named
constructors are preferable (see below), as they offer better type safety
(for instance ‘make_ID (42)’ would not even compile), but symbol_type
constructors may help when token kinds are discovered at run-time, e.g.,
[a-z]+ { if (auto i = lookup_keyword (yytext)) return yy::parser::symbol_type (i, loc); else return yy::parser::make_ID (yytext, loc); }
Note that it is possible to generate and compile type incorrect code (e.g. ‘symbol_type (':', yytext, loc)’). It will fail at run time, provided the assertions are enabled (i.e., -DNDEBUG was not passed to the compiler). Bison supports an alternative that guarantees that type incorrect code will not even compile. Indeed, it generates named constructors as follows.
parser
: symbol_type
make_token (const value_type&
value, const location_type&
location)
¶parser
: symbol_type
make_token (const location_type&
location)
¶parser
: symbol_type
make_token (const value_type&
value)
¶parser
: symbol_type
make_token ()
¶Build a complete terminal symbol for the token kind token (not
including the api.token.prefix
), whose semantic value, if it has one,
is value of adequate value_type. Pass the location iff
location tracking is enabled.
For instance, given the following declarations:
%define api.token.prefix {TOK_} %token <std::string> IDENTIFIER; %token <int> INTEGER; %token COLON; %token EOF 0;
Bison generates:
symbol_type make_IDENTIFIER (const std::string&, const location_type&); symbol_type make_INTEGER (const int&, const location_type&); symbol_type make_COLON (const location_type&); symbol_type make_EOF (const location_type&);
which should be used in a scanner as follows.
[a-z]+ return yy::parser::make_IDENTIFIER (yytext, loc); [0-9]+ return yy::parser::make_INTEGER (text_to_int (yytext), loc); ":" return yy::parser::make_COLON (loc); <<EOF>> return yy::parser::make_EOF (loc);
Tokens that do not have an identifier are not accessible: you cannot simply
use characters such as ':'
, they must be declared with %token
,
including the end-of-file token.