RE/flex scanner generator replacement for Flex/Lex. More...
#include "reflex.h"
Macros | |
#define | WITH_BOOST_PARTIAL_MATCH_BUG |
Work around the Boost.Regex partial_match bug by forcing the generated scanner to buffer all input. More... | |
Functions | |
int | fopen_s (FILE **file, const char *name, const char *mode) |
Safer fopen_s() More... | |
char | char_tolower (char c) |
Convert to lower case. More... | |
static std::string | file_ext (std::string &name, const char *ext) |
Add file extension if not present, modifies the string argument and returns a copy. More... | |
int | main (int argc, char **argv) |
Main program instantiates Reflex class and runs Reflex::main(argc, argv) More... | |
Variables | |
static const char * | options_table [] |
Table with command-line reflex options and lex specification %options. More... | |
static const Reflex::Library | library_table [] |
Table with regex library properties. More... | |
RE/flex scanner generator replacement for Flex/Lex.
#define WITH_BOOST_PARTIAL_MATCH_BUG |
Work around the Boost.Regex partial_match bug by forcing the generated scanner to buffer all input.
|
inline |
Convert to lower case.
|
static |
Add file extension if not present, modifies the string argument and returns a copy.
name
string with extension ext
|
inline |
Safer fopen_s()
int main | ( | int | argc, |
char ** | argv | ||
) |
Main program instantiates Reflex class and runs Reflex::main(argc, argv)
|
static |
Table with regex library properties.
This table is extensible and new regex libraries may be added. Each regex library is described by:
matcher=NAME
optionA regex library signature is a string of the form "decls:escapes?+."
, see reflex::convert.
The optional "decls:"
part specifies which modifiers and other special (?...)
constructs are supported:
(?:...)
is supported(?i...)
case-insensitive matching is supported(?m...)
multiline mode is supported for the ^ and $ anchors(?s...)
dotall mode is supported(?x...)
freespace mode is supported#
specifies that (?#...)
comments are supported=
specifies that (?=...)
lookahead is supported<
specifies that (?<...)
lookbehind is supported!
specifies that (?!=...)
and (?!<...)
are supported^
specifies that (?^...)
negative (reflex) patterns are supportedThe "escapes"
characters specify which standard escapes are supported:
a
for \a
(BEL U+0007)b
for \b
(BS U+0008) in brackets [\b]
only AND the \b
word boundaryc
for \cX
control character specified by X
modulo 32d
for \d
ASCII digit [0-9]
e
for \e
ESC U+001Bf
for \f
FF U+000Ch
for \h
ASCII blank [ \t]
(SP U+0020 or TAB U+0009)i
for \i
reflex indent anchorj
for \j
reflex dedent anchorj
for \k
reflex undent anchorl
for \l
ASCII lower case letter [a-z]
n
for \n
LF U+000Ap
for \p{C}
Unicode character classes, also implies Unicode {X}, , , , , r
for \r
CR U+000Ds
for \s
space (SP, TAB, LF, VT, FF, or CR)t
for \t
TAB U+0009u
for \u
ASCII upper case letter [A-Z]
(when not followed by {XXXX}
)v
for \v
VT U+000Bw
for \w
ASCII word-like character [0-9A-Z_a-z]
x
for \xXX
8-bit character encoding in hexadecimaly
for \y
word boundaryz
for \z
end of input anchorfor `\
begin of input anchor'
for \'
end of input anchor<
for \<
left word boundary>
for \>
right word boundaryA
for \A
begin of input anchorB
for \B
non-word boundaryD
for \D
ASCII non-digit [^0-9]
H
for \H
ASCII non-blank [^ \t]
L
for \L
ASCII non-lower case letter [^a-z]
N
for \N
not a newlineP
for \P{C}
Unicode inverse character classes, see 'p'Q
for \Q...\E
quotationsR
for \R
Unicode line breakS
for \S
ASCII non-space (no SP, TAB, LF, VT, FF, or CR)U
for \U
ASCII non-upper case letter [^A-Z]
W
for \W
ASCII non-word-like character [^0-9A-Z_a-z]
X
for \X
any Unicode characterZ
for \Z
end of input anchor, before the final line break0
for \0nnn
8-bit character encoding in octal requires a leading 0
Note that 'p' is a special case to support Unicode-based matchers that natively support UTF8 patterns and Unicode classes {C}, {C}, , , , , , , , , , and {X}. Basically, 'p' prevents conversion of Unicode patterns to UTF8. This special case does not support {NAME} expansions in bracket lists such as [a-z||{upper}] and {lower}{+}{upper} used in lexer specifications.
The optional "?+"
specify lazy and possessive support:
?
lazy quantifiers for repeats are supported+
possessive quantifiers for repeats are supportedThe optional "."
(dot) specifies that dot matches any character except newline. A dot is implied by the presence of the 's' modifier, and can be omitted in that case.
|
static |
Table with command-line reflex options and lex specification %options.
The table consists of option names with hyphens replaced by underscores.