RE/flex UCS to UTF-8 converters.
More...
#include <cstddef>
#include <cstring>
#include <string>
|
#define | REFLEX_NONCHAR (0x200000) |
| Replace invalid UTF-8 with the non-character U+200000 code point for guaranteed error detection (the U+FFFD code point makes error detection harder and possible to miss). More...
|
|
#define | REFLEX_NONCHAR_UTF8 "\xf8\x88\x80\x80\x80" |
|
|
std::string | reflex::latin1 (int a, int b, int esc= 'x', bool brackets=true) |
| Convert an 8-bit ASCII + Latin-1 Supplement range [a,b] to a regex pattern. More...
|
|
std::string | reflex::utf8 (int a, int b, int esc= 'x', const char *par="(", bool strict=true) |
| Convert a UCS-4 range [a,b] to a UTF-8 regex pattern. More...
|
|
size_t | reflex::utf8 (int c, char *s) |
| Convert UCS-4 to UTF-8, fills with REFLEX_NONCHAR_UTF8 when out of range, or unrestricted UTF-8 with WITH_UTF8_UNRESTRICTED. More...
|
|
int | reflex::utf8 (const char *s, const char **r=NULL) |
| Convert UTF-8 to UCS, returns REFLEX_NONCHAR for invalid UTF-8 except for MUTF-8 U+0000 and 0xD800-0xDFFF surrogate halves (use WITH_UTF8_UNRESTRICTED to remove any limits on UTF-8 encodings up to 6 bytes). More...
|
|
std::wstring | reflex::wcs (const char *s, size_t n) |
| Convert UTF-8 string to wide string. More...
|
|
std::wstring | reflex::wcs (const std::string &s) |
| Convert UTF-8 string to wide string. More...
|
|
RE/flex UCS to UTF-8 converters.
- Author
- Robert van Engelen - engel.nosp@m.en@g.nosp@m.enivi.nosp@m.a.co.nosp@m.m
- Copyright
- (c) 2016-2020, Robert van Engelen, Genivia Inc. All rights reserved.
-
(c) BSD-3 License - see LICENSE.txt
#define REFLEX_NONCHAR (0x200000) |
Replace invalid UTF-8 with the non-character U+200000 code point for guaranteed error detection (the U+FFFD code point makes error detection harder and possible to miss).
#define REFLEX_NONCHAR_UTF8 "\xf8\x88\x80\x80\x80" |