Migration guide v0.4 -> v1.0
This guide outlines the breaking changes between 0.4 and 1.0, and how to migrate to 1.0.
Format strings
The new syntax for format string specifiers is
[[fill]align][width][L][type]
.
This means, that the old localization flag l
has been superseded by L
, that must come in a specific point in the format string.
“Type” encompasses all type-specific options.
Integers
n
is now used to accept locale-specific digits, implying L
.
b
now means binary. Custom base can be specified with Bnn
.
i
no longer requires the argument to be signed. It is now the only option, which detects the base of the number based on the input.
u
no longer detects the base of the number, but expects it to be decimal. It also does not accept any preceding sign.
The default for integers is now d
(decimal), not i
/u
(detect base).
Accepted base prefixes now include 0b
and 0B
for binary, and 0o
and 0O
for octal.
v0.4 |
v1.0 |
---|---|
|
|
|
|
(default) |
|
|
|
|
|
|
|
|
|
(1): i
in 1.0 accepts both signed and unsigned values
Floats
a
, A
, e
, E
, f
, F
, g
, and G
had no effect in 0.4.
In 1.0 they each have a meaning, see API docs for more.
To get the old behavior, replace the above flag with g
or G
.
This is also the same behavior if no type flag is given.
v0.4 |
v1.0 |
---|---|
|
|
(default) |
(default, no changes) |
Buffers/spans
Spans now behave like strings, except they have a built-in size limitation in them.
In 0.4, reading a span would just mean reading as many bytes as is the size of the span.
// 0.4
std::vector<char> vec(5, '\0');
auto span = scn::make_span(vec);
auto result = scn::scan("ab cde", "{}", span);
// result.range() == "e"
// vec == "ab cd"
In 1.0, reading a span is like reading a string, except it can’t grow
// 1.0, same code
std::vector<char> vec(5, '\0');
auto span = scn::make_span(vec);
auto result = scn::scan("ab cde", "{}", span);
// result.range() == " cde"
// vec == "ab"
To achieve the same behavior, use the new width field in the format string, alongside the new string set functionality:
// 1.0, same behavior
std::vector<char> vec(5, '\0');
auto span = scn::make_span(vec);
// max 5 chars, accept all chars
auto result = scn::scan("ab cde", "{:5[:all:]}", span);
// result.range() == "e"
// vec == "ab cd"
Or, if the range is direct, just use std::copy
:
// 1.0, same behavior
std::vector<char> vec(5, '\0');
auto span = scn::make_span(vec);
auto source = std::string{"ab cde"};
auto it = std::copy(source.begin(), source.end(), span.begin());
// it = "e"
// vec == "ab cd"
Whitespace skipping
In 0.4, the rules whether whitespace was to be skipped before a value were based on the format string, and they were complicated.
In 1.0, whitespace is always skipped, except when scanning characters or a code_point, or when using string set scanning with strings (the [option]
)
Encoding
In 0.4, narrow strings (format strings, ranges) were assumed to be ASCII, and wide strings whatever the current locale said.
In 1.0, all strings are assumed to be Unicode, regardless of locale: narrow strings are UTF-8, wide strings are UTF-16 (Windows) or UTF-32 (POSIX).
Having the global locale, or the locale passed to scn::scan_localized
, be something other than UTF-8, is unsupported.
Lists
scn::scan_list
and scn::scan_list_until
are replaced by scn::scan_list
and scn::scan_list_ex
, augmented by scn::scan_list_options
and scn::scan_list_localized
.
scn::scan_list
without a separator and an until-character work the same in 0.4 and 1.0.
If a separator and/or an until-character support is desired, scn::scan_list_ex
is to be used.
It takes an instance of a scn::scan_list_options
, which can be made with scn::list_separator
, scn::list_until
, and scn::list_separator_and_until
std::vector<int> vec{};
// basic scan_list staying the same
// 0.4
auto result = scn::scan_list("123 456", vec);
// 1.0
auto result = scn::scan_list("123 456", vec);
// vec == [123, 456]
// result.empty() == true
// separator
// 0.4
auto result = scn::scan_list("123, 456", vec);
// 1.0
auto result = scn::scan_list_ex("123, 456", scn::list_separator(','));
// vec == [123, 456]
// result.empty() == true
// until
// 0.4
auto result = scn::scan_list_until("123 456\n789", vec, '\n');
// 1.0
auto result = scn::scan_list_ex("123 456\n789", vec, scn::list_until('\n'));
// vec == [123, 456]
// result.range() == "789"
// separator + until
// 0.4
auto result = scn::scan_list_until("123, 456,\n789", vec, '\n', ',');
// 1.0
auto result = scn::scan_list_ex("123, 456,\n789", vec, scn::list_separator_and_until(',', '\n'));
// vec == [123, 456]
// result.range() == "789"
Remaining range in results
Following member functions in result
were renamed:
string()
->range_as_string()
span()
->range_as_span()
string_view()
->range_as_string_view()
Other, more minor stuff
vscan
signature
The signature of vscan
was changed: stuff previously done in scan
is now done in vscan
.
In 0.4, vscan
had the signature (Context&, ParseContext&, args) -> error
.
In 1.0, vscan
has the signature (WrappedRange, format_string, args) -> vscan_result
Newly stabilized scn::wrap
, scn::make_context
, and scn::make_parse_context
will help with implementing new scan
and vscan
functions.
Implementing scan
was not really possible in 0.4 without using internal APIs, but now it is.
Its functionality was also split into vscan
, vscan_default
and vscan_localized
.
// scan in 1.0
template <typename Range, typename... Args>
auto scan(Range&& r, string_view f, Args&... a) {
auto range = scn::wrap(std::forward<Range>(r));
auto args = scn::make_args_for(range, f, a...);
auto ret = scn::vscan(std::move(range), f, {args});
return scn::make_scan_result(std::move(ret));
}
read_char()
-> read_code_unit()
read_char
was renamed to better reflect the Unicode-awareness of the library.
Its behavior is identical.
scn::scan_usertype
While not a breaking change, scn::scan_usertype
increases the ergonomics of scanning a user-defined type. See the API docs for more.