Source release 19.1.0
This commit is contained in:
226
third_party/libcppbor/README.md
vendored
Normal file
226
third_party/libcppbor/README.md
vendored
Normal file
@@ -0,0 +1,226 @@
|
||||
LibCppBor: A Modern C++ CBOR Parser and Generator
|
||||
==============================================
|
||||
|
||||
TODO(b/254108623):
|
||||
This is a modified version of LibCppBor and is C++-14 compliant. The released
|
||||
version can be found at
|
||||
https://android.googlesource.com/platform/external/libcppbor, which requires
|
||||
C++-17. This is a reminder of refreshing the library with the latest source
|
||||
above once we officially move to C++-17.
|
||||
|
||||
LibCppBor provides a natural and easy-to-use syntax for constructing and
|
||||
parsing CBOR messages. It does not (yet) support all features of
|
||||
CBOR, nor (yet) support validation against CDDL schemata, though both
|
||||
are planned. CBOR features that aren't supported include:
|
||||
|
||||
* Indefinite length values
|
||||
* Semantic tagging
|
||||
* Floating point
|
||||
|
||||
LibCppBor requires C++-17.
|
||||
|
||||
## CBOR representation
|
||||
|
||||
LibCppBor represents CBOR data items as instances of the `Item` class or,
|
||||
more precisely, as instances of subclasses of `Item`, since `Item` is a
|
||||
pure interface. The subclasses of `Item` correspond almost one-to-one
|
||||
with CBOR major types, and are named to match the CDDL names to which
|
||||
they correspond. They are:
|
||||
|
||||
* `Uint` corresponds to major type 0, and can hold unsigned integers
|
||||
up through (2^64 - 1).
|
||||
* `Nint` corresponds to major type 1. It can only hold values from -1
|
||||
to -(2^63 - 1), since it's internal representation is an int64_t.
|
||||
This can be fixed, but it seems unlikely that applications will need
|
||||
the omitted range from -(2^63) to (2^64 - 1), since it's
|
||||
inconvenient to represent them in many programming languages.
|
||||
* `Int` is an abstract base of `Uint` and `Nint` that facilitates
|
||||
working with all signed integers representable with int64_t.
|
||||
* `Bstr` corresponds to major type 2, a byte string.
|
||||
* `Tstr` corresponds to major type 3, a text string.
|
||||
* `Array` corresponds to major type 4, an Array. It holds a
|
||||
variable-length array of `Item`s.
|
||||
* `Map` corresponds to major type 5, a Map. It holds a
|
||||
variable-length array of pairs of `Item`s.
|
||||
* `Simple` corresponds to major type 7. It's an abstract class since
|
||||
items require more specific type.
|
||||
* `Bool` is the only currently-implemented subclass of `Simple`.
|
||||
|
||||
Note that major type 6, semantic tag, is not yet implemented.
|
||||
|
||||
In practice, users of LibCppBor will rarely use most of these classes
|
||||
when generating CBOR encodings. This is because LibCppBor provides
|
||||
straightforward conversions from the obvious normal C++ types.
|
||||
Specifically, the following conversions are provided in appropriate
|
||||
contexts:
|
||||
|
||||
* Signed and unsigned integers convert to `Uint` or `Nint`, as
|
||||
appropriate.
|
||||
* `std::string`, `std::string_view`, `const char*` and
|
||||
`std::pair<char iterator, char iterator>` convert to `Tstr`.
|
||||
* `std::vector<uint8_t>`, `std::pair<uint8_t iterator, uint8_t
|
||||
iterator>` and `std::pair<uint8_t*, size_t>` convert to `Bstr`.
|
||||
* `bool` converts to `Bool`.
|
||||
|
||||
## CBOR generation
|
||||
|
||||
### Complete tree generation
|
||||
|
||||
The set of `encode` methods in `Item` provide the interface for
|
||||
producing encoded CBOR. The basic process for "complete tree"
|
||||
generation (as opposed to "incremental" generation, which is discussed
|
||||
below) is to construct an `Item` which models the data to be encoded,
|
||||
and then call one of the `encode` methods, whichever is convenient for
|
||||
the encoding destination. A trivial example:
|
||||
|
||||
```
|
||||
cppbor::Uint val(0);
|
||||
std::vector<uint8_t> encoding = val.encode();
|
||||
```
|
||||
|
||||
It's relatively rare that single values are encoded as above. More often, the
|
||||
"root" data item will be an `Array` or `Map` which contains a more complex structure.For example
|
||||
:
|
||||
|
||||
``` using cppbor::Map;
|
||||
using cppbor::Array;
|
||||
|
||||
std::vector<uint8_t> vec = // ...
|
||||
Map val("key1", Array(Map("key_a", 99 "key_b", vec), "foo"), "key2", true);
|
||||
std::vector<uint8_t> encoding = val.encode();
|
||||
```
|
||||
|
||||
This creates a map with two entries, with `Tstr` keys "Outer1" and
|
||||
"Outer2", respectively. The "Outer1" entry has as its value an
|
||||
`Array` containing a `Map` and a `Tstr`. The "Outer2" entry has a
|
||||
`Bool` value.
|
||||
|
||||
This example demonstrates how automatic conversion of C++ types to
|
||||
LibCppBor `Item` subclass instances is done. Where the caller provides a
|
||||
C++ or C string, a `Tstr` entry is added. Where the caller provides
|
||||
an integer literal or variable, a `Uint` or `Nint` is added, depending
|
||||
on whether the value is positive or negative.
|
||||
|
||||
As an alternative, a more fluent-style API is provided for building up
|
||||
structures. For example:
|
||||
|
||||
```
|
||||
using cppbor::Map;
|
||||
using cppbor::Array;
|
||||
|
||||
std::vector<uint8_t> vec = // ...
|
||||
Map val();
|
||||
val.add("key1", Array().add(Map().add("key_a", 99).add("key_b", vec)).add("foo")).add("key2", true);
|
||||
std::vector<uint8_t> encoding = val.encode();
|
||||
```
|
||||
|
||||
An advantage of this interface over the constructor -
|
||||
based creation approach above is that it need not be done all at once.
|
||||
The `add` methods return a reference to the object added to to allow calls to be chained,
|
||||
but chaining is not necessary; calls can be made
|
||||
sequentially, as the data to add is available.
|
||||
|
||||
#### `encode` methods
|
||||
|
||||
There are several variations of `Item::encode`, all of which
|
||||
accomplish the same task but output the encoded data in different
|
||||
ways, and with somewhat different performance characteristics. The
|
||||
provided options are:
|
||||
|
||||
* `bool encode(uint8\_t** pos, const uint8\_t* end)` encodes into the
|
||||
buffer referenced by the range [`*pos`, end). `*pos` is moved. If
|
||||
the encoding runs out of buffer space before finishing, the method
|
||||
returns false. This is the most efficient way to encode, into an
|
||||
already-allocated buffer.
|
||||
* `void encode(EncodeCallback encodeCallback)` calls `encodeCallback`
|
||||
for each encoded byte. It's the responsibility of the implementor
|
||||
of the callback to behave safely in the event that the output buffer
|
||||
(if applicable) is exhausted. This is less efficient than the prior
|
||||
method because it imposes an additional function call for each byte.
|
||||
* `template </*...*/> void encode(OutputIterator i)`
|
||||
encodes into the provided iterator. SFINAE ensures that the
|
||||
template doesn't match for non-iterators. The implementation
|
||||
actually uses the callback-based method, plus has whatever overhead
|
||||
the iterator adds.
|
||||
* `std::vector<uint8_t> encode()` creates a new std::vector, reserves
|
||||
sufficient capacity to hold the encoding, and inserts the encoded
|
||||
bytes with a std::pushback_iterator and the previous method.
|
||||
* `std::string toString()` does the same as the previous method, but
|
||||
returns a string instead of a vector.
|
||||
|
||||
### Incremental generation
|
||||
|
||||
Incremental generation requires deeper understanding of CBOR, because
|
||||
the library can't do as much to ensure that the output is valid. The
|
||||
basic tool for intcremental generation is the `encodeHeader`
|
||||
function. There are two variations, one which writes into a buffer,
|
||||
and one which uses a callback. Both simply write out the bytes of a
|
||||
header. To construct the same map as in the above examples,
|
||||
incrementally, one might write:
|
||||
|
||||
```
|
||||
using namespace cppbor; // For example brevity
|
||||
|
||||
std::vector encoding;
|
||||
auto iter = std::back_inserter(result);
|
||||
encodeHeader(MAP, 2 /* # of map entries */, iter);
|
||||
std::string s = "key1";
|
||||
encodeHeader(TSTR, s.size(), iter);
|
||||
std::copy(s.begin(), s.end(), iter);
|
||||
encodeHeader(ARRAY, 2 /* # of array entries */, iter);
|
||||
Map().add("key_a", 99).add("key_b", vec).encode(iter)
|
||||
s = "foo";
|
||||
encodeHeader(TSTR, foo.size(), iter);
|
||||
std::copy(s.begin(), s.end(), iter);
|
||||
s = "key2";
|
||||
encodeHeader(TSTR, foo.size(), iter);
|
||||
std::copy(s.begin(), s.end(), iter);
|
||||
encodeHeader(SIMPLE, TRUE, iter);
|
||||
```
|
||||
|
||||
As the above example demonstrates, the styles can be mixed -- Note the
|
||||
creation and encoding of the inner Map using the fluent style.
|
||||
|
||||
## Parsing
|
||||
|
||||
LibCppBor also supports parsing of encoded CBOR data, with the same
|
||||
feature set as encoding. There are two basic approaches to parsing,
|
||||
"full" and "stream"
|
||||
|
||||
### Full parsing
|
||||
|
||||
Full parsing means completely parsing a (possibly-compound) data
|
||||
item from a byte buffer. The `parse` functions that do not take a
|
||||
`ParseClient` pointer do this. They return a `ParseResult` which is a
|
||||
tuple of three values:
|
||||
|
||||
* std::unique_ptr<Item> that points to the parsed item, or is nullptr
|
||||
if there was a parse error.
|
||||
* const uint8_t* that points to the byte after the end of the decoded
|
||||
item, or to the first unparseable byte in the event of an error.
|
||||
* std::string that is empty on success or contains an error message if
|
||||
a parse error occurred.
|
||||
|
||||
Assuming a successful parse, you can then use `Item::type()` to
|
||||
discover the type of the parsed item (e.g. MAP), and then use the
|
||||
appropriate `Item::as*()` method (e.g. `Item::asMap()`) to get a
|
||||
pointer to an interface which allows you to retrieve specific values.
|
||||
|
||||
### Stream parsing
|
||||
|
||||
Stream parsing is more complex, but more flexible. To use
|
||||
StreamParsing, you must create your own subclass of `ParseClient` and
|
||||
call one of the `parse` functions that accepts it. See the
|
||||
`ParseClient` methods docstrings for details.
|
||||
|
||||
One unusual feature of stream parsing is that the `ParseClient`
|
||||
callback methods not only provide the parsed Item, but also pointers
|
||||
to the portion of the buffer that encode that Item. This is useful
|
||||
if, for example, you want to find an element inside of a structure,
|
||||
and then copy the encoding of that sub-structure, without bothering to
|
||||
parse the rest.
|
||||
|
||||
The full parser is implemented with the stream parser.
|
||||
|
||||
### Disclaimer
|
||||
This is not an officially supported Google product
|
||||
Reference in New Issue
Block a user