CBOR is a compact binary data serialization and messaging format. This specification defines CBOR-LD 1.0, a CBOR-based format to serialize Linked Data. The encoding is designed to leverage the existing JSON-LD ecosystem, which is deployed on hundreds of millions of systems today, to provide a compact serialization format for those seeking efficient encoding schemes for Linked Data. By utilizing semantic compression schemes, compression ratios in excess of 60% better than generalized compression schemes are possible. This format is primarily intended to be a way to use Linked Data in storage and bandwidth constrained programming environments, to build interoperable semantic wire-level protocols, and to efficiently store Linked Data in CBOR-based storage engines.
This document is experimental.
There is a reference implementation that is capable of demonstrating the features described in this document.
CBOR is a compact binary data serialization and messaging format. This specification defines CBOR-LD 1.0, a CBOR-based format to serialize Linked Data. The encoding is designed to leverage the existing JSON-LD ecosystem, which is deployed on hundreds of millions of systems today, to provide a compact serialization format for those seeking efficient encoding schemes for Linked Data. By utilizing semantic compression schemes, compression ratios in excess of 60% better than generalized compression schemes are possible. This format is primarily intended to be a way to use Linked Data in storage and bandwidth constrained programming environments, to build interoperable semantic wire-level protocols, and to efficiently store Linked Data in CBOR-based storage engines.
This document is a detailed specification for a serialization of Linked Data in CBOR. The document is primarily intended for the following audiences:
There are a number of ways that one may participate in the development of this specification:
CBOR-LD satisfies the following design goals:
Similarly, the following are non-goals.
The following minefields have been identified while working on this specification:
The general CBOR-LD encoding algorithm takes a JSON-LD Document and does the following:
The first step in decoding a CBOR-LD payload is to recreate the term codec map that was used to encode it by processing the contexts in the payload. However, the contexts needed to create the term codec map can have their URLs encoded as integers by CBOR-LD. If a CBOR-LD payload contains context URLs compressed in such a way, the consumer of the CBOR-LD needs to know what compression tables (maps from JSON-LD terms to integers) were used to compress the context URLs during creation to be able to reconstruct the term codec map. The following sections define the exact mechanism by which this can be accomplished, allowing an arbitrary CBOR-LD consumer to decompress any CBOR-LD payload that conforms to this specification.
To this end, we have registered the CBOR tag `0xCB1D` (tag value 51997) to be used for CBOR-LD. The data that follows this tag value is used to look up what compression table(s) are needed to decompress the CBOR-LD context URLs.
This exact tag value has not yet been officially registered with the IANA CBOR Tag Registry. The exact value is subject to change.
To enable unbounded extension on possible use cases for CBOR-LD that require different compression table material for consumption while working with a single CBOR tag value, we define the following.
CBOR-LD payloads MUST be structured such that the item tagged with tag `0xCB1D` is a two-element array, and the first element MUST be a major type 0 integer. This integer is a CBOR-LD Registry Entry ID.
The value of the CBOR-LD Registry Entry ID is then used to look up a CBOR-LD Registry Entry in the CBOR-LD Registry.
The CBOR-LD Registry is a global list that provides consumers of CBOR-LD payloads the information they need to reconstruct the term codec map required for decompression. A CBOR-LD Registry Entry contains the following:
The `typeTables` associated with a CBOR-LD Registry Entry MUST be an array of or JSON objects. The only exception is the string "callerProvidedTable", which may appear in this array, denoting that for this use case, a `Type Table` is required which is not globally defined.
Dereferencing one of these URLs MUST result in a JSON object with the following properties:
If a JSON object is present in the `typeTables` array, it MUST be in the above format.
The following is the current CBOR-LD registry:
Registry Entry Id | Use Case | Processing Model | Provisional | typeTables |
---|---|---|---|---|
0 | Uncompressed CBORLD | DEFAULT | No | None |
1 | Compressed CBORLD, default use case. | DEFAULT | No | DEFAULT |
100 | Verifiable Credential Barcodes Specification Test Vectors | DEFAULT | Yes |
[ { type: "context", table: { "https://www.w3.org/ns/credentials/v2": 32768, "https://w3id.org/vc-barcodes/v1": 32769, "https://w3id.org/utopia/v2": 32770 } }, { type: "https://w3id.org/security#cryptosuiteString", table: { "ecdsa-rdfc-2019": 1, "ecdsa-sd-2023": 2, "eddsa-rdfc-2022": 3, "ecdsa-xi-2023": 4 } } ] |
10001 | Provisional California DMV Credentials | DEFAULT | Yes |
[ { type: "context", table: { "https://www.w3.org/ns/credentials/v2": 1, "https://w3id.org/vc-barcodes/v1": 2, "https://w3id.org/vc-dpp/v1rc1": 3, "https://w3id.org/vdl/v1": 4 } }, { type: "https://w3id.org/security#cryptosuiteString", table: { "ecdsa-rdfc-2019": 1 } }, { type: "url", table: { "did:key:zDnaeW9VZZs7NH1ykvS5EMFmdodu2wj4dPcrV3DzTAadrXJee": 1, "did:key:zDnaeW9VZZs7NH1ykvS5EMFmdodu2wj4dPcrV3DzTAadrXJee#zDnaeW9VZZs7NH1ykvS5EMFmdodu2wj4dPcrV3DzTAadrXJee": 2, "https://dmv.ca.gov/statuses/12345/status-lists": 3 } } ] |
10002 | Provisional First Responder Credentials | DEFAULT | Yes |
[ { type: "context", table: { "https://www.w3.org/ns/credentials/v2": 1, "https://w3id.org/vc-barcodes/v1": 2, "https://w3id.org/first-responder/sap/v1rc1": 3, "https://w3id.org/first-responder/v1", 4, "https://w3id.org/first-responder/v2rc1", 5 } }, { type: "https://w3id.org/security#cryptosuiteString", table: { "ecdsa-rdfc-2019": 1 } }, { type: "url", table: { "did:key:zDnaeW9VZZs7NH1ykvS5EMFmdodu2wj4dPcrV3DzTAadrXJee": 1, "did:key:zDnaeW9VZZs7NH1ykvS5EMFmdodu2wj4dPcrV3DzTAadrXJee#zDnaeW9VZZs7NH1ykvS5EMFmdodu2wj4dPcrV3DzTAadrXJee": 2, "https://caloes.ca.gov/statuses/12345/status-lists": 3 } } ] |
31000000 | California DL/ID Barcodes | DEFAULT | Yes |
[ { type: "context", table: { "https://www.w3.org/ns/credentials/v2": 1, "https://w3id.org/vc-barcodes/v1": 2 } }, { type: "https://w3id.org/security#cryptosuiteString", table: { "ecdsa-xi-2023": 1 } }, { type: "url", table: { "did:web:credentials.dmv.ca.gov": 1, "https://api.credentials.dmv.ca.gov/status/dlid/1/status-lists": 2, "https://api.credentials.dmv.ca.gov/status/dlid/2/status-lists": 3, "https://api.credentials.dmv.ca.gov/status/dlid/3/status-lists": 4, "did:web:credentials.dmv.ca.gov#vm-vcb-1": 5, "did:web:credentials.dmv.ca.gov#vm-vcb-2": 6, "did:web:credentials.dmv.ca.gov#vm-vcb-3": 7, "did:web:credentials.dmv.ca.gov#vm-vcb-4": 8, "did:web:credentials.dmv.ca.gov#vm-vcb-5": 9, "did:web:credentials.dmv.ca.gov#vm-vcb-6": 10, "did:web:credentials.dmv.ca.gov#vm-vcb-7": 11, "did:web:credentials.dmv.ca.gov#vm-vcb-8": 12, "did:web:credentials.dmv.ca.gov#vm-vcb-9": 13, "did:web:credentials.dmv.ca.gov#vm-vcb-10": 14, "did:web:credentials.dmv.ca.gov#vm-vcb-11": 15, "did:web:credentials.dmv.ca.gov#vm-vcb-12": 16, "did:web:credentials.dmv.ca.gov#vm-vcb-13": 17, "did:web:credentials.dmv.ca.gov#vm-vcb-14": 18, "did:web:credentials.dmv.ca.gov#vm-vcb-15": 19, "did:web:uat-credentials.dmv.ca.gov": 20, "https://api.uat-credentials.dmv.ca.gov/status/dlid/1/status-lists": 21, "did:web:uat-credentials.dmv.ca.gov#vm-vcb-1": 22, "did:web:uat-credentials.dmv.ca.gov#vm-vcb-2": 23, "did:web:uat-credentials.dmv.ca.gov#vm-vcb-3": 24, "did:web:uat-credentials.dmv.ca.gov#vm-vcb-4": 25, "did:web:uat-credentials.dmv.ca.gov#vm-vcb-5": 26, "https://api.uat-credentials.dmv.ca.gov/status/dlid/2/status-lists": 27, "https://api.uat-credentials.dmv.ca.gov/status/dlid/3/status-lists": 28 } } ] |
This algorithm takes a map `typeTable`, an integer `registryEntryId`, and a JSON-LD document `jsonldDocument` as inputs, and returns a hexadecimal string `cborldBytes`.
This algorithm takes a CBOR-LD payload `cborldBytes`, and returns a JSON-LD document `jsonldDocument`.
The algorithms in this section describe the behavior of a "converter" for abstractly converting inputs between data forms. When used in conjunction with a "strategy", such as the "compression" and "decompression" strategies defined later in this section, these algorithms can be instantiated to convert between concrete data forms. The "compression" strategy converts from JSON-LD to CBOR-LD, while the "decompression" strategy converts from CBOR-LD to JSON-LD.
This algorithm takes and returns a map `state`.
This algorithm takes a map `state` and a map or array of maps `inputDocuments`, and returns a map containing a map `state` and a map or array of maps `outputMaps`.
This algorithm takes maps `input`, `output`, `state`, and `activeContext` as inputs, and returns a map containing maps `state` and `output`.
This algorithm takes maps `state`, `activeContext`, `termInfo`, and values `value` and `termType`. It returns a `result` object containing maps `state` and `output`.
The algorithms in this section define the "compression" strategy to be used with the "conversion" algorithms defined previously to convert JSON-LD to CBOR-LD.
This algorithm takes maps `state`, `activeContext`, `input`, and `output`, and returns a map `result` containing maps `output`, `state`, and `activeContext`.
This algorithm takes maps `state` and `termInfo`, and values `valueToEncode` and `termType`, and returns a map `encoderData`.
This algorithm takes maps `state`, `activeContext`, and `input`, and returns a map `state` and an array `entries`.
This algorithm takes maps `activeContext` and `input`, and returns a set `objectTypes`.
The algorithms in this section define the "decompression" strategy to be used with the "conversion" algorithms defined previously to convert CBOR-LD to JSON-LD.
This algorithm takes maps `state`, `activeContext`, `input`, and `output`, and returns a map `result` containing maps `output`, `state`, and `activeContext`.
This algorithm takes maps `state` and `termInfo`, and values `termType` and `valueToDecode`, and returns a value `decodedValue`.
This algorithm takes maps `state`, `activeContext`, and `input`, and returns a map `state` and an array `entries`.
This algorithm takes maps `state`, `activeContext`, `input` as inputs, and returns a map `state` and a set `objectTypes`.
The algorithms in this section describe how to determine what components of the context documents associated with a JSON-LD document are in use at any point during compression or decompression. These algorithms include how to apply embedded, type-scoped, and property-scoped contexts with CBOR-LD. This is in contrast to the Context Loading algorithms defined later in this specification, which describe how to construct the mappings from terms to integers that are the core CBOR-LD compression technique. Together, the Active Context Processing and Context Loading algorithms specify how JSON-LD context documents should be processed when converting to and from CBOR-LD.
This algorithm takes maps `previousActiveContext` and `termMap`, and returns a map `activeContext`. It updates the active context in use and finds all aliases for `'@type'`.
This algorithm takes maps `state`, `activeContext`, and `input` as inputs, and returns a map `result` containing maps `state` and `activeContext`.
This algorithm takes maps `state`, `activeContext`, and a string `term` as inputs and returns a map `result` containing maps `state` and `activeContext`.
This algorithm takes maps `state`, `activeContext` ,and a set `objectTypes` as inputs, and returns a map `result` containing maps `state` and `activeContext`.
This algorithm takes maps `state`, `activeTermMap`, and map or array `contexts` as well as booleans `typeScope` and `propertyScope`, both of which default to `false` if not provided, as inputs. It returns maps `state` and `activeTermMap`.
This algorithm takes as input a map `activeContext`, and returns a map `newTermMap`.
The algorithms in this section define how to construct the mappings between terms and integers that are used as the core CBOR-LD compression technique.
This algorithm takes and returns a map `state`.
{ '@context' => 0, '@type' => 2, '@id' => 4, '@value' => 6, '@direction' => 8, '@graph' => 10, '@included' => 12, '@index' => 14, '@json' => 16, '@language' => 18, '@list' => 20, '@nest' => 22, '@reverse' => 24, '@base' => 26, '@container' => 28, '@default' => 30, '@embed' => 32, '@explicit' => 34, '@none' => 36, '@omitDefault' => 38, '@prefix' => 40, '@preserve' => 42, '@protected' => 44, '@requireAll' => 46, '@set' => 48, '@version' => 50, '@vocab' => 52, '@propagate' => 54 }
This algorithm takes a map `state` and a context map or URL `contextIdentifier`, and returns `result`, a map containing maps `state` and `entry`.
This algorithm takes a map `state`, a context object `context`, and a context URL `contextUrl`, and returns `result`, a map containing maps `state` and `entry`.
The codecs in this section specify exactly how individual values in JSON-LD should be converted to CBOR and vice versa. They are used by the algorithms in the previous section, and allow CBOR-LD to efficiently encode both primitive and non-primitive types as CBOR.
This algorithm takes a map `typeTable` and a value `contextValue` and returns a map `encoderData`.
This algorithm takes a map `encoderData`, and returns CBOR binary data.
This algorithm takes a map `reverseTypeTable`, and returns a map `encoderData`.
This algorithm takes a map `decoderData` and a value `value`, and returns a value.
This algorithm takes maps `state` and `termInfo`, and values `termType` and `valueToEncode`, and returns a map `encoderData` or `valueToEncode`.
This algorithm takes a map `encoderData`, and returns CBOR binary data.
This algorithm takes maps `state` and `termInfo`, and values `termType` and `valueToDecode`, and returns a map `decoderData`.
This algorithm takes a map `decoderData`, and returns a value.
This algorithm takes an encoded CBOR-LD payload `cborldBytes` as input, and returns `suffix`, the main data to be decoded, as well as the `registryEntryId` value that should be used to decompress `suffix`.
This specification registers a CBOR tag to allow consumers to identify CBOR-LD payloads. The following is provisional, and has not yet been ratified by IANA.
Tag: 51997
Registry: https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml
Data item: array
Semantics: a tag value of 51997 indicates that the payload is CBOR-LD.
Description of semantics: https://json-ld.github.io/cbor-ld-spec/#cbor-tags-for-cbor-ld
Point of contact: Wesley Smith (wsmith@digitalbazaar.com)