CBOR is a compact binary data serialization and messaging format. This specification defines CBOR-LD 1.0, a CBOR-based format to serialize Linked Data. The encoding is designed to leverage the existing JSON-LD ecosystem, which is deployed on hundreds of millions of systems today, to provide a compact serialization format for those seeking efficient encoding schemes for Linked Data. By utilizing semantic compression schemes, compression ratios in excess of 60% better than generalized compression schemes are possible. This format is primarily intended to be a way to use Linked Data in storage and bandwidth constrained programming environments, to build interoperable semantic wire-level protocols, and to efficiently store Linked Data in CBOR-based storage engines.
This document is experimental.
There is a reference implementation that is capable of demonstrating the features described in this document.
CBOR is a compact binary data serialization and messaging format. This specification defines CBOR-LD 1.0, a CBOR-based format to serialize Linked Data. The encoding is designed to leverage the existing JSON-LD ecosystem, which is deployed on hundreds of millions of systems today, to provide a compact serialization format for those seeking efficient encoding schemes for Linked Data. By utilizing semantic compression schemes, compression ratios in excess of 60% better than generalized compression schemes are possible. This format is primarily intended to be a way to use Linked Data in storage and bandwidth constrained programming environments, to build interoperable semantic wire-level protocols, and to efficiently store Linked Data in CBOR-based storage engines.
This document is a detailed specification for a serialization of Linked Data in CBOR. The document is primarily intended for the following audiences:
There are a number of ways that one may participate in the development of this specification:
CBOR-LD satisfies the following design goals:
Similarly, the following are non-goals.
The following minefields have been identified while working on this specification:
The general CBOR-LD encoding algorithm takes a JSON-LD Document and does the following:
The first step in decoding a CBOR-LD payload is to recreate the term codec map that was used to encode it by processing the contexts in the payload. However, the contexts needed to create the term codec map can have their URLs encoded as integers by CBOR-LD. If a CBOR-LD payload contains context URLs compressed in such a way, the consumer of the CBOR-LD needs to know what compression tables (maps from JSON-LD terms to integers) were used to compress the context URLs during creation to be able to reconstruct the term codec map. The following sections define the exact mechanism by which this can be accomplished, allowing an arbitrary CBOR-LD consumer to decompress any CBOR-LD payload that conforms to this specification.
To this end, we have registered the range of CBOR tags 1536-1791** (0x0600-0x06FF) to be used for CBOR-LD, where data that includes tag value is used to lookup what compression table(s) are needed to decompress the CBOR-LD context URLs.
This exact range of tag values has not yet been officially registered with the IANA CBOR Tag Registry. The exact range is subject to change.
To enable unbounded extension on possible use cases for CBOR-LD that require different compression table material for consumption while working within a fixed number of CBOR tag values, we define the following.
Implementers MUST interpret the last byte of the two-byte CBOR tag value on a CBOR-LD payload as the beginning of a varint. If the CBOR tag is in the range `0x0600`–`0x067F`, the last byte of the CBOR tag is a one-byte varint. If the CBOR tag is `0x0680` or greater, the first item in the CBOR payload MUST be a major type 2 byte string containing the rest of the varint. See Algorithm for more information.
The value of this varint is then used to lookup a CBOR-LD Varint Registry Entry in the CBOR-LD Varint Registry.
The CBOR-LD Registry is a global list that provides consumers of CBOR-LD payloads the information they need to reconstruct the term codec map required for decompression. A CBOR-LD Varint Registry Entry contains the following:
The `typeTables` associated with a CBOR-LD Varint Registry Entry MUST be an array of or JSON objects. The only exception is the string "callerProvidedTable", which may appear in this array, denoting that for this use case, a `Type Table` is required which is not globally defined.
Dereferencing one of these URLs MUST result in a JSON object with the following properties:If a JSON object is present in the `typeTables` array, it MUST be in the above format.
The following is the current CBOR-LD registry:
Registry Entry Id | Use Case | typeTables | Processing Model |
---|---|---|---|
0 | Uncompressed CBORLD | None | DEFAULT |
1 | Compressed CBORLD, default use case. | DEFAULT | DEFAULT |
100 | Verifiable Credential Barcodes Specification Test Vectors | [ { type: "context", table: { "https://www.w3.org/ns/credentials/v2": 32768, "https://w3id.org/vc-barcodes/v1": 32769, "https://w3id.org/utopia/v2": 32770 } }, { type: "https://w3id.org/security#cryptosuiteString", table: { "ecdsa-rdfc-2019": 1, "ecdsa-sd-2023": 2, "eddsa-rdfc-2022": 3, "ecdsa-xi-2023": 4 } } ] | DEFAULT |
This algorithm takes JSON-LD objects `jsonldDocument` and `options` as well as an integer `registryEntryId` as input.
This algorithm takes a JSON-LD object `jsonldDocument`, integer `registryEntryId` and `options` as input.
This algorithm takes a JSON-LD object `jsonldDocument` and `options` as input. The `options` MUST contain:
This algorithm takes a map `typeTable` and returns a CBOR-LD term codec map that maps JSON-LD terms to their associated byte values and value compression functions.
**Note: This term codec registry is deprecated and has been replaced by the CBOR-LD Varint Registry.
The following is a registry of well-known term codecs. These will be registered on a first-come first-serve basis.
Value | Context URL | Context Name |
---|---|---|
0x00 - 0x0F |
RESERVED |
Reserved for future use. |
0x10 |
https://www.w3.org/ns/activitystreams |
ActivityStreams 2.0 |
0x11 |
https://www.w3.org/2018/credentials/v1 |
Verifiable Credentials Data Model v1 |
0x12 |
https://www.w3.org/ns/did/v1 |
Decentralized Identifiers (DID) Core Spec v1 |
0x13 |
https://w3id.org/security/suites/ed25519-2018/v1 |
Ed25519Signature2018 Suite |
0x14 |
https://w3id.org/security/suites/ed25519-2020/v1 |
Ed25519Signature2020 Suite |
0x15 |
https://w3id.org/cit/v1 |
Concealed Id Token |
0x16 |
https://w3id.org/age/v1 |
Age Verification |
0x17 |
https://w3id.org/security/suites/x25519-2020/v1 |
X25519KeyAgreementKey2020 Suite |
0x18 |
https://w3id.org/veres-one/v1 |
Veres One DID Method |
0x19 |
https://w3id.org/webkms/v1 |
WebKMS (Key Management System) |
0x1A |
https://w3id.org/zcap/v1 |
Authorization Capabilities (zCap) |
0x1B |
https://w3id.org/security/suites/hmac-2019/v1 |
Sha256HmacKey2019 Crypto Suite |
0x1C |
https://w3id.org/security/suites/aes-2019/v1 |
AesKeyWrappingKey2019 Crypto Suite |
0x1D |
https://w3id.org/vaccination/v1 |
Vaccination Certificate Vocabulary v0.1 |
0x1E |
https://w3id.org/vc-revocation-list-2020/v1 |
Verifiable Credentials Revocation List 2020 |
0x1F |
https://w3id.org/dcc/v1 |
DCC (Decentralized Credentials Consortium) Core Context |
0x20 |
https://w3id.org/vc/status-list/v1 |
Verifiable Credentials Status List |
0x21 |
https://www.w3.org/ns/credentials/v2 |
Verifiable Credentials Data Model v2 |
0x22 - 0x2F |
Available for use. | |
0x30 |
https://w3id.org/security/data-integrity/v1 |
Data Integrity v1.0 |
0x31 |
https://w3id.org/security/multikey/v1 |
Multikey v1.0 |
0x32 |
Reserved for future use. | |
0x33 |
https://w3id.org/security/data-integrity/v2 |
Data Integrity v2.0 |
0x34 - 0x36 |
RESERVED |
Reserved for future use. |