Introduction

When building a database system designed for very long-term use, there arises a need for permanent identifiers for its objects. Usage of provisional identifiers will eventually lead to a situation where a name ceases to refer to the object it has named, effectively making references incorrect until said identifiers are updated. This implies the need for constant database monitoring and maintenance. Users of the system would also need to be informed of any changes made.

This specification defines abstract data objects called ‘resource descriptors’ and an URI scheme for naming them. It also defines an interface for interacting with these objects. Anything else is outside the scope of this document, including mapping said interface to a communication protocol.

Resource descriptors contain knowledge about a topic designated by the URI. They are abstract in the sense that the content of their representation is different depending on the current time and the contacted host. It is neither a specific object nor a network location. You may think of the URI as a precise search term supplied in a query and of the resource descriptor as an answer to that query.

Foreword

The key words ‘MUST,’ ‘MUST NOT,’ ‘REQUIRED,’ ‘SHALL,’ ‘SHALL NOT,’ ‘SHOULD,’ ‘SHOULD NOT,’ ‘RECOMMENDED,’ ‘NOT RECOMMENDED,’ ‘MAY,’ and ‘OPTIONAL’ in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Descriptor identifier

Resource descriptors are identified by the `rd` URI scheme defined herein.

Syntax

The syntax of these URIs is defined by the following `rd-URI` ABNF rule. It follows the generic URI syntax defined in RFC3986. The `reg-name`, `segment-nz`, `query` and `fragment` rules are imported from that document.

```rd-URI   = "rd://" rd-auth [ rd-path [ "?" query ] [ "#" fragment ] ]
rd-auth  = reg-name / RDGN
RDGN     = 1*RDGN-blk
RDGN-blk = <24>b32-char
b32-char = ALPHA / "2" / "3" / "4" / "5" / "6" / "7"
rd-path  = 1*( "/" segment-nz )```

Scheme

The `rd` in the scheme stands for ‘resource descriptor.’ It is expected that these identifiers are stored in large amounts. A two-letter abbreviation was chosen in order to save space and to make the computation time of URI comparisions shorter.

Authority

The authority component contains either the canonical Resource-descriptor Graph Number (RDGN) or a registered name for ease of human input.

Resource-descriptor Graph Number (RDGN)

The canonical authority is a Resource-descriptor Graph Number (RDGN). It is a randomly-generated unsigned integer, which identifies a graph (collection) of closely-related resource descriptors.

The maximum value of the RDGN is a variable called its length. The unit of an RDGN length is a 120-bit block.

The initial length is 1 block. The length is increased in steps of blocks, i.e. by 120 bits.

At least one bit of the last block MUST be set. RDGN with all bits cleared is invalid.

These blocks ensure that both base64 and base32 encodings of the binary representation produce strings without any padding. This also leaves one free octet in a 16-octet buffer for use by software, where a last-block marker or the amount of remaining blocks could be stored.

Textual representation of an RDGN is constructed by representing the number as a sequence of octets in ascending order of octet significance. The resulting sequence of octets (its length is a multiple of 15) is then encoded into text using some octet-to-text encoding. Within the URI, RDGNs are encoded with the base32 encoding. [RFC4648]

Note: One block (15 octets) produces 24 base32 characters.

Note: base64 cannot be used because URI authorities are case-insensitive.

Empty blocks

The RDGN `EEEEEEEEEEEEEEEEEEEEEEEEAAAAAAAAAAAAAAAAAAAAAAAA` is treated the same as `EEEEEEEEEEEEEEEEEEEEEEEE`. In other words, all empty blocks are stripped from the end.

Anonymous Graph

RDGN 1745936836749459630212825467061601310 (`ANONANONANONANONANONANON` in base32) is reserved for the Anonymous Graph.

The Anonymous Graph SHOULD be used in examples.

Implementations MAY define special processing for the Anonymous Graph.

RDGN registry

This technology was created for use in a Kueea Network. [KUEEA] It is a peer-to-peer network, in which a node may advertize that it wishes to take on a given network role.

Taking on the role of an RDGN registry means that the node will be contacted by other nodes in order to determine allocation state of an RDGN. The role of a registry is thus to keep a list of allocated numbers. The minimum amount of information on a given RDGN a registry needs to store is a boolean value indicating whether the number has been allocated or not.

Registries also keep track of the current RDGN length. The length is independently increased by each registry when 1% of all numbers of the current length is allocated. In other words, the second block is added after 2120/100 numbers have been collected, the third after 2240/100 numbers, etc.

It is an error if a node wishes to allocate a number that is too short. Length of new RDGNs MUST be equal to or more than the current length.

There SHOULD be multiple nodes functioning as an RDGN registry within a given network because nodes MAY resign from their role at any time. Nodes may also unexpectedly disappear from the network.

Definition of a registration protocol is out of scope of this document, although the numbers MUST always be provided by the registrant. If the number has not been allocated yet, it is marked as allocated, in a first-come, first-served fashion.

The registrant provides a randomly-generated number to the registry, in order to ensure the number has really been randomly generated. If a remote node would control the generation of numbers, it could present numbers which only appear to be random.

Registered name

The authority component is treated as a registered name if, and only if, the authority component is not a multiple of 24 characters or it contains a character not matched by the `b32-char` rule.

These names are only for ease of human input. They are never stored.

DNS [RFC1035] name is the only domain of registered names defined.

This document may be updated in the future in order to define additional domains of registered names, although it is believed that such a need will never arise. The authors believe that no other system for storing human-readable, registered names than DNS is required, because any such system would ultimately have exactly the same problems as those identified with DNS. Any human-readable naming system requires a global registry, which must be centrally managed in order to solve disputes over names. Most problems with the public DNS are not with the database itself, but are rather problems with the management (such as lack of trust in it) or with the protocol used for transferring the data over a network.

In order to resolve a domain name to an RDGN, issue a query with QNAME set to the domain name and QTYPE set to TXT records. Then, look for a TXT record in the answer section that matches the following `rd-DNS` rule and extract the value of the RDGN from the first record that matched.

`rd-DNS   = %s"RDGN " <RDGN>`

Path

The path is a human-readable, case-sensitive name of a resource descriptor.

Although the syntax defines paths as hierarchical, resource descriptors are considered to be nodes of a graph (i.e. not of a tree). There is no concept of parents and children within the namespace. In other words, the existence of `/a/b/c` does not imply the existence of `/a` nor the existence of `/a/b`.

For clarity: Paths MUST NOT end with a U+002F SOLIDUS character (`/`) and empty path segments are not permitted, i.e. there MUST NOT be any two consecutive U+002F SOLIDUS characters (`//`) within the path segment.

Each resource descriptor should contain knowledge that is narrow in scope. Descriptors are to form a graph structure that applications traverse. The names ought to be chosen so that the amount of knowledge within one descriptor is concise and can be processed relatively fast. Large amounts of information should be split among mutiple descriptors so that applications do not waste time processing unnecessary data.

For example, if a Compact Disc is to be described, only information about the disc itself should be present and nothing else. Even if there are songs on the disc, a song is not a disc – it should have its own descriptor.

The `/index` descriptor

Every graph MUST contain at least the descriptor under the path `/index`.

It is the the first descriptor an application retrieves. It contains information about the classes of the graph and references to other descriptors that exist within the graph.

The list of referenced descriptors MAY not be exhaustive. Only nearby nodes SHOULD be referenced, i.e. the minimal set required to reach all other nodes.

It is RECOMMENDED that other descriptors with the same purpose also use the path segment `index`, preferably as the last one.

Query

The query component is a serialized list of name-value pairs.

The serialization algorithm is as follows: For each pair:

1. Encode both the name and value using UTF-8. [RFC3629]
2. If output is not empty, append an U+0026 AMPERSAND character.
3. Append the name.
4. Append an U+003D EQUALS SIGN character.
5. Append the value.

The space of query parameters is defined separately for the retrieval function and for the submission function.

Normalization

Do the following in order to normalize an `rd` URI.

1. If the authority is a registered name, dereference the name and modify the authority component accordingly.
2. Remove the query component.

URI comparision

When comparing URIs according to this framework:

• resolve to the full URI if an URI reference;
• if the scheme is `rd`, normalization is REQUIRED, otherwise normalization is RECOMMENDED;
• compare URIs by components, not by the whole string; for example, the URI `http://example.com/?#` is equal to `http://example.com/`.

Descriptor definition

```descriptor
|-piece-id => RDF-graph
|             |-signer-id => signature
|             |-signer-id => signature
|-piece-id => RDF-graph
|-signer-id => signature
|-signer-id => signature```

A resource descriptor is a unique mapping of a piece indentifier to a descriptor piece.

A piece identifier is a character string which has the same syntax as the `Message-ID` field of Internet Messages. [RFC2822]

A descriptor piece is a pair of an RDF graph, and a unique mapping of a (singer) URI to a signature.

An RDF graph is a set of RDF statements. [RDF] The set MUST NOT be empty.

A signature is a tuple of: the time of expiration, a character string identifying the signature scheme, and an opaque data object (sometimes called a blob). The object contains the result of applying the scheme.

Signatures are an assessment by the signer that all of the statements contained within the signed RDF graph are all correct and true.

A piece expires when all of its signatures expire, i.e. all assessments of information truthfullness expire.

Graph requirements

All `rd` URIs in the graph MUST be in the normalized form.

All `ni` URIs MUST have an empty authority component.

There MUST NOT be any `data` URIs in the graph.

Signatures

This document does not define any signature schemes. It only defines how signatures are expressed and what data is signed.

```sig-scheme = 1*(ALPHA / DIGIT / "-")
sig-expire = date-time / sig-expkey
sig-expkey = %s"never"```

The `date-time` rule is imported from [RFC3339].

The identifier of a signature scheme MUST match the `sig-scheme` rule.

The time of expiration is character string. It MUST match the `sig-expire` rule. It is either a specific date and time or a keyword.

The only defined keyword is `never`, indicating that the signature never expires.

Data necessary for veryfying a signature is obtained via the signer URI. The signature scheme defines how to utilize the URI. In general, these URIs SHOULD identify a user account or similar object.

The opaque data object contains the result of applying the signature scheme over an output sequence of octets, generated as follows.

1. Let output be an empty sequence of octets.
2. Let graph be the RDF graph of the piece.
3. Append the piece identifier to output.
4. Append the URI of the signer to output.
5. Append the time of expiration to output.
6. Append the scheme identifier to output.
7. Encode graph into its `application/prs.inumi.rdg-graph` representation [RDG-GRAPH] and append the result to output.
8. Return output.

This algorithm permits re-encoding of the RDF graph into another representation without invalidating a signature.

Resolution / Interface

Interface of a resolver has two functions: retrieval and submission.

This document does not define any resolution mechanism for identifiers without a path component (for resolving graphs).

Both of these functions take a desciptor indentifier as input. Parameters are extracted from the query component of the identifier and then the URI is normalized, removing the query component.

Graph data is stored at Resource-descriptor Graph Endpoints. Endpoints are referenced by an URI or a domain name. Documents that define protocols for these endpoints also define the syntax of their URIs and how to obtain a URI given a domain name.

The `rd-graph.home.arpa` domain name

This document defines the DNS domain name `rd-graph.home.arpa`.

It is a name within a residental home network. [RFC8375] Availablility of this name is REQUIRED to comply with this document. It is configured locally per site as an opt-in to the RDG system and MUST locate at least one Resource-descriptor Graph Endpoint, according to the specifications of said endpoint.

Common parameters

This section lists parameters for both retrieval and submission.

The `endpoints` parameter

The value of the `endpoints` parameter is a space-separated list of Resource-descriptor-Graph-Endpoint URIs and domain names.

By default, the list consists of the `rd-graph.home.arpa` domain name. If the authority was a domain name, the name is also included in the list.

For example, for `rd://example.com/desc` endpoints are `rd-graph.home.arpa` and `example.com`; for `rd://ANONANONANONANONANONANON/desc`, endpoints are only `rd-graph.home.arpa`; for `rd://example.com/desc?endpoints=example.org+http://example.com/rdg`, endpoints are as given - `example.org` and `http://example.com/rdg`.

Retrieval

Input is a descriptor identifier.

Output is a data object whose format depends on the `format` parameter.

The `format` parameter

The `format` parameter identifies the format of the representation. The value is either a media type or a URI of the data format.

By default, the value is `multipart/prs.inumi.rdg-descriptor`. [RDG-MIME]

The `signers` parameter

The `signers` parameter is a space-separated list of signer URIs.

The output MUST contain only those pieces which are signed by at least one signer whose URI is listed in the `signers` list.

The `schemes` parameter

The `schemes` parameter is a space-separated list of signature schemes.

The output MUST contain only those pieces which have at least one signature generated using a scheme listed in the `schemes` list.

Submission

Input is a descriptor identifier and a descriptor piece.

Output is a list of pairs of an endpoint URI and a character string.

The first character of the string indicates the status. It is one of: success (S), partial (P) or failure (F). Subsequent characters SHOULD contain a human-readable messsage to the user.

Success means that all submitted data was accepted.

Partial means that only a portion of the submitted data was accepted.

Failure means that none of the submitted data was accepted.

The resolver contacts endpoints one after another and submits the received descriptor piece via the endpoint’s API. For each endpoint, a pair indicating the result is appended to the list.

Endpoint behaviour

Endpoints MUST process the submitted descriptor piece as follows.

A piece with an empty graph is interpreted as a piece reference, in case the protocol used does not allow for explicit references.

If the graph in the piece is a reference:

1. Find a descriptor by its identifier; if not found, return failure.
2. Find a piece by reference; if not found, return failure.
3. Verify the signatures in the submitted piece over the referenced graph.
4. Discard all signatures that failed to verify.
5. Return failure if no signatures remain.
6. Update the signatures under the piece with the submitted ones.
7. Return success (all signatures valid) or partial.

Otherwise (if the graph is not a reference):

1. Find a descriptor by its identifier; if not found, create it.
2. Find a piece with an identifier equal to that of the submitted piece; if found, return failure.
3. Verify the signatures in the submitted piece over the referenced graph.
4. Discard all signatures that failed to verify.
5. Return failure if no signatures remain.
6. Insert the piece into the descriptor.
7. Return success (all signatures valid) or partial.

Note that a new signature may have an expiration time equal to the current time or be in the past, which effectively revokes it.

Pieces that have expired SHOULD be removed and those that did not SHOULD be kept until they do.

It is up to the specific endpoint when and which pieces are kept. Users have no guarantee that endpoints will keep storing their data. It is most desirable that users have their own endpoints, instead of relying on a third party for storing and serving the data.

Security considerations

RDGN registries

Registries SHOULD be somehow protected from flood registation attacks. Such an attack would unnecessarily pollute the RDGN space.

Registries MAY also deregister numbers for which there is no information available within the network that they operate it, which were not registered by means of interchanging data with a registry from another network. The other network MUST be contacted for verification of RDGN status.

Endpoints

If an endpoint stores RDF graph data in the submitted format, it ought to remove all comments in the serialization, in order to avoid malicious users from using the endpoint as a data storage by including comments with arbitrary data.

The statements in the graph SHOULD also be processed and validated. This document recommends putting pieces from unrecognized users into a quarantine to be later reviewed by a human being. Data from trusted users could skip the quarantine in general, but it is recommended to also quarantine it once in a while.

To be written.