Decentralized identity

Current state, its issues and our choices

Subject identifiers, DIDs and VCs

The goal of digital identity is to avoid identifier (and related attributes) collision by assigning unique identifiers. Various possibilities of such identifiers exists, from opaque to global identifiers (see subject_identifiers for more details on the types we support within IETF GNAP).

The need for decentralized public key infrastructure (DPKI) was introduced by organizations such as sovrin and more recently the W3C and DIF. The identity model is organized around DIDs (decentralized identifiers), an attempt to provide globally resolvable identifiers as an alternative to the federated identity model (such as OpenID connect for instance). The interested reader may refer to the DID primer for an introduction:

One of the core problems with previous identity systems was the requirement for a centralized database (such as Active Directory) to store these unique identifiers. At the start of these projects, blockchain technology appeared to both guarantee the non-collusion of identifiers and not require a centralized control while enabling a seemingly infinite number of identifiers to be issued (and in practice many DID methods exist). For instance, the sidetree protocol anchors identity proofs on top on bitcoin, and is used by Microsoft. DIDs can also be used for messaging, with DIDComm as an alternative to email for instance.

The W3C further proposes Verifiable Credentials, i.e. “a tamper-evident credential that has authorship that can be cryptographically verified”, or in other words, a message signed by a reputable issuer using cryptographic algorithms. However, rather than simply signing a byte string or a json schema, Verifiable Credentials (VCs) present a linked data model for the idea of claims, which are any list of attributes and values pertaining to a subject, the “entity about which claims are made”.

Example of a simple Verifiable Credential, using JSON-LD:

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://www.w3.org/2018/credentials/examples/v1"
  ],
  "id": "http://example.edu/credentials/3732",
  "type": ["VerifiableCredential", "UniversityDegreeCredential"],
  "credentialSubject": {
    "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
    "degree": {
      "type": "BachelorDegree",
      "name": "Bachelor of Science and Arts"
    }
  },
  "proof": { ... }
}

Should JSON-LD be used for identity?

Short answer: we think it's a bad idea. While semantic web technologies are very useful in the area of knowledge graphs, their diffusion has stalled due to the complexity of the underlying technologies, i.e. RDF and web ontologies (such as OWL). But most importantly, its widespread use in decentralized identity standards seems odd. Here's why.

The rationale for using JSON-LD is explained as the technology required to resolve the claim document from the DID (as explained in this video). JSON-LD is organised around @context, which allows a mapping using URIs (Uniform Resource Identifiers, such as http://example.org) and values where the same graph can be serialized in different manners. The problem is that it isn't targeted at security or privacy applications:

  • a known problem is that there is no way to enforce the privacy of any attributes attached to the Verifiable Credentials produced by service end-points accessed via DID documents and DIDs. It is only stated that “it is strongly recommended that DID documents contain no PII”, but correlation is likely to happen in practice. One-time use DIDs are not enough, as the W3C DID standard notes “the anti-correlation protections of pseudonymous DIDs are easily defeated if the data in the corresponding DID Documents can be correlated”.

  • Linked Data puts security behind semantics (aka RDF expansion and normalization first, signature verification second). Using hash links on the schema.org schema does not guarantee the proper form of immutability that ensures an issued VC is verifiable.

  • Linked Data doesn't correspond to a real world representation of trust. When an issuer generates a VC from a document verification or a live face detection algorithm, it's generally isn't a binary decision. Linked Data provides a false sense of assurance that assertions from different issuers can be chained, but aren't able to account for uncertainties and the risks related to profile validation.

  • VC issuance is based on trusting organizational issuers, but there's still very little advances in cross corporate identity management (except from preliminary work being done on vLEI by GLEIF, based on KERI)

  • Issues with TLS are problematic.

JSON-LD and RDF also prone to performance issues (see https://linkeddatafragments.org for the continuum of choices). For our use case on embedded devices, it becomes very much impractical, which consume binary formats such as CBOR and cannot always rely on fetching additional online resources anyways.

In general, critics have observed that self sovereign identity concepts lacks a broader view on socio-technical impacts.

What do we use then?

Instead, IETF GNAP doesn't rely on JSON-LD and we propose to use :

  • a cryptographic event management layer using DIF KERI (see dedicated section)

  • a schema for attributes for full data points, for instance json schema or ION schema (ION is built by amazon as an alternative to json, which brings the benefit of text and binary representations)

  • advanced data processing using scalable labeled graphs (such as dgraph or nebula) and event processing techniques to account for uncertainty, as well as confidential computing

For compatibility reasons, we still include DIDs as a subject format, through the did:keri method.

Last updated