Skip to content

Generic Dataset Assurance Levels

Trust Framework members can indicate the level of assurance in a dataset by describing it with an Assurance Level.

This specification defines generic Assurance Levels, identified by URLs in the IB1 Root Registry with the https://registry.trust.ib1.org/dataset-assurance-level/ prefix.

Dataset Assurance Level 1

https://registry.trust.ib1.org/dataset-assurance-level/Level1

Assurance that:

  • D1.1. The metadata for the dataset is available publicly on the web in a location recorded in the organization’s Trust Framework Directory entry, and conforms to the specification adopted by the Scheme
  • D1.2. The datasets are in machine-readable formats
  • D1.3. The member has the right to share data with the members permitted to access the data under the license specified by the Scheme, either through ownership of the data or by having obtained a license permitting the transfer and subsequent use.
  • D1.4. The dataset contains no personal data and is not subject to UK GDPR, EU GDPR or related UK and European data protection regulations
  • D1.5. For Open Data
    • D1.5.1. The dataset is published on the web with a license that is compatible with Open Data
    • D1.5.2. The metadata specifies IB1-O for the Data Sensitivity Class
    • D1.5.3. Anonymous downloads of open data is strongly preferred, but where the dataset requires compulsory registration before download:
      • D1.5.3.1. Registration is only conditional on completion of a lightweight challenge necessary for technical measures to minimise spam and bot abuse, such as verifying receipt of an email
      • D1.5.3.2. Acceptance of registration is automatic and immediate
      • D1.5.3.3. Registration may only be denied or withdrawn for misuse
      • D1.5.3.4. The registration process does not introduce any barriers to automated downloads or API access. Access to data is identical in all respects to a simple HTTP download of a published URL or API, except for the addition of a static credential or token that does not need renewing
  • D1.6. For Shared Data
    • D1.6.1. The metadata specifies the license, conformance and access control for the data
    • D1.6.2. The metadata specifies IB1-SA or IB1-SB for the Data Sensitivity Class
    • D1.6.3. Access is restricted to Scheme members using the Scheme’s standards for identification and access control

Dataset Assurance Level 2

https://registry.trust.ib1.org/dataset-assurance-level/Level2

The dataset meets all the requirements of Level 1, plus assurance that:

  • D2.1. Legal
    • D2.1.1. The dataset is accompanied by documentation of the process and decision-making justifying how the dataset is originated (including the rights the publisher has in the data), published, accessed, and licensed (for example data triage or equivalent)
    • D2.1.2. Metadata includes licenses for data sources and commercially reasonable citations and/or provenance (processes of data collection, processing and quality assurance)
    • D2.1.3. Metadata is reviewed annually to ensure that the license, API version and data schema meets the latest version of the standards defined in Registry Scheme Catalog Requirements where they are defined and available within the Scheme Registry
  • D2.2. Operational
    • D2.2.1. Metadata includes dates of creation and publication
    • D2.2.2. Where a dataset covers a temporal range, this is defined in the metadata
    • D2.2.3. Where a dataset covers a geographical location or region, this is defined in the metadata
    • D2.2.4. The dataset will be maintained and available for a minimum of one calendar year, except where legal or regulatory requirements require shorter availability or immediate removal
    • D2.2.5. Where a dataset is withdrawn, if required by the Scheme, the member will follow the Scheme’s process for notifying users
    • D2.2.6. Data is documented on publicly available URLs
    • D2.2.7. A mechanism is available for people to provide feedback and ask questions (e.g. human technical support)
    • D2.2.8. Responses to feedback and questions are guaranteed to be provided within a reasonable timeframe that is transparent to users, not exceeding 30 days
  • D2.3. Technical
    • D2.3.1. Data is published in content-appropriate formats that enable data to be used in an interoperable manner by machine-based systems
    • D2.3.2. For Shared Data, the dataset is immediately available via an endpoint implementing the Scheme's secure data interchange requirements to any Scheme-registered application that meets the terms of its license, conformance and access control metadata

Dataset Assurance Level 3

https://registry.trust.ib1.org/dataset-assurance-level/Level3

The dataset meets all the requirements of Level 2, plus assurance that:

  • D3.2. Operational
    • D3.2.1. A schedule is published at a public URL documenting the process of maintaining the data’s availability
    • D3.2.2. Published data is monitored for unauthorised changes and processes in place for restoration within the 99.5% uptime commitment (see below)
    • D3.2.3. A document detailing the dataset provenance, excepting any information that is security- or commercially-sensitive, is authored alongside the dataset, and
      • D3.2.3.1. for Open Data, published via a public URL, accessed with the same method as the dataset,
      • D3.2.3.2. for Shared Data, made available to the specific roles pre-authorised to access the dataset within the relevant Trust Framework and/or Scheme, except where there is no commercially reasonable reason to restrict publication, when it is made available to all roles.
    • D3.2.4. The process for assessing and prioritising data requests or new use cases is published
  • D3.3. Technical
    • D3.3.1. Metadata cites, in a machine-readable format, the underlying content-appropriate open standard(s) used for publishing the dataset
    • D3.3.2. Publication of a single consistent URL (a “permalink”), or clear rules for how URLs are constructed, for accessing the dataset
    • D3.3.3. Machine-readable metadata describing the contents of the dataset is provided (e.g. JSON-LD, CSVW)
    • D3.3.4. Where data is provided by an API, the API has a machine-readable definition (e.g. OpenAPI)
    • D3.3.5. The service hosting the dataset has availability of at least 99.5%

Dataset Assurance Level 4

https://registry.trust.ib1.org/dataset-assurance-level/Level4

The dataset meets all the requirements of Level 3, plus assurance that:

  • D4.1. Legal
    • D4.1.1. The full set of license terms are machine-readable and available at a persistent URL in a consistent manner (e.g. https://creativecommons.org/publicdomain/zero/1.0/)
  • D4.2. Operational
    • D4.2.1. Data quality parameters and processes are published in a machine-readable format at a persistent URL in a consistent manner
  • D4.3. Technical
    • D4.3.1. Provenance is published in a machine-readable format. This means that:
    • D4.3.2. Shared Data provenance is made available, as a minimum, to the specific roles pre-authorised to access the dataset
    • D4.3.3. Open Data provenance is made available to all roles, published via a public URL
    • Uniform Resource Identifiers (URIs) are used as identifiers within data
    • D4.3.4. The service hosting the dataset has availability of at least 99.9%