RFD: Redefining a Reference

Status: Draft
Authors: Jay Jacobs
Created: 2026-02-24
Updated: 2026-02-24
Target: CVE Record Format v6.x (or later)
Related: Discuss Forum RFD (PR #462)
Affected Section: References

Summary

This RFD proposes improving the CVE Record references section (in both CNA and ADP containers) so that references are not merely unlabeled URLs but machine-usable, typed, and contextualized external anchors. The current structure requires only a url, with optional name and optional tags; in practice this often results in lists of raw pointers whose purpose, authority, and content type are unclear to consumers. This proposal explores a replacement/extension of the current reference object to include structured fields such as type, title, publisher, retrieval/publication timestamps, media_type, archive_urls, and status, while maintaining compatibility paths from the current url / name / tags model.

Problem Statement

The current references object in the CVE Record Format is minimally constrained and not sufficiently expressive for either human understanding nor machine use. It currently supports three fields:

url (required)
name (optional free text)
tags (optional array; includes defined enums and custom values via x_*)

In practice, references in CVEs are often reduced to a list of unlabeled URLs. When only url is present, the reference is a pointer to unknown content with no structured indication of what the consumer should expect (vendor advisory, technical analysis, patch, issue tracker, etc.), whether the link is authoritative, whether it is still reachable, or whether archival alternatives exist.

This creates problems in at least three categories:

Data quality deficiency: lack of contextualized reference metadata weakens the usefulness and interpretability of references.
Schema expressiveness deficiency: the schema cannot capture common, practical distinctions that consumers care about.
Consumer interoperability deficiency: consumers cannot reliably automate workflows based on references because meaning is hidden in free text, tags, or external page content.

Evidence of current limitations (over the past 12 months as of 2026-02-23)

71% of URLs (92,736 / 130,906) did not include a name value.
70% of CVEs (34,857 / 49,722) did not include name values for any references.
50% of URLs (65,475 / 130,906) did not include a tag value
48% of CVEs (23,914 / 49,722) did not include tag values for any references.
55% of tags used (786,038 / 1,428,322) used x_* custom tags, indicating high variance and weak standardization.

Who is affected

CNA / ADP producers: are able to insert references without context, pushing the burden onto consumers. CNAs that want to identify and differentiate the authoritative material they are producing would struggle to separate out the important references from simply adjacent references (e.g., reference to issue-level details vs link to product page vs link to general article that mentions CVE as an example).
CVE consumers (tool vendors, VM teams, researchers, data platforms): must manually inspect URLs or rely on heuristics to infer meaning and importance.
UI builders and automation workflows: cannot reliably prioritize or label references (e.g., “show vendor advisory first”, “collect patch links”, “attempt CSAF ingestion”).

What happens if we do nothing?

If no changes are made, references will continue to function primarily as a weak pointer list with inconsistent tagging. Consumers will continue to spend manual effort following links to discover their purpose, and CVE records will continue to under-deliver on a key coordination function: structured external anchors that bolster confidence and support downstream automation.

Proposed Solution

This RFD proposes a structured replacement/extension for items in the references array in the CNA and ADP containers. The intent is to preserve the core role of references while making them more machine-usable and more informative for humans.

Design goals

Contextualize references so consumers know what kind of content a URL points to, when it was last active, how to retrieve it (media type), etc.
Improve machine usability for reference selection, retrieval, filtering, prioritization, and general information retrieval automation.
Improve data quality checks by making reference purpose and status explicit.
Fit within the existing CVE record model, everything is contained in a single reference section and as an array of reference objects..
Support operational reality where some metadata is unknown or difficult to determine.

Proposed reference object fields

The following fields are proposed for discussion as a new structured reference object shape.

Core identity and content location

url (string - required)
- Single URL field
type (enum - required, draft list below)
- Structured classification of the referenced resource (replacing overreliance on free-text names and tags).
title (string - optional, free text )
- Human-readable label/title for the referenced content.

Publisher metadata

publisher (object)
- name (string) - need to define this clearly, is it the host/domain? the org that authored the content? or the org that published the page? The authors name?
- role (enum; draft values are authoritative, non_authoritative)
- domain (string)

Timing metadata

published_at (datetime, optional) OR first_retrieved_at (datetime, optional)
- At least one of these should be supported and available when possible.
last_retrieved_at (datetime)
- Intended to be required in the proposed model (subject to QWG discussion).
- May be set and refreshed by the CVE Program / Secretariat over time.

Content and availability metadata

media_type (string)
- Ideally inferred and populated automatically where feasible, with support for human correction/override.
archive_urls (array of string URLs)
- Intended to be populated by the CVE Program archival process (details out of scope for this RFD).
status (enum; initial draft: reachable, moved, archived_only)

Required fields (intentionally unresolved for discussion)

This initial draft does not fix the final required field set. The RFD should invite QWG discussion on the value and burden of each field before finalizing minimum requirements.

`type` vocabulary (draft for discussion)

This RFD proposes introducing a single explicit type field for references. The starting point for discussion includes current tag semantics and possible consolidation. The submitter specifically recommends:

no x_* extensions for type, until presentation can be separated from storage, and there is a version of data where all x_* is stripped.
include a single other value instead to denote that the type was evaluated and this reference was deemed to not meet any of them.

Draft values for discussion (mix of current and possible future simplifications):

broken-link
customer-entitlement
exploit
government-resource
issue-tracking
mailing-list
mitigation
not-applicable
patch
permissions-required
media-coverage
product
related
release-notes
signature
technical-description
third-party-advisory
vendor-advisory
vdb-entry
other

Alternative consolidation candidates to discuss:

advisory - (note: removing the “third-party” vs “vendor” distinction)
technical_analysis
commit
issue_tracker
bulletin
vendor_notice
blog_post
exploit_writeup
forum_post (discussion?)

And specific doc types to consider:

csaf
cvrf
vulnerability_record - (e.g., osv/ghsa/gcve - should we have a way to explicitly reference a structured data resource from another registry?)
(there are definitely more)

Relationship to existing `tags`

This RFD does not yet finalize the fate of tags. It should explicitly discuss:

whether tags are retained for legacy purposes, slowly deprecated, or immediate replaced
how the existing tags map to type, status, or other structured fields
whether some existing tags reflect semantics better represented elsewhere

Examples

The examples below are illustrative and intended to support QWG discussion of structure and semantics.

Example 1: Current vs proposed (minimal contextualization)

Current style (today)

{
  "references": [
    {
      "url": "https://vendor.example/advisories/abc-2026"
    }
  ]
}

Proposed style (draft minimal)

{
  "references": [
    {
      "url": "https://vendor.example/advisories/abc-2026",
      "type": "advisory",
      "title": "Acme Advisory ABC-2026",
      "publisher": {
        "name": "Example Vendor",
        "role": "authoritative",
        "domain": "vendor.example"
      },
      "retrieved_at": "2026-02-24T18:00:00Z",
      "status": "reachable"
    }
  ]
}

Example 2: Third-party technical analysis blog

{
  "references": [
    {
      "url": "https://research.example/blog/deep-analysis-of-cve-2026-9999",
      "type": "technical-description",
      "title": "Deep Analysis of CVE-2026-9999",
      "publisher": {
        "name": "Research Example",
        "role": "non_authoritative",
        "domain": "research.example"
      },
      "published_at": "2026-01-12T09:30:00Z",
      "retrieved_at": "2026-02-24T18:00:00Z",
      "media_type": "text/html",
      "status": "reachable"
    }
  ]
}

Example 3: Git commit / patch

{
  "references": [
    {
      "url": "https://github.com/example/project/commit/abcdef123456",
      "type": "patch",
      "title": "Fix bounds check in parser",
      "publisher": {
        "name": "example/project",
        "role": "authoritative",
        "domain": "github.com"
      },
      "retrieved_at": "2026-02-24T18:00:00Z",
      "media_type": "text/html",
      "status": "reachable"
    }
  ]
}

Example 4: Removed URL with archive URL

{
  "references": [
    {
      "url": "https://nonexistant.example/security/advisory-123",
      "type": "vendor-advisory",
      "title": "Security Advisory 123",
      "publisher": {
        "name": "Old Example Vendor",
        "role": "authoritative",
        "domain": "old.example"
      },
      "first_retrieved_at": "2024-03-01T00:00:00Z",
      "retrieved_at": "2026-02-24T18:00:00Z",
      "status": "archived_only",
      "archive_urls": [
        "https://archive.example/snapshots/old-example-advisory-123"
      ]
    }
  ]
}

Impact Assessment

Expected benefits

Improved consumer usability: explicit type and status support filtering, prioritization, and automation.
Better UI/UX: references can be labeled consistently in tools and portals.
Improved data quality checks: records with only low-value or non-authoritative references can be detected more easily.
Better archival resilience: structured status and archive_urls support long-term utility of records.

Risks / costs (to be refined during QWG discussion)

Increased producer burden if too many fields are required at publication time.
Inconsistent classification for type and publisher.role without clear guidance, may be able to enhance validation with LLMs and autoclassification approaches.
False precision in dates (published_at) or media_type if populated heuristically (e.g., “I think this was published last May” == “2024-05-01T00:00:00Z”)
Implementation complexity if CVE Services automates status checks, archive handling, or media type inference.

Compatibility and Migration

This section is intentionally incomplete in this initial draft and should be completed based on QWG discussion.

Success Metrics

To be finalized in later revision of this RFD

Proposed evaluation timeline

To be finalized in later revision of this RFD

Candidate success criteria

To be finalized in later revision of this RFD

Supporting Data or Research

There is substantial incompleteness and inconsistency in current references usage (missing name, missing tags, heavy x_* tag usage, see above) across a large sample of recent CVE records.

What the CNA Rules say about references

CNA Operational Rules v4.1.0

A CVE Record must include at least one public reference

CNA Rules §5.1.10 says a CVE Record MUST contain at least one public reference, and that reference MUST NOT be the CVE Record itself.

The public reference must exist before or at publication time

CNA Rules §5.3.1 says CNAs MUST ensure a public reference exists on the internet before or concurrently with publication of the CVE Record. It also states a CVE Record MUST NOT be the first public disclosure of the vulnerability.

If there are multiple public references, include the most freely available one

CNA Rules §5.3.1.2 says if multiple public references exist, CNAs MUST include the most freely available public reference in the CVE Record.

CNAs should think about long-term availability / archival

CNA Rules §5.3.2 says CNAs SHOULD consider long-term availability of public references, including archival services (e.g., Internet Archive) or other mechanisms determined by the CVE Program.

At least one reference must meet quality/access criteria

CNA Rules §5.3.3 says at least one public reference must:
- SHOULD NOT require registration/login (§5.3.3.1)
- SHOULD NOT impose restrictive terms that conflict with CVE Program use (§5.3.3.2)
- MUST contain info about the specific vulnerability (§5.3.3.3)
- MUST provide minimum information supporting the CVE Record content (§5.3.3.4)
- MUST NOT be the CVE Record itself (§5.3.3.5)

What the schema says about references (format / fields / limits)

“references” is required in the CNA container

In the CVE schema docs, the CNA container’s references field is marked Required and is an array. It must contain at least 1 item and at most 512 items, and items must be unique.

Each reference item is a constrained object

Each reference entry is an object with No Additional Properties (i.e., schema-constrained shape).

“url” is the only required field in each reference item

Within a reference item, url is Required; name and tags are optional. The schema describes url as the URL used to retrieve the referenced resource.

“name” is optional and user-created

The schema defines optional name as a user-created name for the reference, often the page title.

“tags” are optional, but if present they are structured

If tags is present, it is an array with min 1 item, unique values, and items must be either:
- a standard enum from reference-tags.json, or
- an extension tag (tagExtension, i.e., x_...)

The schema docs also enumerate the standard reference tags (e.g., vendor-advisory, patch, issue-tracking, technical-description, vdb-entry, etc.).

Related Issues or Proposals

Pull Request #462 (add a field to the reference type):
#462

Unresolved Questions

What consumers should infer when fields are absent (e.g., absent status = unspecified, not unreachable)
Whether retrieved_at reflects CNA/ADP retrieval, registry retrieval, or either (with provenance implied by container/source)
What is the final required field set for a structured reference object?
What should the final type vocabulary be, and should it consolidate current tag semantics?
Should existing tags be retained, deprecated, or replaced?
What exactly does publisher represent (host vs author vs publisher org)?
How much automated enrichment is feasible without creating operational burden?
- Who sets and maintains retrieved_at, status, archive_urls, and media_type (CNA/ADP vs CVE Program)?
What semantics should consumers apply when metadata fields are absent? Unaddressed?
How should backward compatibility be handled in schema and CVE Services?

Future Possibilities

This RFD is intentionally scoped as a targeted improvement to the CVE Record references section. Future work may explore:

Stronger normalization of external resource types and publisher roles
More explicit relationships between references and specific CVE record assertions (analyzed_by, described_by, announced_by, fixed_by (for commits/patch refs), mentions, disputes, etc.)
Richer provenance for reference metadata changes
Cross-field consistency checks between reference types and other record content (e.g., affectedness, remediation, vendor metadata)
- There is a reference for a patch, but the record does not have patch information or claims there is no patch.
- Reference claims their company is “authoritative” but the record has no mention of the company (vendor).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFD: Redefining a Reference

Summary

Problem Statement

Evidence of current limitations (over the past 12 months as of 2026-02-23)

Who is affected

What happens if we do nothing?

Proposed Solution

Design goals

Proposed reference object fields

Core identity and content location

Publisher metadata

Timing metadata

Content and availability metadata

Required fields (intentionally unresolved for discussion)

`type` vocabulary (draft for discussion)

Relationship to existing `tags`

Examples

Example 1: Current vs proposed (minimal contextualization)

Current style (today)

Proposed style (draft minimal)

Example 2: Third-party technical analysis blog

Example 3: Git commit / patch

Example 4: Removed URL with archive URL

Impact Assessment

Expected benefits

Risks / costs (to be refined during QWG discussion)

Compatibility and Migration

Success Metrics

Proposed evaluation timeline

Candidate success criteria

Supporting Data or Research

What the CNA Rules say about references

What the schema says about references (format / fields / limits)

Related Issues or Proposals

Unresolved Questions

Future Possibilities

FilesExpand file tree

RFD_Redifining_a_Reference.md

Latest commit

History

RFD_Redifining_a_Reference.md

File metadata and controls

RFD: Redefining a Reference

Summary

Problem Statement

Evidence of current limitations (over the past 12 months as of 2026-02-23)

Who is affected

What happens if we do nothing?

Proposed Solution

Design goals

Proposed reference object fields

Core identity and content location

Publisher metadata

Timing metadata

Content and availability metadata

Required fields (intentionally unresolved for discussion)

type vocabulary (draft for discussion)

Relationship to existing tags

Examples

Example 1: Current vs proposed (minimal contextualization)

Current style (today)

Proposed style (draft minimal)

Example 2: Third-party technical analysis blog

Example 3: Git commit / patch

Example 4: Removed URL with archive URL

Impact Assessment

Expected benefits

Risks / costs (to be refined during QWG discussion)

Compatibility and Migration

Success Metrics

Proposed evaluation timeline

Candidate success criteria

Supporting Data or Research

What the CNA Rules say about references

What the schema says about references (format / fields / limits)

Related Issues or Proposals

Unresolved Questions

Future Possibilities

`type` vocabulary (draft for discussion)

Relationship to existing `tags`