- Status: Draft
- Authors: Jay Jacobs
- Created: 2026-02-24
- Updated: 2026-02-24
- Target: CVE Record Format v6.x (or later)
- Related: Discuss Forum RFD (PR #462)
- Affected Section: References
This RFD proposes improving the CVE Record references section (in both CNA and ADP containers) so that references are not merely unlabeled URLs but machine-usable, typed, and contextualized external anchors. The current structure requires only a url, with optional name and optional tags; in practice this often results in lists of raw pointers whose purpose, authority, and content type are unclear to consumers. This proposal explores a replacement/extension of the current reference object to include structured fields such as type, title, publisher, retrieval/publication timestamps, media_type, archive_urls, and status, while maintaining compatibility paths from the current url / name / tags model.
The current references object in the CVE Record Format is minimally constrained and not sufficiently expressive for either human understanding nor machine use. It currently supports three fields:
url(required)name(optional free text)tags(optional array; includes defined enums and custom values viax_*)
In practice, references in CVEs are often reduced to a list of unlabeled URLs. When only url is present, the reference is a pointer to unknown content with no structured indication of what the consumer should expect (vendor advisory, technical analysis, patch, issue tracker, etc.), whether the link is authoritative, whether it is still reachable, or whether archival alternatives exist.
This creates problems in at least three categories:
- Data quality deficiency: lack of contextualized reference metadata weakens the usefulness and interpretability of references.
- Schema expressiveness deficiency: the schema cannot capture common, practical distinctions that consumers care about.
- Consumer interoperability deficiency: consumers cannot reliably automate workflows based on references because meaning is hidden in free text, tags, or external page content.
- 71% of URLs (92,736 / 130,906) did not include a name value.
- 70% of CVEs (34,857 / 49,722) did not include name values for any references.
- 50% of URLs (65,475 / 130,906) did not include a tag value
- 48% of CVEs (23,914 / 49,722) did not include tag values for any references.
- 55% of tags used (786,038 / 1,428,322) used x_* custom tags, indicating high variance and weak standardization.
- CNA / ADP producers: are able to insert references without context, pushing the burden onto consumers. CNAs that want to identify and differentiate the authoritative material they are producing would struggle to separate out the important references from simply adjacent references (e.g., reference to issue-level details vs link to product page vs link to general article that mentions CVE as an example).
- CVE consumers (tool vendors, VM teams, researchers, data platforms): must manually inspect URLs or rely on heuristics to infer meaning and importance.
- UI builders and automation workflows: cannot reliably prioritize or label references (e.g., “show vendor advisory first”, “collect patch links”, “attempt CSAF ingestion”).
If no changes are made, references will continue to function primarily as a weak pointer list with inconsistent tagging. Consumers will continue to spend manual effort following links to discover their purpose, and CVE records will continue to under-deliver on a key coordination function: structured external anchors that bolster confidence and support downstream automation.
This RFD proposes a structured replacement/extension for items in the references array in the CNA and ADP containers. The intent is to preserve the core role of references while making them more machine-usable and more informative for humans.
- Contextualize references so consumers know what kind of content a URL points to, when it was last active, how to retrieve it (media type), etc.
- Improve machine usability for reference selection, retrieval, filtering, prioritization, and general information retrieval automation.
- Improve data quality checks by making reference purpose and status explicit.
- Fit within the existing CVE record model, everything is contained in a single reference section and as an array of reference objects..
- Support operational reality where some metadata is unknown or difficult to determine.
The following fields are proposed for discussion as a new structured reference object shape.
url(string - required)- Single URL field
type(enum - required, draft list below)- Structured classification of the referenced resource (replacing overreliance on free-text names and tags).
title(string - optional, free text )- Human-readable label/title for the referenced content.
publisher(object)name(string) - need to define this clearly, is it the host/domain? the org that authored the content? or the org that published the page? The authors name?role(enum; draft values areauthoritative,non_authoritative)domain(string)
published_at(datetime, optional) ORfirst_retrieved_at(datetime, optional)- At least one of these should be supported and available when possible.
last_retrieved_at(datetime)- Intended to be required in the proposed model (subject to QWG discussion).
- May be set and refreshed by the CVE Program / Secretariat over time.
media_type(string)- Ideally inferred and populated automatically where feasible, with support for human correction/override.
archive_urls(array of string URLs)- Intended to be populated by the CVE Program archival process (details out of scope for this RFD).
status(enum; initial draft:reachable,moved,archived_only)
This initial draft does not fix the final required field set. The RFD should invite QWG discussion on the value and burden of each field before finalizing minimum requirements.
This RFD proposes introducing a single explicit type field for references. The starting point for discussion includes current tag semantics and possible consolidation. The submitter specifically recommends:
- no
x_*extensions fortype, until presentation can be separated from storage, and there is a version of data where allx_*is stripped. - include a single
othervalue instead to denote that the type was evaluated and this reference was deemed to not meet any of them.
Draft values for discussion (mix of current and possible future simplifications):
- broken-link
- customer-entitlement
- exploit
- government-resource
- issue-tracking
- mailing-list
- mitigation
- not-applicable
- patch
- permissions-required
- media-coverage
- product
- related
- release-notes
- signature
- technical-description
- third-party-advisory
- vendor-advisory
- vdb-entry
- other
Alternative consolidation candidates to discuss:
- advisory - (note: removing the “third-party” vs “vendor” distinction)
- technical_analysis
- commit
- issue_tracker
- bulletin
- vendor_notice
- blog_post
- exploit_writeup
- forum_post (discussion?)
And specific doc types to consider:
- csaf
- cvrf
- vulnerability_record - (e.g., osv/ghsa/gcve - should we have a way to explicitly reference a structured data resource from another registry?)
- (there are definitely more)
This RFD does not yet finalize the fate of tags. It should explicitly discuss:
- whether
tagsare retained for legacy purposes, slowly deprecated, or immediate replaced - how the existing tags map to
type,status, or other structured fields - whether some existing tags reflect semantics better represented elsewhere
The examples below are illustrative and intended to support QWG discussion of structure and semantics.
{
"references": [
{
"url": "https://vendor.example/advisories/abc-2026"
}
]
}{
"references": [
{
"url": "https://vendor.example/advisories/abc-2026",
"type": "advisory",
"title": "Acme Advisory ABC-2026",
"publisher": {
"name": "Example Vendor",
"role": "authoritative",
"domain": "vendor.example"
},
"retrieved_at": "2026-02-24T18:00:00Z",
"status": "reachable"
}
]
}{
"references": [
{
"url": "https://research.example/blog/deep-analysis-of-cve-2026-9999",
"type": "technical-description",
"title": "Deep Analysis of CVE-2026-9999",
"publisher": {
"name": "Research Example",
"role": "non_authoritative",
"domain": "research.example"
},
"published_at": "2026-01-12T09:30:00Z",
"retrieved_at": "2026-02-24T18:00:00Z",
"media_type": "text/html",
"status": "reachable"
}
]
}{
"references": [
{
"url": "https://github.com/example/project/commit/abcdef123456",
"type": "patch",
"title": "Fix bounds check in parser",
"publisher": {
"name": "example/project",
"role": "authoritative",
"domain": "github.com"
},
"retrieved_at": "2026-02-24T18:00:00Z",
"media_type": "text/html",
"status": "reachable"
}
]
}{
"references": [
{
"url": "https://nonexistant.example/security/advisory-123",
"type": "vendor-advisory",
"title": "Security Advisory 123",
"publisher": {
"name": "Old Example Vendor",
"role": "authoritative",
"domain": "old.example"
},
"first_retrieved_at": "2024-03-01T00:00:00Z",
"retrieved_at": "2026-02-24T18:00:00Z",
"status": "archived_only",
"archive_urls": [
"https://archive.example/snapshots/old-example-advisory-123"
]
}
]
}- Improved consumer usability: explicit
typeandstatussupport filtering, prioritization, and automation. - Better UI/UX: references can be labeled consistently in tools and portals.
- Improved data quality checks: records with only low-value or non-authoritative references can be detected more easily.
- Better archival resilience: structured
statusandarchive_urlssupport long-term utility of records.
- Increased producer burden if too many fields are required at publication time.
- Inconsistent classification for
typeandpublisher.rolewithout clear guidance, may be able to enhance validation with LLMs and autoclassification approaches. - False precision in dates (
published_at) ormedia_typeif populated heuristically (e.g., “I think this was published last May” == “2024-05-01T00:00:00Z”) - Implementation complexity if CVE Services automates status checks, archive handling, or media type inference.
This section is intentionally incomplete in this initial draft and should be completed based on QWG discussion.
To be finalized in later revision of this RFD
To be finalized in later revision of this RFD
To be finalized in later revision of this RFD
There is substantial incompleteness and inconsistency in current references usage (missing name, missing tags, heavy x_* tag usage, see above) across a large sample of recent CVE records.
A CVE Record must include at least one public reference
- CNA Rules §5.1.10 says a CVE Record MUST contain at least one public reference, and that reference MUST NOT be the CVE Record itself.
The public reference must exist before or at publication time
- CNA Rules §5.3.1 says CNAs MUST ensure a public reference exists on the internet before or concurrently with publication of the CVE Record. It also states a CVE Record MUST NOT be the first public disclosure of the vulnerability.
If there are multiple public references, include the most freely available one
- CNA Rules §5.3.1.2 says if multiple public references exist, CNAs MUST include the most freely available public reference in the CVE Record.
CNAs should think about long-term availability / archival
- CNA Rules §5.3.2 says CNAs SHOULD consider long-term availability of public references, including archival services (e.g., Internet Archive) or other mechanisms determined by the CVE Program.
At least one reference must meet quality/access criteria
- CNA Rules §5.3.3 says at least one public reference must:
- SHOULD NOT require registration/login (§5.3.3.1)
- SHOULD NOT impose restrictive terms that conflict with CVE Program use (§5.3.3.2)
- MUST contain info about the specific vulnerability (§5.3.3.3)
- MUST provide minimum information supporting the CVE Record content (§5.3.3.4)
- MUST NOT be the CVE Record itself (§5.3.3.5)
“references” is required in the CNA container
- In the CVE schema docs, the CNA container’s references field is marked Required and is an array. It must contain at least 1 item and at most 512 items, and items must be unique.
Each reference item is a constrained object
- Each reference entry is an object with No Additional Properties (i.e., schema-constrained shape).
“url” is the only required field in each reference item
- Within a reference item, url is Required; name and tags are optional. The schema describes url as the URL used to retrieve the referenced resource.
“name” is optional and user-created
- The schema defines optional name as a user-created name for the reference, often the page title.
“tags” are optional, but if present they are structured
- If tags is present, it is an array with min 1 item, unique values, and items must be either:
- a standard enum from reference-tags.json, or
- an extension tag (tagExtension, i.e., x_...)
The schema docs also enumerate the standard reference tags (e.g., vendor-advisory, patch, issue-tracking, technical-description, vdb-entry, etc.).
- Pull Request #462 (add a field to the reference type):
#462
-
What consumers should infer when fields are absent (e.g., absent
status= unspecified, not unreachable) -
Whether
retrieved_atreflects CNA/ADP retrieval, registry retrieval, or either (with provenance implied by container/source) -
What is the final required field set for a structured reference object?
-
What should the final
typevocabulary be, and should it consolidate current tag semantics? -
Should existing
tagsbe retained, deprecated, or replaced? -
What exactly does
publisherrepresent (host vs author vs publisher org)? -
How much automated enrichment is feasible without creating operational burden?
- Who sets and maintains
retrieved_at,status,archive_urls, andmedia_type(CNA/ADP vs CVE Program)?
- Who sets and maintains
-
What semantics should consumers apply when metadata fields are absent? Unaddressed?
-
How should backward compatibility be handled in schema and CVE Services?
This RFD is intentionally scoped as a targeted improvement to the CVE Record references section. Future work may explore:
- Stronger normalization of external resource types and publisher roles
- More explicit relationships between references and specific CVE record assertions (analyzed_by, described_by, announced_by, fixed_by (for commits/patch refs), mentions, disputes, etc.)
- Richer provenance for reference metadata changes
- Cross-field consistency checks between reference types and other record content (e.g., affectedness, remediation, vendor metadata)
- There is a reference for a patch, but the record does not have patch information or claims there is no patch.
- Reference claims their company is “authoritative” but the record has no mention of the company (vendor).