Reporting Affected Artifacts in CVE

Field	Value
RFD Submitter	Andrew Lilley Brinker
RFD Pull Request	RFD #0000

Summary

Today, CVE supports identifying affected products or packages using three "identifier-like" constructs, with one more proposed in RFD #2, "Supporting Package URLs in CVE". They are:

CPE, Common Platform Enumeration
Vendor and product names, provided as a pair
Collection URL and package names, provided as a pair
(If accepted) Package URLs, also called "purls"

While these coarse-grained identifiers are great for identifying affected products or packages, they are insufficiently granular for identifying affected artifacts. This makes it difficult for CNAs to report fine-grained applicability information when they otherwise could.

For example, a CNA may know that specific binaries they build and ship to users are affected by a vulnerability. Today, there is not a clear, structured mechanism for reporting identifiers for these affected binaries to CVE consumers.

This RFD proposes introducing support for reporting affected artifacts, by adding a new optional affectedArtifacts field to containers.cna, which would contain an array of objects specifying identifiers for artifacts affected by a vulnerability.

Problem Statement

While CVE records today can contain substantial information about affected products or packages, there isn't a clear and structured way to report information about specific artifacts affected or not affected by a vulnerability.

This deficiency means CNAs who publish artifacts—such as prebuilt binaries, archive files such as .zips or .tar.gzs, script files, or configuration files—lack a means to communicate when those artifacts are known to be vulnerable or to not be vulnerable.

For vulnerability managers, reacting to vulnerability disclosures with coarse-grained identifiers for affected software requires maintaining accurate software inventories, whether through Software Bills of Material, package manifests (such as package.json or Cargo.toml), lockfiles (such as package-lock.json or Cargo.lock), or other means. Without some method for tracking what software is deployed in a production system, vulnerability managers may struggle to turn identifiers provided in a CVE record into a clear determination of applicability, and therefore also to respond quickly to vulnerabilities when they're disclosed. Reducing the time-to-react for vulnerability managers is a clear equity for the CVE program.

Proposed Solution

The presence of artifact identifiers in CVE Records would provide an additional mechanism to vulnerability managers to identify applicable vulnerabilities. For example, a hash of a known-vulnerable binary could be searched for on production systems in addition to any deployed software inventories.

Artifact identifiers also have the benefit of low false-positive matches. Coarse-grained identifiers for products or packages may be decomposed further with additional fields for objects in the affected array, such as platforms, versions, programFiles, programModules, and more. These fields, and the potential for ambiguity or complexity for checking in many of them, mean that coarse-grained identifiers' applicability decisions can easily become complex and require human intervention to assess, and even remain uncertain despite human intervention.

By comparison, identifiers for affected artifacts, which are often based on hashing file contents, are unlikely to produce false positives. The nature of cryptographic hashing algorithms is that they are generally resistant to engineering collisions, with properties such as collision resistance, preimage resistance, and second-preimage resistance. The result of these properties is that if a vulnerability manager finds a file in their system whose artifact identifier matches an artifact identifier provided in a CVE Record, that manager can act quickly with high confidence that the match is correct.

Artifact identifiers have an additional benefit, because of their low false positive rate and content-based construction, of being easy to automate and check at scale.

The following is the actual proposed change for the Record Format:

Add an `affectedArtifacts` field

Add an affectedArtifacts field to the cnaPublishedContainer object, found at the path containers.cna within a CVE Record. This new field would be an array containing affectedArtifact objects. The specific edits to the schema would be as follows:

First, the introduction of the affectedArtifacts field within the cnaPublishedContainer object:

"affectedArtifacts": {
    "type": "array",
    "description": "List of affected artifacts.",
    "minItems": 1,
    "items": {"$ref": "#/definitions/affectedArtifact"}
}

Second, the definition of the affectedArtifact type within the "definitions" portion of the schema:

"affectedArtifact": {
    "type": "object",
    "description": "Provides information about a specific artifact affected by a vulnerability.",
    "allOf": [
        {
            "description": "An identifier-like field, to identify the artifact.",
            "anyOf": [
                {"required": ["omniborArtifactID", "omniborArtifactType"]},
                {"required": ["sha256"]}
            ]
        },
        {
            "description": "The status of the artifact.",
            "anyOf": [
                {"required": ["status"]}
            ]
        }
    ],
    "properties": {
        "omniborArtifactID": {
            "type": "string",
            "pattern": "^gitoid:blob:sha256:[0-9a-f]{64}$",
            "description": "The OmniBOR Artifact ID of the artifact to be matched against.",
            "examples": [
                "gitoid:blob:sha256:9f64df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
                "gitoid:blob:sha256:09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772",
                "gitoid:blob:sha256:230f3515d1306690815bd9c3da0d15d8b6fcf43894d17100eb44b6d329a92f61"
            ]
        },
        "omniborArtifactType": {
            "type": "string",
            "enum": ["artifact", "buildInput"],
            "description": "Specifies how consumers of the Artifact ID should search for matches. If the 'target' is 'artifact', then the Artifact ID is identifying an artifact which should be searched for directly (for example, within a file system by matching against Artifact IDs for files). If the 'target' is 'buildInput' then the Artifact ID is identifying a build input, and consumers should match the Artifact ID against IDs found in OmniBOR Input Manifests for their software."
        },
        "sha256": {
            "type": "string",
            "pattern": "^[a-f0-9]{64}$",
            "description": "The SHA-256 hash of the artifact.",
            "examples": [
                "68e656b251e67e8358bef8483ab0d51c6619f3e7a1a9f0e75838d41ff368f728",
                "2cc620f8a156b986806bc2757c0572d978d8cbfc4d25f0dfa7c552291bf68279",
                "97272dc1b6ac7ca84735b797b4a04233b17fd55707f9c728fc3747e3f935f02c"
            ]
        },
        "status": {
            "description": "The vulnerability status for the version or range of versions. For a range, the status may be refined by the 'changes' list.",
            "$ref": "#/definitions/status"
        },
        "version": {
            "description": "The single version being described, or the version at the start of the range. By convention, typically 0 denotes the earliest possible version.",
            "$ref": "#/definitions/version"
        },
        "versionType": {
            "type": "string",
            "description": "The version numbering system used for specifying the range. This defines the exact semantics of the comparison (less-than) operation on versions, which is required to understand the range itself. 'Custom' indicates that the version type is unspecified and should be avoided whenever possible. It is included primarily for use in conversion of older data files.",
            "minLength": 1,
            "maxLength": 128,
            "examples": [
                "custom",
                "git",
                "maven",
                "python",
                "rpm",
                "semver"
            ]
        },
        "platforms": {
            "description": "List of specific platforms if the vulnerability is only relevant in the context of these platforms (optional). Platforms may include execution environments, operating systems, virtualization technologies, hardware models, or computing architectures. The lack of this field implies that the other fields are applicable to all relevant platforms.",
            "type": "array",
            "minItems": 1,
            "uniqueItems": true,
            "items": {
                "type": "string",
                "examples": ["iOS", "Android", "Windows", "macOS", "x86", "ARM", "64 bit", "Big Endian", "iPad", "Chromebook", "Docker", "Model T"],
                "maxLength": 1024
            }
        }
    }
}

The explanations of the fields for affectedArtifact objects is as follows:

omniborArtifactID: An OmniBOR Artifact Identifier, used to identify either an artifact itself, such as a binary file, or to identify build inputs used to produce the artifact.
omniborArtifactType: The type associated with the omniborArtifactID field, can be either "artifact" or "buildInput". If "artifact" is used, then the field is the Artifact ID of an artifact itself, such as a binary file. If "buildInput" is used, then the field is the Artifact ID of a build input. This field indicates to CVE consumers how to use the field in question. For artifacts, they should search their systems and/or inventories for files with a matching Artifact ID. For build inputs, they should search their OmniBOR Input Manifests for IDs which match.
sha256: The SHA-256 hash of the artifact in question.
status: Indicates whether the identified artifact is affected, not affected, or has an unknown affected status.
version: The version applicable to the identified artifact, if relevant.
versionType: If "version" is used, this indicates what type of version is present, and should be used by CVE consumers to validate and interpret the "version" field.
platforms: A list of platforms, describing the specific platform the identified artifact is intended for.

Additionally, the data constraints on the affectedArtifact object ensure that at least one set of identifier-like fields is present per object, and that each object always includes a "status" field.

Note that identifiers found in the same affectedArtifact object should be interpreted as synonyms, identifying the same artifact. For example, an entry in the affectedArtifacts array which contains both an omniborArtifactID and sha256 value should be interpreted as identifying only one artifact, for which either identifier is valid. The presence of multiple identifiers is intended only to make matching easier for CVE consumers by providing them with options which may be more convenient depending on what identifiers or tooling the consumer has available in their systems to support matching.

Use of this as a template for future identifiers

This proposal is intended as a template for the introduction of more fine-grained identifier types intended for identifying artifacts in the future. Specifically, future identifiers should be added as new fields within the affectedArtifact object inside the affectedArtifacts array.

Vendoring of the relevant specifications

To ensure consistency about new identifier types added, the CVE project should "vendor," meaning maintain its own public copy of, any relevant specifications when those specifications are not versioned upstream.

Examples

The following is an example affectedArtifacts field, identifying three binaries, one for each of Windows, macOS, and Linux systems on x86:

"affectedArtifacts": [
    {
        "omniborArtifactID": "gitoid:blob:sha256:9f64df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
        "omniborArtifactType": "artifact",
        "sha256": "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824",
        "status": "affected",
        "version": "0.18.1",
        "versionType": "semver",
        "platforms": ["macOS", "x86"]
    },
    {
        "omniborArtifactID": "gitoid:blob:sha256:4043df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
        "omniborArtifactType": "artifact",
        "sha256": "40414dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824",
        "status": "affected",
        "version": "0.18.1",
        "versionType": "semver",
        "platforms": ["Windows", "x86"]
    },
    {
        "omniborArtifactID": "gitoid:blob:sha256:ccc4df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
        "omniborArtifactType": "artifact",
        "sha256": "ddd24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824",
        "status": "affected",
        "version": "0.18.1",
        "versionType": "semver",
        "platforms": ["Linux", "x86"]
    }
]

Impact Assessment

The addition of this new field would enable CNAs to report affected artifacts, such as known-vulnerable prebuilt binaries shipped for versions of software affected by a vulnerability, and would be complementary to the existing ability in the Record Format to identify affected products and packages.

For CVE consumers, the addition of this field would provide the ability to search for the presence of known-vulnerable artifacts in their systems when reported by CNAs.

Compatibility and Migration

This would be a minor change, as the addition of new optional fields is considered non-breaking.

CVE consumers could, if they wanted, gain the benefit of the new field by updating their consumption logic to recognize the field and make use of its contents. CVE consumers would only be broken if they incorrectly assume in their consumption logic that no new optional fields will ever be added to the cnaContainer object.

Success Metrics

The success of this proposal will depend on the adoption of the new field, and the degree to which the new field provides value for CVE consumers.

CNA adoption can be measured in reported CVEs. After a 6 month period from the publication of the first version to include the new field, the QWG must assess the prevalence of the new field in CVEs published in the past 6 months. If the new field is present in 5% of new CVEs, this RFD will be considered successful and the new field will not be rolled back.

CVE may consider making inclusion of affected artifacts a requirement for CNA recognition with the Enrichment Recognition List.

Measuring use by CVE consumers is a significantly larger challenge. A potential path would be to interview vulnerability management tool vendors, since many of these ingest and process the CVE list. Enquiring as to the role affected artifacts play in their processes would provide a strong indication of the value these identifiers provide. Of course, it will take vendors some time to adjust their processes. As such, the measure might be to look for at least two vendors using the new software identifier formats within a year of the adoption of the new formats.

Supporting Data or Research

Demand for OmniBOR was identified specifically in the most recent CVE user survey, with positive demand shown in Question 16, with the strongest demand shown from self-identified data aggregators and integrators.

More generally, demand for identifying affected artifacts in CVE is unclear. Beyond the question future priorities which included OmniBOR, there were no specific questions in the survey around demand for identifying affected artifacts.

That said, this lack of support has been identified as a gap in discussions among the QWG, and there is interest in addressing it, whether through this proposal or a future alternative proposal.

Related Issues or Proposals

None identified.

Recommended Priority

Medium

Unresolved Questions

There are no remaining unresolved questions.

Future Possibilities

More identifier types may be desirable to add in the future. Any question of what those types may be, or what they may look like within the CVE Record Format, is not addressed here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reporting Affected Artifacts in CVE

Summary

Problem Statement

Proposed Solution

Add an `affectedArtifacts` field

Use of this as a template for future identifiers

Vendoring of the relevant specifications

Examples

Impact Assessment

Compatibility and Migration

Success Metrics

Supporting Data or Research

Related Issues or Proposals

Recommended Priority

Unresolved Questions

Future Possibilities

FilesExpand file tree

0000-reporting-affected-artifacts.md

Latest commit

History

0000-reporting-affected-artifacts.md

File metadata and controls

Reporting Affected Artifacts in CVE

Summary

Problem Statement

Proposed Solution

Add an affectedArtifacts field

Use of this as a template for future identifiers

Vendoring of the relevant specifications

Examples

Impact Assessment

Compatibility and Migration

Success Metrics

Supporting Data or Research

Related Issues or Proposals

Recommended Priority

Unresolved Questions

Future Possibilities

Add an `affectedArtifacts` field