Skip to content

kernel: enable CONFIG_CRYPTO_ECDSA for H100 confidential compute#14

Merged
kvinwang merged 1 commit into
mainfrom
kvin/fix-h100-cc-crypto-ecdsa
May 27, 2026
Merged

kernel: enable CONFIG_CRYPTO_ECDSA for H100 confidential compute#14
kvinwang merged 1 commit into
mainfrom
kvin/fix-h100-cc-crypto-ecdsa

Conversation

@kvinwang
Copy link
Copy Markdown
Collaborator

Summary

Add CONFIG_CRYPTO_ECDSA=y to the dstack kernel defconfigs (6.17 and 6.18).

Why

NVIDIA's open kernel driver (nvidia.ko) gates its LKCA-backed libspdm
crypto provider on CONFIG_CRYPTO_ECDSA being defined at compile time
(see kernel-open/nvidia/internal_crypt_lib.h — the USE_LKCA macro
requires the kernel to advertise ECDSA, ECDH, RSA, HMAC, AKCIPHER, etc.).

When CONFIG_CRYPTO_ECDSA is missing, libspdm is wired to stubs at
compile time. At runtime the driver prints:

libspdm_check_crypto_backend: Error - libspdm expects LKCA but found stubs!
NVRM: spdmContextInit_IMPL: SPDM cannot boot without proper crypto backend!
NVRM: GPU 0000:04:00.0: RmInitAdapter failed!

…and the H100 never finishes init in Confidential Compute mode
(e.g. GCP TDX + a3-highgpu-1g). nvidia-smi reports no devices.

meta-nvidia/recipes-kernel/linux/files/nvidia.cfg already declares
this option, but it ships as a linux-yocto%.bbappend, which does
not attach to the in-tree linux-custom_*.bb recipes that build
the dstack kernel from a defconfig. So the setting never reached the
final kernel. Adding it directly to the defconfigs ensures all
flavors (incl. nvidia) pick it up.

Verification

End-to-end on GCP a3-highgpu-1g + TDX + --confidential-compute-type=TDX,
after rebuilding the kernel + nvidia kernel modules with this change:

  • SPDM session establishes; nvidia.ko logs libspdm_check_crypto_backend: LKCA wrappers found.
  • nvidia-smi shows NVIDIA H100 80GB HBM3
  • nvidia-smi conf-compute -f reports CC status: ON
  • nvidia-smi conf-compute -grs reports Confidential Compute GPUs Ready state: ready
  • PyTorch (pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime) runs a 4096³ matmul at ~38 TFLOPs FP32

Test plan

  • Build mc:nvidia:dstack-image-uki with this change
  • Deploy on GCP TDX a3-highgpu-1g and verify nvidia-smi works
  • Confirm CC status: ON
  • Run a CUDA workload through Docker

NVIDIA's open kernel driver (nvidia.ko) gates its LKCA-backed libspdm
crypto provider on `CONFIG_CRYPTO_ECDSA` being defined when the driver
is built (see `kernel-open/nvidia/internal_crypt_lib.h`: the
`USE_LKCA` macro requires the kernel to advertise ECDSA, ECDH, RSA,
HMAC, AKCIPHER, etc.). When `CONFIG_CRYPTO_ECDSA` is missing, libspdm
falls back to stubs and at runtime prints
`libspdm expects LKCA but found stubs!` then fails
`spdmEstablishSession`, so H100 in Confidential Compute mode (e.g. GCP
TDX + a3-highgpu-1g) never finishes init and `nvidia-smi` reports no
devices.

`meta-nvidia/recipes-kernel/linux/files/nvidia.cfg` already sets this
config, but it ships as a `linux-yocto%.bbappend`, which does not
attach to the in-tree `linux-custom_*.bb` recipes that build the
dstack kernel from a defconfig. Add the option directly to the 6.17
and 6.18 defconfigs so all flavors (incl. nvidia) pick it up.

Verified end-to-end on GCP a3-highgpu-1g + TDX after rebuilding the
kernel + nvidia kernel modules with this change: SPDM session
establishes, `nvidia-smi conf-compute -f` reports `CC status: ON`,
and a PyTorch matmul runs at ~38 TFLOPs.
Copilot AI review requested due to automatic review settings May 27, 2026 04:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables CONFIG_CRYPTO_ECDSA=y in the dstack kernel defconfigs so NVIDIA’s open kernel driver can use the kernel crypto (LKCA) backend required for H100 Confidential Compute initialization.

Changes:

  • Add CONFIG_CRYPTO_ECDSA=y to the Linux 6.17 defconfig.
  • Add CONFIG_CRYPTO_ECDSA=y to the Linux 6.18 defconfig.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
meta-dstack/recipes-kernel/linux/files/6.18/defconfig Enables ECDSA in the 6.18 defconfig to satisfy NVIDIA driver LKCA requirements for CC.
meta-dstack/recipes-kernel/linux/files/6.17/defconfig Enables ECDSA in the 6.17 defconfig to satisfy NVIDIA driver LKCA requirements for CC.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kvinwang kvinwang merged commit 110991a into main May 27, 2026
1 check passed
@kvinwang kvinwang mentioned this pull request May 27, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants