Skip to content

fix(moq-native): defer TLS root resolution so http:// fingerprint bootstrap works without roots#1949

Closed
kixelated wants to merge 2 commits into
mainfrom
claude/elastic-rhodes-cfe5a7
Closed

fix(moq-native): defer TLS root resolution so http:// fingerprint bootstrap works without roots#1949
kixelated wants to merge 2 commits into
mainfrom
claude/elastic-rhodes-cfe5a7

Conversation

@kixelated

Copy link
Copy Markdown
Collaborator

Fixes #1907.

Problem

A roots-less client configuration (--client-tls-system-roots=false with no --client-tls-root and no --client-tls-fingerprint) failed at construction, before any connect() could see the URL scheme:

  • client.rs::new() called config.tls.build() (→ tls::Client::verification()) eagerly for every backend.
  • QuicheClient::new() resolved verification() too.

Both returned NoRoots and aborted construction, even though an http:// URL would fetch /certificate.sha256 and pin it for that connection (bypassing CA roots). Net effect: the http:// insecure-bootstrap path was unusable when no system/custom roots were configured. Fail-closed and niche, but a real functional limitation, consistent across quinn/noq/quiche.

Fix

Resolve the verification policy once but defer the "no roots" decision until the URL scheme is known:

  • tls::Client::verification() now returns Ok(None) for the no-roots case instead of Err(NoRoots). The eager public build() keeps its contract by mapping None → NoRoots; a new pub(crate) build_with(verification) builds a rustls config from an already-resolved policy.
  • moq-native::Client stores the prebuilt rustls config as Option (None = deferred) plus the tls::Client to rebuild a pinned config per connection.
  • quinn / noq / quiche build their per-connection config once the scheme is known: an http:// bootstrap pins a freshly fetched fingerprint (no roots needed), anything else reuses the resolved policy and surfaces NoRoots at connect time.
  • The fingerprint fetch stays on the QUIC side, so the WebSocket race can still fall back over plaintext ws:// if the fetch fails.
  • The WebSocket path only needs roots for secure wss://; plaintext ws:// connects without them (deferred NoRoots only on a secure scheme).

The fingerprint-fetch helper, previously duplicated inline in quinn/noq, is now a small fetch_fingerprint() in each backend returning a length-validated [u8; 32] (matching quiche).

Public API

No public breaking changes. Additive only:

  • quinn::Error / noq::Error: new FingerprintLength(usize) variant (both #[non_exhaustive]).
  • websocket::Error: new Tls(#[from] tls::Error) variant (#[non_exhaustive]).

tls::Client::build() (used by moq-relay) is unchanged in signature and behavior. verification() / build_with() are pub(crate).

Test plan

  • cargo test -p moq-native (57 integration + 46 lib) green.
  • New tls::verification_defers_no_roots (resolution returns Ok(None), build() still NoRoots).
  • New client::init_defers_no_roots (roots-less init() now succeeds instead of erroring at construction).
  • cargo check/clippy across default, noq, and quiche feature sets; cargo check -p moq-relay.
  • Not e2e-tested against a live self-signed relay over http://.

(Written by Claude Opus 4.8)

… without roots

A roots-less client config (`--client-tls-system-roots=false` with no
`--client-tls-root` and no `--client-tls-fingerprint`) used to fail at
construction: `Client::new` built the rustls config (and `QuicheClient::new`
resolved verification) eagerly, before any `connect()` saw the URL scheme, so
`NoRoots` fired even for an `http://` URL that would have fetched and pinned
`/certificate.sha256` (bypassing CA roots) for that connection.

Resolve the verification policy once but defer the "no roots" decision:

- `tls::Client::verification()` now returns `Ok(None)` for the no-roots case
  instead of `Err(NoRoots)`. `build()` keeps its eager contract by mapping
  `None -> NoRoots`; the new `build_with()` builds a rustls config from an
  already-resolved policy.
- `moq-native::Client` stores the prebuilt config as `Option` (None = deferred)
  plus the `tls::Client` to rebuild a pinned config per connection.
- quinn/noq/quiche build their per-connection config once the scheme is known:
  an `http://` bootstrap pins a freshly fetched fingerprint (no roots needed),
  anything else reuses the resolved policy and surfaces `NoRoots` at connect
  time. The fingerprint fetch stays on the QUIC side so the WebSocket race can
  still fall back over plaintext `ws://`.
- The WebSocket path only needs roots for secure `wss://`; plaintext `ws://`
  connects without them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @kixelated, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 631f15be-c7e4-4cde-9e5e-3c5f2d99bd32

📥 Commits

Reviewing files that changed from the base of the PR and between 29644f0 and b3d2fe1.

📒 Files selected for processing (6)
  • rs/moq-native/src/lib.rs
  • rs/moq-native/src/noq.rs
  • rs/moq-native/src/quiche.rs
  • rs/moq-native/src/quinn.rs
  • rs/moq-native/src/tls.rs
  • rs/moq-native/src/websocket.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • rs/moq-native/src/websocket.rs
  • rs/moq-native/src/quiche.rs
  • rs/moq-native/src/quinn.rs
  • rs/moq-native/src/tls.rs
  • rs/moq-native/src/noq.rs

Walkthrough

TLS root resolution is deferred from client initialization to per-connection handling. Client::verification now returns Result<Option<Verification>>, Client::build still fails on None, and Client stores optional TLS state plus a base TLS client for backend rebuilding. QUIC, noq, quiche, and WebSocket paths now accept optional TLS configs and surface missing roots at connect time. Fingerprint fetching was extracted into helpers, and a shared fetch timeout was added.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly states the main change: deferring TLS root resolution for http:// fingerprint bootstrap.
Description check ✅ Passed The description matches the changeset and explains the deferred root-resolution fix across backends.
Linked Issues check ✅ Passed The changes satisfy #1907 by deferring root resolution and updating quinn, noq, quiche, websocket, and tls flow.
Out of Scope Changes check ✅ Passed No unrelated changes stand out; the timeout and error additions support the stated bootstrap fix.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch claude/elastic-rhodes-cfe5a7

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
rs/moq-native/src/websocket.rs (1)

41-43: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add rustdoc for the new public Error::Tls variant.

This enum is public, so the added Tls variant should be documented too.

Proposed fix
+	/// TLS configuration or validation failed for a secure WebSocket connection.
 	#[error(transparent)]
 	Tls(#[from] crate::tls::Error),

As per coding guidelines, **/*.{rs,ts,tsx,js,jsx}: "Document every exported public API symbol: each exported Rust item and each exported JS/TS symbol, including notable public members, must have a doc comment."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rs/moq-native/src/websocket.rs` around lines 41 - 43, Add rustdoc for the new
public Error::Tls variant in the Error enum so the exported API is documented
consistently. Update the enum definition in websocket.rs near the Error type and
the #[error(transparent)] Tls variant to include a brief doc comment describing
that it wraps TLS-related failures via crate::tls::Error, matching the existing
public API documentation style.

Source: Coding guidelines

rs/moq-native/src/quinn.rs (1)

160-190: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Document the changed connect TLS contract and align the bootstrap comment.

The new base: Option<_>/tls_config contract is subtle, and the inline comment says bootstrap is only used when no stronger verification exists, but allows_http_bootstrap() can still allow an http:// opt-in when CA roots are configured.

Proposed docs/comment update
+	/// Connect to a WebTransport endpoint using Quinn.
+	///
+	/// `base` is the already-resolved TLS configuration. When it is `None`,
+	/// non-bootstrap schemes fail with `NoRoots`, while eligible `http://` URLs
+	/// fetch `/certificate.sha256`, pin that fingerprint, and are upgraded to
+	/// `https://` for the QUIC connection.
 	pub async fn connect(
 		&self,
 		base: Option<&rustls::ClientConfig>,
 		tls_config: &crate::tls::Client,
 		url: Url,
 	) -> Result<web_transport_quinn::Session> {
-			// Insecure per-connection bootstrap: only honored when no stronger
-			// verification is configured, so an attacker controlling the plaintext
-			// fetch can't weaken an explicit pin or re-enable disabled verification.
+			// Insecure per-connection bootstrap: not honored for explicit pins or
+			// disabled verification. An `http://` URL is an explicit opt-in to
+			// replace CA validation with a freshly fetched pin for this connection.

As per coding guidelines, **/*.{rs,ts,tsx,js,jsx}: Document every exported public API symbol.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rs/moq-native/src/quinn.rs` around lines 160 - 190, The public API contract
for connect in quinn.rs is now subtle because base is optional and tls_config
drives verification, so update the exported method docs to explain how those
inputs interact with HTTP bootstrap and resolved policy. Also align the inline
bootstrap comment near fetch_fingerprint and the http_bootstrap branch with
allows_http_bootstrap(), making it clear that http:// can still be opted in even
when CA roots are configured, rather than implying bootstrap is only allowed
when no stronger verification exists.

Source: Coding guidelines

rs/moq-native/src/noq.rs (1)

162-191: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Document the changed connect TLS contract and align the bootstrap comment.

The new base: Option<_>/tls_config contract is subtle, and the inline comment says bootstrap is only used when no stronger verification exists, but allows_http_bootstrap() can still allow an http:// opt-in when CA roots are configured.

Proposed docs/comment update
+	/// Connect to a WebTransport endpoint using Noq.
+	///
+	/// `base` is the already-resolved TLS configuration. When it is `None`,
+	/// non-bootstrap schemes fail with `NoRoots`, while eligible `http://` URLs
+	/// fetch `/certificate.sha256`, pin that fingerprint, and are upgraded to
+	/// `https://` for the QUIC connection.
 	pub async fn connect(
 		&self,
 		base: Option<&rustls::ClientConfig>,
 		tls_config: &crate::tls::Client,
 		url: Url,
 	) -> Result<web_transport_noq::Session> {
-			// Insecure per-connection bootstrap: only honored when no stronger
-			// verification is configured, so an attacker controlling the plaintext
-			// fetch can't weaken an explicit pin or re-enable disabled verification.
+			// Insecure per-connection bootstrap: not honored for explicit pins or
+			// disabled verification. An `http://` URL is an explicit opt-in to
+			// replace CA validation with a freshly fetched pin for this connection.

As per coding guidelines, **/*.{rs,ts,tsx,js,jsx}: Document every exported public API symbol.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rs/moq-native/src/noq.rs` around lines 162 - 191, Update the public API docs
for `Noq::connect` to describe the new `base: Option<&rustls::ClientConfig>` and
`tls_config` handshake contract, including when each is used and how bootstrap
policy is resolved. Also revise the inline bootstrap comment in `connect` to
match `allows_http_bootstrap()`, clarifying that an `http://` opt-in can still
be permitted even when CA roots are configured. Keep the wording aligned with
the existing `connect` and `allows_http_bootstrap` behavior so the documentation
matches the actual TLS decision flow.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rs/moq-native/src/noq.rs`:
- Around line 53-54: The new public Error variant FingerprintLength in
noq::Error needs rustdoc so callers know when it can occur. Add a doc comment on
the FingerprintLength variant itself, and if needed on the Error enum near the
variant, describing that it is returned when a certificate fingerprint is not
exactly 32 bytes (SHA-256). Keep the wording consistent with the existing Error
variants and ensure the public API symbol FingerprintLength is documented.

In `@rs/moq-native/src/quinn.rs`:
- Around line 51-52: Add rustdoc for the new public `Error` variant
`FingerprintLength` in `quinn.rs` so the exported API is documented. Update the
`Error` enum (and any nearby docs if needed) to explain that callers can observe
this variant when a provided certificate fingerprint is not exactly 32 bytes,
keeping the wording consistent with the existing `FingerprintLength(usize)`
symbol and public error surface.
- Around line 268-272: The fingerprint fetch in quinn.rs currently uses
reqwest::get without any overall timeout, so the connect path can hang
indefinitely on a stalled /certificate.sha256 response. Update the fingerprint
request in the connect flow to use the client’s existing connect budget/timeout
rather than the unbounded helper, and keep the existing Error::FetchFingerprint
and Error::FingerprintStatus handling around the request.

---

Nitpick comments:
In `@rs/moq-native/src/noq.rs`:
- Around line 162-191: Update the public API docs for `Noq::connect` to describe
the new `base: Option<&rustls::ClientConfig>` and `tls_config` handshake
contract, including when each is used and how bootstrap policy is resolved. Also
revise the inline bootstrap comment in `connect` to match
`allows_http_bootstrap()`, clarifying that an `http://` opt-in can still be
permitted even when CA roots are configured. Keep the wording aligned with the
existing `connect` and `allows_http_bootstrap` behavior so the documentation
matches the actual TLS decision flow.

In `@rs/moq-native/src/quinn.rs`:
- Around line 160-190: The public API contract for connect in quinn.rs is now
subtle because base is optional and tls_config drives verification, so update
the exported method docs to explain how those inputs interact with HTTP
bootstrap and resolved policy. Also align the inline bootstrap comment near
fetch_fingerprint and the http_bootstrap branch with allows_http_bootstrap(),
making it clear that http:// can still be opted in even when CA roots are
configured, rather than implying bootstrap is only allowed when no stronger
verification exists.

In `@rs/moq-native/src/websocket.rs`:
- Around line 41-43: Add rustdoc for the new public Error::Tls variant in the
Error enum so the exported API is documented consistently. Update the enum
definition in websocket.rs near the Error type and the #[error(transparent)] Tls
variant to include a brief doc comment describing that it wraps TLS-related
failures via crate::tls::Error, matching the existing public API documentation
style.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c263b37f-4d65-45a8-943d-c7e0ea21bd56

📥 Commits

Reviewing files that changed from the base of the PR and between 370dfd0 and 29644f0.

📒 Files selected for processing (6)
  • rs/moq-native/src/client.rs
  • rs/moq-native/src/noq.rs
  • rs/moq-native/src/quiche.rs
  • rs/moq-native/src/quinn.rs
  • rs/moq-native/src/tls.rs
  • rs/moq-native/src/websocket.rs

Comment thread rs/moq-native/src/noq.rs
Comment thread rs/moq-native/src/quinn.rs
Comment thread rs/moq-native/src/quinn.rs Outdated
- build() rustdoc no longer links the pub(crate) verification()/build_with()
  (rustdoc errored: public item linking a private one).
- Bound the http:// fingerprint fetch with a 10s timeout (new
  FINGERPRINT_FETCH_TIMEOUT) so a stalled /certificate.sha256 can't hang
  connect; applied to quinn, noq, and quiche.
- Document the new public error variants (quinn/noq FingerprintLength,
  websocket Tls) and the quinn/noq connect() base/tls_config contract.
- Reword the bootstrap comment: http:// is a deliberate per-connection opt-in
  to replace CA validation, refused only for explicit pins / disabled verify.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@kixelated

Copy link
Copy Markdown
Collaborator Author

Closing unmerged. The fix works, but on reflection the config it unblocks is too niche to justify the added indirection in the TLS-verification path (the one module where less surface area is better). See the issue for the won't-fix rationale and workarounds.

(Written by Claude Opus 4.8)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TLS: defer root resolution so http:// fingerprint bootstrap works without configured roots

1 participant