Update TypeTree chapter to reflect newer understanding of it by ZuseZ4 · Pull Request #2911 · rust-lang/rustc-dev-guide

ZuseZ4 · 2026-06-26T20:49:09Z

@scottmcm @workingjubilee does that match your understanding after the last discussions?

rustbot · 2026-06-26T20:49:11Z

Thanks for the PR. If you have write access, feel free to merge this PR if it does not need reviews. You can request a review using r? rustc-dev-guide or r? <username>.

workingjubilee

some typos and some concept-level thoughts

View changes since this review

workingjubilee · 2026-07-03T23:02:52Z

-Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently.
+Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++). As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata in selected locations.
+
+## Where we generate TypeTree


Suggested change

## Where we generate TypeTree

## Where we generate TypeTrees

workingjubilee · 2026-07-03T23:04:36Z


 ## What are TypeTrees?
-Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently.
+Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++). As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata in selected locations.


This can be split across multiple lines since Markdown (proper Markdown, like the book's) gets concatenated across newlines (but not double-newlines). This allows diffing individual sentences of a paragraph.

You do not have to do this if you do not like how it looks, it is just a suggestion.

Suggested change

Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++). As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata in selected locations.

Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++).

As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata where Rust actively asserts a type.

workingjubilee · 2026-07-03T23:12:34Z

+Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++). As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata in selected locations.
+
+## Where we generate TypeTree
+The underlying idea is that memory "at rest" is untyped, but plenty of usages interprete bytes in a way that we can communicate to Enzyme. For example, when we call a function, the memory passed to it is interpreted according to the function's signature, so we can add TypeTrees to the LLVM-IR function definitions. We currently only do that for the outermost functions differentiated (those that have a `#[autodiff]` macro on them), but we plan to extend it to all functions which are called from them. We currently also generate TypeTree information for all calls to mem{cpy|move|set}. Finally, we started to add TypeTrees to the input or return values of certain instructions, for now that mainly is `extractvalue`.


Suggested change

The underlying idea is that memory "at rest" is untyped, but plenty of usages interprete bytes in a way that we can communicate to Enzyme. For example, when we call a function, the memory passed to it is interpreted according to the function's signature, so we can add TypeTrees to the LLVM-IR function definitions. We currently only do that for the outermost functions differentiated (those that have a `#[autodiff]` macro on them), but we plan to extend it to all functions which are called from them. We currently also generate TypeTree information for all calls to mem{cpy|move|set}. Finally, we started to add TypeTrees to the input or return values of certain instructions, for now that mainly is `extractvalue`.

The underlying idea is that while the memory of a place is untyped, plenty of usages impose a type assertion on bytes in ways that we can communicate to Enzyme.

For example, when we call a function, its arguments and return values are passed by typed copies matching the function's signature, so we can add TypeTrees to the LLVM-IR function definitions.

We currently only do that for the outermost functions differentiated (those that have a `#[autodiff]` macro on them), but we plan to extend it to all functions which are called from them. We currently also generate TypeTree information for all calls to mem{cpy|move|set}. Finally, we started to add TypeTrees to the input or return values of certain instructions, for now that mainly is `extractvalue`.

So this one is not just nitpicking the spelling of "interpret"... the interpretation is a very specific kind, and it applies to the values that receive what are often referred to as "typed copies".

Saying "memory" risks being vague for people because "memory" can mean both the values that receive typed copies and the memory in a place, and an argument can be a pointer to a place.

Much of the idea of the validity here can be considered as matching https://doc.rust-lang.org/std/mem/fn.transmute.html

workingjubilee · 2026-07-03T23:14:09Z

+The underlying idea is that memory "at rest" is untyped, but plenty of usages interprete bytes in a way that we can communicate to Enzyme. For example, when we call a function, the memory passed to it is interpreted according to the function's signature, so we can add TypeTrees to the LLVM-IR function definitions. We currently only do that for the outermost functions differentiated (those that have a `#[autodiff]` macro on them), but we plan to extend it to all functions which are called from them. We currently also generate TypeTree information for all calls to mem{cpy|move|set}. Finally, we started to add TypeTrees to the input or return values of certain instructions, for now that mainly is `extractvalue`.
+
+## How we add TypeTrees
+If we determined that a value has a meaningfull type, then we walk the MIR `Ty` of that value in the middle-end and generate a Rust TypeTree out of it. In the codegen\_llvm backend we lower our Rust TypeTree to LLVM/Enzyme TypeTrees. We then attach them to one of three locations:


Suggested change

If we determined that a value has a meaningfull type, then we walk the MIR `Ty` of that value in the middle-end and generate a Rust TypeTree out of it. In the codegen\_llvm backend we lower our Rust TypeTree to LLVM/Enzyme TypeTrees. We then attach them to one of three locations:

If we determine that a value has a meaningful type, then we walk the MIR `Ty` of that value in the middle-end and generate a Rust TypeTree out of it. In the `codegen_llvm` backend we lower our Rust TypeTree to Enzyme TypeTrees. We then attach them to one of three locations:

Hm. Calling them LLVM/Enzyme TypeTrees confuses the matter, I think? What makes them "LLVM/Enzyme TypeTrees"? Is it because they are directly embedded in the LLVMIR? I think it could just say that, then?

This should probably explain that first, actually. "A TypeTree is an Enzyme concept that gets smuggled through LLVMIR metadata" https://llvm.org/docs/LangRef.html#metadata

workingjubilee · 2026-07-03T23:14:39Z

+define internal void @_RNvCs7tI50jyFEig_3foo1f(ptr align 8 "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" %0, ptr align 8 "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" %1, ptr align 8 "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" %2) unnamed_addr #0 !dbg !1089 {
+```
+
+Argument to calls:


Suggested change

Argument to calls:

Arguments to calls:

workingjubilee · 2026-07-03T23:20:17Z

- Tells Enzyme which bytes are differentiable vs metadata
+## Why are they needed?
+- Plenty of LLVM types are opaque (e.g. `ptr`), but types are needed to compute the correct derivatives.
+- They tell Enzyme which bytes are differentiable (e.g. the pointer to float within a slice) vs metadata (e.g. the integer length of a slice)


or "a float", but the pointer of a slice can also be to zero floats, so...

Suggested change

- They tell Enzyme which bytes are differentiable (e.g. the pointer to float within a slice) vs metadata (e.g. the integer length of a slice)

- They tell Enzyme which bytes are differentiable (e.g. the pointer to floats within a slice) vs. metadata (e.g. the integer length of a slice)

workingjubilee · 2026-07-03T23:20:49Z

+## Why are they needed?
+- Plenty of LLVM types are opaque (e.g. `ptr`), but types are needed to compute the correct derivatives.
+- They tell Enzyme which bytes are differentiable (e.g. the pointer to float within a slice) vs metadata (e.g. the integer length of a slice)
+- Enzyme can't deduce all types from LLVM IR, but can (to some extend) deduce them from usage (Type Analysis).


Suggested change

- Enzyme can't deduce all types from LLVM IR, but can (to some extend) deduce them from usage (Type Analysis).

- Enzyme can't deduce all types from LLVM IR, but can (to some extent) deduce them from usage (Type Analysis).

workingjubilee · 2026-07-03T23:21:59Z

+  call void @llvm.memcpy.p0.p0.i64(ptr align 8 "enzyme_type"="{[0]:Pointer, [0,0]:Pointer, [0,0,-1]:Float@double}" %6, ptr align 8 "enzyme_type"="{[0]:Pointer, [0,0]:Pointer, [0,0,-1]:Float@double}" %0, i64 24, i1 false), !dbg !669
+```
+
+Input or return values of instructions, via debug metadata:


that's what these are, right? https://llvm.org/docs/LangRef.html#metadata-nodes-mdnode

Suggested change

Input or return values of instructions, via debug metadata:

Input or return values of instructions rustc uses for typed copies, via metadata nodes:

Update TypeTree chapter to reflect newer understanding of it

f0cfd5b

rustbot added the S-waiting-on-review Status: this PR is waiting for a reviewer to verify its content label Jun 26, 2026

ZuseZ4 mentioned this pull request Jun 26, 2026

Fix typetree generation for differentiated functions rust-lang/rust#158333

Open

workingjubilee reviewed Jul 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update TypeTree chapter to reflect newer understanding of it#2911

Update TypeTree chapter to reflect newer understanding of it#2911
ZuseZ4 wants to merge 1 commit into
mainfrom
typetree-updates

ZuseZ4 commented Jun 26, 2026

Uh oh!

rustbot commented Jun 26, 2026

Uh oh!

workingjubilee left a comment •

edited by rustbot

Loading

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

workingjubilee Jul 3, 2026 •

edited

Loading

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

workingjubilee Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++). As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata in selected locations.
	Memory layout descriptors for Enzyme. They tell Enzyme what "type" bytes are, with the main categories being Float, Integer, or Pointer. In Rust, memory is conceptually untyped, so it is possible to store a float into 4 bytes, and later read the bytes back as an integer. This is generally true in Rust even in the absence of `enum` or `union` types. We therefore can not directly put typetree metadata on allocations. We can also not accept Enzyme's default behaviour, which incorrectly assumes that LLVM-IR follows `strict aliasing` rules (known from C/C++).
	As a solution, we disable Enzyme's strict-aliasing behaviour and only generate TypeTree metadata where Rust actively asserts a type.

	If we determined that a value has a meaningfull type, then we walk the MIR `Ty` of that value in the middle-end and generate a Rust TypeTree out of it. In the codegen\_llvm backend we lower our Rust TypeTree to LLVM/Enzyme TypeTrees. We then attach them to one of three locations:
	If we determine that a value has a meaningful type, then we walk the MIR `Ty` of that value in the middle-end and generate a Rust TypeTree out of it. In the `codegen_llvm` backend we lower our Rust TypeTree to Enzyme TypeTrees. We then attach them to one of three locations:

	- They tell Enzyme which bytes are differentiable (e.g. the pointer to float within a slice) vs metadata (e.g. the integer length of a slice)
	- They tell Enzyme which bytes are differentiable (e.g. the pointer to floats within a slice) vs. metadata (e.g. the integer length of a slice)

	- Enzyme can't deduce all types from LLVM IR, but can (to some extend) deduce them from usage (Type Analysis).
	- Enzyme can't deduce all types from LLVM IR, but can (to some extent) deduce them from usage (Type Analysis).

	Input or return values of instructions, via debug metadata:
	Input or return values of instructions rustc uses for typed copies, via metadata nodes:

Uh oh!

Conversation

ZuseZ4 commented Jun 26, 2026

Uh oh!

rustbot commented Jun 26, 2026

Uh oh!

workingjubilee left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

workingjubilee left a comment •

edited by rustbot

Loading

workingjubilee Jul 3, 2026 •

edited

Loading