Commit 3e5cb40
committed
feat(graph): add PJRTPlan execution wrapper with KV cache state management
Add RunPrefill, RunDecode, Reset, and Close methods to PJRTPlan[T] for
executing compiled PJRT programs with automatic KV cache buffer lifecycle
management. RunPrefill stores KV outputs for subsequent decode steps,
RunDecode donates previous KV buffers and captures new ones, and Reset
clears KV state for new generation sequences.1 parent c8db036 commit 3e5cb40
1 file changed
Lines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
| 57 | + | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| |||
226 | 227 | | |
227 | 228 | | |
228 | 229 | | |
229 | | - | |
| 230 | + | |
230 | 231 | | |
231 | 232 | | |
232 | | - | |
| 233 | + | |
233 | 234 | | |
234 | 235 | | |
235 | 236 | | |
| |||
249 | 250 | | |
250 | 251 | | |
251 | 252 | | |
| 253 | + | |
0 commit comments