Skip to content

Enable register spilling to shared memory#1132

Draft
stephenswat wants to merge 1 commit intoacts-project:mainfrom
stephenswat:perf/spill_to_smem
Draft

Enable register spilling to shared memory#1132
stephenswat wants to merge 1 commit intoacts-project:mainfrom
stephenswat:perf/spill_to_smem

Conversation

@stephenswat
Copy link
Copy Markdown
Member

CUDA 13.0 enables the PTX assembler to spill registers to shared memory instead of local memory, which should both be much faster, and also reduce the local memory usage of our fitting and finding kernels which are currently bottlenecking our throughput.

@stephenswat stephenswat added the performance Performance-relevant changes label Aug 20, 2025
@sonarqubecloud
Copy link
Copy Markdown

@stephenswat
Copy link
Copy Markdown
Member Author

I'm not 100% certain this works as intended like this, as this pragma is to be attached at the function scope. But we can try.

@beomki-yeo
Copy link
Copy Markdown
Contributor

beomki-yeo commented Aug 20, 2025

This is interesting as we are not actively using the shared memory in our finding and fitting kernels.
On the contrary, I hope the compiler is smart enough not to overuse the shared memory as this can reduce the number of concurrent blocks. (if we can limit the usage of shared memory from register spilling it would be great) Please let us know if there is any noticeable performance change

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jan 9, 2026

@stephenswat

This comment was marked as outdated.

@stephenswat stephenswat force-pushed the perf/spill_to_smem branch 2 times, most recently from d9595e0 to 8c1878e Compare May 7, 2026 15:27
CUDA 13.0 enables the PTX assembler to spill registers to shared memory
instead of local memory, which should both be much faster, and also
reduce the local memory usage of our fitting and finding kernels which
are currently bottlenecking our throughput.
@stephenswat stephenswat force-pushed the perf/spill_to_smem branch from 8c1878e to b22a9c4 Compare May 7, 2026 15:35
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 7, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-relevant changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants