You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Which issue does this PR close?
- Closes#21467.
## Rationale for this change
This PR implements three distinct optimizations:
1. `lcm` was computing the result NULL buffer iteratively. This is
relatively slow. Switching to Arrow's `try_binary` kernel makes the
implementation more concise and also improves performance by computing
the result NULL buffer via the bitwise union of the input NULL buffers.
2. The `gcd` scalar arg path was doing similarly; switching to Arrow's
`try_unary` yields a similar speedup.
3. For the `gcd` scalar path, computing the GCD can only fail in a few
edge cases (e.g., `gcd(i64::MIN, i64::MIN)`). It is cheap to check for
these edge-cases; for most `gcd` inputs, we can use Arrow's `unary`
kernel instead of `try_unary`. The former is more efficient because it
allows LLVM to vectorize the code more effectively.
Benchmarks (ARM64):
```
- gcd array and scalar: 2.9ms → 2.2ms, -25% faster
- lcm both array: 2.7ms → 2.0ms, -26% faster
```
## What changes are included in this PR?
* Add benchmark for `lcm`
* Improve SLT test coverage
* Move Rust unit test for `lcm` to SLT
* Optimize `lcm` and `gcm` NULL handling
* Optimize `gcm` to avoid overhead for edge cases
## Are these changes tested?
Yes. Benchmark results above. I inspected the generated code for the
`gcd` case to confirm that LLVM is able to generate better code for the
`unary` case than for the `try_unary` case.
## Are there any user-facing changes?
No.
0 commit comments