bpf2c array map lookup optimization.#5127
bpf2c array map lookup optimization.#5127saxena-anurag wants to merge 20 commits intomicrosoft:mainfrom
Conversation
1 similar comment
1703f65 to
fcc6d15
Compare
…//github.com/saxena-anurag/ebpf-for-windows-1 into user/anusa/bpf2c_direct_map_lookup_array_map
1 similar comment
2 similar comments
Alan-Jowett
left a comment
There was a problem hiding this comment.
Thanks for the detailed design doc and implementation! I have a few concerns — one blocker and two correctness issues.
| [submodule "external/ebpf-verifier"] | ||
| path = external/ebpf-verifier | ||
| url = https://github.com/vbpf/ebpf-verifier.git | ||
| url = https://github.com/saxena-anurag/ebpf-verifier.git |
There was a problem hiding this comment.
Blocker: This changes the ebpf-verifier submodule URL from the upstream vbpf/ebpf-verifier.git to a personal fork (saxena-anurag/ebpf-verifier.git). This PR cannot merge with this change — it would make the entire repo depend on a personal fork.
The new verifier APIs (get_map_fd_range, get_map_type, Call::is_map_lookup) need to be upstreamed to vbpf/ebpf-verifier first, and this submodule should continue pointing to the upstream repo.
There was a problem hiding this comment.
Updated. Was waiting for prevails PRs to be merged.
| if (annotations != nullptr) { | ||
| for (size_t i = 0; i < count; i++) { | ||
| _map_annotations[annotations[i].instruction_offset] = annotations[i]; | ||
| } |
There was a problem hiding this comment.
Correctness concern — annotation offset collision with BPF-to-BPF subprograms
_map_annotations is keyed solely by instruction_offset and is shared across all programs in the generator. In extract_program(), subprograms are recursively extracted and each calls generate() — all sharing the same _map_annotations map.
Subprogram instruction offsets restart at 0 (see extract_program line 398: uint32_t offset = 0), while the annotations contain main program offsets (also starting at 0). If a subprogram has a bpf_map_lookup_elem call at the same local offset as an annotated call in the main program, it would incorrectly match the main program's annotation and inline the wrong map.
Suggested fix: key annotations by (program_name, instruction_offset) pair, or scope the annotation map per-program rather than sharing it globally.
| @@ -0,0 +1,49 @@ | |||
| // Copyright (c) eBPF for Windows contributors | |||
There was a problem hiding this comment.
Missing test coverage for the optimized (inline) path
This test validates the fallback (ambiguous map → no inline), which is great. But I don't see a targeted runtime test that validates the optimized inline path end-to-end:
- Create an array map and write a known value
- Load a native program that uses the inline
array_datalookup path - Verify the correct value is returned
The existing map tests provide some indirect coverage, but a test that specifically exercises the map_data[idx].array_data + key * value_size code path would catch regressions in the array_data pointer population (e.g., if ebpf_map_get_value_address returns a wrong pointer).
Fixes #5177
Description
This pull request introduces a verifier-assisted optimization to enable bpf2c to inline array map lookups for
BPF_MAP_TYPE_ARRAYmaps, eliminating unnecessary indirect helper calls when the verifier can prove which map is being accessed. It achieves this by extracting precise per-instruction map annotations from the PREVAIL verifier and extending runtime support for direct array data access. The changes include new APIs, data structure extensions, runtime loader logic, and documentation.Verifier and API Enhancements:
ebpf_get_map_annotations_from_verifierand its export, which provides per-instruction map annotations (map identity, type, name, etc.) from the verifier for use by bpf2c and other consumers. (include/ebpf_api.h,ebpfapi/Source.def,libs/api_common/api_common.cpp) [1] [2] [3] [4] [5] [6]bpf2c and Loader Runtime Support:
map_data_tstructure to include a direct pointer to array map data (array_data) and bumped the version for compatibility; updated size/version tables accordingly. (include/bpf2c.h,libs/shared/shared_common.c) [1] [2] [3]array_datafield for array maps usingebpf_map_get_value_address, enabling inlined lookups at runtime. (libs/execution_context/ebpf_native.c) [1] [2]Verifier-Assisted Optimization Logic:
libs/api_common/api_common.cpp)Documentation:
docs/VerifierAssistedMapOptimization.md) detailing the problem, approach, architecture, runtime support, and correctness guarantees for this optimization.These changes together enable safer and more efficient code generation for array map lookups, with clear fallback to standard helper calls when the verifier cannot prove map identity.
Testing
Existing CICD tests.
Perf numbers:
There is ~40% improvement in array map read numbers.
Before
After
Do any existing tests cover this change? Are new tests needed?
If new tests were added:
Documentation
Is there any documentation impact for this change?
Installation
Is there any installer impact for this change?