-
Notifications
You must be signed in to change notification settings - Fork 494
Pull requests: UKGovernmentBEIS/inspect_ai
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
test(extensions): correct ChatMessage/GenerateConfig types and add hook-ordering regression
#3865
opened May 7, 2026 by
sjawhar
Contributor
Loading…
1 task done
test(conftest): dynamic ThreadedMotoServer port and xdist worker session-finish guard
#3864
opened May 7, 2026 by
sjawhar
Contributor
Loading…
2 tasks done
fix(model): GenerateConfig() shared mutable default in get_model and resolve_models
#3863
opened May 7, 2026 by
sjawhar
Contributor
Loading…
1 task done
fix(log/eval): preflight ETag check on S3 conditional write
#3862
opened May 7, 2026 by
sjawhar
Contributor
Loading…
1 task done
fix(view): support FastAPI 0.118+ where fastapi._compat.v2 is removed
#3861
opened May 7, 2026 by
sjawhar
Contributor
Loading…
1 task done
feat(model): per-attempt ModelEvent retry accounting and timing
#3860
opened May 7, 2026 by
sjawhar
Contributor
Loading…
1 task done
Stop operator-interrupted samples from failing eval on scorer error
#3859
opened May 7, 2026 by
rasmusfaber
Contributor
Loading…
1 task done
Add direct link to one-time front-end submodule setup in project README
#3856
opened May 6, 2026 by
glasnt
Loading…
1 of 5 tasks
bedrock: drop unsupported sampling params for Claude 4.7+ (#3766)
#3855
opened May 6, 2026 by
WatchTree-19
Loading…
1 of 5 tasks
feat(sagemaker): Add prompt_logprobs support for SageMaker perplexity scoring
#3853
opened May 6, 2026 by
avadali-amzn
Contributor
Loading…
2 of 5 tasks
Fix MMLU CLI command on Evals page (#3834)
#3852
opened May 6, 2026 by
antnewman
Loading…
2 of 5 tasks
Anthropic: skip top-level cache_control on Bedrock/Vertex
#3851
opened May 6, 2026 by
jon-aisi
Loading…
1 of 5 tasks
Add aggregate(key, agg=...) metric factory (#3735)
#3850
opened May 6, 2026 by
antnewman
Loading…
2 of 5 tasks
fix(eval-set): bump retry log timestamp to avoid clobbering failed log
#3837
opened May 5, 2026 by
ransomr
Collaborator
Loading…
1 task done
hf_dataset: retry transient HF errors
#3836
opened May 5, 2026 by
FazeelUsmani
Contributor
Loading…
2 of 5 tasks
Fetch pending sample data directly from S3 in viewer
#3835
opened May 5, 2026 by
rasmusfaber
Contributor
Loading…
3 of 5 tasks
fix: store and aggregate results for cancelled eval runs
#3828
opened May 4, 2026 by
PranshuSrivastava
Loading…
1 of 5 tasks
Agent bridge: keep ChatMessageSystem.id stable across content mutations
#3806
opened Apr 30, 2026 by
ezra-apollo
Loading…
3 tasks done
Agent bridge: preserve ChatMessage and ToolCall ids across turns
#3805
opened Apr 30, 2026 by
ezra-apollo
Loading…
3 tasks done
Fix(scorer): strip % in numeric match for face-value comparisons
#3782
opened Apr 28, 2026 by
RecreationalMath
Contributor
Loading…
2 of 5 tasks
add redirect to profiles.ini
#3702
opened Apr 17, 2026 by
anthonyduong9
Contributor
•
Draft
1 of 5 tasks
Per-sandbox server dir for inspect-sandbox-tools (fixes sandbox="local")
#3695
opened Apr 16, 2026 by
jon-aisi
Loading…
5 tasks done
Previous Next
ProTip!
Follow long discussions with comments:>50.