Last Session: 2026-04-20
| Version | Status | Branch |
|---|---|---|
| v0.3.0 | ✅ Released on PyPI | main |
| v0.4.0 | ✅ Released on PyPI | main |
| v0.4.2 | ✅ Released on PyPI | main |
| v0.4.3 | ✅ Released on PyPI | main |
| v0.4.4 | ✅ Released on PyPI | main |
| v0.4.5 | ✅ Released on PyPI | main |
| v0.4.6 | ✅ Released on PyPI | main |
| v0.4.7 | ✅ Released on PyPI | main |
| v0.4.8 | ✅ Released on PyPI | main |
| v0.5.0 | 📋 Not yet released — codebase on main is ahead of PyPI | main |
| Artifact | Status |
|---|---|
| Source Code | ✅ Complete |
| Test Suite | ✅ 372 tests passing |
| README.md | ✅ Comprehensive — tagline, features, architecture updated for current state |
| CHANGELOG.md | ✅ v0.1.0-v0.4.0 documented (needs v0.4.8 entry) |
| CONTRIBUTING.md | ✅ Contributor guidelines |
| LICENSE | ✅ Elastic License 2.0 (Sivan Grünberg, Vitakka Consulting) |
| CI/CD | ✅ GitHub Actions (test 3.11, test 3.12, lint, typecheck) — all green |
| Documentation | ✅ User Guide + Technical Guide + Multi-agent Guide + Sixth Sense (3 guides) + Agent Sentinel (5 guides) + PR #4 Fixes walkthrough |
| Wiki | ✅ https://github.com/sivang/bedsheet/wiki (Home + PR-4-Fixes-Explained) |
| Copyright | ✅ All 14 HTML docs have footer + 11 sidebar docs have sidebar copyright |
| Examples | ✅ Investment advisor demo + Agent Sentinel security demo + Cloud Monitor demo |
| Demo | ✅ uvx bedsheet demo (requires GEMINI_API_KEY, uses REAL DATA from Yahoo Finance + DuckDuckGo) |
| pyproject.toml | ✅ PyPI ready |
- Sixth Sense — distributed agent communication (
bedsheet/sense/) - LLM Recording & Replay —
RecordingLLMClient+ReplayLLMClient(bedsheet/recording.py) - GeminiClient — first-class Gemini support with thought-signature handling (
bedsheet/llm/gemini.py) - LLM Factory —
make_llm_client()picks provider from env vars (bedsheet/llm/factory.py) - Transport Factory —
make_sense_transport()picks transport from env vars (bedsheet/sense/factory.py) - Agent Sentinel — full security demo with Action Gateway, sentinels, commander, live dashboard
- Cloud Monitor — second multi-agent example
- Annotated schema support —
Annotated[T, "desc"]for tool parameter descriptions - Verbose logging —
print_event()framework-level stdout logging - Empty response handling — agent loop explicitly catches "no text, no tool calls"
- 9 bug fixes from two rounds of automated code review (B1/B2/B3/H1/M1 + 5 hardening items)
- +46 new tests (326 → 372)
| Issue | Description |
|---|---|
| #5 | v0.5.x cleanup: provider-state refactor, recording dataclass, Signal validation, test gaps, heartbeat dead code, Agent internals encapsulation |
-
Recording replay fix —
ReplayLLMClientnow usesstop_reason="replay_exhausted"instead oftext=None(which triggered the agent's empty-response guard) or synthetic text (which leaked into PubNub, memory, and streaming). Custom stop_reason handled explicitly in bothAgent.invoke()andSupervisor.invoke(). -
Agent Sentinel™ branding — Added ™ to all user-facing titles across 12 files (HTML docs, README, start.sh banner, run.py log, example README). Inline text unchanged.
-
Supplemental license terms — Agent Sentinel™ and Sixth Sense added as protected technology (Sections 1.3, 1.4) with formal definitions.
-
PubNub error logging — Replaced bare
except Exception: passin all 6 sentinel agent scripts withdebug-level logging. -
Supervisor hardening — Added
replay_exhaustedguard and empty-response guard toSupervisor.invoke()(which overridesAgent.invoke()entirely). -
PR review toolkit rounds — 3 review cycles on recording fix (PR #9), 3 on sentinel presenter (PR #8). All critical/high findings fixed before merge.
-
War Narrator feature (PR #10, branch
feature/war-narrator, NOT merged):- Colonel Eli Vance — named narrator character (intelligence analyst, cerebral, military bearing)
- 16 pre-rendered narration beats via Gemini 3.1 Flash TTS with Director's Notes + inline audio tags
NarrationManagerJS module with ambient duck/unduck, M-key mute, chapter-jump cleanup- New
audiocue type in MovieEngine with linter validation and preloading generate.pystandalone TTS generator with retry, duration measurement, CLIscript.jsonstructured narration schema (version 1)- Total narration: ~7 minutes across all chapters
| PR | Title | Branch |
|---|---|---|
| #8 | feat: Agent Sentinel Presenter — cinematic demo playback + movie mode | feature/sentinel-presenter |
| #9 | fix: Supervisor replay guards + PubNub error logging | fix/review-findings |
| PR | Title | Branch | Status |
|---|---|---|---|
| #10 | feat: War Narrator — Colonel Vance TTS narration | feature/war-narrator | Ready for testing/polish |
main: up to date, all merged work + design spec + license updatefeature/war-narrator: 7 commits ahead, PR #10 open, Tasks 1-6 complete, Tasks 7-8 pendingfeature/nats-transport: parked, synced to origin
- War narrator total duration is ~7min vs movie's ~2:44 — chapter timings will need extending to match narration length (Task 7)
- Gemini TTS key: new key
AIzaSyAc...under "Bedsheet test and demo" project with billing. Old key hit free-tier quota. - Worktree cleanup:
.worktrees/sentinel-presenter/can be removed (branch merged). Has arecordings.bakdirectory from the recording session. - Task 7 remaining: full movie playback test, timing adjustments, presenter guide update
- Task 8 remaining: final push + review toolkit before merge
Movie mode — a third peer playback mode added to docs/sentinel-presenter.html, alongside live (PubNub) and replay (recordings). Movie mode runs a ~2:44 fully-scripted, synthetic demo end-to-end with no agents, no PubNub, and no recording dependency. Activated via ./start.sh --movie or ?mode=movie.
8 scripted chapters covering the full Agent Sentinel story:
- Ch 0: Intro + Bedsheet pitch + two-plane architecture diagram (~44s)
- Ch 1: Network Startup (7 agents come online)
- Ch 2: Normal Operations (routine tool calls through the gateway)
- Ch 3: Malicious Install Blocked (supply-chain-sentinel hash mismatch)
- Ch 4: Rogue Burst (5 tool calls in 2s)
- Ch 5: Gateway Block (rate-limit response)
- Ch 6: Sentinel Alert (behavior-sentinel queries gateway ledger)
- Ch 7: Quarantine Issued (commander issues quarantine)
- Ch 8: Stable State Restored (6 agents on mission)
Architecture:
MovieEngineclass with timer registry — cancelAll/restart/jumpToChapter/setSpeed all tear down pending timers atomically- 10-cue schema:
chapter-card,spotlight,signal,commentary,line,reset, plus 4 Chapter-0 overlay cues - Synthetic signals rendered via
buildEventCard+driveMapEffectForSignal— bypassesdrainMapEvents800ms pacing so Chapter 4's rapid burst renders at burst rate - Boot-time
lintMovieScriptvalidator covers 5 schema rules - Chapter 0 introduces Bedsheet's "Sixth Sense" pitch (DARPA-white-paper register) and an inline SVG of the two-plane architecture (operational + control planes separated by the Sixth Sense bus)
Pitch copy emphasizes Bedsheet's defining invention: the first real-time, high-availability, general-purpose communication bus in an agentic framework (transport-agnostic: PubNub, NATS). Explicit contrast with A2A ("not HPC"). Pitch and arch diagram are locked verbatim in §3.4 and §3.5 of the design spec.
docs/sentinel-presenter.html— +1100 lines (MovieEngine, cue dispatcher, 9 chapters, pitch copy, arch diagram SVG, CSS)examples/agent-sentinel/start.sh—--movieflagdocs/sentinel-presenter-guide.html— Movie Mode sectiondocs/superpowers/specs/2026-04-14-sentinel-presenter-movie-mode-design.md— full designdocs/superpowers/plans/2026-04-14-sentinel-presenter-movie-mode.md— implementation plan
Design doc went through 3 brainstorming + spec review iterations. Plan went through 3 review iterations. Issues caught during reviews that would have derailed implementation:
handleSignalwas assumed to exist — actuallyhandleMessage(event)with PubNub wrapper; fixed by callingbuildEventCarddirectly.- Cue schema grew from 6 → 10 types to keep pitch/arch overlays flowing through MovieEngine timer registry (not leak through state transitions).
spotlightcue must callshowFocusOverlay— without it, event cards append to an invisible (opacity-0) container.driveMapEffectForSignalmust resolve colour fromROLE_COLORS— passingnullas colour sets literalstroke="null".MovieEngine.setSpeedmust re-schedule the chapter-advance timer aftercancelAll— otherwise the movie stalls at the end of the current chapter after a speed change.- PITCH_LINES is 1270 chars (not the 950 initially assumed); pitch takes 37.5s; Chapter 0 timing re-paced to 44s.
- Branch:
feature/sentinel-presenter - Worktree:
.worktrees/sentinel-presenter/ - Implementation commits: Phases 1–6 (one commit per task, ~18 commits total)
- Pushed to
origin/feature/sentinel-presenter
-
Sentinel Presenter — new cinematic demo presentation page (
docs/sentinel-presenter.html)- Reuses dashboard visual DNA (CSS vars, world map, SVG agent nodes, signal animations)
- Scene-based playback: collects events from PubNub, groups by agent, plays one at a time
- Smooth cinematic map zoom to active agent with overlay focus panel
- Director mode: press 1-9 to jump to any agent's scene
- Military-style typed intro crawl overlaid on the map ("// CLASSIFIED — SENTINEL NETWORK")
- Per-agent asset briefings (typed dossier: role, mission, protects against, future ops)
- Scene commentary (typed "INTELLIGENCE BRIEFING" explaining each chapter)
- Chapter detection: Network Startup, Skill Acquisition, Malicious Install Blocked, Rogue Burst, Gateway Block, Sentinel Alert, Quarantine Issued
- Cinematic mode (C key): hides all chrome, movie-style chapter cards
- Ambient audio support (drop
docs/ambient.mp3) - Keyboard: Space, arrows, 1-9 director, Shift+1-5 speed, F fullscreen, C cinematic, T timestamp
- Collision-avoiding overlay positioning (briefing + activity panels avoid each other and agent nodes)
- Signal lines animate during scene playback (gateway tools + inter-agent requests)
-
start.sh updated —
--presentflag (implies --replay),--cinematicflag -
Comprehensive guide —
docs/sentinel-presenter-guide.htmlwith features, JS architecture walkthroughs, recording workflow -
Worktree setup —
.worktrees/added to gitignore, worktree at.worktrees/sentinel-presenter/
- Branch:
feature/sentinel-presenter(LOCAL ONLY — not pushed) - Worktree:
.worktrees/sentinel-presenter/ - 16 commits ahead of main
- Files:
docs/sentinel-presenter.html(2422 lines),docs/sentinel-presenter-guide.html(1093 lines),start.sh(+35 lines) - Untracked:
docs/ambient.mp3(user-provided audio) - Modified (unstaged):
examples/agent-sentinel/data/calendar.json(from test run)
- Recordings are short (49 total events) — need re-recording with billable Gemini key (60-90s window)
- sentinel-commander recording triggers "empty response" error — recording quality, not presenter bug
- Overlay positioning still needs live testing after latest collision avoidance changes
- TTS integration deferred — Web Speech API (zero setup) or Edge TTS (better quality) for voiceover
- Mode toggle in dashboard — test adding presenter as a mode within existing dashboard (future)
- Interactive walkthrough mode — audience C (deferred)
-
PR #4 merged (
feature/sixth-sense→main, squash merge)- 60 original commits + 9 fix commits from review
- Two rounds of automated code review (7 agent passes)
- All critical/important findings fixed before merge
- CI green on all 4 checks
-
9 bug fixes applied (B1 Gemini double-call, B2 asyncio task GC, B3 gateway audit lies, H1 empty response loop, M1 ledger integration test, Defer #1-5 hardening)
-
Transport factory (
make_sense_transport()) — decoupled Action Gateway from PubNubTransport, enabled future NATS support -
PR #6 merged —
docs/pr-4-fixes-explained.mdwalkthrough -
PR #7 merged — README doc links + copyright footers + review fix (dashboard HUD overlay + security-arch dark-theme colors)
-
Wiki bootstrapped — Home + PR-4-Fixes-Explained
-
README fully updated — new tagline, features, architecture tree, comparison table, installation, roadmap, FAQ
-
Copyright footers — all 14 HTML docs (footer) + all 11 sidebar docs (sidebar)
-
Branch cleanup — all 6 merged branches verified and deleted (local + remote)
-
Issue #5 filed — v0.5.x cleanup tracker for deferred review items
- 372 passed, 1 pre-existing failure (
test_memory_exports— missingredismodule locally, passes in CI) - CI: test (3.11) ✅, test (3.12) ✅, lint ✅, typecheck ✅
-
Framework-level
print_event()for verbose agent stdout- Added
print_event(agent_name, event, verbose=None)tobedsheet/events.py - Prints LLM events to stdout with
[agent-name]prefixes (Docker Compose-style) - Gated by
BEDSHEET_VERBOSEenv var or per-callverbose=True/Falseoverride - Handles all event types: ThinkingEvent, ToolCallEvent, ToolResultEvent, CompletionEvent, ErrorEvent, DelegationEvent, CollaboratorStartEvent, CollaboratorCompleteEvent
- Exported from
bedsheet/__init__for convenience
- Added
-
All 4 LLM sentinel agents wired up
- scheduler, web_researcher, skill_acquirer, sentinel_commander
- Each calls
print_event(agent.name, event)in their invoke loop - Existing output preserved (commander arrows, threat assessments, rogue mode prints)
-
start.sh updated
- Sets
BEDSHEET_VERBOSE=1by default - Added
--quietflag to disable verbose output - No regression to existing
--record,--replay,--no-dashflags
- Sets
-
Session resurrection from 6 handoff documents
- Read all SESSION_HANDOFF files, MEMORY.md, design docs, plans
- Reconstructed full project context across feature/sixth-sense branch
| File | Changes |
|---|---|
bedsheet/events.py |
Added print_event(), _truncate() |
bedsheet/__init__.py |
Export print_event |
examples/agent-sentinel/agents/scheduler.py |
Import + call print_event |
examples/agent-sentinel/agents/web_researcher.py |
Import + call print_event |
examples/agent-sentinel/agents/skill_acquirer.py |
Import + call print_event |
examples/agent-sentinel/agents/sentinel_commander.py |
Import + call print_event |
examples/agent-sentinel/start.sh |
--quiet flag, BEDSHEET_VERBOSE=1 default |
- 326 passed, 1 pre-existing failure (
test_memory_exports— missingredismodule)
-
Replaced ALL Mock Data with REAL APIs
- All 3 demo locations now use real data (no mocks, no simulations)
- Yahoo Finance (yfinance): Stock prices, PE ratios, market caps, 52-week ranges
- DuckDuckGo (ddgs): Real news articles with sources and dates
- Calculated metrics: RSI-14, MACD, SMA-20/50, beta vs SPY, volatility, max drawdown, Sharpe ratio
-
GitHub Release v0.4.7 "Hermes" Published
- Codename: Hermes (swift messenger god = deploy anywhere)
- URL: https://github.com/sivang/bedsheet/releases/tag/v0.4.7
- Comprehensive release notes with real data capabilities
-
Dependencies Added
yfinance>=0.2.40- Yahoo Finance (no API key required)ddgs>=6.0.0- DuckDuckGo search (no API key required)- Available via:
pip install bedsheet[demo]
-
Verified with Real Data
- NVDA: $184.61, RSI 47.09, Beta 1.84, 5 real news articles
- AAPL: $249.03, RSI 13.77 (oversold), Beta 1.26
- All 265 tests passing
| File | Changes |
|---|---|
bedsheet/__main__.py |
Real data tools for uvx bedsheet demo |
examples/investment-advisor/agents.py |
Real yfinance/ddgs tools |
examples/investment-advisor/deploy/gcp/agent/agent.py |
Real tools for GCP deployment |
examples/investment-advisor/deploy/gcp/pyproject.toml |
Added yfinance, ddgs deps |
examples/investment-advisor/pyproject.toml |
Added yfinance, ddgs deps |
pyproject.toml |
Added [demo] optional dependency group |
-
Released v0.4.3 through v0.4.7 - Six releases in one session!
Version Feature v0.4.3 Dynamic CLI version (importlib.metadata) v0.4.4 Credential preflight check (warns if GOOGLE_APPLICATION_CREDENTIALS set) v0.4.5 Project consistency check (terraform.tfvars vs gcloud config) v0.4.6 make uicommand (one-command Dev UI access)v0.4.7 Improved first-time UX (check cloud-run-proxy component) -
Complete E2E Testing Validated
- Fresh agent with
uvx bedsheet@latest init test-agent - Generated with
bedsheet generate --target gcp - Deployed to Cloud Run (since deleted)
- Dev UI accessible via
make uiathttp://localhost:8080/dev-ui/
- Fresh agent with
-
Documentation Mega Update
- Updated
docs/gcp-deployment-deep-dive.mdwith:- New DX Safeguards section
- Testing Deployed Agents section
- Release History section
- Executive Summary for stakeholders
- Updated PROJECT_STATUS.md with current session
- Updated
"Every time something happens and you are setting something manually instead of analyzing and providing a solution...you are leaving a bug to explore in the user's hands."
This session embodied this principle by converting manual workarounds into automated safeguards.
| File | Changes |
|---|---|
bedsheet/deploy/templates/gcp/Makefile.j2 |
Credential check, project check, make ui |
bedsheet/deploy/templates/gcp/DEPLOYMENT_GUIDE.md.j2 |
Warning documentation |
bedsheet/cli/main.py |
Dynamic version via importlib.metadata |
docs/gcp-deployment-deep-dive.md |
Comprehensive update |
PROJECT_STATUS.md |
Session summary |
pyproject.toml |
Version 0.4.7 |
CHANGELOG.md |
v0.4.3-v0.4.7 entries |
-
Fixed GCP E2E Testing - ROOT CAUSE FOUND!
GOOGLE_APPLICATION_CREDENTIALSenv var pointed to wrong project's service account- Python SDK prioritizes this env var over ADC
- Fix:
unset GOOGLE_APPLICATION_CREDENTIALS
-
Investment Advisor Deployed to Cloud Run
- Deployed to Cloud Run (since deleted)
- Model:
gemini-3-flash-previewvia global Vertex AI endpoint - Multi-agent system working: MarketAnalyst, NewsResearcher, RiskAnalyst
- All tools functional
-
ADK Dev UI Enabled
- Changed Dockerfile template from
api_servertowebmode - Dev UI accessible at
/dev-ui/on both local and Cloud Run
- Changed Dockerfile template from
-
Comprehensive Documentation Created
docs/gcp-deployment-deep-dive.mdand.html- Architecture diagrams, troubleshooting guides, credential flow explanations
- Sanitized sensitive info from docs and git history
SDK Credential Priority:
GOOGLE_APPLICATION_CREDENTIALSenv var (highest priority)- Application Default Credentials (ADC)
- Compute Engine / Cloud Run service account
If GOOGLE_APPLICATION_CREDENTIALS points to project A's SA, but you're accessing project B, you get 403 even with correct IAM roles.
| File | Changes |
|---|---|
bedsheet/deploy/templates/gcp/Dockerfile.j2 |
Use web mode for Dev UI |
docs/gcp-deployment-deep-dive.md |
New comprehensive docs |
docs/gcp-deployment-deep-dive.html |
Styled HTML version |
-
GCP Templates Updated for ADC (Application Default Credentials)
- Removed API key requirement - now uses user's GCP credentials
make initauto-triggers browser auth if needed- ADC quota project set explicitly to target project
- GOOGLE_CLOUD_PROJECT passed to all Terraform commands
-
Removed google_project_service from Terraform
- APIs now enabled via gcloud CLI in Makefile (avoids ADC permission issues)
- Terraform focuses on resource creation only
- Fixed dependency issues between resources
-
Added IAM API to enabled services
- Prevents permission errors during service account creation
-
Published Multiple Release Candidates
- v0.4.2rc3: Initial ADC changes
- v0.4.2rc4: Quota project + ADC validation
- v0.4.2rc5: Removed google_project_service from Terraform
-
GCP E2E Test Progress
- search-assistant agent configured with Google Search grounding
- Authentication working (gcloud auth + ADC)
- APIs enabled successfully
- Terraform still encountering permission issues (ADC vs gcloud auth difference)
ADC vs gcloud CLI auth difference:
gcloud auth login→ full account permissions for gcloud CLIgcloud auth application-default login→ limited OAuth scopes for SDKs- Terraform google provider uses ADC, which has fewer permissions than gcloud CLI
- Solution: Enable APIs with gcloud, create resources with Terraform
| File | Changes |
|---|---|
bedsheet/deploy/templates/gcp/Makefile.j2 |
ADC validation, quota project, GOOGLE_CLOUD_PROJECT |
bedsheet/deploy/templates/gcp/main.tf.j2 |
Removed google_project_service resources |
pyproject.toml |
Version bumped to 0.4.2rc5 |
- Complete GCP E2E Test - Terraform apply still failing, may need service account key
- Commit template changes - All changes ready for commit
- Release v0.4.2 - After E2E validation
Option A: Use service account key for Terraform (traditional approach) Option B: Further investigate ADC permission scopes Option C: Test with user who has simpler GCP setup
-
AgentCore Deployment Target - COMPLETE! (EXPERIMENTAL)
- Added new
agentcoretarget for Amazon Bedrock AgentCore ⚠️ Experimental: AgentCore is in preview, APIs may change- Full stack: Runtime + Gateway + Lambda for tools
- Terraform-based infrastructure
- 16 template files created
- 26 unit tests added (all passing)
- Added new
-
Strands vs Bedsheet Research
- Comprehensive feature comparison analysis
- Strands has more features: Swarms, Graphs, Workflows, multi-provider LLM
- Bedsheet has unique strengths: multi-cloud deployment, CLI, structured outputs
- Decision: Keep Bedsheet simple, document patterns
-
Multi-Agent Patterns Documentation
- Created
docs/multi-agent-patterns.md - Shows how to implement Swarms, Graphs, Workflows, A2A with current constructs
- No new features needed - just creative use of Supervisor + @action + asyncio
- Created
-
Roadmap Update
- Added "Advanced Orchestration (v0.9+)" section
- ReWOO, Reflexion, Autonomous Loops planned for future
- Multi-agent patterns documented as achievable today
| File | Description |
|---|---|
bedsheet/deploy/targets/agentcore.py |
AgentCore target implementation |
bedsheet/deploy/templates/agentcore/ |
16 Jinja2 templates |
tests/test_deploy_targets_agentcore.py |
26 unit tests |
docs/multi-agent-patterns.md |
Pattern implementation guide |
docs/strands-advanced-patterns.md |
Detailed pattern explanations |
| File | Changes |
|---|---|
bedsheet/deploy/config.py |
Added AgentCoreTargetConfig |
bedsheet/deploy/__init__.py |
Exported new config class |
bedsheet/deploy/targets/__init__.py |
Exported AgentCoreTarget |
bedsheet/cli/main.py |
Added agentcore to TARGETS |
PROJECT_STATUS.md |
Updated roadmap with orchestration styles |
feature/agentcore-target - Ready for merge
-
Package Rename: bedsheet-agents → bedsheet
- Simplified PyPI package name for cleaner
uvx bedsheetexperience - Updated pyproject.toml package name
- Simplified PyPI package name for cleaner
-
CLI Demo Command
- Added
bedsheet democommand to CLI - Fixes "Missing command" error when running
uvx bedsheet - Demo runs the multi-agent investment advisor
- Added
-
Documentation Updates (pip → uv)
- Updated all documentation to use modern
uv/uvxtooling - Files updated: README.md, CONTRIBUTING.md, CLAUDE.md, PROJECT_STATUS.md
- Files updated: docs/user-guide.md, docs/user-guide.html, bedsheet/cli/README.md
- Updated all documentation to use modern
-
Multi-Agent Guide HTML
- Created HTML version of multi-agent-guide.md
- Matches styling of other documentation files
-
README Image Optimization
- Updated image from logo.png to Pythonic.jpg
- Optimized file size: 3.9MB → 652KB (JPEG 85% quality)
- No visible quality loss
-
Git History Cleanup
- Removed all Claude attributions from commit messages
- Used git-filter-repo for safe history rewrite
- Verified integrity via tree hash comparison before force push
-
Project Conventions
- Created
.claude/rules/dont.mdwith lessons learned - Documents: image backup before edits, GitHub Pages links, no Claude attributions
- Created
-
Roadmap Update
- Enhanced v0.6: Added "classification models for high-speed validation"
- Added v0.8: WASM/Spin support (browser agents, edge deployment, Fermyon Cloud)
52aad0d Reduce size even more
c7344f7 chore: optimize README image size (3.9MB → 652KB)
82a4a6d chore: crop README image to 16:9 widescreen
9cf81f2 feat(cli): add demo command and update README image
e308f35 docs: update remaining files to use uv tooling
09d2caf docs: update README to use uvx/uv instead of pip
798b5ff docs: add HTML version of multi-agent guide
7c885a5 chore: rename package from bedsheet-agents to bedsheet
dd7fb6d Fix link format for LICENSE in README.md
059f2bd Update LICENSE link to LICENSE.md
Python Code:
bedsheet/cli/main.py- Added demo command, version bump to 0.4.0pyproject.toml- Package rename
Documentation:
README.md- uv tooling, new image, roadmap updateCONTRIBUTING.md- uv toolingCLAUDE.md- uv toolingdocs/user-guide.md- uv toolingdocs/user-guide.html- uv toolingdocs/multi-agent-guide.html- New filebedsheet/cli/README.md- uv tooling
Assets:
Pythonic.jpg- New optimized README image (652KB)
Local Config:
.claude/rules/dont.md- Project conventions (not committed)
-
Published v0.4.0 to PyPI - GA Release!
- Bumped version from 0.4.0rc4 to 0.4.0
- Fixed build: Added node_modules exclusion to pyproject.toml
- Removed all
--prereleaseflags from docs and templates - https://pypi.org/project/bedsheet/
-
License Cleanup
- Updated all Apache 2.0 references to Elastic License 2.0
- Files updated: PROJECT_STATUS.md, README.md, CONTRIBUTING.md, all HTML docs
-
PR #1 Merged
- CI/CD fixes merged to main
- All GitHub Actions passing (test 3.11, 3.12, lint, typecheck)
-
uvx Support
- Package now installable via
uvx bedsheet --help - No more
--prereleaseneeded
- Package now installable via
uv pip install bedsheet
uv add bedsheet
uvx bedsheet --helppyproject.toml- Version bump + build exclusionsdocs/deployment-guide.html- Removed --prereleasebedsheet/deploy/templates/gcp/Dockerfile.j2- Removed --prereleasebedsheet/deploy/templates/local/Dockerfile.j2- Removed --prerelease
- GCP Cloud Run E2E Test - Still pending
- Knowledge bases and RAG - v0.5 roadmap item
- Guardrails and safety - v0.6 roadmap item
-
AWS Terraform Thinking Events - COMPLETE!
- Solved thinking/rationale extraction for AWS Bedrock Debug UI
- Option A prompt injection extracts XML
<thinking>tags from model responses - Backported to Bedsheet templates:
aws-terraform/debug-ui/server.py.j2 - Fixed duplicate thinking events in UI (deduplication logic)
- Fixed
<answer>tag content appearing in thinking panel
-
AWS Terraform Target - COMPLETE!
- Added
aws-terraformtarget to CLI (bedsheet/deploy/targets/aws_terraform.py) - Updated CLI main.py to support aws-terraform in generate and deploy commands
- Updated config.py to include aws-terraform in TargetType enum
- Full E2E tested with wisdom-council multi-agent deployment
- Added
-
Documentation Review - COMPLETE!
- Added comprehensive v0.4.0 entry to CHANGELOG.md
- Reviewed README.md roadmap (already accurate)
- Updated PROJECT_STATUS.md with session summaries
- All documentation now reflects v0.4 features
Python Code:
bedsheet/cli/main.py- Added aws-terraform targetbedsheet/deploy/config.py- Added aws-terraform to TargetType enum
Templates:
bedsheet/deploy/templates/aws-terraform/debug-ui/server.py.j2- Thinking extraction
Documentation:
CHANGELOG.md- Added v0.4.0 sectionPROJECT_STATUS.md- Updated session history
- GCP Cloud Run E2E Test - Still pending
- Final testing across all targets
- Release v0.4.0 to PyPI
-
AWS Terraform @action Translation Fix - COMPLETE!
- Fixed critical bug: delegate action was being translated to Lambda/OpenAPI for supervisors
- AWS Bedrock has NATIVE collaboration via
aws_bedrockagent_agent_collaborator - Delegate @action should only exist for LOCAL execution, not AWS deployment
-
Delegate Action Filtering
- Modified
aws.pyandaws_terraform.pyto filter delegate BEFORE creating template context - For supervisors with collaborators:
if is_supervisor and collaborators: filter delegate - Single agents and supervisors without collaborators: no filtering applied
- Result: NO Lambda handler, NO OpenAPI endpoint for delegate action
- Modified
-
Resource Identification with bedsheet- Prefix
- Added
bedsheet-prefix to infrastructure resources for easy identification - IAM roles:
bedsheet-${local.name_prefix}-agent-role - IAM policies:
bedsheet-${local.name_prefix}-agent-permissions - Lambda functions (when generated):
bedsheet-${local.name_prefix}-actions - User-facing resources (Bedrock agents, aliases) kept clean without prefix
- Added
-
Resource Tagging
- Fixed incorrect
agent_resource_tagsattribute →tags(correct Terraform syntax) - Added comprehensive tags to ALL resources:
ManagedBy = "Bedsheet"BedsheetVersion = "0.4.0"Project = var.project_nameEnvironment = local.workspaceAgentType = "Supervisor|Collaborator|SingleAgent"(for Bedrock agents)
- Tags support governance, cost allocation, and resource filtering
- Fixed incorrect
-
Verified with wisdom-council
- Generated deployment artifacts with all fixes applied
- Confirmed NO Lambda files generated (delegate was only tool, filtered out)
- Confirmed NO
/delegateendpoint in openapi.yaml (only/health) - Confirmed
bedsheet-prefix on IAM resources - Confirmed correct
tagsattribute in all resources
Problem: User explicitly requested in previous session: "translate the @action decorator of bedsheet to the implementation in AWS, just as it does for GCP"
What was happening:
- Supervisor auto-registers
delegateaction in__init__() - Introspection extracts ALL tools including delegate
- AWS templates blindly generated Lambda + OpenAPI for ALL tools
- Result: Redundant delegate Lambda that conflicts with Bedrock's native collaboration
Solution:
- Filter delegate at generation time for multi-agent scenarios
- Bedrock handles delegation via
aws_bedrockagent_agent_collaboratorresources - GCP translates @actions to ADK tool stubs (platform idiom)
- AWS now translates by filtering delegate for supervisors (platform idiom)
Python Code (filtering logic):
bedsheet/deploy/targets/aws.py:40-51- Filter delegate before context creationbedsheet/deploy/targets/aws_terraform.py:40-48- Filter delegate before context creation
Terraform Template (naming, tagging):
bedsheet/deploy/templates/aws-terraform/main.tf.j2:40, 71- bedsheet- prefix for IAMbedsheet/deploy/templates/aws-terraform/main.tf.j2:195-201, 226-232, 296-302- Fixed tags attribute
Generated wisdom-council/deploy/aws-terraform/:
- ✅ 11 files generated (NO lambda directory)
- ✅
openapi.yamlcontains only/healthendpoint - ✅
main.tfhas NO Lambda resource definitions - ✅ IAM resources named
bedsheet-wisdom_council-dev-agent-role - ✅ All resources properly tagged with ManagedBy=Bedsheet
-
Deploy with Terraform (blocked by aws-vault credentials issue)
- Need to restart session for aws-vault to work properly
- Run:
cd deploy/aws-terraform && aws-vault exec personal -- terraform plan - Then:
aws-vault exec personal -- terraform apply
-
Test with Debug UI
- Start debug UI:
aws-vault exec personal -- python debug-ui/server.py - Verify multi-agent collaboration works via Bedrock native delegation
- Check traces show collaborator invocations (NOT Lambda delegate calls)
- Start debug UI:
-
Add to examples/ (if successful)
- Copy wisdom-council to BedsheetAgents/examples/
- Document as canonical multi-agent AWS deployment example
- ✅ AWS @action translation now matches user's original intent
- ✅ Resource naming conventions established (bedsheet- prefix)
- ✅ Resource tagging strategy implemented
- ✅ Multi-agent translation correctly handles platform idioms
-
AWS Bedrock Debug UI - COMPLETE!
- Built comprehensive debug UI for AWS Bedrock agents
- FastAPI server that proxies to Bedrock Agent Runtime API
- SSE streaming for real-time event updates
- Multi-agent collaboration tracing (collaborator_start/complete)
- Tested with Judge/Sage/Oracle multi-agent system
-
Debug UI Features
- Collapsible event items with badge icons and summaries
- Thinking/rationale trace visualization
- Tool call and result tracking
- Environment variable configuration for agent ID/alias
- Filter out redundant long thinking events (final synthesis)
-
Template Updates
- Lambda handler simplified to use standard library only (removed aws_lambda_powertools dependency)
- CDK stack improvements for multi-agent deployments
- Debug UI template added to AWS target
- Events panel now starts collapsed for cleaner UX
-
AWS E2E Test Complete
- Successfully deployed agent to Bedrock via CDK
- Invoked agent through debug UI
- Verified multi-agent orchestration tracing works
bedsheet/deploy/templates/aws/debug-ui/server.py.j2- Debug UI serverbedsheet/deploy/templates/aws/lambda/handler.py.j2- Simplified handlerbedsheet/deploy/templates/aws/stacks/agent_stack.py.j2- CDK improvements
- Add Bedsheet @action compilation to Lambda (pending)
- Update roadmap: AWS Debug UI now DONE (was deferred)
-
GCP ADK Dev UI Integration - WORKING!
- Fixed ADK agent discovery:
adk web .(notadk web agent) - Added
root_agentexport to__init__.py.j2template - ADK requires agent directory structure with
root_agentvariable
- Fixed ADK agent discovery:
-
Gemini Model Compatibility Testing
- Tested multiple models for free-tier API key support
gemini-2.0-flash- quota errors (regional restrictions)gemini-1.5-flash- also didn't workgemini-2.5-flash- WORKS with free tier!gemini-3-pro-preview- requires billing- Updated default model in
config.pytogemini-2.5-flash
-
Improved Developer Experience
- Added QUICK START guide to
.env.exampletemplate - Clear instructions: get API key → copy .env → run
make dev-ui-local - Updated CLI to show GCP-specific next steps after
bedsheet generate
- Added QUICK START guide to
-
Template Fixes
Makefile.j2: Fixeddev-ui-localtarget to useadk web .__init__.py.j2: Exportroot_agentfor ADK discoveryenv.example.j2: Added step-by-step QUICK START comments
bedsheet/deploy/config.py- Default model →gemini-2.5-flashbedsheet/deploy/templates/gcp/Makefile.j2-adk web .fixbedsheet/deploy/templates/gcp/__init__.py.j2-root_agentexportbedsheet/deploy/templates/gcp/env.example.j2- QUICK START guidebedsheet/cli/main.py- GCP next steps in CLI output
-
Published v0.4.0rc4 to PyPI - End-to-end tested and working!
bedsheet init myagentscaffolds complete projectbedsheet generate --target localcreates Docker deploymentmake build && make rundeploys working agent with real LLM calls
-
Fixed Local Deploy Template Issues
- Dockerfile: Use
uv pip install -r pyproject.toml(not editable install) - docker-compose: Build context from project root, proper volume mounts
- app.py: Correct
agent.invoke(session_id, message)signature - app.py: Use
CompletionEvent.response(not.text)
- Dockerfile: Use
-
Removed build-system from Scaffolded Projects
- Agent projects aren't installable packages, just dependency declarations
- Fixes
uv sync/uv runerrors with hatchling
-
Release Candidates Published
- rc1: Initial CLI deps fix
- rc2: CLI deps in main dependencies
- rc3: Wired up agent invocation
- rc4: Fixed session_id, event attributes, volume mounts
-
v0.4 "Build Once, Deploy Anywhere" - Full implementation on development branch
- CLI:
bedsheet init,bedsheet generate,bedsheet validate,bedsheet deploy - 3 deployment targets: Local (Docker), GCP (Terraform), AWS (CDK)
- Multi-environment support: dev → staging → prod
- GitHub Actions CI/CD for both GCP and AWS
- CLI:
-
GCP Target Generator
- ADK-compatible
agent.pygeneration - Terraform IaC (Cloud Run, IAM, Secret Manager)
- GitHub Actions with Terraform workspaces
- cloudbuild.yaml for Cloud Build
- ADK-compatible
-
AWS Target Generator
- AWS CDK stack (Bedrock Agent, Lambda, IAM)
- Lambda handlers with AWS Powertools
- OpenAPI schema generation from @action decorators
- GitHub Actions with CDK contexts
-
Local Target Generator
- Docker Compose + FastAPI wrapper
- Hot reload support
- Redis for session persistence
-
Agent Introspection API
extract_agent_metadata()for deployment compilation- Tool schema extraction from @action decorators
bedsheet/
├── cli/
│ ├── __init__.py
│ └── main.py # Typer CLI
├── deploy/
│ ├── config.py # bedsheet.yaml Pydantic schema
│ ├── introspect.py # Agent metadata extraction
│ └── targets/
│ ├── base.py # DeploymentTarget protocol
│ ├── local.py # Docker/FastAPI
│ ├── gcp.py # ADK/Terraform
│ └── aws.py # CDK/Bedrock
│ └── templates/
│ ├── local/ # 6 Jinja2 templates
│ ├── gcp/ # 13 Jinja2 templates
│ └── aws/ # 12 Jinja2 templates
| Feature | Status | Notes |
|---|---|---|
| Structured Outputs | ✅ Done | OutputSchema from Pydantic or dict |
| Anthropic Beta Integration | ✅ Done | structured-outputs-2025-11-13 |
| LLMResponse.parsed_output | ✅ Done | Validated structured data |
| MockLLMClient support | ✅ Done | Testing with output schemas |
| Optional Redis Import | ✅ Done | Works without redis installed |
| Feature | Status | Notes |
|---|---|---|
| Supervisor Agent | ✅ Done | Extends Agent, manages collaborators |
| Supervisor Mode | ✅ Done | Orchestration with synthesis |
| Router Mode | ✅ Done | Direct handoff, no synthesis |
| Parallel Delegation | ✅ Done | Delegate to multiple agents at once |
| Multi-Agent Events | ✅ Done | RoutingEvent, DelegationEvent, etc. |
| Feature | Status | Notes |
|---|---|---|
| Single Agent with ReAct loop | ✅ Done | Agent class with tool calling |
| ActionGroup + @action decorator | ✅ Done | Auto schema inference |
| Streaming Events | ✅ Done | 11 event types |
| Parallel Tool Execution | ✅ Done | asyncio.gather |
| Pluggable Memory | ✅ Done | InMemory, RedisMemory |
| AnthropicClient | ✅ Done | Claude integration |
Latest: v0.4.7 on PyPI
| Feature | Status | Notes |
|---|---|---|
CLI (bedsheet command) |
✅ Done | init, generate, validate, deploy |
| bedsheet.yaml config schema | ✅ Done | Pydantic validation |
| Agent introspection API | ✅ Done | Extract metadata from agents |
| Local target (Docker) | ✅ Done | FastAPI + Docker Compose |
| GCP target (Terraform) | ✅ Done | ADK + Cloud Run + Terraform |
| AWS target (CDK) | ✅ Done | Bedrock + Lambda + CDK |
| GitHub Actions CI/CD | ✅ Done | Multi-environment workflows |
| Multi-env (dev/staging/prod) | ✅ Done | Terraform workspaces / CDK contexts |
| Streaming SSE endpoint | ✅ Done | /invoke/stream exposes Bedsheet's event stream |
| Debug UI (React SPA) | ✅ Done | Chat + live event stream + expand/collapse |
| Debug UI: Local target | ✅ Done | Included by default, env flag to disable |
| Debug UI: GCP Cloud Run | ✅ Done | ADK Dev UI via make ui |
| Debug UI: AWS Bedrock | ✅ Done | FastAPI proxy to Bedrock Agent Runtime with tracing |
| GCP Cloud Run E2E Test | ✅ Done | test-agent deployed, Dev UI verified via make ui |
| AWS Bedrock E2E Test | ✅ Done | Deployed Judge/Sage/Oracle, verified via Debug UI |
| Credential preflight check | ✅ Done | v0.4.4 - warns if GOOGLE_APPLICATION_CREDENTIALS set |
| Project consistency check | ✅ Done | v0.4.5 - validates terraform.tfvars vs gcloud config |
make ui command |
✅ Done | v0.4.6 - one-command access to deployed Dev UI |
| First-time UX improvements | ✅ Done | v0.4.7 - checks for cloud-run-proxy component |
| Real data demo tools | ✅ Done | v0.4.7 - yfinance + ddgs, no mocks |
Tests: 265 passing GitHub Release: v0.4.7 "Hermes"
| Feature | Status | Priority |
|---|---|---|
| Knowledge Base Protocol | 🔮 Planned | High |
| RAG Integration | 🔮 Planned | High |
| Vector store abstraction | 🔮 Planned | Medium |
| Feature | Status | Priority |
|---|---|---|
| Classification models for high-speed validation | 🔮 Planned | High |
| Content Filtering | 🔮 Planned | Medium |
| PII Detection | 🔮 Planned | Medium |
| Prompt injection detection | 🔮 Planned | Medium |
Approach: Use lightweight classification models (not full LLMs) for input/output validation. Fast inference for real-time safety checks.
Sentinel Architecture (from Agent Sentinel security analysis):
| Feature | Status | Priority |
|---|---|---|
| Private worker-gateway channels | 🔮 Planned | High |
| Commander receives structured telemetry only (no free-text) | 🔮 Planned | High |
| Sentinel as local sidecar (process supervisor) | 🔮 Planned | High |
| All agent I/O routed through gateway (fs, network, tools) | 🔮 Planned | Medium |
| Cryptographic signal signing | 🔮 Planned | Medium |
| Sentinel process kill switch (stop runaway LLM loops) | 🔮 Planned | Medium |
Layered security model (two orthogonal enforcement planes):
Key insight: Gateway controls what the agent can do (capability plane — tool access, rate limits, keyword blocks). Sentinel controls whether the agent exists (existence plane — process lifecycle, kill switch). These are orthogonal: gateway restricts actions, sentinel restricts the process itself. Together they provide defense-in-depth where neither alone is sufficient.
- Gateway (Layer 1 — Capability Plane): Deterministic enforcement — rate limits, keyword blocks, quarantine. Owns all tool execution. Not an LLM, cannot be prompt-injected. Cutting gateway access = full quarantine of the agent's capabilities.
- Sentinels (Layer 2 — Existence Plane): LLM-based local sidecars — deep pattern recognition on gateway telemetry + local process monitoring (CPU, memory, token burn, runaway loops). One per worker/compute node. OS-level kill switch to terminate the agent process entirely.
- Commander (Layer 3): LLM-based judgment — correlates structured findings from sentinels, decides quarantine for ambiguous cases the gateway's rules didn't catch. Receives only structured data, no free-text signal payloads.
| Feature | Status | Priority |
|---|---|---|
| Agent Engine deployment target | 🔮 Planned | High |
| A2A protocol support | 🔮 Planned | High |
| Managed sessions/memory | 🔮 Planned | Medium |
| ADK wrapper generation | 🔮 Planned | Medium |
Why: Agent Engine provides built-in A2A (Agent-to-Agent) protocol, managed session state, enterprise security (VPC-SC, CMEK), and interop with other enterprise agents (SAP Joule, Microsoft Copilot, etc.). Cloud Run remains the "flexible" option; Agent Engine is the "managed" option.
| Feature | Status | Priority |
|---|---|---|
| Browser-based agents via WASM | 🔮 Planned | High |
| Edge deployment (Cloudflare Workers, Deno Deploy) | 🔮 Planned | High |
| Fermyon Spin deployment target | 🔮 Planned | High |
| Sandboxed tool execution | 🔮 Planned | Medium |
| Plugin system via WASM modules | 🔮 Planned | Low |
Why: WASM enables running agents in browsers, edge runtimes, and with near-instant cold starts. Spin provides serverless WASM deployment to Fermyon Cloud.
| Feature | Description | Priority |
|---|---|---|
| ReWOO | Plan-Execute-Synthesize pattern (fewer LLM calls) | Medium |
| Reflexion | Self-critique and iterative improvement loop | Medium |
| Autonomous Loops | Long-running agents with checkpointing | Medium |
Note: Multi-agent patterns (Swarms, Graphs, Workflows, A2A) are already achievable with current constructs. See docs/multi-agent-patterns.md.
| Feature | Status | Priority |
|---|---|---|
| AMAZON.UserInput equivalent | 🔮 Planned | Medium |
| Code Interpreter | 🔮 Planned | Medium |
| Inline Agents (runtime config) | 🔮 Planned | Low |
| MCP Integration | 🔮 Planned | Low |
Tasks identified but postponed for future consideration:
| Task | Reason | Priority |
|---|---|---|
| ASP Terraform Module Integration | Use Agent Starter Pack's battle-tested Terraform modules as optional terraform_source: "asp" |
Medium |
| Observability Templates | Cloud Trace, Logging dashboards pre-configured | Low |
| Load Testing Integration | Locust templates like ASP | Low |
| Azure Target | Add Azure Bot Framework / Azure OpenAI target | Low |
bedsheet/
├── __init__.py # Exports: Agent, Supervisor, ActionGroup
├── __main__.py # Demo: uvx bedsheet
├── agent.py # Single agent with ReAct loop
├── supervisor.py # Multi-agent coordination
├── action_group.py # @action decorator, tool registration
├── events.py # 11 event types for streaming
├── exceptions.py # Custom exceptions
├── testing.py # MockLLMClient for tests
├── llm/
│ ├── base.py # LLMClient protocol
│ └── anthropic.py # Claude integration
├── memory/
│ ├── base.py # Memory protocol
│ ├── in_memory.py # Dict-based (dev)
│ └── redis.py # Redis-based (prod)
├── cli/ # NEW in v0.4
│ └── main.py # Typer CLI app
└── deploy/ # NEW in v0.4
├── config.py # bedsheet.yaml schema
├── introspect.py # Agent metadata extraction
├── targets/ # Deployment generators
└── templates/ # Jinja2 templates
- AWS CDK over Terraform for AWS - CDK is Pythonic, has native Bedrock L2 constructs, and generates CloudFormation (ejectable)
- Terraform for GCP - GCP has no Python CDK equivalent; Terraform is industry standard
- Reuse, don't reinvent - Designed to integrate with ASP's Terraform modules (deferred)
- User-choice ejectability - Users can choose managed (Bedrock, Agent Engine) or ejectable (containers, serverless)
- Multi-environment via workspaces/contexts - Terraform workspaces for GCP, CDK contexts for AWS
- Structured Outputs via Anthropic Beta - Use constrained decoding for 100% schema compliance
- Pydantic integration - OutputSchema.from_pydantic() for familiar DX
- Supervisor IS-A Agent - Extend Agent rather than separate class hierarchy
- Single delegate tool - Match AWS's AgentCommunication::sendMessage pattern
- Isolated memory - Collaborators don't share supervisor's conversation
- Bedrock-like API - Mirror AWS Bedrock concepts for familiarity
- Streaming-first - Async iterator with events, not batch responses
- Protocol-based extensibility - Memory and LLMClient as protocols
Copyright © 2025-2026 Sivan Grünberg, Vitakka Consulting