This is a Kernel application that implements a CUA (computer use agent) loop using Tzafon's Northstar CUA Fast model with Kernel's Computer Controls API. The model is accessed via Tzafon's Lightcone API platform.
Northstar CUA Fast is a vision language model trained with reinforcement learning for computer use tasks.
-
Get your API keys:
- Kernel: dashboard.onkernel.com
- Tzafon: tzafon.ai
-
Deploy the app:
kernel login
cp .env.example .env # Add your TZAFON_API_KEY
kernel deploy main.py --env-file .envkernel invoke python-tzafon-cua cua-task --payload '{"query": "Go to wikipedia.org and search for Alan Turing"}'Note: Replay recording is only available to Kernel users on paid plans.
Add "record_replay": true to your payload to capture a video of the browser session:
kernel invoke python-tzafon-cua cua-task --payload '{"query": "Navigate to https://example.com", "record_replay": true}'When enabled, the response will include a replay_url field with a link to view the recorded session.
Northstar CUA Fast works well with a 1280x800 viewport, which is the default.
| Action | Description |
|---|---|
click |
Left or right mouse click at coordinates |
double_click |
Double-click at coordinates |
point_and_type |
Click at coordinates then type text (with optional Enter) |
key |
Press key combo (e.g. Enter, ctrl+a) |
scroll |
Scroll at coordinates |
drag |
Click-and-drag from start to end coordinates |
done |
Signal task completion with a result summary |