Live edit of the Agents config by the AI itself
Workshop stand — open in browser, no local setup needed:
Fill in .env.example — add your OpenAI and Anthropic API keys, then:
After python run.py, copy the network address from the terminal:
Open the network URL in your browser (your IP will differ):
What's on screen, what it does, what to click, and what happens. Opens at the address from the terminal.
3 visible zones + 2 slide-out panels:
| Element | What it is | Click / result |
|---|---|---|
⚙︎ Config Tuning | Logo / title | Not clickable |
Badges anthropic ✓ / openai ✕ | Which providers are available (key in .env). ✓ = key found, ✕ = missing | Not clickable. If both ✕ — optimization and chat will error |
📊 | Hidden Usage panel | Click → token usage and cost estimate (JSON). Close with ✕ |
⚙ Settings | Settings | Click → settings panel (section 4) |
Main zone: Agent B runs here and you see the full process.
| Element | What it does |
|---|---|
| Optimization goal field | Natural-language task: what Agent A should start doing. Enter = run |
| ▶ Optimize | Starts the optimization loop. Button → ⏳ Running… and disabled until done |
| ↺ Reset | Resets Agent A config to starting state and clears step feed, final diff, chat, and progress |
Example goals:
max_steps from settings)Each step is a collapsible card (click header to expand/collapse). Header shows: step number, self N chip, verdict continue / ✓ done.
| Block | What it shows |
|---|---|
| Agent B reasoning | Optimizer reasoning: why config is good/bad and what to change |
| Config changes | Diff: System prompt (green/red), temperature (before → after), Few-shot (new examples) |
| Probe questions for Agent A | Questions B invented for your goal, and Agent A's streamed answers |
Loop runs until B says done or hits max_steps. At the bottom — Final config diff: starting vs final config.
Always available. Talk to Agent A on its current config (after optimization — the improved one).
| Element | What it does |
|---|---|
| Line under title | Current config: provider · model · reasoning · t=… · few-shot N |
| Input + → | Enter — send, Shift+Enter — newline. Response streams |
Workflow: optimize for "be funny" → done → test in chat whether the agent actually got funnier.
.env. Panel is for models and parameters.| Field | Sets | Notes |
|---|---|---|
| Provider | anthropic / openai | (no key) if key missing |
| Model | Agent A model | List fetched live from provider API |
| Reasoning | Reasoning level | Reasoning models only. GPT-5.x — none/low/medium/high/xhigh; Claude — none/on |
| Temperature | 0–1 | Only when model accepts temperature; hidden when reasoning is on |
| System prompt (starting) | Agent A starting prompt | Initial config that B improves |
| Field | Sets |
|---|---|
| Provider / Model | Who runs the optimizer |
| Reasoning | Agent B reasoning level (if model supports it) |
Agent B system prompt is not editable — it's built in.
| Field | Sets |
|---|---|
| Probe questions per step | How many probe questions B asks A per step (1–6) |
| Max steps | Maximum loop steps |
| Save | Saves settings, closes panel, updates config line in chat |
Summary: calls, tokens, cost estimate, breakdown by model (JSON). Full log in usage.jsonl. Intentionally hidden from main UI.
openai, gpt-4.1-mini, prompt You are a helpful assistant., temperature 0.7. Agent B: gpt-5.4-mini, reasoning none. Probe 3, Max steps 5. Save.self 55 continue.✓ done.You are a helpful assistant. to entomologist prompt with examples.