Agents
Agents are autonomous browser-based workers. Each agent gets its own browser session, receives a prompt describing what to do, and runs an agentic loop until the task is complete. Results, conversation logs, and artifacts are available via the API.
Overview
The agent is the most fundamental entity in Aiqaramba. Everything else in the system (projects, journeys, schedules, personas) exists to configure and organize agents. An agent represents a single autonomous execution: one browser session, one prompt, run to completion.
When you create an agent, it enters a state machine that progresses through a well-defined lifecycle: pending (queued for execution), running (actively performing actions in a browser), and a terminal state of either completed (finished successfully or with findings) or failed (encountered an unrecoverable error). Once an agent reaches a terminal state, its results, conversation log, and any recorded artifacts are available via the API.
During execution, the agent operates in an agentic loop: it observes the current state of the browser, decides what to do next, performs an action, and repeats. This loop continues until the agent determines the task is complete or it runs out of iterations.
Capabilities
Agents have access to a broad set of capabilities that go well beyond simple page navigation. These capabilities are subject to change as the platform evolves, but at a high level agents can do the following:
- Browser interaction. Navigate to URLs, read page content as structured text or raw HTML, take screenshots, and inspect the accessibility tree. Agents can click elements, type into inputs, select dropdown options, handle native dialogs, and manage multiple browser tabs.
- Low-level input. Beyond simple clicks and typing, agents can perform raw pointer and keyboard actions for complex interactions like drag-and-drop, modifier key combinations, precise mouse movements, and scroll gestures. This is what allows agents to interact with complex JavaScript widgets that break under simpler automation approaches.
- HTTP requests. Agents can make direct HTTP calls independent of the browser. This is useful for API testing, verifying endpoints, or interacting with services that don't have a web UI.
- File system. Each agent gets an isolated workspace where it can read, write, copy, move, and delete files. Files can be uploaded to the workspace before execution via the API, downloaded from the web during execution, and uploaded to file inputs on pages. The workspace is sandboxed per agent.
- JavaScript execution. Agents can execute arbitrary JavaScript in the browser context. This is useful for inspecting application state, reading values that aren't visible in the DOM, or triggering client-side behavior.
- Console and network inspection. Agents can read browser console logs (including JavaScript errors) and inspect network events (HTTP requests and responses) to debug failed API calls or understand what a page action triggered behind the scenes.
- Memory. Agents can save operational insights (workarounds, efficient navigation paths, timing requirements) that persist across runs. Future agents working on the same project can recall these memories to avoid repeating mistakes.
- Human handoff. When an agent encounters something it cannot handle (CAPTCHA, 2FA, payment flows), it can pause execution and hand browser control to a human user. Once the user completes the manual step, the agent resumes.
Create a new test run
Creates a new test run and enqueues it for execution. The test run will be assigned a browser session and begin following the prompt instructions.
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
project_id | uuid | body | Yes | ID of the project this test run belongs to |
persona_id | uuid | body | No | Optional persona for browser session (cookies, credentials, email) |
role_id | uuid | body | No | Optional role for prompt injection context |
journey_id | uuid | body | No | Optional user journey this test run executes |
name | string | body | No | Human-readable name for the test run |
prompt | string | body | Yes | Instructions for the test run to follow |
model | string | body | No | LLM model to use (default: (server default)) |
browser_type | string | body | No | Browser to use (chrome, firefox, edge) (default: chrome) |
max_iterations | integer | body | No | Maximum number of LLM iterations before the test run stops (default: 115) |
record_video | boolean | body | No | Save recordings for all runs. When false (default), only failed runs are saved. (default: false) |
file_paths | string[] | body | No | Paths of tenant files to copy into the test run workspace |
Status Codes
| Code | Description |
|---|---|
201 | Agent created and enqueued |
400 | Validation error |
401 | Unauthorized |
402 | Agent limit reached |
404 | Project or persona not found |
Response Body
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440000",
"project_id": "770e8400-e29b-41d4-a716-446655440000",
"persona_id": "880e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"name": "Checkout Flow Test",
"prompt": "Navigate to the homepage and click the sign-up button",
"model": "gemini-2.5-flash",
"browser_type": "chrome",
"record_video": false,
"agent_type": "direct",
"messages": [],
"iteration": 0,
"max_iterations": 115,
"tokens_used": 0,
"source": "api",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:30:00Z"
}/api/v1/agentsList test runs
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
limit | integer | query | No | Number of results to return (default: 20) |
cursor | uuid | query | No | Cursor for pagination (test run ID) |
project_id | uuid | query | No | Filter by project ID (alias: project) |
failure_source | string | query | No | Filter by failure source: app, platform, agent, ambiguous |
Status Codes
| Code | Description |
|---|---|
200 | OK |
401 | Unauthorized |
Response Body
{
"agents": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440000",
"project_id": "770e8400-e29b-41d4-a716-446655440000",
"persona_id": "880e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"prompt": "Navigate to the homepage and click the sign-up button",
"model": "gemini-2.5-flash",
"browser_type": "chrome",
"record_video": false,
"agent_type": "direct",
"iteration": 12,
"max_iterations": 115,
"tokens_used": 4500,
"source": "api",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:35:00Z",
"started_at": "2025-01-15T10:30:05Z",
"completed_at": "2025-01-15T10:35:00Z"
}
],
"next_cursor": "990e8400-e29b-41d4-a716-446655440000"
}/api/v1/agentsGet test run by ID
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
id | uuid | path | Yes | Test run ID |
Status Codes
| Code | Description |
|---|---|
200 | OK |
400 | Invalid UUID |
401 | Unauthorized |
404 | Agent not found |
Response Body
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440000",
"project_id": "770e8400-e29b-41d4-a716-446655440000",
"persona_id": "880e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"prompt": "Navigate to the homepage and click the sign-up button",
"model": "gemini-2.5-flash",
"browser_type": "chrome",
"record_video": false,
"agent_type": "direct",
"messages": [],
"iteration": 12,
"max_iterations": 115,
"tokens_used": 4500,
"result": {"outcome": "success", "findings": []},
"summary": {"outcome": "success", "steps_taken": ["..."]},
"source": "api",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:35:00Z",
"started_at": "2025-01-15T10:30:05Z",
"completed_at": "2025-01-15T10:35:00Z"
}/api/v1/agents/{id}Delete a test run
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
id | uuid | path | Yes | Test run ID |
Status Codes
| Code | Description |
|---|---|
204 | Agent deleted |
400 | Invalid UUID |
401 | Unauthorized |
404 | Agent not found |
/api/v1/agents/{id}Generate test run summary
Uses an LLM to generate a summary of what the test run did. Only works on terminal test runs (completed, failed, stopped).
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
id | uuid | path | Yes | Test run ID |
force | boolean | query | No | Force regeneration of existing summary (default: false) |
Status Codes
| Code | Description |
|---|---|
200 | Summary generated |
400 | Agent not in terminal state or has no messages |
401 | Unauthorized |
404 | Agent not found |
Response Body
{
"outcome": "success",
"steps_taken": [
"Navigated to homepage",
"Clicked sign-up button",
"Filled registration form"
],
"findings": [],
"checkpoints": {}
}/api/v1/agents/{id}/summarizeRe-run a test run
Creates a new test run by cloning a terminal test run's configuration. The original test run is preserved. Also available at /api/v1/agents/{id}/rerun. Accepts an optional JSON body with override fields.
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
id | uuid | path | Yes | Parent test run ID |
model | string | body | No | Override LLM model |
max_iterations | int | body | No | Override max iterations |
prompt | string | body | No | Override prompt |
record_video | bool | body | No | Override record video |
browser_type | string | body | No | Override browser (chrome, firefox, edge) |
Status Codes
| Code | Description |
|---|---|
201 | New agent created |
400 | Agent not in terminal state or invalid override |
401 | Unauthorized |
402 | Agent limit reached |
404 | Agent not found |
Response Body
{
"id": "880e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440000",
"project_id": "770e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"prompt": "...",
"model": "gemini-2.5-flash",
"browser_type": "chrome",
"record_video": false,
"agent_type": "direct",
"messages": [],
"iteration": 0,
"max_iterations": 115,
"tokens_used": 0,
"source": "rerun",
"parent_agent_id": "550e8400-e29b-41d4-a716-446655440000",
"created_at": "2025-01-15T10:45:00Z",
"updated_at": "2025-01-15T10:45:00Z"
}/api/v1/agents/{id}/retryStop a running test run
Stops a test run that is currently pending, running, or waiting. The test run status will be set to failed.
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
id | uuid | path | Yes | Test run ID |
Status Codes
| Code | Description |
|---|---|
200 | Agent stopped |
400 | Agent already in terminal state |
401 | Unauthorized |
404 | Agent not found |
Response Body
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440000",
"project_id": "770e8400-e29b-41d4-a716-446655440000",
"status": "failed",
"prompt": "...",
"model": "gemini-2.5-flash",
"browser_type": "chrome",
"record_video": false,
"agent_type": "direct",
"messages": [],
"iteration": 5,
"max_iterations": 115,
"tokens_used": 1200,
"source": "api",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:32:00Z"
}/api/v1/agents/{id}/stopRespond to test run clarification
Sends a response to a test run that is waiting for human clarification. The test run must have status 'waiting' with a clarification wait condition.
Parameters
| Parameter | Type | In | Required | Description |
|---|---|---|---|---|
id | uuid | path | Yes | Test run ID |
response | string | body | Yes | The user's response to the clarification request |
Status Codes
| Code | Description |
|---|---|
200 | Clarification accepted, agent resumed |
400 | Agent not waiting for clarification |
401 | Unauthorized |
404 | Agent not found |
Response Body
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440000",
"project_id": "770e8400-e29b-41d4-a716-446655440000",
"status": "running",
"prompt": "...",
"model": "gemini-2.5-flash",
"browser_type": "chrome",
"record_video": false,
"agent_type": "direct",
"messages": [],
"iteration": 3,
"max_iterations": 115,
"tokens_used": 800,
"source": "api",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:31:00Z"
}/api/v1/agents/{id}/clarification