Agents

Agents are autonomous browser-based workers. Each agent gets its own browser session, receives a prompt describing what to do, and runs an agentic loop until the task is complete. Results, conversation logs, and artifacts are available via the API.

Overview

The agent is the most fundamental entity in Aiqaramba. Everything else in the system (projects, journeys, schedules, personas) exists to configure and organize agents. An agent represents a single autonomous execution: one browser session, one prompt, run to completion.

When you create an agent, it enters a state machine that progresses through a well-defined lifecycle: pending (queued for execution), running (actively performing actions in a browser), and a terminal state of either completed (finished successfully or with findings) or failed (encountered an unrecoverable error). Once an agent reaches a terminal state, its results, conversation log, and any recorded artifacts are available via the API.

During execution, the agent operates in an agentic loop: it observes the current state of the browser, decides what to do next, performs an action, and repeats. This loop continues until the agent determines the task is complete or it runs out of iterations.

Capabilities

Agents have access to a broad set of capabilities that go well beyond simple page navigation. These capabilities are subject to change as the platform evolves, but at a high level agents can do the following:

  • Browser interaction. Navigate to URLs, read page content as structured text or raw HTML, take screenshots, and inspect the accessibility tree. Agents can click elements, type into inputs, select dropdown options, handle native dialogs, and manage multiple browser tabs.
  • Low-level input. Beyond simple clicks and typing, agents can perform raw pointer and keyboard actions for complex interactions like drag-and-drop, modifier key combinations, precise mouse movements, and scroll gestures. This is what allows agents to interact with complex JavaScript widgets that break under simpler automation approaches.
  • HTTP requests. Agents can make direct HTTP calls independent of the browser. This is useful for API testing, verifying endpoints, or interacting with services that don't have a web UI.
  • File system. Each agent gets an isolated workspace where it can read, write, copy, move, and delete files. Files can be uploaded to the workspace before execution via the API, downloaded from the web during execution, and uploaded to file inputs on pages. The workspace is sandboxed per agent.
  • JavaScript execution. Agents can execute arbitrary JavaScript in the browser context. This is useful for inspecting application state, reading values that aren't visible in the DOM, or triggering client-side behavior.
  • Console and network inspection. Agents can read browser console logs (including JavaScript errors) and inspect network events (HTTP requests and responses) to debug failed API calls or understand what a page action triggered behind the scenes.
  • Memory. Agents can save operational insights (workarounds, efficient navigation paths, timing requirements) that persist across runs. Future agents working on the same project can recall these memories to avoid repeating mistakes.
  • Human handoff. When an agent encounters something it cannot handle (CAPTCHA, 2FA, payment flows), it can pause execution and hand browser control to a human user. Once the user completes the manual step, the agent resumes.
POST /api/v1/agents

Create a new test run

Creates a new test run and enqueues it for execution. The test run will be assigned a browser session and begin following the prompt instructions.

Parameters

ParameterTypeInRequiredDescription
project_iduuidbodyYesID of the project this test run belongs to
persona_iduuidbodyNoOptional persona for browser session (cookies, credentials, email)
role_iduuidbodyNoOptional role for prompt injection context
journey_iduuidbodyNoOptional user journey this test run executes
namestringbodyNoHuman-readable name for the test run
promptstringbodyYesInstructions for the test run to follow
modelstringbodyNoLLM model to use (default: (server default))
browser_typestringbodyNoBrowser to use (chrome, firefox, edge) (default: chrome)
max_iterationsintegerbodyNoMaximum number of LLM iterations before the test run stops (default: 115)
record_videobooleanbodyNoSave recordings for all runs. When false (default), only failed runs are saved. (default: false)
file_pathsstring[]bodyNoPaths of tenant files to copy into the test run workspace

Status Codes

CodeDescription
201Agent created and enqueued
400Validation error
401Unauthorized
402Agent limit reached
404Project or persona not found

Response Body

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "660e8400-e29b-41d4-a716-446655440000",
  "project_id": "770e8400-e29b-41d4-a716-446655440000",
  "persona_id": "880e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "name": "Checkout Flow Test",
  "prompt": "Navigate to the homepage and click the sign-up button",
  "model": "gemini-2.5-flash",
  "browser_type": "chrome",
  "record_video": false,
  "agent_type": "direct",
  "messages": [],
  "iteration": 0,
  "max_iterations": 115,
  "tokens_used": 0,
  "source": "api",
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:30:00Z"
}
POST /api/v1/agents
cURL
Response
GET /api/v1/agents

List test runs

Parameters

ParameterTypeInRequiredDescription
limitintegerqueryNoNumber of results to return (default: 20)
cursoruuidqueryNoCursor for pagination (test run ID)
project_iduuidqueryNoFilter by project ID (alias: project)
failure_sourcestringqueryNoFilter by failure source: app, platform, agent, ambiguous

Status Codes

CodeDescription
200OK
401Unauthorized

Response Body

{
  "agents": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "tenant_id": "660e8400-e29b-41d4-a716-446655440000",
      "project_id": "770e8400-e29b-41d4-a716-446655440000",
      "persona_id": "880e8400-e29b-41d4-a716-446655440000",
      "status": "completed",
      "prompt": "Navigate to the homepage and click the sign-up button",
      "model": "gemini-2.5-flash",
      "browser_type": "chrome",
      "record_video": false,
      "agent_type": "direct",
      "iteration": 12,
      "max_iterations": 115,
      "tokens_used": 4500,
      "source": "api",
      "created_at": "2025-01-15T10:30:00Z",
      "updated_at": "2025-01-15T10:35:00Z",
      "started_at": "2025-01-15T10:30:05Z",
      "completed_at": "2025-01-15T10:35:00Z"
    }
  ],
  "next_cursor": "990e8400-e29b-41d4-a716-446655440000"
}
GET /api/v1/agents
cURL
Response
GET /api/v1/agents/{id}

Get test run by ID

Parameters

ParameterTypeInRequiredDescription
iduuidpathYesTest run ID

Status Codes

CodeDescription
200OK
400Invalid UUID
401Unauthorized
404Agent not found

Response Body

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "660e8400-e29b-41d4-a716-446655440000",
  "project_id": "770e8400-e29b-41d4-a716-446655440000",
  "persona_id": "880e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "prompt": "Navigate to the homepage and click the sign-up button",
  "model": "gemini-2.5-flash",
  "browser_type": "chrome",
  "record_video": false,
  "agent_type": "direct",
  "messages": [],
  "iteration": 12,
  "max_iterations": 115,
  "tokens_used": 4500,
  "result": {"outcome": "success", "findings": []},
  "summary": {"outcome": "success", "steps_taken": ["..."]},
  "source": "api",
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:35:00Z",
  "started_at": "2025-01-15T10:30:05Z",
  "completed_at": "2025-01-15T10:35:00Z"
}
GET /api/v1/agents/{id}
cURL
Response
DELETE /api/v1/agents/{id}

Delete a test run

Parameters

ParameterTypeInRequiredDescription
iduuidpathYesTest run ID

Status Codes

CodeDescription
204Agent deleted
400Invalid UUID
401Unauthorized
404Agent not found
DELETE /api/v1/agents/{id}
cURL
Response
POST /api/v1/agents/{id}/summarize

Generate test run summary

Uses an LLM to generate a summary of what the test run did. Only works on terminal test runs (completed, failed, stopped).

Parameters

ParameterTypeInRequiredDescription
iduuidpathYesTest run ID
forcebooleanqueryNoForce regeneration of existing summary (default: false)

Status Codes

CodeDescription
200Summary generated
400Agent not in terminal state or has no messages
401Unauthorized
404Agent not found

Response Body

{
  "outcome": "success",
  "steps_taken": [
    "Navigated to homepage",
    "Clicked sign-up button",
    "Filled registration form"
  ],
  "findings": [],
  "checkpoints": {}
}
POST /api/v1/agents/{id}/summarize
cURL
Response
POST /api/v1/agents/{id}/retry

Re-run a test run

Creates a new test run by cloning a terminal test run's configuration. The original test run is preserved. Also available at /api/v1/agents/{id}/rerun. Accepts an optional JSON body with override fields.

Parameters

ParameterTypeInRequiredDescription
iduuidpathYesParent test run ID
modelstringbodyNoOverride LLM model
max_iterationsintbodyNoOverride max iterations
promptstringbodyNoOverride prompt
record_videoboolbodyNoOverride record video
browser_typestringbodyNoOverride browser (chrome, firefox, edge)

Status Codes

CodeDescription
201New agent created
400Agent not in terminal state or invalid override
401Unauthorized
402Agent limit reached
404Agent not found

Response Body

{
  "id": "880e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "660e8400-e29b-41d4-a716-446655440000",
  "project_id": "770e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "prompt": "...",
  "model": "gemini-2.5-flash",
  "browser_type": "chrome",
  "record_video": false,
  "agent_type": "direct",
  "messages": [],
  "iteration": 0,
  "max_iterations": 115,
  "tokens_used": 0,
  "source": "rerun",
  "parent_agent_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2025-01-15T10:45:00Z",
  "updated_at": "2025-01-15T10:45:00Z"
}
POST /api/v1/agents/{id}/retry
cURL
Response
POST /api/v1/agents/{id}/stop

Stop a running test run

Stops a test run that is currently pending, running, or waiting. The test run status will be set to failed.

Parameters

ParameterTypeInRequiredDescription
iduuidpathYesTest run ID

Status Codes

CodeDescription
200Agent stopped
400Agent already in terminal state
401Unauthorized
404Agent not found

Response Body

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "660e8400-e29b-41d4-a716-446655440000",
  "project_id": "770e8400-e29b-41d4-a716-446655440000",
  "status": "failed",
  "prompt": "...",
  "model": "gemini-2.5-flash",
  "browser_type": "chrome",
  "record_video": false,
  "agent_type": "direct",
  "messages": [],
  "iteration": 5,
  "max_iterations": 115,
  "tokens_used": 1200,
  "source": "api",
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:32:00Z"
}
POST /api/v1/agents/{id}/stop
cURL
Response
POST /api/v1/agents/{id}/clarification

Respond to test run clarification

Sends a response to a test run that is waiting for human clarification. The test run must have status 'waiting' with a clarification wait condition.

Parameters

ParameterTypeInRequiredDescription
iduuidpathYesTest run ID
responsestringbodyYesThe user's response to the clarification request

Status Codes

CodeDescription
200Clarification accepted, agent resumed
400Agent not waiting for clarification
401Unauthorized
404Agent not found

Response Body

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "660e8400-e29b-41d4-a716-446655440000",
  "project_id": "770e8400-e29b-41d4-a716-446655440000",
  "status": "running",
  "prompt": "...",
  "model": "gemini-2.5-flash",
  "browser_type": "chrome",
  "record_video": false,
  "agent_type": "direct",
  "messages": [],
  "iteration": 3,
  "max_iterations": 115,
  "tokens_used": 800,
  "source": "api",
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:31:00Z"
}
POST /api/v1/agents/{id}/clarification
cURL
Response