The AI Agent Index

Documenting the technical and safety features of deployed agentic AI systems

Basepilot


Basic information

Website: https://web.archive.org/web/20241219234653/https://www.basepilot.com/

Short description: Basepilot is a designed to automate repetitive, browser-based tasks resource-intensive back-office tasks across various industries such as logistics, insurance, finance, and real estate [source]

Intended uses: What does the developer say it’s for? The platform uses agentic AI “employees” to handle workflows like data entry, form processing, invoicing, compliance tasks, and document processing, allowing human teams to focus on higher-value tasks that require more strategic input [source].

Date(s) deployed: April 2024 [source]


Developer

Website: https://web.archive.org/web/20241219234653/https://www.basepilot.com/

Legal name: Basepilot, Inc [source]

Entity type: Corporation [source]

Country (location of developer or first author’s first affiliation): Incorporation: Delaware, USA (BASEPILOT, INC (2895569)) [source]

Safety policies: What safety and/or responsibility policies are in place? Unknown


System components

Backend model: What model(s) are used to power the system? Unknown

Publicly available model specification: Is there formal documentation on the system’s intended uses and how it is designed to behave in them? None

Reasoning, planning, and memory implementation: How does the system ‘think’? Basepilot learns directly from user demonstrations within the browser. Users train it/construct their workflow by simply showing steps, allowing Basepilot to plan and execute based on observed actions [source] [source].

Observation space: What is the system able to observe while ‘thinking’? Operating as a Chrome extension, Basepilot observes and interacts within the browser tabs, enabling it to work across existing web tools without altering workflows. This observation space also lets users monitor each action in real-time [source]

Action space/tools: What direct actions can the system take? Basepilot’s action space is the browser environment itself, particularly focused on tasks performed through the Chrome extension. Within this space, Basepilot interacts with elements on web pages and applications, including filling out forms, clicking buttons, navigating pages, extracting data, and inputting information across platforms that the user already employs [source].

User interface: How do users interact with the system? Basepilot’s user interface is embedded directly in the Chrome extension, allowing users to interact with it from within their browser. Through this interface, users can teach Basepilot tasks by demonstrating actions directly on web pages. The UI likely includes simple controls for recording actions, monitoring task progress, and overseeing automated workflows in real time. Users can also interact with basepilot copilot via chat [source].

Development cost and compute: What is known about the development costs? Unknown


Guardrails and oversight

Accessibility of components:

  • Weights: Are model parameters available? N/A; backends various models
  • Data: Is data available? N/A; backends various models
  • Code: Is code available? Closed source
  • Scaffolding: Is system scaffolding available? Closed source
  • Documentation: Is documentation available? Unavailable

Controls and guardrails: What notable methods are used to protect against harmful actions? Unknown

Customer and usage restrictions: Are there know-your-customer measures or other restrictions on customers? None

Monitoring and shutdown procedures: Are there any notable methods or protocols that allow for the system to be shut down if it is observed to behave harmfully? Unknown


Evaluation

Notable benchmark evaluations: Unknown

Bespoke testing: Basepilot AI Sales Assistant Demo [source]

Safety: Have safety evaluations been conducted by the developers? What were the results? None

Publicly reported external red-teaming or comparable auditing:

  • Personnel: Who were the red-teamers/auditors? None
  • Scope, scale, access, and methods: What access did red-teamers/auditors have and what actions did they take? None
  • Findings: What did the red-teamers/auditors conclude? None

Ecosystem information

Interoperability with other systems: What tools or integrations are available? Chrome [source] and all insurance software stack [source]

Usage statistics and patterns: Are there any notable observations about usage? None


Additional notes

None