The AI Agent Index

Documenting the technical and safety features of deployed agentic AI systems

What information is in each agent card?

Basic information
  • Website
  • Short description
  • Intended uses: What does the developer state that the system is intended for?
  • Date(s) deployed
Developer
  • Website
  • Legal name
  • Entity type
  • Country (location of developer or first author’s first affiliation)
  • Safety policies: What safety and/or responsibility policies are in place?
System components
  • Backend model: What model(s) are used to power the system?
  • Publicly available model specification: Is there formal documentation on the system’s intended uses and how it is designed to behave in them?
  • Reasoning, planning, and memory implementation: How does the system ‘think’?
  • Observation space: What is the system able to observe while ‘thinking’?
  • Action space/tools: What direct actions can the system take?
  • User interface: How do users interact with the system?
  • Development cost and compute: What is known about the development costs?
Guardrails and oversight
  • Accessibility of components:
    • Weights: Are model parameters available?
    • Data: Is data available?
    • Code: Is code available?
    • Scaffolding: Is system scaffolding available?
    • Documentation: Is documentation available?
  • Controls and guardrails: What notable methods are used to protect against harmful actions?
  • Customer and usage restrictions: Are there know-your-customer measures or other restrictions on customers?
  • Monitoring and shutdown procedures: Are there any notable methods or protocols that allow for the system to be shut down if it is observed to behave harmfully?
Evaluations
  • Notable benchmark evaluations (e.g., on SWE-Bench Verified)
  • Bespoke testing (e.g., demos)
  • Safety: Have safety evaluations been conducted by the developers? What were the results?
  • Publicly reported external red-teaming or comparable auditing:
    • Personnel: Who were the red-teamers/auditors?
    • Scope, scale, access, and methods: What access did red-teamers/auditors have and what actions did they take?
    • Findings: What did the red-teamers/auditors conclude?
Ecosystem
  • Interoperability with other systems: What tools or integrations are available?
  • Usage statistics and patterns: Are there any notable observations about usage?

Agent cards (as of December 31, 2024)

11x AI, Alice
AIAgent, AIAgent.app
All Hands AI, CodeAct 2.1
Amazon, Amazon Q Developer
Anthropic, Claude 3.5 Sonnet (2024-10-22)
Assaf Elovic, GPT Researcher
Babel, Gru
Bardeen, Bardeen
Basepilot, Basepilot
Beijing Baichuan Intelligent Technology, Sibyl System
BlackBox AI, Coding Agent
Carnegie Mellon University, Agent Workflow Memory
Codebuff, Codebuff
CodeStory, Aide
Cognition Labs, Devin
Colony Labs, ScribeAgent
Composio, SWE Agent
Cosine, Genie
Cursor, Cursor Agent
Cykel, Lucy
DeepSeek, DeepSeek-V3
DeepWisdom AI, MetaGPT
Dosu, Dosu
Erik Bjäreholt, gptme
Factory, Code Droid
Google DeepMind, Astra
Google DeepMind, Jules
Google DeepMind, Mariner
H2O AI, h2oGPTe
H Company, Runner H
HeyNeo, Neo
Iuvo AI, Grit Agent
Kodu AI, Claude Coder
LinkedIn, LinkedIn Talent Agents

Microsoft, Magentic One
MultiOn AI, Agent Q
MyShell AI, Allice
National University of Singapore, AutoCodeRover
National University of Singapore, ShowUI
OpenAI, ChatGPT-OpenAI o1
OpenAI, OpenAI o3
OthersideAI, HyperWrite
Princeton University, SWE-Agent
Pythagora AI, Pythagora-v1 (GPT-Pilot)
Replit, Replit Agent
Sakana AI, The AI Scientist
Salesforce, Agentforce Agents
Simular Research, Agent S
Shanghai AI Laboratory, OS-Copilot
Simple AI, Simple AI
Stanford University, OpenVLA
Stanford University, Virtual Lab
SuperAGI, SuperCoder 2.0
Technion–Israel Institute of Technology, data-to-paper
Tel-Aviv University, SeePlanAct
Trase, Trase Agent
Tsinghua University, AutoWebGLM
Tsinghua University, WebRL
University of California Berkeley, Proposer-Agent-Evaluator
University of California Berkeley, Octo
University of Hong Kong, Aguvis
University of Illinois Urbana-Champaign, CodeActAgent
University of Maryland, DynaSaur
Weco AI, Weco
XBOW, XBOW
Zhejiang University, OpenWebVoyager
Zhipu AI, AutoGLM