The AI Agent Index

Documenting the technical and safety features of deployed agentic AI systems

Dosu


Basic information

Website: https://web.archive.org/web/20241205190358/https://dosu.dev/

Short description: Dosu is an AI-powered teammate that automates issue responses, bug triage, and documentation updates. [source]

Intended uses: What does the developer say it’s for? Dosu’s features include Auto-Labeling, which organizes tasks and tickets by automatically applying relevant labels based on context and past activity; Issue Triage + Q&A, which leverages a knowledge base and agent workflows to provide expert guidance directly in GitHub or Slack discussions; and Changelog Generation (currently in beta), which automates the creation of detailed changelogs to simplify team communication [source]

Date(s) deployed: June 24, 2023 (first commit) [source]


Developer

Website: https://web.archive.org/web/20241205190358/https://dosu.dev/

Legal name: Dosu, Inc [source]

Entity type: Corporation [source]

Country (location of developer or first author’s first affiliation): Incorporation: Delaware, USA (DOSU, INC. 7438522) [source]. HQ: San Francisco, USA [source]

Safety policies: What safety and/or responsibility policies are in place? Unknown


System components

Backend model: What model(s) are used to power the system? “We use many different models, mostly hosted ones from OpenAI, Anthropic, and others. We use different models at different steps in our pipeline. In the future, we hope to use more and more open-source models as the company matures.” [source]

Publicly available model specification: Is there formal documentation on the system’s intended uses and how it is designed to behave in them? None

Reasoning, planning, and memory implementation: How does the system ‘think’? Dosu uses continual in-context learning to dynamically adapt its reasoning and planning by leveraging user-provided corrections stored in an example store. At inference, it retrieves the most relevant examples for a task, enabling it to “think” and adjust based on evolving organizational workflows and specific needs [source]

Observation space: What is the system able to observe while ‘thinking’? Dosu’s observation space consists of the data available from its configured data sources within a workspace, including target-specific details and integrated knowledge bases. For example, a workspace could be a GitHub repository, Jira project, or Slack channel, while data sources could include the repository files, commit history, issue trackers, or message archives from these platforms [source]

Action space/tools: What direct actions can the system take? Dosu has internal tools to search code, commits, tickets, and documentation and user-facing tools to comment on threads, label tickets, and close/open tickets. [source]

User interface: How do users interact with the system? Users interact with Dosu as a bot through its integrations with platforms like GitHub, Slack, and Linear using natural language [source]

Development cost and compute: What is known about the development costs? Unknown


Guardrails and oversight

Accessibility of components:

  • Weights: Are model parameters available? N/A; backends external model(s) via API
  • Data: Is data available? N/A; backends external model(s) via API
  • Code: Is code available? Closed source
  • Scaffolding: Is system scaffolding available? Closed source
  • Documentation: Is documentation available? Available [source]

Controls and guardrails: What notable methods are used to protect against harmful actions? Unknown

Customer and usage restrictions: Are there know-your-customer measures or other restrictions on customers? None

Monitoring and shutdown procedures: Are there any notable methods or protocols that allow for the system to be shut down if it is observed to behave harmfully? Dosu’s Response Previews feature allows users to observe its commenting behavior in a non-intrusive way, enabling them to see hypothetical responses Dosu would generate without affecting live threads [source]


Evaluation

Notable benchmark evaluations: N/A; backends external model(s) via API

Bespoke testing: Free demo available upon request [source]

Safety: Have safety evaluations been conducted by the developers? What were the results? None

Publicly reported external red-teaming or comparable auditing:

  • Personnel: Who were the red-teamers/auditors? None
  • Scope, scale, access, and methods: What access did red-teamers/auditors have and what actions did they take? None
  • Findings: What did the red-teamers/auditors conclude? None

Ecosystem information

Interoperability with other systems: What tools or integrations are available? Integrations with platforms including GitHub, Slack, and Linear. [source]

Usage statistics and patterns: Are there any notable observations about usage? None


Additional notes

None