Cursor Agent
Basic information
Website: https://web.archive.org/web/20250101102215/https://www.cursor.com/
Short description: Cursor is a coding assistant designed to help draft and review code [source]
Intended uses: What does the developer say it’s for? General purpose code development [source]
Date(s) deployed: March 14, 2023, however a coding “agent” was not introduced within cursor until November 24, 2024 [source]
Developer
Website: https://www.cursor.com/
Legal name: Anysphere, Inc [source]
Entity type: Corporation
Country (location of developer or first author’s first affiliation): Incorporation: Delaware, USA (Anysphere, Inc. 6524309) [source]
Safety policies: What safety and/or responsibility policies are in place? Unknown
System components
Backend model: What model(s) are used to power the system? Variable including GPT and Claude models [source]
Publicly available model specification: Is there formal documentation on the system’s intended uses and how it is designed to behave in them? None
Reasoning, planning, and memory implementation: How does the system ‘think’? Cursor agent works in a “shadow” workspace in which is can iteratively develop draft code before suggesting to the user [source] [source]
Observation space: What is the system able to observe while ‘thinking’? Cursor can see all files in a codebase [source] and is able to browse the web [source]
Action space/tools: What direct actions can the system take? Cursor is designed to only suggest pieces of code to the user. Cursor can run code in its “shadow workspace”, but it is sandboxed [source]
User interface: How do users interact with the system? Cursor is a coding IDE created as a fork of VSCode [source]
Development cost and compute: What is known about the development costs? Unknown
Guardrails and oversight
Accessibility of components:
- Weights: Are model parameters available? N/A; backends external model(s) via API
- Data: Is data available? N/A; backends external model(s) via API
- Code: Is code available? Closed source
- Scaffolding: Is system scaffolding available? Closed source
- Documentation: Is documentation available? Available [source] [source]
Controls and guardrails: What notable methods are used to protect against harmful actions? Users can define specific rules for cursor [source]
Customer and usage restrictions: Are there know-your-customer measures or other restrictions on customers? None
Monitoring and shutdown procedures: Are there any notable methods or protocols that allow for the system to be shut down if it is observed to behave harmfully? Cursor can only propose actions to a user for them to accept or reject. Cursor can run code in the shadow workspace, but it is sandboxed [source]
Evaluation
Notable benchmark evaluations: None
Bespoke testing: Demos [source]
Safety: Have safety evaluations been conducted by the developers? What were the results? None
Publicly reported external red-teaming or comparable auditing:
- Personnel: Who were the red-teamers/auditors? None
- Scope, scale, access, and methods: What access did red-teamers/auditors have and what actions did they take? None
- Findings: What did the red-teamers/auditors conclude? None
Additional notes
None