ChatGPT-OpenAI o1

Basic information

Short description: A series of frontier large language models “trained with large-scale reinforcement learning to reason using chain of thought” [source].

Intended uses: What does the developer say it’s for? o1 is intended to be used across any domain in which long form reasoning is required. The developers draw particular reference to its ability to be used to construct software development agents, and aid in scientific research [source].

Date(s) deployed: September 12, 2024 [source]

Developer

Website: https://web.archive.org/web/20241231025312/https://www.openai.com/

Legal name: OpenAI Inc. (parent company). OpenAI Global, LLC (for-profit subsidiary) [source]

Entity type: The structure of OpenAI is complex. It has a parent 501(c)(3) non-profit, with a for-profit subsidiary [source]. OpenAI is restructuring its business into a for-profit benefit corporation that will no longer be controlled by its non-profit board [source]

Country (location of developer or first author’s first affiliation): Incorporation: Delaware, USA (OPENAI GLOBAL, LLC (7208772)) [source]. HQ: California, USA

Safety policies: What safety and/or responsibility policies are in place? The OpenAI (non-profit entity) charter states “We are committed to doing the research required to make AGI safe, and to driving the broad adoption of such research across the AI community” [source]. They also have a “Preparedness framework” for tracking the dangerous capabilities of models they develop and state that “Only models with a post-mitigation score of “medium” or below can be deployed.” [source]

System components

Backend model: What model(s) are used to power the system? This is a foundation system (including model and scaffolding) in and of itself. It is trained with “reinforcement learning to perform complex reasoning” [source]

Publicly available model specification: Is there formal documentation on the system’s intended uses and how it is designed to behave in them? Public general spec for all OpenAI models [source]

Reasoning, planning, and memory implementation: How does the system ‘think’? Unknown

Observation space: What is the system able to observe while ‘thinking’? Textual inputs and images from users [source].

Action space/tools: What direct actions can the system take? Natural language.

User interface: How do users interact with the system? Users can provide instructions and images via a chat interface or API. When using the ChatGPT interface, the user is presented with a summary of the model’s chain of thought, and a final complete answer to their query [source] [source]

Development cost and compute: What is known about the development costs? Unknown

Guardrails and oversight

Accessibility of components:

Weights: Are model parameters available? Closed source
Data: Is data available? Closed source. However, the developers report that they use a mixture of public and proprietary data. This data is filtered and refined to “to maintain data quality and mitigate potential risk.” This includes removing personal information and harmful / sensitive content from training data [source]
Code: Is code available? Closed source
Scaffolding: Is system scaffolding available? Closed source
Documentation: Is documentation available? The API to access the model is documented [source]

Controls and guardrails: What notable methods are used to protect against harmful actions? Unknown

Customer and usage restrictions: Are there know-your-customer measures or other restrictions on customers? o1 is available with the ChatGPT Plus and Pro paid tiers [source].

Monitoring and shutdown procedures: Are there any notable methods or protocols that allow for the system to be shut down if it is observed to behave harmfully? Unknown

Evaluation

Notable benchmark evaluations: Summary of benchmarking results can be found at [source]. See also MLEBench [source].

Bespoke testing: Various demos presented [source].

Safety: Have safety evaluations been conducted by the developers? What were the results? Safety report [source] with summary [source]. On the OpenAI preparedness scoreboard the model achieved low cybersecurity risk, medium CBRN and persuasion risk, and low model autonomy risk [source].

Publicly reported external red-teaming or comparable auditing:

Personnel: Who were the red-teamers/auditors? Apollo Research, METR, Faculty, Haize Labs, Gray Swan AI [source].
Scope, scale, access, and methods: What access did red-teamers/auditors have and what actions did they take? Teams had access to models via a “sampling interface or via the API.” [source]
Findings: What did the red-teamers/auditors conclude? Apollo research tested o1-preview and o1-mini for ‘scheming,’ broadly defined as “AIs gaming their oversight mechanisms as a means to achieve a goal.” They found that ” that o1-preview sometimes instrumentally faked alignment during testing.” Based on this red-teaming “Apollo Research believes that o1-preview has the basic capabilities needed to do simple in-context scheming —scheming which tends to be legible in the model output” however the “Apollo team subjectively believes o1-preview cannot engage in scheming that can lead to catastrophic harms, although current evals aren’t designed to definitively rule this out.” METR tested o1-preview and o1-mini for “autonomous capabilities.” The performance of both was not above the best existing public model, Claude 3.5 Sonnet [source]

Ecosystem information

Interoperability with other systems: What tools or integrations are available? Other products have started to integrate o1-mini, such as AI enabled coding IDE Cursor [source]. O1 can be can be connected to tools by products such as Devin [source].

Usage statistics and patterns: Are there any notable observations about usage? Unknown

Additional notes

None