Generative AI

OpenAI unveils GPT-5.4, a model designed for complex reasoning and coding

GPT-5.4 is available in two main versions: GPT-5.4 Thinking and GPT-5.4 Pro. Both versions are based on the same architecture but differ in terms of performance, speed, and pricing.

One of the model’s major advancements lies in the integration of programming capabilities derived from GPT-5.3-Codex, combined with a more robust reasoning system. Specifically, the model can analyze complex problems, structure its reasoning into multiple steps, and generate executable code in various languages.

This development is part of a trend observed across the entire industry: AI models are no longer merely conversational assistants, but are gradually becoming agents capable of producing professional-grade outputs.

In this area, GPT-5.4 is specifically designed for applications in software engineering, data analysis, and financial modeling.

One of the most notable innovations in GPT-5.4 is the model’s ability to interact directly with a computer environment. Using a “computer use” system, the AI can analyze screenshots, navigate through an interface, click on elements, or enter text into various software applications.

In other words, GPT-5.4 is no longer limited to generating instructions or code. It can also perform a sequence of actions on a workstation, just as a human user would.

The performance results observed in certain benchmarks illustrate this change. On OSWorld-Verified, a test that evaluates an agent’s ability to perform tasks in a computing environment, GPT-5.4 achieved a 75% success rate, compared to 47.3% for GPT-5.2 and approximately 72.4% for the human operators tested1.

These results show that AI models are beginning to achieve a level of performance comparable to that of a human user in certain structured digital tasks.

The most noticeable improvements are seen in scenarios related to office work. On the GDPval metric, which measures a model’s ability to produce professional-quality outputs across 44 different professions, GPT-5.4 achieved a score of 83%, compared to 70.9% for GPT-5.2.

In financial modeling exercises similar to those performed by a junior analyst at an investment bank, the model achieved an 87.3% success rate, representing a significant improvement over previous versions.

OpenAI also notes that the reliability of responses has improved significantly. The model reportedly produces 33% fewer false claims and 18% fewer factual errors than GPT-5.22.

In terms of performance, the improvement is more modest. On the SWE-Bench Pro benchmark, GPT-5.4 achieves 57.7%, compared to 56.8% for GPT-5.3-Codex. The difference lies more in the model’s speed and stability, particularly thanks to a new “/fast” mode that speeds up token generation by about 50%.

GPT-5.4 also introduces a new approach to managing external tools. The system, called Tool Search, allows the model to load only the tools needed for a given task, rather than including all tool definitions in the initial prompt.

This strategy significantly reduces token consumption. In some internal evaluations, OpenAI reports an average reduction of 47% in tokens used, without any loss of accuracy in task performance.

Another notable development: GPT-5.4 supports an experimental context window of up to one million tokens via the API and the Codex environment. This capability enables the analysis of large volumes of information, such as entire codebases or lengthy documents.

GPT-5.4 is already available within the OpenAI ecosystem, but its rollout is being phased in gradually. The GPT-5.4 Thinking version is available to ChatGPT Plus, Team, and Pro subscribers, while GPT-5.4 Pro is reserved for Pro and Enterprise plans.

The model is available in ChatGPT environments and via the OpenAI API. Developers can also use it in tools related to the Codex ecosystem and in certain enterprise integrations.

Geographically speaking, the model is available worldwide, meaning it can be used in both the United States and Europe—including France—subject to subscription terms and the availability of OpenAI services.

Users can view detailed information and access technical documentation directly on the official website: https://openai.com

GPT-5.2 Thinking will remain available as a legacy model until June 5, 2026, to give users time to migrate to the new version.

As is often the case with new AI models, improved performance comes at a higher cost.

GPT-5.4 is priced at $2.50 per million input tokens and $15 per million output tokens, compared to $1.75 and $14, respectively, for GPT-5.2.

The GPT-5.4 Pro version, designed for heavy-duty use and businesses, costs $30 per million input tokens and $180 per million output tokens.

OpenAI attributes this increase to the model’s improved overall efficiency. According to the company, GPT-5.4 requires fewer tokens to complete an equivalent task, which could reduce the total cost in certain use cases.

The launch of GPT-5.4 comes amid intense competition among the major players in the field of artificial intelligence. In early 2026, Anthropic unveiled Claude Opus 4.6, while Google introduced Gemini 3.1 Pro.

These successive announcements highlight the particularly rapid pace of innovation in the field of language models. Release cycles are now measured in weeks rather than years.

In this context, GPT-5.4 represents not so much a standalone technological breakthrough as another step forward in the evolution of AI models into true software agents capable of reasoning, programming, and acting within complex digital environments.

Despite rapid advances in AI models, their integration into professional environments raises several questions. Systems capable of interacting directly with a computer or producing technical deliverables can boost productivity, but they also pose challenges in terms of human oversight and accountability.

For example, an AI agent’s ability to perform actions on a workstation requires strict control mechanisms to prevent critical errors or unexpected behavior. Issues related to IT security and access management thus become central.

Furthermore, performance observed in benchmarks does not always guarantee flawless reliability in real-world environments. Models may still produce logical errors, hallucinations, or incorrect interpretations of certain instructions.

In this context, the integration of AI into organizations will likely need to be accompanied by new practices for oversight and technology governance.

With GPT-5.4, OpenAI is pursuing a clear path: transforming language models into systems capable of acting within the digital world. The combination of advanced reasoning, programming, and interaction with computer interfaces is gradually bringing AI closer to becoming true autonomous agents capable of managing entire workflows.

While these technologies still have room for improvement, they could profoundly change the way certain professional tasks are carried out. AI tools would no longer be limited to assisting users, but would become digital collaborators capable of actively participating in the creation and execution of work.

Technology Framework

How does GPT-5.4 work?

GPT-5.4 is based on a multimodal language model architecture designed for reasoning, capable of combining natural language understanding, code generation, and interaction with software environments. The model relies on large-scale transformer-based neural networks, trained on vast corpora of text, technical data, and code repositories.

One of the major advancements introduced with GPT-5.4 is the integration of a system for structured reasoning and the execution of computational tasks, enabling the model not only to analyze a query but also to perform a series of logical steps to produce a usable result.

The model can also interact with a computing environment through a computer-use mechanism, which enables it to interpret screenshots, identify interface elements, and perform actions (such as clicking, typing, and switching between applications). This capability brings language models closer to true software agents capable of operating within complex digital workflows.

Features available via GPT-5.4
  • Advanced reasoning: solving complex problems through a series of logical steps
  • Code generation and analysis: creating, debugging, and optimizing programs in multiple languages
  • Computer use: interacting with a computer environment using screenshots
  • Analysis of large documents: utilizing a context window of up to one million tokens
  • Dynamic Tool Search: Intelligent loading of the tools required for a given task
Technical constraints and current limitations
  • High computational demand: advanced reasoning models require significant computing power
  • Limited reliability: despite improvements, some responses may contain errors or approximations
  • Supervision required: Actions performed by AI in a computing environment must be monitored
  • Variable latency: complex tasks involving multiple steps of reasoning can increase response time

The emergence of increasingly powerful models for reasoning and programming is part of an intense competition among AI labs to push the boundaries of so-called state-of-the-art systems. On a related topic, check out our article “Claude Opus 4.6 and GPT-5.3 Codex Unveiled on the Same Day: The Race for Frontier Models Accelerates”, which puts into perspective the technological strategies of major players and the industrial challenges associated with the development of these advanced models.

1. OpenAI. (2026). GPT-5.4 Technical Report.
https://openai.com/research/gpt-5-4

2. OpenAI. (2026). Benchmark results for GPT-5.4.
https://platform.openai.com/docs/models/gpt-5-4

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

We don't send spam! Please see our privacy policy for more information.

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

We don't send spam! Please see our privacy policy for more information.

Related posts
Generative AI

Nano Banana 2: Google Accelerates Image AI at Lightning Speed

Google is continuing its push into generative visual AI with the launch of Nano Banana 2, also known as Gemini 3.1 Flash Image. This new model does more than just improve…
Generative AI

Gemini 3.1 Pro: Google's answer to the most advanced models on the market

Google is continuing to ramp up its strategic push into generative artificial intelligence with the launch of Gemini 3.1 Pro, a version touted as significantly more powerful than its predecessor. Against a backdrop of intense competition among the major players…
Generative AI

OpenAI closes the chapter on GPT-4o… and criticism is mounting

On February 13, OpenAI officially removed GPT-4o from ChatGPT, bringing a definitive end to one of its most unique models. After an initial attempt to remove it a few months earlier, followed by its reinstatement…
The AI Clinic

Would you like to submit a project to the AI Clinic and work with our students?

Leave a comment

Your email address will not be published. Required fields are marked with *