OpenAI launches GPT-5.4 mini, nano for high-volume, low-latency workloads

According to the company, the models are tailored for environments where response time directly impacts user experience

by e4m Staff
Published: Mar 18, 2026 1:09 PM | 2 min read

OpenAI has introduced GPT-5.4 mini and GPT-5.4 nano, its latest small-format models aimed at delivering faster performance and efficiency for high-volume use cases, according to a company release.

The new models are positioned as more capable and cost-effective alternatives within the GPT-5.4 family, bringing several of the flagship model’s strengths to lighter deployments where speed and scalability are critical.

Focus on speed and efficiency

GPT-5.4 mini marks a significant upgrade over its predecessor, GPT-5 mini, with improvements across coding, reasoning, multimodal understanding and tool usage. The company said the model runs more than twice as fast while approaching the performance of the larger GPT-5.4 model on several benchmarks, including SWE-Bench Pro and OSWorld-Verified.

GPT-5.4 nano, meanwhile, is designed as the smallest and most cost-efficient variant in the series. It is positioned for tasks where latency and cost take priority, such as classification, data extraction, ranking and supporting coding subagents.

Built for real-time applications

According to the company, the models are tailored for environments where response time directly impacts user experience. These include coding assistants that require quick interactions, subagents handling routine tasks, and systems that process screenshots or interpret visual inputs in real time.

The release also highlights use cases in multimodal applications, where models are expected to reason over both text and images with minimal delay.

Shift toward practical performance

With the launch, OpenAI emphasised that model selection is increasingly driven by responsiveness and efficiency rather than size alone.

In high-frequency, production environments, the company noted that smaller models capable of fast responses, reliable tool use and solid performance on professional tasks may be more suitable than larger, more resource-intensive systems.

In ChatGPT, GPT‑5.4 mini is available to Free and Go users via the “Thinking” feature in the + menu. For all other users, GPT‑5.4 mini is available as a rate limit fallback for GPT‑5.4 Thinking.