GPT-5.2
16 articles found in this topic.
OpenAI Introduces FrontierScience Benchmark to Evaluate AI Scientific Reasoning
OpenAI has launched FrontierScience, a new benchmark to evaluate AI's scientific reasoning in physics, chemistry, and biology at a Ph.D. level. This initiative moves beyond fact recall, focusing on expert-level scientific thought and problem-solving. The benchmark includes competition and research-oriented challenges, with questions developed and graded by experts.
GPT-5.2 Faces Widespread Criticism and Underperforms Against Gemini 3 Pro in Benchmarks
OpenAI's GPT-5.2 faces significant criticism and underperforms against Google's Gemini 3 Pro in various benchmarks, including Epoch AI's ECI score. Third-party evaluations show it falling short, leading OpenAI to issue a "red alert" and re-prioritize. Google, meanwhile, re-emerges as an AI frontrunner with Gemini 3 Pro's superior performance.
AI Models Struggle with Six-Fingered Hands, Exposing Architectural Limitations
AI models consistently fail to accurately count fingers, especially when presented with more than five digits, a phenomenon dubbed the "finger problem." This issue highlights architectural limitations and biases in AI's visual reasoning, stemming from pre-trained data that prioritizes five-fingered hands.
OpenAI Unveils GPT-5.2 Models, Disney Invests $1 Billion in Partnership
OpenAI launches new GPT-5.2 models, including Instant, Thinking, and Pro versions, demonstrating superior performance in various tasks. Concurrently, Disney invests $1 billion in OpenAI, forming a partnership to integrate AI-generated content using its vast character library into platforms like Disney+.
OpenAI's GPT-5.2 Benchmarks Questioned Over Token Usage and Performance Claims
OpenAI's GPT-5.2 faces accusations of "false marketing" as its benchmark scores are questioned due to excessive token usage compared to competitors. Critics suggest the reported superior performance might stem from a "brute-force computation" advantage rather than genuine advancement. User experiences also indicate a potential mismatch between benchmark results and actual model performance.
OpenAI Launches GPT-5.2 Series, Enhancing Professional Task Performance and Programming
OpenAI has unveiled its GPT-5.2 model series (Instant, Thinking, Pro), designed to enhance professional task performance and programming. These models reportedly outperform industry professionals in 44 knowledge-based tasks, offering significant upgrades in intelligence and efficiency.
OpenAI Releases GPT-5.2 Model Family Amidst Rising Competition
OpenAI has launched its GPT-5.2 model family, including Instant, Thinking, and Pro versions, to reassert leadership amidst rising competition. These models offer enhanced capabilities for various tasks, from routine operations to complex problem-solving, setting new performance benchmarks.
OpenAI Releases GPT-5.2 Models, Citing Expert-Level Performance
OpenAI has launched GPT-5.2 models (Instant, Thinking, Pro) with expert-level capabilities for complex tasks. These models outperform competitors like Gemini 3 Pro and demonstrate significant advancements in various professional domains, including coding and knowledge work, despite higher pricing.
OpenAI Introduces GPT-5.2 Models, Achieving "Human Expert Level" in Knowledge Work
OpenAI has launched its GPT-5.2 series, including Instant, Thinking, and Pro models, which are touted as the most powerful for professional knowledge work. These models achieve "human expert level" in various tasks, demonstrating significant advancements in benchmarks like mathematics, abstract reasoning, and coding. GPT-5.2 Thinking also shows enhanced professional capabilities, faster output generation, and reduced hallucinations.
OpenAI Launches GPT-5.2, Integrating into Microsoft Products and Targeting Professional Tasks
OpenAI has released GPT-5.2, its latest large language model, available in Instant, Thinking, and Pro versions for paid ChatGPT users and developers. This new model demonstrates enhanced performance across various benchmarks, excelling in professional tasks, coding, and mathematical problems. It surpasses industry professionals in explicit knowledge work and offers significant improvements over previous versions.
OpenAI Launches GPT-5.2 in Three Versions, Excelling in 44 Professional Tasks
OpenAI has released GPT-5.2 in three versions: Instant, Thinking, and Pro. This new iteration of its large language model excels in 44 professional knowledge tasks, with the Thinking version outperforming human experts in many areas. It offers significant upgrades in intelligence, context understanding, and efficiency.
OpenAI Releases GPT-5.2, Achieving Human Expert Performance in Key Benchmarks
OpenAI has launched GPT-5.2, achieving human expert-level performance in key benchmarks like GDPval and AIME 2025. This rapid update follows a "Code Red" and significantly enhances abstract reasoning, programming, and multimodal capabilities. It's available in Instant, Thinking, and Pro versions.