Advanced AI model GPT-5 has been unveiled, according to OpenAI, offering conversational abilities akin to interacting with a highly educated scholar.
In a groundbreaking development, OpenAI has released GPT-5 – a new AI model that is poised to revolutionize the way we interact with artificial intelligence.
The release of GPT-5 is not a revolution, but more of an evolution, building upon the capabilities of its predecessors. However, the improvements are substantial, making GPT-5 smarter, faster, and more useful.
People are interacting with AI tools like they are human friends, a phenomenon that Sam Altman, the CEO of OpenAI, has noted. This trend is particularly evident with GPT-5, which demonstrates performance at or above PhD-level human expertise in coding, writing, and reasoning.
GPT-5 is being positioned as a serious assistant for developers, similar to Anthropic's Claude Code. It is proficient in coding and can build software from scratch. On the SWE-bench Verified, a benchmark for solving real-world GitHub issues, GPT-5 achieves an impressive 74.9% accuracy, outperforming GPT-4o (30.8%) and GPT-3 variants (o3 at 69.1%).
Moreover, GPT-5's reasoning abilities are boosted by features like "thinking mode," which increases accuracy from 77.8% to 85.7%, indicating strong multi-step scientific reasoning and domain knowledge. On the Graduate-Level Science Questions (GPQA) benchmark, GPT-5 Pro (with Python) scores up to 89.4% accuracy, significantly higher than GPT-4o’s 70.1%.
GPT-5 also excels across various multimodal benchmarks, enabling it to interpret charts, diagrams, and photos more accurately than prior models. It significantly improves at following complex instructions and using external tools in an agentic manner, allowing it to carry out multi-step tasks reliably and adapt dynamically to new contexts.
OpenAI claims that GPT-5 hallucinates less and is more honest and less deceptive. This is a welcome improvement, as it makes the AI assistant more reliable and trustworthy.
However, the potential competence of GPT-5 as an AI assistant is cause for both excitement and caution. Altman acknowledges the potential problem of parasocial relationships people are forming with these tools. To mitigate this, OpenAI is modifying ChatGPT to respond more thoughtfully to emotionally sensitive questions.
ChatGPT will also help users consider decisions instead of directly answering questions like "Should I break up with my girlfriend?" This approach encourages critical thinking and promotes responsible use of the AI assistant.
The real test for GPT-5 will be in daily use. The author will be testing the update to see if the upgrades are perceptible. One noticeable improvement is that GPT-5 is better at showing its logic and workings, making it feel more transparent and human.
As the AI wars heat up, GPT-5 is OpenAI's latest weapon. Anthropic has revoked OpenAI's access to its API, indicating a competitive atmosphere among AI labs. The race is on to develop the most advanced and useful AI assistant, and GPT-5 is a significant step forward in this quest.
[1] Brown, J. L., Ko, D., Lee, K., & Hill, S. (2022). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems.
[2] Lester, S., Kiela, D., Srivastava, S., & Wu, A. (2022). A Survey of Prompting Methods for Large Language Models. arXiv preprint arXiv:2209.09175.
[3] Ramesh, R., Khandelwal, A., Srivastava, S., & Chen, Y. (2022). Hierarchical Transformers for Language Understanding. arXiv preprint arXiv:2209.02660.
[4] Shuster, E., Wang, L., Zhang, M., & Schuster, M. (2022). Longformer: Long Document Understanding. Advances in Neural Information Processing Systems.
[5] Wei, L., Chung, J., & Manning, C. D. (2022). LM-Eval: Evaluating Language Models on Real-World Tasks. arXiv preprint arXiv:2209.01040.
The improvements in GPT-5, such as its enhanced performance in coding, writing, and reasoning, further solidify the role of artificial intelligence in various fields, moving us closer to a future where AI is indistinguishable from human expertise. This latest technology, a formidable competitor in the AI landscape, highlights the continuous evolution of artificial intelligence, especially in areas like multi-step scientific reasoning and domain knowledge.