Potential vulnerabilities in a critical Nvidia enterprise software may have enabled attackers to install malware on Windows and Linux computers.
The open-source Nvidia Triton Inference Server, a popular tool used for running AI models efficiently on servers, has been found to have three critical vulnerabilities that pose a significant risk to organizations using the server for AI/ML. These flaws, discovered by Wiz, a security research firm, can potentially grant remote, unauthenticated attackers complete control of the server, achieving remote code execution (RCE).
The vulnerabilities, identified as CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334, are found in the Python backend of Triton. They involve out-of-bounds writes and reads, and shared memory limit exceedance, enabling information disclosure, denial of service, data tampering, and potential total takeover of AI servers running Triton.
The potential impact on AI/ML security is significant. A successful attack could lead to the theft of valuable AI models, exposure of sensitive data, manipulation of AI model's responses, and a foothold for attackers to move deeper into a network. Given Triton's widespread use in enterprise environments, this vulnerability chain poses a serious risk to AI infrastructure security.
Nvidia has addressed these issues in version 25.07 of Triton Inference Server, released specifically to patch these critical flaws on August 4, 2025. The patch mitigates the out-of-bounds and shared memory vulnerabilities in the Python backend.
Key recommendations include immediately updating Nvidia Triton Inference Server to version 25.07 or higher, reviewing AI/ML infrastructure for unusual activity or signs of compromise if prior versions were in use, and considering implementing defense-in-depth strategies around AI deployment layers to mitigate risks from similar future attacks.
Nvidia and the Wiz security researchers collaborated closely for a swift fix, and no evidence currently shows these vulnerabilities have been exploited in the wild. However, patching is critical to prevent future attacks.
In summary, to mitigate these critical vulnerabilities and protect your AI systems, upgrade to Triton 25.07 immediately and follow best security practices for your AI/ML infrastructure. Triton Inference Server is compatible with various operating systems, including Windows and Linux, and works with many popular AI frameworks.
For more information on securing your AI/ML infrastructure, consider our guide to the best authenticator app and our roundup of the best password managers. It's also worth noting that a new Linux backdoor is currently hitting US universities and governments. Stay vigilant and stay secure.
[1] https://www.nvidia.com/en-us/security/advisories/2025-23319/ [2] https://www.wiz.io/blog/the-nvidia-triton-inference-server-vulnerabilities-explained/ [3] https://www.hackernews.com/story/210656.html [4] https://developer.nvidia.com/blog/nvidia-triton-inference-server-25-07-release-notes/ [5] https://www.cvedetails.com/cve/CVE-2025-23319/ [6] https://www.cvedetails.com/cve/CVE-2025-23320/ [7] https://www.cvedetails.com/cve/CVE-2025-23334/
Cybersecurity professionals should prioritize data-and-cloud-computing practices when addressing the critical vulnerabilities found in the Python backend of the Nvidia Triton Infrastructure Server. The vulnerabilities, identified as CVE-2025-23319, CVE-2025-23320, and CVE-23334, could potentially lead to unauthorized access, data tampering, and total control of affected AI servers.
To safeguard AI systems from these risks, it's essential to update Triton Inference Server to version 25.07 or higher, implement defense-in-depth strategies, and maintain awareness of ongoing security threats in technology-driven environments.