AMD reveals new Stable Diffusion 3.0 AI model generator optimized for XDNA 2 NPUs, capable of running locally on Ryzen AI-equipped laptops within the industry.
**New Text-to-Image Generator Unveiled: AMD's Stable Diffusion 3.0 Medium AI Model**
A groundbreaking text-to-image generator, based on Stable Diffusion 3.0 Medium, has been unveiled by AMD. This innovative tool is designed to create customizable stock-quality visuals for design and marketing applications, running entirely on-device without the need for Internet access or cloud services.
The system requirements for using the AMD XDNA 2 Stable Diffusion 3.0 Medium AI model are as follows:
1. A laptop equipped with an AMD Ryzen AI 300 series or Ryzen AI MAX+ processor, featuring the AMD XDNA 2 Neural Processing Unit (NPU) capable of at least 50 Tera Operations Per Second (TOPS) performance. 2. A minimum of 24GB of system RAM is required to run the model locally, although the model consumes only about 9GB of memory during execution due to memory optimizations. 3. The model uses a BF16 (block FP16) precision format, which balances accuracy and performance and is optimized for the XDNA 2 NPU hardware. 4. For certain enhanced features like AMD XDNA Super Resolution, an AMD Ryzen 8040 series processor with 32GB RAM and the latest OEM MCDM and NPU driver updates is recommended. 5. The AI image generation software that supports this model is Amuse 3.1 by Tensorstack, which runs locally on supported Ryzen AI laptops and allows toggling the XDNA 2 Stable Diffusion 3.0 setting.
The model interprets written prompts and produces images in 1024×1024 resolution, which are upscaled to 2048×2048 resolution, resulting in 4MP outputs. It also supports advanced prompting features for fine control over image composition.
To use the text-to-image generator, users must install the latest AMD Adrenalin Edition drivers and the Amuse 3.1 Beta software from Tensorstack. In Amuse, users should switch to EZ Mode, move the slider to HQ, and enable the 'XDNA 2 Stable Diffusion Offload' option.
The model's output resolution is 2048×2048, resulting in 4MP images. The model is optimized for BF16 precision and designed to run locally on machines with XDNA 2 NPUs.
The model's usage is subject to the Stability AI Community License, and it is free for individuals and small businesses with less than $1 million in annual revenue, though licensing terms may change eventually.
An example of a prompt for the model is: "Close up, award-winning wildlife photography, vibrant and exotic face of a toucan against a black background, focusing on the colorful beak, vibrant color, best shot, 8k, photography, high res."
This new text-to-image generator promises to revolutionize content creation and design, offering a fast, offline image generation solution suitable for a wide range of applications.
- This innovative text-to-image generator, based on Stable Diffusion 3.0 Medium, not only supports creating customizable stock-quality visuals but also smartphones and gadgets equipped with advanced technology such as artificial-intelligence processors could potentially leverage this technology for local image generation, revolutionizing the content creation process for mobile devices as well.
- As the text-to-image generator continues to evolve, it's foreseeable that artificial-intelligence will play an increasingly significant role in the refinement of the images produced, possibly allowing for more intricate details and nuances, blurring the line between traditional photography and AI-generated imagery.