Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

🤖 Yapay Zekâ 📰 Ars Technica 🕐 4 saat önce
Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

Another day, another AI model from Google. This time, Google DeepMind has released a new member of the Gemma 4 open model family , but it's fundamentally different from the rest of the lineup. DiffusionGemma doesn't generate outputs linearly like most AI models. Instead, it can produce an entire block of text in parallel. Google says this makes it faster and more efficient when running on local hardware like an Nvidia DGX or a humble gaming GPU. Most AI models are designed to

Google DeepMind has introduced DiffusionGemma, a new AI model that diverges from traditional text generation methods. Unlike autoregressive models that produce text sequentially, DiffusionGemma generates entire blocks of text in parallel, drawing inspiration from image generation techniques. This parallel processing approach significantly enhances efficiency and speed when operating on local hardware, including gaming GPUs.

The model, which utilizes a Mixture of Experts architecture with 26 billion parameters, can activate a smaller subset during inference, allowing it to fit within the memory constraints of high-end graphics cards. Benchmarks indicate DiffusionGemma can achieve speeds up to four times faster than comparable autoregressive Gemma models, producing over 1,000 tokens per second on specialized AI accelerators.

This development offers the potential for faster and more accessible AI text generation on consumer-grade hardware, democratizing advanced AI capabilities.

#deepmind#hardware#war

📌 Kaynak

Bu özet Ars Technica kaynağından otomatik derlenmiştir. Tamamı için orijinal habere gidin.

Orijinal haberi oku →
📱
News AI World — Mobil uygulama
Bu haberleri 45 dilde, anlık çeviriyle cebinde. Erken erişim için Gmail adresini bırak.
← Tüm haberlere dön