Researchers say they trained a foundation model from scratch for about $1,500

🤖 Yapay Zekâ 📰 VentureBeat 🕐 2 saat önce

Training a foundation LLM from scratch costs millions and requires internet-scale data — which is why most enterprises don't bother. Sapient thinks it has a cheaper path. To overcome this brute-force scaling dogma, researchers at Sapient developed HRM-Text , which replaces standard Transformers with a highly sample-efficient Hierarchical Recurrent Model (HRM), an architecture they first introduced last year . HRM decouples computation into slow-evolving strategic and fast-evo

Researchers have developed a new method for training large language models (LLMs) that significantly reduces costs and computational resources. Their approach, called HRM-Text, utilizes a Hierarchical Recurrent Model (HRM) architecture instead of traditional Transformers. This model is trained exclusively on instruction-response pairs, mimicking real-world enterprise use cases. A 1-billion parameter version was trained from scratch for approximately $1,500, demonstrating performance competitive with much larger models.

This development democratizes access to powerful AI models, allowing organizations with limited resources to train their own capable reasoning systems affordably.

#neural network#llm#chatgpt#environment#research

📌 Kaynak

Bu özet VentureBeat kaynağından otomatik derlenmiştir. Tamamı için orijinal habere gidin.

Orijinal haberi oku →

📱

News AI World — Mobil uygulama

Bu haberleri 45 dilde, anlık çeviriyle cebinde. Erken erişim için Gmail adresini bırak.

← Tüm haberlere dön

Researchers say they trained a foundation model from scratch for about $1,500

📌 Kaynak

📰 Önerilen haberler