Orbit Open-Source RL Framework Enables Single-Node Trillion-Parameter Model Training

🤖 Yapay Zeka 📰 Pandaily 🕐 5 gün önce

Sphere AI Lab has officially open-sourced Orbit, a reinforcement learning (RL) post-training framework that enables trillion-parameter models like DeepSeek-V4 and Kimi-K2.6 to run RL fine-tuning on a single 8xB200 GPU node, a task previously requiring multi-node distributed systems. The core innovation of Orbit lies in its adapter-first system design. By freezing a low-precision base model and training only a lightweight adapter, Orbit compresses the GPU memory requirements f

Sphere AI Lab has released Orbit, an open-source reinforcement learning framework designed for efficient post-training of massive AI models. Orbit allows trillion-parameter models, such as DeepSeek-V4 and Kimi-K2.6, to undergo RL fine-tuning on a single high-end GPU node, a feat previously demanding extensive distributed computing resources.

The framework achieves this by employing an adapter-first strategy. It freezes the base model's weights and trains only a small, lightweight adapter, drastically reducing the memory footprint. This method also sidesteps precision issues common in other RL post-training systems and has shown promising results in benchmarks with stable performance improvements.

This development democratizes access to advanced AI model fine-tuning, enabling researchers and developers with limited hardware to work with state-of-the-art large language models.

#orbit

📌 Kaynak

Bu özet Pandaily kaynağından otomatik derlenmiştir. Tamamı için orijinal habere gidin.

Orijinal haberi oku →

← Tüm haberlere dön