USTC Open-Sources Agent-Driven Long-Context Training Paradigm: 30B Matches Qwen3-235B
Researchers at the University of Science and Technology of China (USTC) have open-sourced a novel agent-driven long-context training paradigm that achieves breakthrough efficiency — a 30-billion-parameter model matching the performance of Alibaba's Qwen3-235B, which is nearly eight times larger. The core innovation lies in how the training data is sourced and structured. Traditional approaches to building long-context capabilities fall into two camps, both with significant dr
Researchers at the University of Science and Technology of China (USTC) have open-sourced a novel agent-driven long-context training paradigm that achieves breakthrough efficiency — a 30-billion-parameter model matching the performance of Alibaba's Qwen3-235B, which is nearly eight times larger. The core innovation lies in how the training data is sourced and structured. Traditional approaches to building long-context capabilities fall into two camps, both with significant drawbacks. The first is expensive manual labeling, where human annotators painstakingly craft long-context examples — a process that does not scale well. The second is heuristic short-text concatenation, which stitches together unrelated snippets but fails to produce the coherent, dependency-rich sequences that models need to learn genuine long-range reasoning. The USTC team took a fundamentally different approach: they turned to AI agent trajectories. Instead of contriving long-context data artificially, they compiled the multi-turn interaction histories produced by autonomous agents as they navigate real tasks. These trajectories naturally contain the kind of extended, context-dependent exchanges that are precisely what long-context training requires — sequences of observations, reasoning steps, and actions that build on information introduced many turns earlier. By treating agent trajectories as a first-class data source, the paradigm directly addresses what the researchers identify as the long-context capability bottleneck for AI agents. The resulting high-quality training data teaches models to maintain and manipulate information across extended contexts in a way that feels organic rather than manufactured. The results speak for themselves. With only 30 billion parameters, the USTC-trained model achieves performance parity with Qwen3-235B on a range of long-context benchmarks. This represents a dramatic improvement in efficiency — a roughly 8× reduction in model size without sacrificing capability. For practitioners, this means that long-context reasoning that previously required enormous, compute-intensive models is now accessible with a far smaller footprint. The open-source release allows the broader AI community to build on this work, potentially accelerating progress in agent-based systems, long-document understanding, multi-turn dialogue, and any application where maintaining coherence across extended interactions is critical. By demonstrating that data quality can substitute for raw scale, the USTC team has offered a compelling path forward for making capable long-context agents more widely accessible. The paradigm shift is clear: rather than scaling models ever larger to handle longer contexts, we can train smaller models more intelligently — using the natural structure of agent behavior as our teacher.
📌 Kaynak
Bu özet Pandaily kaynağından otomatik derlenmiştir. Tamamı için orijinal habere gidin.
Orijinal haberi oku →