RingX: Scalable Parallel Attention for Long-Context Learning on HPC

Junqi Yin,M. Palash,M. Shankar,Feiyi Wang

Published 2025 in International Conference on Software Composition

ABSTRACT

The attention mechanism has become foundational for remarkable AI breakthroughs since the introduction of the Transformer, driving the demand for increasingly longer context to power frontier models such as large-scale reasoning language models and high-resolution image/video generators. However, its quadratic computational and memory complexities present substantial challenges. Current state-of-the-art parallel attention methods, such as ring attention, are widely adopted for long-context training but utilize a point-to-point communication strategy that fails to fully exploit the capabilities of modern HPC network architectures. In this work, we propose ringX, a scalable family of parallel attention methods optimized explicitly for HPC systems. By enhancing workload partitioning, refining communication patterns, and improving load balancing, ringX achieves up to 3.4 × speedup compared to conventional ring attention on the Frontier supercomputer. Optimized for both bi-directional and causal attention mechanisms, ringX demonstrates its effectiveness through training benchmarks of a Vision Transformer (ViT) on a climate dataset and a Generative Pre-Trained Transformer (GPT) model, Llama3 8B. Our method attains an end-to-end training speedup of approximately 1.5 × in both scenarios. To our knowledge, the achieved 38% model FLOPs utilization (MFU) for training Llama3 8B with a 1M-token sequence length on 4,096 GPUs represents one of the highest training efficiencies reported for long-context learning on HPC systems. Our code implementation is available at https://github.com/jqyin/ringX-attention.

PUBLICATION RECORD

  • Publication year

    2025

  • Venue

    International Conference on Software Composition

  • Publication date

    2025-11-15

  • Fields of study

    Computer Science, Engineering

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-29 of 29 references · Page 1 of 1

CITED BY