Eagle-7B: A New LLM Architecture Challenging Transformers

There’s a new LLM bird in town! Meet Eagle-7B, the latest model based on the RWKV-v4 architecture that outperforms SOTA 7B LLM models like Mistral and Falcon.
What’s Cool About It
Unlike transformer-based LLM architectures, Eagle-7B deals with sequences incrementally, resulting in lower memory requirements. This is especially helpful during GPU shortages.
How RWKV-v4 Differs from Transformers
While they both share common patterns, the way they each deal with attention and sequences are different. RWKV-v4 architectures are more efficient in processing shorter sequences and for language modeling and generation.
Where Eagle-7B Fits In
Eagle-7B is trained on a multi-lingual dataset and beats SOTA on both English and multi-lingual evaluations.
Is the Run for Transformers Over?
Not by a stretch. Architectures like RWKV-v4 that challenge transformers are a promising development in research into alternate patterns, but transformers remain dominant across most use cases.