Eagle-7B: A New LLM Architecture Challenging Transformers

Eagle-7B RWKV

There’s a new LLM bird in town! Meet Eagle-7B, the latest model based on the RWKV-v4 architecture that outperforms SOTA 7B LLM models like Mistral and Falcon.

What’s Cool About It

Unlike transformer-based LLM architectures, Eagle-7B deals with sequences incrementally, resulting in lower memory requirements. This is especially helpful during GPU shortages.

How RWKV-v4 Differs from Transformers

While they both share common patterns, the way they each deal with attention and sequences are different. RWKV-v4 architectures are more efficient in processing shorter sequences and for language modeling and generation.

Where Eagle-7B Fits In

Eagle-7B is trained on a multi-lingual dataset and beats SOTA on both English and multi-lingual evaluations.

Is the Run for Transformers Over?

Not by a stretch. Architectures like RWKV-v4 that challenge transformers are a promising development in research into alternate patterns, but transformers remain dominant across most use cases.

Read the full post