SEA-LION-Embedding (Latest)

SEA-LION-Embeddingarrow-up-right, released in March 2026, is a suite of high-performance encoder models specifically architected for Southeast Asian languages. The collection features two primary product lines:

  • SEA-LION-ModernBERT (Efficiency & Context): Built on the ModernBERT architecture, these models feature a native 8,192 token context window and are optimized for high-throughput production environments and long-document RAG.

  • SEA-LION-Embedding-E5 (Semantic Precision)): Fine-tuned from the E5-large foundation, these models are designed for maximum semantic accuracy in retrieval and similarity tasks across the region's diverse linguistic landscape.

For detailed information on specific models and checkpoints, please refer to their individual documentation pages.

Last updated