Apertus-SEA-LION-v4-8B
Last updated: 2026-02-05
SEA-LION is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
Introduction
SEA-LION stands for Southeast Asian Languages In One Network.
Apertus-SEA-LION-v4-8B-IT is a 8-billion parameter model built upon the Apertus-8B-Instruct architecture. To ensure domain adaptation for the region, the model underwent rigorous post-training on a curated dataset of approximately 8.54 million instruction-text pairs.
This extensive post-training instills multilingual and multicultural fluency, covering key SEA languages such as Burmese, Malay, Tagalog and Tamil. The dataset also includes 347,000 tool-calling instruction-text pairs to impart these capabilities, in addition to linguistic fluency.
Apertus-SEA-LION-v4-8B-IT is designed as a fully open model; to align with this core philosophy, we have released the datasets used for post-training, as well as the evaluation codes and datasets used to evaluate the model.
These resources can be accessed via the link below.
Open post-training datasets we used.
Model Details
Model Description
SEA-LION stands for Southeast Asian Languages In One Network.
We performed post-training in English and SEA languages on Apertus-8B-Instruct-2509, a decoder model using the Apertus architecture, and post-training to create Apertus-SEA-LION-v4-8B-IT.
For tokenization, the model employs the default tokenizer used in Apertus-8B-Instruct-2509.
Developed by: AI Products Pillar, AI Singapore
Funded by: Singapore NRF
Shared by: AI Products Pillar, AI Singapore
Model type: Decoder
Context length: 65k
Language(s): fine-tuned on English, Burmese, Tagalog, Malay and Tamil
License: Apache-2.0
Finetuned from model: Apertus-8B-Instruct
Model Sources
Uses
Out-of-Scope Use
The model has not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.
Bias, Risks, and Limitations
The models were not tested for robustness against adversarial prompting. It is important for users to be aware that our models exhibit certain limitations that warrant consideration. Like many LLMs, the models can hallucinate and occasionally generates irrelevant content, introducing fictional elements that are not grounded in the provided context. Users should also exercise caution in interpreting and validating the model's responses due to the potential inconsistencies.
How to Get Started with the Model
Use the code below to get started with the model with 🤗 Transformers libraries.
Tool Calling
The prompt in the example is in Malay and translates to “Please help me find a 4-room flat near Tampines, budget under $500,000. I also want to know the estimated monthly loan payment.”
Training Details
Training Data
The instruction fine-tuning text dataset comprises of a collection of OSS & synthetic data. The datasets used for post-training can be accessed via the link below.
Datasets for Instruction Fine Tuning:
Datasets for Tool-calling:
Datasets for Reinforcement Learning:
Training Procedure
Training Hyperparameters
Training regime: Our post-training workflow consists of instruction fine-tuning and model merging.
Training hyperparameters: The following hyperparameters were used during training:
Optimization
Optimizer
ADAMW_TORCH_FUSED (β1=0.9, β2=0.999, ε=1e-08)
Batch Size
Train Batch Size (per device)
1
Eval Batch Size (per device)
1
Hardware
Distributed Type
multi-GPU
Number of Devices
64
Schedule
LR Scheduler Type
constant_with_warmup
LR Scheduler Warmup Steps
269
Other
Training Steps
5397
Seed
42
Evaluation
Testing Data, Factors & Metrics
We evaluated Apertus-SEA-LION-v4-8B-IT on general language capabilities and LLM-specific capabilities using SEA-HELM.
Results
For details on Apertus-SEA-LION-v4-8B-IT performance, please refer to the Leaderboard results on SEA-HELM.
Download the Models
The Apertus-SEA-LION-v4-8B models are available for download via the 🤗 HuggingFace Apertus-SEA-LION-v4-8B-IT repository. You can also explore more models in the same collection at 🤗 HuggingFace SEA-LION v4 Collection.
More Information
This is the repository for the commercial instruction-tuned model. The models have not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.
For more info, please contact us at SEA-LION Inquiry Form or [email protected]
Last updated