Apertus-SEA-LION-v4-8B

Last updated: 2026-02-05

SEA-LION is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.

Introduction

SEA-LION stands for Southeast Asian Languages In One Network.

Apertus-SEA-LION-v4-8B-IT is a 8-billion parameter model built upon the Apertus-8B-Instruct architecture. To ensure domain adaptation for the region, the model underwent rigorous post-training on a curated dataset of approximately 8.54 million instruction-text pairs.

This extensive post-training instills multilingual and multicultural fluency, covering key SEA languages such as Burmese, Malay, Tagalog and Tamil. The dataset also includes 347,000 tool-calling instruction-text pairs to impart these capabilities, in addition to linguistic fluency.

Apertus-SEA-LION-v4-8B-IT is designed as a fully open model; to align with this core philosophy, we have released the datasets used for post-training, as well as the evaluation codes and datasets used to evaluate the model.

These resources can be accessed via the link below.

Model Details

Model Description

SEA-LION stands for Southeast Asian Languages In One Network.

We performed post-training in English and SEA languages on Apertus-8B-Instruct-2509, a decoder model using the Apertus architecture, and post-training to create Apertus-SEA-LION-v4-8B-IT.

For tokenization, the model employs the default tokenizer used in Apertus-8B-Instruct-2509.

  • Developed by: AI Products Pillar, AI Singapore

  • Funded by: Singapore NRF

  • Shared by: AI Products Pillar, AI Singapore

  • Model type: Decoder

  • Context length: 65k

  • Language(s): fine-tuned on English, Burmese, Tagalog, Malay and Tamil

  • Finetuned from model: Apertus-8B-Instructarrow-up-right

Model Sources

Uses

Out-of-Scope Use

The model has not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.

Bias, Risks, and Limitations

The models were not tested for robustness against adversarial prompting. It is important for users to be aware that our models exhibit certain limitations that warrant consideration. Like many LLMs, the models can hallucinate and occasionally generates irrelevant content, introducing fictional elements that are not grounded in the provided context. Users should also exercise caution in interpreting and validating the model's responses due to the potential inconsistencies.

How to Get Started with the Model

Use the code below to get started with the model with 🤗 Transformers libraries.

Tool Calling

The prompt in the example is in Malay and translates to “Please help me find a 4-room flat near Tampines, budget under $500,000. I also want to know the estimated monthly loan payment.”

Training Details

Training Data

The instruction fine-tuning text dataset comprises of a collection of OSS & synthetic data. The datasets used for post-training can be accessed via the link below.

Datasets for Instruction Fine Tuning:

Datasets for Tool-calling:

Datasets for Reinforcement Learning:

Training Procedure

Training Hyperparameters

  • Training regime: Our post-training workflow consists of instruction fine-tuning and model merging.

  • Training hyperparameters: The following hyperparameters were used during training:

Category
Hyperparameter
Value

Optimization

Optimizer

ADAMW_TORCH_FUSED (β1=0.9, β2=0.999, ε=1e-08)

Batch Size

Train Batch Size (per device)

1

Eval Batch Size (per device)

1

Hardware

Distributed Type

multi-GPU

Number of Devices

64

Schedule

LR Scheduler Type

constant_with_warmup

LR Scheduler Warmup Steps

269

Other

Training Steps

5397

Seed

42

Evaluation

Testing Data, Factors & Metrics

We evaluated Apertus-SEA-LION-v4-8B-IT on general language capabilities and LLM-specific capabilities using SEA-HELM.

Results

For details on Apertus-SEA-LION-v4-8B-IT performance, please refer to the Leaderboard results on SEA-HELMarrow-up-right.

Download the Models

The Apertus-SEA-LION-v4-8B models are available for download via the 🤗 HuggingFace Apertus-SEA-LION-v4-8B-ITarrow-up-right repository. You can also explore more models in the same collection at 🤗 HuggingFace SEA-LION v4 Collectionarrow-up-right.

More Information

This is the repository for the commercial instruction-tuned model. The models have not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.

For more info, please contact us at SEA-LION Inquiry Formarrow-up-right or [email protected]

Last updated