# Motivations

Large Language Models (LLMs) are a type of artificial intelligence model designed to understand and generate human language. Recent developments in LLMs have showcased remarkable capabilities in understanding and generating human language, with applications spanning translation, summarization, coding assistance, question answering, and more.

Many existing LLMs, however, are trained upon massive internet-based datasets, which often has disproportionately large influences from western, industrialized, rich, educated, and democratic (WIRED) societies, as people from non-WIRED societies are less likely to be literate, to use the Internet, and to have their output easily accessed. Such an imbalance in the training data can lead to model outputs that\
display strong bias in terms of cultural values, political beliefs and social attitudes.

LLMs trained on predominantly WIRED-centric content risk neglecting the linguistic and cultural diversity inherent in non-WIRED populations. These biases become evident not only in examples like cultural references or local idioms, but also in more critical domains such as decision-making, social attitudes and moral reasoning, and which can vary significantly across global communities. By overlooking these variations, mainstream LLMs may inadvertently perpetuate inaccurate assumptions or exclude large segments of the global population.

Our work in SEA-LION, now part of Singapore’s National Multi-Modal Large Language Model project, aims to address these disparities by creating LLMs that cater to under-represented population groups and low resource languages in the SEA region.

SEA-LION is trained on more content produced in Southeast Asian languages like Thai, Vietnamese and Bahasa Indonesia to ensure better representation in data and alignment compared to Western or Chinese models. SEA-LION models understand nuances in SEA languages and demonstrate greater awareness of cultural context specific to the region.

This lowers the bar for governments, industries, and academia that seek LLM solutions that fit local languages and reflect local cultural norms, since WIRED-centric models can pose langauge barriers and misalign with local sensibilities in the SEA region.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sea-lion.ai/overview/readme/why_sea-lion.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
