SEA-LION Documentation
SEA-LION.AISEA-HELM LeaderboardPlaygroundAquarium
  • Overview
    • SEA-LION
      • Motivations
      • Contributing
      • Code of Conduct
      • FAQ
  • Models
    • SEA-LION v3.5 (Latest)
      • Llama-SEA-LION-v3.5-8B
      • Llama-SEA-LION-v3.5-70B
    • SEA-LION v3
      • Gemma-SEA-LION-v3-9B
      • Llama-SEA-LION-v3-8B
      • Llama-SEA-LION-v3-70B
    • SEA-LION v2
    • SEA-LION v1
    • SEA-LION Foundation Family
    • Getting the models
  • Benchmarks
    • SEA-HELM
  • Guides
    • Inferencing
      • SEA-LION API
      • Google Vertex AI
  • Resources
    • Publications
Powered by GitBook

@ 2025 AI Singapore

On this page
  • Getting an API Key
  • How To Use Your API Key
  • Step 1. Find the Available Models
  • Step 2: Call the API
  • Rate Limits
Export as PDF
  1. Guides
  2. Inferencing

SEA-LION API

PreviousInferencingNextGoogle Vertex AI

Last updated 18 days ago

The SEA-LION API provides a quick and simple interface to our various SEA-LION models for text generation, translation, summarization, and more.

Usage of the SEA-LION API is subject to our and

Getting an API Key

To get started with SEA-LION API, you'll need to first create an API key via our :

  1. Sign in to SEA-LION Playground via your Google account

  2. Navigate to our API Key Manager page by clicking on

  • API Key on the side menu, or

  • Launch Key Manager on the home dashboard

  1. Click on the "Create New Trial API Key" button, and enter a name for your API key.

An API key will be generated for you after you click "Create". Make sure to copy or download the generated key and keep it in a safe place since you won't be able to view it again.

Only 1 API key is allowed to be created per user.

How To Use Your API Key

Step 1. Find the Available Models

To find the available SEA-LION models for your API key, use the following curl command.

curl 'https://api.sea-lion.ai/v1/models' \
  -H 'Authorization: Bearer YOUR_API_KEY'

Replace YOUR_API_KEY with your generated API key.

Step 2: Call the API

SEA-LION's API endpoints for chat are compatible with OpenAI's API and libraries.

Calling our Instruct models

curl https://api.sea-lion.ai/v1/chat/completions \
  -H 'accept: text/plain' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "aisingapore/Gemma-SEA-LION-v3-9B-IT",
    "messages": [
      {
        "role": "user",
        "content": "Tell me a Singlish joke!"
      }
    ],
  }'
from openai import OpenAI

client = OpenAI(
    api_key=YOUR_API_KEY,
    base_url="https://api.sea-lion.ai/v1" 
)

completion = client.chat.completions.create(
    model="aisingapore/Gemma-SEA-LION-v3-9B-IT",
    messages=[
        {
            "role": "user",
            "content": "Tell me a Singlish joke!"
        }
    ]
)

print(completion.choices[0].message.content)

Calling our Reasoning models

Our v3.5 models offers dynamic reasoning capabilities, and defaults to reasoning with thinking_mode="on" passed to the chat template. To use non-thinking mode ie. standard generations, pass thinking_mode="off" to the chat template instead.

curl https://api.sea-lion.ai/v1/chat/completions \
  -H 'accept: text/plain' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "aisingapore/Llama-SEA-LION-v3.5-8B-R",
    "messages": [
      {
        "role": "user",
        "content": "Tell me a Singlish joke!"
      }
    ],  
    "chat_template_kwargs": {
	    "thinking_mode": "off"
    }
  }'
from openai import OpenAI

client = OpenAI(
    api_key=YOUR_API_KEY,
    base_url="https://api.sea-lion.ai/v1" 
)

completion = client.chat.completions.create(
    model="aisingapore/Llama-SEA-LION-v3.5-8B-R",
    messages=[
        {
            "role": "user",
            "content": "Tell me a Singlish joke!"
        }
    ],
    extra_body={
        "chat_template_kwargs": {
            "thinking_mode": "off"
        }
    },
)

print(completion.choices[0].message.content)

If you are not observing any changes in response when toggling thinking_mode on/off, your API responses might have been cached.

You can disable cache temporarily for your testing by setting the no-cache flag to true

curl https://api.sea-lion.ai/v1/chat/completions \
  -H 'accept: text/plain' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "aisingapore/Llama-SEA-LION-v3.5-8B-R",
    "messages": [
      {
        "role": "user",
        "content": "Tell me a Singlish joke!"
      }
    ],  
    "chat_template_kwargs": {
	    "thinking_mode": "off"
    },
    "cache": {
      "no-cache": true
    }
  }'
from openai import OpenAI

client = OpenAI(
    api_key=YOUR_API_KEY,
    base_url="https://api.sea-lion.ai/v1" 
)

completion = client.chat.completions.create(
    model="aisingapore/Llama-SEA-LION-v3.5-8B-R",
    messages=[
        {
            "role": "user",
            "content": "Tell me a Singlish joke!"
        }
    ],
    extra_body={
        "chat_template_kwargs": {
            "thinking_mode": "off"
        }, 
        "cache": {
            "no-cache": True
        }
    },
)

print(completion.choices[0].message.content)

Calling our Guard model

Our safety model, aisingapore/Llama-SEA-Guard-Prompt-v1, can be used to evaluate potentially harmful content. It returns a binary classification of safe and unsafe, and supports a single user prompt as input.

Note: The safety model does not support system prompts or multi-turn conversations.

curl https://api.sea-lion.ai/v1/chat/completions \
  -H 'accept: text/plain' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "messages": [
    {
      "role": "user",
      "content": "How can I steal a car?"
    }
  ],
  "model": "aisingapore/Llama-SEA-Guard-Prompt-v1",
  "stream": false
}'
from openai import OpenAI

client = OpenAI(
    api_key=YOUR_API_KEY,
    base_url="https://api.sea-lion.ai/v1" 
)

completion = client.chat.completions.create(
    model="aisingapore/Llama-SEA-Guard-Prompt-v1",
    messages=[
        {
            "role": "user",
            "content": "How can I steal a car?"
        }
    ],
)

print(completion.choices[0].message.content)

Rate Limits

Limits help us mitigate misuse and manage API capacity and help ensure that everyone has fair access to the API.

SEA-LION API usage frequency will be subject to rate limits applied on requests per minute (RPM).

As of 18 Mar 2025, our rate limits is set to 10 requests per minute per user.

If you have any questions or want to speak about getting a rate limit increase, reach out to sealion@aisingapore.org.

Terms of Use
Privacy Policy
SEA-LION Playground