Tool-Calling
Introduction to Tool Calling
Tool calling is a powerful feature that enables Large Language Models (LLMs) to interact with external functions and APIs, extending their capabilities beyond text generation. SEA-LION models support tool calling with different implementations depending on the model version.
Tool calling allows models to:
Access real-time information (weather, time, web search)
Perform calculations and data processing
Interact with external systems and APIs
Execute specific functions based on user requests
This guide covers tool calling implementation for the SEA-LION model variants hosted on SEA-LION API, each with distinct behaviors and requirements. For demonstration purposes, the tools suggested in the tool implementation page will be used in the sample code snippets.
Model-Specific Tool Calling Guides
Gemma-SEA-LION-v4-27B-IT
Key Characteristics:
Uses text-based tool calling format
Requires parsing tool calls from response content
Does not utilize standard
tool_calls
parameterFollows system prompt instructions for tool call formatting
Following the Gemma 3 chat template, Gemma-SEA-LION-v4-27B-IT does not parse the tools
parameter, hence it is recommended to handle tool-calling via the parsing of the model's message response, similar to this example by Google DeepMind engineer Philipp Schmid.
When tool_choice
is configured to enforce usage of a specific tool, the tool_calls
parameter will be returned, but this removes flexibility from the LLM on determining whether tool call is required.
API Request Configuration
# For Gemma-SEA-LION-v4-27B-IT, DO NOT include tools in request
request_data = {
"model": "aisingapore/Gemma-SEA-LION-v4-27B-IT",
"messages": messages,
"temperature": 0,
# Note: No tools or tool_choice parameters
}
If enforcing tool call:
# For models that support native tool calling, include tools and enforce specific tool
request_data = {
"model": "aisingapore/Gemma-SEA-LION-v4-27B-IT",
"messages": messages,
"temperature": 0,
"tools": build_tool_schema(),
"tool_choice": {
"type": "function",
"function": {"name": "get_current_weather"} # Force use of specific tool
}
}
# Alternative: Force any tool call (not a specific one)
request_data_any_tool = {
"model": "aisingapore/Gemma-SEA-LION-v4-27B-IT",
"messages": messages,
"temperature": 0,
"tools": build_tool_schema(),
"tool_choice": "required" # Force model to use any available tool
}
Example Response (Tool-calling not enforced)
{
"choices": [{
"message": {
"content": "```tool_code\nget_time(timezone=\"Asia/Singapore\")\n```",
"role": "assistant"
}
}]
}
Llama-SEA-LION-v3-70B-IT
Key Characteristics:
Supports standard OpenAI-style function calling
Uses
tool_calls
parameter in responsesRequires tools configuration in API request
Works with
tool_choice: "auto"
setting
API Request Configuration
# For Llama-SEA-LION-v3-70B-IT, include tools and tool_choice
request_data = {
"model": "aisingapore/Llama-SEA-LION-v3-70B-IT",
"messages": messages,
"temperature": 0,
"tools": build_tool_schema(),
"tool_choice": "auto"
}
Response Handling
def extract_tool_calls(data):
"""Extract tool calls from the response data."""
choice = data.get("choices", [{}])[0] if data.get("choices") else {}
return choice.get("message", {}).get("tool_calls")
# Usage
tool_calls = extract_tool_calls(response_data)
if tool_calls:
# Execute tool calls directly
tool_results = await execute_tool_calls(tool_calls, session)
Example Response
{
"choices": [{
"message": {
"content": null,
"role": "assistant",
"tool_calls": [{
"function": {
"arguments": "{\"timezone\": \"Asia/Singapore\"}",
"name": "get_time"
},
"id": "chatcmpl-tool-920019c71dd14d96a262ec798b778ccd",
"type": "function"
}]
}
}]
}
Llama-SEA-LION-v3.5-70B-R
Key Characteristics:
Reasoning model without tool calling capability
Tool-calling can be done via parsing from message response
Similar to Gemma-SEA-LION-v4-27B-IT using tool-calling via message content
Recommend not adding
tools
,tool_choice
in API call
API Request Configuration
# For reasoning models, do NOT include tools to avoid errors
def is_reasoning_model(model_name):
return model_name.endswith('-R')
# Request configuration
if is_reasoning_model(api_config["model"]):
request_data = {
"model": "aisingapore/Llama-SEA-LION-v3.5-70B-R",
"messages": messages,
"temperature": 0,
# No tools or tool_choice parameters
}
else:
request_data = {
"model": api_config["model"],
"messages": messages,
"temperature": 0,
"tools": tools,
"tool_choice": "auto"
}
Implementation Example
Here's an examples that handles all three models, making use of the components provided in the tool implementation page:
async def process_user_message(user_message, messages, api_config, session):
"""Process a user message and handle tool calls for different model types."""
messages.append({"role": "user", "content": user_message})
tools = build_tool_schema()
# Check model type
is_reasoning_model = api_config["model"].endswith('-R')
# Configure request based on model type
request_data = {
"model": api_config["model"],
"messages": messages,
"temperature": 0,
}
# Only add tools for non-reasoning models
if not is_reasoning_model:
request_data["tools"] = tools
request_data["tool_choice"] = "auto"
headers = {"Authorization": f"Bearer {api_config['api_key']}"}
async with session.post(
api_config["api_url"],
json=request_data,
headers=headers,
timeout=30
) as response:
data = await response.json()
assistant_message = data.get("choices", [{}])[0].get("message")
if not assistant_message:
return
messages.append(assistant_message)
# Handle tool calls based on model type
tool_calls = extract_tool_calls(data)
if not tool_calls:
# Tool call not found, parse tool call from message content
message_content = assistant_message.get("content", "")
if is_reasoning_model:
# Check for tool call only in non-reasoning content to prevent excess calls
message_content = message_content.split("</think>")[1].strip()
tool_calls = parse_tool_calls_from_text(message_content)
if tool_calls:
# Execute tools and get final response
tool_results = await execute_tool_calls(tool_calls, session)
messages.extend(tool_results)
# Get final response with tool results
final_response = await session.post(
api_config["api_url"],
json={"model": api_config["model"], "messages": messages, "temperature": 0},
headers=headers,
timeout=30
)
final_data = await final_response.json()
final_message = final_data.get("choices", [{}])[0].get("message")
if final_message:
print(final_message["content"])
messages.append(final_message)
else:
# No tool calls, show direct response
print(assistant_message.get("content", ""))
Points to Take Note Of
Model-Specific Considerations
Gemma-SEA-LION-v4-27B-IT:
Typically uses text parsing instead of standard tool calling
System prompt should explicitly define tool call format
tools
parameter in API requests is only utilized whentool_choice
is set to"required"
or specific tool is enforcedTool calls are wrapped in ```tool_code blocks
Regex patterns needed for extraction
Llama-SEA-LION-v3-70B-IT:
Fully supports OpenAI-style tool calling
Uses
tools
andtool_choice
in API requestsReturns structured
tool_calls
in responseReliable for production tool calling applications
Llama-SEA-LION-v3.5-70B-R:
Reasoning model without tool calling capability
Tool-calling can be done via parsing from message response
Can reason about tool usage
Take note to parse from message content after reasoning segment, to prevent multiple redundant tool calls
Best used for complex reasoning tasks
General Best Practices
Error Handling: Always implement proper error handling for tool execution failures and API timeouts.
Model Detection: Use model name suffixes to determine the appropriate tool calling approach:
is_reasoning_model = model_name.endswith('-R')
Timeout Management: Set appropriate timeouts for both LLM API calls and tool execution
Response Validation: Always validate tool call responses before processing:
if not tool_calls or not isinstance(tool_calls, list): # Handle no tool calls case
Conversation Flow: Maintain proper conversation history by adding all messages (user, assistant, tool results) to the messages array.
Gemma 3 chat template enforces the alternating of
user
andassistant
in message history, hence in the example functionexecute_tool_calls
, the role returned isuser
and nottool
for the tool result
Platform Considerations: Some models may behave differently on different platforms (e.g., Ollama vs cloud APIs). Test your implementation on your target platform.
Token Efficiency: The text-based approach may use more tokens than standard function calling. Monitor usage accordingly.
Security Considerations
Validate all tool parameters before execution
Implement rate limiting for external API calls
Sanitize user inputs that will be passed to tools
Consider implementing tool execution sandboxing for production environments
Performance Optimization
Cache tool results where appropriate (e.g., weather data for short periods)
Implement parallel tool execution when multiple tools are called
Use connection pooling for HTTP requests in tool implementations
Consider implementing tool call batching for efficiency
Relevant Links
Last updated