Tool-Calling
Introduction to Tool Calling
Tool calling is a powerful feature that enables Large Language Models (LLMs) to interact with external functions and APIs, extending their capabilities beyond text generation. SEA-LION models support tool calling with different implementations depending on the model version.
Tool calling allows models to:
Access real-time information (weather, time, web search)
Perform calculations and data processing
Interact with external systems and APIs
Execute specific functions based on user requests
This guide covers tool calling implementation for the SEA-LION model variants hosted on SEA-LION API, each with distinct behaviors and requirements. For demonstration purposes, the tools suggested in the tool implementation page will be used in the sample code snippets.
Model-Specific Tool Calling Guides
Gemma-SEA-LION-v4-27B-IT
Key Characteristics:
Uses text-based tool calling format
Requires parsing tool calls from response content
Does not utilize standard
tool_callsparameterFollows system prompt instructions for tool call formatting
Following the Gemma 3 chat template, Gemma-SEA-LION-v4-27B-IT does not parse the tools parameter, hence it is recommended to handle tool-calling via the parsing of the model's message response, similar to this example by Google DeepMind engineer Philipp Schmid.
When tool_choice is configured to enforce usage of a specific tool, the tool_calls parameter will be returned, but this removes flexibility from the LLM on determining whether tool call is required.
API Request Configuration
If enforcing tool call:
Example Response (Tool-calling not enforced)
Llama-SEA-LION-v3-70B-IT
Key Characteristics:
Supports standard OpenAI-style function calling
Uses
tool_callsparameter in responsesRequires tools configuration in API request
Works with
tool_choice: "auto"setting
API Request Configuration
Response Handling
Example Response
Llama-SEA-LION-v3.5-70B-R
Key Characteristics:
Reasoning model without tool calling capability
Tool-calling can be done via parsing from message response
Similar to Gemma-SEA-LION-v4-27B-IT using tool-calling via message content
Recommend not adding
tools,tool_choicein API call
API Request Configuration
Implementation Example
Here's an examples that handles all three models, making use of the components provided in the tool implementation page:
Points to Take Note Of
Model-Specific Considerations
Gemma-SEA-LION-v4-27B-IT:
Typically uses text parsing instead of standard tool calling
System prompt should explicitly define tool call format
toolsparameter in API requests is only utilized whentool_choiceis set to"required"or specific tool is enforcedTool calls are wrapped in ```tool_code blocks
Regex patterns needed for extraction
Llama-SEA-LION-v3-70B-IT:
Fully supports OpenAI-style tool calling
Uses
toolsandtool_choicein API requestsReturns structured
tool_callsin responseReliable for production tool calling applications
Llama-SEA-LION-v3.5-70B-R:
Reasoning model without tool calling capability
Tool-calling can be done via parsing from message response
Can reason about tool usage
Take note to parse from message content after reasoning segment, to prevent multiple redundant tool calls
Best used for complex reasoning tasks
General Best Practices
Error Handling: Always implement proper error handling for tool execution failures and API timeouts.
Model Detection: Use model name suffixes to determine the appropriate tool calling approach:
Timeout Management: Set appropriate timeouts for both LLM API calls and tool execution
Response Validation: Always validate tool call responses before processing:
Conversation Flow: Maintain proper conversation history by adding all messages (user, assistant, tool results) to the messages array.
Gemma 3 chat template enforces the alternating of
userandassistantin message history, hence in the example functionexecute_tool_calls, the role returned isuserand nottoolfor the tool result
Platform Considerations: Some models may behave differently on different platforms (e.g., Ollama vs cloud APIs). Test your implementation on your target platform.
Token Efficiency: The text-based approach may use more tokens than standard function calling. Monitor usage accordingly.
Security Considerations
Validate all tool parameters before execution
Implement rate limiting for external API calls
Sanitize user inputs that will be passed to tools
Consider implementing tool execution sandboxing for production environments
Performance Optimization
Cache tool results where appropriate (e.g., weather data for short periods)
Implement parallel tool execution when multiple tools are called
Use connection pooling for HTTP requests in tool implementations
Consider implementing tool call batching for efficiency
Relevant Links
Last updated