Tool Calling
Tool calling is supported in all types of Realtime sessions. For client-sdk usage, tool calling is configured when generating the connection details using the startRealtimeSession.
For phone calls, tool calling is configured when creating the phone number attachement.
Responding to Tool Calls In A Realtime Session
For maximum flexibility, the AI in a realtime session does not automatically respond with tool call results.
This behavior was chosen because:
- When a
user
message triggers multiple tool calls, it's unclear how to inject the results in a responseassistant
message. - Some tools can take a long time to complete. In such cases, you might want the conversation to continue in the meantime.
What this means is that it's up to you as the developer to respond into the Realtime Session for tool calls. You can "speak" into the session using speak API.
Here's a complete tool calling example to show how to work with a tool calling flow of reasonable complexity.
Complete Tool Calling Example
A realtime session with id realtime-session-id
is running with a send_message
tool call available to it.
check_schedule
tool definition object
{
"id": "tool-definition-id",
"name": "check_schedule",
"description": "called when the user asks about their calendar",
"parameters": [
{
"name": "date_and_time",
"description": "the date and time to look up",
"type": "string",
"required": true
},
],
"call_settings": {
"destination": {
"type": "web_request",
"url": "https://api.your-backend.com/gabber_tool_calls/send_message"
}
}
}
The user says:
"Send a message to Anne saying hello and John saying call me later."
At this point, the Realtime Session does the following sequence:
1. Triggers a webhook of type tool.calls_started
with the following payload
POST Webhook with type tool.calls_started
POST https://api.your-backend.com/gabber_webhook
Content-Type: "application/json"
X-Webhook-Signature: "<hmac-sha256 signature of body signed with Gabber API key>"
{
"type": "tool.calls_started",
"payload": {
"realtime_session": "realtime-session-id",
"group": "tool-call-group-id-1"
"tool_calls": [
{
"id": "tool-call-id-1",
"tool_definition_id": "tool-definition-id",
"function": {
"name": "send_message",
"arguments": {
"recipient": "Anne",
"msg": "Hello."
}
}
},
{
"id": "tool-call-id-2",
"tool_definition_id": "tool-definition-id",
"function": {
"name": "send_message",
"arguments": {
"recipient": "John",
"msg": "Call me later."
}
}
}
],
}
}
2. In parallel makes two POST requests (one for each tool call)
POST Tool call with message to Anne
POST https://api.your-backend.com/gabber_tool_calls/send_message
Content-Type: application/json
X-Gabber-Tool-Call-Id: tool-call-id-1
X-Gabber-Tool-Call-Group-Id: tool-call-group-id
X-Gabber-Tool-Call-Group-Index: 0
X-Gabber-Tool-Call-Group-Length: 2
X-Gabber-Tool-Definition-Id: tool-definition-id
X-Gabber-Tool-Definition-Name: send_message
X-Gabber-Body-Signature: <hmac-sha256 signature of body signed with Gabber API key>
X-Gabber-Realtime-Session-Id: realtime-session-id
{
"recipient": "Anne",
"msg": "Hello."
}
POST Tool call with message to John
POST https://api.your-backend.com/gabber_tool_calls/send_message
Content-Type: application/json
X-Gabber-Tool-Call-Id: tool-call-id-2
X-Gabber-Tool-Call-Group-Id: tool-call-group-id
X-Gabber-Tool-Call-Group-Index: 1
X-Gabber-Tool-Call-Group-Length: 2
X-Gabber-Tool-Definition-Id: tool-definition-id
X-Gabber-Tool-Definition-Name: send_message
X-Gabber-Body-Signature: <hmac-sha256 signature of body signed with Gabber API key>
X-Gabber-Realtime-Session-Id: realtime-session-id
{
"recipient": "John",
"msg": "Call me later."
}
3. Triggers a webhook of type tool.calls_finished
NOTE: This webhook is triggered after waiting for all of the tool calls have finsihed completely with a response.
POST Webhook with type tool.calls_finished
POST https://api.your-backend.com/gabber_webhook
Content-Type: "application/json"
X-Webhook-Signature: "<hmac-sha256 signature of body signed with Gabber API key>"
{
"type": "tool.calls_finished",
"payload": {
"realtime_session": "realtime-session-id",
"group": "tool-call-group-id",
"tool_calls": [
{
"id": "tool-call-id-1",
"tool_definition_id": "tool-definition-id",
"function": {
"name": "send_message",
"arguments": {
"recipient": "Anne",
"msg": "Hello."
}
}
},
{
"id": "tool-call-id-2",
"tool_definition_id": "tool-definition-id",
"function": {
"name": "send_message",
"arguments": {
"recipient": "John",
"msg": "Call me later."
}
}
}
],
"tool_call_results": [
{
"tool_call_id": "tool-call-id-1",
"tool_definition_id": "tool-definition-id",
"code": 200,
"response_string": "OK"
},
{
"tool_call_id": "tool-call-id-2",
"tool_definition_id": "tool-definition-id",
"code": 200,
"response_string": "OK"
},
],
}
}
Example Backend
In response to step three, you might want to send a response from the "assistant" like "I sent those messages to Anne and John".
Here's an example of how you might do that using the "speak" API.