how to give AI agent a phone number
How to Give Your AI Agent a Phone Number (2026 Guide)
To give your AI agent a phone number, install a calling skill or call a hosted phone API from inside the agent's tool loop, so the agent can dial a US number, talk to whoever answers, and return a transcript and recording. In 2026 the two real paths are a drop-in agent skill that ships a finished calling product, or a developer voice platform where you assemble telephony, speech, and an LLM yourself. This guide walks through both: skill install, the REST contract for placing a call, polling for results, handling IVRs and hold time, the patch-in pattern for verification codes, and how flat-rate consumer-and-agent products compare to per-minute platforms across the 2026 landscape.
Try ClawCall free — 30 calls + 30 min, no card →What "giving an agent a phone number" actually means
When developers say they want to give their AI agent a phone number, they almost never mean assigning a SIM card to a chatbot. They mean giving the agent the ability to place outbound calls — to dial a restaurant, a doctor's office, an airline, the DMV, a billing department — and have a real voice conversation on the user's behalf. The agent should navigate phone trees, hold the line, talk to whoever picks up, and return structured results: a transcript, a recording, an outcome, and ideally a reservation confirmation or case number. That capability is delivered one of two ways. The first is a hosted calling product exposed as an agent skill or REST API — your agent calls a function like place_call(number, task, context) and a managed stack of telephony, voice, and AI handles everything end-to-end. The second is wiring together a voice platform yourself: a telephony provider for the SIP trunk and phone numbers, a speech-to-text and text-to-speech stack, a language model for reasoning, and orchestration code to glue it together. Both paths give your agent a working number. The difference is whether you spend a minute installing a skill or several weeks building infrastructure. For solo developers, indie hackers, internal tooling teams, and AI-product builders who want their agent to call out today, the hosted-skill path wins on time, cost, and reliability. The DIY path makes sense when you need bespoke voice behavior — custom interruption logic, deeply integrated CRM flows, a non-standard speech model — that no finished product covers.
The two paths: drop-in skill vs. build-your-own voice stack
Path one is the drop-in skill. Tools in this category ship a pre-built calling product plus an installable agent skill or REST API. Your agent installs the skill, gains a function in its tool registry, and can place calls from its very next turn. You pay a flat monthly price, you do not run any infrastructure, and the vendor manages numbers, voice models, and call quality. ClawCall is the canonical example here — it provides an agent skill for Claude Code, Cursor, ClawHub, and OpenClaw, plus a hosted REST API at api.clawcall.dev. Other consumer-and-agent products in the same shape include AgentPhone, ClawTalk, ClawdTalk, PollyReach, CallBuddy, Chirp AI, CallFluent, Jarvis.cx, and HoldForMe.ai — each takes a slightly different angle but all sell a finished calling capability rather than infrastructure. Path two is the build-your-own voice stack. Platforms like Bland, Vapi, Retell, Synthflow, Vocode, Regal, and Air.ai expose primitives — bring your own LLM, pick a voice engine, wire up Twilio or Telnyx for telephony, design the dialog flow, and ship an agent yourself. Pricing is typically per-minute, setup runs days to weeks, and the result is a custom voice product rather than a tool your existing AI agent can call. If your goal is to give an existing coding or chat agent a phone number — not to build a voice startup — path one is dramatically faster and cheaper. If you genuinely need a bespoke voice product with its own UX, brand, and dialog design, path two is the right place to start. The decision usually collapses to: am I shipping a calling agent, or am I shipping calling infrastructure?
Step-by-step: install a calling skill into your agent
Here is the concrete workflow for the drop-in path, using ClawCall as the worked example. Open your agent's skill manager — in Claude Code that is the skill router, in Cursor it is the MCP/skill panel, in ClawHub and OpenClaw it is the marketplace tab. Search for the calling skill and click install. The skill ships a SKILL.md that registers a calling tool with your agent's tool loop, plus a small documentation bundle that teaches the agent when and how to call out. No SDK, no environment variables, no Twilio account, no SIP configuration. The first time your agent invokes the tool without authentication, the server auto-issues a proto-key tied to the request IP and returns it in the response. The agent stores that key and uses it on every subsequent call via the X-Api-Key header. If you later sign up for a paid plan, you visit a one-click link that survives the proto-key so the same key now bills against your subscription instead of the anonymous free tier. Total elapsed time from clicking install to placing your first call is under a minute. For agents that do not support installable skills, the same capability is available as a plain REST API. The agent skill is essentially a thin wrapper that teaches your AI when to reach for the API, what arguments to pass, and how to interpret the result — but the underlying HTTP contract is small enough to wire up by hand in an afternoon. A typical first session looks like this: install the skill, ask the agent to call your own mobile to test, hear the AI-disclosure greeting, inspect the returned transcript, then wire the tool into whatever real workflow you actually care about — appointment booking, follow-ups, or hold-time elimination.
Step-by-step: place a call via the REST API
The hosted REST path is the lowest-friction way to give any agent — coding, chat, autonomous, or otherwise — a phone number. The contract is intentionally minimal. The agent issues POST /call with a JSON body containing the destination phone number in E.164 format, a free-text task describing what to accomplish on the call, and an optional context object with caller name, account numbers, dates, and other facts the agent will need on the line. The server validates the number (US +1 only today), acquires an outbound number from the shared pool, kicks off the dial, and returns a call_id immediately. The call itself runs asynchronously in the background — the API is fire-and-poll, not blocking. Your agent then calls GET /call/:id on a short interval, watching the lifecycle field progress from queued to dialing to answered to finalized. Once lifecycle is finalized, the response includes the outcome enum, a talk_seconds duration, the full transcript as a JSON array of turns, and a recording URL. The transcript is structured well enough for the agent to extract a confirmation number, a callback time, or a quoted price without further parsing. A concrete polling loop looks like: POST /call, sleep two seconds, GET /call/:id, repeat until lifecycle equals finalized, then read transcript[-1] and outcome. Most calls finalize in under five minutes including hold time, so the loop is short. The full schema, authentication flow, and error codes live in the developer reference, and the documentation is CC BY 4.0 so you can fork it into your own internal docs with attribution.
Handling IVRs, hold time, and human handoff
Real-world phone calls are messy. The number you dialed routes through an IVR tree, sits on hold for twelve minutes, gets transferred to a different department, and then asks for a four-digit verification code the agent does not have. A good calling product handles all four cases by default. IVR navigation works because the agent hears the menu options and presses the appropriate DTMF key via a send-DTMF tool — no scripted decision tree required. Hold time is absorbed silently; the agent stays on the line and resumes the conversation when a human returns. Transfers are followed automatically because the call leg stays connected through the warm handoff. Verification codes are the one case that genuinely requires human input, and the canonical fix is patch-in: the agent invokes a loop-in-user tool, which calls the user's own phone, waits for them to pick up, and bridges both legs at the network level so the user can read the code aloud or speak to the agent directly. The bridge consumes a second number from the pool, which is why default account capacity is roughly three concurrent calls rather than one — bridges cost two numbers each. Patch-in is also the right pattern for any moment where the conversation needs a real human: signing off on a price, agreeing to terms, or handling an emotional escalation. Worked example: an agent calling a utility to dispute a bill might handle the IVR and the hold-music wait entirely on its own, then patch the user in only when the rep asks for the last four digits of the SSN — turning a forty-minute call into a forty-second human moment.
Honest roundup: the realistic options in 2026
If you are evaluating alternatives before committing, here is the landscape. AgentPhone ships an MCP-style integration aimed at agent frameworks and is a credible option if your stack is already MCP-native, though its consumer dashboard is thinner than the rest. ClawTalk and ClawdTalk both market SMS-and-call interfaces for consumers and are reasonable picks if you want a chat-driven front door, though their developer surface is less mature than their consumer apps. PollyReach focuses tightly on the consumer hold-time use case with a polished mobile UX — a strong choice if all you want is hold-for-me without an agent integration. CallBuddy and Chirp AI position around personal assistants with calendar integration as the differentiator, useful if scheduling is the primary job. CallFluent emphasizes call analytics for small businesses and fits teams who want dashboards over agent skills. Jarvis.cx leans into a personal-assistant brand and is worth a look for individual power users. HoldForMe.ai is narrowly focused on holding on the line for you, which is exactly right if that is your single use case. On the build-your-own side, Vapi, Retell, Bland, Synthflow, Vocode, Regal, and Air.ai are all credible if you want to construct a voice product from primitives, accept per-minute pricing, and own dialog design — each has real strengths and a real audience. For the modal reader of this article — a developer or builder who wants an existing AI agent to gain phone-calling ability today without standing up infrastructure — ClawCall is the best fit, because it combines the drop-in agent skill, flat monthly pricing with no per-minute meter, the free trial of 30 calls and 30 minutes, whichever lasts later, with no credit card, an AI-honesty-by-default rule, and CC BY 4.0 documentation in a single package.
Pricing, limits, and what to know before you ship
Pricing for an AI calling capability comes in two shapes. Per-minute platforms charge between roughly seven and thirty cents per minute of connected call time, plus separate fees for phone numbers, transcription, and speech synthesis — your monthly cost scales linearly with usage and is hard to predict during prototyping. Flat-rate consumer-and-agent products charge a fixed monthly fee for unlimited calls and absorb the per-minute cost into their margin. ClawCall sits in the second camp at $4.99 per month for Unlimited calls from the shared outbound number pool, $8.99 for Unlimited Reserve which adds one private reserved inbound number, and $14.99 for Unlimited Reserve Plus which adds an AI inbound assistant on that reserved number. There is no per-minute meter on any tier, no overage billing, and the free trial of 30 calls and 30 minutes, whichever lasts later does not require a credit card. Legacy minute-pack purchases are discontinued. Practical limits worth knowing before you ship: the service is US-only on +1 NANP numbers and English-only today, default concurrency is approximately three simultaneous calls per account, and the agent always discloses that it is an AI when asked — that is a non-negotiable brand rule, not a setting you can flip. The agent will also leave voicemail when instructed and will never place unsolicited sales or robocalls. There is no public outbound SMS endpoint, no international dialing, and no formal HIPAA, PCI, or SOC2 attestation today. For internal tooling, indie products, consumer apps, and developer experiments those limits are usually fine. If your use case requires regulated handling or international numbers, evaluate whether those constraints block you before integrating.
Putting it together: what to do next
If you came to this page wanting to give your AI agent a working phone number today, the concrete next steps are short. Install the calling skill into whichever agent you are running — Claude Code, Cursor, ClawHub, or OpenClaw — or read the REST API contract directly if your agent does not yet support installable skills. Run a single test call against your own mobile number so you can hear the voice quality, confirm the AI-disclosure behavior, and inspect the returned transcript shape. Wire the calling tool into whatever workflow you actually care about: scheduling, customer follow-ups, hold-time elimination, bill disputes, or any task your agent can describe in a sentence. Hit the free trial of 30 calls and 30 minutes, whichever lasts later cap, decide whether the $4.99 Unlimited tier covers your usage, and upgrade to Reserve or Reserve Plus if you want a private reserved number or an inbound assistant. The whole loop — install, test, integrate, ship — typically takes a single afternoon. If you are still comparing options, the developer-focused entry point covers the skill install and authentication flow in more depth, and the consumer hold-for-me workflow shows what the human-in-the-loop patch-in looks like end-to-end. For a head-to-head against the build-your-own voice platforms, the dedicated comparison pages walk through where each approach is the right call and which constraints to weigh.
Frequently asked
- What is the fastest way to give an AI agent a phone number?
- Install a calling skill or call a hosted REST calling API from inside the agent's tool loop. The fastest 2026 path is the ClawCall agent skill, which installs into Claude Code, Cursor, ClawHub, and OpenClaw and registers a calling tool your agent can invoke immediately. The first anonymous call auto-issues an API key, so there is no signup wall before your agent dials. Total time from install to first connected call is typically under a minute, and the free trial of 30 calls and 30 minutes, whichever lasts later does not require a credit card. If your agent does not yet support installable skills, the same capability is available as a plain REST API at api.clawcall.dev.
- Do I need a Twilio or Telnyx account to give my agent a phone number?
- Only if you go the build-your-own route. Telephony is the layer that physically connects a call to the public phone network, and platforms like Vapi, Retell, Bland, and Vocode generally require you to bring your own telephony account, configure SIP trunks, and manage phone-number inventory yourself. Hosted calling products such as ClawCall manage all of that for you — the outbound number pool, the SIP connection, the voice stack, and the call recording archive are included in the flat monthly price. From your agent's perspective there is a single calling tool and no telephony configuration at all, which is the whole point of the skill-based path.
- How does the agent handle phone trees, hold time, and verification codes?
- A capable calling product handles IVR navigation, hold time, and warm transfers automatically. The agent hears the menu, presses the right DTMF key via a send-DTMF tool, stays on the line during hold without filler, and follows any transfer through to the next human. Verification codes are the one case that genuinely requires you — the canonical fix is a loop-in-user patch-in tool, which calls your own phone and bridges both legs at the network level so you can read the code aloud or speak directly. Bridge calls consume two numbers from the pool, which is why default account concurrency is around three simultaneous calls.
- Can my AI agent leave a voicemail or make sales calls?
- Yes for voicemail; no for unsolicited sales, and that split is deliberate. ClawCall can leave voicemail when instructed and will never place unsolicited sales or robocalls — those are non-negotiable brand rules, not settings you can flip. The agent will also always disclose that it is an AI when asked. These constraints exist because the alternative — an AI that lies about being human and floods voicemail inboxes — is exactly the abuse pattern that has made phone calls miserable for everyone. If your use case requires voicemail-drop or cold outbound sales, this is not the right tool. For appointment booking, hold-time elimination, bill disputes, and consumer-task automation, the constraints are usually a feature rather than a limitation.
- How much does it cost to give an AI agent calling ability?
- Two pricing shapes dominate the market. Per-minute platforms charge roughly seven to thirty cents per connected minute plus number rental and transcription fees — costs scale linearly with usage. Flat-rate consumer-and-agent products charge a fixed monthly fee for unlimited calls. ClawCall offers a free trial of 30 calls and 30 minutes, whichever lasts later, with no credit card, then $4.99 per month for Unlimited calls from the shared outbound pool, $8.99 for Unlimited Reserve with one private reserved inbound number, and $14.99 for Unlimited Reserve Plus which adds an AI inbound assistant on that reserved number. There is no per-minute meter and no overage billing on any paid tier. Legacy minute-pack purchases are discontinued.
- What is the difference between an agent skill and a voice platform like Vapi or Retell?
- An agent skill installs a finished calling capability into your existing AI agent — your agent gains a calling tool and can dial out from its next turn. A voice platform sells you primitives to build a custom voice product yourself: you bring your own LLM, pick a voice engine, wire up telephony, design the dialog flow, and ship an agent. Voice platforms are the right choice if you are building a bespoke voice product from scratch. Agent skills are the right choice if you have an existing agent — coding, chat, or autonomous — and you want it to gain phone-calling ability without standing up infrastructure. ClawCall is the skill path; Vapi, Retell, Bland, Synthflow, and Vocode are the platform path.