Setting up SIP trunk for Voice AI: step-by-step
Setting up a SIP trunk for Voice AI: step-by-step
Setting up a SIP trunk for Voice AI involves five stages: choosing and configuring your SIP provider, setting up your Voice AI platform to accept SIP, configuring credentials and codec settings, testing the call path end-to-end, and validating under load before go-live. This guide walks through every step using Twilio as the SIP provider and Vapi as the Voice AI platform - the most common combination in production deployments - with the exact settings that work and the errors you will encounter along the way.
The SIP trunk configuration is the part of a Voice AI project that looks simple in the documentation and takes three times longer than expected in practice. Not because the technology is fundamentally difficult - it is not - but because the failure modes are silent, the error messages are cryptic, and the gap between a correct-looking configuration and a working one can be a single codec setting or a missing firewall rule that nobody mentioned.
This guide covers the complete setup process - not the theory of how SIP works, but the actual steps to get a working SIP trunk connected to a Voice AI platform. Every setting shown here is taken from configurations that work in production deployments. Every error described is one I have personally encountered and debugged.
Step 1 - Create a SIP trunk in Twilio
Log into your Twilio console at console.twilio.com. In the left sidebar navigate to Voice → SIP Trunking → Trunks. Click Create new SIP Trunk. Give it a descriptive name - something like voice-ai-production or vapi-integration. Click Create.
You now have an empty SIP trunk. The next steps configure it to route calls to your Voice AI platform and to allow calls from your Voice AI platform to go out to the PSTN.
Step 2 - Configure origination (inbound calls to your AI)
Origination settings tell Twilio where to send inbound calls - i.e. what SIP URI your Voice AI platform is listening on. Inside your new SIP trunk, click the Origination tab.
Click Add new Origination URI. You need the SIP URI from your Voice AI platform. In Vapi, this is found under Settings → SIP → Origination SIP URI. It will look something like:
Paste this URI into the Origination URI field. Set the Priority to 10 and Weight to 10 - these are the defaults and are fine for a single-trunk setup. Click Add.
transport=udp instead of transport=tcp. Many Voice AI platforms require TCP transport for reliability. If your SIP INVITE is sent over UDP and the platform expects TCP, calls will fail silently - no error, just no answer. Always confirm the required transport with your Voice AI platform's documentation before setting this.Step 3 - Configure termination (outbound calls from your AI)
Termination settings allow your Voice AI platform to make outbound calls through Twilio - dialling real phone numbers via the PSTN. Click the Termination tab on your SIP trunk.
You will see a Termination SIP URI field. Twilio auto-generates a unique domain for your trunk - it looks like:
Copy this URI - you will paste it into your Voice AI platform in Step 4. This is the address your platform will use when initiating outbound calls.
Now configure authentication. Under Authentication click Add credential list. Create a new credential with a username and a strong password - this is what your Voice AI platform will use to authenticate with Twilio when making outbound calls. Note these credentials down - you will need them in Step 4.
Under IP Access Control Lists, add the IP addresses of your Voice AI platform's servers. For Vapi, these are published in their documentation under SIP configuration. Adding these IP addresses allows Twilio to accept SIP requests from your platform without username/password authentication on inbound traffic - which reduces latency on the SIP handshake.
Step 4 - Connect your Voice AI platform to the SIP trunk
Now configure Vapi (or your Voice AI platform of choice) to use your Twilio SIP trunk. In Vapi, navigate to Settings → Phone Numbers → Import. Select SIP Trunk as the type.
After saving these settings, go back to your Twilio SIP trunk and attach your phone number to it. In Twilio Console navigate to Phone Numbers → Manage → Active Numbers. Click your number. Under Voice & Fax → A CALL COMES IN, change the setting from Webhook to SIP Trunk and select the trunk you just created. Save.
The codec setting is where most teams spend the most unplanned debugging time. Twilio defaults to G.711 PCMU. Some Voice AI platforms default to G.722 or Opus. When the codec on one end does not match the other, the call connects - you can see it in the SIP logs as a 200 OK - but there is no audio. One-way audio or complete silence on an otherwise successful call is almost always a codec mismatch.
What I do now: I explicitly set PCMU (G.711) as the only accepted codec on both sides during initial setup - even if the platform supports better codecs. Once the call path is confirmed working end-to-end, I then test enabling Opus or G.722 if the platform supports it. Narrowing the codec to one option first eliminates the most common failure mode before introducing variables.
Step 5 - Test the call path end-to-end
Never skip this step. A configuration that looks correct in the console can fail for reasons that only become visible when a real call is placed. Run these four tests in order before declaring the integration complete.
Dial your Twilio number from a mobile phone. The call should connect and your Voice AI agent should answer. If you hear a fast busy signal, the SIP origination URI is wrong or Twilio cannot reach your platform. If you hear silence, it connected but the codec is mismatched. If you hear the Twilio error message "we are sorry, an application error has occurred", check your Twilio number's voice configuration - it may not be pointing to your SIP trunk yet.
Use your Voice AI platform to initiate an outbound call to your own mobile number. The call should ring and when you answer, you should hear your AI agent. If the call fails to initiate, check the termination SIP URI and credential authentication settings. If the call connects but you cannot hear the AI, again - codec mismatch.
Run your twelve-turn test script - a complete representative conversation from greeting to resolution. Measure per-turn latency using your platform's call logs. Confirm that VAD is correctly detecting end-of-utterance without cutting you off or waiting too long. This is also where you test the human escalation transfer - initiate a transfer to a real phone number mid-conversation and confirm it completes cleanly.
Place five to ten simultaneous calls through the same SIP trunk and Voice AI platform. Measure whether latency degrades materially under concurrent load. Check Twilio's concurrent call limits on your account - free and trial accounts have very low concurrency limits that will cause calls to fail under even light load. Upgrade your Twilio account before any production deployment.
Common errors and exactly how to fix them
| Error or symptom | Cause | Fix |
|---|---|---|
| Fast busy signal on inbound call | Origination URI wrong or unreachable | Check SIP URI format and platform availability |
| Call connects but no audio either way | Codec mismatch | Set PCMU only on both sides and retest |
| One-way audio (you hear AI, AI cannot hear you) | NAT/firewall blocking RTP return path | Open UDP ports 10000-20000 on firewall |
| SIP 407 Proxy Auth Required | Wrong username or password in credentials | Recreate credential list and re-enter in platform |
| SIP 488 Not Acceptable Here | No matching codec in SDP negotiation | Add G.711 PCMU to platform codec list |
| SIP 503 Service Unavailable | Platform server unreachable or overloaded | Check platform status page and IP allowlist |
| Calls drop after exactly 30 seconds | Missing SIP re-INVITE / session timer | Enable session timers in platform SIP settings |
"The most expensive SIP debugging sessions I have run were caused by one wrong character in a SIP URI or one missing firewall rule. Always start with the simplest possible configuration - one codec, one transport, no encryption - and add complexity only after the basic call path is confirmed working."
- What I tell every team before their first SIP integrationPre-go-live checklist
The setup that takes 15 minutes - or four days
A SIP trunk setup that follows the steps above correctly takes about 15 minutes of configuration and two to three hours of testing. The same setup done without a clear sequence - jumping between the Twilio console and the Voice AI platform, changing multiple settings at once, skipping the codec verification step - can consume four days of engineering time and still leave you with an intermittent bug you cannot explain.
The principle that saves the most time: change one thing at a time. When a call fails, revert to the last known working state, change one variable, and test again. SIP debugging done methodically is fast. SIP debugging done by changing multiple settings simultaneously is how teams spend a week on something that was always going to work.
Once this setup is complete and all ten pre-go-live checklist items are ticked, you have a production-ready SIP trunk connected to your Voice AI platform. Everything above this layer - the STT engine, the LLM, the TTS, the system prompt - can be iterated on without touching the SIP configuration again.
Setting up Voice AI infrastructure?
I write every week about Voice AI and SIP telephony from real enterprise deployments - practical guides, not theory. Get in touch if you are working through a deployment and need a second opinion.
Follow with your Google account and get new posts in your Blogger reading list automatically.

Comments
Post a Comment