The Zero-Cost Fortress Phone

// Building a free PBX on GCP with Tailscale, Callcentric, and a Grandstream ATA //

I wanted a house phone. Not a cell phone, not a VoIP app, but a real, pick-up-the-handset, hear-the-dial-tone house phone — the kind my kids (ages 7 to 12) could use to call grandparents without needing a screen, a charger, or a YouTube rabbit hole. The problem: traditional landlines are dead, and the telcos want $30-50/month for something that should basically be free in 2025. So I built my own. On the Google Cloud free tier. With a VPN mesh. And an analog telephone adapter. And I made exactly zero of the decisions correctly on the first try.

This is the story of the Zero-Cost Fortress Phone — a project that took me through three SIP trunk providers, two driver architectures I didn’t know were different, a CPU steal time rabbit hole, a NAT masquerade bug that took tcpdump to find, and enough pivots to qualify as an agility course. If you’re here to follow along and build your own, every step is documented. If you’re here to laugh at my mistakes, there’s plenty of that too. Either way, by the end, I had a working household phone system that costs roughly $1.35/month and is invisible to the public internet.

/* Table of Contents */

The Dream: Why Build This?
The Architecture
Phase 1 — GCP Free Tier + Incredible PBX
Phase 2 — The Callcentric Trunk (and the chan_sip Disaster)
Phase 3 — Tailscale: Making the PBX Reach the Living Room
Phase 4 — Choppy Audio: CPU Steal Time and Strict RTP
Phase 5 — The Grandstream HT801V2: Registering the Analog Phone
Phase 6 — “The Person You Are Trying to Reach Is Unavailable”
Phase 7 — The Sophos XG Static Route That Wasn’t Enough
Phase 8 — The nftables Masquerade Fix: The Final Boss
The Real Cost
Lessons Learned
References and Resources

The Dream: Why Build This?

Here’s the thing about having kids in the 7-12 age range: they need to make phone calls, but they shouldn’t have smartphones. Grandparents want to hear their voices. The school office needs a number to call. And sometimes you just want a phone that rings in the kitchen like it’s 1997 and the only thing waiting on the other end is your mom asking if you’ve eaten.

But landline service from the telco? That’s $30-50/month for something that costs approximately nothing to provide. VoIP providers like Vonage and Ooma charge $15-20/month. And all of them want you to use their hardware, their app, their ecosystem. I wanted out of all of it.

The requirements were simple: a phone that works like a phone, that my kids can pick up and dial, that rings when someone calls, that has voicemail, and that doesn’t cost me a car payment every year. Oh, and I wanted it to be secure — no SIP ports exposed to the internet, no attack surface for toll fraud, no trusting my home IP to the mercy of a dynamic DNS update.

What I ended up with is a system where a virtual machine on Google Cloud runs Asterisk and FreePBX, connects to a SIP provider through the public internet for PSTN access, and reaches my living room through a WireGuard-based mesh VPN called Tailscale — with no public SIP or RTP ports exposed anywhere. The analog phone on my desk connects through a Grandstream HT801V2 ATA, which registers to the PBX over the encrypted tunnel. The whole thing costs about what a cup of coffee costs per month.

💡 Why “Fortress” Phone?

Because the PBX has zero public-facing SIP or RTP ports. The only way to reach it for VoIP traffic is through the Tailscale mesh, which requires authentication through your identity provider. No port scanning, no SIP enumeration, no brute-force registration attacks. The attack surface is effectively nil.

The Architecture

Before we get into the build, let me show you what we’re building. This diagram represents the final, working architecture — not the first draft, which looked very different and had a certain Google Voice-shaped hole in it.

PUBLIC INTERNET │ ┌────┴────┐ │Callcentric│ │ SIP/RTP │ └────┬────┘ │ (SIP/RTP via PJSIP trunk) ┌────┴────────────┐ │ GCP e2-micro │ │ Incredible PBX │ │ Asterisk 22 │ │ FreePBX 17 │ │ Tailscale IP: │ │ 100.99.48.20 │ │ Public IP: │ │ 35.208.173.52 │ └────┬────────────┘ │ WireGuard/Tailscale mesh │ (encrypted, authenticated) ┌────┴────────────┐ │ Arch Linux │ │ (kvm-host) │ │ Subnet Router │ │ 172.16.2.3 │ │ Tailscale: │ │ 100.105.118.74 │ │ Advertises: │ │ 172.16.2.0/24 │ └────┬────────────┘ │ ┌────┴────────────┐ │ Sophos XG │ │ (SIP ALG OFF) │ │ 172.16.2.1 │ └────┬────────────┘ │ LAN: 172.16.2.0/24 ┌────┴────────────┐ │ Grandstream │ │ HT801V2 (ATA) │ │ 172.16.2.x │ │ NAT: No │ │ SIP→100.99.48.20│ └────┬────────────┘ │ (RJ-11 FXS) ┌────┴────────────┐ │ Analog Phone │ │ (cordless) │ └─────────────────┘

The key insight in this architecture is that the PBX lives in the cloud, but the phone lives on my desk. They’re connected by Tailscale, which creates an encrypted WireGuard tunnel between them. The PBX’s Tailscale IP (100.99.48.20) is the only SIP address the Grandstream ever sees — not the GCP public IP, not my home IP, not anything that shows up in a Shodan search. The GCP firewall only allows SIP/RTP traffic from the Tailscale CGNAT range (100.64.0.0/10). If you’re not on my Tailnet, you can’t even find the PBX’s VoIP ports.

The Arch Linux machine (kvm-host) serves dual duty: it’s both a hypervisor running other VMs and a Tailscale subnet router that advertises the home LAN (172.16.2.0/24) to the Tailnet. This means the PBX can reach devices on my home LAN without those devices needing Tailscale installed — which is critical because the Grandstream HT801V2 is a “dumb” SIP device that can’t run a VPN client.

Phase 1 — GCP Free Tier + Incredible PBX

Spinning Up the Cloud PBX

Google Cloud’s Always Free tier gives you one e2-micro VM per month in us-central1, us-east1, or us-west1. That’s 2 shared vCPUs (meaning you get 25% of a physical core on average) and 1 GB of RAM. It’s not a lot, but Asterisk doesn’t need a lot for a household PBX handling a couple of concurrent calls. The G.711 codec uses about 1% CPU per call. The math works — on paper.

The setup process is straightforward if you’ve ever used GCP before. Create a project, enable billing (required even for free tier — Google puts a $1 hold on your card to verify it), deploy an e2-micro VM with Debian 12 (Bookworm) and a 30 GB standard persistent disk. Both the VM and the disk fall within the free tier limits.

⚠️ Convert Your Ephemeral IP to Static

By default, GCP assigns an ephemeral external IP that changes every time you stop and start the VM. Since your SIP trunk and Tailscale both depend on this IP being stable, you must convert it to a static IP in VPC Network → External IP Addresses. A static IP is free while attached to a running VM — you only pay (~$7.30/month) if the VM is stopped with the IP reserved.

The Firewall: Lock It Down

Here’s where the “Fortress” part starts. I created exactly two GCP firewall rules. The first allows SSH from my home public IP only. The second allows all traffic from the Tailscale CGNAT range (100.64.0.0/10). That’s it. No public SIP ports, no public RTP ports, nothing for Shodan to find. The PBX is completely dark to the public internet for VoIP traffic.

Incredible PBX 2025 Installation

I used Ward Mundy’s Incredible PBX 2025 distribution, which bundles Asterisk 22, FreePBX 17, and a suite of preconfigured security tools (Fail2Ban, Travelin’ Man 3 IPtables firewall). The installer compiles Asterisk from source, which takes 25-40 minutes on the e2-micro’s shared vCPU. Go get coffee. Or lunch. Don’t interrupt it.

root@fortress-pbx:~$ wget http://incrediblepbx.com/IncrediblePBX2025.sh
root@fortress-pbx:~$ chmod +x IncrediblePBX2025.sh
root@fortress-pbx:~$ ./IncrediblePBX2025.sh
# … 30-40 minutes of compilation …
Incredible PBX 2025 installation complete!

The Swap File (Don’t Skip This)

With 1 GB of RAM, the Linux OOM killer will eventually decide that Asterisk is expendable — usually at 3 AM on a Tuesday. A 2 GB swap file is mandatory, not optional:

fallocate -l 2G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo '/swapfile none swap sw 0 0' >> /etc/fstab
echo 'vm.swappiness = 10' >> /etc/sysctl.conf
sysctl -p

Setting swappiness to 10 tells the kernel to use RAM first and swap only as an emergency buffer. This is what you want for a VoIP server — you don’t want Asterisk’s memory pages getting swapped out during an active call.

Narrow the RTP Port Range

The default Asterisk RTP range of 10000-20000 is 10,000 ports. For a household PBX, I narrowed it to 100 concurrent channels (far more than we’ll ever need):

; /etc/asterisk/rtp.conf
[general]
rtpstart=10000
rtpend=10100

Whitelist Tailscale in the Incredible PBX Firewall

Incredible PBX includes its own iptables-based firewall (Travelin’ Man 3) that blocks all non-whitelisted IPs. You need to whitelist the entire Tailscale CGNAT range so any device on your Tailnet can reach the PBX:

/root/add-ip tailscale 100.64.0.0/10
/root/add-ip homelan 172.16.2.0/24

Phase 2 — The Callcentric Trunk (and the chan_sip Disaster)

This is the first major pivot in the project, and it’s entirely my fault. Let me tell you about it so you don’t make the same mistake.

🔄 PIVOT #1: Google Voice → Callcentric

The original plan was to use Google Voice as the SIP trunk via GVSIP (a community-maintained bridge between Asterisk and Google’s proprietary telephony API). This involved OAuth 2.0 credentials, refresh tokens, and a setup process that was finicky enough to have its own support forum thread. I went through the entire process before realizing that Google Voice doesn’t support native 911 calling — a hard requirement for a household with kids. I switched to Callcentric, which provides both a real PSTN DID and E911 service for about $0.85/month.

The chan_sip vs. chan_pjsip Mistake

When I first set up the Callcentric trunk, I followed every guide I could find online. The problem? Most of those guides were written for chan_sip — Asterisk’s legacy SIP driver. My Incredible PBX 2025 system runs Asterisk 22 with chan_pjsip exclusively, which is a completely different configuration model with a completely different GUI.

The result was a guide that told me to add 40 alpha/bravo server entries to sip_custom_post.conf (a chan_sip concept that doesn’t exist in PJSIP), fill in a “PEER Details” text box (a chan_sip GUI element that doesn’t exist in the PJSIP trunk GUI), and use a “Register String” (a chan_sip mechanism replaced by structured PJSIP fields). I was essentially trying to configure a PJSIP system using chan_sip instructions, and of course nothing worked.

⚠️ chan_sip vs. chan_pjsip — Know Which You’re Using

If your FreePBX trunk menu says “Add SIP (chan_pjsip) Trunk”, you are using PJSIP. There is no PEER Details box, no Register String, no sip_custom_post.conf. The configuration model is entirely different. Every guide written before ~2022 probably assumes chan_sip. Adjust accordingly.

The Correct PJSIP Configuration

After scraping together information from Callcentric’s own PJSIP vanilla guide, the FreePBX community forums (specifically contributor Stewart1, who is a treasure), and the VoIP-info wiki, I arrived at a working PJSIP trunk configuration. Here are the critical settings that differ from the defaults:

Field	Value	Why It Matters
Context	`from-pstn-toheader`	Callcentric sends the called number (DID) in the To: header. The default `from-pstn` doesn’t extract it, so inbound calls arrive with EXTEN `s` and your Inbound Routes won’t match.
Outbound Proxy	`sip:sip.callcentric.net\;lr`	Enables SIP loose routing (RFC 3261). Without the `\;lr`, PJSIP may alter the Request-URI and Callcentric rejects the call. The backslash escapes the semicolon so PJSIP’s parser doesn’t treat it as a comment.
Contact User	`17778534517`	Sets the contact_user on the PJSIP registration. Without it, inbound calls enter the dialplan with EXTEN `s` instead of your DID.
From Domain	`sip.callcentric.net`	Required. Callcentric checks the From domain on inbound auth.
From User	`17778534517`	Required. Callcentric uses this to identify your account on outbound calls.
Match (Permit)	`199.87.144.0/21,204.11.192.0/22`	These two CIDR ranges cover all of Callcentric’s server IPs. This is the PJSIP equivalent of chan_sip’s 40 alpha/bravo host entries — far simpler and more reliable.
Direct Media	`No`	Callcentric’s own config specifies `direct_media=no`. With the PBX behind NAT on GCP, direct media would break anyway.
RTP Symmetric	`Yes`	Ensures RTP media flows symmetrically — essential behind NAT.
Force rport	`Yes`	Tells Asterisk to send responses to the actual source address, not the Via header. Essential for NAT traversal.

After applying these settings, the trunk registered immediately:

root@fortress-pbx:~$ asterisk -rx ‘pjsip show registrations’
Registration: callcentric/sip:17778534517@sip.callcentric.net Auth: callcentric Registered

I also whitelisted Callcentric’s IP ranges in the Incredible PBX firewall, which is critical — without it, inbound INVITEs from Callcentric’s servers are silently dropped:

/root/add-ip ip-callcentric1 199.87.144.0/21
/root/add-ip ip-callcentric2 204.11.192.0/22

Phase 3 — Tailscale: Making the PBX Reach the Living Room

Here’s the fundamental problem: the PBX lives in Google Cloud, and the phone lives on my desk. The phone is an analog device connected to a Grandstream HT801V2 ATA, which is a “dumb” SIP endpoint that can’t run a VPN client. I needed a way to bridge the PBX’s SIP/RTP traffic to the ATA without exposing any ports to the public internet.

The answer is Tailscale — a WireGuard-based mesh VPN that creates encrypted tunnels between authenticated devices. Each device on the Tailnet gets an IP in the 100.64.0.0/10 CGNAT range, and all traffic between them is encrypted and authenticated through your identity provider.

Installing Tailscale on the GCP Instance

curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
# Authenticate via the URL printed to terminal
tailscale ip -4
# Returns something like 100.99.48.20 — this is the PBX Tailscale IP

This IP (100.99.48.20) is the address the Grandstream ATA will use as its SIP server. Every device on the Tailnet can reach the PBX at this IP, and no one else can.

The Subnet Router: Reaching Devices That Can’t Run Tailscale

The Grandstream HT801V2 can’t run Tailscale — it’s a purpose-built SIP device with no ability to install software. So I needed one of my existing machines to act as a Tailscale subnet router, making the entire home LAN reachable from the GCP instance through the Tailnet.

My Arch Linux server (kvm-host, at 172.16.2.3 on the LAN) already runs Tailscale, so I reconfigured it as a subnet router:

# Enable IP forwarding
echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.d/99-tailscale.conf
sudo sysctl -p /etc/sysctl.d/99-tailscale.conf

# Reconfigure Tailscale as subnet router
sudo tailscale up --advertise-routes=172.16.2.0/24

Then I approved the subnet route in the Tailscale admin console (login.tailscale.com/admin/machines → edit route settings → check the box for 172.16.2.0/24). This makes the home LAN reachable from the GCP instance, so the PBX can send RTP audio back to the HT801V2’s LAN IP.

NAT Settings in Asterisk

Even though Tailscale encrypts and tunnels the traffic, Asterisk still sees the HT801V2’s source IP (a 172.16.2.x address) as being on a different network from itself (a 10.128.x.x GCP VPC address). Without proper NAT configuration, Asterisk will direct RTP audio to the wrong IP, resulting in one-way audio. In the FreePBX GUI under Settings → Asterisk SIP Settings, I configured:

Setting	Value
External Address	`35.208.173.52` (GCP public IP)
Local Network: Tailscale	`100.64.0.0/255.192.0.0`
Local Network: GCP VPC	`10.128.0.0/255.255.240.0`
Local Network: Home LAN	`172.16.2.0/255.255.255.0`

These local network entries tell Asterisk that any call from a 100.x, 10.128.x, or 172.16.2.x address is “local” and doesn’t need external NAT manipulation. This is critical for two-way audio through the Tailscale tunnel.

Phase 4 — Choppy Audio: CPU Steal Time and Strict RTP

With the trunk registered and the Tailscale tunnel up, I made my first test call. The audio was soooo laggy and choppy — to the point of being completely unusable. The call connected, the IVR played, but the audio came through in stutters and gaps, like the PBX was underwater and only occasionally surfacing for air.

🔄 PIVOT #2: From “Network Problem” to “CPU Problem”

My first instinct was to blame the network — Tailscale adds encryption overhead, there’s a Sophos firewall in the path, maybe the RTP ports are being throttled. But the Asterisk CLI showed the call flowing correctly, with no errors in the SIP signaling. The problem was in the RTP media stream, not the signaling. A session of top on the GCP instance revealed the real culprit: CPU steal time of 5-20%.

What Is CPU Steal Time?

The GCP e2-micro uses a shared vCPU — Google schedules your VM’s execution time on physical cores alongside other tenants. When the hypervisor takes the CPU away from your VM to give it to another tenant, that’s called steal time. For most workloads (web servers, databases), brief CPU delays are invisible. For VoIP, they’re catastrophic.

Here’s why: Asterisk generates RTP audio packets on a strict 20ms cadence (50 packets per second for G.711 ulaw). If the hypervisor steals the CPU for even 50-100ms during an active call, Asterisk can’t send or process 2-5 RTP packets on schedule. The result is a gap in the audio stream — what you hear as “choppiness.” And because Asterisk on the e2-micro uses only 2-5% CPU, the hypervisor sees that low utilization and becomes more aggressive about taking CPU time away, making the steal time problem worse, not better.

A sysadmin on ServerFault documented this exact phenomenon: “This is a somewhat common issue with workloads that are both latency-sensitive and CPU non-intensive. The power management sees the very low CPU usage and assumes it can throttle the processor, even though it should not!”

The Fix: Disable Strict RTP

While CPU steal time was the underlying cause, there was a second contributing factor that was much easier to fix: Asterisk’s Strict RTP setting. Strict RTP drops RTP packets that arrive from an unexpected source IP or port. In a NAT environment with Tailscale in the path, the RTP source address seen by Asterisk may differ from what was negotiated in the SDP, especially if Tailscale’s relay path is involved.

Disabling Strict RTP was the immediate fix:

FreePBX GUI → Settings → Asterisk SIP Settings → RTP Settings
Strict RTP: No

✅ RESULT: AUDIO IMMEDIATELY IMPROVED

After disabling Strict RTP, the choppy audio was resolved. The call quality went from unusable to clear. Strict RTP is a security feature that prevents RTP injection attacks, but since the PBX is behind Incredible PBX’s firewall and GCP’s firewall, and only whitelisted IPs can reach the server, the risk of disabling it is minimal.

For the record, if steal time had been the only issue, the fix would have been upgrading from the e2-micro to the e2-small (~$7/month for 0.5 vCPU and 2 GB RAM). Ward Mundy explicitly recommends this: “We would encourage you to move up to the Standard machine type for consistent performance.” But in my case, disabling Strict RTP was enough to get clean audio on the free tier.

Phase 5 — The Grandstream HT801V2: Registering the Analog Phone

The Grandstream HT801V2 is a single-port FXS ATA (Analog Telephone Adapter). It has one RJ-11 phone jack and one Ethernet port. You plug an analog phone into the phone jack, plug the Ethernet into your LAN, and configure it to register to your PBX via SIP. The ATA converts the analog audio signals to RTP packets and back again.

Creating the PJSIP Extension

Before touching the HT801V2, I created a PJSIP extension in FreePBX for it to register against. The critical settings on the Advanced tab:

Field	Value	Why
Direct Media	`No`	Asterisk must stay in the media path. With Tailscale in the path, direct media between the trunk and the ATA would break.
RTP Symmetric	`Yes`	Accepts RTP from whatever address/port it actually arrives from, rather than strictly matching SDP.
Force rport	`Yes`	Forces Asterisk to use the actual source IP:port of incoming SIP requests as seen through Tailscale.

Configuring the HT801V2

The HT801V2’s web interface is at whatever IP it gets from DHCP. You can find the IP by picking up the connected analog phone and dialing *** then 02 — the IVR announces the current IP address. Then open that IP in a browser.

Here are the critical settings that differ from defaults, and why each one matters:

Field	Value	Why This Matters
Primary SIP Server	`100.99.48.20:5060`	The PBX’s Tailscale IP with port appended. The `:5060` is mandatory — the HT801V2 needs the server port explicitly specified.
NAT Traversal	`No`	Counterintuitive but critical. Tailscale handles NAT. Enabling STUN/Auto would cause the HT801V2 to discover the public IP and put it in SDP headers, breaking audio because RTP must flow through Tailscale.
SIP OPTIONS Keep Alive	`Yes`	Sends periodic OPTIONS messages to keep the SIP path alive through the Tailscale tunnel.
Keep Alive Interval	`30` seconds	Responsive enough to detect a dropped tunnel without being chatty.
Register Expiration	`60` minutes	Matches FreePBX default. Re-registers every hour to keep the NAT pinhole open.
SIP REGISTER Contact	`LAN Address`	The HT801V2 advertises its LAN IP in the Contact header. The PBX can reach this IP through Tailscale subnet routing.

💡 The “NAT Traversal = No” Paradox

Normally, a SIP device behind NAT needs STUN or a manually configured public IP so the PBX knows where to send RTP. But in this architecture, the PBX can already reach the HT801V2’s LAN IP (172.16.2.x) through the Tailscale subnet router. If you enable NAT Traversal, the HT801V2 puts your public IP in the SDP, and the PBX tries to send RTP to your public IP — which will fail because RTP must flow through Tailscale, not through the public internet with its SIP-unfriendly carrier-grade NAT.

Phase 6 — “The Person You Are Trying to Reach Is Unavailable”

With everything configured, I called my Callcentric DID from my cell phone. The call went through the trunk correctly. The IVR played. But when the call was routed to extension 1001 (the HT801V2), I got: “The person you are trying to reach is unavailable.”

Investigation: The Asterisk CLI Trace

I fired up asterisk -rvvv and called again. The trace told the whole story:

— Executing [1001@ext-local:1] Set(“PJSIP/callcentric-00000001”, “RINGTIMER_EXTEN=1001”)
— Executing [1001@ext-local:2] Macro(“PJSIP/callcentric-00000001”, “exten-vm,1001,1001”)
— Executing [s@macro-dial-one:12] Set(“PJSIP/callcentric-00000001”, “THISDIAL=”)
WARNING: PJSIP_DIAL_CONTACTS(1001) returned empty — no registered contacts
— Executing [s@macro-dial-one:41] Set(“PJSIP/callcentric-00000001”, “DIALSTATUS=CHANUNAVAIL”)
— Auto fallthrough, channel ‘PJSIP/callcentric-00000001’ status is ‘CHANUNAVAIL’

The THISDIAL was set to an empty string because PJSIP_DIAL_CONTACTS(1001) returned nothing. This function resolves a PJSIP endpoint to its registered Contact URI — and when nothing is registered, it returns empty. The dial status was set to CHANUNAVAIL, and the “unavailable” message played.

The HT801V2 wasn’t registered. But why?

The Suspects

Before I identified the real cause, I went down a rabbit hole investigating a known FreePBX 17 bug. GitHub issue #758 documented that extensions created via the “Quick Create Extension” modal get PJSIP/3 instead of PJSIP/EXTENSION in the Asterisk database. I checked:

asterisk*CLI> database show DEVICE/1001/dial
/DEVICE/1001/dial: PJSIP/1001

Not the bug. The database entry was correct. The extension was properly configured. The problem was that the HT801V2 simply couldn’t reach the PBX to register.

🔄 PIVOT #3: From “Extension Bug” to “Network Connectivity”

I spent time investigating the FreePBX Quick Create Extension bug before confirming it wasn’t my issue. The real problem was much simpler: the HT801V2 on the 172.16.2.0/24 LAN couldn’t reach the PBX’s Tailscale IP (100.99.48.20). The Tailscale subnet router only made the PBX reachable from one direction (GCP → home LAN), not the other (home LAN → GCP).

I tested from a computer on the same LAN:

user@kvm-dr:~$ ping -c 3 100.99.48.20
PING 100.99.48.20: 3 packets transmitted, 0 received, 100% packet loss

But from kvm-host (the Tailscale subnet router), the same ping worked:

user@kvm-host:~$ ping -c 3 100.99.48.20
PING 100.99.48.20: 64 bytes, time=28.3 ms
3 packets transmitted, 3 received, 0% packet loss

The subnet router could reach the PBX, but other LAN devices couldn’t. The missing piece: a static route on the Sophos XG firewall telling LAN traffic destined for the Tailscale CGNAT range (100.64.0.0/10) to go through kvm-host (172.16.2.3).

Phase 7 — The Sophos XG Static Route That Wasn’t Enough

I added the static route on the Sophos XG: destination 100.64.0.0/10, gateway 172.16.2.3 (kvm-host). I figured that would be it — the Sophos would forward Tailscale-destined traffic to kvm-host, kvm-host would send it into the Tailscale tunnel, and we’d be good.

I was wrong.

user@kvm-dr:~$ traceroute 100.99.48.20
traceroute to 100.99.48.20, 30 hops max, 60 byte packets
1 _gateway (172.16.2.1) 0.248 ms 0.227 ms 0.243 ms
2 * * *
3 * * *

Traffic reached the Sophos (hop 1 = 172.16.2.1) and then died. The static route was there, but the Sophos wasn’t forwarding the traffic. Why?

Lesson: Sophos XG Routes ≠ Sophos XG Firewall Rules

Sophos XG is a stateful firewall. Having a static route only tells the Sophos where to send packets — it does NOT mean the Sophos will actually allow the traffic through. Every packet that traverses the Sophos must pass through the firewall engine, and if there’s no rule explicitly allowing LAN → 100.64.0.0/10, the traffic gets silently dropped.

This is the most important lesson about Sophos XG for anyone building network infrastructure: routes and firewall rules are separate. A route without a matching firewall rule means traffic gets silently dropped. No error, no log entry in the default view — just silence.

Creating the Firewall Rule

I created a firewall rule on the Sophos XG:

Field	Value
Rule Name	`Allow-LAN-to-Tailscale`
Source Zones	LAN
Source Networks	`172.16.2.0/24`
Destination Zones	LAN (the next hop 172.16.2.3 is on the same LAN zone)
Destination Networks	`Tailscale-CGNAT` (100.64.0.0/10)
Services	Any
Masquerading	UNCHECKED
Action	Accept

⚠️ Masquerading Must Be OFF

Do NOT enable masquerading on this rule. The Sophos should route the packet, not NAT it. kvm-host needs to see the original source IP (172.16.2.x) so it knows the packet isn’t from itself. If masquerading is on, kvm-host sees the packet as coming from 172.16.2.1, which technically works but is less clean and can cause issues with asymmetric routing.

After adding the firewall rule, I tested again:

user@kvm-dr:~$ traceroute 100.99.48.20
traceroute to 100.99.48.20, 30 hops max, 60 byte packets
1 _gateway (172.16.2.1) 0.248 ms
2 172.16.2.3 0.312 ms
3 * * *
4 * * *

Progress! Traffic now reached kvm-host (hop 2 = 172.16.2.3), which meant the Sophos was correctly forwarding it. But then it died again at kvm-host. The Sophos firewall rule was fixed, but kvm-host wasn’t forwarding the traffic into the Tailscale tunnel.

Phase 8 — The nftables Masquerade Fix: The Final Boss

At this point, I knew the Sophos was forwarding traffic correctly to kvm-host, and kvm-host could reach the PBX directly (pings from kvm-host worked). But kvm-host wasn’t forwarding traffic from other LAN devices into the Tailscale tunnel.

I ran a series of diagnostics on kvm-host to figure out what was happening.

Diagnostic 1: IP Forwarding

user@kvm-host:~$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1

Good — IP forwarding was enabled.

Diagnostic 2: Tailscale iptables FORWARD Chain

user@kvm-host:~$ sudo iptables -L ts-forward -v -n
Chain ts-forward (1 references)
MARK all — 0.0.0.0/0 0.0.0.0/0 MARK xset 0x40000/0xff0000
ACCEPT all — 0.0.0.0/0 0.0.0.0/0 mark match 0x40000/0xff0000
DROP all — 100.65.0.0/10 0.0.0.0/0
ACCEPT all — 0.0.0.0/0 0.0.0.0/0

The last ACCEPT rule was there, so Tailscale was allowing forwarded traffic through the FORWARD chain.

Diagnostic 3: The tcpdump That Revealed Everything

This is the diagnostic that cracked the case. I ran tcpdump on the Tailscale interface while pinging from a LAN device:

user@kvm-host:~$ sudo tcpdump -i tailscale0 -n icmp and host 100.99.48.20 -c 5
IP 172.16.2.4 > 100.99.48.20: ICMP echo request
IP 172.16.2.4 > 100.99.48.20: ICMP echo request
IP 172.16.2.4 > 100.99.48.20: ICMP echo request

There it was. The packets were entering the Tailscale tunnel with their original LAN source IP (172.16.2.4) — they were NOT being NAT’d. The PBX at the other end received these packets, but when it tried to reply, it had no route back to 172.16.2.4. The reply packets went nowhere.

Why Tailscale’s Default SNAT Didn’t Help

By default, Tailscale subnet routers have SNAT (Source NAT) enabled via the --snat-subnet-routes flag. But Tailscale’s SNAT only handles traffic coming from the tailscale0 interface (inbound from the Tailnet). It uses a packet mark (0x40000) to identify this traffic and applies masquerade in the ts-postrouting chain.

I checked the nftables ruleset:

user@kvm-host:~$ sudo nft list ruleset
chain ts-postrouting {
meta mark & 0x00ff0000 == 0x00040000 counter packets 1 bytes 84 xt target “MASQUERADE”
}

This rule only matches packets with mark 0x40000 — which is set on traffic coming in from tailscale0 (the Tailscale → LAN direction). Traffic from the LAN going out to tailscale0 has no such mark and no masquerade rule. The packets enter the tunnel with their original source IP, the PBX can’t route replies back, and everything dies in silence.

🔄 PIVOT #4: Understanding Tailscale’s Asymmetric SNAT

The key insight: Tailscale’s default SNAT only masquerades traffic coming IN from tailscale0 (marked 0x40000). Traffic from the LAN going OUT to tailscale0 has no masquerade. This makes sense for the primary use case (Tailnet devices accessing your LAN), but it’s a problem when you need LAN devices to access the Tailnet through a static route on a separate firewall.

The Fix: One nftables Rule

The solution is one nftables masquerade rule for LAN traffic going out to tailscale0:

sudo nft add rule ip nat ts-postrouting oifname "tailscale0" ip saddr 172.16.2.0/24 counter masquerade

This rule says: any packet from the 172.16.2.0/24 LAN going out the tailscale0 interface should have its source IP masqueraded to kvm-host’s Tailscale IP. Now the PBX sees the packets as coming from kvm-host’s Tailscale IP (which it can reach), and return traffic flows naturally.

To persist this across reboots:

sudo nft list ruleset > /etc/nftables.conf
sudo systemctl enable nftables

✅ RESULT: EVERYTHING WORKS

After adding the nftables masquerade rule, ping from the LAN to the PBX Tailscale IP succeeded immediately. The HT801V2 registered, inbound calls rang the analog phone, and two-way audio flowed clearly through the encrypted Tailscale tunnel. The Zero-Cost Fortress Phone was finally online.

The Three-Part Fix Summary

Getting LAN devices to reach the PBX through Tailscale required three separate changes, each one building on the last:

#	Component	Change	What It Does
1	Sophos XG	Static route: `100.64.0.0/10 → 172.16.2.3`	Tells the Sophos where to send Tailscale-destined traffic
2	Sophos XG	Firewall rule: Allow LAN → Tailscale-CGNAT	Permits the traffic through the firewall engine (routes ≠ rules)
3	kvm-host (Arch)	nftables masquerade for LAN→tailscale0	NATs the source IP so the PBX can route replies back

Miss any one of these three, and the whole thing falls apart. The static route without the firewall rule: silently dropped. The firewall rule without the masquerade: packets arrive but replies never come back. All three together: a working, encrypted, zero-port-exposed VoIP path from the cloud to your living room.

The Real Cost

Item	Monthly Cost
GCP e2-micro (Always Free tier)	$0.00
GCP 30 GB Standard Persistent Disk (Always Free tier)	$0.00
GCP Static IP (free while attached to running instance)	$0.00
GCP Egress (estimated 5-10 GB/month over free 1 GB)	~$0.50 – $1.00
Callcentric DID + E911	~$0.85
Tailscale (Personal plan, up to 100 devices)	$0.00
Grandstream HT801V2 (one-time hardware purchase)	~$35-45
Total Monthly	~$1.35 – $1.85/month

Not literally zero, but close enough that the “Zero-Cost Fortress Phone” name is still in the spirit of the thing. The one-time hardware cost for the Grandstream ATA pays for itself in about two months compared to a $20/month VoIP service.

Lessons Learned

Every project is really two projects: the one you planned and the one you actually built. Here’s what I learned from the delta between them.

1. Know Your SIP Driver

chan_sip and chan_pjsip are not interchangeable. They are two completely different SIP stack implementations with different configuration models, different GUIs, different file formats, and different troubleshooting commands. If your FreePBX says “Add SIP (chan_pjsip) Trunk,” you are using PJSIP, and every guide written for chan_sip (PEER Details, Register Strings, sip_custom_post.conf) will lead you astray. Check which driver you’re using before following any guide.

2. Routes ≠ Firewall Rules (Sophos XG)

On a Sophos XG, a static route tells the kernel where to send packets, but the firewall engine independently decides whether to allow them. A route without a matching firewall rule means silently dropped traffic — no error, no obvious log entry, just nothing. This is probably the single most counterintuitive thing about Sophos XG for people coming from consumer routers where “port forwarding” does both jobs at once.

3. Tailscale SNAT Is Asymmetric

Tailscale’s default SNAT (--snat-subnet-routes) only masquerades traffic coming IN from tailscale0 (marked 0x40000). Traffic from the LAN going OUT to tailscale0 has no masquerade. This works for the primary use case (Tailnet devices accessing your LAN), but it’s a trap when you’re routing LAN traffic to the Tailnet through a separate firewall’s static route. You need to add your own masquerade rule for the LAN→tailscale0 direction.

4. tcpdump Is the Truth

When I was troubleshooting the routing issue, I spent time checking iptables rules, sysctl settings, and Tailscale configuration. None of it told me the real problem. Running tcpdump -i tailscale0 immediately revealed that packets were entering the tunnel with their original LAN source IP (172.16.2.4) instead of being NAT’d. The packet capture doesn’t lie — it shows you exactly what’s on the wire. When in doubt, tcpdump.

5. Strict RTP Can Break Audio Through VPN Tunnels

Asterisk’s Strict RTP setting drops packets from unexpected source IPs/ports. In a VPN tunnel scenario where the RTP source address may differ from what was negotiated in SDP (especially if the VPN path changes between direct and relay), Strict RTP will silently drop valid audio packets. Disabling it resolved choppy audio immediately, and the security tradeoff is acceptable when the PBX is behind multiple firewalls.

6. CPU Steal Time Kills VoIP on Shared vCPUs

The GCP e2-micro’s shared vCPU works fine for a PBX that spends 99% of its time idle. But the 1% of the time it’s handling an active call, any CPU steal time by the hypervisor creates gaps in the RTP packet stream that manifest as choppy audio. If you need consistent VoIP quality, a dedicated vCPU (e2-small at ~$7/month) is the minimum viable upgrade.

7. NAT Traversal on the ATA Should Be “No” When Using Tailscale

This is the most counterintuitive setting in the entire project. Normally, a SIP device behind NAT needs STUN or a public IP in its SDP so the PBX knows where to send RTP. But with Tailscale, the PBX can already reach the ATA’s LAN IP through the subnet router. Enabling NAT traversal causes the ATA to put the public IP in the SDP, which breaks audio because RTP must flow through the encrypted tunnel, not through carrier-grade NAT on the public internet.

References and Resources

Incredible PBX / Nerd Vittles

Ward Mundy’s Incredible PBX project is the foundation of this build. The installer script, the Travelin’ Man 3 firewall, and the comprehensive documentation at nerdvittles.com are what make deploying Asterisk on GCP feasible for mere mortals. His specific warning about e2-micro performance for VoIP (“don’t expect much more performance-wise than what you’d get with the original Raspberry Pi”) was prescient.

Callcentric PJSIP Vanilla Guide

Callcentric’s own Asterisk PJSIP configuration guide (callcentric.com/support/device/asterisk_pjsip) provides the canonical PJSIP settings for their service. I cross-referenced this with the FreePBX community forums, particularly contributor Stewart1, who provides detailed PJSIP guidance specific to FreePBX 17.

Tailscale Site-to-Site Documentation

Tailscale’s site-to-site networking documentation (tailscale.com/docs/features/site-to-site) covers the subnet router configuration, SNAT behavior, and the --snat-subnet-routes flag. The documentation is thorough but the asymmetric SNAT behavior (only masquerading traffic from tailscale0, not to it) is not prominently called out — you discover it through tcpdump.

FreePBX Community Forums

The FreePBX community forums at community.freepbx.org are an invaluable resource. The thread on Tailscale/PFSense/FreePBX integration, the Quick Create Extension bug discussion (GitHub issue #758), and Stewart1’s posts on Callcentric PJSIP configuration were all critical to solving problems in this project.

Grandstream HT801V2 Administration Guide

Available from grandstream.com/support/firmware. The VoIP-info forum thread on HT801 PJSIP configuration revealed the critical detail that the SIP server port must be appended to the server address (e.g., 100.x.y.z:5060) — the “Local SIP Port” field on the HT801V2 is for its own listener, not the server port.

ServerFault: Causes of RTP Jitter at the Server

The ServerFault thread documenting CPU steal time (ready time) causing periodic RTP jitter on VMware was the key insight for diagnosing the choppy audio issue. The specific observation that low CPU utilization makes hypervisors more aggressive about throttling — not less — explained why the e2-micro was particularly bad for VoIP despite Asterisk using only 2-5% CPU.

/* end of transmission */

built with: GCP free tier • Incredible PBX 2025 • Tailscale • Callcentric • Grandstream HT801V2 • stubbornness