iOS (Swift)
iOS (Swift)
Native iOS applications integrate with the Atoms agent over the raw WebSocket protocol with zero third-party dependencies.
URLSessionWebSocketTask handles transport and has been available since iOS 13. AVAudioEngine captures microphone PCM16 through an input tap, and AVAudioPlayerNode plays agent audio with sample-accurate scheduling.
Configure AVAudioSession with the .playAndRecord category and .voiceChat mode to enable the system echo cancellation pipeline.
Validated end-to-end on the iOS simulator (iPhone 16 Pro, iOS 26.4): URLSessionWebSocketTask connects, AVAudioEngine input tap captures, AVAudioPlayerNode plays back. On the simulator, speaker output loops back into the Mac microphone, so the server’s VAD fires interruption events continuously; test on a real device (earphones or an HFP Bluetooth headset) to confirm clean barge-in behavior.
When to use native iOS
- iOS-only app, or a cross-platform app where iOS is the priority platform.
- You want zero external dependencies for audio and networking.
- You need fine control over audio session routing (Bluetooth, CarPlay, external mics).
If your app is primarily React Native, the React Native guide is simpler. For Flutter, see Flutter.
Dependencies
None. URLSessionWebSocketTask and AVAudioEngine are part of the iOS SDK. Minimum deployment target: iOS 13.
Permissions
Add to Info.plist:
Request at runtime before starting a session:
AVAudioApplication.requestRecordPermission replaces the deprecated AVAudioSession.sharedInstance().requestRecordPermission in iOS 17. For earlier versions, use the older API.
Audio session setup
The audio session governs capture and playback routing. For a bidirectional voice call, configure it as .playAndRecord with the .voiceChat mode. .voiceChat enables the system’s echo cancellation and noise suppression pipeline.
setPreferredSampleRate(24000) asks the hardware to match the rate the server negotiates. The system may not honor it exactly on all devices. If the active sample rate differs, resample before sending or when receiving.
Quickstart
A full working session: configure the audio session, open the WebSocket, stream mic PCM16, play agent PCM16, close cleanly.
Open the WebSocket
URLSessionWebSocketTask.receive is one-shot. Re-call it after every message to keep the stream flowing.
Microphone capture
Install an audio tap on the input node. The tap runs on a high-priority audio thread and hands you an AVAudioPCMBuffer every few milliseconds. Convert it to Int16 PCM and send as base64.
The tap’s closure runs on the audio thread. Keep it short. Do not block on UI updates or synchronous I/O. URLSessionWebSocketTask.send is asynchronous and non-blocking.
Playback
Schedule incoming PCM16 chunks on an AVAudioPlayerNode. The player node manages its own queue, so you can schedule many buffers in sequence and they play gaplessly.
Handle server events
Threading model
- Audio callbacks (mic tap, player completion) run on a high-priority audio thread. Touch no UI state from there. Dispatch to
MainActorfor anything the user sees. - WebSocket callbacks run on
URLSession’s delegate queue. Same rule: no UI on that queue; hop to main for anything visual. handleServerEventabove dispatches implicitly through Foundation serialization; it is safe to call from the WS delegate queue.
A clean pattern:
From the audio or WS thread, call await MainActor.run { viewModel.handleStateChange("connected") }.
Interruption handling
When a phone call comes in or the user triggers Siri, the audio session posts an interruption notification. Pause capture and playback, resume on the “ended” notification.
Route changes
Bluetooth connect/disconnect, headphone unplug, and CarPlay activation trigger AVAudioSession.routeChangeNotification. The audio engine handles most transitions transparently. Subscribe if you want to update UI (show “using Bluetooth” indicator, etc.).
Background modes
For calls that continue when the user locks the screen, add to Info.plist:
Apple’s review expects this to be used for VoIP-style apps. Combine with PushKit and CallKit for a compliant VoIP experience. For short in-app calls that end when backgrounded, skip this and tear down on UIApplication.didEnterBackgroundNotification.
Production hardening
Reconnect on transient failure
URLSessionWebSocketTask closes with a URLError code. Retry only on network-transient codes (.notConnectedToInternet, .timedOut, .networkConnectionLost), never on auth errors (4401, 4403) or a clean client close.
Mic mute while agent speaks
To reduce echo when headset AEC underperforms, stop the mic tap on agent_start_talking and reinstall it on agent_stop_talking. The user’s speech during that window goes undetected, which is usually preferable to the agent hearing itself.
Error event from server
The error event from the server carries actionable codes. Surface auth failures (401, 403) to the user immediately and stop retrying.

