Flutter
Flutter applications integrate with the Atoms agent over the raw WebSocket protocol.
The Dart web_socket_channel package handles transport. mic_stream captures microphone PCM16, and flutter_pcm_sound plays agent audio with low-latency scheduling.
The stack is WebRTC-free. No LiveKit, no Daily, no platform-specific media engines.
Validated end-to-end on the iOS simulator: WebSocket connects, mic captures, agent audio plays back. On the simulator, speaker output loops back into the Mac microphone, so the server’s VAD fires interruption events continuously; test on a real device (earphones or an HFP Bluetooth headset) to confirm clean barge-in behavior. mic_stream on iOS does not configure the audio session for voice chat, so you get no echo cancellation out of the box. See the iOS audio session section.
When to use Flutter
- Cross-platform mobile or desktop app with a shared Dart codebase.
- You want a single audio pipeline that works across iOS, Android, and desktop targets.
- You do not need character-level TTS alignment timings.
For single-platform native apps, the iOS (Swift) guide gives you full platform control with fewer intermediaries.
Dependencies
Platform configuration
iOS
Add to ios/Runner/Info.plist:
Android
Add to android/app/src/main/AndroidManifest.xml:
Minimum Android SDK should be 24 (Android 7) for mic_stream compatibility. Set in android/app/build.gradle:
Request at runtime
Quickstart
A full agent session: check permission, open the WebSocket, stream mic audio, play agent audio, clean up.
Microphone capture
MicStream.microphone returns a Stream<Uint8List> synchronously (not a Future); don’t await it.
On Android, AudioSource.VOICE_COMMUNICATION selects the platform’s echo-cancelled audio path. On iOS, mic_stream uses AVCaptureSession with a default capture device and does not configure AVAudioSession for voice chat. For iOS echo cancellation on a real device, configure the audio session yourself via the audio_session package (category .playAndRecord, mode .voiceChat) before starting the stream, or mute the mic while the agent speaks (see Mic mute while agent speaks).
Server events
FlutterPcmSound.feed queues the chunk for playback. Internally the plugin manages a ring buffer on the platform side and drains it at the hardware sample rate, so you can push chunks as fast as they arrive.
Platform differences
iOS audio session
mic_stream does not configure AVAudioSession. It uses AVCaptureSession directly, so you get whatever the system default category is (usually .soloAmbient), and no echo cancellation. Configure the session yourself with audio_session before starting the stream:
.playAndRecord + .voiceChat enables the iOS system AEC pipeline and the voice-chat audio mode. Without it, the agent hears its own audio through the mic and the server’s VAD fires continuous interruption events. If your app uses other audio plugins (for example, just_audio for media playback), coordinate their session categories through the same audio_session package; two plugins fighting over the session will cause one to silence the other.
Android foreground service
If the call continues when the app is backgrounded, start a foreground service on the native Android side. mic_stream will continue capturing briefly when backgrounded but Android 12+ will revoke mic access within seconds without a foreground service declaring the phoneCall type. See the Android foreground services for voice calls reference for the service implementation.
Flutter-side, trigger the service from your MainActivity or via a plugin like flutter_background_service.
Desktop support
mic_stream and flutter_pcm_sound currently target mobile. Desktop targets (macOS, Windows, Linux) need flutter_webrtc or platform-channel bridges. If you need desktop today, use web_socket_channel for the WS and write platform-channel code for capture and playback.
Threading and isolates
WebSocketChannelevents arrive on the main isolate.MicStream.microphonedelivers its Uint8List frames on the main isolate as well.FlutterPcmSound.feedis fast (enqueues to a native buffer) but avoid calling it from a blocking UI build method.
For CPU-intensive preprocessing (resampling, denoising beyond what the platform provides), use compute() or a dedicated isolate. The baseline pipeline shown here does not need one.
Interruption handling
Incoming phone calls and other audio-focus events interrupt the stream. On Android, subscribe to audio focus via a platform channel or the audio_session plugin. On iOS, the operating system pauses mic_stream automatically and resumes after the interruption ends.
Call _installInterruptionHandler once during app startup.
Production hardening
Reconnect on transient failure
The onError/onDone callbacks on the WebSocket stream fire when the connection drops. Retry with exponential backoff up to 30 s for transient network errors. Do not retry on close codes 1000, 4401, 4403.
Mic mute while agent speaks
If your target devices have weaker echo cancellation, cancel the mic subscription on agent_start_talking and restart it on agent_stop_talking. The user’s speech during the agent turn goes undetected; it is the safer trade-off than audible feedback.
App lifecycle
Use WidgetsBindingObserver to tear down on background transitions:
Battery
The pipeline draws 3–5 % battery per minute on mobile, comparable to the native platforms. Do not ship features that keep the session open idle.
Next steps
The full wire protocol with every message type, payload, and error code.
Native iOS integration with URLSessionWebSocketTask and AVAudioEngine.
Cross-platform mobile client built on react-native-audio-api.
HTTP status codes returned by every Atoms endpoint.

