ApertureAudioAUD · TTS · 2026

voice-synth

Natural speech and voice cloning.

Low-latency text-to-speech with natural prosody and optional voice cloning — built for agents and apps.

Start creating →API reference

Specs

Voices40+ presets

CloningFrom 30s sample

FormatsWAV · MP3 · stream

Avg latency~0.3s

Capabilities

Natural prosodyCloningStreamingLow latency

Sample prompt

“Read this announcement in a warm, upbeat tone.”

Pricing

$0.02 per minute

See pricing →

API

Call it in one request

curl https://api.aperture.network/v1/audio/generations \
  -H "Authorization: Bearer $APERTURE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"voice-synth","prompt":"Read this announcement in a warm, upbeat tone."}'

Suno

suno-v5

Full songs from a prompt, vocals included.

Audio$0.05

Anthropic

claude-opus-4-8

Top-tier reasoning, writing, and long-form work.

Text$15 / $75

OpenAI

gpt-5.5

Frontier model for deep reasoning and agentic work.