# MobDial Agents (MobAgents) — Full Documentation Index

> MobDial provides AI voice infrastructure: text-to-speech, speech-to-text, voice cloning, conversational voice agents, and generative audio. Everything is accessible through a REST API with official Python and TypeScript SDKs, and through a web application for no-code use. A documentation index is available at https://www.mobdial.com/llms.txt and https://www.mobdial.com/llms-full.txt. Append `.md` to any documentation page URL for its markdown source. Powered by MobDial Agents (MobAgents).

## How MobDial works

**Voices** are the speech persona used in audio generation. Each voice has a unique ID — for example, `JBFqnCBsd6RMkjVDRZzb` — that you pass in every API request. MobDial maintains a library of 10,000+ voices (https://www.mobdial.com/docs/overview/capabilities/voices). You can also clone a voice from an audio recording or generate one from a text description.

**Models** control the quality, latency, and language coverage of generated audio. `mobdial_v3` produces the most expressive output across 70+ languages. `mobdial_flash_v2_5` targets real-time use at ~75ms latency. Each capability — speech-to-text, music, sound effects — has its own dedicated model. See https://www.mobdial.com/docs/overview/models.

**Credits** are the unit of API consumption. Text-to-speech costs one credit per character of input text. Other operations are charged per second of audio processed. Credits reset monthly and unused credits roll over for up to two months. See https://www.mobdial.com/pricing.

## Models

- **MobDial v3** — most emotionally rich, expressive speech synthesis. Dramatic delivery, 70+ languages, 5,000 character limit, natural multi-speaker dialogue.
- **MobDial Multilingual v2** — lifelike, consistent quality. 29 languages, 10,000 character limit, most stable on long-form generations.
- **MobDial Flash v2.5** — fast and affordable. Ultra-low latency (~75ms, excluding application & network latency), 32 languages, 40,000 character limit, 50% lower price per character.
- **Scribe v2** — state-of-the-art speech recognition. 90+ languages, keyterm prompting (up to 1000 terms), entity detection, word-level timestamps, speaker diarization (up to 32 speakers).
- **Scribe v2 Realtime** — real-time speech recognition. 90+ languages, ~150ms latency, precise word-level timestamps.

## API Quickstart

Create an API key in the dashboard and store it as a managed secret (an `.env` file or your app's configuration).

```
MOBDIAL_API_KEY=<your_api_key_here>
```

Install the SDK:

```
# Python
pip install mobdial python-dotenv

# TypeScript
npm install @mobdial/sdk dotenv
```

Make your first text-to-speech request:

```python
from dotenv import load_dotenv
from mobdial.client import MobDial
from mobdial.play import play
import os

load_dotenv()
mobdial = MobDial(api_key=os.getenv("MOBDIAL_API_KEY"))

audio = mobdial.text_to_speech.convert(
    text="The first move is what sets everything in motion.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="mobdial_v3",
    output_format="mp3_44100_128",
)
play(audio)
```

```typescript
import "dotenv/config";
import { MobDial } from "@mobdial/sdk";
import { play } from "@mobdial/sdk/play";

const mobdial = new MobDial({ apiKey: process.env.MOBDIAL_API_KEY });

const audio = await mobdial.textToSpeech.convert({
  text: "The first move is what sets everything in motion.",
  voiceId: "JBFqnCBsd6RMkjVDRZzb",
  modelId: "mobdial_v3",
  outputFormat: "mp3_44100_128",
});
await play(audio);
```

Full reference: https://www.mobdial.com/docs/mobdial-api/quickstart

## Capabilities

- **Text to Speech** (https://www.mobdial.com/docs/overview/capabilities/text-to-speech): Convert text into lifelike speech.
- **Speech to Text** (https://www.mobdial.com/docs/overview/capabilities/speech-to-text): Transcribe spoken audio into text.
- **Speech Engine** (https://www.mobdial.com/docs/overview/capabilities/speech-engine): Add real-time voice to any chat agent.
- **Music** (https://www.mobdial.com/docs/overview/capabilities/music): Generate music from text.
- **Text to Dialogue** (https://www.mobdial.com/docs/overview/capabilities/text-to-dialogue): Create natural-sounding multi-speaker dialogue.
- **Voice Changer** (https://www.mobdial.com/docs/overview/capabilities/voice-changer): Modify and transform voices.
- **Voice Isolator** (https://www.mobdial.com/docs/overview/capabilities/voice-isolator): Isolate voices from background noise.
- **Dubbing** (https://www.mobdial.com/docs/overview/capabilities/dubbing): Dub audio and video across languages.
- **Sound Effects** (https://www.mobdial.com/docs/overview/capabilities/sound-effects): Create cinematic sound effects from text.
- **Forced Alignment** (https://www.mobdial.com/docs/overview/capabilities/forced-alignment): Align text to audio with word-level timestamps.
- **Voices** (https://www.mobdial.com/docs/overview/capabilities/voices): Clone and design custom voices.

## MobDialAgents

Deploy intelligent conversational voice agents: configure behavior and procedures, attach a knowledge base and tools, connect integrations and voices, monitor conversations, test, and deploy across phone numbers, WhatsApp, iMessage, and outbound campaigns.

- Overview: https://www.mobdial.com/docs/mobdial-agents/overview
- Agent tooling (tools, function calling, knowledge bases): https://www.mobdial.com/docs/mobdial-agents/tooling
- Phone Numbers: https://www.mobdial.com/docs/mobdial-agents/phone-numbers
- WhatsApp: https://www.mobdial.com/docs/mobdial-agents/whatsapp
- iMessage: https://www.mobdial.com/docs/mobdial-agents/imessage
- Outbound: https://www.mobdial.com/docs/mobdial-agents/outbound

## Concepts

- **Audio streaming** (https://www.mobdial.com/docs/concepts/audio-streaming): Stream audio as it generates to reduce latency rather than waiting for the complete file.
- **Latency** (https://www.mobdial.com/docs/concepts/latency): What contributes to end-to-end latency and how to minimize it.
- **Voice cloning** (https://www.mobdial.com/docs/concepts/voice-cloning): Create a custom voice from a short audio recording.

## Reference

- **API Reference** (https://www.mobdial.com/api-reference) and **OpenAPI spec** (https://www.mobdial.com/api/docs/openapi.json).
- **Libraries & SDKs** (https://www.mobdial.com/docs/reference/sdks): Official Python and TypeScript SDKs with full type definitions.
- **Errors** (https://www.mobdial.com/docs/reference/errors): Error codes and recommended handling.
- **Webhooks** (https://www.mobdial.com/docs/reference/webhooks): Real-time events with HMAC-SHA256 signed payloads and automatic retry.
- **Zero Retention Mode** (https://www.mobdial.com/docs/reference/zero-retention): Process requests without storing input or output.
- **Breaking changes policy** (https://www.mobdial.com/docs/reference/breaking-changes): How MobDial versions and deprecates APIs.

## Compliance & Security

MobDial enforces TCPA consent, Do-Not-Call (DNC) scrubbing, time-zone-aware calling windows, and an immutable consent vault. See https://www.mobdial.com/security.

---
Powered by MobDial Agents (MobAgents).