For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceClient LibrariesChangelog
DocumentationAPI ReferenceClient LibrariesChangelog
  • Introduction
    • Introduction
  • Getting Started
    • Quickstart
    • Models
    • Authentication
    • HTTP Streaming
  • Text to Speech
    • How to TTS
    • LLM to TTS
    • Voice Models & Languages
  • Voice Cloning
    • Types of Cloning
    • Voice Clone via UI
    • How to Voice Clone
    • Delete Cloned Voice
    • Professional Voice Cloning
  • Integrations
    • LiveKit
    • Plivo
    • Vonage
  • Product
    • Projects
  • Best Practices
    • Voice Cloning Best Practices
    • PVC Best Practices
    • TTS Best Practices
LogoLogo
Voice AgentsModels
Voice AgentsModels
On this page
  • 🎙️ How PVC Enhances Voice Cloning
  • 1. Handles Background Noise More Effectively
  • 2. Captures a More Natural Speaking Style
  • 3. Understands Extreme Emotions & Variability
  • 4. Improves Inconsistent Speaking Patterns
  • 5. More Robust for Long-Form Content
Best Practices

Professional Voice Cloning - Best Practices

||View as Markdown|
Was this page helpful?
Previous

Voice Cloning - Best Practices

Next

Text to Speech - Best Practices

Built with

To get the most accurate and natural voice clone, it’s essential to provide high-quality reference audio. The best practices for recording remain the same as those for Instant Voice Cloning, which you can find here:

🔗 Instant Voice Cloning - Best Practices

However, Professional Voice Cloning (PVC) significantly improves upon Instant Voice Cloning in the following ways:

🎙️ How PVC Enhances Voice Cloning

1. Handles Background Noise More Effectively

  • PVC can filter out mild background noise without affecting voice quality.
  • Unlike Instant Cloning, PVC adapts better to real-world recording conditions.

2. Captures a More Natural Speaking Style

  • Supports a wider range of tones and vocal inflections.
  • Preserves the natural rhythm and personality of speech.

3. Understands Extreme Emotions & Variability

  • PVC models can learn from expressive speech, making them ideal for voices with dynamic emotions (anger, excitement, sadness).
  • Instant Cloning may struggle with highly expressive tones.

4. Improves Inconsistent Speaking Patterns

  • Can learn from pauses, breath sounds, and fluctuations in speaking speed.
  • Works well even if the reference recordings contain slight variations.

5. More Robust for Long-Form Content

  • Best suited for audiobook narration, dubbing, and professional voice applications.
  • Produces high-quality results even in long recordings.

If you have any questions or run into any issues, our community is here to help!

  • Join our Discord server to connect with other developers and get real-time support.
  • Reach out to our team via email: support@smallest.ai.