Real-Time AI Voice Chat
Real-Time AI Voice Chat
软件描述
与AI进行自然的口语对话。
官方网站
访问软件的官方网站了解更多信息
github.com
什么是 Real-Time AI Voice Chat?
Have a natural, spoken conversation with an AI! This project lets you chat with a Large Language Model (LLM) using just your voice, receiving spoken responses in near real-time. Think of it as your own digital conversation partner. What's under the hood? A sophisticated client-server system built for low-latency interaction:
Capture: Your voice is captured by your browser. Stream: Audio chunks are whisked away via WebSockets to a Python backend. Transcribe: RealtimeSTT rapidly converts your speech to text. Think: The text is sent to an LLM (like Ollama or OpenAI) for processing. Synthesize: The AI's text response is turned back into speech using RealtimeTTS. Return: The generated audio is streamed back to your browser for playback. Interrupt: Jump in anytime! The system handles interruptions gracefully.
Key features:
Fluid Conversation: Speak and listen, just like a real chat. Real-Time Feedback: See partial transcriptions and AI responses as they happen. Low Latency Focus: Optimized architecture using audio chunk streaming. Smart Turn-Taking: Dynamic silence detection (turndetect.py) adapts to the conversation pace. Flexible AI Brains: Pluggable LLM backends (Ollama default, OpenAI support via llm_module.py). Customizable Voices: Choose from different Text-to-Speech engines (Kokoro, Coqui, Orpheus via audio_module.py). Web Interface: Clean and simple UI using Vanilla JS and the Web Audio API. Dockerized Deployment: Recommended setup using Docker Compose for easier dependency management.