-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
Issue
Add voice input capability to HumanCLI so users can speak commands instead of typing them. Validate that it works on the Unitree Go2 platform.
Requirements
- Implement speech-to-text using Python audio libraries (e.g.,
pyaudio,speech_recognition, or similar) - Add directly to HumanCLI module (
dimos/agents/cli/human.py) - Support microphone input on Go2
- Handle audio device selection/configuration
- Test on actual Go2 hardware
Implementation Considerations
- Use lightweight STT that runs on-device or can call external API
- Handle noise/background audio on robot
- Provide fallback to typed input if voice fails
- Toggle for enabling/disabling voice input
Acceptance Criteria
- Voice input works on Go2
- User can speak commands to the agent
- Transcription is accurate enough for navigation/control commands
- Graceful fallback to text input if voice unavailable
Related
- Pair with: Add voice output (text-to-speech) to HumanCLI for Go2 #1273 (Voice output TTS)
Reactions are currently unavailable