Dev/ben graphs by plutoless · Pull Request #2128 · TEN-framework/ten-framework

plutoless · 2026-04-03T07:47:53Z

No description provided.

- Add new deepgram_tts extension using WebSocket streaming API - Support for Aura-2 voices with linear16/mulaw/alaw encoding - Add flux_gpt_5_4_deepgramtts graph (Flux STT + GPT-5.4 + Deepgram TTS) - Uses same DEEPGRAM_API_KEY as STT extensions

…raphs The basic_connections template included a flush command to avatar extension even for graphs without avatars, causing runtime errors. Now avatar flush is only added conditionally when has_avatar=True.

- Add comprehensive test suite (13 tests covering basic, error, metrics, params, and robustness) - Add README.md with configuration and usage documentation - Add test configs for various scenarios - Update manifest.json with base_url property - Update property.json with complete defaults

Restore AI_working_with_ten.md and AI_working_with_ten_compact.md which contain useful development guidelines for this branch.

Explicitly set api_key to empty string in property_miss_required.json to match cartesia_tts pattern. Without this, the default api_key from property.json with env var expansion was being used, causing the test_miss_required_params integration test to timeout instead of receiving the expected error response.

Previously, when the Deepgram websocket timed out waiting for audio, the code would silently break without yielding any termination event. This caused the extension to never call finish_request(), leaving tests waiting indefinitely for tts_audio_end. Now yields EVENT_TTS_ERROR on timeout, consistent with the Cartesia TTS pattern where all exit paths yield a termination event.

Deepgram TTS has a 10s websocket timeout which exceeds the test's timing assumptions (7s total: 2s + 5s wait). This mismatch causes false test failures.

Consistent with other TTS extensions (ElevenLabs, Bytedance), set the cancellation flag before closing websocket to ensure pending recv operations exit promptly.

Track whether tts_audio_start was sent for each request and ensure it is always sent before tts_audio_end, even in error cases. This fixes state machine errors in tests that expect the complete event sequence (start -> frames -> end).

github-actions · 2026-04-03T07:52:04Z

PR Review: Dev/ben graphs (#2128)

Thanks for this contribution — the new deepgram_tts extension and accompanying graph are well-structured overall, and the rebuild_property.py bug fix is a solid improvement. Here's my detailed feedback.

Bug Fix ✅

rebuild_property.py — flush-to-avatar connection
The fix to gate the flush command inside if has_avatar: is correct. Previously it was unconditionally added to main_control connections, which would have caused a broken graph for non-avatar configurations. Good catch.

Issues to Address

1. Race condition: `_is_cancelled` reset vs. concurrent `cancel()` call

In deepgram_tts.py, get() resets self._is_cancelled = False at the top of the method. Since cancel() is async and can be called concurrently, there's a window where:

cancel() sets _is_cancelled = True
A new get() call immediately resets it to False
The cancellation is silently dropped

Consider using asyncio.Event or protecting the flag reset with a check that no cancellation is in-flight, or use an integer generation counter to detect stale cancellations.

2. Fire-and-forget `asyncio.create_task` for preheat — silent failures

In both on_init and request_tts (when self.client is None):

asyncio.create_task(self.client.start())

This fires and forgets. If preheat fails (network error, bad API key, etc.), the error is only logged — the extension reports successful initialization and the caller has no visibility into the failure. For request_tts specifically, self.client.get(text) is called immediately after without waiting for the connection, which means _ensure_connection() inside get() will handle reconnection, but this creates an unnecessary double-connect scenario.

Suggestion: await client.start() in on_init, or at minimum propagate preheat errors via on_error so the runtime knows initialization is degraded.

3. WebSocket reconnection yields `EVENT_TTS_FLUSH` to caller

In deepgram_tts.py, when EVENT_TTS_FLUSH is received mid-stream, the code closes the WebSocket, reconnects, and then yields the flush event back to the caller. The reconnect is an internal implementation detail — the caller may not expect a flush event to arrive mid-reconnect or after a cancel. Consider whether the flush event should be consumed internally here rather than propagated upstream.

4. `config.update_params()` strips all fields into typed config

Unlike the pattern described in AGENTS.md (which strips only api_key at request time, leaving other params in the params dict for passthrough), DeepgramTTSConfig.update_params() strips api_key, model, encoding, sample_rate, and base_url — leaving self.params empty after the call. This isn't a bug, but it diverges from the documented pattern and means any future Deepgram-supported params would need explicit code changes rather than being passed through automatically.

Minor Issues

5. `manifest.json`: `sample_rate` uses `int64` instead of `int32`

Per AGENTS.md conventions, integer params should use int32. The sample_rate field in deepgram_tts/manifest.json uses int64. Please update to int32 for consistency.

6. Test skip in `test_invalid_text_handling.py`

The skip comment notes that the test assumes a 7-second wait but the extension has a 10-second internal timeout. The skip is acceptable short-term, but it would be better to make the test wait configurable per extension rather than skip it entirely. Consider adding a tts_timeout parameter to the test fixture.

Concern: AI documentation files committed to repo root

The two new files ai/AI_working_with_ten.md and ai/AI_working_with_ten_compact.md appear to be personal operational runbooks containing:

Hardcoded paths specific to one contributor's server (/home/ubuntu/ten-framework/, ten_agent_dev container name)
Personal deployment workflows that won't generalize to other contributors
A git commit-msg hook that blocks the word "claude" in commit messages — this would affect any contributor using Claude Code for commits

If this documentation is meant to help AI coding assistants, it should be generalized (no hardcoded personal paths) and the commit hook policy should be discussed with the team before being committed as official guidance. If it's personal notes, consider keeping it outside the repo or in a personal branch.

Nits

property.json array reformatting (inline → multi-line for on_user_joined/on_user_left/tool_register) is cosmetic — fine to include but produces noisy diffs
Thymia graph: min_speech_duration changed from 20.0 → 10.0 seconds with no explanation in the PR description — worth a brief comment on why this was adjusted

Summary

Area	Status
`rebuild_property.py` bug fix	✅ Correct
`deepgram_tts` WebSocket design	⚠️ Race condition + preheat error handling
Config pattern consistency	⚠️ Diverges from documented pattern
`manifest.json` type	🔧 `int64` → `int32`
AI docs in repo root	❗ Needs discussion (personal paths + git hook concern)
Test coverage	🔧 Skip should be temporary

The core extension logic is solid — the main asks are addressing the _is_cancelled race condition, preheat error propagation, and resolving the AI docs concern before merge.

wangyimin-agora · 2026-04-03T08:03:18Z

ai_agents/agents/ten_packages/extension/deepgram_tts/extension.py

+        try:
+            await super().on_init(ten_env)
+            config_json_str, _ = await self.ten_env.get_property_to_json("")
+            ten_env.log_info(f"config_json_str: {config_json_str}")


remove, key will be printed

wangyimin-agora · 2026-04-03T08:05:59Z

ai_agents/agents/ten_packages/extension/deepgram_tts/extension.py

+                self.sent_ts = None
+                self._audio_start_sent = False  # Reset for new request
+                if t.metadata is not None:
+                    self.session_id = t.metadata.get("session_id", "")


these two parameters not used in other places

wangyimin-agora · 2026-04-03T08:52:24Z

ai_agents/agents/integration_tests/tts_guarder/tests/test_invalid_text_handling.py

+    # Skip for extensions with longer timeouts that don't match test timing assumptions
+    # Deepgram TTS has a 10s timeout, but this test waits only 7s total (2s + 5s)
+    if extension_name == "deepgram_tts":
+        pytest.skip("Deepgram TTS timeout (10s) exceeds test timing assumptions (7s)")


This case is designed for verifying tts client can recover immediately after an impossible error input text.
It is mandatory for all tts extension

wangyimin-agora · 2026-04-03T08:56:51Z

ai_agents/agents/ten_packages/extension/deepgram_tts/deepgram_tts.py

+    async def _ensure_connection(self) -> None:
+        """Ensure websocket connection is established"""
+        if not self.ws:
+            await self._connect()


Reconnection should be implemented after disconnection ASAP, it is later to reconnect when get next input text.

wangyimin-agora · 2026-04-03T09:08:43Z

ai_agents/agents/ten_packages/extension/deepgram_tts/deepgram_tts.py

+                    break
+
+                try:
+                    message = await asyncio.wait_for(


it's better to use duplex for websocket. One progress for sending, and another progress for receiving

wangyimin-agora · 2026-04-03T09:10:16Z

ai_agents/agents/ten_packages/extension/deepgram_tts/extension.py

+            elif self.current_request_finished:
+                self.ten_env.log_error(
+                    f"Received a message for a finished request_id "
+                    f"'{t.request_id}' with text_input_end=False."


text_input_end couldn't be hardcode

wangyimin-agora · 2026-04-03T09:11:22Z

ai_agents/agents/ten_packages/extension/deepgram_tts/extension.py

+                            self.ten_env.log_debug(
+                                "Received empty payload for TTS response"
+                            )
+                            if t.text_input_end:


duplicate code, it is better to call one function

Ubuntu added 10 commits March 27, 2026 07:39

fix: remove avatar connection from basic_connections for non-avatar g…

0252c3e

…raphs The basic_connections template included a flush command to avatar extension even for graphs without avatars, causing runtime errors. Now avatar flush is only added conditionally when has_avatar=True.

Merge remote-tracking branch 'origin/main' into dev/ben-graphs

a72b5c4

chore: restore ai documentation files for dev/ben-graphs branch

a0c4c02

Restore AI_working_with_ten.md and AI_working_with_ten_compact.md which contain useful development guidelines for this branch.

test: skip test_invalid_text_handling for deepgram_tts

626cc4e

Deepgram TTS has a 10s websocket timeout which exceeds the test's timing assumptions (7s total: 2s + 5s wait). This mismatch causes false test failures.

fix: set _is_cancelled flag in stop() for proper cleanup

52c1d3e

Consistent with other TTS extensions (ElevenLabs, Bytedance), set the cancellation flag before closing websocket to ensure pending recv operations exit promptly.

plutoless requested a review from halajohn as a code owner April 3, 2026 07:47

wangyimin-agora reviewed Apr 3, 2026

View reviewed changes

wangyimin-agora suggested changes Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev/ben graphs#2128

Dev/ben graphs#2128
plutoless wants to merge 10 commits intomainfrom
dev/ben-graphs

plutoless commented Apr 3, 2026

Uh oh!

github-actions bot commented Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

wangyimin-agora Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

plutoless commented Apr 3, 2026

Uh oh!

github-actions bot commented Apr 3, 2026

PR Review: Dev/ben graphs (#2128)

Bug Fix ✅

Issues to Address

1. Race condition: _is_cancelled reset vs. concurrent cancel() call

2. Fire-and-forget asyncio.create_task for preheat — silent failures

3. WebSocket reconnection yields EVENT_TTS_FLUSH to caller

4. config.update_params() strips all fields into typed config

Minor Issues

5. manifest.json: sample_rate uses int64 instead of int32

6. Test skip in test_invalid_text_handling.py

Concern: AI documentation files committed to repo root

Nits

Summary

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

wangyimin-agora Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Race condition: `_is_cancelled` reset vs. concurrent `cancel()` call

2. Fire-and-forget `asyncio.create_task` for preheat — silent failures

3. WebSocket reconnection yields `EVENT_TTS_FLUSH` to caller

4. `config.update_params()` strips all fields into typed config

5. `manifest.json`: `sample_rate` uses `int64` instead of `int32`

6. Test skip in `test_invalid_text_handling.py`