feat: add health-supplement-search ability by megz2020 · Pull Request #214 · openhome-dev/abilities

megz2020 · 2026-03-15T22:03:09Z

Adds a new voice-driven health supplement search ability that lets users ask about a health concern and get personalized supplement recommendations from a curated database of 100 real iHerb products.
health supplement search Loom record

Semantic vector search over 100 curated supplement products (names, brands, ratings, reviews, effects)
Supports Qdrant Cloud (free 1 GB tier, recommended) and Weaviate Cloud (14-day sandbox) as vector backends
Falls back to Serper web search when a product isn't found in the local database
Multi-turn conversation: ask for product details, re-rank by rating, or search a new concern
STT-resilient: handles garbled voice input via LLM intent classification and a guess-and-confirm flow
Passes local validator (validate_ability.py) with zero errors
How It Works
User speaks a health concern ("find me something for joint pain")
Query is embedded via Jina AI (jina-embeddings-v3, 1024 dims) and searched against the Qdrant collection
If cosine distance < 0.70 → return curated results; otherwise fall back to Serper
Results are summarized by the OpenHome LLM into a natural voice response
User can ask for details on a specific product, re-rank by rating, or search something new
STT Resilience
Voice recognition often garbles health queries. This ability handles it in two layers:

LLM intent check: all inputs of 3+ words go through an LLM to judge health intent, even if no keyword matched
Guess and confirm: short or ambiguous inputs trigger a guess ("Did you mean joint pain?") — a "yes" confirms and searches
Setup Required
The ability needs a pre-loaded vector database. Full setup instructions and scripts are in the companion branch: feat/health-supplement-search-setup

Voice-driven semantic search over 100 curated supplement products. Supports Weaviate (built-in Snowflake Arctic embeddings) and Qdrant (Jina AI embeddings) via a config flag. Falls back to Serper web search when a supplement is not found in the local DB.

- Remove unused json import from main.py - Replace CONFIG_FILE/load_config with top-level constant block - Update README to document constants-based setup (not JSON file) - Fix setup branch link in README (root, not subfolder path)

Architecture: - Add per-provider threshold note to DISTANCE_THRESHOLD config comment - Extract trigger text in call() to pre-fill first search turn - Initialize _last_results/_last_source/_trigger_text in call() (not class-level) Code quality: - Remove LLM fallback from _wants_exit; expand EXIT_WORDS with phrase set - Add ordering comment above rerank/detail checks - Add _strip_llm_fences + ordinal word fallback to _wants_detail int parse - Wrap _log/_err in try/except matching local-event-explorer pattern - Add isinstance(reviews, list) guard in _detail_response - Add payload guard in _wants_detail list comprehension Performance: - Wrap all text_to_text_response calls in asyncio.to_thread (non-blocking) - Make _summarize_curated, _summarize_web, _detail_response async

- Expand _DETAIL_TRIGGERS: add affirmative follow-ups (yes, give me, show me, the first/second/third) so 'Yes. Give me' correctly routes to detail mode - Add clarification response when detail intent detected but product not resolved (e.g. STT garble like 'the restaurant') instead of falling through to search - Tighten _summarize_curated prompt: explicitly forbid inferring benefits not listed in the data to prevent LLM hallucination (e.g. 'cancer treatment') - Add _is_health_query() guard: keyword-first check then short LLM fallback rejects off-topic inputs before triggering a vector DB search

- Add thank you/thanks/cheers to EXIT_WORDS (covers 'Thank you, Snowby' garble) - Add short-input LLM fallback in _wants_exit for inputs <=5 words that pass keyword check — catches STT garbles of goodbye that keyword matching misses

- Add _just_showed_detail flag: blocks _DETAIL_TRIGGERS from re-matching on the turn immediately after detail was shown, preventing the double-detail loop - Strip HTML tags from reviews before passing to _detail_response using _strip_html() — source data contains raw <span className=...> tags that garble the review text and cause LLM to paraphrase instead of quoting

…ealth_query Previously the LLM fallback only ran for inputs <=6 words, so long off-topic queries like "What is the result between Liverpool and Tottenham today?" bypassed the guard and triggered a supplement search. Now LLM is called for all inputs that don't match health keywords.

- Remove implementation-detail and narrative comments; keep only "why" comments - Update README: Qdrant as primary provider, STT resilience section, run_io_loop listed - Apply ruff format (no logic changes)

…apping - Declare _trigger_text and _just_showed_detail as class attributes to match OpenHome convention (alongside _last_results / _last_source) - Remove awkward multi-line parens ruff introduced around pending_guess and confirmed_search inline comments

github-actions · 2026-03-15T22:03:18Z

✅ Community PR Path Check — Passed

All changed files are inside the community/ folder. Looks good!

github-actions · 2026-03-15T22:03:18Z

🔀 Branch Merge Check

PR direction: feat/health-supplement-search → dev

✅ Passed — feat/health-supplement-search → dev is a valid merge direction

github-actions · 2026-03-15T22:03:19Z

✅ Ability Validation Passed

📋 Validating: community/health-supplement-search
  ✅ All checks passed!

github-actions · 2026-03-15T22:03:23Z

🔍 Lint Results

✅ `init.py` — Empty as expected

Files linted: community/health-supplement-search/main.py

✅ Flake8 — Passed

✅ All checks passed!

uzair401

Hey @megz2020, ran this through the voice naturalness audit. LLM-based intent routing is correctly used throughout and the STT resilience design is well thought out. A few issues to address:

1. Hardcoded string matching

The guess confirmation tuple is missing coverage for common spoken affirmatives. Add: "absolutely", "go ahead", "do it", "sounds good", "for sure", "yup".
_wants_rerank uses hardcoded substring matching ("best rated", "highest rated") which will miss paraphrases like "which one has the best reviews", "most popular", "sort by rating". Replace with an LLM classifier:

result = self.capability_worker.text_to_text_response(
    f"The user said: '{user_input}'. Are they asking to sort results by rating? "
    "Reply ONLY with: RATING_HIGH, RATING_LOW, or NO."
).strip().upper()

"more", "ok", and "okay" in _DETAIL_TRIGGERS will produce false positives on inputs like "no more", "one more search", "ok thanks". Remove them — the LLM path in _wants_detail already handles intent resolution correctly without these.

2. LLM classifier prompts missing few-shot examples

Both _wants_exit classifier prompts provide no examples, which reduces reliability on STT-garbled farewell phrases. Add inline examples: "'bye', 'that's all', 'im done', 'cheers' = YES. Reply YES or NO only."
_guess_health_intent provides no garbled input examples for the LLM to calibrate against. Add: "Examples: 'join te pin' → 'joint pain', 'sleep iz shoes' → 'sleep issues'."

3. EXIT_WORDS coverage gap

The set is missing several common spoken closing phrases. Add: "i'm good", "all set", "i'm all set", "that's enough", "nothing else".

4. LLM output formatting constraints incomplete

_summarize_curated, _summarize_web, and _detail_response instruct the LLM to avoid markdown but do not explicitly prohibit bullet points or numbered lists. A response like "1. Product A 2. Product B" will pass the markdown check but produce broken TTS output. Add to all three prompts: "Plain spoken English only. No bullet points, no numbered lists, no formatting of any kind."

5. Response length violations

The opening speak() string is 46 words, exceeding the 30-word ceiling. Refactor to:

"Welcome to Health Supplement Search — informational only, not medical advice. What health concern can I help you with?" (17 words)
The setup error string references main.py and README — both are meaningless in a voice context. Refactor to: "Health Supplement Search isn't configured yet. Please add your API keys and re-upload the ability."
_summarize_curated instructs the LLM to respond in 3-4 sentences. Per voice delivery guidelines, result delivery should be capped at 2-3 sentences max. Update accordingly.

No menu-driven flow issues found. Please push fixes and we'll take another look!

- Add _normalize_query() to clean garbled STT before vector search - Replace hardcoded _wants_rerank with LLM classifier - Replace keyword-based _wants_exit with fully LLM-based classifier - Add few-shot examples to exit/intent/guess prompts - Expand affirmatives; remove false-positive detail triggers - Cap LLM responses to 30-40 words for voice-appropriate length - Bump DISTANCE_THRESHOLD to 0.85 for Weaviate compatibility - Replace magic numbers with named constants

megz2020 added 11 commits March 15, 2026 03:40

Use run_io_loop for idle reprompt (SDK best practice)

fe0c4b7

Fix exit detection: catch STT-garbled goodbyes

74cce41

- Add thank you/thanks/cheers to EXIT_WORDS (covers 'Thank you, Snowby' garble) - Add short-input LLM fallback in _wants_exit for inputs <=5 words that pass keyword check — catches STT garbles of goodbye that keyword matching misses

Fix Jina dimensions: 1536 -> 1024 (jina-embeddings-v3 max is 1024)

5639a2c

Clean up comments, update README, apply ruff formatting

ef142e4

- Remove implementation-detail and narrative comments; keep only "why" comments - Update README: Qdrant as primary provider, STT resilience section, run_io_loop listed - Apply ruff format (no logic changes)

megz2020 requested a review from a team as a code owner March 15, 2026 22:03

github-actions bot added the community-ability Community-contributed ability label Mar 15, 2026

Merge branch 'dev' into feat/health-supplement-search

2d2dc67

uzair401 requested changes Mar 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add health-supplement-search ability#214

feat: add health-supplement-search ability#214
megz2020 wants to merge 13 commits intoopenhome-dev:devfrom
megz2020:feat/health-supplement-search

megz2020 commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

uzair401 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

megz2020 commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Community PR Path Check — Passed

Uh oh!

github-actions bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔀 Branch Merge Check

Uh oh!

github-actions bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Ability Validation Passed

Uh oh!

github-actions bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Lint Results

✅ __init__.py — Empty as expected

✅ Flake8 — Passed

Uh oh!

uzair401 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 15, 2026 •

edited

Loading

github-actions bot commented Mar 15, 2026 •

edited

Loading

github-actions bot commented Mar 15, 2026 •

edited

Loading

github-actions bot commented Mar 15, 2026 •

edited

Loading

✅ `init.py` — Empty as expected