Skip to content

Add spec-extraction-workflow: bootstrap repos with semantic baseline#119

Merged
Alan-Jowett merged 4 commits intomicrosoft:mainfrom
Alan-Jowett:feature/spec-extraction-workflow
Mar 30, 2026
Merged

Add spec-extraction-workflow: bootstrap repos with semantic baseline#119
Alan-Jowett merged 4 commits intomicrosoft:mainfrom
Alan-Jowett:feature/spec-extraction-workflow

Conversation

@Alan-Jowett
Copy link
Copy Markdown
Member

Summary

Adds a spec-extraction-workflow -- an interactive orchestration template that bootstraps any repository with structured requirements, design, and validation specifications.

Closes #117

The Bootstrap Complement

This is the first half of a two-workflow model:

Together they form a complete lifecycle for any engineering domain.

Workflow Phases

Phase 1: Repository Scan (agent reads code, docs, tests with tools) Phase 2: Draft Extraction (requirements + design + validation, confidence-tagged) Phase 3: Human Clarification Loop (iterate until specs are crisp) Phase 4: Consistency Audit (adversarial, D1-D7) Phase 5: Human Approval (loop back if needed) Phase 6: Create Deliverable (PR with spec files)

Entry point: read and execute templates/spec-extraction-workflow.md

New Components

Just 1 template -- everything else is reused:

Reused Component Role
requirements-from-implementation protocol Systematic code-to-spec extraction
requirements-elicitation protocol Decompose into atomic requirements
traceability-audit protocol Consistency audit
adversarial-falsification protocol Challenge findings
requirements-doc format Requirements output structure
design-doc format Design output structure
validation-plan format Validation output structure
specification-drift taxonomy Audit classifications (D1-D7)

Key Design Decisions

  • Domain-agnostic: Configurable persona works for any domain
  • Agent-driven scanning: Uses tools to read the repo (not user-pasted content)
  • User-specified output paths: No opinionated filename defaults
  • Confidence marking: Every extracted item tagged HIGH/MEDIUM/LOW
  • Zero new protocols/formats/personas: Maximum reuse of existing components

Validation

Both validators pass cleanly.

Add an interactive orchestration template that bootstraps any repository
with structured requirements, design, and validation specifications.

Workflow phases:
1. Repository scan (agent uses tools to read code, docs, tests)
2. Draft extraction (requirements + design + validation with confidence)
3. Human clarification loop (iterate until specs are crisp)
4. Consistency audit (adversarial, D1-D7 classification)
5. Human approval (loop back if needed)
6. Create deliverable (PR with spec files)

Key design decisions:
- Domain-agnostic: configurable persona for any engineering domain
- Agent-driven scanning: uses tools to read the repo, not user-pasted
- User-specified output paths: no opinionated filename defaults
- Confidence marking: every extracted item tagged HIGH/MEDIUM/LOW
- Reuses all existing protocols (requirements-from-implementation,
  requirements-elicitation, traceability-audit, etc.)
- No new protocols, formats, or personas needed

This is the bootstrap complement to engineering-workflow:
  spec-extraction (bootstrap) -> engineering-workflow (evolve)

Closes microsoft#117

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 30, 2026 03:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new interactive PromptKit orchestration template to bootstrap an existing repository with an initial “semantic baseline” (requirements, design, validation) extracted from implementation and refined via human-in-the-loop review, then audited for consistency before producing PR-ready spec files.

Changes:

  • Added spec-extraction-workflow interactive template with phased repo scanning, extraction, clarification, audit, and deliverable creation.
  • Registered the new template in manifest.yaml (persona configurable, multi-artifact output, drift taxonomy).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
templates/spec-extraction-workflow.md New interactive workflow template defining a 6-phase code-to-spec extraction + clarification + audit loop.
manifest.yaml Adds the new template entry so bootstrap/assembly can discover and run it.

… quality checklist

- Standardize confidence labels to High/Medium/Low (matches
  investigation-report format convention)
- Inline section skeletons for requirements-doc, design-doc, and
  validation-plan formats since only multi-artifact is assembled
- Enumerate investigation-report's 9 required sections with
  mapping for Phase 4 audit output and verdict placement
- Add output_audit param for persisting the audit report
- Add Quality Checklist section (12 verification items)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

- Add output_audit to Inputs display section
- Replace [UNCERTAIN]/[AMBIGUOUS] with [UNKNOWN: ...]/[ASSUMPTION]
  to match anti-hallucination protocol conventions
- Align requirements-doc skeleton to 8-section structure with
  separate Constraints, Dependencies, Assumptions, Risks sections
- Align design-doc skeleton to 9-section structure (Context & Goals,
  Non-Goals, Requirements Summary, Architecture, Detailed Design,
  Security/Ops, Tradeoffs, Open Questions, Revision History)
- Align validation-plan skeleton to 10-section structure (Overview,
  Scope, Test Strategy, Risk Prioritization, Test Cases, Traceability,
  Pass/Fail Criteria, Coverage, Environment, Revision History)
- Add 'None identified' rule for empty sections

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

…ctures

- requirements-doc: match 8-section structure (Overview, Scope,
  Definitions and Glossary, Requirements, Dependencies, Assumptions,
  Risks, Revision History)
- design-doc: match 9-section structure (Overview, Requirements
  Summary, Architecture, Detailed Design, Tradeoff Analysis, Security
  Considerations, Operational Considerations, Open Questions,
  Revision History)
- validation-plan: match 8-section structure with correct ordering
  (Overview, Scope of Validation, Test Strategy, Requirements
  Traceability Matrix, Test Cases, Risk-Based Test Prioritization,
  Pass/Fail Criteria, Revision History)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Alan-Jowett Alan-Jowett merged commit 0d0233b into microsoft:main Mar 30, 2026
3 checks passed
@Alan-Jowett Alan-Jowett deleted the feature/spec-extraction-workflow branch March 30, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🧩 Spec Extraction Workflow

2 participants