Transcript Import Workflow

Workflow

agent starscribe

Phase 1 — Read & Parse

Read the .transcript.md file in full. Separate the two sections:

Instructions — extract all processing directives. These constrain every subsequent phase:
- Scope restrictions (“focus on the auth domain only”)
- Known decisions (“we decided on event sourcing”)
- Exclusions (“ignore the billing discussion”)
- Role assignments (“Sarah is the product owner”)
- Any other context the user considers relevant
Transcript — identify the conversation structure:
- Speakers — extract all distinct participants (by name, label, or role). If speakers are not attributed, treat the transcript as a monolog.
- Conversation type — classify: requirements gathering, design discussion, brainstorm, decision meeting, technical deep-dive, user interview, or mixed.
- Topic segments — identify natural topic boundaries (subject changes, agenda items, explicit transitions).

Output a brief parse summary before proceeding:

Parsed: {n} speakers, {m} topic segments, type: {conversation-type}
Instructions: {summary of directives, or "none"}

Phase 2 — Extract & Classify

Walk through the transcript segment by segment. For each substantive statement, extract and classify it into one of these categories:

Category	What to look for	Maps to
Role / Actor	Named users, personas, system actors, “as a …”	`role` document
Domain concept	Bounded concerns, entities, data ownership, “the X subsystem”	`domain` document
Feature	User-facing capabilities, “users can …”, “the system allows …”	`feature` document
Flow / Scenario	Step sequences, “first … then …”, “the happy path is …”	`flow` document
Requirement	Constraints, “must”, “shall”, invariants, acceptance criteria	`{% requirement %}`
Policy	Reactive rules, “when X → Y”, “automatically”, triggers	`{% policy %}`
API item	Actions, events, operations, error cases	`{% action %}` / `{% event %}` / `{% operation %}` / `{% error %}`
Data model	Entities, attributes, relationships, “has a”, “belongs to”	`{% model %}`
NFR	Performance, security, scalability, accessibility constraints	`{% requirement %}` with NFR scope
Value / Principle	Beliefs, design philosophies, “we value …”, “our approach is …”	`{% value %}` / `{% principle %}`
Goal	Success metrics, launch targets, KPIs	`{% goal %}`
Architecture	Technology choices, integration patterns, infrastructure decisions	`blueprint` document
UI / Interaction	Screen descriptions, navigation, layout, “the user sees …”	`surface` document or `{% surface %}`
Open question	Unresolved discussions, “we need to decide”, “TBD”, “not sure”	Open Questions section
Decision	Resolved choices, “we agreed”, “the decision is”, “let’s go with”	Decisions section

Attribution: For each extract, record the speaker (if identifiable) and approximate position in the transcript (beginning / middle / end). This is metadata for the processed file, not for the final spec.

Instructions override: If the Instructions section restricts scope, skip extracts outside that scope entirely. If it declares a decision, mark any contradicting transcript content as superseded.

Phase 3 — Deduplicate & Consolidate

Conversations revisit topics. The same concept may be discussed three times with slight variations. This phase collapses redundancy.

For each category from Phase 2:

Group extracts that refer to the same concept (same entity, same feature, same constraint — even if worded differently).
Merge into a single canonical statement per concept:
- Prefer the most specific and complete formulation.
- Prefer later statements over earlier ones (conversations tend to refine).
- Prefer statements from domain experts over general discussion (use speaker roles from Instructions if available).
Discard pure repetitions, filler, off-topic tangents, and social exchanges.

Track the merge count — report how many raw extracts collapsed into how many canonical items.

Phase 4 — Detect Inconsistencies

Compare all canonical items for contradictions:

Inconsistency type	Example
Contradicting requirements	”Must be real-time” vs. “batch processing is fine”
Scope conflict	Feature X assigned to two different domains
Naming collision	Two different concepts using the same term
Undecided alternative	Multiple options discussed, no resolution recorded
Decision vs. discussion	A later discussion reopens a previously recorded decision

For each inconsistency, record:

The conflicting statements (with speaker attribution)
The category and concept they affect
A suggested resolution if one is obvious from context (e.g., later statement supersedes earlier)

Do not silently resolve inconsistencies — always flag them. The processed file must make every conflict visible.

Phase 5 — Map to StarSpec Metamodel

Organize the deduplicated, classified extracts into a document plan aligned to the StarSpec content tree:

Roles — one entry per identified actor with goals and responsibilities.
Domains — group related concepts, entities, policies, and API items under bounded domains. Each domain gets:
- Glossary terms
- API items (actions, events, operations, errors)
- Data model entities
- Policies
Features — map user-facing capabilities to features, linking to domains and roles.
Flows — map step sequences to flows, linking to features.
Blueprints — group architecture and technology decisions.
Manifest — collect values, principles, goals, and NFRs.

Apply naming conventions from starspec/agents/conventions/naming-standards:

Document ids: kebab-case
Actions: kebab-case-imperative (e.g. create-order)
Events: kebab-case-past-tense (e.g. order-created)
Operations: kebab-case (e.g. get-order)
Errors: kebab-case-noun-phrase (e.g. order-not-found)

For each planned document, estimate completeness (0–100%) using the same scale as scaffold mode.

Phase 6 — Write Processed File

Write the output to docs/{basename}.processed.md. The file must follow this exact structure:

# Transcript Import — {title or subject}

**Source:** {relative path to .transcript.md file}
**Date:** {ISO 8601 date}
**Speakers:** {comma-separated list, or "monolog"}
**Type:** {conversation type from Phase 1}
**Instructions:** {summary of processing directives, or "none"}

---

## Document Plan

| Type | Id | Purpose | Completeness |
|------|-----|---------|-------------|
| manifest | {id} | {one-line} | {n}% |
| role | {id} | {one-line} | {n}% |
| domain | {id} | {one-line} | {n}% |
| ... | ... | ... | ... |

---

## Roles

### {role-id}

- **Goals:** {extracted goals}
- **Responsibilities:** {extracted responsibilities}
- **Source:** {speaker, position}

---

## Domains

### {domain-id}

**Scope:** {one-line scope description}

**Glossary:**
- **{term}** — {definition}

**API:**
- Action: `{signature}`
- Event: `{event-name}: { fields }`
- Operation: `{signature}`
- Error: `{error-name}` — {description}

**Model:**
{DBML or entity-relationship description}

**Policies:**
- When {source} → {reaction} ({policy-id})

---

## Features

### {feature-id}

- **Domains:** {domain-id list}
- **Roles:** {role-id list}
- **Requirements:**
  - {requirement description} (priority: {must/should/could})
- **Acceptance criteria:**
  - {criterion}

---

## Flows

### {flow-id}

- **Feature:** {feature-id}
- **Preconditions:** {list}
- **Steps:**
  1. {Actor} {action} → {outcome}
  2. ...
- **Postconditions:** {list}

---

## Blueprints

### {blueprint-id}

- **Decisions:** {architecture/technology choices}
- **Rationale:** {why}

---

## Manifest

**Values:**
- {VALUE_NAME}: {description}

**Principles:**
- {principle-name}: {description}

**Goals:**
- {goal-name}: {target} (status: pending)

**NFRs:**
- {nfr-name}: {constraint} (priority: {must/should})

---

## Decisions

Resolved choices extracted from the transcript:

1. **{decision}** — decided by {speaker}, {position in transcript}. Rationale: {why}
2. ...

---

## Open Questions

Unresolved items that require follow-up:

1. [{GAP_TAG}] **{question}** — raised by {speaker}. Context: {surrounding discussion}
2. ...

Gap tags: [SCOPE], [NAMING], [DOMAIN], [ERROR], [FLOW], [UI], [POLICY], [NFR], [ACTOR]

---

## Inconsistencies

Conflicts detected between transcript statements:

1. **{concept}** — "{statement A}" ({speaker A}) vs. "{statement B}" ({speaker B}).
   Suggested resolution: {suggestion, or "needs user input"}
2. ...

---

## Processing Summary

- **Raw extracts:** {n}
- **After deduplication:** {m} ({n - m} redundant statements removed)
- **Inconsistencies:** {k}
- **Open questions:** {q}
- **Decisions recorded:** {d}
- **Documents planned:** {p} (avg. completeness: {avg}%)

Omit any section that has zero items (e.g., if there are no blueprints, omit the Blueprints section). Do not leave empty sections.

Phase 7 — Report & Next Steps

Output a concise summary to the user:

Transcript imported — {m} items extracted from {n} raw statements

Documents planned: {p}
  {type counts: e.g. 2 roles, 3 domains, 4 features, 2 flows}

Decisions recorded: {d}
Open questions: {q}
Inconsistencies: {k}

Output: docs/{basename}.processed.md

Then suggest next steps:

If open questions or inconsistencies exist: “Review open questions and inconsistencies in the processed file before scaffolding.”
If the processed file is clean: “Ready to scaffold. Run: scaffold from docs/{basename}.processed.md”
If the Instructions section requested specific focus: note what was excluded and suggest separate import runs if needed.

Do’s and Don’ts

Do:

Read the entire transcript before extracting — context from later sections often reframes earlier statements
Honour the Instructions section as authoritative — it overrides inferences from the transcript
Preserve the speakers’ own terminology in glossary terms and concept names
Record attribution (speaker + position) for traceability
Flag every inconsistency — never silently resolve conflicts
Apply StarSpec naming conventions to all proposed ids
Omit empty sections from the processed file

Don’t:

Write .mdoc files — this workflow produces .processed.md only
Ask clarifying questions during import — infer, document uncertainty, and proceed (same principle as scaffold mode)
Invent requirements not supported by the transcript text or Instructions
Discard “off-topic” remarks without checking — they sometimes contain implicit NFRs, policies, or domain boundaries
Merge statements from different speakers without recording the merge
Resolve inconsistencies silently — the user must see every conflict

Definition of Done

Both sections (Instructions + Transcript) parsed and accounted for
All substantive statements extracted and classified
Redundancy eliminated with merge counts reported
Inconsistencies flagged with suggested resolutions
Open questions gathered with gap tags
Document plan aligned to StarSpec metamodel with completeness scores
Processed file written to docs/{basename}.processed.md
Summary with next-step guidance displayed to user