Speech-To-Text (STT)

Manages the transcription of audio recordings into text using local AI.

Data Model

Diagram
Source

er Speech-To-Text (STT) data model

Diagram Syntax Error

erDiagram
  transcript {
    string id PK
    string sessionId 
    string recordingId 
    datetime createdAt 
  }
  transcript_segment {
    string id PK
    string transcriptId 
    string participantId 
    string text 
    integer startMs 
    integer endMs 
  }
  transcript_segment }o--|| transcript : "transcriptId"

Table transcript {
  id string [pk]
  sessionId string [note: 'Links to conversation/session']
  recordingId string [note: 'Links to audio/audio-recording']
  createdAt datetime
  note: 'Always shared between participants'
}

Table transcript_segment {
  id string [pk]
  transcriptId string [ref: > transcript.id]
  participantId string [null, note: 'Speaker; links to conversation/participant']
  text string
  startMs integer
  endMs integer
}

API

request-transcription

Queues a recording for transcription.

Properties

recording-idstring

Parameter	Type	Required	Description
`recording-id`	string	—

transcription-completed

Properties

session-idstringtranscript-idstring

Parameter	Type	Required	Description
`session-id`	string	—
`transcript-id`	string	—

write

transcribe-audio

Converts an audio buffer into a structured Transcript with timestamps.

Input Parameters

audio-dataobject

Parameter	Type	Required	Description
`audio-data`	object	—	Buffer

Returns

idstringlengthnumber

Parameter	Type	Required	Description
`id`	string	✓
`length`	number	✓

Declared Errors

transcription-failed

Error ID	Code	Category	Description
`transcription-failed`	`TODO`	domain	Thrown when audio cannot be transcribed locally.

Source	Condition	Reaction	Rule
`session-ended`	—	`request-transcription`	When a session ends, transcription is automatically requested.
`request-transcription`	—	`transcription-completed`	Transcriptions are announced
`request-transcription`	—	`transcription-failed`	Transcriptions are announced

Reactive Topology

autoflow Speech-To-Text (STT) reactive topology

Gallery

Diagram Syntax Error

flowchart LR;
    classDef actionNode fill:#e3f2fd,stroke:#1e88e5,color:#0d47a1;
    classDef eventNode fill:#e8f5e9,stroke:#4caf50,color:#1b5e20;
    classDef errorNode fill:#ffebee,stroke:#ef5350,color:#b71c1c;
    classDef externalNode fill:#f5f5f5,stroke:#9e9e9e,color:#616161,stroke-dasharray:4 4;
    subgraph conversation ["Conversation"]
        n3{{"session-ended"}}
    end
    subgraph speech-to-text ["Speech-To-Text (STT)"]
        n0(["request-transcription"])
        n1{{"transcription-completed"}}
        n2{{"transcription-failed"}}
    end
    n3 -. "auto-transcription" .-> n0
    n0 -. "policy (success)" .-> n1
    n0 -. "policy (failed)" .-> n2
    class n0 actionNode;
    class n1 eventNode;
    class n2 errorNode;
    class n3 externalNode;
    click n0 href "/domains/speech-to-text#request-transcription" _self;
    click n1 href "/domains/speech-to-text#transcription-completed" _self;
    click n2 href "/domains/speech-to-text#transcription-failed" _self;
    click n3 href "/domains/conversation#session-ended" _self;