simulstream.server.speech_processors

Functions

`build_speech_processor`(speech_processor_config)	Instantiate a SpeechProcessor subclass based on configuration.
`class_load`(class_string)
`speech_processor_class_load`(...)	Import the speech processor class from its string definition.

Classes

Abstract base class for speech processors.

class simulstream.server.speech_processors.SpeechProcessor(config: SimpleNamespace)

Abstract base class for speech processors.

Subclasses must implement methods to load models, process audio chunks, set source/target languages, and clear internal states.

abstractmethod clear() → None: Clear internal states, such as history of cached audio and/or tokens, in preparation for a new stream or conversation.

abstractmethod end_of_stream() → IncrementalOutput

This method is called at the end of audio chunk processing. It can be used to emit hypotheses at the end of the speech to conclude the output.

abstractmethod classmethod load_model(config: SimpleNamespace)

Load and initialize the underlying speech model.

Parameters:: config (SimpleNamespace) – Configuration of the speech processor.

abstractmethod process_chunk(waveform: float32) → IncrementalOutput

Process a chunk of waveform and produce incremental output.

Parameters:: waveform (np.float32) – A 1D NumPy array of the audio chunk. The array is PCM audio normalized to the range [-1.0, 1.0] sampled at simulstream.server.speech_processors.SAMPLE_RATE.
Returns:: The incremental output (new and deleted tokens/strings).
Return type:: IncrementalOutput

abstractmethod set_source_language(language: str) → None

Set the source language for the speech processor.

abstractmethod set_target_language(language: str) → None

Set the target language for the speech processor (for translation).

property speech_chunk_size: float: Return the size of the speech chunks to be processed (in seconds).

abstractmethod tokens_to_string(tokens: List[str]) → str

Converts token sequences into human-readable strings.

simulstream.server.speech_processors.build_speech_processor(speech_processor_config: SimpleNamespace) → SpeechProcessor

Instantiate a SpeechProcessor subclass based on configuration.

The configuration should specify the fully-qualified class name in the type field (e.g. "simulstream.server.speech_processors.MyProcessor").

Parameters:: speech_processor_config (SimpleNamespace) – Configuration for the speech processor.
Returns:: An instance of the configured speech processor.
Return type:: SpeechProcessor
Raises:: AssertionError – If the specified class is not a subclass of SpeechProcessor.

simulstream.server.speech_processors.speech_processor_class_load(speech_processor_class_string: str) → type[SpeechProcessor]

Import the speech processor class from its string definition.

Parameters:: speech_processor_class_string (str) – Full name of the speech processor class to load.
Returns:: A class object for the speech processor class.
Return type:: SpeechProcessorClass
Raises:: AssertionError – If the specified class is not a subclass of SpeechProcessor.