simulstream.metrics.stats

Functions

cli_main()

Module for computing evaluation statistics from Simulstream logs.

main(args)

Main entry point for computing statistics.

Classes

NormalizedErasure()

Compute the Normalized Erasure metric.

RealTimeFactor()

Compute the Real Time Factor.

Stats()

Abstract base class for defining evaluation statistics.

class simulstream.metrics.stats.NormalizedErasure

Compute the Normalized Erasure metric.

This measures the amount of flickering in retranslation, as defined in `Arivazhagan et al., “Re-translation versus Streaming for Simultaneous Translation”

It is defined as the ratio:

\[\text{Normalized Erasure} = \frac{\text{# Deleted Tokens}}{\text{# Final Generated Tokens}}\]
compute(log_reader: LogReader) float

Compute the value of the statistic.

Parameters:

log_reader (LogReader) – Reader object encapsulating log data.

Returns:

The computed value of the statistic.

Return type:

float

description() str

The human-readable explanation of the statistic.

name() str

The unique name of the statistic.

class simulstream.metrics.stats.RealTimeFactor

Compute the Real Time Factor.

This measures how many seconds of computation are required on average for each second of input audio.

Values greater than 1 indicate that the system is slower than real time and cannot process input before the next audio chunk arrives.

compute(log_reader: LogReader) float

Compute the value of the statistic.

Parameters:

log_reader (LogReader) – Reader object encapsulating log data.

Returns:

The computed value of the statistic.

Return type:

float

description() str

The human-readable explanation of the statistic.

name() str

The unique name of the statistic.

class simulstream.metrics.stats.Stats

Abstract base class for defining evaluation statistics.

Subclasses must implement: - name(): unique identifier of the statistic. - description(): a human-readable explanation. - compute(): logic to compute the metric from a LogReader.

abstractmethod compute(log_reader: LogReader) float

Compute the value of the statistic.

Parameters:

log_reader (LogReader) – Reader object encapsulating log data.

Returns:

The computed value of the statistic.

Return type:

float

abstractmethod description() str

The human-readable explanation of the statistic.

abstractmethod name() str

The unique name of the statistic.

simulstream.metrics.stats.cli_main()

Module for computing evaluation statistics from Simulstream logs.

This script provides a CLI interface to compute metrics that describe the behavior of streaming systems. Metrics are computed from JSONL log files generated during evaluation and include:

  • Normalized Erasure: measures flickering in retranslation processors.

  • Computational Cost: measures average computation time per second of audio.

The output is printed on standard output in JSON format.

Typical usage from the command line:

$ python -m simulstream.metrics.stats –eval-config config/speech_processor.yaml

–log-file metrics.jsonl

simulstream.metrics.stats.main(args: Namespace)

Main entry point for computing statistics.

Loads the evaluation configuration and log file, computes all defined statistics, and prints them in JSON format.

Parameters:

args (argparse.Namespace) – Parsed command-line arguments.