simulstream.metrics.score_quality
Functions
|
Quality scoring script for Simulstream evaluation. |
|
Main entry point for quality scoring. |
- simulstream.metrics.score_quality.cli_main()
Quality scoring script for Simulstream evaluation.
This module provides functionality to compute quality-based evaluation metrics on system outputs stored in JSONL log files. It uses pluggable scorers from the
simulstream.metrics.scorers.qualityregistry and compares system outputs against references and/or transcripts.It supports: - Reference-based metrics (e.g., BLEU, COMET). - Source-based metrics (e.g., reference-free COMET). - Hybrid setups when both references and transcripts are available.
The script can be invoked as a standalone CLI:
- $ python -m simulstream.metrics.score_quality
–eval-config config/speech-processor.yaml –log-file metrics.jsonl –references ref.en –transcripts src.it –scorer sacrebleu
- simulstream.metrics.score_quality.main(scorer_cls: type[QualityScorer], args: Namespace)
Main entry point for quality scoring.
This function loads the evaluation configuration, system hypotheses, and reference/transcript data (if required), then constructs scoring samples and computes the final quality score using the selected scorer.
The output is printed on standard output.
- Parameters:
scorer_cls (type[QualityScorer]) – Class implementing the quality metric.
args (argparse.Namespace) – Parsed command-line arguments.