adding spec-kit information for audio qc pipeline for discussion#261
adding spec-kit information for audio qc pipeline for discussion#261
Conversation
wilke0818
commented
Apr 10, 2026
- Adding document created by spec-kit to have discussion about planned audio QC pipeline
There was a problem hiding this comment.
Code Review
This pull request introduces the specification and implementation plan for an Audio Quality Assurance Pipeline, covering automated screening, human review, and release reporting. The feedback identifies a significant mathematical flaw in the composite confidence calculation and a security risk regarding the persistence of raw PII text in metadata files. Additionally, the reviewer noted inconsistencies between the research findings and the data model's classification logic, and suggested improvements for CLI documentation and the configurability of task-specific thresholds.
| | `session_id` | string | | | ||
| | `task_name` | string | | | ||
| | `composite_score` | float 0.0–1.0 | Weighted mean of per-check scores | | ||
| | `composite_confidence` | float 0.0–1.0 | 1 − std_dev of per-check confidences | |
There was a problem hiding this comment.
The proposed formula for composite_confidence ($1 - \text{std_dev}(\text{confidences})$) is mathematically unsound for this context. If all individual checks have very low confidence (e.g., all are 0.1), the standard deviation is 0, resulting in a composite confidence of 1.0 (perfect confidence). This would lead to misleadingly high confidence claims in the final report. A more robust approach would be to use the weighted mean of confidences, potentially penalized by the variance to account for disagreement between checks.
There was a problem hiding this comment.
Noted and changing
| *pii_disclosure*: | ||
| ```json | ||
| { | ||
| "entities_detected": [{"text": "...", "label": "name", "score": 0.82}], | ||
| "transcript_confidence": 0.88, | ||
| "model_used": "gliner-pii", | ||
| "redacted_transcript": "My name is [NAME]." | ||
| } | ||
| ``` |
There was a problem hiding this comment.
The pii_disclosure detail block includes an entities_detected list containing the text of the detected PII. Storing the actual PII text in qa_check_results.tsv (which is written to the BIDS root) creates a security and privacy risk, as these metadata files might be inadvertently included in data shares or releases. To support human review without persisting PII in logs, consider storing only the entity labels, confidence scores, and character offsets, or ensuring that these specific logs are strictly excluded from any external distribution.
There was a problem hiding this comment.
Aggregate results won't have this stored but per JSON sidecars should for human review.
|
satra's offline notes:
|
- qa_pipeline_config.json: default PipelineConfig with all thresholds, weights, compliance params, YAMNet noise classes, evans_model TODO, and sc_004_review_fraction_warn - qa_pipeline_schema.json: JSON Schema (draft-07) documenting all 8 output file formats and 12 entity definitions - qa_models.py: CheckResult, CompositeScore, ReviewDecision, QualityReport, PipelineConfig dataclasses + 4 enums (no I/O logic) - tasks.md: T001, T002, T003 marked complete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
wilke0818
left a comment
There was a problem hiding this comment.
seems good so far, mostly documentation files and some model definitions.