Skip to content

adding spec-kit information for audio qc pipeline for discussion#261

Draft
wilke0818 wants to merge 3 commits intomainfrom
185-audio-quality-pipeline
Draft

adding spec-kit information for audio qc pipeline for discussion#261
wilke0818 wants to merge 3 commits intomainfrom
185-audio-quality-pipeline

Conversation

@wilke0818
Copy link
Copy Markdown
Contributor

  • Adding document created by spec-kit to have discussion about planned audio QC pipeline

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the specification and implementation plan for an Audio Quality Assurance Pipeline, covering automated screening, human review, and release reporting. The feedback identifies a significant mathematical flaw in the composite confidence calculation and a security risk regarding the persistence of raw PII text in metadata files. Additionally, the reviewer noted inconsistencies between the research findings and the data model's classification logic, and suggested improvements for CLI documentation and the configurability of task-specific thresholds.

| `session_id` | string | |
| `task_name` | string | |
| `composite_score` | float 0.0–1.0 | Weighted mean of per-check scores |
| `composite_confidence` | float 0.0–1.0 | 1 − std_dev of per-check confidences |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The proposed formula for composite_confidence ($1 - \text{std_dev}(\text{confidences})$) is mathematically unsound for this context. If all individual checks have very low confidence (e.g., all are 0.1), the standard deviation is 0, resulting in a composite confidence of 1.0 (perfect confidence). This would lead to misleadingly high confidence claims in the final report. A more robust approach would be to use the weighted mean of confidences, potentially penalized by the variance to account for disagreement between checks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted and changing

Comment on lines +71 to +79
*pii_disclosure*:
```json
{
"entities_detected": [{"text": "...", "label": "name", "score": 0.82}],
"transcript_confidence": 0.88,
"model_used": "gliner-pii",
"redacted_transcript": "My name is [NAME]."
}
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The pii_disclosure detail block includes an entities_detected list containing the text of the detected PII. Storing the actual PII text in qa_check_results.tsv (which is written to the BIDS root) creates a security and privacy risk, as these metadata files might be inadvertently included in data shares or releases. To support human review without persisting PII in logs, consider storing only the entity labels, confidence scores, and character offsets, or ensuring that these specific logs are strictly excluded from any external distribution.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggregate results won't have this stored but per JSON sidecars should for human review.

@wilke0818
Copy link
Copy Markdown
Contributor Author

satra's offline notes:

the few bits that caught my eye is the notion that there could be only one extra voice. perhaps clarify that while it’s likely that there is zero or one extra voice, there could be more (and potentially in different languages, although relatively unlikely). the other thing to add is about potential considerations for recordings that while many are done in a quiet setting it’s possible not all are, so running something like yamnet (or some recent variant)

wilke0818 and others added 2 commits April 13, 2026 15:28
- qa_pipeline_config.json: default PipelineConfig with all thresholds,
  weights, compliance params, YAMNet noise classes, evans_model TODO,
  and sc_004_review_fraction_warn
- qa_pipeline_schema.json: JSON Schema (draft-07) documenting all 8
  output file formats and 12 entity definitions
- qa_models.py: CheckResult, CompositeScore, ReviewDecision,
  QualityReport, PipelineConfig dataclasses + 4 enums (no I/O logic)
- tasks.md: T001, T002, T003 marked complete

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@wilke0818 wilke0818 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems good so far, mostly documentation files and some model definitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants