Skip to content

Track image validation rejections in MongoDB for data-driven tuning #83

@taha-abbasi

Description

@taha-abbasi

Problem

Image padding detection (#69) rejects DALL-E images that have padding/borders, but we have no structured data on:

  • Rejection rate by niche (is med spa worse than bakery?)
  • Rejection reasons (solid padding vs soft padding vs square/portrait)
  • Regeneration success rate (does the 2nd attempt fix it?)
  • Cost impact (each regen = extra $0.08 DALL-E call)

Currently rejections are console.log only — lost in journald, not queryable.

Proposed Solution

Log image validation results to MongoDB:

db.imageValidation.insertOne({
  articleId, siteId, niche, brandName,
  reason: 'solid-padding' | 'soft-padding' | 'square-portrait',
  dimensions: { width, height },
  regenAttempt: 1,  // increments on each retry
  resolved: false,  // true when regen produces a valid image
  timestamp: new Date()
})

Why Before Tuning

We need data before tuning DALL-E prompts. Current observation (1 test, Sally's Spa):

  • 2/3 images rejected (66% rejection rate)
  • 1/2 regen succeeded
  • Sample size too small to act on

Priority

P2 — not blocking but needed before any DALL-E prompt optimization work.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions