Skip to content

refactor adult content detection#1301

Open
wastaken7 wants to merge 2 commits intomasterfrom
adult-content
Open

refactor adult content detection#1301
wastaken7 wants to merge 2 commits intomasterfrom
adult-content

Conversation

@wastaken7
Copy link
Copy Markdown
Collaborator

@wastaken7 wastaken7 commented Mar 19, 2026

#1279

Summary by CodeRabbit

  • New Features

    • Introduced centralized adult media detection system with consistent confirmation workflow across all trackers.
  • Refactor

    • Consolidated adult content validation logic from individual tracker modules into a unified system.
    • Standardized adult media upload handling and confirmation prompts across all platforms.
    • Improved metadata handling with new adult media classification flag.

@github-actions
Copy link
Copy Markdown

Thanks for taking the time to contribute to this project. Upload Assistant is currently in a complete rewrite, and no new development is being conducted on this python source at this time.

If you have come this far, please feel free to leave open, any pull requests regarding new sites being added to the source, as these can serve as the baseline for later conversion.

If your pull request relates to a critical bug, this will be addressed in this code base, and a new release published as needed.

If your pull request only addresses a quite minor bug, it is not likely to be addressed in this code base.

Details for the new code base will follow at a later date.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 19, 2026

📝 Walkthrough

Walkthrough

This PR centralizes adult media detection logic across trackers. Instead of each tracker independently detecting adult content via regex pattern matching on keywords and genres, the system now extracts adult flags from TMDB, computes a unified adult_media flag in Prep, and provides a shared confirmation method in COMMON that trackers delegate to.

Changes

Cohort / File(s) Summary
Core Adult Media Detection Infrastructure
src/prep.py, src/tmdb.py, src/trackers/COMMON.py
Added check_adult_media() method in Prep that checks TMDB adult flag and matches keywords; TMDB now extracts and returns tmdb_adult_media; COMMON introduces Meta type alias and new check_and_confirm_adult_media_upload() method for centralized gating logic.
Tracker Updates — Simple Delegation
src/trackers/AR.py, src/trackers/ULCX.py, src/trackers/OE.py, src/trackers/LUME.py, src/trackers/RTF.py
Removed local keyword-based adult detection and replaced with direct calls to check_and_confirm_adult_media_upload(); unused import re removed where applicable.
Tracker Updates — Control Flow Refactoring
src/trackers/ANT.py, src/trackers/BHD.py, src/trackers/IHD.py, src/trackers/RF.py
Replaced conditional branches and flags with early-return patterns; adult validation now gates upload behavior via the centralized method.
Tracker Updates — New/Enhanced Methods
src/trackers/TL.py
Added new get_additional_checks() method delegating to check_and_confirm_adult_media_upload(); search_existing() now invokes the additional check before login with early termination on failure.

Sequence Diagram

sequenceDiagram
    participant User
    participant Prep
    participant TMDB as TMDB API
    participant Tracker
    participant COMMON

    User->>Prep: Start upload
    Prep->>TMDB: Fetch media data
    TMDB-->>Prep: Return adult flag
    Prep->>Prep: check_adult_media()
    Note over Prep: Evaluate TMDB flag<br/>+ keyword matching
    Prep->>Tracker: meta["adult_media"] = bool
    
    Tracker->>COMMON: check_and_confirm_adult_media_upload(meta)
    alt Adult media detected
        COMMON->>User: Prompt confirmation<br/>(if not unattended)
        User-->>COMMON: Confirm/Deny
        COMMON-->>Tracker: Return bool
    else Not adult media
        COMMON-->>Tracker: Return True
    end
    
    alt Confirmation allowed
        Tracker->>Tracker: Proceed with upload
    else Confirmation denied
        Tracker->>Tracker: Skip/Abort upload
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

Suggested Reviewers

  • Audionut

Poem

🐰 Hopping through the code with glee,
No more duplication do we see!
Adult detection, now as one,
Trackers unified—the job is done!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor adult content detection' accurately describes the main change: consolidating fragmented adult-content detection logic from multiple tracker files into centralized methods in Prep and COMMON classes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch adult-content
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/trackers/COMMON.py (1)

1318-1336: Centralized adult content check looks correct.

The logic correctly handles:

  • Interactive mode: prompts user for confirmation
  • Unattended with unattended_confirm: prompts user
  • Unattended without confirm: silently blocks upload

Minor suggestion for slightly cleaner control flow:

♻️ Optional: Simplify conditional with inverted check
         if meta.get("adult_media", False):
             if not meta["unattended"] or (meta["unattended"] and meta.get("unattended_confirm", False)):
                 console.print(f"[bold red]Pornography is not allowed at {tracker}.[/bold red]")
-                if cli_ui.ask_yes_no("Do you want to upload anyway?", default=False):
-                    pass
-                else:
+                if not cli_ui.ask_yes_no("Do you want to upload anyway?", default=False):
                     return False
             else:
                 return False
 
         return True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/trackers/COMMON.py` around lines 1318 - 1336, The control flow in
check_and_confirm_adult_media_upload is correct but can be simplified: if
meta["adult_media"] is truthy, first return False immediately when
meta.get("unattended", False) is True and meta.get("unattended_confirm", False)
is False; otherwise print the console message and call cli_ui.ask_yes_no("Do you
want to upload anyway?", default=False) and return True only if that call
returns True (else return False); keep all checks referencing meta keys
"adult_media", "unattended", and "unattended_confirm" and use the existing
console.print and cli_ui.ask_yes_no calls in the function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/prep.py`:
- Around line 1274-1278: The combined genres/keywords string is built with a
space so entries at the start of meta["combined_genres"] can evade the
comma-aware regex; change how genres is constructed (e.g., join keywords and
combined_genres with a comma and/or use filter to avoid empty parts) so genres
is a true comma-separated list before the regex runs (refer to the variable
genres and the use of meta.get('keywords', ''), meta.get('combined_genres', ''),
and adult_keywords in this block).

In `@src/tmdb.py`:
- Line 1323: The new tmdb_adult_media field (and TMDb keywords) won't be
backfilled because set_tmdb_metadata() short-circuits when
title/year/genres/overview exist; update set_tmdb_metadata() so it still fetches
TMDb data when tmdb_adult_media or tmdb_keywords are missing (or null) even if
title/year/genres/overview are present, then write those fields into the same
metadata paths (tmdb_adult_media and tmdb_keywords) so Prep.check_adult_media()
can see accurate values; reference set_tmdb_metadata(), tmdb_adult_media, and
tmdb_keywords when making the conditional and write-back changes.

In `@src/trackers/RTF.py`:
- Around line 202-205: The shared helper
COMMON.check_and_confirm_adult_media_upload currently allows a user override;
change its behavior so that when meta indicates adult_media it does not prompt
and always returns False (hard-reject) instead of calling cli_ui.ask_yes_no and
returning True on confirmation; keep callers like RTF (the call in RTF where
COMMON(config=self.config).check_and_confirm_adult_media_upload(meta,
self.tracker) is used) unchanged so they continue to set meta['skipping']="RTF"
and return [] when the helper returns False.

---

Nitpick comments:
In `@src/trackers/COMMON.py`:
- Around line 1318-1336: The control flow in
check_and_confirm_adult_media_upload is correct but can be simplified: if
meta["adult_media"] is truthy, first return False immediately when
meta.get("unattended", False) is True and meta.get("unattended_confirm", False)
is False; otherwise print the console message and call cli_ui.ask_yes_no("Do you
want to upload anyway?", default=False) and return True only if that call
returns True (else return False); keep all checks referencing meta keys
"adult_media", "unattended", and "unattended_confirm" and use the existing
console.print and cli_ui.ask_yes_no calls in the function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e84e2b24-5765-4b08-a29f-bc7422401cfe

📥 Commits

Reviewing files that changed from the base of the PR and between 7df0d14 and f56ed60.

📒 Files selected for processing (13)
  • src/prep.py
  • src/tmdb.py
  • src/trackers/ANT.py
  • src/trackers/AR.py
  • src/trackers/BHD.py
  • src/trackers/COMMON.py
  • src/trackers/IHD.py
  • src/trackers/LUME.py
  • src/trackers/OE.py
  • src/trackers/RF.py
  • src/trackers/RTF.py
  • src/trackers/TL.py
  • src/trackers/ULCX.py

Comment thread src/prep.py Outdated
Comment on lines +1274 to +1278
genres = f"{meta.get('keywords', '')} {meta.get('combined_genres', '')}"
if meta.get("tmdb_adult_media", False):
return True
else:
return bool(any(re.search(rf"(^|,\s*){re.escape(keyword)}(\s*,|$)", genres, re.IGNORECASE) for keyword in adult_keywords))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a comma-aware separator before running the adult regex.

The pattern only matches at start-of-string or after a comma. f"{keywords} {combined_genres}" inserts a plain space, so an adult value at the start of combined_genres—for example Erotic, Drama—won't match and the shared adult gate can be bypassed.

Suggested fix
     def check_adult_media(self, meta) -> bool:
         adult_keywords = ["xxx", "erotic", "porn", "adult", "orgy"]
-        genres = f"{meta.get('keywords', '')} {meta.get('combined_genres', '')}"
         if meta.get("tmdb_adult_media", False):
             return True
-        else:
-            return bool(any(re.search(rf"(^|,\s*){re.escape(keyword)}(\s*,|$)", genres, re.IGNORECASE) for keyword in adult_keywords))
+        searchable = ", ".join(
+            part for part in (meta.get('keywords', ''), meta.get('combined_genres', '')) if part
+        )
+        return any(
+            re.search(rf"(^|,\s*){re.escape(keyword)}(\s*,|$)", searchable, re.IGNORECASE)
+            for keyword in adult_keywords
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/prep.py` around lines 1274 - 1278, The combined genres/keywords string is
built with a space so entries at the start of meta["combined_genres"] can evade
the comma-aware regex; change how genres is constructed (e.g., join keywords and
combined_genres with a comma and/or use filter to avoid empty parts) so genres
is a true comma-separated list before the regex runs (refer to the variable
genres and the use of meta.get('keywords', ''), meta.get('combined_genres', ''),
and adult_keywords in this block).

Comment thread src/tmdb.py
'networks': networks,
'imdb_mismatch': imdb_mismatch,
'mismatched_imdb_id': mismatched_imdb_id
"tmdb_adult_media": adult_media,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Backfill the new TMDb adult fields on already-populated metadata paths.

Adding tmdb_adult_media here isn't sufficient while set_tmdb_metadata() still skips the TMDb fetch as soon as title/year/genres/overview exist. Those paths never populate tmdb_adult_media or TMDb keywords, so Prep.check_adult_media() can classify adult titles as safe.

Suggested fix
# src/tmdb.py
-        essential_fields = ['title', 'year', 'genres', 'overview']
+        essential_fields = ['title', 'year', 'genres', 'overview', 'keywords', 'tmdb_adult_media']
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tmdb.py` at line 1323, The new tmdb_adult_media field (and TMDb keywords)
won't be backfilled because set_tmdb_metadata() short-circuits when
title/year/genres/overview exist; update set_tmdb_metadata() so it still fetches
TMDb data when tmdb_adult_media or tmdb_keywords are missing (or null) even if
title/year/genres/overview are present, then write those fields into the same
metadata paths (tmdb_adult_media and tmdb_keywords) so Prep.check_adult_media()
can see accurate values; reference set_tmdb_metadata(), tmdb_adult_media, and
tmdb_keywords when making the conditional and write-back changes.

Comment thread src/trackers/RTF.py
Comment on lines +202 to 205
common = COMMON(config=self.config)
if not common.check_and_confirm_adult_media_upload(meta, self.tracker):
meta['skipping'] = "RTF"
return []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

common_file="$(fd 'COMMON.py$' src | head -n 1)"
[ -n "$common_file" ] || { echo "COMMON.py not found" >&2; exit 1; }

echo "=== check_and_confirm_adult_media_upload ==="
python - "$common_file" <<'PY'
import sys
from pathlib import Path

path = Path(sys.argv[1])
lines = path.read_text(encoding="utf-8").splitlines()

start = None
indent = None
for i, line in enumerate(lines):
    stripped = line.lstrip()
    if stripped.startswith("def check_and_confirm_adult_media_upload("):
        start = i
        indent = len(line) - len(stripped)
        break

if start is None:
    raise SystemExit("method not found")

for j, line in enumerate(lines[start:], start=start):
    stripped = line.lstrip()
    current_indent = len(line) - len(stripped)
    if j > start and stripped.startswith(("def ", "class ")) and current_indent <= indent:
        break
    print(line)
PY

echo
echo "=== call sites ==="
rg -n -C2 'check_and_confirm_adult_media_upload\s*\(' src

Repository: Audionut/Upload-Assistant

Length of output: 3601


Remove user override capability from adult-content check in shared helper.

The shared helper check_and_confirm_adult_media_upload() allows uploads to proceed when the user confirms "yes" to the override prompt. This changes RTF (and other trackers like LUME, OE) from a hard-reject to a user-overridable block.

Looking at COMMON.py, when adult_media is flagged and the user is not in unattended mode, the method prints a warning and calls ask_yes_no(). If the user selects yes, it executes pass and then returns True, allowing the upload:

if cli_ui.ask_yes_no("Do you want to upload anyway?", default=False):
    pass  # Falls through to return True
else:
    return False

Either the helper should return False unconditionally when adult content is detected (restoring the hard-reject behavior), or it should only be used for trackers that legitimately allow user override.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/trackers/RTF.py` around lines 202 - 205, The shared helper
COMMON.check_and_confirm_adult_media_upload currently allows a user override;
change its behavior so that when meta indicates adult_media it does not prompt
and always returns False (hard-reject) instead of calling cli_ui.ask_yes_no and
returning True on confirmation; keep callers like RTF (the call in RTF where
COMMON(config=self.config).check_and_confirm_adult_media_upload(meta,
self.tracker) is used) unchanged so they continue to set meta['skipping']="RTF"
and return [] when the helper returns False.

@wastaken7 wastaken7 requested a review from Audionut March 19, 2026 14:09
@wastaken7 wastaken7 linked an issue Mar 19, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TL: Disallow adult content

1 participant