Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@ The bridge is conservative by default. `policy.json` decides what is trusted, wh

```json
{
"source": {
"from": ["notifications@github.com", "giscebot@gisce.net"]
},
"botLogins": ["pilipilisbot"],
"trustedOrgs": ["your-org"],
"enabledRepos": ["your-org/your-repo"],
"orgRoutes": {
Expand Down
46 changes: 43 additions & 3 deletions docs/policy-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,12 @@ gab --policy ~/.config/github-agent-bridge/policy.json enqueue-comment-url ...
```json
{
"source": {
"from": "notifications@github.com",
"from": ["notifications@github.com", "giscebot@gisce.net"],
"requiredAuth": ["spf=pass", "dkim=pass", "dmarc=pass"],
"requiredUrlPrefix": "https://github.com/",
"messageIdDomain": "github.com"
},
"botLogins": ["pilipilisbot"],
"trustedRepos": ["your-org/your-repo"],
"trustedOrgs": ["your-org"],
"enabledRepos": ["your-org/your-repo"],
Expand Down Expand Up @@ -97,6 +98,7 @@ gab --policy ~/.config/github-agent-bridge/policy.json enqueue-comment-url ...
| `orgRoutes` | object | `{}` | Per-owner delivery routes used when no `repoRoutes` entry matches. |
| `repoRoles` | object | `{}` | Exact per-repo operating role. Takes precedence over `orgRoles`. |
| `orgRoles` | object | `{}` | Per-owner operating role used when no `repoRoles` entry matches. |
| `botLogins` | array of strings | `["pilipilisbot"]` | GitHub login names that should count as addressed bots when classifying mentions, assignments, and review requests. |
| `actions` | object | built-in action defaults | Maps classified notification actions to policy decisions. |
| `promptOverrides` | object | `{}` | Optional Markdown files that replace selected packaged prompt resources. |
| `feedbackLearning` | object | `{ "enabled": true, "minConfidence": 0.5, "autoApproveConfidence": 0.8 }` | Controls candidate capture, autonomous learning, and prompt threshold for feedback rules. |
Expand All @@ -109,7 +111,7 @@ Unknown top-level keys are ignored by the current implementation.

| Key | Type | Default | Meaning |
| --- | --- | --- | --- |
| `from` | string | `notifications@github.com` | Required substring in the decoded email `From` header. |
| `from` | string or array of strings | `notifications@github.com` | Required substring in the decoded email `From` header. Use an array when GitHub notifications are forwarded or rewritten by a trusted mail gateway while GitHub reply headers and message ids are preserved. |
| `requiredUrlPrefix` | string | `https://github.com/` | At least one extracted URL must start with this prefix. |
| `messageIdDomain` | string | `github.com` | Required substring in the email `Message-ID`. |
| `requiredAuth` | array of strings | currently documented only | Intended SPF/DKIM/DMARC requirements. See note below. |
Expand All @@ -123,14 +125,52 @@ Current auth behavior:
Source trust fails when any of these are false:

```text
source.from is in From header
any configured source.from value is in From header
AND auth is OK
AND at least one GitHub URL has source.requiredUrlPrefix
AND Message-ID contains source.messageIdDomain
```

If source trust fails, the decision is always `deny`.

Example with Google Groups or similar forwarded GitHub notifications:

```json
{
"source": {
"from": ["notifications@github.com", "giscebot@gisce.net"],
"requiredUrlPrefix": "https://github.com/",
"messageIdDomain": "github.com"
}
}
```

The parser still requires GitHub-specific headers, a GitHub reply address, GitHub message id content, and normal source trust before forwarded messages are accepted.

## `botLogins`

`botLogins` defines the GitHub accounts that count as the addressed bot for mention, assignment, and review-request classification.

Default:

```json
{
"botLogins": ["pilipilisbot"]
}
```

Configured names are case-insensitive and may include or omit the leading `@`.

Example:

```json
{
"botLogins": ["pilipilisbot", "giscebot"]
}
```

With this policy, comments that mention `@giscebot`, assignments to `@giscebot`, and review requests from `@giscebot` are classified the same way as the default `@pilipilisbot` notifications. Set an explicit empty array only if the deployment should rely on GitHub footer text such as “You are receiving this because you were mentioned” instead of login matching.

## `trustedRepos`

Exact repositories trusted for `trustedAuto` actions.
Expand Down
8 changes: 7 additions & 1 deletion policy.example.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
{
"source": {
"from": "notifications@github.com",
"from": [
"notifications@github.com",
"giscebot@gisce.net"
],
"requiredAuth": [
"spf=pass",
"dkim=pass",
Expand All @@ -10,6 +13,9 @@
"messageIdDomain": "github.com"
},
"trustedRepos": [],
"botLogins": [
"pilipilisbot"
],
"trustedOrgs": [
"your-org"
],
Expand Down
4 changes: 2 additions & 2 deletions src/github_agent_bridge/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from .models import Notification, utc_now
from .monitor import MonitorThresholds, monitor, report_json
from .observability import DEFAULT_PROCESS_SAMPLE_RETENTION_SECONDS
from .parser import decode_header_value, extract_body_text, parse_auth_results
from .parser import decode_header_value, extract_body_text, is_github_notification_message, parse_auth_results
from .policy import Policy
from .queue import JobQueue
from .reader import ImapConfig, ImapReader, imap_mailbox_arg
Expand All @@ -34,7 +34,7 @@ def load_policy(path: str | None) -> Policy:

def msg_to_notification(msg, uid: int | None = None) -> Notification | None:
from_addr = decode_header_value(msg.get("From", ""))
if "notifications@github.com" not in from_addr.lower():
if not is_github_notification_message(msg, from_addr):
return None
return Notification(
uid=uid,
Expand Down
3 changes: 3 additions & 0 deletions src/github_agent_bridge/dispatch.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,9 @@ def is_non_actionable_review(self, ctx: GitHubContext) -> bool:
review = self.pull_request_review(ctx)
if not review:
return False
state = (review.get("state") or "").upper()
if state == "APPROVED":
return True
body = (review.get("body") or "").lower()
non_actionable_markers = (
"generated no new comments",
Expand Down
46 changes: 37 additions & 9 deletions src/github_agent_bridge/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@

REVIEW_ONLY_PATTERNS = ("fes-ne una review", "fes una review", "fes review", "fer una review", "fes-ne una revisio", "fes-ne una revisió", "fes una revisio", "fes una revisió", "fer una revisio", "fer una revisió", "review de la pr", "revisió de la pr", "revisio de la pr", "revisa aquesta pr", "revisa els canvis", "revisar els canvis", "com veus els canvis", "què et semblen els canvis", "que et semblen els canvis", "what do you think of these changes", "please review", "can you review")
IMPLEMENTATION_PATTERNS = ("fes els canvis", "fes-ho", "implementa", "modifica", "canvia", "arregla", "corregeix", "fix", "push", "commit", "aplica", "resol", "resolve")
BOT_MENTION_PATTERNS = ("@pilipilisbot", "pilipilisbot", "you are receiving this because you were mentioned")
ASSIGNMENT_PATTERNS = ("assigned you", "assigned to you", "you were assigned", "you are assigned", "assigned pilipilisbot", "assigned @pilipilisbot")
REVIEW_REQUEST_PATTERNS = ("requested your review", "requested a review from you", "you were requested for review", "review requested", "requested review from pilipilisbot", "requested review from @pilipilisbot", "requested @pilipilisbot")
BOT_MENTION_PATTERNS = ("you are receiving this because you were mentioned",)
ASSIGNMENT_PATTERNS = ("assigned you", "assigned to you", "you were assigned", "you are assigned")
REVIEW_REQUEST_PATTERNS = ("requested your review", "requested a review from you", "you were requested for review", "review requested")
COPILOT_REVIEW_PATTERNS = ("copilot-pull-request-reviewer", "github-copilot", "github copilot", "copilot reviewed", "copilot commented", "copilot left a comment", "copilot suggested", "copilot requested changes")
WORKFLOW_RUN_FAILED_PATTERNS = ("run failed", "workflow run failed", "workflow failed", "job failed", "failing after")

Expand All @@ -37,23 +37,51 @@ def parse_auth_results(msg: Message) -> dict[str, bool]:
return {"spf": "spf=pass" in raw, "dkim": "dkim=pass" in raw, "dmarc": "dmarc=pass" in raw}


def is_github_notification_message(msg: Message, from_addr: str | None = None) -> bool:
"""Return True for direct GitHub notifications and Google Groups rewrites.

GISCE routes GitHub mail through a Google Group, so incoming notifications can
arrive as `From: ... via GISCE Bot <giscebot@gisce.net>` while retaining the
GitHub reply address, message id and X-GitHub headers.
"""
sender = (from_addr or decode_header_value(msg.get("From", ""))).lower()
if "notifications@github.com" in sender:
return True
reply_to = decode_header_value(msg.get("Reply-To", "")).lower()
message_id = decode_header_value(msg.get("Message-ID", "")).lower()
return (
bool(msg.get("X-GitHub-Recipient"))
and bool(msg.get("X-GitHub-Reason"))
and "@reply.github.com" in reply_to
and "github.com" in message_id
)


def _contains_any(text: str, patterns: tuple[str, ...]) -> bool:
return any(p in text for p in patterns)


def github_event_flags(subject: str, body: str) -> dict[str, bool]:
def _bot_patterns(bot_logins: set[str] | None) -> tuple[str, ...]:
names = sorted({login.lower().lstrip("@") for login in (bot_logins or set()) if login.strip()})
return tuple(pattern for name in names for pattern in (f"@{name}", name))


def github_event_flags(subject: str, body: str, bot_logins: set[str] | None = None) -> dict[str, bool]:
text = f"{subject}\n{body}".lower()
return {"bot_mentioned": _contains_any(text, BOT_MENTION_PATTERNS), "assigned": _contains_any(text, ASSIGNMENT_PATTERNS), "review_requested": _contains_any(text, REVIEW_REQUEST_PATTERNS), "copilot_review": _contains_any(text, COPILOT_REVIEW_PATTERNS)}
bot_patterns = _bot_patterns(bot_logins)
assignment_patterns = ASSIGNMENT_PATTERNS + tuple(f"assigned {p}" for p in bot_patterns)
review_patterns = REVIEW_REQUEST_PATTERNS + tuple(f"requested review from {p}" for p in bot_patterns) + tuple(f"requested {p}" for p in bot_patterns)
return {"bot_mentioned": _contains_any(text, BOT_MENTION_PATTERNS + bot_patterns), "assigned": _contains_any(text, assignment_patterns), "review_requested": _contains_any(text, review_patterns), "copilot_review": _contains_any(text, COPILOT_REVIEW_PATTERNS)}


def _looks_like_pr_thread(subject: str, body: str) -> bool:
text = f"{subject}\n{body}".lower()
return bool(re.search(r"\bpr #\d+\b|\bpull request #\d+\b", text) or re.search(r"github\.com/[^/]+/[^/]+/pull/\d+", text))


def classify_work_intent(subject: str, body: str) -> str:
def classify_work_intent(subject: str, body: str, bot_logins: set[str] | None = None) -> str:
text = f"{subject}\n{body}".lower()
flags = github_event_flags(subject, body)
flags = github_event_flags(subject, body, bot_logins)
asks_review = flags["review_requested"] or _contains_any(text, REVIEW_ONLY_PATTERNS)
asks_implementation = flags["assigned"] or _contains_any(text, IMPLEMENTATION_PATTERNS)
if asks_review and not asks_implementation:
Expand All @@ -66,9 +94,9 @@ def classify_work_intent(subject: str, body: str) -> str:
return "work_allowed"


def classify_github_action(subject: str, body: str) -> str:
def classify_github_action(subject: str, body: str, bot_logins: set[str] | None = None) -> str:
text = f"{subject}\n{body}".lower()
flags = github_event_flags(subject, body)
flags = github_event_flags(subject, body, bot_logins)
if re.search(r"github\.com/[^/]+/[^/]+/actions/runs/\d+", text) and _contains_any(text, WORKFLOW_RUN_FAILED_PATTERNS):
return "workflow_run_failed"
if "merged" in text:
Expand Down
16 changes: 13 additions & 3 deletions src/github_agent_bridge/policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
"worktree",
}
DEFAULT_REPO_ROLE = "contributor"
DEFAULT_BOT_LOGINS = frozenset({"pilipilisbot"})


@dataclass(frozen=True)
Expand Down Expand Up @@ -60,7 +61,7 @@ class FeedbackLearning:

@dataclass(frozen=True)
class Policy:
source_from: str = "notifications@github.com"
source_from: str | tuple[str, ...] = "notifications@github.com"
required_url_prefix: str = "https://github.com/"
message_id_domain: str = "github.com"
trusted_repos: set[str] = field(default_factory=set)
Expand All @@ -73,6 +74,7 @@ class Policy:
org_routes: dict[str, Route] = field(default_factory=dict)
repo_roles: dict[str, str] = field(default_factory=dict)
org_roles: dict[str, str] = field(default_factory=dict)
bot_logins: set[str] = field(default_factory=lambda: set(DEFAULT_BOT_LOGINS))
prompt_overrides: PromptOverrides = field(default_factory=PromptOverrides)
feedback_learning: FeedbackLearning = field(default_factory=FeedbackLearning)

Expand Down Expand Up @@ -150,7 +152,7 @@ def feedback_learning(raw: dict) -> FeedbackLearning:
)

return cls(
source_from=source.get("from", cls.source_from),
source_from=tuple(source.get("from")) if isinstance(source.get("from"), list) else source.get("from", cls.source_from),
required_url_prefix=source.get("requiredUrlPrefix", cls.required_url_prefix),
message_id_domain=source.get("messageIdDomain", cls.message_id_domain),
trusted_repos={r.lower() for r in data.get("trustedRepos", [])},
Expand All @@ -161,13 +163,21 @@ def feedback_learning(raw: dict) -> FeedbackLearning:
trusted_auto_actions=set(actions.get("trustedAuto", ["reply_comment", "open_issue", "submit_review", "sync_after_merge", "workflow_run_failed"])),
repo_routes=routes(data.get("repoRoutes", {})), org_routes=routes(data.get("orgRoutes", {})),
repo_roles=roles(data.get("repoRoles", {})), org_roles=roles(data.get("orgRoles", {})),
bot_logins=(
{str(login).lower().lstrip("@") for login in data.get("botLogins", []) if str(login).strip()}
if "botLogins" in data
else set(DEFAULT_BOT_LOGINS)
),
prompt_overrides=prompt_overrides(data.get("promptOverrides", {})),
feedback_learning=feedback_learning(data.get("feedbackLearning", {})),
)

def trusted_source(self, n: Notification, ctx: GitHubContext) -> bool:
auth_ok = all(bool(n.auth.get(k)) for k in ("spf", "dkim", "dmarc")) if n.auth else True
return self.source_from in n.from_addr and auth_ok and any(u.startswith(self.required_url_prefix) for u in ctx.urls) and self.message_id_domain in n.message_id
sources = (self.source_from,) if isinstance(self.source_from, str) else self.source_from
from_addr = n.from_addr.lower()
source_ok = any(str(source).lower() in from_addr for source in sources)
return source_ok and auth_ok and any(u.startswith(self.required_url_prefix) for u in ctx.urls) and self.message_id_domain in n.message_id

def repo_trusted(self, repo: str | None) -> bool:
if not repo:
Expand Down
4 changes: 2 additions & 2 deletions src/github_agent_bridge/queue.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ def init(self) -> None:

def enqueue(self, n: Notification, policy: Policy) -> tuple[Job | None, str]:
ctx = extract_github_context(n.body)
action = classify_github_action(n.subject, n.body)
intent = classify_work_intent(n.subject, n.body)
action = classify_github_action(n.subject, n.body, policy.bot_logins)
intent = classify_work_intent(n.subject, n.body, policy.bot_logins)
decision = policy.decision(n, ctx, action)
status = {"auto": "done", "ask": "waiting_approval", "deny": "denied"}.get(decision, "pending")
now = utc_now()
Expand Down
4 changes: 2 additions & 2 deletions src/github_agent_bridge/reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from dataclasses import dataclass

from .models import Notification
from .parser import decode_header_value, extract_body_text, parse_auth_results
from .parser import decode_header_value, extract_body_text, is_github_notification_message, parse_auth_results
from .policy import Policy
from .queue import JobQueue

Expand Down Expand Up @@ -57,7 +57,7 @@ def fetch_once(self) -> int:
from_addr = decode_header_value(msg.get("From", ""))
subject = decode_header_value(msg.get("Subject", ""))
message_id = decode_header_value(msg.get("Message-ID", ""))
if "notifications@github.com" in from_addr.lower():
if is_github_notification_message(msg, from_addr):
n = Notification(uid=uid, message_id=message_id, subject=subject, from_addr=from_addr, body=extract_body_text(msg), auth=parse_auth_results(msg))
self.queue.enqueue(n, self.policy)
# Only GitHub notifications belong to this bounded context.
Expand Down
22 changes: 22 additions & 0 deletions tests/test_github_followup_detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,28 @@ def test_visible_followup_ignores_review_comment_before_trigger():
assert github.visible_followup_after_trigger(ctx) is None


def test_approved_review_is_non_actionable():
ctx = GitHubContext(
urls=["https://github.com/gisce/erp/pull/27805#pullrequestreview-4325056741"],
repo="gisce/erp",
issue_number=27805,
review_id=4325056741,
)
github = RecordingGitHubClient(
{
("api", "repos/gisce/erp/pulls/27805/reviews/4325056741"): json.dumps(
{
"state": "APPROVED",
"body": "Looks good after the follow-up commit.",
"submitted_at": "2026-05-20T03:59:00Z",
}
),
}
)

assert github.is_non_actionable_review(ctx) is True


def test_visible_followup_for_issue_comment_returns_newest_bot_comment_after_trigger():
ctx = GitHubContext(
urls=["https://github.com/pilipilisbot/github-agent-bridge/pull/13#issuecomment-4524715895"],
Expand Down
28 changes: 27 additions & 1 deletion tests/test_manual_enqueue_cli.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import json
from email.message import EmailMessage

from github_agent_bridge.cli import _parse_github_comment_url, notification_from_comment_url
from github_agent_bridge.cli import _parse_github_comment_url, msg_to_notification, notification_from_comment_url
from github_agent_bridge.parser import extract_github_context


Expand Down Expand Up @@ -32,3 +33,28 @@ def fake_gh(args, gh_bin="gh"):
assert ctx.repo == "gisce/erp"
assert ctx.issue_number == 27675
assert ctx.comment_id == 4419572864


def test_giscebot_mention_classifies_as_reply_comment():
from github_agent_bridge.parser import classify_github_action

body = "@giscebot pots mirar això?\nhttps://github.com/gisce/erp/pull/27675#issuecomment-4419572864"

assert classify_github_action("Re: [gisce/erp] Example (PR #27675)", body, {"giscebot"}) == "reply_comment"


def test_msg_to_notification_accepts_google_group_rewritten_github_mail():
msg = EmailMessage()
msg["From"] = "'Eduard Carreras' via GISCE Bot <giscebot@gisce.net>"
msg["Reply-To"] = "gisce/erp <reply+abc@reply.github.com>"
msg["Message-ID"] = "<gisce/erp/pull/27853/c4547966148@github.com>"
msg["Subject"] = "Re: [gisce/erp] Example (PR #27853)"
msg["X-GitHub-Recipient"] = "giscebot"
msg["X-GitHub-Reason"] = "mention"
msg.set_content("https://github.com/gisce/erp/pull/27853#issuecomment-4547966148")

n = msg_to_notification(msg, uid=6)

assert n is not None
assert n.uid == 6
assert n.from_addr == "'Eduard Carreras' via GISCE Bot <giscebot@gisce.net>"
Loading
Loading