Skip to content

Add notification mechanism to compliance monitor#1103

Draft
mbuechse wants to merge 2 commits intomainfrom
issue/899
Draft

Add notification mechanism to compliance monitor#1103
mbuechse wants to merge 2 commits intomainfrom
issue/899

Conversation

@mbuechse
Copy link
Copy Markdown
Contributor

No description provided.

Signed-off-by: Matthias Büchse <matthias.buechse@alasca.cloud>
@mbuechse
Copy link
Copy Markdown
Contributor Author

@mitch000001 That's how far we got yesterday

Signed-off-by: Matthias Büchse <matthias.buechse@alasca.cloud>
@mbuechse mbuechse linked an issue Feb 18, 2026 that may be closed by this pull request
@mbuechse
Copy link
Copy Markdown
Contributor Author

I think the approach is not quite right. Here, we track whether the result for a given triple subject/scope/version changes, e.g., from PASS to FAIL. However, so many variables are at play here, because this result is compiled from testcase results, and each testcase result can stem from a different run with different software version etc. So what we ought to track instead is individual testcase results, for instance, given subject X and testcase Y, and we have two consecutive runs of the corresponding check script, we might obtain results r1 at time t1 with software revision v1 and r2 at time t2 with software revision v2. And if r1 is not equal to r2, then we can tell the partner: hey, there has been a change, and we can even say whether v1 != v2 or not.

@mbuechse
Copy link
Copy Markdown
Contributor Author

What's also not covered here is the case that the state changes because the latest result expires. We have no outside trigger for that, so we might need add a dedicated one.

@mbuechse
Copy link
Copy Markdown
Contributor Author

Since #889 the risk that the latest result expires is actually quite minimal, because we always get something (be it ABORT). What could expire, though, is the latest PASSing result. This, however, is probably not what we want! As long as the test suite is run regularly, we can't fault the partner if it produces false positives. We actually have to extend the lifetime period of the passing result. That's not something we currently do (and I'm not even sure the database schema is up for that)

@mbuechse
Copy link
Copy Markdown
Contributor Author

mbuechse commented Apr 30, 2026

The principle should be: the subject passes the testcase until/unless proven otherwise. We have to prove that. If we get an ABORT, it should probably count as PASS, until/unless a human assessor reviews it and judges it's indeed a fail. We should write the tests in such a way that FAIL results are quite airtight. Then the fail could be effective immediately. However, there are sometimes tests that are flaky, in the sense that they usually pass, but sometimes inexplicably fail (because of some overly strict timeout or what not). This is particularly true in the case of Sonobuoy, where we don't even control the code. So we should probably at least retry failed tests or only count multiple failed tests in a row. Again, in the case of Sonobuoy, this is not trivial, because Sonobuoy actually aggregates multiple testcases, where each one can be flaky individually, so three consecutive runs of Sonobuoy may all fail with no two runs failing because of the same testcase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Notification mechanism for compliance monitor

1 participant