25.8 Antalya backport of #90825: Add role-based access to Glue catalog#1428
25.8 Antalya backport of #90825: Add role-based access to Glue catalog#1428zvonand merged 9 commits intoantalya-25.8from
Conversation
Add role-based access to Glue catalog
cef9232 to
fba0893
Compare
fba0893 to
c96e9ea
Compare
Integration Test FailureAll 12 Root CauseThe new start_mock_servers(
started_cluster, script_dir,
[("mock_sts.py", "sts.us-east-1.amazonaws.com", "80", args)], # 4 elements
)But for server_name, container, port in mocks: # ValueError: too many values to unpackThe upstream repo has an updated TracebackImpactThis is directly caused by this PR. Since the fixture crashes, all 12 |
|
@Selfeer i saw that, Ill fix that The problem is regression fails |
Regression aarch64 swarms - Failure AnalysisPR: #1428 | Workflow: 22222536503 | Version: Reason: Not related to this specific PR Report: report.html SummaryThe Failed Scenarios & Historical Flakiness
Root Cause AnalysisAll 6 failures originate from
Conclusion
Appendix: Queries UsedAll queries below were run against the 1. Historical pass/fail ratios per scenario (all-time)This query computes the historical pass/fail percentage for each of the failed scenarios across all recorded runs. SELECT
test_name,
result,
count() AS cnt,
round(count() * 100.0 / sum(count()) OVER (PARTITION BY test_name), 1) AS pct
FROM `gh-data`.clickhouse_regression_results
WHERE test_name IN (
'/swarms/feature/node failure/check restart clickhouse on swarm node',
'/swarms/feature/node failure/check restart swarm node',
'/swarms/feature/node failure/cpu overload',
'/swarms/feature/node failure/initiator out of disk space',
'/swarms/feature/node failure/network failure',
'/swarms/feature/node failure/swarm out of disk space',
'/swarms/feature/swarm joins',
'/swarms/feature/swarm union'
)
GROUP BY test_name, result
ORDER BY test_name, result;Result:
2. Recent timeline for disk space scenariosThis query shows the recent pass/fail timeline for the SELECT
test_name,
result,
start_time,
clickhouse_version,
job_url
FROM `gh-data`.clickhouse_regression_results
WHERE test_name LIKE '%swarm%'
AND (
test_name LIKE '%kill swarm%'
OR test_name LIKE '%out of disk%'
OR test_name LIKE '%initiator%'
OR test_name LIKE '%kill initiator%'
OR test_name LIKE '%node failure%feature%'
)
ORDER BY start_time DESC
LIMIT 200;Key observation: The most recent ~30 entries for both |
|
'/tiered storage/with s3gcs/background move/max move factor': https://github.com/Altinity/ClickHouse/actions/runs/22222536503/job/64284577377 Also is not related, and seems like an infra fail - The test queries |
|
There is a swarms fail that doesn't seem to be related to the changes in the pr, in fact those are the same fails that are described here: #1428 (comment) @alsugiliazova can we mark this as verified regardless of the issue that was raised? ClickHouse#97858 |
|
I think it is OK (as long as this behavior matches upstream and they are OK with that) Also, does not look like a problem to me at all. |
|
This PR does not have S3 credentials cache (It was added in 25.11 and 25.12 and not in the scope of this backport), so, performance may differ from latest upstream. Backporting the cache logic would be quite heavy. Actually, the code in this PR is somewhat similar to what they had in upstream before 91706 (which is related, but its backport was not requested) |
|
25.8 does not include
|
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Add role-based access to Glue catalog. Use settings
aws_role_arnand, optionally,aws_role_session_name. (ClickHouse#90825 by @antonio2368)CI/CD Options
Exclude tests:
Regression jobs to run: