Skip to content

feat: add Data Quality Agent configuration and setup#373

Merged
Threated merged 7 commits intosamply:developfrom
RadovanTomik:feat/data_quality_agent
Apr 7, 2026
Merged

feat: add Data Quality Agent configuration and setup#373
Threated merged 7 commits intosamply:developfrom
RadovanTomik:feat/data_quality_agent

Conversation

@RadovanTomik
Copy link
Copy Markdown
Contributor

Adds an opt-in option to install the Data Quality Agent as part of the BH

@RadovanTomik RadovanTomik requested a review from a team as a code owner March 3, 2026 12:37
@RadovanTomik
Copy link
Copy Markdown
Contributor Author

still need to figure out the best way to point it to Blaze without user input on start

Comment on lines +8 to +10
# Remove these 2 environment variables if you do not wish to share Data Quality Reports
REPORTING_SERVER_URL: ${DATA_QUALITY_SERVER_URL}
REPORTING_SERVER_NAME: ${DATA_QUALITY_SERVER_NAME}
Copy link
Copy Markdown
Member

@Threated Threated Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is rather useless since one would just not set ENABLE_DATA_QUALITY_AGENT if they don't want it (and nobody really reads the compose files).
Would it make sense to default these vars to the values you mentioned in the README or are they defaulted by the component itself?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, setting those ENVs activates remote sharing of the quality reports. The application can be used for local reporting nevertheless. Does the application just check if the ENVs are set or does it also check if they are empty? If the latter, you could include the ENVs (and the comment) in the configuration template (bbmri.conf). That way, sites not opting in can leave them empty there.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the comments, I have modified the documentation and setup so its clear they can be used for local data quality evaluation and also optional sharing of the metrics

Comment thread README.md Outdated
DATA_QUALITY_SERVER_NAME=Central Data Quality Server of BBMRI
```

Reports are stored under `/var/cache/bridgehead/bbmri/data-quality-agent-reports/` and are accessible at `https://<your-host>/bbmri-data-quality-agent` (requires Bridgehead login).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently not mounted as a volume in the compose file so it won't be persisted.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed and they can be accessed via the UI.

labels:
- "traefik.enable=true"
- "traefik.http.routers.data_quality_agent_bbmri.rule=PathPrefix(`/bbmri-data-quality-agent`)"
- "traefik.http.services.data_quality_agent_bbmri.loadbalancer.server.port=8080"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see through docker inspecting the image this service runs on 8082 no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. Is the path prefix correct/conventional for the BH?

@Threated
Copy link
Copy Markdown
Member

Threated commented Mar 3, 2026

still need to figure out the best way to point it to Blaze without user input on start

You can just use point it to "http://bridgehead-bbmri-blaze:8080/fhir" (with or without the /fhir whatever you like) its a bbmri module so there should always be a blaze under that url without auth because its doing docker networking and not going via traefik

@RadovanTomik
Copy link
Copy Markdown
Contributor Author

still need to figure out the best way to point it to Blaze without user input on start

You can just use point it to "http://bridgehead-bbmri-blaze:8080/fhir" (with or without the /fhir whatever you like) its a bbmri module so there should always be a blaze under that url without auth because its doing docker networking and not going via traefik

added as an ENV

@RadovanTomik
Copy link
Copy Markdown
Contributor Author

Can we test the config on BBMRI-ERIC dev BH?

@RadovanTomik
Copy link
Copy Markdown
Contributor Author

Hi, @TKussel could you please take a look at the PR?

Copy link
Copy Markdown
Member

@Threated Threated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. @TKussel do we want to replicate the ghcr image to our registry or just keep the ghcr url?

- /etc/timezone:/etc/timezone:ro
volumes:
agent-data:
driver: local No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
driver: local

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better yet, remove the volume and mount a folder (see above)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread bbmri/modules/data-quality-agent-compose.yml
depends_on:
- "blaze"
volumes:
- agent-data:/app/data
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use a docker volume but use a folder mount in /var/cache/bridgehead/

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if cache is the correct term/place, but I have updated the config to match your request based on the existing exporter config

- /etc/timezone:/etc/timezone:ro
volumes:
agent-data:
driver: local No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better yet, remove the volume and mount a folder (see above)

Comment thread README.md Outdated
@TKussel
Copy link
Copy Markdown
Member

TKussel commented Mar 24, 2026

Hi, @TKussel could you please take a look at the PR?

Sorry, took me a while. Thank you for continually improving the PR. I only had some nitpicks regarding the volume remaining :)

@TKussel
Copy link
Copy Markdown
Member

TKussel commented Mar 24, 2026

Looks good to me. @TKussel do we want to replicate the ghcr image to our registry or just keep the ghcr url?

We currently don't have the ghcr in Bridgehead's required endpoint list so, I replicated the latest tag to our registry: https://docker.verbis.dkfz.de/cache/bbmri-cz/data-quality-server:latest

RadovanTomik and others added 2 commits April 7, 2026 10:16
Co-authored-by: Tobias Kussel <TKussel@users.noreply.github.com>
@RadovanTomik
Copy link
Copy Markdown
Contributor Author

Hi, @TKussel could you please take a look at the PR?

Sorry, took me a while. Thank you for continually improving the PR. I only had some nitpicks regarding the volume remaining :)

thanks for taking a look :) I have applied your suggestions, let me know if there is something missing

@Threated Threated merged commit 71b25fe into samply:develop Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants