Detailed outlier flags by vergauwenthomas · Pull Request #611 · vergauwenthomas/MetObs_toolkit

vergauwenthomas · 2026-01-20T10:09:26Z

No description provided.

* fix bug and add tests baselines (#571) * fix bug and add tests baselines * fix location of baseline * black edits * handle comma as decimal symbol when importing data (#576) * Add convert_to_numeric_series function and integrate into dataset and sensordata modules * black edits * fix review * Error make plot model name (#579) * added modeldata_name variable to make_plot() of stations class. * add modeldata_kwargs to make_plot in order to select a specific modeldata series * add plot test * black edits * new parameter modeltype adds functionality to plot different type of modeldata compared to obstype, if nothing is specified modeltype=obstype * trigger * resolve empty dicts * tests added * tests adapted (no fake argument anymore) * remove check for dataset if modeltype exists * arg renaming + argsorder change * black edits --------- Co-authored-by: Thomas Vergauwen <thomas.vergauwen@meteo.be> * Fix buddy check bug (#583) * add reference of the iteration to the detailed msg * fix duplicates by joining messages * add a test to trigger this edgecase * black edits * Update tests/test_qc.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * grammer errors in comments --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * develop alpha version bump * first attempt max_gap_duration_to_fill implementation * changed > to >= * rechanged >= to > * print add * print * gap is filled or not filled , not partly * small error fix * errorfix * implementation of intuitive gap overview on all levels called 'singular_gaps() ( datset, station, sensor) + deal with edgecase of partially filled gaps by including a new option in def fillstatus _partially_successful_label = _partially_successful_label = "partially successful gapfill" currently this will cause the gap to be refilled totally (also were the gap filling was succesfull) when a new gapfilling is called. we can extend this later to only filling the 'gap in the gap' * version set to prerelease * Refactor gap handling: remove singular_gaps property and add gap_status_overview_df method for improved gap status reporting * make the gapfill status flagger more human readable * Refactor gap handling: replace singular_gaps property with gap_status_overview_df method for improved gap status reporting * Refactor gap handling: remove singular_gaps property and implement gap_status_overview_df method for enhanced gap reporting; update max_gap_duration_to_fill default to 12 hours * Update max_gap_duration_to_fill default to 12 hours and improve gap size validation checks * minor code style refactoring * black edits * fix identation bug * code refactoring of gaps overview in sensordata * gap_overview_dfs in seperate module * add the new method to the API docs * update and add tests * Update the example * rename function to gap_overview_df * fix bug * black edits * serialize timedelta and timestamp attr in xr conv * fix unicode error when exporting to netcdf * fix tests * by default overwrite the partially GF * update tests * black edits * black edits --------- Co-authored-by: Thomas Vergauwen <82087298+vergauwenthomas@users.noreply.github.com> Co-authored-by: Thomas Vergauwen <thomas.vergauwen@meteo.be> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* develop version bump * fix the filtering to modeldata dataframe * use helper func for extracting modeltimeseries * test different plotting methods of modeldata on different levels * add log_entry * black edits

* functionality of pd_plot * move the sensordata pd plot to seperate module * fix warnings * implementation for modeldata pd_plot * add the pd_plot methods to the api doc * use id for sensordata label in pd_plot * add tests * prerelease version bump * black edits * code review fixes * fix docstring identation * black edits

…ent unphysical data (#598) * Initial plan * Add min_value and max_value parameters to core gap-fill functions Co-authored-by: ADRIE-A3 <109740197+ADRIE-A3@users.noreply.github.com> * Add tests for min_value and max_value gap-fill parameters Co-authored-by: ADRIE-A3 <109740197+ADRIE-A3@users.noreply.github.com> * Add documentation and examples for gap-fill value constraints Co-authored-by: ADRIE-A3 <109740197+ADRIE-A3@users.noreply.github.com> * Add implementation summary for gap-fill value constraints Co-authored-by: ADRIE-A3 <109740197+ADRIE-A3@users.noreply.github.com> * remove unwanted files * Remove min_value and max_value parameters from Gap and SensorData classes * pass minvalue maxvalue args for GF methods * Add min_value and max_value parameters to Dataset gap-filling methods * remove unwanted files * minor version bump * add test * black edits * Gee credits local pipeline (#599) * black edits * bugfix that data._credentials does not exist in ee v1.6.12. Now with a test catch condition on ee.initialize * add gee credential testing and checking for the credential file in the (local) testing pipeline --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ADRIE-A3 <109740197+ADRIE-A3@users.noreply.github.com> Co-authored-by: Thomas Vergauwen <thomas.vergauwen@meteo.be> Co-authored-by: Thomas Vergauwen <82087298+vergauwenthomas@users.noreply.github.com>

* add comment * open logs in default text browser * close figures after comparison to fix the failing tests * black edit * Update pyproject.toml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add catch_white_records function and update repetitions_check to exclude white records from outlier detection * implement white records for repetitions check on all levels. * whitelist for persistence * Update white_records parameter type to Union[pd.Index, None] in Dataset and Station classes * white records for gross value check * add white_records to step and window variation check * implement whitelist argument for buddy checks * pass white records for repetitions check * update to use the sensorwhiteset argument * update to using the sensorwhiteset argument * implement the sensorwhiteset argument * refactor: replace white_records with whiteset in QC methods and related functions * cleanup the module * add tests * black formatting * add WhiteSet to api docs * add important box * fix for timezone bug * add topic and refer in the qc example to the topic on WhiteSets * comment protector * black edits * remove default value for sensorwhiteset in persistence_check, repetitions_check, and step_check functions * refactor SensorWhiteSet initialization to use None as default for white_timestamps * remove default value for sensorwhiteset in window_variation_check function * Update tests/test_qc.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * refactor save_whitelist_records function to improve documentation and remove unused code * add logging decorator to get_info method in WhiteSet class * remove unused import of SensorWhiteSet from whitelist module * add a string formatting for in the xr attributes * black edits * refer to the topic section * sync docstrings * fix futurwarning * dev version bump --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add catch_white_records function and update repetitions_check to exclude white records from outlier detection * implement white records for repetitions check on all levels. * whitelist for persistence * Update white_records parameter type to Union[pd.Index, None] in Dataset and Station classes * white records for gross value check * add white_records to step and window variation check * implement whitelist argument for buddy checks * pass white records for repetitions check * update to use the sensorwhiteset argument * update to using the sensorwhiteset argument * implement the sensorwhiteset argument * refactor: replace white_records with whiteset in QC methods and related functions * cleanup the module * add tests * black formatting * add WhiteSet to api docs * add important box * fix for timezone bug * add topic and refer in the qc example to the topic on WhiteSets * comment protector * black edits * remove default value for sensorwhiteset in persistence_check, repetitions_check, and step_check functions * refactor SensorWhiteSet initialization to use None as default for white_timestamps * remove default value for sensorwhiteset in window_variation_check function * Update tests/test_qc.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * refactor save_whitelist_records function to improve documentation and remove unused code * add logging decorator to get_info method in WhiteSet class * remove unused import of SensorWhiteSet from whitelist module * add a string formatting for in the xr attributes * black edits * refer to the topic section * sync docstrings * fix futurwarning * dev version bump * adding a 'details' column to the outliersdf * implement the buddy with safetynets (generalisation) * black edits * check the safety net argument * exclude 'details' form comparison * fix tests * update the examples * update solutions * fix: correct formatting of log messages and documentation comments * minor version bump * Update src/metobs_toolkit/qc_collection/buddy_check.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: deprecate buddy_check_with_LCZ_safety_net method in Dataset class * rename safety_nets to safetynets * black edits * drop explicit attributs in whitelist * fix test issue * fix link to whitelist topic page * drop duplicated attributes in docstring * fix bool and list attributes of QC * fix tests * black edits --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* rename all occurences of target_obstype to obstype * revert function name * rename trgobstype to obstype * rename target_obstype to obstype * rename obstype name to name * fix tests * rename arguments to more easy to read version of gee_*_dataset * implement a formatter for output files * filepath arugment consistency * add initialize_gee argument where missing and consitent use of update arguments * pass modelname and modelvariable arugments to dataset GF methods * minor version bump * test examples * simpler naming for gee_manager * filepath handling for logging to file * black edits * store and check version compatibility * fix circular import with lazy import * fix tests * fix test * black edits * running tests

Copilot

Pull request overview

Copilot reviewed 66 out of 143 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (3)

tests/pkled_solutions/test_qc_solutions/testwhiterecords/test_white_multi_idx_records_buddy_check_with_safety_nets/datatype.json:1

This solution directory no longer contains a datatype.json, but SolutionFixer2.get_solution() unconditionally reads datatype.json to determine how to load the stored parquet files. Removing this file will cause the corresponding test to fail when loading the expected solution. Restore datatype.json (or update the solution loader to handle missing metadata).
tests/pkled_solutions/test_qc_solutions/testwhiterecords/test_white_dt_only_records_buddy_check_with_safety_nets/datatype.json:1
This solution directory no longer contains a datatype.json, but SolutionFixer2.get_solution() unconditionally reads datatype.json to determine how to load the stored parquet files. Removing this file will cause the corresponding test to fail when loading the expected solution. Restore datatype.json (or update the solution loader to handle missing metadata).
tests/pkled_solutions/test_qc_solutions/testwhiterecords/test_white_multi_idx_records_buddy_check/datatype.json:1
This solution directory no longer contains a datatype.json, but SolutionFixer2.get_solution() unconditionally reads datatype.json to determine how to load the stored parquet files. Removing this file will cause the corresponding test to fail when loading the expected solution. Restore datatype.json (or update the solution loader to handle missing metadata).

You can also share your feedback on Copilot code review. Take the survey.

src/metobs_toolkit/qc_collection/spatial_checks/methods/pdmethods.py

src/metobs_toolkit/plot_collection/qc_info_pies.py

src/metobs_toolkit/qc_collection/spatial_checks/methods/safetynets.py

pyproject.toml

src/metobs_toolkit/plot_collection/qc_info_pies.py

…D, pytest and develop-pipeline)

vergauwenthomas and others added 30 commits September 29, 2025 14:39

Update ModelTimeSeries repr and change modeldata return type to list

3902321

Fix filter to modeldata bug (#592)

2f4b0ef

* develop version bump * fix the filtering to modeldata dataframe * use helper func for extracting modeltimeseries * test different plotting methods of modeldata on different levels * add log_entry * black edits

fix the xr engine problem when saving as netcdf

549e95b

check for duplicates in loghandlers before adding

235bb8e

Distance matric uses BallTree with haversine (for scaling)

a04fe5e

debug version bump

3a1a358

add clipping in docstring schematic description

b6b630b

Update src/metobs_toolkit/sensordata.py

878f09c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update tests/test_gf.py

f5aaffc

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update tests/test_plotting.py

246fd52

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fixes for codereview

aa578e1

Merge branch 'dev' of github.com:vergauwenthomas/MetObs_toolkit into dev

d28d81a

black edit

5610861

version bump

198d69e

trigger WF

c5595c3

remove all references to max_consec_fill

dc79e1f

code review fixes

502f741

introduce a seperate module for datetime transformations

fefa405

Merge branch 'master' into dev

cdf1b1b

fix and add log tests

b0d940b

Imported classes for typehinting are now imported via TYPE_CHECKING

43e9f1d

remove info file

671e661

add debug logging for CRS conversion failure in _validate_metadf

bc93d96

vergauwenthomas added 16 commits March 18, 2026 18:00

add distancematrix to as method to Dataset

79b40c1

update docstring

771c3b5

fix example

20680c2

fix and extend the qc example

2188d35

bugfix in the QCresult to string formatting

96e2592

fix filling example

79ad693

version bump

0393b89

fix examples

f5d0b5a

black edits

8c0553c

include details in the solution

6ed24d5

remove warning in docstring

dfc27c6

fix the raised warning

bc60169

cleanup docstring

25c3c72

cleanup docstrings

3a7b1af

cleanup docstrings

4b83132

replace pd.concat to save_concat

1662a48

vergauwenthomas requested a review from Copilot March 19, 2026 13:12

Copilot AI reviewed Mar 19, 2026

View reviewed changes

vergauwenthomas force-pushed the dev branch from 95ee076 to 0025b16 Compare March 19, 2026 13:21

vergauwenthomas added 10 commits March 19, 2026 16:31

Merge branch 'dev' into detailed_outlier_flags

711a550

Bugfix in syncronization of timeseries in buddy check

8653f74

minor code review fixes

707f51c

add sanity checks (in __main__ protectors) to testing framework (CI/C…

d808e05

…D, pytest and develop-pipeline)

convert sanity checks to series of assert checks

8bb24f5

update docstrings

2877d73

explicit add scikit-learn (for balltree method)

ed5fba2

black edits

84323d1

excluse the metobs package when pip audit in CI/CD pipeline

3745133

fix black vulnerability (CVE-2026-32274)

8da21c7

vergauwenthomas merged commit 5406252 into dev Mar 23, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detailed outlier flags#611

Detailed outlier flags#611
vergauwenthomas merged 154 commits intodevfrom
detailed_outlier_flags

vergauwenthomas commented Jan 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

vergauwenthomas commented Jan 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants