Training Dataset Check by MichaelTj02 · Pull Request #13 · Metacreation-Lab/autolume

MichaelTj02 · 2026-03-16T23:20:57Z

Perform check on dataset prior to training (resolution, consistency, format, colour channels)
Add lazy loading to thumbnails section in preprocessing model improving performance on larger file imports

…d resolution consistency check

… path is found

ucodia

I found multiple bugs in this PR.

I would suggest you split things related to dataset checks and thumbnail lazy loading separately for easier review and fixing.

ucodia · 2026-03-27T04:44:25Z

-                _, new_data_path = imgui_utils.input_text("##preprocessing_data_path", self.preprocessing_data_path, 1024, 0, 
+                _, new_data_path = imgui_utils.input_text("##preprocessing_data_path", str(self.preprocessing_data_path), 1024, 0, 
                width=imgui.get_window_width() - self.menu.app.button_w - imgui.calc_text_size("Browse")[0])
                if new_data_path != self.preprocessing_data_path:


Bug: We are using != to compare a str with a Path, therefore this check will always be True. I advice you wrap Path objects in str in comparisons from input_text like you did on line 279.

We need to review all types to make sure we are not duplicating this issue elsewhere and are consistent in the types we use

ucodia · 2026-03-27T05:27:46Z

        try:
            if Path(self.icon_path).exists():
-                help_img = cv2.imread(Path(self.icon_path).as_posix(), cv2.IMREAD_UNCHANGED)
+                help_img = cv2.imread(Path(self.icon_path), cv2.IMREAD_UNCHANGED)


Potential bug: Does imread really take a Path object? str seemed safer.

ucodia · 2026-03-27T05:32:48Z

                    self.save_path = directory_path
                else:
-                    print("No save path selected")
+                    self.save_path = self.save_path


Was that a mistake?

ucodia · 2026-03-27T05:34:16Z

+                target_dataset_path = Path(self.data_path)
+                if target_dataset_path.is_dir():
+                    image_files = [f for f in target_dataset_path.iterdir()
+                                if f.is_file()]


Bug: We used to have a .png filter meaning we would not try to open non image files before, if a .txt exist in the dataset folder, I think PIL is gonna crash. Any reason for removing that filter?

ucodia · 2026-03-27T05:35:49Z

@@ -1,9 +1,11 @@
 from pathlib import Path
 import zipfile
+import io


Nit: Added but never used

ucodia · 2026-03-27T05:37:36Z

-        for file_path in file_paths:
-            self.get_thumbnail(file_path)
+
+        self.file_index_map = {fp: idx for idx, fp in enumerate(file_paths)}


We create file_index_map in this function, then clear it in clear_thumbnails but never seem to use it anywhere. Either we use it or we remove it.

ucodia · 2026-03-27T05:51:02Z

+            raise ValueError(
+                "Invalid dataset:\n"
+                f"- Image '{filename}' is not square ({width}x{height}).\n"
+                "- StyleGAN3 training only accepts square images."


Just mention StyleGAN, version 3 is irrelevant to the error

ucodia · 2026-03-27T05:52:05Z

+        if height != expected_height or width != expected_width:
+            raise ValueError(
+                "Invalid dataset:\n"
+                f"- Image '{filename}' has resolution {width}x{height}, expected {expected_width}x{expected_height}.\n"


I would suggest the following error format instead for clarity:

Inconsistent dataset: 'foo.png' is 512x512 but the detected dataset resolution is 256x256. Ensure all images have been preprocessed to the same size.

ucodia · 2026-04-16T22:27:45Z

Closing in favor of #15 #16 and #17

MichaelTj02 added 16 commits February 17, 2026 11:30

Fix as_posix and change it to str for imgui compatibilty

02a4197

Add dataaset check logic after hitting trainng button

32a5c21

Dataset check

f37e127

Add error message in training popup if dataset is incorrect

b58c656

Do dataset check in dataset.py script to avoid double loop

42e58c3

Add image resolution power of 2 check, and seperate colour channel an…

d8ca69b

…d resolution consistency check

Add lazy loading in preprocessing module. Only on placeholder thumbnails

ab1f405

Change dataset detection error message

56bc949

Add lazy loading on thumbnail rendering

4511a9a

Fix indexing bug for thumbnail rendering

225a53d

fix

6c3e6c2

Fix thumbnail render lazy loading

7a14cc2

Merge remote-tracking branch 'origin/main' into dataset_check

59e1b42

merge new training_module origin

cb4a5be

remove comments

c962fe5

fix: Path to string usage causing crash if a duplicate dataset output…

3510b1e

… path is found

ucodia requested changes Mar 27, 2026

View reviewed changes

ucodia closed this Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Dataset Check#13

Training Dataset Check#13
MichaelTj02 wants to merge 16 commits intomainfrom
dataset_check

MichaelTj02 commented Mar 16, 2026

Uh oh!

ucodia left a comment

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia Mar 27, 2026

Uh oh!

ucodia commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MichaelTj02 commented Mar 16, 2026

Uh oh!

ucodia left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ucodia commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants