Skip to content

Latest commit

 

History

History
240 lines (132 loc) · 11.2 KB

File metadata and controls

240 lines (132 loc) · 11.2 KB

SimpleTuner WebUI Tutorial

Introduction

This tutorial will help you get started with the SimpleTuner Web interface.

Installing requirements

For Ubuntu systems, start by installing the required packages:

apt -y install python3.12-venv python3.12-dev
apt -y install libopenmpi-dev openmpi-bin cuda-toolkit-12-8 libaio-dev # if you're using DeepSpeed

Creating a workspace directory

A workspace contains your configurations, output models, validation images, and potentially your datasets.

mkdir ~/simpletuner-workspace
export SIMPLETUNER_WORKSPACE=~/simpletuner-workspace
cd $SIMPLETUNER_WORKSPACE

Installing SimpleTuner into your workspace

Create a virtual environment to install dependencies to:

python3.12 -m venv .venv
. .venv/bin/activate

CUDA-specific dependencies

NVIDIA users will have to use the CUDA extras to pull in all the right dependencies:

pip install -e '.[cuda]'

There are other extras for users on apple and rocm hardware, see the installation instructions.

Starting the server

To start the server with SSL on port 8080:

# for DeepSpeed, we'll need CUDA_HOME pointing to the correct location
export CUDA_HOME=/usr/local/cuda-12.8
export LIBRARY_PATH=$CUDA_HOME/targets/x86_64-linux/lib/stubs:$LIBRARY_PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$CUDA_HOME/targets/x86_64-linux/lib/stubs:$LD_LIBRARY_PATH

simpletuner server --ssl --port 8080

Now, visit https://localhost:8080 in your web browser. You may need to forward the port over SSH, for example:

ssh -L 8080:localhost:8080 user@remote-server

Using the WebUI

Onboarding steps

Once you have the page loaded, you'll be asked onboarding questions to set up your environment.

Configuration directory

The special configuration value configs_dir is introduced to point to a folder that contains all of your SimpleTuner configurations, which are recommended to be sorted into subdirectories - the Web UI will do this for you:

configs/
├── an-environment-named-something
│   ├── config.json
│   ├── lycoris_config.json
│   └── multidatabackend-DataBackend-Name.json
image
Migrating from command-line usage

If you've been using SimpleTuner before without a WebUI, you can point to your existing config/ folder and all of your environments will be auto-discovered.

For new users, the default location of your configs and datasets will ~/.simpletuner/ and it's recommended to move your datasets somewhere with more space:

image

(Multi-)GPU selection and configuration

After configuring the default paths, you'll reach a step where multi-GPU can be configured (pictured on a Macbook)

image

If you've got multiple GPUs and would like to just use the second one, this is where you can do that.

Creating your first training environment

If you did not have any pre-existing configurations found in your configs_dir, you'll be asked to create your first training environment:

image

Use Bootstrap From Example to select an example config to start from, or simply enter a descriptive name and create a random environment if you prefer to use a setup wizard instead.

Switching between training environments

If you had any pre-existing configuration environments, they will show up in this drop-down menu.

Otherwise, the option we just created while onboarding will be selected and active already.

image

Use Manage Configs to get to the Environment tab where a list of your environments, dataloader and other configurations can be found.

Configuration wizard

I've worked hard to provide a comprehensive setup wizard that will help you configure some of the most important settings in a no-nonsense bootstrap to get started.

image

In the upper left navigation menu, the Wizard button will bring you to a selection dialogue:

image

And then all built-in model variants are offered. Each variant will pre-enable required settings like Attention Masking or extended token limits.

LoRA model options

If you wish to train a LoRA, you'll be able to set the model quantisation options here.

In general, unless you're training a Stable Diffusion type model, int8-quanto is recommended as it won't harm quality, and allows higher batch sizes.

Some small models like Cosmos2, Sana, and PixArt, really do not like being quantised.

image

Full-rank training

Full-rank training is discouraged, as it generally takes a lot longer and costs more in resources than a LoRA/LyCORIS, for the same dataset.

However, if you do wish to train a full checkpoint, you're able to configure DeepSpeed ZeRO stages here which will be required for larger models like Auraflow, Flux, and larger.

FSDP2 is supported, but not configurable in this wizard. Simply leave DeepSpeed disabled and manually configure FSDP2 later if you wish to use it

image

How long do you want to train for?

You'll have to decide whether you wish to measure training time in epochs or steps. It all is pretty much equal in the end, though some people develop a preference one way or the other.

image

Sharing your model via Hugging Face Hub

Optionally, you can publish your final and intermediate checkpoints to Hugging Face Hub, but you'll require an account - you can login to the hub via the wizard, or the Publishing tab. Either way, you can always change your mind and enable or disable it.

If you do select to publish your model, be mindful to select Private repo if you don't want your model to be accessible to the broader public.

image

Model validations

If you want the trainer to generate images periodically, you can configure a single validation prompt at this point of the wizard. Multiple prompt library can be configured inside the Validations & Output tab after the wizard is complete.

image

Logging training statistics

SimpleTuner has support for multiple target APIs if you wish to send your training statistics to one.

Note: None of your personal data, training logs, captions, or data are ever sent to SimpleTuner project developers. Control of your data is in your hands.

image

Dataset Configuration

At this point, you can decide whether to keep any existing dataset, or create a new configuration (leaving any others untouched) through the Dataset Creation Wizard, which will appear upon clicking.

image
Dataset Wizard

If you elected to create a new dataset, you'll see the following wizard, which will walk you through the adding of a local or cloud dataset.

image image

For a local dataset, you'll be able to use the Browse directories button to access a dataset browser modal.

image

If you've pointed the datasets directory correctly during onboarding, you'll see your stuff here.

Click the directory you wish to add, and then Select Directory.

image

After this, you'll be guided through configuring resolution values and cropping.

NOTE: SimpleTuner doesn't upscale images, so ensure they are at least as large as your configured resolution.

When you reach the step to configure your captions, carefully consider which option is correct.

If you're just wanting to use a single trigger word, that'd be the Instance Prompt option.

image

Learning rate, batch size & optimiser

Once you complete the dataset wizard (or if you elected to keep your existing datasets), you'll be offered presets for optimiser/learning rate and batch size.

These are just starting points that help newcomers make somewhat better choices for their first few training runs - for experienced users, use Manual configuration for complete control.

NOTE: If you plan on using DeepSpeed later, the optimiser choice doesn't matter much here.

image

Review & save

If you're happy with all of your selected values, go ahead and Finish the wizard.

You'll then see your new environment actively selected and ready for training!

In most cases, these settings will be all you'll have needed to configure. You may want to add extra datasets or fiddle with other settings.

image

On the Environment page, you'll see the newly-configured training job, and buttons to download or duplicate the configuration, if you wished to use it like a template.

image

NOTE: The Default environment is special, and not recommended for use as a general training environment; its settings can be automatically merged into any environment that enables the option to do so, Use environment defaults:

image