The repository demonstrates using the ExecuTorch to convert Pytorch model to .pte file format and run on Himax WE2 chip.
- The package has been tested in Ubuntu 20.04 LTS environment.
- Create python 3.10 virtual environment
# install python3.10-dev first
sudo apt install python3.10-dev
git clone --recursive https://github.com/HimaxWiseEyePlus/ExecuTorch_Convert_Example
cd ExecuTorch_Convert_Example
# create python 3.10 venv
python3.10 -m venv --without-pip executorch_env_py_3_10_dev
wget https://bootstrap.pypa.io/get-pip.py
source './executorch_env_py_3_10_dev/bin/activate'
which python
python --version
which pip
python get-pip.py
- Install requirement package and download Himax vela config ini file
pip install -r requirements.txt
pip install --no-deps git+https://git.gitlab.arm.com/tosa/tosa-reference-model.git@v2025.07.1
# download himax vela config ini
wget https://raw.githubusercontent.com/HimaxWiseEyePlus/ML_FVP_EVALUATION/main/vela/himax_vela.ini
- We use MobilenetV2 pretrained model from huyvnphan/PyTorch_CIFAR10 for the examples.
- There are four example.
- PTQ (Post-Training Quantization) example, it will generate
mbv2_cifar10_ptq_quant_himax_ini_vela_4_5_0.pte.python ptq_example.py

- QAT (Quantization-Aware Training) example, it will generate
mbv2_cifar10_qat_himax_ini_vela_4_5_0.pte.python qat_example.py

- Unstructure pruning + PTQ example, it will generate
mbv2_cifar10_global_unstruct_prune_ptq_himax_ini_vela_4_5_0.pte.python unstruct_prune_ptq.py
- Structure pruning + PTQ example, it will generate
mbv2_cifar10_structured_prune_ptq_himax_ini_vela_4_5_0.pte.python struct_prune_ptq.py
- PTQ (Post-Training Quantization) example, it will generate
The conversion flow uses PyTorch 2.0 Export (torch.export) and ExecuTorch's Arm Backend to compile the models for the Himax WE2 NPU (Arm Ethos-U55). The core steps, as demonstrated in the python scripts, are:
-
Model Preparation & Quantization Setup
- The PyTorch model is placed in evaluation mode.
- We create an
EthosUCompileSpecspecifying the target unit (ethos-u55-64) and reference a configuration file himax_vela.ini. You can also add the Vela option atEthosUCompileSpecsuch like--optimise Sizeto optimise SRAM size about model usage. - The
EthosUQuantizeris initialized to annotate the model with the NPU's quantization constraints.
-
Calibration (PTQ) or Training (QAT)
- For Post-Training Quantization (PTQ), the model undergoes calibration via
prepare_pt2e. A representative dataset is passed through the model so it can record activation ranges. - For Quantization-Aware Training (QAT), the model is prepared using
prepare_qat_pt2eand then fine-tuned through normal training loops.
- For Post-Training Quantization (PTQ), the model undergoes calibration via
-
Quantization Conversion
- The
convert_pt2efunction converts the internal recorded active ranges or fake quants into actual quantized integer operations.
- The
-
Lowering to ExecuTorch Edge Dialect
- Ensure your model is exported via
torch.export.export. - Using
EthosUPartitioner, the graph is partitioned and delegated to the Arm Ethos-U backend viato_edge_transform_and_lower. - Passes like
QuantizeInputsandQuantizeOutputsare applied to enforce integer inputs and outputs.
- Ensure your model is exported via
-
Saving
.pteFormat- Finally, the edge dialect is compiled into an ExecuTorch program with
edge.to_executorch(). - It is saved to disk using
save_pte_program.
- Finally, the edge dialect is compiled into an ExecuTorch program with
- You can reference torch_mb_cls to try run the models on WE2.