This is a fork of Pytorch Image Models (TIMM), adding more options for training and custom model types. The original TIMM readme with installation instructions can be found here.
--last-layer: new parameter that freezes the model weights and only trains the classifier parameters
--apr-per-class calculates accuracy, precision and recall per class
--acc-pm1 calculates accuracy, counting classes with an index offset of plus/minus 1 as correct
For all datasets a create_timm_* script is provided, that creates a folder strucutre from annotions.
Place classifcation using the Places365 dataset.
In addition, a script to modify checkpoints to reuse weights in the modified model types is provided.
Room type classification using the NYU Depth v2 dataset.
Additionally a script to convert the RGB images to JPEG and to create a balanced subset are provided.
Shot type classification using the Movienet dataset.
Variants of EfficentNet-B3 for specific training configurations on Places365 have been added:
efficientnet_b3_places365supercat: Add a layer to predict the Places365 supercategries from a weighted sum of the probabilties of the more fine-grained classes.
efficientnet_b3_places365supercatmax: Add a layer to predict the Places365 supercategries from fine-grained class with the highest probability.
The research leading to these results has been funded partially by the program ICT of the Future by the Austrian Federal Ministry of Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK) in the project TailoredMedia.


