#APN currently sends the correct VRAM and clock frequency to primenet, but uses the CPU's core+thread count and cache sizes. This doesn't really affect anything but perhaps having the same value for each model (i.e. deriving nothing from the CPU's values) would be more consistent.
- The "core" count can be the streaming multiprocessor/CU/whatever count of the GPU. On OpenCL this is CL_DEVICE_MAX_COMPUTE_UNITS. My RTX 2060 has 30.
- There are several threads in a GPU multiprocessor (like my RTX 2020/Turing has 4 dispatchers per SM), but that isn't that important to most people and isn't very easy to get from any API (instead you might need to have a database of models). Probably just leave it at 1.
- For cache you can get a global mem cache amount from opencl and an analogous value for L2 (both being more like a CPU’s L3) from nvidia-smi. What happens in each CU is harder to know, again assuming you don't want to maintain a big database of models — the amount of local memory per CU matters more in comparison. Some small placeholder number might be good enough there.
#APN currently sends the correct VRAM and clock frequency to primenet, but uses the CPU's core+thread count and cache sizes. This doesn't really affect anything but perhaps having the same value for each model (i.e. deriving nothing from the CPU's values) would be more consistent.