Skip to content

[Performance] Compatibility Analysis of OCR Model's Inability to Use NNAPI/CoreML Acceleration #2

@longipinnatus

Description

@longipinnatus

简述

在针对移动端优化 OCR 检测模型 det_mobile.onnx 时,通过 onnxruntime.tools.check_onnx_model_mobile_usability 工具检查发现,当前模型架构无法有效利用 NNAPI (Android) 或 CoreML (iOS) 进行硬件加速,目前只适合使用 CPU 运行。

分析报告

根据工具输出结果,主要存在以下限制:

  1. 算子不支持 (Unsupported Operators):

    • 模型中包含 NNAPI/CoreML 不支持或在特定路径下不支持的算子:ConvTransposeHardSigmoid
    • HardSigmoid 是导致模型无法被加速算子库完整接管的主要原因之一。
  2. 动态形状限制 (Dynamic Shapes):

    • 原始模型由于输入具有动态形状(Dynamic Shape),导致 NNAPI 无法运行任何节点。
  3. 分片过多导致性能下降 (Partitioning Issues):

    • 即使将模型修改为固定形状(Fixed Shapes),NNAPI 虽然能接管约 87.1% 的节点(244/280),但由于不支持的算子将模型切分成了 31 个分区
    • 跨设备硬件(NPU/GPU)与 CPU 频繁交换数据的开销将抵消加速效果,导致最终性能可能比纯 CPU 运行更差。

结论

  • 使用 NNAPI/CoreML: 否,针对该模型,在移动端应优先使用 CPU Execution Provider
  • 备注: 在其他 GitHub Repo Issues 里有看到使用 NNAPI 加速 PP-OCRv5 的案例,可以进一步研究。

日志

python -m onnxruntime.tools.check_onnx_model_mobile_usability .\det_mobile.onnx

INFO:  Checking det_mobile.onnx for usability with ORT Mobile.
INFO:  Checking NNAPI
INFO:  0 partitions with a total of 0/280 nodes can be handled by the NNAPI EP.
INFO:  Unsupported nodes due to operator=36
INFO:   Unsupported ops: ai.onnx:ConvTranspose,ai.onnx:HardSigmoid
INFO:   Caveats that have not been checked and may result in a node not actually being supported:
     ai.onnx:Conv:Only 2D Conv is supported. Weights and bias should be constant.
     ai.onnx:GlobalAveragePool:Only 2D Pool is supported.
     ai.onnx:Resize:Only 2D Resize is supported.
INFO:  Unsupported nodes due to input having a dynamic shape=280
INFO:  NNAPI cannot run any nodes in this model.
INFO:  Model should perform well with NNAPI as is: NO
INFO:  --------
INFO:  Checking if model will perform better if the dynamic shapes are fixed...
INFO:  Partition information if the model was updated to make the shapes fixed:
INFO:  31 partitions with a total of 244/280 nodes can be handled by the NNAPI EP.
INFO:   Partition sizes: [4, 6, 9, 6, 6, 15, 8, 6, 15, 8, 6, 6, 6, 6, 6, 6, 6, 16, 6, 6, 7, 4, 6, 6, 6, 6, 9, 28, 14, 3, 2]
INFO:  Unsupported nodes due to operator=36
INFO:   Unsupported ops: ai.onnx:ConvTranspose,ai.onnx:HardSigmoid
INFO:   Caveats that have not been checked and may result in a node not actually being supported:
     ai.onnx:Conv:Only 2D Conv is supported. Weights and bias should be constant.
     ai.onnx:GlobalAveragePool:Only 2D Pool is supported.
     ai.onnx:Resize:Only 2D Resize is supported.
INFO:  NNAPI is not recommended with this model as there are 31 partitions covering 87.1% of the nodes in the model. This will most likely result in worse performance than just using the CPU EP.
INFO:  Model should perform well with NNAPI if modified to have fixed input shapes: NO
INFO:  ================
INFO:
INFO:  Checking CoreML NeuralNetwork
INFO:  0 partitions with a total of 0/280 nodes can be handled by the CoreML NeuralNetwork EP.
INFO:  Unsupported nodes due to operator=36
INFO:   Unsupported ops: ai.onnx:ConvTranspose,ai.onnx:HardSigmoid
INFO:   Caveats that have not been checked and may result in a node not actually being supported:
     ai.onnx:Conv:Only 1D/2D Conv is supported. Weights and bias should be constant.
     ai.onnx:GlobalAveragePool:Only 2D Pool is supported.
     ai.onnx:Resize:4D input. `coordinate_transformation_mode` == `asymmetric`. `mode` == `linear` or `nearest`. `nearest_mode` == `floor`. `exclude_outside` == false `scales` or `sizes` must be constant.
INFO:  Unsupported nodes due to input having a dynamic shape=280
INFO:  CoreML NeuralNetwork cannot run any nodes in this model.
INFO:  Model should perform well with CoreML NeuralNetwork as is: NO
INFO:  --------
INFO:  Checking if model will perform better if the dynamic shapes are fixed...
INFO:  Partition information if the model was updated to make the shapes fixed:
INFO:  31 partitions with a total of 244/280 nodes can be handled by the CoreML NeuralNetwork EP.
INFO:   Partition sizes: [4, 6, 9, 6, 6, 15, 8, 6, 15, 8, 6, 6, 6, 6, 6, 6, 6, 16, 6, 6, 7, 4, 6, 6, 6, 6, 9, 28, 14, 3, 2]
INFO:  Unsupported nodes due to operator=36
INFO:   Unsupported ops: ai.onnx:ConvTranspose,ai.onnx:HardSigmoid
INFO:   Caveats that have not been checked and may result in a node not actually being supported:
     ai.onnx:Conv:Only 1D/2D Conv is supported. Weights and bias should be constant.
     ai.onnx:GlobalAveragePool:Only 2D Pool is supported.
     ai.onnx:Resize:4D input. `coordinate_transformation_mode` == `asymmetric`. `mode` == `linear` or `nearest`. `nearest_mode` == `floor`. `exclude_outside` == false `scales` or `sizes` must be constant.
INFO:  CoreML NeuralNetwork is not recommended with this model as there are 31 partitions covering 87.1% of the nodes in the model. This will most likely result in worse performance than just using the CPU EP.
INFO:  Model should perform well with CoreML NeuralNetwork if modified to have fixed input shapes: NO
INFO:  ================
INFO:
INFO:  Checking CoreML MLProgram
INFO:  0 partitions with a total of 0/280 nodes can be handled by the CoreML MLProgram EP.
INFO:  Unsupported nodes due to operator=35
INFO:   Unsupported ops: ai.onnx:BatchNormalization,ai.onnx:HardSigmoid
INFO:   Caveats that have not been checked and may result in a node not actually being supported:
     ai.onnx:Conv:Only 1D/2D Conv is supported. Bias if provided must be constant.
     ai.onnx:ConvTranspose:Weight and bias must be constant. padding_type of SAME_UPPER/SAME_LOWER is not supported. kernel_shape must have default values. output_shape is not supported. output_padding must have default values.
     ai.onnx:GlobalAveragePool:Only 2D Pool is supported currently. 3D and 5D support can be added if needed.
     ai.onnx:Resize:See [resize_op_builder.cc](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/coreml/builders/impl/resize_op_builder.cc) implementation. There are too many permutations to describe the valid combinations.
INFO:  Unsupported nodes due to input having a dynamic shape=280
INFO:  CoreML MLProgram cannot run any nodes in this model.
INFO:  Model should perform well with CoreML MLProgram as is: NO
INFO:  --------
INFO:  Checking if model will perform better if the dynamic shapes are fixed...
INFO:  Partition information if the model was updated to make the shapes fixed:
INFO:  30 partitions with a total of 245/280 nodes can be handled by the CoreML MLProgram EP.
INFO:   Partition sizes: [4, 6, 9, 6, 6, 15, 8, 6, 15, 8, 6, 6, 6, 6, 6, 6, 6, 16, 6, 6, 7, 4, 6, 6, 6, 6, 9, 28, 16, 4]
INFO:  Unsupported nodes due to operator=35
INFO:   Unsupported ops: ai.onnx:BatchNormalization,ai.onnx:HardSigmoid
INFO:   Caveats that have not been checked and may result in a node not actually being supported:
     ai.onnx:Conv:Only 1D/2D Conv is supported. Bias if provided must be constant.
     ai.onnx:ConvTranspose:Weight and bias must be constant. padding_type of SAME_UPPER/SAME_LOWER is not supported. kernel_shape must have default values. output_shape is not supported. output_padding must have default values.
     ai.onnx:GlobalAveragePool:Only 2D Pool is supported currently. 3D and 5D support can be added if needed.
     ai.onnx:Resize:See [resize_op_builder.cc](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/coreml/builders/impl/resize_op_builder.cc) implementation. There are too many permutations to describe the valid combinations.
INFO:  CoreML MLProgram is not recommended with this model as there are 30 partitions covering 87.5% of the nodes in the model. This will most likely result in worse performance than just using the CPU EP.
INFO:  Model should perform well with CoreML MLProgram if modified to have fixed input shapes: NO
INFO:  ================
INFO:
INFO:  For optimal performance the model should be used with the CPU EP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions