Skip to content

yiqisoft/Face-Recognition-Ensemble-Service-with-Triton-Inference-Server

Repository files navigation

Face Recognition Ensemble Service (UltraFace + ArcFace)

This repository details a high-performance face recognition pipeline deployed on the NVIDIA Triton Inference Server using an ensemble strategy.

1. Description

This service implements a complete face recognition workflow by chaining three distinct models:

  1. UltraFace: For fast face detection.
  2. face_preprocess (Python Backend): Acts as the orchestrator, performing detection post-processing, face cropping, image format conversion, and synchronous internal calls to the ArcFace model.
  3. ArcFace: For generating L2-normalized 512-D face embeddings.

The pipeline is defined entirely by the face_recognition_ensemble configuration.

2. Model Configurations Overview

2.1. face_recognition_ensemble (The Main Service)

I/O Name Data Type Dimensions Description
Input image_input TYPE_FP32 [1, 3, 480, 640] Normalized input image (NCHW).
Output face_tokens TYPE_FP32 [-1, 512] L2-normalized 512-D embeddings for N faces.
Output face_images TYPE_FP32 [-1, 3, 112, 112] Cropped face images (N faces, NCHW, normalized).

2.2. Individual Model Specifications

Model Name Type Input (Name/Dims) Output (Name/Dims) Key Role
ultraface Detector input / [1, 3, 480, 640] scores / [1, 17640, 2], boxes / [1, 17640, 4] Face localization.
arcface Embedder input_1 / [1, 112, 112, 3] embedding / [1, 512] Feature extraction.
face_preprocess Python Backend scores, boxes, image face_tokens, face_images NMS, Cropping, ArcFace Orchestration.

3. Installation

3.1. Prerequisites

  1. NVIDIA Triton Inference Server: Ensure the server is installed and running, supporting the Python Backend.
  2. Model Files: You must acquire and place the ultraface and arcface ONNX/TensorRT model files into their respective version directories (e.g., ultraface/1/model.onnx).
  3. Python Dependencies (for face_preprocess): The Triton Python Backend container must have the following packages installed:
    pip install numpy opencv-python tritonclient

3.2. Model Repository Setup

Organize your model repository as follows:

<model_repository>
├── ultraface/
│   └── 1/
│       └── model.onnx  # UltraFace model file
├── arcface/
│   └── 1/
│       └── model.onnx  # ArcFace model file
├── face_preprocess/
│   ├── config.pbtxt
│   └── 1/
│       └── model.py    # The Python backend script
└── face_recognition_ensemble
└── config.pbtxt    # Ensemble definition

3.3. Running Triton

Start the Triton server, pointing it to your configured model repository.

tritonserver --model-repository=/path/to/model_repository

4. Usage

4.1. Input Format

The client must prepare the input image to match the ensemble's expectation:

  • Shape: [1, 3, 480, 640] (NCHW format).
  • Data Type: TYPE_FP32 (32-bit float).
  • Normalization: The image must be pre-normalized according to the UltraFace model's requirements (e.g., typically a zero-mean, unit-variance or similar scale, depending on the model's training).

4.2. Client Request

Use the Triton client (e.g., Python tritonclient.grpc) to request inference on the ensemble model:

# Pseudo-code for client request
client.infer(
    model_name="face_recognition_ensemble",
    inputs=[
        grpcclient.InferInput("image_input", image_data.shape, "FP32").set_data_from_numpy(image_data)
    ],
    outputs=[
        grpcclient.InferRequestedOutput("face_tokens"),
        grpcclient.InferRequestedOutput("face_images")
    ]
)

4.3. Outputs

The service returns two variable-sized (dynamic batch size, N) outputs:

  1. face_tokens: Embeddings for N detected faces. Use these 512-D vectors for similarity calculation.
  2. face_images: The cropped and normalized $112 \times 112$ face images. Useful for debugging or visualization.

1, 112, 112, 3

5. Notice

5.1. Coordinate System

  • UltraFace outputs (scores, boxes) are normalized coordinates ([0, 1]) relative to the $640 \times 480$ input dimensions.
  • The face_preprocess script scales these back to pixel values for cropping and performs necessary image format conversions.

5.2. ArcFace Call Format

  • The face_preprocess script internally handles the model's required input format: NCHW ([1, 3, 112, 112]) must be transposed to NHWC ([1, 112, 112, 3]) to correctly match the arcface model's configuration.

5.3. Face Alignment

  • The current implementation of _crop_and_resize_face uses a simple crop and resize to $112 \times 112$ based on the bounding box. It does NOT perform landmark-based affine alignment. This design choice (using the simple crop) may impact the quality of the ArcFace embeddings compared to a pipeline using sophisticated alignment.

5.4. Error Handling

  • The face_preprocess includes error handling (_call_arcface) that returns random normalized features as a fallback if the internal ArcFace call fails, preventing the entire ensemble from crashing.

6. Ensemble Configuration Details

face_recognition_ensemble/config.pbtxt

name: "face_recognition_ensemble"
platform: "ensemble"

input [
  {
    name: "image_input"
    data_type: TYPE_FP32
    dims: [1, 3, 480, 640]
  }
]

output [
  {
    name: "face_tokens"
    data_type: TYPE_FP32
    dims: [-1, 512]
  },
  {
    name: "face_images"
    data_type: TYPE_FP32
    dims: [-1, 3, 112, 112]
  }
]

ensemble_scheduling {
  step [
    {
      model_name: "ultraface"
      model_version: -1
      input_map { key: "input", value: "image_input" }
      output_map { key: "scores", value: "ultraface_scores" }
      output_map { key: "boxes", value: "ultraface_boxes" }
    },
    {
      model_name: "face_preprocess"
      model_version: -1
      input_map { key: "scores", value: "ultraface_scores" }
      input_map { key: "boxes", value: "ultraface_boxes" }
      input_map { key: "image", value: "image_input" }
      output_map { key: "face_tokens", value: "face_tokens" }
      output_map { key: "face_images", value: "face_images" }
    }
  ]
}

About

This repository details a high-performance face recognition pipeline deployed on the NVIDIA Triton Inference Server using an ensemble strategy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages