From 2732cb0cb29898a9bfb03b019b02b308f89f3748 Mon Sep 17 00:00:00 2001 From: IgorSwat Date: Wed, 25 Mar 2026 16:21:21 +0100 Subject: [PATCH] update s2t benchmarks --- docs/docs/02-benchmarks/inference-time.md | 4 ++-- .../version-0.8.x/02-benchmarks/inference-time.md | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/docs/02-benchmarks/inference-time.md b/docs/docs/02-benchmarks/inference-time.md index e0723c2681..1f0a394e74 100644 --- a/docs/docs/02-benchmarks/inference-time.md +++ b/docs/docs/02-benchmarks/inference-time.md @@ -139,7 +139,7 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode | Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: | -| Whisper-tiny (30s) | 248 | 254 | 1145 | 435 | 526 | +| Whisper-tiny (30s) | 89 | 93 | 403 | 277 | 260 | ### Decoding @@ -147,7 +147,7 @@ Average time for decoding one token in sequence of approximately 100 tokens, wit | Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: | -| Whisper-tiny (30s) | 23 | 25 | 121 | 92 | 115 | +| Whisper-tiny (30s) | 6 | 6 | 40 | 28 | 25 | ## Text to Speech diff --git a/docs/versioned_docs/version-0.8.x/02-benchmarks/inference-time.md b/docs/versioned_docs/version-0.8.x/02-benchmarks/inference-time.md index e0723c2681..1f0a394e74 100644 --- a/docs/versioned_docs/version-0.8.x/02-benchmarks/inference-time.md +++ b/docs/versioned_docs/version-0.8.x/02-benchmarks/inference-time.md @@ -139,7 +139,7 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode | Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: | -| Whisper-tiny (30s) | 248 | 254 | 1145 | 435 | 526 | +| Whisper-tiny (30s) | 89 | 93 | 403 | 277 | 260 | ### Decoding @@ -147,7 +147,7 @@ Average time for decoding one token in sequence of approximately 100 tokens, wit | Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: | -| Whisper-tiny (30s) | 23 | 25 | 121 | 92 | 115 | +| Whisper-tiny (30s) | 6 | 6 | 40 | 28 | 25 | ## Text to Speech