VPLAY-12800:"Technical Fault "Error message is displayed during long run playback of linear channel#1079
VPLAY-12800:"Technical Fault "Error message is displayed during long run playback of linear channel#1079vinodkadungoth wants to merge 7 commits intodev_sprint_25_2from
Conversation
…keeping existing firstDownload behavior unchanged), so live refresh continues with the 500ms retry path on timeout-class failures.
There was a problem hiding this comment.
Pull request overview
This PR fixes a regression introduced in VPLAY-10563 where certain timeout errors during live manifest refresh caused the download thread to exit incorrectly, resulting in playback failures. The fix ensures that all timeout-class curl errors trigger a 500ms manifest refresh and allow playback to continue using already downloaded fragments.
Changes:
- Replaced hardcoded
CURLE_OPERATION_TIMEDOUTcheck withIsCurlTimeoutFailure()helper function to properly detect all timeout error types
|
@vinodkadungoth I've opened a new pull request, #1080, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| ManifestDownloadResponsePtr thirdManifest = secondManifest; | ||
| while (std::chrono::steady_clock::now() - betweenRefreshStart < | ||
| std::chrono::milliseconds(1000)) | ||
| { | ||
| auto current = mAampMPDDownloader->GetManifest(false, 0); | ||
| if ((current.get() != secondManifest.get()) && | ||
| IsCurlTimeoutFailure(current->mMPDDownloadResponse->iHttpRetValue)) | ||
| { | ||
| thirdManifest = current; | ||
| break; | ||
| } | ||
| usleep(100 * 1000); | ||
| } | ||
|
|
||
| EXPECT_TRUE(thirdManifest.get() != secondManifest.get()); | ||
| EXPECT_TRUE(IsCurlTimeoutFailure( | ||
| thirdManifest->mMPDDownloadResponse->iHttpRetValue)); |
There was a problem hiding this comment.
The retry verification uses another polling loop with sleep-based timing. This can intermittently fail on slow CI machines even if the logic is correct, and it adds unnecessary delay. Consider waiting for the next manifest update via GetManifest(true, timeoutMs) and asserting the elapsed time between updates (rather than polling every 100ms) to make the test deterministic and faster.
| ManifestDownloadResponsePtr thirdManifest = secondManifest; | |
| while (std::chrono::steady_clock::now() - betweenRefreshStart < | |
| std::chrono::milliseconds(1000)) | |
| { | |
| auto current = mAampMPDDownloader->GetManifest(false, 0); | |
| if ((current.get() != secondManifest.get()) && | |
| IsCurlTimeoutFailure(current->mMPDDownloadResponse->iHttpRetValue)) | |
| { | |
| thirdManifest = current; | |
| break; | |
| } | |
| usleep(100 * 1000); | |
| } | |
| EXPECT_TRUE(thirdManifest.get() != secondManifest.get()); | |
| EXPECT_TRUE(IsCurlTimeoutFailure( | |
| thirdManifest->mMPDDownloadResponse->iHttpRetValue)); | |
| // Wait for the next manifest update instead of polling with sleeps | |
| ManifestDownloadResponsePtr thirdManifest = | |
| mAampMPDDownloader->GetManifest(true, 2000); | |
| auto betweenRefreshEnd = std::chrono::steady_clock::now(); | |
| auto betweenRefreshDuration = std::chrono::duration_cast< | |
| std::chrono::milliseconds>(betweenRefreshEnd - betweenRefreshStart); | |
| EXPECT_TRUE(thirdManifest != nullptr); | |
| EXPECT_TRUE(thirdManifest.get() != secondManifest.get()); | |
| EXPECT_TRUE(IsCurlTimeoutFailure( | |
| thirdManifest->mMPDDownloadResponse->iHttpRetValue)); | |
| EXPECT_GT(betweenRefreshDuration.count(), 0); |
| TEST_F(FunctionalTests, | ||
| AampMPDDownloader_LiveRefreshRetriesAt500MsOnTimeoutClassFailure) | ||
| { |
There was a problem hiding this comment.
The test name claims retries happen at 500ms, but the assertions only check that a subsequent timeout manifest appears within 1000ms. Either measure/assert the ~500ms behavior (e.g., bound the elapsed time between updates) or rename the test to reflect what it actually verifies to avoid misleading future readers.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| } | ||
|
|
||
| TEST_F(FunctionalTests, | ||
| AampMPDDownloader_LiveRefreshRetriesAt500MsOnTimeoutClassFailure) |
There was a problem hiding this comment.
The test name asserts a 500ms retry interval, but the assertions only verify that additional manifests arrive and that they fail with a timeout-class error. Consider explicitly validating the retry cadence (e.g., measure elapsed time around GetManifest calls with std::chrono::steady_clock and assert the refresh occurs ~500ms after the timeout, with a small tolerance), otherwise the test can pass even if the retry interval regresses.
| AampMPDDownloader_LiveRefreshRetriesAt500MsOnTimeoutClassFailure) | |
| AampMPDDownloader_LiveRefreshRetriesOnTimeoutClassFailure) |
| ManifestDownloadResponsePtr firstManifest = | ||
| mAampMPDDownloader->GetManifest(true, 2000); |
There was a problem hiding this comment.
The test name asserts a 500ms retry interval, but the assertions only verify that additional manifests arrive and that they fail with a timeout-class error. Consider explicitly validating the retry cadence (e.g., measure elapsed time around GetManifest calls with std::chrono::steady_clock and assert the refresh occurs ~500ms after the timeout, with a small tolerance), otherwise the test can pass even if the retry interval regresses.
| ManifestDownloadResponsePtr secondManifest = | ||
| mAampMPDDownloader->GetManifest(true, 3000); |
There was a problem hiding this comment.
The test name asserts a 500ms retry interval, but the assertions only verify that additional manifests arrive and that they fail with a timeout-class error. Consider explicitly validating the retry cadence (e.g., measure elapsed time around GetManifest calls with std::chrono::steady_clock and assert the refresh occurs ~500ms after the timeout, with a small tolerance), otherwise the test can pass even if the retry interval regresses.
|
|
||
| ManifestDownloadResponsePtr thirdManifest = | ||
| mAampMPDDownloader->GetManifest(true, 1500); | ||
| ASSERT_TRUE(thirdManifest != nullptr); | ||
| EXPECT_TRUE(thirdManifest.get() != secondManifest.get()); | ||
| EXPECT_TRUE(IsCurlTimeoutFailure( | ||
| thirdManifest->mMPDDownloadResponse->iHttpRetValue)); | ||
|
|
There was a problem hiding this comment.
The test name asserts a 500ms retry interval, but the assertions only verify that additional manifests arrive and that they fail with a timeout-class error. Consider explicitly validating the retry cadence (e.g., measure elapsed time around GetManifest calls with std::chrono::steady_clock and assert the refresh occurs ~500ms after the timeout, with a small tolerance), otherwise the test can pass even if the retry interval regresses.
| ManifestDownloadResponsePtr thirdManifest = | |
| mAampMPDDownloader->GetManifest(true, 1500); | |
| ASSERT_TRUE(thirdManifest != nullptr); | |
| EXPECT_TRUE(thirdManifest.get() != secondManifest.get()); | |
| EXPECT_TRUE(IsCurlTimeoutFailure( | |
| thirdManifest->mMPDDownloadResponse->iHttpRetValue)); | |
| auto thirdStart = std::chrono::steady_clock::now(); | |
| ManifestDownloadResponsePtr thirdManifest = | |
| mAampMPDDownloader->GetManifest(true, 1500); | |
| auto thirdEnd = std::chrono::steady_clock::now(); | |
| auto elapsedMs = std::chrono::duration_cast<std::chrono::milliseconds>(thirdEnd - thirdStart).count(); | |
| ASSERT_TRUE(thirdManifest != nullptr); | |
| EXPECT_TRUE(thirdManifest.get() != secondManifest.get()); | |
| EXPECT_TRUE(IsCurlTimeoutFailure( | |
| thirdManifest->mMPDDownloadResponse->iHttpRetValue)); | |
| // Verify that the manifest refresh after the timeout occurs with roughly | |
| // the expected 500ms retry cadence. Allow some tolerance to avoid flakiness. | |
| EXPECT_GE(elapsedMs, 400); | |
| EXPECT_LE(elapsedMs, 2000); |
| if(!firstDownload && (IsCurlTimeoutFailure(mMPDData->mMPDDownloadResponse->iHttpRetValue) || CURLE_COULDNT_CONNECT == mMPDData->mMPDDownloadResponse->iHttpRetValue)) | ||
| { | ||
| AAMPLOG_WARN("Refresh every 500ms to handle a manifest timeout error."); | ||
| //Forcefully go with 500 ms refresh |
There was a problem hiding this comment.
The warning text says this is handling a 'manifest timeout error', but the condition now covers IsCurlTimeoutFailure(...) plus CURLE_COULDNT_CONNECT. Update the log message to reflect the broader timeout-class/connect-failure behavior (and optionally include the curl error code) so logs remain accurate when diagnosing issues.
Curl error 28 (timeout) should trigger a manifest refresh after 500ms and allow
playback to continue using already downloaded fragments. Due to a regression
introduced in VPLAY-10563, the player was exiting the download thread and
incorrectly reporting a playback failure to the application.
This change ensures timeout-class failures do not terminate the download thread and instead trigger a manifest refresh after 500ms.