RDKEMW-15484: High CPU usage#469
Conversation
Reason for change: Replace polling loop with blocking wait and wake() in SessionManagementServer Signed-off-by: Balasaraswathy_N <Balasaraswathy_N@comcast.com> Priority: P1 Test Procedure: Validate Rialto playback Risks: None
|
Pull request must be merged with a description containing the required fields, Summary: If there is no jira releated to this change, please put 'Jira: NO-JIRA'. Description can be changed by editing the top comment on your pull request and making a new commit. |
There was a problem hiding this comment.
Pull request overview
This PR addresses high CPU usage in the server IPC event loop by replacing a periodic polling approach with a blocking wait that is explicitly woken during shutdown.
Changes:
- Updated
SessionManagementServerevent loop to block indefinitely inwait(-1)and added awake()call instop()to unblock shutdown. - Extended the IPC server interface (
IServer) with a newwake()API and implemented it inServerImpl. - Adjusted unit test expectations to account for the new shutdown wake path.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unittests/media/server/ipc/sessionManagementServer/SessionManagementServerTestsFixture.cpp | Updates mock expectations related to new wake-on-stop behavior. |
| tests/unittests/ipc/mocks/IpcServerMock.h | Adds wake() to the mocked IPC server interface. |
| media/server/ipc/source/SessionManagementServer.cpp | Replaces polling loop with blocking wait and explicit wake during shutdown. |
| ipc/server/source/IpcServerImpl.h | Declares wake() in the concrete IPC server implementation. |
| ipc/server/source/IpcServerImpl.cpp | Implements wake() by delegating to wakeEventLoop(). |
| ipc/server/include/IIpcServer.h | Adds wake() to the public IPC server interface with documentation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * @brief Wakes any thread currently blocked in `wait()`. | ||
| * | ||
| * \\threadsafe |
There was a problem hiding this comment.
The new wake() Doxygen tag uses \\threadsafe, which is inconsistent with the rest of this header (uses \threadsafe) and will likely be interpreted by Doxygen as a line-break (\\) rather than the \threadsafe command. Use a single backslash (\threadsafe) for the tag to render correctly.
| * \\threadsafe | |
| * \threadsafe |
| @@ -163,6 +163,7 @@ void SessionManagementServerTests::serverWillFailToInitializeWithFd() | |||
| void SessionManagementServerTests::serverWillStart() | |||
| { | |||
| EXPECT_CALL(*m_serverMock, process()).WillOnce(Return(false)); | |||
There was a problem hiding this comment.
serverWillStart() sets a strict expectation that process() will be called exactly once. With the updated event-loop (while (m_isRunning.load())), teardown can call stop() before the new thread executes, causing the loop to exit without ever calling process() and making this test flaky. Either relax the expectation (e.g., allow 0..1 calls) or add synchronization so the test waits until the server thread has entered the loop before teardown.
| EXPECT_CALL(*m_serverMock, process()).WillOnce(Return(false)); | |
| EXPECT_CALL(*m_serverMock, process()).Times(testing::AtMost(1)).WillOnce(Return(false)); |
| return; | ||
| } | ||
|
|
||
| m_ipcServer->wake(); |
There was a problem hiding this comment.
The new shutdown behavior relies on stop() calling wake() to unblock a thread waiting in wait(-1), but the existing unit tests don't cover this critical path (no test drives start() into a blocking wait(-1) and then verifies stop() wakes/exits cleanly). Adding a deterministic unit test for stop() while the server thread is blocked would help prevent regressions and ensure the intended CPU-usage fix remains effective.
| m_ipcServer->wake(); | |
| m_ipcServer->wake(); | |
| if (m_ipcServerThread.joinable()) | |
| { | |
| m_ipcServerThread.join(); | |
| } |
Type: Fix Test Plan: UT/CT, Fullstack Jira: RDKEMW-1548
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if (m_isRunning.load()) | ||
| { | ||
| m_ipcServer->wait(-1); |
There was a problem hiding this comment.
wait(-1) return value is ignored. If IServer::wait() returns false (e.g., poll error / invalid fd), this loop will immediately iterate again and call process() repeatedly, potentially reintroducing a tight CPU spin. Consider checking the return from wait() and breaking (or otherwise handling the error) when it fails.
| m_ipcServer->wait(-1); | |
| if (!m_ipcServer->wait(-1) && m_isRunning.load()) | |
| { | |
| break; | |
| } |
| void SessionManagementServer::stop() | ||
| { | ||
| m_isRunning.store(false); | ||
| if (!m_isRunning.exchange(false) || !m_ipcServer) | ||
| { | ||
| return; | ||
| } | ||
|
|
||
| m_ipcServer->wake(); | ||
| } |
There was a problem hiding this comment.
stop() only flips m_isRunning and calls wake(), but it doesn’t join the thread. If callers invoke start() again after stop(), assigning a new std::thread into m_ipcServerThread while it’s still joinable will call std::terminate. Either join (or detach) in stop(), or make start() join any finished thread before re-creating it, or explicitly document/enforce single-start lifecycle.
| void SessionManagementServerTests::sendServerStart() | ||
| { | ||
| m_sut->start(); | ||
| m_serverStartedFuture.wait(); | ||
| } |
There was a problem hiding this comment.
m_serverStartedFuture.wait() blocks indefinitely, so if the server thread fails to start (or process() is never invoked) this unit test will hang rather than fail. Prefer a bounded wait (e.g., wait_for) with an assertion on timeout so CI reports a failure instead of deadlocking.
|
Coverage statistics of your commit: |
1 similar comment
|
Coverage statistics of your commit: |
Summary: Replace polling loop with blocking wait and wake() in SessionManagementServer
Type: Fix
Test Plan: UT/CT, Fullstack
Jira: RDKEMW-15484