nrf_rpc: implement timeout for response #1764

Damian-Nordic · 2025-06-06T09:50:42Z

Introduce necessary changes to support timing out waiting for a command response:

Handle nrf_rpc_os_msg_get() returning null data pointer, indicating that the response has not arrived within the required maximum time.
Introduce context mutex to protect all context members. This is necessary because now the response may arrive after the initiating thread has already released the context.

Note that 2 may not sufficient solution to address another scenario where the context is freed by the initiator but is then reused before a delayed response arrives - to solve this, we likely need to introduce the concept of session ID or conversation ID to detect outdated packets.

manifest-pr-skip

Damian-Nordic · 2025-06-18T14:40:56Z

@doki-nordic @grochu Friendly reminder to take a look.

grochu

Sorry for not having reviewed in shorter time, I hoped someone more experienced with nRF RPC could do it. Anyway the changes technically are OK to me, I just have some minor comments from the documentation perspective.

nrf_rpc/template/nrf_rpc_os_tmpl.h

nrf_rpc/nrf_rpc.c

doki-nordic · 2025-06-24T19:19:52Z

I started reviewing this PR, but since I have limited time to review it, I did not finalized it. I have concerns about mutex locking logic. Do we really need to lock it in context alloc function? When locking and unlocking logic is spread across multiple functions and multiple files, there is higher risk of edge cases or time races that may lead to dead lock. I am concern about context reusing functionality. Can you do simple context reusing test on this PR? See docs for some details about it: https://docs.nordicsemi.com/bundle/ncs-latest/page/nrfxlib/nrf_rpc/doc/architecture.html#id5 . This will happen, for example, when the local side sends a command and waits for a response, next the remote send another command back to local within the command handler context. In this case, context on local side should be reused.

Damian-Nordic · 2025-06-24T21:20:21Z

I started reviewing this PR, but since I have limited time to review it, I did not finalized it. I have concerns about mutex locking logic. Do we really need to lock it in context alloc function? When locking and unlocking logic is spread across multiple functions and multiple files, there is higher risk of edge cases or time races that may lead to dead lock. I am concern about context reusing functionality. Can you do simple context reusing test on this PR? See docs for some details about it: https://docs.nordicsemi.com/bundle/ncs-latest/page/nrfxlib/nrf_rpc/doc/architecture.html#id5 . This will happen, for example, when the local side sends a command and waits for a response, next the remote send another command back to local within the command handler context. In this case, context on local side should be reused.

cmd_ctx_alloc may seem counterintuitive but I didn't find a better place to do the locking:

cmd_ctx_alloc already modifies the context to initialize it, so it must lock the mutex in order to prevent another thread from using a partially initialized context when e.g. a delayed response arrives. If we wanted to do lock/unlock every time in a single function, we would need to do this in many functions, such as cmd_ctx_alloc, cmd_ctx_reserve, nrf_rpc_cmd_common. I thought this would be less efficient.
cmd_ctx_alloc is called only when a new context is allocated - in the case of context reusing, this won't be called again. If we moved locking to e.g. nrf_rpc_cmd_common, we might be going to lock an already locked mutex, so we would need to assume the mutex is reentrant or add an extra check.

I'm happy to document it better, or rework this if you prefer locking the mutex in individual functions, but I thought doing this the current way actually simplifies the design - you always run nRF RPC code with the context mutex held and unlock it only when waiting for the peer or when releasing the context.

And yeah, I can run a test like that :)

Damian-Nordic · 2025-07-07T05:27:53Z

@grochu I addressed your comments, please take another look.
@doki-nordic I tested this code with the context reusing (implemented UDP callback which retrieves extra information about the received UDP packet from the remote device) and it works as expected.
Let me know if you have time for a call so I can provide the rationale for the current design.

grochu

Approving to unblock development. Let's discuss the context locking with mutex later in another context.

nrf_rpc/CHANGELOG.rst

Introduce necessary changes to support timing out waiting for a command response: 1. Handle nrf_rpc_os_msg_get() returning null data pointer, indicating that the response has not arrived within the required maximum time. 2. Introduce context mutex to protect all context members. This is necessary because now the response may arrive after the initiating thread has already released the context. Note that 2 may not sufficient solution to address another scenario where the context is freed by the initiator but is then reused before a delayed response arrives - to solve this, we likely need to introduce the concept of session ID or conversation ID to detect outdated packets. Signed-off-by: Damian Krolik <damian.krolik@nordicsemi.no>

Damian-Nordic requested review from hakonfam and grochu June 6, 2025 09:50

Damian-Nordic requested a review from doki-nordic as a code owner June 6, 2025 09:50

This was referenced Jun 6, 2025

manifest: sdk-nrfxlib: nrf_rpc: implement timeout for response nrfconnect/sdk-nrf#22673

Closed

nrfxlib: nrf_rpc: implement response timeout nrfconnect/sdk-nrf#22674

Open

grochu reviewed Jun 24, 2025

View reviewed changes

nrf_rpc/template/nrf_rpc_os_tmpl.h Show resolved Hide resolved

nrf_rpc/nrf_rpc.c Show resolved Hide resolved

Damian-Nordic force-pushed the nrf_rpc_timeout branch from 6265956 to f4b99f3 Compare July 7, 2025 05:18

grochu approved these changes Jul 9, 2025

View reviewed changes

Damian-Nordic force-pushed the nrf_rpc_timeout branch from cfe4b53 to d4e34fe Compare July 9, 2025 12:56

bpienk approved these changes Jul 9, 2025

View reviewed changes

Damian-Nordic force-pushed the nrf_rpc_timeout branch from d4e34fe to 50ef472 Compare July 9, 2025 13:14

github-actions bot added the doc-required PR must not be merged without tech writer approval. label Jul 9, 2025

Damian-Nordic requested a review from annwoj July 9, 2025 13:15

annwoj reviewed Jul 10, 2025

View reviewed changes

nrf_rpc/CHANGELOG.rst Outdated Show resolved Hide resolved

Damian-Nordic force-pushed the nrf_rpc_timeout branch from 50ef472 to 3084eae Compare July 10, 2025 09:34

annwoj approved these changes Jul 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

nrf_rpc: implement timeout for response #1764

nrf_rpc: implement timeout for response #1764

Uh oh!

Damian-Nordic commented Jun 6, 2025 •

edited

Loading

Uh oh!

Damian-Nordic commented Jun 18, 2025

Uh oh!

grochu left a comment

Uh oh!

Uh oh!

Uh oh!

doki-nordic commented Jun 24, 2025

Uh oh!

Damian-Nordic commented Jun 24, 2025 •

edited

Loading

Uh oh!

Damian-Nordic commented Jul 7, 2025 •

edited

Loading

Uh oh!

grochu left a comment

Uh oh!

Uh oh!

Uh oh!

nrf_rpc: implement timeout for response #1764

Are you sure you want to change the base?

nrf_rpc: implement timeout for response #1764

Uh oh!

Conversation

Damian-Nordic commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Damian-Nordic commented Jun 18, 2025

Uh oh!

grochu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

doki-nordic commented Jun 24, 2025

Uh oh!

Damian-Nordic commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Damian-Nordic commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grochu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Damian-Nordic commented Jun 6, 2025 •

edited

Loading

Damian-Nordic commented Jun 24, 2025 •

edited

Loading

Damian-Nordic commented Jul 7, 2025 •

edited

Loading