-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC on Tasking Inside our NodeMCU Framwork #2803
Comments
Excellent concept, fully support that 👍 |
One footnote here. After a side conversation with @jmattsson, I've just realised that the use of |
I've had a few other commitments over this last few weeks, so progress has been slow on this, but I now consider this chunk of work as stable.
I am visiting family this next couple of days so I will do the PR itself on Thursday. Given that this is an architectural alignment for @marcelstoer, are you comfortable with this? |
Implemented in #2836 |
The NodeMCU architecture in essence.
NodeMCU works broadly the same as
Node.js
(here is a good overview). On the ESP variants (RTOS for the ESP32 or the non-OS SDK), a Lua application is composed of a set of tasks organised and scheduled through a Single Threaded Event Loop Scheduler.Each task typically has a thin C initiator which then calls a Lua function that may call other Lua functions in turn, but then the whole runs to completion. The event scheduler will then start the next task ready to run based on FIFO within priority. The whole framework is based on the rule that individual tasks run to completion and are not interrupted by other tasks, so the system as a whole can be implemented in a single processing thread. For this to work, tasks should be short, sharp and non-blocking. Each tasks is typically initiated based on an external event: a timer has fired, a GPIO has been set, a network packet has arrived, and so these are referred to as callbacks in SDK terminology. Because the Lua VM only executes one task at a time we don't need mutexes or other fancy task synchronisation mechanisms. Multi-tasking is cooperative: a task yields control by terminating.
In a typical well-written ESP Lua application, most task are short and execute within a few milliseconds so the ESP processor can complete 100s of tasks a second with minimal overhead, making it and NodeMCU really well suited to embedded IoT applications.
So a typical implementation pattern for a task is that is comprises:
A initiator coded in C which is scheduled in response to an external event. For example a network socket event, such as receiving a TCP packet, invokes the routine
net_recv_cb()
which then decodes the event and decides which Lua action function to execute.There is typically a 1-1 association with a Lua-callable booking function which can book such events and associate the correct Lua function with the event occurring, in this case
net_on('receive',func)
.Because each task exits from a Lua VM perspective, that is the Lua call stack unrolls entirely, the only Lua variables that are preserved from task-to-task are stored in the Lua environment (
_G
) and in the Lua Registry or their direct children. The Lua GC will collect all local variables created and released during the task execution.Because Lua task functions must persist from task to task, this are all stored in the Lua Registry and referenced using an integer handle. The booking function will use the
luaL_ref()
API to allocate this registry slot and obtain the handle, and then the event routine will retrieve the task function using this handle and then callluaL_unref()
to return the used slot to the pool, before executing alua_call()
to execute the Lua task.This is a pretty fixed implementation pattern but we haven't encapsulated this in a higher level API, so there are subtle differences in how this is coded from task to task. Not good.
Whilst NodeMCU as a whole makes very effective use of this framework through its modules library, ironically the core Lua VM does not. This is possibly because the Lua port was done first to bootstrap the implementation. A good example of where we could use this effectively follows:
Lua error handling and Panics
NodeMCU implements the standard Lua error handling model. In this any call level can establish an error handler as part of calling a sub-function. If errors are thrown in this sub-function then they are caught by the error handler. If an error is thrown and not caught by an error handler then it is caught at the top level by what is known as the Lua Panic handler, and on NodeMCU this emits a terse error message to UART0 before rebooting the ESP. This makes Panic errors very difficult to diagnose.
There is absolutely no reason for panics to be handled this way. If we look at a typical pattern for calling a task function:
Here we are calling the function with the handle
ud->client.cb_sent_ref
passing the userdataud->client.cb_sent_ref
as context. If thiscb_sent_ref
routines throws an error then this will panic and reboot the ESP. Why do this? If we replace this with a pattern:We can not only save on coding space, but also get panic handling with full error traceback 'for free'. There are 71 such fragments in the
modules
directory so doing this is a pretty straightforward batch edit. We would need one extranode
callnode.atpanic(function)
which established a non-default panic handler.The
nodemcu_call()
would be something along the lines of:Now the call always returns whether or not the function throws an error. However if it does then the
nodemcu_traceback()
gathers a full error traceback and does a task post to the registered atpanic routine with the traceback as a string argument. The default at panic routine would print this full traceback and restart the cpu. However a production application might log the error over the network.Other possible uses of tasking within the Lua VM / NodeMCU runtime.
It is moot whether we should regard such features as Lua components (i.e. with a
lua_
prefix and part of the lua file hierarchy) or are as NodeMCU ones (i.e. with anodemcu_
prefix and part of the platform or similar file hierarchy). My view is that these extensions are intimately tied into the Lua VM and we already have a Lua module for the NodeMCU extensions; this uses theluaF_
and is inlflash.c
, but there is sound sense in lumping all of these extras together and calling this filelnodemcu.c
instead.Anyway as well as error handling other placs where I am planning to use this tasking model include:
node.output()
spoolingWell any considered responses?
PS:
or follow theand keepdev-esp32
lead and useluaX_
for thisluaN_
for LFS.The text was updated successfully, but these errors were encountered: