Lua 5.1 to 5.3 realignement phase 1 #2836

TerryE · 2019-07-18T16:16:10Z

See #2033, #2787 #2802 #2803 #2808 and #2823

This PR is for the dev branch rather than for master.
This PR is compliant with the other contributing guidelines as well.
I have thoroughly tested my contribution.
The code changes are reflected in the documentation at docs/*.

Overview

My aim here is to make the app/lua53 interface with the platform and other libraries as clean as possible, and since all of the other directories are going to be shared, it is just simpler to backport this tidying into the app/lua code hierarchy. This PR makes the first tranche of these changes to achieve this alignment.

This has ended up pretty large change and it does change the feel of the Lua interpreter:

You can now paste 100s of lines into either the UART or telnet interface without data overrun. (The 4Kb string limit hasn't been changed.)
Error reporting is more robust and works with telnet. The hook to remove CB panics is include (but not yet deployed in all modules)
Telnet works as expected with the VM facilitating the necessary field marshalling

Detailed changes

lua/lua.c. Strip out all dead code left over from the eLua and core versions but not used in NodeMCU (e.g standard Lua supports command line arguments but we don't use them so there's no point in having all of these argument decoding functions. Tidy up how startup and how the input loop is handled using the task interface. (Pasting bulk content into Putty etc. not longer overruns the input buffer).
app/lua/lnodemcu.c. The NodeMCU extensions to the Lua API are now located in 3 API files: lflash.c for LFS (lauN_*() calls), lrotables.c for ROTables (lauF_*() calls) and lnodemcu.c for other extension (for other lauN_*() calls). The current items in this last category include Lua task handling and catch-all panic handling. The new luaN_call() variant establishes a PANIC call handler that uses luaN_posttask() to send the error traceback to the stdout pipe. Note that the error reporter defaults to using the base print function to send the error to stdout, but I plan add a node.setatpanic() API call in a subsuquent PR to allow production system to establish an alternative logging function for logging such panic errors.
lua/lua.h. Add entries for luaN_posttask() and luaN_call() as these functions are used as part of the core Lua VM. New any variants lua_isanyfunction(L,n) and lua_isanytable(L,n) to hide the type test differences between the Lua 5.1 and 5.3 APIs. (Note that these are used in pipe.c and uart.c, but will also be adopted in a later PR for all modules.)
lua/lbaselib.c. Stderr-type errors now use the base print function and this calls c_puts() rather than dbg_printf() as the latter is really only for debugging and bypasses node.output() redirection. (It was this change to use dbg_printf() that broke telnet).
Task Library. This functionality has now been moved into the platform core. See RFC Why have different coding standards for ESP32 and EP8266 modules? #2811 (comment) for background to this. Note that I have temproarily retained a task.h header that remaps the old interface to the platform calling conventions so that modules using the task interface don't need recoding except that the modules that use the taksk inferface have the #include "task.h" hoisted in the include order. The luaX_ functions now include the Lua call interface.
app/module/pipe.c add the ability to bind a Lua CB function at pipe.create() to read and empty the pipe.
app/module/node.c the node.output() function now takes a piper reader as an argument. The field output writes to a stdout pipe and the output reader is now executed as a separate task. This means that node.output() works fine from the interactive session and the stdin and stdout pipes handle of the field marshalling automatically.
Make for lua now includes -Wall plus minor fixes to lua files to remove -Wall warnings. The one non-trivial change here is that app/platform/vfs.h previously used static declarations in this header which genrated compiler warnings when these should should have been static inline. Plus the lflash.c changes also picked up by Johny on the esp32 branch. Also includes fixing some operator precedence issues because of lack of extra guard parentheses in some define macros in driver/rotary.c and driver/spi.c
Move the driver file receive.cto input.c plus enchancements, and ditto for its header. The old file didn't actually do the UART receive handling as this had been moved into lua.c. I've moved the receive handling back and since it does all of the line input handling for the UART0, I decided to rename it to input.c as this better reflects its purpose.
The driver file uart.c has had its initialisation binding split into two parts: uart_init() which is called from user_main.c and uart_init_task() which is called from the input.c driver as part of Lua startup to enable input delivery.
.gdbinit some minor enhancements to this remote GDB macro set fro API developers

Notes

This is a large change because of all of the interdependencies. I suggest we flush out the pending PRs that we want to do in the next master drop, and merge this immediately after.

TerryE · 2019-07-19T03:00:27Z

And the Pipe version of the telnet server. This one is actually usable. I was pasting the following to test out the marshalling

r=debug.getregistry() 
function list(name, t)
  if #name < 15 and (name == '_G' or t ~= _G) then
    print(name,t)
    for k,v in pairs(t) do
      local name = name..'.'..k
      if type(v):sub(-5) == 'table' and name:sub(1,8) ~= '_G.r.std' then
        list(name, v)
      else
        print(name,v)
      end
    end
  end
end
list("_G",_G)

It kept running out of memory, until I twigged was also enumerating r.stdin and r.stdout and the pipes support the tostring intrinsic for debugging so r.stdout[2] contains the string representation of the first stdout pipe UData -- recursive chain reaction, hence the name:sub(1,8) ~= '_G.r.std' guard. Phew!!

jmattsson

Phew! That's a lot of changes!

Over all looks really good! There are a few serious questions I have (in the comments above), but for the most part the comments are for grunt-work cleanup.

The input handling of course looks different on the esp32, but there's nothing here that makes me concerned that it'll cause issues on that side. A bit of integration work and platform adjustments, but should be pretty easy (famous last words, I know).

jmattsson · 2019-07-19T03:08:53Z

app/driver/input.c

+
+/**DEBUG**/extern void dbg_printf(const char *fmt, ...) __attribute__ ((format (printf, 1, 2)));
+
+static void input_handler(platform_task_param_t flag, uint8 priority);


task_prio_t ?

jmattsson · 2019-07-19T03:11:07Z

app/driver/input.c

+
+/*
+** input_handler at high-priority is a system post task used to process pending Rx
+** data on UART0.  The flag is used as a latch to stops the interrupt handler posting


s/stops/stop/

jmattsson · 2019-07-19T03:11:53Z

app/driver/input.c

+int lua_main (void);
+static bool input_readline(void);
+
+static void input_handler(platform_task_param_t flag, uint8 priority) {


task_prio_t

jmattsson · 2019-07-19T03:12:19Z

app/driver/input.c

+    lua_main();
+    return;
+  }
+  ins.input_sig_flag = flag & 0x1;


I'd quite prefer a symbolic name here. Even if the lack of one is my fault so far >.>

It's now a 0/1 field so this is once where I think that a symbolic isn't really needed.

jmattsson · 2019-07-19T03:13:17Z

app/driver/input.c

+static void input_handler(platform_task_param_t flag, uint8 priority) {
+  (void) priority;
+  if (!ins.data) {
+    lua_main();


What's the advantage of doing the init here, rather than when establishing this task handler?

jmattsson · 2019-07-19T04:28:32Z

app/modules/pipe.c

-** Note that pipes also support the undocumented length and tostring operators
-** for debugging puposes, so if p is a pipe then #p[1] gives the effective
-** length of pipe slot 1 and printing p[1] gives its contents
+** Note that pipe tables also support the undocumented length and tostring


Looks like it's no longer undocumented! ;)

Documented as in readthedocs general documentation. Anyone reading this is digging around in the pipe internals anyway.

jmattsson · 2019-07-19T04:30:41Z

app/modules/pipe.c

+#define CB_QUIESCENT      4
+/*
+** Note that nothing precludes the Lua CB function from itself writing to the 
+** pipe and in this case this routine will call itself recursively. 


Recursively, or merely retasking itself?

Recursively vs retasking. I leave it to you to propose alternative wording.

The real point here was that node.output() used to be a minefield as it was really breaking the VM architectural assumptions in that the print base function was never intended to embed a Lua CB, so doing a print inside this Lua CB code could trash the whole runtime environment and PANIC the processor.

Because the output interface now uses pipes and the tasking mechanism we now comply with the architecture and there is actually nothing stopping the output CB doing a print.

jmattsson · 2019-07-19T04:34:15Z

app/modules/uart.c

  lua_rawgeti(L, LUA_REGISTRYINDEX, uart_receive_rf);
  lua_pushlstring(L, buf, len);
-  lua_call(L, 1, 0);
-  return !run_input;
+  luaN_call(L, 1, 0, 0);


So here for example, I believe if the call fails we've just littered the stack with an error message.

This actually touches on a more systemic issue in that all CB action routines should always balance the stack from entry to return but we don't have any guidance / quality check / coding pattern to ensure this and failure to do so leads to resource leakage that will ultimately exhaust memory. We need to address this wider issue, IMO.

However as to this specific point, we need to decide the stack rules for luaN_call(). IMO we have two options:

It follows the lua_pcall()convention of returning 1 argument and an error status if the called routine throws an error, or

It dummies the lua_call() of returning nresults arguments (which might be nil in the case of an error) and an error status.

IIRC, there only one use in the modules library where the library routine accepts any result from the Lua CB, and all that typically happens is that the wrapper function returns control to the SDK on return from the CB. I think that we should go for easy of migration to luaN_call() and that adopting option 2 will best achieve this.

My vote would be for #2 as well. If there are any instances where a module is actually interested in the error message, they can use lua_pcall() for that. For everyone else, I think the simplicity of lua_call() behaviour will give us the most robust outcome.

jmattsson · 2019-07-19T04:35:05Z

app/modules/uart.c

  {
-    need_len = ( uint16_t )luaL_checkinteger( L, stack );
+    data_len = luaL_checkinteger( L, stack );
+    luaL_argcheck(L, data_len >= 0 && data_len <= 255, stack, "wrong arg range");


Not LUA_MAXINPUT?

Dunno. Ask Zeroday why he used 255. All I did was to recode the existing validation logic to use luaL_argcheck() whilst getting rid of some warnings.

The actual UART input buffer is allocated with input_setup() and this is called in lua.c using LUA_MAXINPUT so I agree that this would be a better symbolic value. So updated.

jmattsson · 2019-07-19T04:35:59Z

app/modules/uart.c

-  if (lua_type(L, stack) == LUA_TFUNCTION || lua_type(L, stack) == LUA_TLIGHTFUNCTION){
-    if ( lua_isnumber(L, stack+1) ){
-      run = lua_tointeger(L, stack+1);
+  if (lua_isfunction(L, stack) || lua_islightfunction(L, stack)) {


I believe you added a nice lua_isanyfunction(L, stack) macro above.

Yup, but I'll be doing all of these as a separate sweep.

Noted. And thumbs up.

jmattsson · 2019-07-19T04:48:37Z

Oh, and Travis is unhappy with you. :)

TerryE · 2019-07-19T13:29:52Z

Oh, and Travis is unhappy with you.

@jmattsson, you can blame @marcelstoer and @galjonsfigur and #2790 for this one 😄 This does a lint- style check of all our Lua examples, This is a great idea, IMO _but we haven't as yet done the hard bit of going through all of the examples and fixing the issues, so it is supposed to be optional for now, but for some reason after I rebaselined against devit is running in this PR. The Travis fail arises from these check failing.

As to your other comments, I will go through them and do a separate consolidated reply for most as this will probably be the most readable for you and others.

galjonsfigur · 2019-07-19T14:33:08Z

@TerryE I just looked at the Travis log and lint check does not break the build - it only spills lots of warnings, but because of true in this line it won't break the build. The real error is here in the build log:

lflash.c: In function 'flashBlock':
lflash.c:124:3: error: format '%x' expects argument of type 'unsigned int', but argument 3 has type 'const void *' [-Werror=format=]
   NODE_DBG("flashBlock((%04x),%08x,%04x)\n", curOffset,b,size);

TerryE · 2019-07-19T14:34:24Z

Reasoning behind some of these changes:

Tasking. We added the task interface to Lua quite late on, but the more that I use it and node.js I realise that NodeMCU Lua should fully embrace the tasking paradigm. As node.js demonstrates this isn't really an SDK vs RTOS issue, but is core to the NodeMCU execution model. The SDK vs RTOS aspect simply dictates whether the next event scheduler is provided as an external (SDK) service or is part of the platform environment (under RTOS).
We need the tasking interface to be available before the Lua environment has had a chance to initialise itself and any ROM-based libraries, so it makes sense to hoist this code into the app/platform code set, hence the rename to the platform_xxx "namespace", though since we have other namespaces such as vfs_ already as part of platform there would be no harm in retaining the task_ but simply moving the service into the hierachy. This includes decisions like renaming types like platform_task_param_t.
One thing that I want the platform headers to do is to fully encapsulate any links to SDK or RTOS headers to that we handle these withing the two platform layers and not #ifdef bracketed include sets in our code. Reviewers thoughts, please
stdin and stdout pipes are an example of how we now use tasking. The uart ISR posts a high priority task to request the UART buffer to be emptied. This task is handled by the input driver code which buffer whole input lines and then does a stdin:write() to stash them in the stdin pipe. The pipe is read by a low priority pipe reader task internal to lua.c which processes one (extra) Lua line per invocation. Since the stashing is done at a higher priority then this will be done preferentially to starting processing the next line. What this means in practice is that it is really difficult to overflow the Lua input: you can just cut and paste large blocks of code into the UART or telnet without source data drop. For me this makes using tools like Esplorer obsolete. I also don't bother with SPIFFSimg any more since it is just easy to have a block of source Lua containing file.putcontents('file1.lua [=[ ... ]=]) chunks and pasting this into the interpreter to set up the FS.
Likewise the core print function / output_redirect writes to the stdout pipe and this is emptied by a CB task which reads the pipe.
luaN_call() uses this task interface as well. Like lua_pcall() it doesn't throw an error and so always returns. Internally it established a traceback error handler which catches any error traceback as a string and then posts a separate task to print this out / send it to the stdout pipe ( or in future to a application selectable handler if the app wants to log these errors separately). So no more PANIC reboots for modules with use luaN_call() to call their Lua CBs, just a proper error traceback or the ability to log this separately.

I'll fix the minor review points that you also picked up. Thanks :)

jmattsson · 2019-07-20T03:52:43Z

When Travis is happy, I'll be happy. This is a very nice (and large) improvement - thanks for all the work Terry!

TerryE · 2019-07-20T06:28:17Z

Thanks Jonny for all of the feedback. I've tried to action most of your comments except for a few where I've given my reasons. As I say in the title, this is only phase 1. We've got a few more steps to go. However now that you've taken off #2838, I'll be pausing further work here until #2838 is landed and then I can rebase this against the new dev.

However as a general comment we've had 3 committers contributing to this review and one other monitoring on the sidelines, so unless any of the others chip in with objections, I now view this broadly acceptable as a general canon of work, so I will round this off and merge it in immediately after the next master drop.

devsaurus · 2019-07-20T10:45:59Z

These changes look good and are a significant step forward. Thanks a lot Terry!
Haven't had the time to actually test drive this PR though.

Can't formally approve atm with the android client here 😐

devsaurus

👍

TerryE · 2019-07-23T20:25:34Z

I've just rebaselined against the current dev, hence the forced push. Not ready for dev merge yet. More testing needed and I need to do master drop anyway.

TerryE · 2019-07-24T17:35:59Z

@jmattsson @devsaurus @nwf, I have another big batch of changes pending tidying up the modules. I suspect that it is going to be another couple of weeks before we eventually do the next master drops and are positioned to merge this PR. That will mean another month before we also review and do the module changes as well. Would there be any advantage also to rolling in the module changes into this PR? That way we can drop a couple of weeks from the timeline.

marcelstoer · 2019-07-24T18:41:30Z

I suspect that it is going to be another couple of weeks before we eventually do the next master drops

Why? Most issues in https://github.com/nodemcu/nodemcu-firmware/milestone/13 are either completed or close to completion. If it's any help we can drop some PRs from the milestone and snap to master next week. What's your preference?

TerryE · 2019-07-24T19:16:21Z

Marcel whenever you, Gregor and Nathaniel can make the cut ... but if these are joined, then I can roll on the prep work. It's really up to the reviews as to whether they think this rolled up patch will be too big to review.

jmattsson · 2019-07-25T06:50:00Z

I'd prefer a second PR for the next round of clean-up, it'd be a lot easier to review. Nothing stopping you from just branching off from this PR branch and continuing work though.

TerryE · 2019-07-25T08:40:44Z

OK will do it that way. Just have to rebase twice.

jmattsson · 2019-07-25T09:30:59Z

Might not need any more rebasing. This one is looking all green at the moment, so unless there's something conflicting going in soon it should just be a couple of squash-n-merges :)

TerryE · 2019-07-26T01:18:55Z

I've got my dev-rotable-opts rebased against this PR and everything seems to be working fine, though there is little point in raising this as a PR until this one is merged, as I will want to rebase against the current dev before pushing the new PR.

marcelstoer · 2019-09-10T10:40:40Z

@TerryE can you please update or close all the referenced issues if necessary.

TerryE · 2019-09-12T22:35:32Z

@TerryE can you please update or close all the referenced issues if necessary.

On my TODO list 😄

TerryE · 2019-09-12T23:15:05Z

Done. #2808 carried forward to next PR. All others are closed.

TerryE requested review from nwf, jmattsson, pjsg, devsaurus and HHHartmann July 18, 2019 16:16

jmattsson mentioned this pull request Jul 19, 2019

LFS support for ESP32 NodeMCU #2801

Merged

4 tasks

jmattsson requested changes Jul 19, 2019

View reviewed changes

TerryE mentioned this pull request Jul 19, 2019

Major cleanup - c_whatever is finally history. #2838

Merged

4 tasks

jmattsson approved these changes Jul 20, 2019

View reviewed changes

TerryE mentioned this pull request Jul 21, 2019

Major PRs in the run up to Lua53 #2816

Closed

devsaurus approved these changes Jul 21, 2019

View reviewed changes

TerryE added 5 commits July 23, 2019 19:16

Lua 5.1 to 5.3 realignement phase 1

14c1b8f

Add telnet example

ba03cb0

Updates following JM review

bc98174

Rebased against current dev

39bb60e

Rebased against current dev, tweaks for clean compile

522b1d0

TerryE force-pushed the dev-new-lua.c branch from c7a106b to 522b1d0 Compare July 23, 2019 20:23

This was referenced Jul 23, 2019

Basic pipe support for inter-task communication #2802

Closed

Adding a coroutining example and explaining how it can be really useful #2848

Closed

This was referenced Jul 26, 2019

More ROTable optimisations #2859

Closed

Merging some lua53 changes back into app/lua #2872

Closed

TerryE mentioned this pull request Aug 24, 2019

Telnet / node.output example not working #2033

Closed

TerryE merged commit fff9f95 into nodemcu:dev Sep 7, 2019

TerryE deleted the dev-new-lua.c branch September 7, 2019 09:54

marcelstoer added this to the Next release milestone Sep 10, 2019

This was referenced Sep 12, 2019

RFC - Sorting out the input handling #2787

Closed

RFC on Tasking Inside our NodeMCU Framwork #2803

Closed

Code logic flaw in `app/driver/rotary,c #2823

Closed

matsievskiysv mentioned this pull request Nov 12, 2019

Cannot load LFS image #2959

Closed

HHHartmann mentioned this pull request Apr 21, 2020

Correct integer types in BMP085 driver #3070

Merged

2 tasks

marcelstoer mentioned this pull request May 16, 2020

Error Handling in Lua #3078

Closed

TerryE mentioned this pull request May 18, 2020

Tidy up Telnet and other documentation #3116

Closed


		/DEBUG/extern void dbg_printf(const char *fmt, ...) __attribute__ ((format (printf, 1, 2)));

		static void input_handler(platform_task_param_t flag, uint8 priority);

Lua 5.1 to 5.3 realignement phase 1 #2836

Lua 5.1 to 5.3 realignement phase 1 #2836

Conversation

TerryE commented Jul 18, 2019 • edited by marcelstoer Loading

Overview

Detailed changes

Notes

TerryE commented Jul 19, 2019

jmattsson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmattsson commented Jul 19, 2019

TerryE commented Jul 19, 2019

galjonsfigur commented Jul 19, 2019

TerryE commented Jul 19, 2019 • edited Loading

jmattsson commented Jul 20, 2019

TerryE commented Jul 20, 2019 • edited Loading

devsaurus commented Jul 20, 2019 • edited Loading

devsaurus left a comment

Choose a reason for hiding this comment

TerryE commented Jul 23, 2019 • edited Loading

TerryE commented Jul 24, 2019 • edited Loading

marcelstoer commented Jul 24, 2019

TerryE commented Jul 24, 2019

jmattsson commented Jul 25, 2019

TerryE commented Jul 25, 2019 • edited Loading

jmattsson commented Jul 25, 2019

TerryE commented Jul 26, 2019

marcelstoer commented Sep 10, 2019

TerryE commented Sep 12, 2019

TerryE commented Sep 12, 2019

TerryE commented Jul 18, 2019 •

edited by marcelstoer

Loading

TerryE commented Jul 19, 2019 •

edited

Loading

TerryE commented Jul 20, 2019 •

edited

Loading

devsaurus commented Jul 20, 2019 •

edited

Loading

TerryE commented Jul 23, 2019 •

edited

Loading

TerryE commented Jul 24, 2019 •

edited

Loading

TerryE commented Jul 25, 2019 •

edited

Loading