Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boards: Set stack size from within apps #180

Closed
wants to merge 2 commits into from

Conversation

alistair23
Copy link
Collaborator

Stack size isn't board specific, but more app specific. So let's set the
stack size from within apps.

This follows what TockOS does and sets a buffer for the stack. We then
parse that buffer when running elf2tab to use that value.

Signed-off-by: Alistair Francis alistair.francis@wdc.com

@torfmaster
Copy link
Collaborator

I have to say that I find this quite inconvenient to use. Firstly, it depends on convention and it is not clear why a properly compiling app fails because of not enough stack. Secondly, I don't want to bother the user with such details.
My proposal would be to

  1. set a "one size fits all" stack for standard applications
  2. find a way to opt-in a different stack size which is less fragile (using environment variables and macros for examples).

@alistair23
Copy link
Collaborator Author

not clear why a properly compiling app fails because of not enough stack

This is the exact problem we have today. I usually test with a larger stack size as I have seen multiple times where the default 2K stack isn't large enough.

I don't want to bother the user with such details.

They already have to think about the stack. To fit apps on the HiFive1 you need to shrink the stack size, but to run complex examples you need to increase it.

There is no one size fits all size.

Do we really want to spend 2K of stack to blink LEDs? That seems a little big and makes the example too big to fit on the HiFive1.

On top of that we are also starting to see example apps that need stack sizes larger then 2048. What then? Every app has to have a 3K stack just for one large example?

I tried a few different ways and this is the least fragile. I can't find anyway to use environment variables in a linker script. I tried using sed to edit the linker script, but that doesn't really get you anything. How would Rust macros be used to edit the linker script?

The stack usage changes per app, not per board so it makes more sense to have an app set it. It's now clearly set out for every app. If start with the hello_world app and start changing it you can then easily increase the stack size. It's more obvious that it is something that you have to worry about now, instead of being hidden away.

@jrvanwhy
Copy link
Collaborator

I think we should have per-app linker scripts that define STACK_SIZE then #include the per-board linker script.

Unfortunately, because all the examples live in a single directory, I think they need to share a linker script. I think we should either move larger examples out to their own directory or we should optimize until stack usage isn't a problem for any of the examples.

@alistair23
Copy link
Collaborator Author

I don't see how a linker script per app is any easier. It just means there is another place to edit values.

Stack usage is always going to be somewhat of a problem though. It will take less stack to blink an LED then to do complex computation and send it over Bluetooth. A per app stack avoids wasting space on the simpler apps.

@torfmaster
Copy link
Collaborator

I can't find anyway to use environment variables in a linker script.

I see to ways here:

  • extend the build.rs to generate the linker script
  • use a proc macro to move your approach (if it works well) somewhere inside libtock-rs (you can read env-variables inside proc macros)

I think 2) is trivial, 1) is more involved but would prepare the "slots" approach for relocation if it is still the plan to use it.

@torfmaster
Copy link
Collaborator

Do we really want to spend 2K of stack to blink LEDs? That seems a little big and makes the example too big to fit on the HiFive1.

This sounds to me that the stack for the examples should be limited per board for now.

@torfmaster
Copy link
Collaborator

They already have to think about the stack.

There are some users that have to do it. On the nrf52dk I never had stack issues and I think users with more powerful boards will never think about stack sizes.

@torfmaster
Copy link
Collaborator

It's more obvious that it is something that you have to worry about now, instead of being hidden away.

That's how I think about configuration: If there's a default value - don't show it. Configuration is much more maintainable when you see only the non-default values.
In principle stack issues can be reported: On compile time by static analysis and on run time by the MPU. So if the stack is too small, the users can take measures (and we document how to do it). I think we should put the stack size in every example just that the user is aware that could be an issue. Otherwise we could do that with the heap size as well.

@Woyten
Copy link
Contributor

Woyten commented May 15, 2020

* extend the build.rs to generate the linker script

This is exactly what I was thinking about in the past. There are some great templating engines. Why not use one of them in the build.rs?

@ppannuto
Copy link
Member

Sorry to be late to this, but I think libtock-c already does what you want?

The Makefile picks up the environment variable STACK_SIZE and then passes it to the linker via gcc ... -Xlinker --defsym=STACK_SIZE=2048 which defines a symbol used by the generic linker script.

From some quick googling, it looks like the llvm linker also supports the defsym flag, so you should be able to just do the same thing I think?

@torfmaster
Copy link
Collaborator

@ppannuto: I think this depends on rust-lang/cargo#7811.

@alistair23
Copy link
Collaborator Author

alistair23 commented May 15, 2020

So the problem with build.rs generating the linker script is that build.rs runs first so then there is no way to have an app override the size. For stack size you don't really get anything expect maybe an environment variable to set the size, but that is more confusing for users.

A macro has the same problem. There isn't really a way to pass a macro from an app to the linker script.

@torfmaster you are right that large boards won't really have an issue. You can just set the stack higher then you need and not worry about it. It is wasting space, but maybe that doesn't matter. The problem is there are smaller boards where this is a problem.

The MPU errors aren't always obvious. Especially on RISC-V boards without PMP (the MPU equivalent) like the HiFive1 revA. OpenTitan doesn't have working PMP implementation today, so there is no reporting if you overgrow the stack.

@torfmaster
Copy link
Collaborator

torfmaster commented May 15, 2020

A macro has the same problem. There isn't really a way to pass a macro from an app to the linker script.

You can just create an array as you did using a proc macro without touching the linker script. Inside you can read whatever you like: variables, files, ... I don't want to imply that I prefer this solution. But I prefer it over putting arrays into the examples for non-obvious reasons.

Edit: typo

@alistair23
Copy link
Collaborator Author

Ok, I created an issue to track this: #181

I'll leave the macro or linker script generation up to you as I don't see how it would work.

Copy link
Contributor

@bradjc bradjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that apps here don't have an associated makefile, this seems like a pretty reasonable method for setting the stack size.

@torfmaster
Copy link
Collaborator

@alistair23: Here's one issue I found while implementing the dynamic linker script generation. How do you plan to pass the stack size used in your apps to elf2tab?

@torfmaster
Copy link
Collaborator

@alistair23: Here's one issue I found while implementing the dynamic linker script generation. How do you plan to pass the stack size used in your apps to elf2tab?

I see, you are reading it from the artifact. I'm leaving this comment here for reference.

@torfmaster
Copy link
Collaborator

When I forget to create a stack buffer inside an example I get the following error message

error: The argument '--stack <STACK_SIZE>' requires a value but none was supplied

plus the usage of elf2tab. I think this should be replaced by an error message explaining what failed and why.

@alistair23
Copy link
Collaborator Author

How would an error message be generated?

@bradjc
Copy link
Contributor

bradjc commented Jun 24, 2020

I happen to be a big fan of this change because of how it causes the segments to be created in the ELF. On master:

$ readelf.py -l target/thumbv7em-none-eabi/release/examples/blink

Elf file type is EXEC (Executable file)
Entry point is 0x30081
There are 6 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001040 0x00030040 0x00030040 0x00980 0x00980 R E 0x1000
  LOAD           0x002800 0x20004800 0x000309c0 0x00000 0x00004 RW  0x1000
  LOAD           0x0029c0 0x000309c0 0xe005cb80 0x00010 0x00010 R E 0x1000
  GNU_RELRO      0x002800 0x20004800 0x000309c0 0x00000 0x00000 R   0x1
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x0
  ARM_UNWIND     0x0029c0 0x000309c0 0xe005cb80 0x00010 0x00010 R   0x4

 Section to Segment mapping:
  Segment Sections...
   00     .crt0_header .text
   01     .bss
   02     .endflash .ARM.exidx
   03
   04
   05     .endflash .ARM.exidx

With this PR and the following diff:

diff --git a/layout_generic.ld b/layout_generic.ld
index 7536a3b..c5b5130 100644
--- a/layout_generic.ld
+++ b/layout_generic.ld
@@ -85,7 +85,7 @@ SECTIONS {
     } > FLASH =0xFF

     /* Application stack */
-    .stack (NOLOAD) :
+    .stack :
     {
          _sstack = .;
          KEEP(*(.stack_buffer))
$ readelf.py -l target/thumbv7em-none-eabi/release/examples/blink

Elf file type is EXEC (Executable file)
Entry point is 0x30081
There are 7 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001040 0x00030040 0x00030040 0x00980 0x00980 R E 0x1000
  LOAD           0x002000 0x20004000 0x20004000 0x00400 0x00400 RW  0x1000
  LOAD           0x002400 0x20004400 0x000309c0 0x00000 0x00004 RW  0x1000
  LOAD           0x0029c0 0x000309c0 0xe005cf80 0x00010 0x00010 R E 0x1000
  GNU_RELRO      0x002400 0x20004400 0x000309c0 0x00000 0x00000 R   0x1
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x0
  AARCH64_UNWIND 0x0029c0 0x000309c0 0xe005cf80 0x00010 0x00010 R   0x4

 Section to Segment mapping:
  Segment Sections...
   00     .crt0_header .text
   01     .stack
   02     .bss
   03     .endflash .ARM.exidx
   04
   05
   06     .endflash .ARM.exidx

When the stack has a section mapped to it (as in this PR), and is not marked as no load, we get a RW segment at address 0x20004000, which is where the blink binary expects its RAM region to start. This allows elf2tab to find this, and add headers to the TBF indicating this fixed, required address so the kernel can display an error if it is unable to meet this requirement. I think this would help a lot with catching any address misalignment issues early.

With the stack marked NOLOAD, the stack is not included in a segment, and there is no segment at the intended address (0x20004000).

With the stack only set by the STACK_SIZE variable (i.e. no section assigned to it) but not marked NOLOAD, there is a segment at the correct address, but it for some reason gets marked R E, which then elf2tab cannot differentiate from the flash region.

@bradjc
Copy link
Contributor

bradjc commented Jun 24, 2020

After more digging, the incompatibilities between llvm and gcc make even having this PR merged not enough to do automatic RAM address detection without a bit of guesswork and heuristics (at least as far as I can tell).

I still think that we need to be able to set the stack on a per-app basis, however.

@alistair23
Copy link
Collaborator Author

Ping! This is an even bigger problem today now that we have more complex examples such as CTAP.

This has been reviewed by someone and ACKed by a few people. If I rebase it can it be merged?

@torfmaster
Copy link
Collaborator

I'm strictly against this solution as I already mentioned. It makes the library hard to maintain for no good reason.

@jrvanwhy
Copy link
Collaborator

I'm not enthused about the mechanism this PR uses to specify stack sizes (I think it is really ugly), but I recognize that there is a problem here and this is the only PR that offers a solution. #184 allows the stack size to be set via environment variable -- which I prefer in general -- but that doesn't work for libtock-rs' examples.

As such, I think this should be merged despite its uglyness.

@jrvanwhy
Copy link
Collaborator

I'm for merging this if it is rebased.

@ppannuto
Copy link
Member

Agreed that it's not the prettiest, but it also matches how we set stack size for boards in the kernel, so there's some consistency there at least.

Biggest win for me is that changing stack size used to require changing things in two places (linker script and flash script argument to elf2tab), but now there's just one authoritative stack size location.

I think this is a go for me for now, with the understanding that we're open to a another solution if/when one presents itself later.

@torfmaster
Copy link
Collaborator

As a user I wouldn't use the library anymore if I had to manipulate such low-level-details in such a way - it feels to much like C. As I said: we should create a separate CTAP crate and use a feasible solution for setting the stack size - but if you want to make this library unsable for users exept some few people from WDC and google - go for it.

@jrvanwhy
Copy link
Collaborator

As a user I wouldn't use the library anymore if I had to manipulate such low-level-details in such a way - it feels to much like C. As I said: we should create a separate CTAP crate and use a feasible solution for setting the stack size - but if you want to make this library unsable for users exept some few people from WDC and google - go for it.

Your concern is that the following 4 lines are too low-level to be in an application, correct?

/// Dummy buffer that causes the linker to reserve enough space for the stack.
#[no_mangle]
#[link_section = ".stack_buffer"]
pub static mut STACK_MEMORY: [u8; 0x400] = [0; 0x400];

Other than a stylistic objection, do you have a reason why this solution is unworkable? If not, then "unusable for users except some few people from WDC and google" is an exaggeration.

This PR isn't a code style improvement, it is a fix to allow us to continue developing and testing libtock-rs. Our ability to continue developing libtock-rs is more important than style here.

@alistair23
Copy link
Collaborator Author

I'm all agreed it's ugly. If someone has a nice way to improve it please send a patch and I'll happily review it.

I have rebased this PR, it's failing again from the OT boot ROM being out of date. Just waiting on my Tock PR to be merged: tock/tock#2151

@alistair23 alistair23 force-pushed the alistair/stack-setup branch 2 times, most recently from 5c724d9 to 62d5979 Compare October 14, 2020 21:09
@hudson-ayers
Copy link
Contributor

https://github.com/hudson-ayers/libtock-rs/tree/stack-setup-2 contains a patch that maintains a default buffer size, but lets apps that want a different size than the default change it by passing the desired stack size as an argument to #[libtock::main].

E.g. most apps stay exactly as they are today, but if an app wants a bigger stack size it uses #[libtock::main(0x2000)] or whatever.

Thoughts?

@alistair23
Copy link
Collaborator Author

That works for me.

@jrvanwhy
Copy link
Collaborator

I'm going to see if I can generate that symbol from a macro_rules! macro rather than a procedural macro. I want to avoid procedural macros as much as possible. That said, the procedural macro has the advantage of allowing libtock-rs to provide a default stack size.

@hudson-ayers
Copy link
Contributor

I'm going to see if I can generate that symbol from a macro_rules! macro rather than a procedural macro. I want to avoid procedural macros as much as possible. That said, the procedural macro has the advantage of allowing libtock-rs to provide a default stack size.

Yeah, I went the procedural macro approach just because we already are using one to declare main, and it seemed sensible that setting a stack size be associated with declaring main - each can only be done once. Also this prevents having to add any new code to apps that want to keep a default.

@hudson-ayers
Copy link
Contributor

I updated my branch so the syntax is now
#[libtock::main(stack_size = 4096)] to make it more explicit what is happening when a non-default size is used. I also added checks for error cases like the stack size not being a multiple of 8.

@jrvanwhy
Copy link
Collaborator

I'm going to see if I can generate that symbol from a macro_rules! macro rather than a procedural macro. I want to avoid procedural macros as much as possible. That said, the procedural macro has the advantage of allowing libtock-rs to provide a default stack size.

Yeah, I went the procedural macro approach just because we already are using one to declare main, and it seemed sensible that setting a stack size be associated with declaring main - each can only be done once. Also this prevents having to add any new code to apps that want to keep a default.

Yeah it makes sense for async main functions, but in libtock_core I don't have the procedural macro. Those mains are synchronous.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Stack size isn't board specific, but more app specific. So let's set the
stack size from within apps.

This follows what TockOS does and sets a buffer for the stack. We then
parse that buffer when running elf2tab to use that value.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
@hudson-ayers
Copy link
Contributor

Yeah it makes sense for async main functions, but in libtock_core I don't have the procedural macro. Those mains are synchronous.

Gotcha. If we have a create_stack_buffer!(0x800) macro, maybe it could be called from within the procedural macro for examples with an async main, and called directly from main for those without? Its less uniform, but hopefully still addresses @torfmaster concerns by leaving the interface unchanged for async examples.

@jrvanwhy
Copy link
Collaborator

Yeah it makes sense for async main functions, but in libtock_core I don't have the procedural macro. Those mains are synchronous.

Gotcha. If we have a create_stack_buffer!(0x800) macro, maybe it could be called from within the procedural macro for examples with an async main, and called directly from main for those without? Its less uniform, but hopefully still addresses @torfmaster concerns by leaving the interface unchanged for async examples.

That seems like it should be possible. I just opened #244 which demonstrates the macro_rules! approach, in case we want to pursue that approach further.

@hudson-ayers
Copy link
Contributor

I opened #245 which combines #244 with the approach in my branch. I am personally fine with either being merged, but wanted to at least propose a concrete version that sets the stack size automatically for async apps.

@torfmaster
Copy link
Collaborator

This PR isn't a code style improvement, it is a fix to allow us to continue developing and testing libtock-rs. Our ability to continue developing libtock-rs is more important than style here.

I disagree that this is necessary to actually develop libtock-rs it's rather necessary to make an example run on a specific board without creating a separate crate. I would say what we are trying to do in the examples is far beyond the idea of rust examples and we should just stop thinking that examples give us the complete flexibility of binary crates - they don't.

@torfmaster
Copy link
Collaborator

I want to avoid procedural macros as much as possible.

Is there a reason for that? Declarative macros have the disadvantage that

  • it is hard to validate the input
  • they almost not testable
  • they are hard to read and hard to maintain

Do they have advantages at all?

@torfmaster
Copy link
Collaborator

Other than a stylistic objection, do you have a reason why this solution is unworkable? If not, then "unusable for users except some few people from WDC and google" is an exaggeration.

From a design perspective libtock-rs moves from "there is a library to write safe embedded code in a high level manner - fearlessly" to "it's just another embedded library you always have to take care about all the low-level-details and if you don't you are screwed". Prior to this PR it was possible to write high level code on your microcontroller, now you have to specify link sections, things a Rust programmer may not have done before (and does not want to do).

The other objections is that this is error prone: alignment is not checked, error messages are not really helpful if you forget to specify the link section. It's exactly opposed to the Rust philosophy: If it compiles - it works. Now you would have to copy and paste the examples and fiddle in the code. If something doesn't work you have to search for the error in hundrets of lines of Makefiles and bash scripts - which are untested and ill-documented - exactly as in C embedded development.

@torfmaster
Copy link
Collaborator

And I still don't understand why we don't choose a solution like #184. I know it is a little bit of work to be done but I also took a lot a work of me in my spare time with that without beeing interested in the topic of setting stack sizes because I care about the quality of the library. My forecast is: in 6 months we'll be again discussion about workarounds resulting from the limitations of "examples" in Rust crates and we'll have another hacky workaround to fix that instead of finally making libtock-rs a crate which is usable as such (and not a playground).

@hudson-ayers
Copy link
Contributor

Is there a reason for that? Declarative macros have the disadvantage that

  • it is hard to validate the input
  • they almost not testable
  • they are hard to read and hard to maintain
  • Do they have advantages at all?

I don't speak for Johnathan, but I can think of a few. Declarative macros are hygenic, and procedural macros are not. Procedural macros have some security concerns: https://doc.rust-lang.org/reference/procedural-macros.html. Procedural macros require the use of crates not contained in std.

From a design perspective libtock-rs moves from "there is a library to write safe embedded code in a high level manner - fearlessly" to "it's just another embedded library you always have to take care about all the low-level-details and if you don't you are screwed". Prior to this PR it was possible to write high level code on your microcontroller, now you have to specify link sections, things a Rust programmer may not have done before (and does not want to do).

For this PR as it stands I somewhat see this perspective, but for #244 , for instance, I think that asking novice users to include stack_size!(2048) at the top of their main file is not very different from asking novice users to annotate their main function with #[libtock::main]. A user may not know why this is necessary, but all examples do it and documentation can indicate it is necessary. I would be surprised if this single macro line that most users can ignore was the difference maker in discouraging a Rust programmer from using this crate.

The other objections is that this is error prone: alignment is not checked, error messages are not really helpful if you forget to specify the link section.

This seems pretty easily resolvable with an additional assert in the linker. The alignment is already checked at compile time, we could add a check that the .stack_buffer symbol is defined and print an error otherwise.

It's exactly opposed to the Rust philosophy: If it compiles - it works. Now you would have to copy and paste the examples and fiddle in the code. If something doesn't work you have to search for the error in hundrets of lines of Makefiles and bash scripts - which are untested and ill-documented - exactly as in C embedded development.

The code is linked at compile time, so an assertion in general_layout.ld should suffice to enforce that if code compiles it works. You already have to copy the examples to some extent in order to get the async runtime to work, or to figure out how to properly print (up until the new println!() approach was merged). To some extent users cannot ignore the fact that they are developing on limited platforms, a user today could easily write code that overflows the stack, and currently they have no way to address that. We also already support boards like the HiFive where flashing multiple apps fails because the stack memory requirements are too high for even extremely simple examples like blink.

And I still don't understand why we don't choose a solution like #184.

The biggest issue I see with #184 is that it does not allow setting a stack size for a particular example, something we support in libtock-c and use in several places. I disagree that any app more complex than the basic examples currently in libtock-rs should have to live in a separate crate from the other examples, I think doing so represents a usability cost in itself. Also, using environment variables is somewhat messy when flashing multiple apps in a row or when writing a script that automatically flashes different apps.

My forecast is: in 6 months we'll be again discussion about workarounds resulting from the limitations of "examples" in Rust crates and we'll have another hacky workaround to fix that instead of finally making libtock-rs a crate which is usable as such (and not a playground).

I'm a little confused as to what you are asking for here; do you think that no examples should live in the main libtock-rs crate, and each example should live in its own crate?

@jrvanwhy
Copy link
Collaborator

Is there a reason for that? Declarative macros have the disadvantage that

  • it is hard to validate the input
  • they almost not testable
  • they are hard to read and hard to maintain
  • Do they have advantages at all?

I don't speak for Johnathan, but I can think of a few. Declarative macros are hygenic, and procedural macros are not. Procedural macros have some security concerns: https://doc.rust-lang.org/reference/procedural-macros.html. Procedural macros require the use of crates not contained in std.

Those crates (quote and syn, IIRC) are an auditing issue (they are large) and have been causing me pain in tock-on-titan ever since we added procedural macros to libtock-rs. I intend to remove libtock_core's dependency on them if possible.

We won't be using futures in the long run so we eventually won't have the #[libtock::main] line.

@torfmaster
Copy link
Collaborator

torfmaster commented Oct 16, 2020

We won't be using futures in the long run so we eventually won't have the #[libtock::main] line.

You mean, libtock-core won't be using futures right?

@jrvanwhy
Copy link
Collaborator

jrvanwhy commented Oct 16, 2020

We won't be using futures in the long run so we eventually won't have the #[libtock::main] line.

You mean, libtock-core won't be using futures right?

I intend to add some futures support to libtock_core, but tock-on-titan (and most likely other OpenTitan-related codebases) won't use them. I wasn't referring to libtock-rs in general.

bors bot added a commit that referenced this pull request Oct 22, 2020
244: Make binaries specify their stack size using a new stack_size!{} macro. r=alistair23 a=jrvanwhy

Prior to this PR, it was not possible for binaries to specify their own stack size. libtock-rs had a hardcoded stack size of 2KiB.

This is an alternative to #180, which had some pushback for requiring `libtock-rs`'s users to write low-level code.

Co-authored-by: Johnathan Van Why <jrvanwhy@google.com>
@alistair23 alistair23 closed this Oct 22, 2020
@alistair23
Copy link
Collaborator Author

Closing as this was addressed by: #244

@alistair23 alistair23 deleted the alistair/stack-setup branch November 20, 2020 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants