Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support code generators #610

Open
kubukoz opened this issue Feb 3, 2022 · 15 comments
Open

Support code generators #610

kubukoz opened this issue Feb 3, 2022 · 15 comments
Assignees
Labels
enhancement New feature or request

Comments

@kubukoz
Copy link
Contributor

kubukoz commented Feb 3, 2022

Is your feature request related to a problem? Please describe.

Build tools generally provide ways to generate source files before compilation. scala-cli doesn't, it was previously mentioned it potentially could.

Describe the solution you'd like

Some way to run an arbitrary script before every compilation, probably configured in directives or by convention (i.e. putting a script in a ./scala-scripts directory).

The interface would be basically Unit => Unit, maybe with extra context passed as environment variables.

Some possible things I'd like to see passed:

  • a way to gather all the sources used in the build (e.g. :-separated list of source paths)
  • the path to a directory where I can output files that'll be included in the compilation. Alternatively I could write to anywhere I want and these paths would be added manually in a directive. Not sure how this would work with cross-compilation.
  • pwd (or something like a workspace root, in case of bsp? I don't know how exactly that protocol works though)
  • build metadata (in theory I could read the directives myself, but having this passed would be much better), e.g. deps, resolved scala version, target platform)

Note that this would be triggered on IDE compilations as well.

There should be a way to customize the script beyond the runnable name. Ideas:

  • Passing args within the directive, e.g. //> build script echo foo bar
  • Wrapping the script with another, and using the other script in the directive. E.g. script echow runs echo foo bar, and I do //> build script echow

Unfortunately, any kind of scripts would mean that builds can't be shared via e.g. gists. Personally I'm fine with this.

Describe alternatives you've considered

Making my own build tool wrapping scala-cli :)

Additional context

Not much to write here.

@ckipp01
Copy link
Contributor

ckipp01 commented Apr 24, 2023

Just to tie these together I just had a usecase where having something like this would have been super useful. I wrote about it in here, but to reiterate I essentially needed a BuildInfo that held the version of my app so that it could be displayed to users. Since I use a Makefile for the project I was able to make sure that any command ran a script before running the actual command to compile. This sort of works fine until you get out of the context of the Makefile. For example I just realized now that Scala Steward can no longer run on my project because of this. Having something like the approach outlined by @kubukoz would really help in situations like this.

@slabuz
Copy link

slabuz commented May 8, 2023

Hi @kubukoz

After a little cooperation with the scala-cli team, I've come up with a proposal on how to define code generators. You can find it here https://gist.github.com/slabuz/b66432d9c71dd100d193617754c79911. In my proposal, I focused on providing a code generator for protobuf. Let me extend it with a few words of commentary.

Where do generators come from?
The idea is to include some of the popular generators in scala-cli and keep extending the library. In the final version, users will be able to provide new generators locally or from gist.

How are they written?
They can be written using any version of scala, not necessarily the same as the main code version. They can use external dependencies, just like normal scala-cli code.

When do they run?
The code generation step will be an obligatory part of the compilation process, run before it to make sure that all generated sources are in place and up to date. In addition, code generation can be triggered automatically when using code editors such as IntelliJ. For those writing code without such tools, a new step is added, scala-cli generate, as shown in the example.

How is the code structured?
At this point, we have identified 2 main aspects of the generator API. The first is a way for the generator to describe itself, giving the most useful data. In my example it's just a JSON, but in the end there will be a case class definition that every generator will need to instantiate and return. The second part of the API is an actual method to generate source code, given the source file and output location.

We are open to any constructive criticism and suggestions on how to make this solution even better ;)

@bishabosha
Copy link
Contributor

Note that bloop also has integrated support for execution of source generators - and it is aware of project dependencies and is cached: scalacenter/bloop#1774, scalacenter/bloop#1819, scalacenter/bloop#1784

@tgodzik
Copy link
Member

tgodzik commented May 8, 2023

Note that bloop also has integrated support for execution of source generators - and it is aware of project dependencies and is cached: scalacenter/bloop#1774, scalacenter/bloop#1819, scalacenter/bloop#1784

Yes, the plan would be to use that.

@kubukoz
Copy link
Contributor Author

kubukoz commented May 9, 2023

The plan looks great, would love to see it :) let me know once I can try integrating https://github.com/disneystreaming/smithy4s/

@przemek-pokrywka
Copy link

I think that code generation is clearly behind the ideal scope of scala-cli, because it makes it very difficult to define a clear feature set of the tool. Clear definitions are essential for anyone who comes to learn about stuff. An ocean of idiosyncrasies is the worst thing to confront.

Maybe if the tool allows for a hook, like in Cargo (https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/cargo/reference/build-scripts.html#build-scripts - the script would need to be Scala to exclude OS differences etc) then the damage could be contained. But, again, what is the new clear definition of the scala-cli?
How do you explain it briefly to newcomers / other-lang-refugees?

@przemek-pokrywka
Copy link

To add some constructiveness to the criticism above, in my opinion it would be good to make scala-cli a well-behaving component of arbitrary systems larger than itself. I'm thinking of things like Bazel or Nix.

@Luka-J9
Copy link

Luka-J9 commented May 16, 2023

I'd love to see this feature. I would look at how Rust/Bleep designed their solution. Bleep is especially interesting due to the notion that it has some interop with sbt plugins. From a design perspective I also like how the configuration is also handled, as having a one liner with lots of configurations (what I understand the current proposal to be) can end up being cumbersome. However I also understand that file formats like yaml/toml would be a larger departure from how scala-cli currently functions (although it might be worth revisiting in this light?)

I disagree with the notion that this is outside the ideal scope of scala-cli, to me it seems like a natural progression. And wedging it into a larger system like Nix or Bazel raises the barrier to entry for newcomers in an unnecessary way in my opinion.

Currently scala-cli has the concept of exporting to Mill/Sbt for when build requirements become sufficiently complex that scala-cli no longer becomes the appropriate tool. Adding code generation would allow users to stay on scala-cli for longer before needing to resort to such an option. While ejecting into a different build tool is a fine option for those of us who are familiar with sbt or mill, thinking from the perspective of a newcomer I would think it would be frustrating to have to learn a completely different tool to achieve functionality like "I want to generate code from my protobuf" or "I want a access Buildinfo."

@Gedochao Gedochao added this to the Scala CLI 1.1.0 milestone May 24, 2023
@He-Pin
Copy link

He-Pin commented Sep 8, 2023

Will it support protobuf code generation?

@tgodzik
Copy link
Member

tgodzik commented Sep 18, 2023

Yes, this is the intention and basic feature we want to support

@bishabosha
Copy link
Contributor

bishabosha commented Jan 30, 2024

I would like to propose this as a GSOC project under Scala org, if anyone wants to object

Edit: It is now being worked on by Rizky Maulana @Perklone

@przemek-pokrywka
Copy link

Hi, seeing the "manual code-gen directive" in action changed my mind as it's much better to support the popular use cases in a standard way rather than forcing users to hack their workarounds.

It would be indeed very helpful to have the ability to depend on code that would be generated in the process of building the script/app.

The main question would be how to implement it in a sound, pragmatic, and ergonomic way. If we tried to formalize @WojciechMazur's proto-directive (naming mine), it might look like this:

//> generate --channel https://disneystreaming.github.io/coursier.json smithy4s generate --dependencies com.disneystreaming.smithy:aws-dynamodb-spec:2023.02.10 -o ./handlers/wildrides

so (provided the code generator exists somewhere as a binary) the main Scala-CLI script could even stay stand-alone / as a gist, easily copy-and-paste-able wherever necessary.

If we wanted the code generation to be sound, I'd propose to treat the generator's output directory as a dependency (writable by the generator only).

There are multiple questions about the interface exposed by the generator to Scala-CLI. How would Scala-CLI know what is the output directory etc?

@bishabosha
Copy link
Contributor

bishabosha commented Jul 15, 2024

as I understand, directives syntax is very limited, it would be nice to support a directive with multiple fields, not just "list of strings, where each string has its own custom dsl", and multiple directives all collapse into the same list of strings.

Edit, investigating this - the base parser itself does collapse repeated directives (e.g. 1 per line) into a single list of strings - so separation isn't possible without some custom logic

@Perklone
Copy link

This is a concept of supporting source generator that I had in mind, also discussed with @bishabosha and @kannupriyakalra aswell.

The idea is that use the source generator via directives that would look something like this:

//> using sourceGenerator "${.}/source-generator-input|glob:test.in|python3 ${.}/source-generator-1.py"

the format for the directive is

//> using sourceGenerator inputDirectory|glob|commandProcessor

This solves a few points that are addressed in the issue:

Some way to run an arbitrary script before every compilation, probably configured in directives or by convention (i.e. putting a script in a ./scala-scripts directory).

This will be using directives so it will be run before every compilation, but due to the nature of bloop caching mechanism, it will cache if you run an identical command after the first compilation.

a way to gather all the sources used in the build (e.g. :-separated list of source paths)

By using directives, we can use multi-line of those directives to gather all of the sources that is needed to compile what you need.

Let me know what you think, I have created a draft PR where you could try it out. Sorry that it's not fully fleshed out yet, but will be working on improving it 👍

@bishabosha
Copy link
Contributor

bishabosha commented Jul 17, 2024

This will be using directives so it will be run before every compilation, but due to the nature of bloop caching mechanism, it will cache if you run an identical command after the first compilation.

So this seems to be because bloop treats the command as not mutable - so if instead of arbitrary commands, we fix the command to be something that is checked every time - such as a scala-cli command on a scala source file - then that should go away.

But also it might be a design decision that actually source generators should be published in a library, so they have a version number and shouldn't change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In progress
Development

No branches or pull requests

10 participants