Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable definition for generated interpreter cases to be composed from multiple files #102021

Closed
jbower-fb opened this issue Feb 18, 2023 · 1 comment
Labels
type-feature A feature request or enhancement

Comments

@jbower-fb
Copy link
Contributor

jbower-fb commented Feb 18, 2023

Feature or enhancement

Allow specifying multiple input files to the generate_cases.py script, making it behave mostly as if the input files were concatenated. Additionally allow existing definitions of instructions to be explicitly overridden using a new override keyword.

I attach a PR with a proposed initial implementation.

Pitch

In Cinder we add a number of new instructions to support our features like Static Python. We also currently have a few tweaks to existing instructions. When we migrate to the new upstream generated interpreter it would be preferable if we could avoid having to make changes to the core bytecodes.c and keep our own definitions/changes separate. As well as easing upstream merges, this would also avoid us having to copy/fork more than we need for Cinder features in a standalone module.

I've made an initial implementation which allows extra files to be passed to generate_cases.py by repeated use of the -i argument. E.g.:

$ generate_cases.py -i bytecodes.c -i cinder-bytecodes.c -o generated_cases.c.h

This mostly behaves as if the input files are concatenated but parsing only takes place between the BEGIN/END BYTECODES markers in each file. We also take advantage of mostly existing book-keeping features to track which input files definitions come from when producing errors.

I've also added a new override keyword which can prefix instruction definitions to explicitly express the intent to override an existing definition. E.g.:

inst(NOP, (--)) {
}

// This is the definition which ends up being used in generation.
override inst(NOP, (--)) {
  magic();
}

// Error - previous definition of NOP exists and "override" not specified.
inst(NOP, (--)) {
}

// Error - requested override but no previous definition of ZOP exists.
override inst(ZOP, (--)) {
}

The goal of explicitly calling out overrides is to quickly reveal if either: something we modify is removed from upstream, or if a new opcode we add ends up with a name clash with a new upstream opcode.

Previous discussion

The idea of having multiple input files for interpreter generation was briefly discussed around the faster-cpython project.

Linked PRs

@gvanrossum
Copy link
Member

Thanks!

carljm added a commit to carljm/cpython that referenced this issue Mar 4, 2023
* main:
  pythongh-102021 : Allow multiple input files for interpreter loop generator (python#102022)
  Add import of `unittest.mock.Mock` in documentation (python#102346)
  pythongh-102383: [docs] Arguments of `PyObject_CopyData` are `PyObject *` (python#102390)
  pythongh-101754: Document that Windows converts keys in `os.environ` to uppercase (pythonGH-101840)
  pythongh-102324: Improve tests of `typing.override` (python#102325)
hugovk pushed a commit to hugovk/cpython that referenced this issue Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants