Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: allow CLI args in env with NODE_OPTIONS #12028

Merged
merged 1 commit into from
Apr 25, 2017

Conversation

sam-github
Copy link
Contributor

@sam-github sam-github commented Mar 24, 2017

Not all CLI options are supported, those that are problematic from a
security or implementation point of view are disallowed, as are ones
that are inappropriate.

Disallowed because they don't make any sense to inject, they change node behaviour to fundamentally to be useful:

  • --version/-v
  • --help/-h
  • --eval/-e/--print/-pe/-p
  • --check/-c
  • --interactive/-i
  • --v8-options (mapped to v8's --help)
  • --: NODEOPT only supports options, so doesn't need a seperator between options and script
  • script.js: the main script and its CLI options can only be specified on the actual command line
  • --preserve-symlinks

Disallowed because of security concerns:

  • --tls-cipher-list: This is the kind of thing I personally prefer to see configured by env var than by CLI, but @jasnell has concerns, and I've never seen the CLI option used, so I am prepared to wait until someone presents a use-case before allowing this in the future (if ever).

Disallowed because of implementation difficulties:

  • --expose-internals: Isn't actually implemented by the options parser, will need some code rearranged to become possible to specify in the env

V8 options:

  • --help (mentioned above)
  • I don't see any other V8 options that should be disallowed, they mostly tune or profile the V8 engine run, which is specifically a use-case for NODEOPT

Tests are still missing, and I'm in the process of testing -r, the most important option, but it seems basically workable.

Fix: #11997
Replace: #11888
Reference: #881

NODEOPT=--expose_gc ./node -p 'typeof gc'
NODEOPT=--expose_gc ./node --no-expose_gc -p 'typeof gc'
NODEOPT=hi ./node --no-expose_gc -p 'typeof gc'
NODEOPT=--expose-internals ./node -p "require('internals')"
NODEOPT=--throw-deprecation ./node -p "require('_linklist')"
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

src

@nodejs-github-bot nodejs-github-bot added the c++ Issues and PRs that require attention from people who are familiar with C++. label Mar 24, 2017
src/node.cc Outdated
char* cstr = strdup(nodeopt.c_str());
// [0] is expected to be the program name, fill it in from the real argv.
argv_from_env[argc_from_env++] = argv[0];
// XXX(sam) can I use strtok or strtok_r?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis ----^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you not be able to use strtok_r?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use std::string::find and substr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strtok_r isn't ANSI C, not sure it exists on Windows or even all the unixen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-github My version of the manpage says it’s part of the recent POSIX standards, so I’d take that as a good sign. If you want to be sure whether it’s okay you could just kick of a CI run that uses it.

(You can also wait for Ben’s answer, of course, but I got the impression that he’s a bit more busy than usual right now…)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the little one wasn't due for a while. :-) Anyhow, Windows is not so POSIX, but I'll just do it and if it works in CI I guess its good for us. Of course, strtok is everywhere, but its not clear to me whether we care about it not being thread safe. Node doesn't here, but it might matter when its used as a library.

src/node.cc Outdated
const char** argv,
int* exec_argc,
const char*** exec_argv,
bool is_nodeopt = false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you align the arguments vertically?

src/node.cc Outdated
char* cstr = strdup(nodeopt.c_str());
// [0] is expected to be the program name, fill it in from the real argv.
argv_from_env[argc_from_env++] = argv[0];
// XXX(sam) can I use strtok or strtok_r?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you not be able to use strtok_r?

src/node.cc Outdated
if (SafeGetenv("NODEOPT", &nodeopt)) {
const char** argv_from_env = new const char*[(nodeopt.length()+1) / 2];
int argc_from_env = 0;
char* cstr = strdup(nodeopt.c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the copy? nodeopt will have its own memory anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suspicion, the c_str() docs aren't amazingly clear about the lifetime of the underlying mem. I'll try without (though I've a mem bug somewhere anyway that I have to find).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-github c_str() is an alias for data() nowadays, it lives as long as the string instance itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't non-const methods on the string reallocate the memory invalidating the last pointer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can, but you’re not doing that? Indexed accesses seem to be excluded from invalidating the pointer (according to http://en.cppreference.com/w/cpp/string/basic_string/c_str), which makes sense

@jasnell jasnell added the wip Issues and PRs that are still a work in progress. label Mar 24, 2017
src/node.cc Outdated
@@ -3661,16 +3665,19 @@ static void ParseArgs(int* argc,
if (debug_options.ParseOption(arg)) {
// Done, consumed by DebugOptions::ParseOption().
} else if (strcmp(arg, "--version") == 0 || strcmp(arg, "-v") == 0) {
DisallowInNodeopts(argv[0], is_nodeopt, arg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth listing all the options you've excluded in the first comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean in git commit body, list the options? I can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I suspect a large amount of the discussion will be around which options we should enable under what conditions etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the list and reasons to the description of this item for visibility, and will add them back in later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you say you will add them back in later. Do you mean add the reasons for the exclusion as comments in the code ? I think a comment in the code as well is good in terms of longer term history when people look and wonder why its not allowed. I do see you have the comments for some of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments in source, will add to the commit message when the list stabilizes

@jasnell
Copy link
Member

jasnell commented Mar 25, 2017

Couple of discussion points on this:

  1. For the blacklisted options like --tls-cipher-list, should be there an explicit opt in that would allow those to work. For instance NODEOPT="--tls-cipher-list" node --ignore-nodeopt-blacklist foo ... or do we simply always blacklist those items for sake of simplicity?

  2. There needs to be a simple way for embedders like Electron to switch off NODEOPT support (a compile time flag should work fine for this)

  3. I still very much want to see NODEOPT to not be passed down to child_process.fork() processes by default.

@sam-github
Copy link
Contributor Author

@jasnell

  1. There are no other options like --tls-cipher-list, its one of a kind, all the other blacklisted items are pretty clearly not the kind of thing that should be in an env var. Having to specify a CLI option to enable an env var isn't something I've heard a use-case for, I think we should wait until someone asks for it and makes a case.

  2. Good idea, I'll add a compile time flag.

  3. Stripping some env vars is something that needs more discussion, I'll follow up when I get out of this meeting.

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

For (1) .. good point :-) The only other one that I would definitely rule out, however, is the --security-revert= flag. It's a purposefully undocumented flag that is a very special case. It's use needs to be explicitly opt-in in every situation.

@sam-github
Copy link
Contributor Author

sam-github commented Mar 27, 2017

Right, forgot --security-revert (I'd read through the code and it didn't do anything, there aren't any reverts at the moment, so I forgot). See c0836e5

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

+1 ... yeah, --security-revert is a special snowflake that is reset with every semver-major. We haven't had any CVE's that needed to be included in this since 4.x I think so it's a non-op at the current time. Hopefully it'll be able to remain a non-op forever :-)

@sam-github
Copy link
Contributor Author

On stripping of NODEOPT env var during spawn: I think it takes away from the usefullness of NODEOPT (specifically, it means we can't use NODEOPT="-r node-report" npm install ... to debug install scripts), intrudes confusingly into the child_process API, and goes against the basic philosophy of environment, which is that environment is inherited by children.

However, I'd rather have at least a one-level NODEOPT with this confusing caveat than none at all.

@jasnell, can you make a case for why you want this non-inheritable env var behaviour? And what specifically you propose?

For example:

require('child_process').exec('node -p process.env.NODE_PATH', function(err, out) {
  console.log(err || out);                                                       
});
require('child_process').exec('node -p process.env.NODEOPT', function(err, out) {
  console.log(err || out);                                                       
});

That NODE_PATH=/some/path NODEOPT=--prof node echo.js should print:

/some/path

instead of

/some/path
--prof

I would find confusing, is that what you propose?

And what do you propose that the following code would print?

process.env.NODEOPT = '-r node-report';
require('child_process').exec('node -p process.env.NODEOPT', function(err, out) {
  console.log(err || out);
});

I would expect it to print the value of NODEOPT, do you propose it print something else?

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

@sam-github ... the concern is largely about backwards compatibility and the possibility of NODEOPT settings altering the behavior of child processes in ways that could break.

For instance, imagine a child process that uses fs.readFileSync() internally to read a file and return it's contents to the parent. The parent is written to scan stdout / stderr to determine whatever status it needs to know. The child process is launched specifically assuming that flags like --trace-sync-io are not passed (because it is not passed in the args). Using NODEOPT=--trace-sync-io, however, causes a number of warnings to be printed to stderr in the child process potentially causing existing code to break.

(Essentially, passing NODEOPT automatically to child processes breaks the existing API contract around args passing to forked child processes.)

My proposal is that NODEOPT would be simply removed from the env passed to the child process unless an option is set, e.g. fork(path, {nodeopt: true}).

@richardlau
Copy link
Member

But don't we already have that issue with existing environment variables, e.g. NODE_DEBUG?

@sam-github
Copy link
Contributor Author

NODE_TTY_UNSAFE_ASYNC is also inherited, and modifies API contract in ways that would break child processes, there seems to be some precedence for these "use at own risk" env vars.

If there is widespread concern, I'd rather disable V8 options that output to stdio then break the inheritance, it leaves the behaviour much more predictable.

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

NODE_TTY_UNSAFE_ASYNC is in the process of being removed entirely, however. There is also extremely little evidence that it's actually used by anyone.

@sam-github
Copy link
Contributor Author

NODE_DEBUG isn't getting removed, and has the same characteristics as you describe for NODEOPT=--trace-sync-io: it can be useful, it is inherited, and it will (unsprisingly) break any code that doesn't expect extra output.

@bmeck
Copy link
Member

bmeck commented Mar 27, 2017

I would be much more comfortable if this was a whitelist instead of a blacklist. It would prevent accidentally harmful flags from being exposed.

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

NODE_DEBUG has been around since v0.1.32 and is well established as an environment variable. It predates the child_process API and it's side effects and inheritance by spawned child processes is well known. It also does not alter arbitrary command line arguments passed into the child process. There is no backwards compatibility concern there and therefore is not directly comparable.

@mhdawson
Copy link
Member

@jasnell , what about if we have NODEOPT and NODEOPT_INHERIT and let people choose the behavior they want. NODEOPT_INHERIT would be passed to child processes while NODEOPT would not. In that way you can use the environment to set options and chose if they are inherited or not. I think having the ability to pass on options to child processes is important and this would be one way to let the end user make a specific choice.

@sam-github
Copy link
Contributor Author

@jasnell Before I implement this, lets be really clear:

My proposal is that NODEOPT would be simply removed from the env passed to the child process unless an option is set, e.g. fork(path, {nodeopt: true}).

child_process.spawn(process.execPath, {env: {NODEOPT='xxx'}}, function(...

You propose NODEOPT to not be set in the child process's env?

@bmeck, Do you mean for node options, or V8 options? I can switch it around to opt-in for node, I think there is a way I can do that, but there are 417 V8 options as of this moment on master, and they all look like the kinds of things someone might want to set via env to tune performance or diagnose mis-performance, and there don't appear to be anything we'd want to blacklist.

ProcessArgv(&argc_from_env, argv_from_env, &exec_argc_, &exec_argv_, true);
delete[] exec_argv_;
delete[] argv_from_env;
free(cstr);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'd rather that C++ std lib classes are used to avoid all the manual allocations. And you shouldn't have to worry about if the C++ std lib is available on a certain platform. And you don't need to rely on the non-reentrant legacy c strtok.

  std::vector<std::string> env_args;
  std::string::size_type pos = 0, pos_next = 0;
  while (pos_next < nodeopt.size())
  {
    pos_next = nodeopt.find(" ", pos);
    if (pos_next == std::string::npos)
    {
      pos_next = nodeopt.size();
    }

    if (pos_next > pos)
    {
      env_args.push_back(nodeopt.substr(pos, pos_next - pos));
    }

    pos = pos_next + 1;
  }

  std::vector<const char *> argv_from_env;
  argv_from_env.reserve(args.size() + 2);
  argv_from_env.push_back(argv[0]);
  for (size_t i = 0; i < args.size(); i++)
  {
    argv_from_env.push_back(env_args[i].c_str());
  }
  argv_from_env.push_back(nullptr);

and now you can call ProcessArgv:

    ProcessArgv(&argc_from_env, &argv_from_env[0], &exec_argc_, &exec_argv_, true );

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

@sam-github ... no, if NODEOPT is explicitly passed in the call to fork like in your example, it would be used. What I'm saying is that env is not explicitly given and the default process.env is used, then it's NODEOPT would not be inherited.

In other words... NODEOPT is used:

child_process.spawn(process.execPath, {env: {NODEOPT:'...'}}, () => {});

process.env.NODEOPT is used:

child_process.spawn(process.execPath, {nodeopt: true}, () => {});

process.env.NODEOPT is not used:

child_process.spawn(process.execPath, () => {});

@mhdawson ... as long as there is an explicit opt-in for the options to be used in child processes I'm fine with however it happens. A second environment variable would not be my first choice tho but I could live with it.

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

Note: that if we went with the nodeopt:true option to spawn/fork now, we can always choose to switch the default later (making the option a non-op) if it can be shown that inheriting the NODEOPT env does not break anyone in practice.

src/node.cc Outdated
break;
char* cstr = strdup(nodeopt.c_str());
char* initptr = cstr;
while (char* token = strtok(initptr, " ")) {
Copy link

@jchip jchip Mar 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this handle multiple spaces? like "-a -b".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@addaleax
Copy link
Member

@jasnell @mhdawson If the main risk you have in mind is security, it won’t matter whether you pass one or two environment variables.

(But as I’ve said before, using environment variables that are not inherited by child processes seems pointless by definition.)

@bmeck
Copy link
Member

bmeck commented Mar 27, 2017

@sam-github I am definitely thinking about more than just node arguments. Things like --harmony-* would prevent some polyfill detection and use incomplete/broken v8 implementations if they are passed.

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

Another note: the documentation on this should need to be clear that:

(a) command line arguments passed directly on the command line take precedence over those passed in NODEOPT and
(b) Child processing inheriting NODEOPT would not inherit arguments passed on the command line.

To illustrate the kind of impact this has consider the following example:

$ NODEOPT="--icu-data-dir=a" ./node --icu-data-dir=b
> process.binding('config');
{ hasIntl: true, hasSmallICU: true, icuDataDir: 'b' }
> var m = child_process.spawn('./node', ['-pe', 'console.log(process.binding("config"))'], {stdio:['inherit','inherit', 'inherit']})
undefined
> { hasIntl: true, hasSmallICU: true, icuDataDir: 'a' }

To be certain, this is not a criticism, just something that should be noted. (This is also true when using the NODE_ICU_DATA environment variable)

Note that the current CLI/env documentation (https://nodejs.org/dist/latest-v7.x/docs/api/cli.html) currently does not explain that arguments would take precedence over env vars.

@cjihrig
Copy link
Contributor

cjihrig commented Mar 27, 2017

I agree with @addaleax that environment variables should be passed to child processes by default.

@jasnell
Copy link
Member

jasnell commented Mar 27, 2017

Another test case that will need be looked at:

NODEOPT="--inspect=localhost:1234" ./node
Debugger listening on port 1234.
Warning: This is an experimental feature and could change at any time.
To start debugging, open the following URL in Chrome:
    chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=localhost:1234/f8884219-6fd2-442e-aa5e-257cb686e3a5
> var m = child_process.spawn('./node', {stdio:['inherit', 'inherit', 'inherit']})
undefined
> Starting inspector on localhost:1234 failed: address already in use

@jasnell jasnell mentioned this pull request May 11, 2017
sam-github added a commit to sam-github/node that referenced this pull request Oct 11, 2017
Not all CLI options are supported, those that are problematic from a
security or implementation point of view are disallowed, as are ones
that are inappropriate (for example, -e, -p, --i), or that only make
sense when changed with code changes (such as options that change the
javascript syntax or add new APIs).

PR-URL: nodejs#12028
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Bradley Farias <bradley.meck@gmail.com>
MylesBorins pushed a commit that referenced this pull request Oct 16, 2017
Not all CLI options are supported, those that are problematic from a
security or implementation point of view are disallowed, as are ones
that are inappropriate (for example, -e, -p, --i), or that only make
sense when changed with code changes (such as options that change the
javascript syntax or add new APIs).

Backport-PR-URL: #12677
PR-URL: #12028
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Bradley Farias <bradley.meck@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Oct 17, 2017
MylesBorins pushed a commit that referenced this pull request Oct 25, 2017
Not all CLI options are supported, those that are problematic from a
security or implementation point of view are disallowed, as are ones
that are inappropriate (for example, -e, -p, --i), or that only make
sense when changed with code changes (such as options that change the
javascript syntax or add new APIs).

Backport-PR-URL: #12677
PR-URL: #12028
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Bradley Farias <bradley.meck@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Nov 3, 2017
MylesBorins added a commit that referenced this pull request Nov 6, 2017
Notable Changes:

* assert:
  - assert.fail() can now take one or two arguments (Rich Trott)
    #12293
* crypto:
  - add sign/verify support for RSASSA-PSS (Tobias Nießen)
    #11705
* deps:
  - upgrade openssl sources to 1.0.2m (Shigeki Ohtsu)
    #16691
  - upgrade libuv to 1.15.0 (cjihrig)
    #15745
  - upgrade libuv to 1.14.1 (cjihrig)
    #14866
  - upgrade libuv to 1.13.1 (cjihrig)
    #14117
  - upgrade libuv to 1.12.0 (cjihrig)
    #13306
* fs:
  - Add support for fs.write/fs.writeSync(fd, buffer, cb) and
    fs.write/fs.writeSync(fd, buffer, offset, cb) as documented
    (Andreas Lind) #7856
* inspector:
  - enable --inspect-brk (Refael Ackermann)
    #12615
* process:
  - add --redirect-warnings command line argument (James M Snell)
    #10116
* src:
  - allow CLI args in env with NODE_OPTIONS (Sam Roberts)
    #12028)
  - --abort-on-uncaught-exception in NODE_OPTIONS (Sam Roberts)
    #13932
  - allow --tls-cipher-list in NODE_OPTIONS (Sam Roberts)
    #13172
  - use SafeGetenv() for NODE_REDIRECT_WARNINGS (Sam Roberts)
    #12677
* test:
  - remove common.fail() (Rich Trott)
    #12293

PR-URL: #16263
MylesBorins added a commit that referenced this pull request Nov 7, 2017
Notable Changes:

* assert:
  - assert.fail() can now take one or two arguments (Rich Trott)
    #12293
* crypto:
  - add sign/verify support for RSASSA-PSS (Tobias Nießen)
    #11705
* deps:
  - upgrade openssl sources to 1.0.2m (Shigeki Ohtsu)
    #16691
  - upgrade libuv to 1.15.0 (cjihrig)
    #15745
  - upgrade libuv to 1.14.1 (cjihrig)
    #14866
  - upgrade libuv to 1.13.1 (cjihrig)
    #14117
  - upgrade libuv to 1.12.0 (cjihrig)
    #13306
* fs:
  - Add support for fs.write/fs.writeSync(fd, buffer, cb) and
    fs.write/fs.writeSync(fd, buffer, offset, cb) as documented
    (Andreas Lind) #7856
* inspector:
  - enable --inspect-brk (Refael Ackermann)
    #12615
* process:
  - add --redirect-warnings command line argument (James M Snell)
    #10116
* src:
  - allow CLI args in env with NODE_OPTIONS (Sam Roberts)
    #12028)
  - --abort-on-uncaught-exception in NODE_OPTIONS (Sam Roberts)
    #13932
  - allow --tls-cipher-list in NODE_OPTIONS (Sam Roberts)
    #13172
  - use SafeGetenv() for NODE_REDIRECT_WARNINGS (Sam Roberts)
    #12677
* test:
  - remove common.fail() (Rich Trott)
    #12293

PR-URL: #16263
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: --config=file command line argument