Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: add swap16() and swap32() methods #5724

Closed
wants to merge 2 commits into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented Mar 15, 2016

Pull Request check-list

  • Does make -j8 test (UNIX) or vcbuild test nosign (Windows) pass with
    this change (including linting)?
  • Is the commit message formatted according to [CONTRIBUTING.md][0]?
  • If this change fixes a bug (or a performance problem), is a regression
    test (or a benchmark) included?
  • Is a documentation update included (if this change modifies
    existing APIs, or introduces new ones)?

Affected core subsystem(s)

buffer

Description of change

Adds Buffer.prototype.swapShort() Buffer.prototype.swap16() and
Buffer.prototype.swapLong() Buffer.prototype.swap32() methods that
mutate the Buffer instance in-place by swapping the 16-bit and 32-bit
byte-order.

Example:

const buf = Buffer([0x1, 0x2, 0x3, 0x4]);
buf.swap16();
console.log(buf);
  // prints Buffer(0x2, 0x1, 0x4, 0x3);

buf.swap32();
console.log(buf);
  // prints Buffer(0x3, 0x4, 0x1, 0x2);

Was looking at a number of use cases recently around reading in UTF16BE data and converting that to UTF8 and realized that we really didn't expose an efficient mechanism for doing a byte-swap in Buffer. This seemed like a useful API addition.

/cc @trevnorris @srl295

@jasnell jasnell added buffer Issues and PRs related to the buffer subsystem. semver-minor PRs that contain new features and should be released in the next minor version. labels Mar 15, 2016
@mscdex
Copy link
Contributor

mscdex commented Mar 15, 2016

I think swap16() and swap32() might be better names.

Just how common are these anyway?

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

I can live with swap16 and swap32.

The swap16 is particularly useful when dealing with utf16 big-endian encoded buffer content that needs to be converted. The swap32 is less common but included for completeness. There are certain formats (such as TIFF) that can be big-endian or little-endian and it's common when working with those to have to swap the order to meet expectations. I wouldn't say that these are critical by any stretch, but they are useful and building the implementation directly into node_buffer is cheap.

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

Methods renamed to swap16/swap32

@jasnell jasnell changed the title buffer: add swapShort() and swapLong() methods buffer: add swap16() and swap32() methods Mar 15, 2016
@@ -1094,6 +1130,8 @@ void SetupBufferJS(const FunctionCallbackInfo<Value>& args) {
env->SetMethod(proto, "hexWrite", HexWrite);
env->SetMethod(proto, "ucs2Write", Ucs2Write);
env->SetMethod(proto, "utf8Write", Utf8Write);
env->SetMethod(proto, "swap16", SwapShort);
env->SetMethod(proto, "swap32", SwapLong);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While case differs, generally the names are the same between c++ and js. Would Swap{16,32} work?

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

@trevnorris ... ok, updated! PTAL

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

@Fishrock123
Copy link
Contributor

Does this belong in core?


if (ts_obj_length % sizeof(uint16_t) != 0)
return env->ThrowRangeError(
"Buffer length must be a multiple of 16-bits");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wrap multi-line statements in conditionals with {}, and double indent with multi-line statements. another thing lint should be catching.

@trevnorris
Copy link
Contributor

@Fishrock123 Asked myself the same thing. I believe it may just fit within the parameters of what functionality belongs in core. For example:

// Read in utf16be file
const file = fs.readFileSync(path).swap16().toString('utf16le');
// Write it back to disk
fs.writeFileSync(path, Buffer(file, 'utf16le').swap16());

Though additional input is welcome.

@jasnell have a nit, but LGTM otherwise. Let's leave this open though for at least a couple days for others to respond.

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

@Fishrock123 ... it is certainly possible to do in userland but there's a bit of a performance penalty if it's done in pure javascript... I'll post some benchmark numbers in a couple of minutes that compares the performance of swap16/swap32 to equivalent pure javascript code. I went back and forth on this one also but I landed in the same @trevnorris did... that it likely fits just inside that boundary.

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

Benchmarks ... the swap(16|21) is the impl in node_buffer, while the htons/htonl is a pure javascript impl... note that at the low end of the range the javascript impl's blow away the native impls, but that reverses once the buffer size grows. Given that, I may change this up a bit to do the swap in javascript if the length is below a certain threshold.

4     - swap16/htons - 15881312.97675 / 56764416.88469 - htons
64    - swap16/htons - 11189303.45557 / 18467670.86474 - htons
1024  - swap16/htons - 2019124.50116  / 1546034.86159  - swap16
2056  - swap16/htons - 1070521.29584  / 732798.39598   - swap16
4096  - swap16/htons - 591436.23889   / 383412.43533   - swap16

4     - swap32/htonl - 15827403.80753 / 78215981.07361 - htonl
64    - swap32/htonl - 11970720.57513 / 20776537.63779 - htonl
1024  - swap32/htonl - 2255585.06207  / 1886075.54128  - swap32
2056  - swap32/htonl - 1242583.71829  / 1036100.06795  - swap32
4096  - swap32/htonl - 621999.48227   / 511625.91432   - swap32

buf[i - 1] = i;
bench.start();
for (i = 0; i < n; i += 1)
method.apply(buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe avoid using apply inside the bench start/end?

In buffer-iterate.js it just calls off the instance...

const name = conf.method;
//...
for (i = 0; i < n; i += 1)
  method[name](buf);

@jasnell
Copy link
Member Author

jasnell commented Mar 15, 2016

@trevnorris @williamkapke ... nits addressed.

I also updated the implementation such that if Buffer length is below a certain threshold, a pure javascript version of the swap will be used rather than dropping down to the native layer. Based on the benchmarks, when the Buffer is below a certain size, it's far more efficient to swap in js. We may be able to further tweak the threshold but the current limits appear to be within the margin of error.

if (len < 512) {
if (len % 2 !== 0)
throw new RangeError('Buffer length must be a multiple of 16-bits');
for (var i = 0, n = 0; i < this.length; i += 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you already have len above... you might as well use it here too! :D
(Same for swap32)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh! missed that one. good catch :-)

@williamkapke
Copy link
Contributor

If we're doing this, what about (also) including a more generic buffer.swap(a,b) that swap16 and swap32 could leverage? Maybe that would tank perf on the native side?

I think it makes the JS side cleaner...

Buffer.prototype.swap = function swap(a, b) {
  var x = this[a];
  this[a] = this[b];
  this[b] = x;
  return this;
};

Buffer.prototype.swap32 = function swap32() {
  for (var i = 0; i < buf.length; i += 2) {
    this.swap(i++, i + 1);
    this.swap(i++, i + 1);
  }
  return this;
};

Buffer.prototype.swap16 = function swap16() {
  for (var i = 0; i < buf.length; i++) {
    this.swap(i, ++i);
  }
  return this;
};

const buf = Buffer([0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8]);
console.log(buf.swap(1,2));
//or
console.log(buf.swap16());
//or
console.log(buf.swap32());

@jasnell
Copy link
Member Author

jasnell commented Mar 16, 2016

Having a swap function makes sense but I wouldn't expose it as part of the Buffer API.

Nits addressed!

@williamkapke
Copy link
Contributor

heh- that's how I originally wrote it and then decided to put swap on the prototype ;)

LGTM!

bench.start();
for (i = 0; i < n; i += 1)
buf[method]();
bench.end(n);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind throwing these (the for loop) in their own function? when profiling it's easier to look for execution of a named function that only contains the code actually being profiled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

@jasnell jasnell force-pushed the buffer-byte-swap branch 2 times, most recently from 74769f0 to 26b4ffd Compare March 17, 2016 00:46
@jasnell
Copy link
Member Author

jasnell commented Mar 17, 2016

@trevnorris ... done! also squashed the commits and rebased to pick up the buffer api changes also

Interprets the `Buffer` as an array of unsigned 32-bit integers and swaps
the byte-order *in-place*. Throws a `RangeError` if the `Buffer` length is
not a multiple of 32 bits. The method returns a reference to the Buffer, so
calls can be chained
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing period.

@ofrobots
Copy link
Contributor

What are the semantics when the size of the buffer is not a multiple of 2 / 4?

EDIT: found the answer by looking at the test-case.

@jasnell
Copy link
Member Author

jasnell commented Mar 22, 2016

It's also covered in the documentation addition. A RangeError is thrown.

@jasnell jasnell modified the milestone: 6.0.0 Mar 22, 2016
@Fishrock123 Fishrock123 removed this from the 6.0.0 milestone Mar 22, 2016
@@ -17,6 +17,48 @@ var poolSize, poolOffset, allocPool;


binding.setupBufferJS(Buffer.prototype, bindingObj);

const swap16n = Buffer.prototype.swap16;
const swap32n = Buffer.prototype.swap32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this instead of attaching the native implementation to binding.swap*()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an artifact of how it was originally written. You're right, attaching directly to binding.swap*() is better.

@trevnorris
Copy link
Contributor

Nice tests. Left two comments. Other than that LGTM.

@jasnell
Copy link
Member Author

jasnell commented Mar 22, 2016

@trevnorris .. thanks .. updated to address those nits!

@trevnorris
Copy link
Contributor

@jasnell Thanks much. LGTM

// dropping down to the native code is faster.
const len = this.length;
if (len % 2 !== 0)
throw new RangeError('Buffer length must be a multiple of 16-bits');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredible nitpicking, but nonetheless I would opt for one of these:

Buffer size must be a multiple of 16 bits

or:

Buffer length must be a multiple of 2

I hope you understand the nuance difference I'm getting at.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change to Buffer size must be a multiple of 16 bits when I land.

jasnell added a commit that referenced this pull request Mar 23, 2016
Adds Buffer.prototype.swap16() and Buffer.prototype.swap32()
methods that mutate the Buffer instance in-place by swapping the
16-bit and 32-bit byte-order.

Example:

```js
const buf = Buffer([0x1, 0x2, 0x3, 0x4]);
buf.swap16();
console.log(buf);
  // prints Buffer(0x2, 0x1, 0x4, 0x3);

buf.swap32();
console.log(buf);
  // prints Buffer(0x3, 0x4, 0x1, 0x2);
```

PR-URL: #5724
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2016

Landed in 7d73e60. Thanks all!

@jasnell jasnell closed this Mar 23, 2016
evanlucas pushed a commit that referenced this pull request Mar 30, 2016
Adds Buffer.prototype.swap16() and Buffer.prototype.swap32()
methods that mutate the Buffer instance in-place by swapping the
16-bit and 32-bit byte-order.

Example:

```js
const buf = Buffer([0x1, 0x2, 0x3, 0x4]);
buf.swap16();
console.log(buf);
  // prints Buffer(0x2, 0x1, 0x4, 0x3);

buf.swap32();
console.log(buf);
  // prints Buffer(0x3, 0x4, 0x1, 0x2);
```

PR-URL: #5724
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
evanlucas added a commit that referenced this pull request Mar 31, 2016
Notable changes:

* buffer:
  * make byteLength work with ArrayBuffer & DataView (Jackson Tian)
[#5255](#5255)
  * backport --zero-fill-buffers command line option (James M Snell)
[#5744](#5744)
  * backport new buffer constructor APIs (James M Snell)
[#5763](#5763)
  * add swap16() and swap32() methods (James M Snell)
[#5724](#5724)
* fs: add the fs.mkdtemp() function. (Florian MARGAINE)
[#5333](#5333)
* net: emit host in lookup event (HUANG Wei)
[#5598](#5598)
* node: --no-browser-globals configure flag (Fedor Indutny)
[#5853](#5853)
* npm: Upgrade to v3.8.3. (Forrest L Norvell)
* repl: support standalone blocks (Prince J Wesley)
[#5581](#5581)
* src: override v8 thread defaults using cli options (Tom Gallacher)
[#4344](#4344)
evanlucas added a commit that referenced this pull request Mar 31, 2016
Notable changes:

* buffer:
  * make byteLength work with ArrayBuffer & DataView (Jackson Tian)
[#5255](#5255)
  * backport --zero-fill-buffers command line option (James M Snell)
[#5744](#5744)
  * backport new buffer constructor APIs (James M Snell)
[#5763](#5763)
  * add swap16() and swap32() methods (James M Snell)
[#5724](#5724)
* fs: add the fs.mkdtemp() function. (Florian MARGAINE)
[#5333](#5333)
* net: emit host in lookup event (HUANG Wei)
[#5598](#5598)
* node: --no-browser-globals configure flag (Fedor Indutny)
[#5853](#5853)
* npm: Upgrade to v3.8.3. Fixes a security flaw in the use of
authentication tokens in HTTP requests that would allow an attacker to
set up a server that could collect tokens from users of the command-line
interface. Authentication tokens have previously been sent with every
request made by the CLI for logged-in users, regardless of the destination
of the request. This update fixes this by only including those tokens
for requests made against the registry or registries used for the
current install. (Forrest L Norvell)
[npm#6](npm#6)
* repl: support standalone blocks (Prince J Wesley)
[#5581](#5581)
* src: override v8 thread defaults using cli options (Tom Gallacher)
[#4344](#4344)
evanlucas added a commit that referenced this pull request Apr 1, 2016
Notable changes:

* buffer:
  * make byteLength work with ArrayBuffer & DataView (Jackson Tian)
[#5255](#5255)
  * backport --zero-fill-buffers command line option (James M Snell)
[#5744](#5744)
  * backport new buffer constructor APIs (James M Snell)
[#5763](#5763)
  * add swap16() and swap32() methods (James M Snell)
[#5724](#5724)
* fs: add the fs.mkdtemp() function. (Florian MARGAINE)
[#5333](#5333)
* net: emit host in lookup event (HUANG Wei)
[#5598](#5598)
* node: --no-browser-globals configure flag (Fedor Indutny)
[#5853](#5853)
* npm: Upgrade to v3.8.3. Fixes a security flaw in the use of
authentication tokens in HTTP requests that would allow an attacker to
set up a server that could collect tokens from users of the command-line
interface. Authentication tokens have previously been sent with every
request made by the CLI for logged-in users, regardless of the destination
of the request. This update fixes this by only including those tokens
for requests made against the registry or registries used for the
current install. (Forrest L Norvell)
[npm#6](npm#6)
* repl: support standalone blocks (Prince J Wesley)
[#5581](#5581)
* src: override v8 thread defaults using cli options (Tom Gallacher)
[#4344](#4344)
evanlucas added a commit that referenced this pull request Apr 1, 2016
Notable changes:

* buffer:
  * make byteLength work with ArrayBuffer & DataView (Jackson Tian)
[#5255](#5255)
  * backport --zero-fill-buffers command line option (James M Snell)
[#5744](#5744)
  * backport new buffer constructor APIs (James M Snell)
[#5763](#5763)
  * add swap16() and swap32() methods (James M Snell)
[#5724](#5724)
* fs: add the fs.mkdtemp() function. (Florian MARGAINE)
[#5333](#5333)
* net: emit host in lookup event (HUANG Wei)
[#5598](#5598)
* node: --no-browser-globals configure flag (Fedor Indutny)
[#5853](#5853)
* npm: Upgrade to v3.8.3. Fixes a security flaw in the use of
authentication tokens in HTTP requests that would allow an attacker to
set up a server that could collect tokens from users of the command-line
interface. Authentication tokens have previously been sent with every
request made by the CLI for logged-in users, regardless of the destination
of the request. This update fixes this by only including those tokens
for requests made against the registry or registries used for the
current install. (Forrest L Norvell)
[npm#6](npm#6)
* repl: support standalone blocks (Prince J Wesley)
[#5581](#5581)
* src: override v8 thread defaults using cli options (Tom Gallacher)
[#4344](#4344)

PR-URL: #5970
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants