buffer: return offset for end of last write #5843

trevnorris · 2013-07-13T00:40:38Z

Users asked to have Buffer#write* methods return last offset. Now the do.

isaacs · 2013-07-19T05:33:32Z

A silly feature, but it puts the "return buffer instance for chainability" and "implicit offset" to bed permanently, which is nice, because those are even sillier features. LGTM.

Mithgol · 2013-07-19T06:15:18Z

Wait, why is the “implicit offset” silly and how does this feature put that to bed permanently?

isaacs · 2013-07-19T06:20:02Z

@Mithgol Because this means that crud like buf.writeInt32(10).writeDouble(8).writeFloat(99) will never ever work (since they all return Numbers)

Mithgol · 2013-07-19T07:26:19Z

The returning of Numbers obviously buries such a chainability, but an implicit offset does not necessarily imply any chainability. Could be the following:

buf.writeInt32(10);
buf.writeDouble(8);
buf.writeFloat(99);

In my opinion, an implicit offset would actually be good. (I've experienced it once with jDataView's interfaces.) That's why I'm interested in your reasons to think that's silly.

A code based on implicit offsets tends to look more compact than the explicit, which is the following:

offset = buf.writeInt32LE(offset, 10);
offset = buf.writeDoubleBE(offset, 8);
offset = buf.writeFloatLE(offset, 99);

trevnorris · 2013-07-19T21:04:46Z

@Mithgol I created a patch to do this. For performance reasons it won't work. The elegant way is setting the necessary object property from cc, but this slows down write{Float|Double}{LE|BE} considerably. There is an ugly hack I could do to get around this problem, but I reserve those fixes for mission critical parts (e.g. nextTickQueue).

Mithgol · 2013-07-20T06:12:18Z

Of course those three additions (to set, to increment, to read an object property from cc) slow down write{Float|Double}{LE|BE}. There's no way to add more working code to a function and expect it to run as fast as before.

The intention behind the “implicit offset” feature request, as far as I understand it, is to pay that price consciously and in a hope that typical userland code utilizing those functions becomes faster because it gets rid of setting a variable, and incrementing it, and passing it to the write{Float|Double}{LE|BE} methods. (Such a workload is transferred to the methods themselves, slowing them down.)

And it's the overall effect that is expected to be beneficial. Seems feasible because the cc code of core functions is expected to run faster than userland javascripts. And because centralized code (built once) is expected to run faster than scattered JIT-compiled lines.

But still there's a valid counter-argument to the above statement. The performance drop of write{Float|Double}{LE|BE} in the case of “implicit offsets” is inevitable, while the performance gain of the userland code (using those methods) is relative because it depends on how much the operations of storing and incrementing and passing the offset have been repeated previously and how much they adhered to the implicit rule. (The latter dependency seems the most important: if some code reads one float out of five and skips the others, then the writes are not sequential and they won't benefit much from implicit offsets and could even be slowed down by the proposed changes in write{Float|Double}{LE|BE}.)

That's why a more correct proposal of changes in Buffer would be to leave readXXXX and writeXXXX methods intact (for their currect usage) but to suggest a set of similar methods implementing implicit offsets:

nextXXXX([noAssert]) — reads like readXXXX(offset, [noAssert]), but reads the next value (the offset is implicit).
appendXXXX(value, [noAssert]) — writes like writeXXXX(value, offset, [noAssert]), but appends to the results of the previous write (the offset is implicit). Returns the buffer instead of the next offset (i.e. becomes a chainable method).

(A set of offset-related methods would also be necessary, such as offset() to get it, seek(position) to set it, skip(numberofbytes) to adjust it. Of couse they aren't necessary for the typical cases of sequential reading and writing, but there are edge cases where such methods are very necessary. Debugging, for example. Or just logging.)

Those methods would then be used only when a benefit is obvious for the user.

(I would expect the sequential readers/writers to switch to nextXXXX/appendXXXX for the performance gain, while the non-sequential readers/writers would stay on readXXXX/writeXXXX. However, when the code speed is less important than the code clarity, the coder may switch to nextXXXX just because he hates writing offset+=4 repeatedly. And who am I to judge him.)

When two sets of methods coexist, users have a choice and can make a conscious and a comfortable decision in each case they face. (That's almost like setXXXX vs. writeXXXX in jDataView except that jDataView does always store an offset and does not care about the inevitable slowing down.)

What do you think about it?

Mithgol · 2013-07-20T06:57:21Z

Additional note: the above statements of “incrementing the offset” in userland are currently valid only for sequential reading because writeXXXX functions are already incrementing and returning the offset after 4a34c69.

For the sake of writeXXXX performance please consider reverting 4a34c69 after appendXXXX methods are born. (The non-sequential writes are slowed down by 4a34c69 as I guess because the offset is calculated unnecessarily in their case.)

However, most userland writes are likely to be sequential, so you'd refrain from reverting 4a34c69 until/unless appendXXXX methods are born.

isaacs · 2013-07-22T21:45:59Z

@Mithgol The approaches you're suggesting have been explored, and were found to introduce performance problems. Additionally, while buf.writeInt32LE(10).writeDoubleBE(10) reads reasonably, it's surprising to many people that a buf.writeXXXX function will have the side effect of causing future writes to have an implicit offset. Buffers are not streams, so it is strange that future writes would be at a different offset.

Bottom line: I'm sorry, we're not changing this behavior.

Mithgol · 2013-07-23T07:26:41Z

Wait a minute.

The last of my suggestions wasn't really about changing the behavior of writeXXXX to have an implicit offset (I understand now that would be suprising to people and would introduce performance problems in some scenarios… @trevnorris persuaded me ≈3 days ago).

I have suggested a new set of methods for sequential writing and reading, meant as additional (and not as a replacement): appendXXXX in addition to writeXXXX, and nextXXXX in addition to readXXXX.

I believe that would diminish most of the problems you recounted.

Different interface and behavior.
- It won't be surprising for new methods to have somewhat different interfaces and behavior.
- The names appendXXXX (e.g. buf.appendDoubleBE(10).appendInt8(75)) make it easier to understand that the writes are sequential, and nextXXXX (e.g. buf.nextUInt16LE()) that the next value is read.
Performance problems.
- Whatever performance problems nextXXXX and appendXXXX methods may have, readXXXX and writeXXXX methods are not meant to be affected and people would still be able to use them instead of nextXXXX/appendXXXX.
- Actually appendXXXX methods are expected to be slower than writeXXXX (and nextXXXX to be slower than readXXXX) because they make three additional steps (they read the offset, they increment it, they store it). However, the act of moving that steps from userland code (that would otherwise have to read the offset, increment it and pass to methods) to Node core (it's faster to do things there) might bring some speed to the sequential reading and writing. That special case (reading or writing several values that immediately follow each other in a buffer) is the only case the new methods are targeting.
- The step of incrementing the offset (calculating the next offset) is actually necessary only for this special case (reading or writing several values that immediately follow each other in a buffer). Therefore you would be able to move the 4a34c69 code to appendXXXX methods and remove that code from writeXXXX methods, and thus the latter would have a small performance gain.
Buffers are not streams.
- Though the case of reading or writing several values that immediately follow each other in a buffer is indeed special, such a case is not unusual. People do that a lot. (For example, it's already usual enough to make the 4a34c69 changes acceptable in the core.)
- Buffers are not streams and can be read/written in arbitrary order. That's exactly why changes to the common methods have to be a tradeoff; you cannot fully alter these methods to address a special case (sequential reading and writing) at the expense of its alternative (arbitrary, non-sequential reading and writing). Even the 4a34c69 change is such a tradeoff: for non-sequential access that change is unneccessary (i.e. a burden), while for sequential access that change is not enough (i.e. while offset is incremented by the core, it's still the userland that has to store the offset and to pass it to the core methods). An additional set of methods is the only solution that could address such a case completely and without tradeoffs. That's what I suggest.

If that wasn't enough to address a concern, please mention it.

isaacs · 2013-07-23T14:00:29Z

@Mithgol We have no interest in adding those methods to node core. Please feel free to do so in a userland module in npm, or use http://npm.im/buffercursor

Mithgol · 2013-07-23T18:43:49Z

Thank you for mentioning node-buffercursor.

(I was not aware that it existed.)

buffer: return offset for end of last write

4a34c69

trevnorris merged commit 4a34c69 into nodejs:master Jul 19, 2013

trevnorris mentioned this pull request Mar 15, 2016

buffer: documentation incorrect nodejs/node#5682

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: return offset for end of last write #5843

buffer: return offset for end of last write #5843

trevnorris commented Jul 13, 2013

isaacs commented Jul 19, 2013

Mithgol commented Jul 19, 2013

isaacs commented Jul 19, 2013

Mithgol commented Jul 19, 2013

trevnorris commented Jul 19, 2013

Mithgol commented Jul 20, 2013

Mithgol commented Jul 20, 2013

isaacs commented Jul 22, 2013

Mithgol commented Jul 23, 2013

isaacs commented Jul 23, 2013

Mithgol commented Jul 23, 2013

buffer: return offset for end of last write #5843

buffer: return offset for end of last write #5843

Conversation

trevnorris commented Jul 13, 2013

isaacs commented Jul 19, 2013

Mithgol commented Jul 19, 2013

isaacs commented Jul 19, 2013

Mithgol commented Jul 19, 2013

trevnorris commented Jul 19, 2013

Mithgol commented Jul 20, 2013

Mithgol commented Jul 20, 2013

isaacs commented Jul 22, 2013

Mithgol commented Jul 23, 2013

isaacs commented Jul 23, 2013

Mithgol commented Jul 23, 2013