Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSS] input plugins codec thread safety #8830

Closed
colinsurprenant opened this issue Dec 11, 2017 · 18 comments
Closed

[DISCUSS] input plugins codec thread safety #8830

colinsurprenant opened this issue Dec 11, 2017 · 18 comments

Comments

@colinsurprenant
Copy link
Contributor

In logstash-plugins/logstash-input-udp/issues/4 (and related https://discuss.elastic.co/t/udp-input-plugin-error-typeerror-cant-convert-nil-into-string/110565) I realized the the udp input plugin uses multiple input workers to parallelize the decoding of the input data.

The problem here is that, in this specific case, the line codec is not thread safe (more precisely the usage of the FileWatch::BufferedTokenizer is not threadsafe).

Now, this can be fixed in multiple ways:

  • Add thread safety in the udp input to protect the codec decoding execution

    • The advantage would be that any other potential non-threadsafe codec used in the udp input would also be protected.
    • The disadvantage is that this added thread safety would practically negate the idea of having multiple workers.
  • Add thread safety in the line codec to protect the usage of the FileWatch::BufferedTokenizer.

    • The advantage would be more granular thread safety for better parallelization for the udp input
    • The disadvantage would be that potentially other non-threadsafe codec could have problems in the udp input.

Observations/Questions:

  • I did not survey all codecs but are codecs generally supposed to be thread safe?
  • We did introduce thread safety settings for the filters and outputs but not for the codecs, should we?
  • What other input(s) has this concept of having parallel decoding? Is this just a udp input thing? I see that the TCP input clones the codec and run them in parallel for each socket which is safe.
  • Fixing the thread safety problem of the line codec is certainly the best thing to do from the udp input POV but I am afraid that we might run into a similar problem with other codec, unless I survey them to assess their trade safety.

WDYT?

@colinsurprenant
Copy link
Contributor Author

Decided to go ahead and make the line codec threadsafe logstash-plugins/logstash-codec-line/pull/13

@colinsurprenant
Copy link
Contributor Author

new threadsafe line codec plugin version 3.0.6 pushed.

@colinsurprenant
Copy link
Contributor Author

After discussing this further - this line codec patch logstash-plugins/logstash-codec-line#13 does fix the problem but it might not be the optimal solution here WRT the UDP input plugin.

  • UDP does not guarantees ordering so using a shared buffer to reconstruct a UDP "stream" in the line codec does not make much sense.
  • A more efficient strategy would be to have separate line codec instances per worker and treat each UDP socket read as a separate "complete" unit of data that can be correctly decoded. This mean that each decode call should be systematically followed by a flush.

One potential problem I could think of with this strategy is if the kernel UDP receive+buffering is non atomic and issuing socket read could return partially copied UDP packets which would be completed in the subsequent read operation, in which case the above patch would be required. WDYT @jordansissel ?

Another problem I can think of is if (and this is a big if) someone would want to implement some higher protocol in a codec, this codec would probably be better off being shared across threads in the UDP input plugin.

@colinsurprenant
Copy link
Contributor Author

It seems that BSD based implementations uses lists of mbufs and each mbuf holds a single udp packet data so in that respect socket read operations would always return a full udp packet. I believe this should also be true in Linux implementations.

@colinsurprenant
Copy link
Contributor Author

interesting related question on SO - talks about truncations https://stackoverflow.com/questions/3069204/reading-partially-from-sockets

but we are providing sensible options defaults to deal with that

  # The maximum packet size to read from the network
  config :buffer_size, :validate => :number, :default => 65536

  # The socket receive buffer size in bytes.
  # If option is not set, the operating system default is used.
  # The operating system will use the max allowed value if receive_buffer_bytes is larger than allowed.
  # Consult your operating system documentation if you need to increase this max allowed value.
  config :receive_buffer_bytes, :validate => :number

@colinsurprenant
Copy link
Contributor Author

So it seems we do not need to guard against partial socket reads.

@jordansissel
Copy link
Contributor

jordansissel commented Dec 13, 2017

@colinsurprenant yeah, I dont' think we would get partial reads. The maximum udp packet size is 65536 bytes (IP packet length field is 2-byte value). The network stack should deliver datagrams one-at-a-time, so no read call should ever return data from two different UDP packets.

This mean that each decode call should be systematically followed by a flush.

+1

I enjoyed reading the summary of your research on this topic! :)

@andrewvc
Copy link
Contributor

I enjoyed reading the research too.

Some thoughts:

  1. Why is the receive buffer size configurable at all? It seems like there is one correct value. I think we should deprecate that today and obsolete that in 7.0
  2. In what scenario do we actually want a threadsafe codec? If the right answer is to clone the codec for the local thread then isn't that encouraging bad behavior? In a multi-threaded situation that creates a choke point.
  3. Perhaps we can change the plugin API to encourage thread local codecs. Maybe a factory method for getting new ones would be nice.

@IrlJidel
Copy link
Contributor

IrlJidel commented Dec 14, 2017

I added the option for receive_buffer_bytes.

There isn't one correct value. OS default could be 64K while max allowed could be set to 1024K.

@andrewvc
Copy link
Contributor

@IrlJidel sure, my question is what's the use case for setting it lower? You could always truncate the message later in the filter section of logstash.

@jordansissel
Copy link
Contributor

@andrewvc receive_buffer_bytes is used when setting up the socket to set the buffer the OS network stack uses to store data to be read by the application. Honestly, not sure if most users will need to set this, and I'm open to exploring this setting being omitted. It exists to set the buffer size of unread bytes before the OS starts dropping packets and is useful in cases where a burst of data comes in temporarily faster than Logstash can process.

What you describe "you could always truncate" is about the confusingly-named buffer_size setting which sets the size of the read() call. Given IPv4 and IPv6 both have a 65536 (2-byte) maximum packet size, the buffer_size makes no sense to be larger than the default value it already has, and I am OK removing it.

@jordansissel
Copy link
Contributor

In what scenario do we actually want a threadsafe codec

Codecs aren't bound to threads, they are bound to streams. TCP input, for example, one connection is one stream. For UDP, one packet is one stream. For File input, one file is one stream. For Kafka, one message is one stream. For HTTP, one request body is one stream.

@IrlJidel
Copy link
Contributor

IrlJidel commented Dec 14, 2017

receive_buffer_bytes is the socket receive buffer size, which holds multiple packets.

Its an important setting for us to handle microbursts. main udp is always the bottleneck on our system and if set too low results in udp drops.

13479 XXXXXX    20   0 10.6g 1.3g  11m S 29.6  2.1 531:03.35 [YYYY]<udp
13483 XXXXXX    20   0 10.6g 1.3g  11m S  9.0  2.1 159:47.11 <udp.0
13484 XXXXXX    20   0 10.6g 1.3g  11m S  9.0  2.1 159:54.72 <udp.1

@colinsurprenant
Copy link
Contributor Author

A few notes about buffer sizes:

  • a single recvfrom call will return a single datagram from the receive buffers and will truncate it if larger than @buffer_size. This is why the default @buffer_size is 65536 because datagram cannot be larger than 64k (16 bit length field in the protocol) so this prevents any potential truncation.
  • I am not sure why anyone would want to change @buffer_size - there is no point in setting it larger than 64k. Maybe for memory usage optimization you could lower it if you know you will never receive large datagrams.
  • for @receive_buffer_bytes, as @jordansissel and @IrlJidel pointed out this is for the kernel socket buffer size. Per Linux man:
    SO_RCVBUF
              Sets or gets the maximum socket receive buffer in bytes.  The
              kernel doubles this value (to allow space for bookkeeping
              overhead) when it is set using setsockopt(2), and this doubled
              value is returned by getsockopt(2).  The default value is set
              by the /proc/sys/net/core/rmem_default file, and the maximum
              allowed value is set by the /proc/sys/net/core/rmem_max file.
              The minimum (doubled) value for this option is 256.
  • As noted by @IrlJidel I believe it is important to provide the flexibility of setting a custom @receive_buffer_bytes depending on the use-case and performance characteristics of the udp input.

@IrlJidel
Copy link
Contributor

Ok thanks for keeping it ☺

Sorry if this is off topic but the main reason for udp bottleneck is due to locks as we're only doing a recvmsg call. Ruby doesn't support recvmmsg but I wonder what perf would be like if we used java just like the tcp input.

@colinsurprenant
Copy link
Contributor Author

In what scenario do we actually want a threadsafe codec? If the right answer is to clone the codec for the local thread then isn't that encouraging bad behavior? In a multi-threaded situation that creates a choke point.

@andrewvc not sure I am following. As @jordansissel mentioned, a codec need to be bound to a stream. This whole issue was raised because the UDP input plugin uses multiple worker threads but these were sharing a single codec instance. In the case of the line codec this created a problem because the line codec is not threadsafe. My first idea was to fix the line codec thread safety issue (which in fact solves the issue) but forgetting that UDP cannot form a stream since UDP packets order in not guaranteed. It turns out that each UDP packet (resulting from the @udp.recvfrom_nonblock(@buffer_size) call should be considered a complete stream in itself so each input worker can just safely clone the configured codec. So in that respect I updated the upd input for this logstash-plugins/logstash-input-udp/pull/32 and reverted my line codec thread safety fix logstash-plugins/logstash-codec-line#14.

@andrewvc
Copy link
Contributor

andrewvc commented Dec 14, 2017 via email

@colinsurprenant
Copy link
Contributor Author

Closing - this is now resolved with latest udp input and line codec releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants