Replies: 1 comment 1 reply
-
Hi @a-abella, apologies for the delayed response—this one slipped through the cracks. You’ve provided sufficient information; we just need to carve out some time to investigate what’s happening. In the meantime, please don’t hesitate to reach out or share any additional details. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I've been investigating Vector for a use-case that I'm surprised a dedicated solution didn't exist (or at least, none that I found): a persistent disk-buffered HTTP POST proxy with all the retry and backoff bells and whistles.
In my configuration I have
http_server
sources that accept arbitrary json and protobuf payloads without decoding. I set the event body to the raw bytes in aremap
, then I send the raw bytes to anhttp
sink with appropriate content-type headers with a blocking disk buffer. I also have acknowledgements enabled.This is pretty-much working, but I was confused by a memory-usage behavior that occurs when the buffer fills and begins to block. Vector's memory usage climbs as though it's spilling over to a secondary in-memory buffer, but I don't have multiple buffers configured.
While the memory usage grows, the
http_server
source is responding with 5XX to my clients. So despite responding with an error, it actually seems like it's accepting the payload and storing in memory.The
http
sink does receive and attempt to post the events that are in memory. Once the backend services stop back-pressuring, memory usage does drop as events are sent and accepted. So it is working exactly like a buffer.I enabled the
--allocation-tracing
command line argument to understand which component was consuming memory, as I understand there are memory buffers between components in the pipeline. What I found is that the memory usage is allocated tocomponent_id="root"
, not any user-defined component:By comparing timestamps it's visible that that memory usage began increasing when the buffer growth plateaued due to reaching its 2GB size:
This is the complete config for one HTTP source-sink pipeline I've been testing (configured via helm):
I suppose my expectation would have been that with a full buffer, blocking, and acknowledgments on, the
http_server
sources should start rejecting client payloads without any kind of memory buffering.I'm not sure if this is working as designed, or if I've tripped and stumbled on something unexpected.
Beta Was this translation helpful? Give feedback.
All reactions