-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Http optimizations #8002
Http optimizations #8002
Conversation
454f744
to
d8e2294
Compare
Instead of storing headers internally as `Hash(String, Array(String))` we store them as `Hash(String, String | Array(String))`. This involves a bit more logic when dealing with this union but it saves a fair amount of memory because for headers with just a single value, which is the most common case, we avoid allocating memory for an array.
1ddca5b
to
023be0a
Compare
023be0a
to
258cd80
Compare
Sorry about the constant rebase, next time I'll keep adding commits to make it easier to review and just at the end I'll rebase everything before merging (I learned |
Instead of using `String#split`, which creates an array with three strings, we find the space indexes and create subslices/substrings for each of the pieces. We also avoid allocating a string for common HTTP methods (GET, POST, etc.) and for the supported HTTP versions. Finally, we use `IO#peek` to see if we can find the request line there instead of allocating a String for it.
258cd80
to
569d813
Compare
|
Ooooh... I didn't know that. I'll try it next time. Thanks! |
When creating an `HTTP::Request` and passing it some `HTTP::Headers`, the headers are dupped to prevent the request from modifying data that the user might hold. However, dupping the headers when parsing a request from an IO is not necessary. This avoid some unneeded memory allocations.
We try to use `IO#peek` and read header lines directly from there, avoiding an extra String allocation for the entire request line. Then we avoid allocating strings for common header field names like `Host` and `Content-Length`).
569d813
to
614d0f9
Compare
Thank you @asterite |
I was still reviewing this and had changes to request :< |
Oh, I'm sorry 😢 I should've just merged it yesterday right away^^ Just a suggestion: When I'm reviewing a PR that might get merged while doing that, I tend do request a review from myself to signal that I'm currently looking at it (or intend to do so). |
By the way, I benchmarked the simple http server on the samples directory before and after this change with Doing:
Before:
After:
And I'm almost sure if we change Hash to have an open addressing implementation it could go up to By comparison, doing the same benchmark against a simple server in Go gives these results:
However, Go handles parallelism and when we'll have parallelism the performance will get a bit worse, but on the other hand a single request doing expensive CPU won't be able to stop the server from receiving other requests. |
@asterite could you test with |
@RX14 Sure! Here it it:
(I don't know if those values are good, I just copied them from their github repo) Before this PR:
After this PR:
With a "hypothetical" Hash with open addressing:
Go:
So the results are very different from |
@asterite is that Go with |
Also, are you going to address the review? |
No, that's without specifying If I pass
So I guess if with parallelism we can be close to Go numbers it'll be more than good ( /cc @waj @bcardiff )
Yes, but in a couple of hours. |
These are a series of refactors to improve parsing of HTTP requests. Check each commit for more details (each has a description in it).
This is more code, more complex code, and more low-level code, but that's the whole idea of Crystal: if you want to go deeper and optimize hot paths you can do it in Crystal itself.
I wrote this benchmark:
I read from a file to simulate reading from an external resource, though the entire file probably fits
IO::Buffered
's buffer (but the same should be true forSocket
). But later I'll show the results withIO::Memory
too.If I run the above benchmark against
master
:When against this PR:
So a 30% improvement! 😄
Also note the memory allocated per op: 2.1kB before, now 816B. This is the main reason it's faster.
If I use
IO::Memory
instead ofFile
I get:Since HTTP servers are pretty common in Crystal I thought this is a good way to optimize all apps out there. I know usually a lot more goes on in a typical web server (for example rendering) but the less time and memory the framework takes for itself, the better.
I didn't benchmark an HTTP::Server with
ab
or similar after this change, but you are welcome to do that and post the results here!