Increasing read chunk size could improve performance for large responses #135

florimondmanca · 2020-08-05T16:33:23Z

Coming from discussion on Gitter with @dalf…

Currently we are reading response data in chunks of 4kB…

httpcore/httpcore/_async/http11.py

Line 26 in f4240b6

READ_NUM_BYTES = 4096

httpcore/httpcore/_async/http2.py

Line 29 in f4240b6

READ_NUM_BYTES = 4096

Benchmarking using @dalf's pyhttp-benchmark tool led us to see that increasing this number to 64kB could lead 2-3x execution time improvement for large responses (typically > 256kB).

My rationale would be that reading N bytes in one go via a syscall is faster than reading n = N/k bytes k times — mostly because the kernel is way faster than Python.

florimondmanca · 2020-08-05T16:53:48Z

Running the 1MB-response benchmark for various READ_NUM_BYTES values, and a matplotlib script later, here's a handy little plot to support this assessment that 64kB is probably a good value…

import matplotlib.pyplot as plt

data = [
    (4, 2.36, 2.13),
    (16, 1.18, 0.99),
    (32, 1.07, 0.94),
    (64, 0.95, 0.81),
    (96, 0.87, 0.74),
    (128, 0.82, 0.68),
    (160, 0.80, 0.66),
]

x, runtime, cpu = zip(*data)

plt.grid()
plt.xlabel("READ_NUM_BYTES (kB)")
plt.xticks(range(0, 192, 16))
plt.ylim(0, 3)
plt.ylabel("Median execution time (s)")
plt.scatter(x, runtime)
plt.scatter(x, cpu)
plt.legend(["runtime", "cpu"])
plt.show()

Looks like the median execution time gets exponentially smaller as we increase the chunk size. Once we hit 32kB the marginal improvement starts hitting its limits. 64kB gets us "below 1s in wall time for a 1MB response" and that sounds very satisfying. :-)

dalf · 2020-08-05T19:06:36Z

An additional graph (made using this tool):

httpcore_True_*: http2=True
httpcore_False_* : http2=False

florimondmanca mentioned this issue Aug 5, 2020

Improve performance when receiving large responses #136

Merged

tomchristie closed this as completed in #136 Aug 5, 2020

florimondmanca mentioned this issue Nov 16, 2020

Easy way to retrieve just the first X bytes from a URL encode/httpx#1392

Closed

dalf mentioned this issue Dec 3, 2021

Change default chunk size from 4Kb to 64Kb encode/starlette#1345

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing read chunk size could improve performance for large responses #135

Increasing read chunk size could improve performance for large responses #135

florimondmanca commented Aug 5, 2020 •

edited

Loading

florimondmanca commented Aug 5, 2020 •

edited

Loading

dalf commented Aug 5, 2020

Increasing read chunk size could improve performance for large responses #135

Increasing read chunk size could improve performance for large responses #135

Comments

florimondmanca commented Aug 5, 2020 • edited Loading

florimondmanca commented Aug 5, 2020 • edited Loading

dalf commented Aug 5, 2020

florimondmanca commented Aug 5, 2020 •

edited

Loading

florimondmanca commented Aug 5, 2020 •

edited

Loading