Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with Content-Length and Transfer-Encoding #2427

Open
3 tasks
jeddenlea opened this issue Feb 9, 2021 · 8 comments · May be fixed by #2432
Open
3 tasks

Problems with Content-Length and Transfer-Encoding #2427

jeddenlea opened this issue Feb 9, 2021 · 8 comments · May be fixed by #2432
Labels
A-body Area: body streaming. A-headers Area: headers.

Comments

@jeddenlea
Copy link

This could potentially be seen as a couple of different issues, but they're all tightly overlapping, and can probably all be solved with a common fix. This also overlaps with #2215.

Currently, Content-Length: 0 may be automatically applied to any 204 response (just like 101 responses in #2215). Also, other Content-Length headers can be set explicitly within a service, or be implied by a body with a KnownLength. A 204 response MUST NOT contain either Content-Length or Transfer-Encoding.

On the other hand, a 304 response will never have a body, but MAY contain either Content-Length or Transfer-Encoding. Currently, the only way to get a Content-Length header attached to a 304 response is to try to actually send a Body of that length.

The headers made in response to a HEAD request should match those which would have been made for a GET request. Currently, if a service attempts to force chunked encoding on what might otherwise imply a Content-Length header be added, GET requests will see Transfer-Encoding: chunked while a HEAD request receives Content-Length: .... It is not currently possible to set Transfer-Encoding on a response to a HEAD request.

It seems the things which need to be fixed are:

  • For 204s, 100..199s, and any 2xx response made to a CONNECT request, any Content-Length or Transfer-Encoding header set by a service must be dropped. Further, neither header should be automatically added.
  • Any Content-Length or Transfer-Encoding header set by a service for any kind of response should be handled the same for GET and HEAD requests.
  • Any Content-Length or Transfer-Encoding header set by a service for a 304 response should be left alone. (perhaps unless a BodySize::Known greater than 0 conflicts with a Content-Length header, which could be an error?)

For reference, https://httpwg.org/specs/rfc7230.html#rfc.section.3.3

@seanmonstar
Copy link
Member

Thanks for the write-up, yea let's fix this! Not that you have to, but since you quoted a few types internal to hyper, do you have a suggestion for how to adjust the code to fix these? It's fine if they need separate fixes, we can make small PRs if need be...

@jeddenlea
Copy link
Author

jeddenlea commented Feb 9, 2021

I think the can_have_body and can_chunked functions are a little confusing at the moment, because they are the same function, and conflate HEAD requests as well as certain statuses...

I've put together a draft, jeddenlea@c5dd693

Are there some regression tests for all of this I could take a look at and alter if necessary?

@seanmonstar
Copy link
Member

I agree that those functions are confusing. There's tests in role.rs, at the bottom, but they might not test those functions directly. I think there's some encode tests. And then there's a set of body length tests for responses in tests/server.rs.

Want to post what you have in PR?

@jeddenlea jeddenlea linked a pull request Feb 10, 2021 that will close this issue
@jeddenlea
Copy link
Author

@seanmonstar sure! I think I've cleaned it up enough to maybe actually work. #2432

I'm still not 100% happy with the new can_have_body and imply_body_only. Maybe there's a better way of expressing those ideas, or maybe I'll just get used to it :)

@jeddenlea
Copy link
Author

I'm new enough to Rust that it took me a while just to figure out how to run the tests under /tests/, but they've proven helpful! I'm also fairly new to Hyper in general, so I'm not sure what patterns are broadly used with it. I'm struggling a little bit to come up with an answer that fits all cases without potentially causing unexpected regressions for existing use cases.

For instance, take a HEAD request. I would expect most naive services to not handle GET or HEAD differently, and if it doesn't set Content-Length itself, it'll be inferred from the Option<BodyLength>. However, if a service did want to conserve energy for a HEAD request, it's not obvious what the right way to do that is. Body::empty() seemed to work, but that ended up being the same things as Body::from("") because From<Bytes> for Body uses Body::empty() if the implied input is empty.

And, encode would take None and write content-length: 0. I think "" should be Some(Known(0)), but Empty should be None. We don't want to be attaching Content-Length: 0 to 304 responses unless the real response would have been something with Content-Length: 0.

So, I'm trying to make Empty and "" slightly more distinct. This subtle change now has me digging through a lot of the integration test macros to figure out what the heck is going on :)

@seanmonstar
Copy link
Member

I think we need to keep empty and "" the same. It's likely behavior that is relied upon, and so I don't think we can change it really...

@jeddenlea
Copy link
Author

I am worried about breaking the similarity too. However, having maybe made it work in my draft, it doesn't appear to be as bad as I had feared.

The changes to behavior are all reasonable, at least. Maybe some projects' test suites will require subtle adjusting, but there are no actual substantive changes. Everything is equivalent on the wire.

First, which should not be controversial, 204s, any 1xx, and any 2xx response to a CONNECT will now never have either Content-Length or Transfer-Encoding. This is an essential fix.

The more worrying subtle changes happen with empty bodies on HEAD requests and 304 responses...

req resp body current proposed
GET 200 "" content-length: 0 content-length: 0
HEAD 200 "" content-length: 0 content-length: 0
GET 200 Body::empty() content-length: 0 content-length: 0
HEAD 200 Body::empty() content-length: 0 _
GET 304 "" _ content-length: 0
HEAD 304 "" _ content-length: 0
GET 304 Body::empty() _ _
HEAD 304 Body::empty() _ _

Of the changes, the worst really are the addition of content-length: 0 to 304 responses that happen to use anything other than Body::empty(). And, even those don't strike me as terrible if gone uncaught, though they may technically break the spec.

I've considered allowing the addition of content-length to 304s based on the body to only happen if the length is greater than 0. That kind of special casing can get annoying though; 0 is just as valid a length as 1...

@jeddenlea
Copy link
Author

On the 304 front, the most conservative change we could make could be to simply never guess Content-Length or Transfer-Encoding unless the service sets a header itself.

For HEADs, unfortunately I don't think there's a clean path forward without potentially changing something in a way which someone may find unexpected. I imagine there may be some accidental, incorrect Content-Length: 0 generated from Body::empty() today which would be fixed. But, some number may be intentional as well.

jeddenlea added a commit to jeddenlea/hyper that referenced this issue Feb 23, 2021
Fixes hyperium#2427

204s must never have either Content-Length or Transfer-Encoding headers.
Nor should any 1xx response. Nor should any 2xx response made to a
CONNECT request.

On the other hand, any other kind of response may have either a
Content-Length or Transfer-Encoding header.  This includes 304s in
general, and any response made to a HEAD request. In those of those
cases, the headers should match what would have been sent with a normal
response.

The trick then is to ensure that such headers are not mis-inferred. When
responding to a HEAD request, a service may reasonably opt out of
computing the full response, and thus may not know how large it would
be. To add automatically add `Content-Length: 0` would be a mistake.

As Body::empty() seems to be the defacto "no response" Body, it shall be
used as such.  If a service responds to a HEAD request, or to a
conditional GET request with a 304, it may specify Body::empty() as the
body, and no Content-Length or Transfer-Encoding header will be added.
In all other cases when a Content-Length header is required for HTTP
message framing, `Content-Length: 0` will be still automatically added.

Body::from("") on the other hand will now implie a specific empty Body.
It will continue to imply `Content-Length: 0` in all cases, now
including a 304 response.

Either Content-Length or Transfer-Encoding may be added explicitly as
headers for either HEAD requests or 304 responses.
jeddenlea added a commit to jeddenlea/hyper that referenced this issue Feb 23, 2021
Fixes hyperium#2427

204s must never have either Content-Length or Transfer-Encoding headers.
Nor should any 1xx response. Nor should any 2xx response made to a
CONNECT request.

On the other hand, any other kind of response may have either a
Content-Length or Transfer-Encoding header.  This includes 304s in
general, and any response made to a HEAD request. In those of those
cases, the headers should match what would have been sent with a normal
response.

The trick then is to ensure that such headers are not mis-inferred. When
responding to a HEAD request, a service may reasonably opt out of
computing the full response, and thus may not know how large it would
be. To add automatically add `Content-Length: 0` would be a mistake.

As Body::empty() seems to be the defacto "no response" Body, it shall be
used as such.  If a service responds to a HEAD request, or to a
conditional GET request with a 304, it may specify Body::empty() as the
body, and no Content-Length or Transfer-Encoding header will be added.
In all other cases when a Content-Length header is required for HTTP
message framing, `Content-Length: 0` will be still automatically added.

Body::from("") on the other hand will now implie a specific empty Body.
It will continue to imply `Content-Length: 0` in all cases, now
including a 304 response.

Either Content-Length or Transfer-Encoding may be added explicitly as
headers for either HEAD requests or 304 responses.
jeddenlea added a commit to jeddenlea/hyper that referenced this issue Mar 2, 2021
Fixes hyperium#2427

204s must never have either Content-Length or Transfer-Encoding headers.
Nor should any 1xx response. Nor should any 2xx response made to a
CONNECT request.

On the other hand, any other kind of response may have either a
Content-Length or Transfer-Encoding header.  This includes 304s in
general, and any response made to a HEAD request. In those of those
cases, the headers should match what would have been sent with a normal
response.

The trick then is to ensure that such headers are not mis-inferred. When
responding to a HEAD request, a service may reasonably opt out of
computing the full response, and thus may not know how large it would
be. To add automatically add `Content-Length: 0` would be a mistake.

As Body::empty() seems to be the defacto "no response" Body, it shall be
used as such.  If a service responds to a HEAD request, or to a
conditional GET request with a 304, it may specify Body::empty() as the
body, and no Content-Length or Transfer-Encoding header will be added.
In all other cases when a Content-Length header is required for HTTP
message framing, `Content-Length: 0` will be still automatically added.

Body::from("") on the other hand will now implie a specific empty Body.
It will continue to imply `Content-Length: 0` in all cases, now
including a 304 response.

Either Content-Length or Transfer-Encoding may be added explicitly as
headers for either HEAD requests or 304 responses.
@davidpdrsn davidpdrsn added A-body Area: body streaming. A-headers Area: headers. labels May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-body Area: body streaming. A-headers Area: headers.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants