-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support anonymous reads and authenticated writes #381
Comments
The topic of allowing some clients read-only access has come up in the past, but I haven't had a concrete-enough use case to spend time investigating such a feature yet. However it might actually be possible to set something like this up with the httpproxy backend: use a primary bazel-remote instance with large storage and authentication, and a secondary bazel-remote instance with |
That is an interesting approach that does sound like it could work today. I hadn't thought of that, although Im not sure if I like that. I think it would be a common setup where:
What do you think of that setup? Presumably, a flag could be introduced that only checks for auth when the http method is not GET and that would satisfy the requirement? |
Maybe there's an even simpler proxy solution- run one bazel-remote instance with authentication, and setup a simple unauthenticated http proxy that forwards GET and HEAD requests (and maybe also cas PUT requests) to it and rejects everything else. This could be a starting point (add some checks in the handler): https://gist.github.com/JalfResi/6287706 - I think it would be worth investigating this modular setup a bit before considering adding this feature to bazel-remote. Of course you can probably also setup nginx or apache to do this kind of proxying, but they have a lot more configuration options you would probably need to understand. Some questions spring to mind, if we were to consider adding this as a feature:
|
Yes, I think a unauthenticated proxy could work fine, however I still think this is a valid and potentially common use case.
I would think they would impact the LRU index, as they are still being leveraged even if anonymously.
I am not aware of any clients like this. I think it would negatively affect performance as the bandwidth would likely still be consumed. If one wanted this setup without a proper client, I would assume they would be willing to setup their own proxy like you are discussing to achieve this behavior.
Personally, a CAS is a CAS and it should be fine to upload to by anyone, however, a malicious client could lead to the emptying of cache items by populating unused CAS items. But it would probably be fine to allow for this anyway. (Does Bazel-remote validate blobs sent to /cas? or is the client expected to do so upon reading?)
They are not less reliable, however, the concern would be of the population of entries to /ac that could lead the CI/CD servers to use maliciously uploaded blobs to the cache. If crafted properly, a CI build could use/execute any random blob that a user might choose to craft. That is why I would protect the service user of the CI agents but still allow devs to read from the populated cache. I would love to be told that my fears are unfounded, as that would simplify things, but currently my understanding is this. |
It's definitely a valid use case. I have no idea how common it is. I'm pretty focused on the compressed-blobs feature at the moment, but once that lands I could look into this feature (if you would like to work on it, contributions are welcome of course). In the meantime, if you try the proxy hacks and find one that works, we could add notes to the README file.
I was mostly curious if there are any clients that have a read-only cache mode. It looks like bazel does, with
bazel-remote validates CAS items on upload- it's required by the REAPI spec, so other gRPC REAPI cache implementations should check this too.
Ah, I was thinking about this more from a reliability point of view. Your understanding is correct- the action cache is certainly more vulnerable since there's no way to check if the entries have been tampered with. Can you describe your threat model a bit more? You have trusted CI/infra, and want untrusted clients to benefit from the cache but prevent them from injecting malicious files for CI? Do you care about untrusted clients injecting malicious files for other untrusted clients? Do you care about DoS attacks (either raw traffic, or trashing the LRU/cache efficiency)? What does your network layout look like? Note that I do not consider bazel-remote to be security-hardended enough to expose it on public internet, even with TLS and authentication. |
The threat here is an employee/developer at an enterprise from tricking a CI server into using a falsified blob as an artifact that could presumably be sent to production. We want to allow blobs created by CI to be used by developers as a cache, but we do not want developers to be able to send PUTs to a cache and corrupt the cache. The network itself would all be secured and behind firewalls/vpn and not exposed directly to the internet. TLS would be enabled and authentication enabled for the agents that write to the cache. I am not concerned about DoS attacks in this scenario. |
I have the same setup and would be interested in a read-only option for developers and write option only from trusted servers as well. Would it work to have a configuration option to have an optional read-only port separate from the other ports? |
I will probably start working on this soon. Using a different address/port for read-write and readonly access is an option, but I think I would prefer a solution that doesn't require external configuration like firewalls/VLANs/etc. |
I have started looking into this... It seems fairly easy to support on a single port/address with basic authentication, but I'm unsure how to do this for mTLS authentication. @Mythra: any tips? If it's not possible to support unauthenticated read-only access + authenticated read-write access on the same port, then maybe adding a separate read-only port is the way to go after all. (Silly me, @cheister's suggestion wouldn't require external firewall/VLAN configuration.) |
Hey! So if you wanted to tackle mTLS authenticated, but read only I'd imagine you'd have a separate intermediate/root certificate you'd validate against. In this case from the codes point of view you'd have two "CA files". A certain CA file would validate write connections, the other CA file would validate read only connections. (These CA files work as they should now, they'll be either two separate root CAs, or much more likely in this case both CA files will contain two certs. The same root but a separate intermediate underneath). Implementation wise though, I'd think you'd be forced to use a separate port for gRPC. I don't think it gives you enough information to say "I validated with this particular cert chain for this connection". Each port would be set up with its own unique ca cert pool. The write port would only accept connections from your first CA file, the read port would accept connections if either CA file signed it. Then just depending on your connected port you would get read or write only. |
Oh silly me! You mentioned unauthenticated read only. I don't think grpc-go gives you that level of fine tunedness today. I believe there is a permissive mode where it accepts both mTLS and not, but then I don't know of a way to access if the connection you're on actually used mTLS. (Note: I may just not being seeing the docs in grpc-go I haven't spent too much time with it.) |
Thanks- I'm not seeing a easy solution with grpc-go either. It looks like grpc methods can extract a So I'm leaning towards just using additional addresses/ports to keep things simple. |
https://jbrandhorst.com/post/grpc-auth/ has some details on how to achieve this with a single port/address. |
Separate ports could allow additional use cases such:
|
I'm not sure how that would look with regards to command line args/config file settings, but it sounds complicated. |
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
Here's a candidate PR, feedback welcome: #412 |
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
If authentication is enabled, this new flag allows readonly access for unauthenticated clients. Technically, some of the "readonly" API calls do modify the cache (by affecting the LRU index, or by causing blobs to be downloaded from proxy backends), but they do not add or modify ActionCache blobs so cannot inject new data into the cache. Implements buchgr#381.
This feature is now available on the master branch, but not in a released version yet. |
It appears that when using htpasswd, one can only enforce auth for r/w or not at all. I would like to enforce auth when writing, but allow for anonymous reads.
The text was updated successfully, but these errors were encountered: