Need pvc namespace passed to CSI driver #170

lpabon · 2018-11-21T06:56:22Z

The in-tree Portworx Kubernetes driver currently relies on determining the namespace of the PVC during creation.

There are two issues with the current implementations in which none pass this information during creation of CSI volume:

1: On mounts the side car containers can pass the pod.Namespace (or more likely, the pvc name and namespace) to the CSI driver. We would like to have this on Creation.

and

2: Secrets Only passes secrets of the PVC namespace for every call except during creation.

We need 1: to work and also would really to have 2: to be more efficient in obtaining secrets.

The text was updated successfully, but these errors were encountered:

saad-ali · 2018-11-28T18:42:31Z

The request is to:

Ability to set namespace for secret during provisioning
- Need to discuss security implications with @liggitt
Be able to pass pvc info (namespace, name, id) to provisioning call, like we do for mount.
- What are the use cases?

liggitt · 2018-11-29T01:08:05Z

Did #69 not address using secrets in other namespaces? (documented at https://kubernetes-csi.github.io/docs/secrets-and-credentials.html)

external-provisioner/pkg/controller/controller.go

Lines 481 to 499 in 3827f80

    
           provisionerCredentials, err := getCredentials(p.client, provisionerSecretRef) 
        
           if err != nil { 
        
           	return nil, err 
        
           } 
        
           req.Secrets = provisionerCredentials 
        
           // Resolve controller publish, node stage, node publish secret references 
        
           controllerPublishSecretRef, err := getSecretReference(controllerPublishSecretNameKey, controllerPublishSecretNamespaceKey, options.Parameters, pvName, options.PVC) 
        
           if err != nil { 
        
           	return nil, err 
        
           } 
        
           nodeStageSecretRef, err := getSecretReference(nodeStageSecretNameKey, nodeStageSecretNamespaceKey, options.Parameters, pvName, options.PVC) 
        
           if err != nil { 
        
           	return nil, err 
        
           } 
        
           nodePublishSecretRef, err := getSecretReference(nodePublishSecretNameKey, nodePublishSecretNamespaceKey, options.Parameters, pvName, options.PVC) 
        
           if err != nil { 
        
           	return nil, err 
        
           }

liggitt · 2018-11-29T01:14:17Z

The reason the secrets in the pvc namespace are not made available for paired create/delete operations is that the PVC and its namespace may not exist at deletion time

lpabon · 2018-12-05T06:04:27Z

@liggitt we only need it in create. Doesn't it must exist in create, right? or am I wrong?

harsh-px · 2018-12-08T02:01:30Z

Here's are some use cases that we, at Portworx, have seen over the years.

Generally, the storage driver needs to know the orchestrator namespace when provisioning the volume so that they can restrict the volume bits at the same level of isolation. In a multi-tenant cluster, Kubernetes relies on namespaces to provide complete isolation between users. This isolation needs to go right down to the storage layer.
If namespaces are provided, the storage driver provisioning the volumes can query additional metadata for the PVC using the API server. For e.g the storage driver can get the labels applied on the PVC spec and can use those to perform storage level affinities and anti-affinities. For e.g do not co-locate any volumes that have labels app=cassandra. We have seen significant use cases for these where users want to influence storage replica placements, just like pod affinity and anti-affinities affect scheduling placements. With Portworx 2.1, we are introducing volume placement strategies which will be centered around PVC labels. Examples below:

Annotations on the PVCs can be used to instruct the storage driver on custom per-PVC behaviors (during creation) which may not be apt at a StorageClass level. While the goal is for the StorageClass and the PVC spec to provide all the fields sufficient for the creation of volume, there will always be a use case where the storage drivers would want to expose functionality which is not part of the official spec yet. In the past , we have used annotations to

Create encrypted PVCs: https://docs.portworx.com/portworx-install-with-kubernetes/storage-operations/create-pvcs/pvc-encryption/#encryption-using-cluster-wide-secret
Create PVCs that are snapshots or clones of other volumes (before snapshot controller): https://docs.portworx.com/portworx-install-with-kubernetes/storage-operations/create-snapshots/snaps-annotations/

harsh-px · 2018-12-17T20:06:41Z

@saad-ali I've added some use cases for this request ^^

lingxiankong · 2019-01-30T01:46:13Z

Also, we(catalyst cloud) have some similar requirements not sure should merge with this one. Besides the PV name/namespace, we also need the reclaim policy of the PV passed to the CSI driver(which in turn passed to the volume property in the storage backend), so as public cloud provider, we could know if we should delete the volumes in the backend when the cloud user deletes the k8s cluster.

So, could we change the issue title to something like Need pvc/pv information passed to CSI driver, then we could discuss what kind of information should be passed?

lingxiankong · 2019-01-30T23:38:58Z

For reclaim policy, to deal with the situation that the policy could be changed anytime during a PV's lifetime, I think we need the external-provisioner to call something like 'updateVolume' method of CSI driver, which is not in the CSI spec yet.

msau42 · 2019-02-11T18:54:02Z

Potentially related: #213

saad-ali · 2019-02-21T01:27:07Z

Ok there are two asks in this issue

Ability to reference a secret from the PVC's namespace:
- Moving this to Allow PVC name/namespace as template for provisioner-secret #233
- Please move all further discussions on this to that issue
Be able to pass pvc info (namespace, name, id) to provisioning call, like we do for mount.
- I'm reading @harsh-px's comments and will respond shortly.

lingxiankong · 2019-02-21T02:11:13Z

@saad-ali thanks for responding

saad-ali · 2019-02-21T02:14:00Z

Generally, the storage driver needs to know the orchestrator namespace when provisioning the volume so that they can restrict the volume bits at the same level of isolation. In a multi-tenant cluster, Kubernetes relies on namespaces to provide complete isolation between users. This isolation needs to go right down to the storage layer.

So you record some metadata when a volume is provisioned to say it belongs to namespace foo? And then when a volume is mounted, you verify that the pod is in namespace foo? If so, why? Kubernetes does this enforcement already.

If namespaces are provided, the storage driver provisioning the volumes can query additional metadata for the PVC using the API server. For e.g the storage driver can get the labels applied on the PVC spec and can use those to perform storage level affinities and anti-affinities. For e.g do not co-locate any volumes that have labels app=cassandra. We have seen significant use cases for these where users want to influence storage replica placements, just like pod affinity and anti-affinities affect scheduling placements. With Portworx 2.1, we are introducing volume placement strategies which will be centered around PVC labels. Examples below:

This is a great use case. But making a CSI driver reach in to Kubernetes to figure this out on its own is a hack.

Kubernetes and CSI already support topology where a volume is only accessible by certain nodes in a cluster. However, Kubernetes and CSI don't provide a way to the case where a volume is equally accessible by all nodes but has some internal storage system topology that can influence application performance.

Rather then poking a hole in the API to make the hack easier, I would strongly suggest working with the community to come up with a generic way to be able to influence storage specific topology. A good place to start is the long standing CSI issue already opened for this: container-storage-interface/spec#44.

Annotations on the PVCs can be used to instruct the storage driver on custom per-PVC behaviors (during creation) which may not be apt at a StorageClass level. While the goal is for the StorageClass and the PVC spec to provide all the fields sufficient for the creation of volume, there will always be a use case where the storage drivers would want to expose functionality which is not part of the official spec yet. In the past , we have used annotations to

Annotations on PVCs MUST NOT be passed to CSI drivers. The Kubernetes PVC object is intended for application portability. When we start leaking cluster/implementation specific details in to it we are violating that principle. And explicitly passing PVC annotations to CSI drivers encourages that pattern.

Let's discuss the specific use cases you have in mind, and see if we can come up with better solutions for each of those uses cases (for example, the use case you pointed out above) rather then opening up a hole in the API.

Also, we(catalyst cloud) have some similar requirements not sure should merge with this one. Besides the PV name/namespace, we also need the reclaim policy of the PV passed to the CSI driver(which in turn passed to the volume property in the storage backend), so as public cloud provider, we could know if we should delete the volumes in the backend when the cloud user deletes the k8s cluster.

For reclaim policy, to deal with the situation that the policy could be changed anytime during a PV's lifetime, I think we need the external-provisioner to call something like 'updateVolume' method of CSI driver, which is not in the CSI spec yet.

The Kubernetes cluster does leak resources on deletion today. That is a problem, but fixing it at the storage system layer is a hack. Cleaning up cluster resources (and ensuring PVC reclaim policy) on cluster deletion is the responsibility of Kubernetes or the Kubernetes deployment system. Please open an issue on https://github.com/kubernetes/kubernetes/issues to do the right thing at those layers.

lingxiankong · 2019-02-24T22:35:05Z

The Kubernetes cluster does leak resources on deletion today. That is a problem, but fixing it at the storage system layer is a hack. Cleaning up cluster resources (and ensuring PVC reclaim policy) on cluster deletion is the responsibility of Kubernetes or the Kubernetes deployment system. Please open an issue on https://github.com/kubernetes/kubernetes/issues to do the right thing at those layers.

Hi @saad-ali, thanks for your reply. I'm confused about something.

Cleaning up cluster resources (and ensuring PVC reclaim policy) on cluster deletion is the responsibility of Kubernetes or the Kubernetes deployment system I agree that resource cleanup is the responsibility of the Kubernetes deployment system, which in this case is the public cloud provider but not kubernetes itself. Because when the end user deletes a kubernetes cluster, it's impossible for the cloud system to log into the kubernetes cluster and delete everything automatically.

Usually, the cloud system relies on the resource metadata/decription/tags to identify which resources are belonged to the kubernetes cluster, and those metadata/description/tags are set when the kubernetes cluster resource is created. I have some examples:

The load balancer created in the OpenStack cloud for the kubernetes service of LoadBalancer type has the cluster information in its description, see here
The volume in the cloud(in-tree PV controller) created for the PV has PV name/namespace information in its properties, see here. However, when switching to use kubernetes-csi, there is only a volName param left indicating the PV's name. IMHO, we lost feature parity in the migration from in-tree pv controller to csi.

kerneltime · 2019-04-19T20:40:55Z

We have a usecase for managing thousands of datasets that in turn can have arbitrary number of versions.
These datasets are created and consumed within kubernetes and have usage outside of kubernetes as well.
The current mapping of which dataset is done via storage class. When creating a volume (filesystem) there is no ability to address a specific version.
Currently we use a fork to test things out
Snapshots model also runs into a similar friction from what I can tell.
The only solution currently seems to be for a pod to directly address the volume (ephemeral volume). The create for a new version that is written will have to implied based on the spec passed in vs being able to make any kind of central decision for it.

msau42 · 2019-04-19T20:55:48Z

@kerneltime for your use case, is your dataset readonly? Can users create a PVC based on some dataset, and modify, and persist it separately from another PVC based off of the same dataset? I'm trying to understand if csi ephemeral volumes can suit your use case better than PVs

kerneltime · 2019-04-19T21:27:10Z

There 2 kinds read only and new datasets. If developers modify a dataset it is a new version.
Ephemeral volumes should work and for scaling the number of volumes (millions over a few months for the same etcd process) it might be the only scalable way.

saad-ali mentioned this issue Feb 21, 2019

Allow PVC name/namespace as template for provisioner-secret #233

Closed

msau42 mentioned this issue Feb 21, 2019

AWS CreateVolume should include tags kubernetes/kubernetes#50898

Closed

leakingtapan mentioned this issue Feb 28, 2019

Add the ability to provide arbitrary EBS volume tags kubernetes-sigs/aws-ebs-csi-driver#180

Closed

pohly mentioned this issue Apr 29, 2019

Adding PVC's annotation to CreateVolumeRequest.Parameters #86

Open

ggriffiths mentioned this issue May 10, 2019

Add secret support for Provision and Delete from pvc name and namespace #274

Merged

k8s-ci-robot closed this as completed in #274 May 17, 2019

oleksiys mentioned this issue May 19, 2019

[release-1.0] Add secret support for Provision and Delete from pvc name and namespace #281

Merged

This was referenced Jun 3, 2019

[release-1.1] Add secret support for Provision and Delete from pvc name and namespace #286

Merged

[release-1.2] Add secret support for Provision and Delete from pvc name and namespace #287

Merged

msau42 mentioned this issue Aug 9, 2019

external-snapshotter should not allow annotation in template for snapshotter-secret-name kubernetes-csi/external-snapshotter#155

Closed

lingxiankong mentioned this issue Jan 30, 2020

CSI provisioned volumes are missing "kubernetes.io/created-for" tags in Openstack kubernetes/cloud-provider-openstack#914

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need pvc namespace passed to CSI driver #170

Need pvc namespace passed to CSI driver #170

lpabon commented Nov 21, 2018 •

edited

Loading

saad-ali commented Nov 28, 2018

liggitt commented Nov 29, 2018 •

edited by saad-ali

Loading

liggitt commented Nov 29, 2018 •

edited

Loading

lpabon commented Dec 5, 2018

harsh-px commented Dec 8, 2018

harsh-px commented Dec 17, 2018

lingxiankong commented Jan 30, 2019

lingxiankong commented Jan 30, 2019

msau42 commented Feb 11, 2019

saad-ali commented Feb 21, 2019 •

edited

Loading

lingxiankong commented Feb 21, 2019

saad-ali commented Feb 21, 2019

lingxiankong commented Feb 24, 2019

kerneltime commented Apr 19, 2019 •

edited

Loading

msau42 commented Apr 19, 2019 •

edited

Loading

kerneltime commented Apr 19, 2019

Need pvc namespace passed to CSI driver #170

Need pvc namespace passed to CSI driver #170

Comments

lpabon commented Nov 21, 2018 • edited Loading

saad-ali commented Nov 28, 2018

liggitt commented Nov 29, 2018 • edited by saad-ali Loading

liggitt commented Nov 29, 2018 • edited Loading

lpabon commented Dec 5, 2018

harsh-px commented Dec 8, 2018

harsh-px commented Dec 17, 2018

lingxiankong commented Jan 30, 2019

lingxiankong commented Jan 30, 2019

msau42 commented Feb 11, 2019

saad-ali commented Feb 21, 2019 • edited Loading

lingxiankong commented Feb 21, 2019

saad-ali commented Feb 21, 2019

lingxiankong commented Feb 24, 2019

kerneltime commented Apr 19, 2019 • edited Loading

msau42 commented Apr 19, 2019 • edited Loading

kerneltime commented Apr 19, 2019

lpabon commented Nov 21, 2018 •

edited

Loading

liggitt commented Nov 29, 2018 •

edited by saad-ali

Loading

liggitt commented Nov 29, 2018 •

edited

Loading

saad-ali commented Feb 21, 2019 •

edited

Loading

kerneltime commented Apr 19, 2019 •

edited

Loading

msau42 commented Apr 19, 2019 •

edited

Loading