-
Notifications
You must be signed in to change notification settings - Fork 95
Metadata load error from volume KV during access update for the volume. #1371
Comments
The datastore1 type is local. |
Yeah it is local VMFS data store. |
https://ci.vmware.run/vmware/docker-volume-vsphere/411 changing readOnly volume to writable volume also fails intermittently.
/CC @pshahzeb |
seen 3 failure so far today so setting it P0 |
I also see this error stack trace:
File a separate issue for handling this? |
@ashahi1 yes this is a separate trace for unhandled exception in case of invalid volume option size literals. |
@shuklanirdesh82 |
Another instance of the same failure on CI: https://ci.vmware.run/vmware/docker-volume-vsphere/455
|
Hit another failure after enabling the log to dump kv_str. This is the CI run
Then the test issue AdminCLI command to change the volume type from "read-write" to "read-only", which causes the following call stack to be called set_vol_opts-> Then the following exception happens during the call to kvESX.load
From the log I have, somehow the kv_str reading from meta_file is an empty string. My plan is to adding more debug log to see why the kv_str reading from meta_file is an empty string and merge the code with additional log added asap. Another problem I found that is in the "VolumeAccessTestSuite", after we change the volume access type, we don't have code to verify that change is successful or not and I think we should add those checks. I will file a separate to track this. |
Got this error in CI - https://ci.vmware.run/vmware/docker-volume-vsphere/520. the root cause may be outside of the service. Where the side car reads are returning 0 bytes for whatever reason after the VMDK is detached from the VM. This may be an ESX side issue as there are no indication that the service code is doing anything wrong. Immediately after the error is seen, the side car (KV) again has all the data in it. But the update from read-only to read-write failed by then and the VMDK is left read-only and causes the writes from the test to fail. I'd suggest that if the load fails in kv.load() then loop around a fixed number of retries before giving up. then its fine again, |
The root cause for this behavior is not known, but at least when the issue happens there aren't any parallel accesses from the admin CLI and service on the same volume. All actions are sequential, and the admin CLI sets the RW access on the volume before the service takes the next docker request to attach the volume. This may be a quirk of the side-cars used for the KV and the retry is basically to tide over the zero byte reads on the KV. |
I am able to reproduce this in my local setup, it looks like it happens after plugin restart, and I am not sure whether the plugin restart cause this issue or not.
From the log, we can see that the meta file is empty and therefore, "kv_str" reading from the meta file is an empty string. As @govint mentioned above, we should add code to retry if the load fails in kv.load(). |
In kvESX.py::def load(volpath)
XXX: Can open/read throw an exception other than IOError for e.g. OSError? We will miss it and end up calling json.load with empty kv_str. Can we change it to cc/ @msterin |
Hit it again in CI This is what I found:
We hit this consistently after the plugin restart, not sure if it related to #1395 although here we don't have parallel volume create here.
|
Root caused to Test not waiting for docker operation to finish before issuing admin update command. Confirmed that adding |
@pdhamdhere Shouldn't the driver take care of this race condition, i.e. the admin update command should be blocked in this case until the docker operation is finished? I'm not really familiar with the detailed logic in this module, but I think the metadata file should be locked using a R/W lock - any read operation should be blocked if a write operation is in progress. Please correct me if I'm wrong here. |
Theoretically, underlying FS (VMFS in this case) locking should correctly handle R/W to same file from multiple processes and should not result in 0 byte file. Another cause could be way SideCar are handled. However, this is rarest race where admin updates status of volume while KV is being written so didnt spend too much time. We should fix the test and move on to higher priority issues. Makes sense? |
@lipingxue But why did the write fail on the KV? Plus a lock is held on every load and save op on the KV even if admin CLI command is issued on the volume when a detach is in progress they will both synch. on that KV lock to prevent simultaneous access. From the log trace two parallel ops are loading (locked) the KV and then saving (again locked) it. Both ops done for both threads in parallel but they do synch. on the KV lock and hence sequenced. @lipingxue can you also share the logs of the write failing? |
@govint - what KV lock you are talking about? |
Both kv load and save take a lock before executing the op. |
Which lock ? In-process python ? It is irrelevant when admin CLI and vmdk_ops service race. I am not aware of any interprocess lock for KV. Did I miss something ? |
Correct, its not a file lock, the kvESX.py load() and save() use a lock decorator so those functions are synchronized within the ESX service and the admin CLI individually. |
The test case itself has been fixed, but we still need more discussion about this issue. |
@shaominchen, @tusharnt - what do we want to do for this issue? there is a race between an admin CLI user and the service user (via docker). One way is to have the admin CLI stop using the service code and instead post requests to the service and get the responses. There are no file locks used here so different processes can't access meta-data in parallel. |
This sounds a reasonable solution to me. Do we have an estimation for this approach? Depending on the schedule and the estimation of the fix, we can decide if we want to fix this for GA or not. |
This approach will need a new listener in the ESX service (can't reuse the VMCI listener in the service) to talk to, to handle requests from the admin CLI. I'd prefer not to do this for GA as the use case is fairly remote - race condition between admin CLI and docker. In fact I'd prefer not to address this issue unless a customer picks it up in which case something like #1079 may be usable - a more generic API based service that can be accessed not only locally but across hosts. |
Closing this issue, the approach was to change the way admin CLI shares the ESX service code, but I don't see that changing any more and definitely not for addressing the scenario here. |
While updating the volume access from read-write (default) to read-only, through automated volume access test, the update fails.
The access is updated using admin cli.
The exact log for the failure are
The text was updated successfully, but these errors were encountered: