You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some cases, it can happen that a drone is started but never reaches the AvailableState. However, during clean-up, it reaches the DownState. This case is currently not handled correctly in the auditor plugin, see
A new record is created when the drone reaches AvailableState. The record is then updated with the stop-time once the drone reaches DownState. If the drone never reached the AvailableState, there is also no record to be updated. In this case, AUDITOR returns a HTTP error `400 BAD REQUEST (The server cannot or will not process the request due to something that is perceived to be a client error).
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: cobald.runtime.tardis.plugins.auditor: 2023-07-13 10:20:02 Drone: {'site_name': 'NEMO', 'machine_type': 'tardis_c40m100', 'obs_machine_meta_data_translation_mapping': {'Cores': 1, 'Memory': 1000, 'Disk': 1000}, 'remote_resource_uuid': 16996522, 'c
reated': datetime.datetime(2023, 7, 12, 20, 4, 53, 130170), 'updated': datetime.datetime(2023, 7, 13, 10, 20, 2, 340694), 'drone_uuid': 'nemo-f25919f1d0', 'resource_status': <ResourceStatus.Deleted: 4>} has changed state to DownState
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: cobald.runtime.runner.asyncio: 2023-07-13 10:20:02 runner aborted: <cobald.daemon.runners.asyncio_runner.AsyncioRunner object at 0x7f17fab74040>
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: Traceback (most recent call last):
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/cobald/daemon/runners/base_runner.py", line 68, in run
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await self.manage_payloads()
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/cobald/daemon/runners/asyncio_runner.py", line 54, in manage_payloads
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await self._payload_failure
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/cobald/daemon/runners/asyncio_runner.py", line 40, in _monitor_payload
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: result = await payload()
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/tardis/resources/drone.py", line 123, in run
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await current_state.run(self)
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/tardis/resources/dronestates.py", line 288, in run
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await drone.set_state(await cls.run_processing_pipeline(drone))
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/tardis/resources/drone.py", line 143, in set_state
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await self.notify_plugins()
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/tardis/resources/drone.py", line 153, in notify_plugins
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await plugin.notify(self.state, self.resource_attributes)
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: File "/usr/local/lib/python3.8/site-packages/tardis/plugins/auditor.py", line 88, in notify
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: await self._client.update(record)
Jul 13 12:20:02 monopol.bfg.privat docker-COBalD-Tardis-atlhei[3808458]: RuntimeError: Reqwest Error: HTTP status client error (400 Bad Request) for url (http://10.18.0.12:8001/update)
I think this is the correct behaviour from auditor, as updating a non-existing record does not make sense.
So we should handle this exception somehow or try to find another solution.
The text was updated successfully, but these errors were encountered:
In some cases, it can happen that a drone is started but never reaches the
AvailableState
. However, during clean-up, it reaches theDownState
. This case is currently not handled correctly in the auditor plugin, seetardis/tardis/plugins/auditor.py
Lines 78 to 88 in 2bf9dcb
A new record is created when the drone reaches
AvailableState
. The record is then updated with the stop-time once the drone reachesDownState
. If the drone never reached theAvailableState
, there is also no record to be updated. In this case, AUDITOR returns a HTTP error `400 BAD REQUEST (The server cannot or will not process the request due to something that is perceived to be a client error).I think this is the correct behaviour from auditor, as updating a non-existing record does not make sense.
So we should handle this exception somehow or try to find another solution.
The text was updated successfully, but these errors were encountered: