-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry on socket timeouts #9
Conversation
Context: I have 3 nodes running in different environments and occasionally the monitor crashes after a few days on different nodes. Crash log: 2020-01-10T14:37:07Z INFO Refresh took 0:00:00.654659 seconds, sleeping for 30.0 seconds 2020-01-10T14:37:38Z INFO Refresh took 0:00:00.596821 seconds, sleeping for 30.0 seconds 2020-01-10T14:38:08Z INFO Refresh took 0:00:00.600586 seconds, sleeping for 30.0 seconds Traceback (most recent call last): File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 334, in <module> main() File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 318, in main refresh_metrics() File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 216, in refresh_metrics blockchaininfo = bitcoinrpc("getblockchaininfo") File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retry.py", line 129, in wrapper return retrier.run(fn, *args, **_kw) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 291, in run self._handle_error(err) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 232, in _handle_error raise err File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 288, in run return self._call(fn, *args, **kw) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 162, in _call res = fn(*args, **kw) File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 180, in bitcoinrpc result = rpc_client().call(*args) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 352, in call return self._call(service_name, *args) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 236, in _call response = self._get_response() File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 261, in _get_response http_response = self.__conn.getresponse() File "/usr/lib/python3.7/http/client.py", line 1321, in getresponse response.begin() File "/usr/lib/python3.7/http/client.py", line 296, in begin version, status, reason = self._read_status() File "/usr/lib/python3.7/http/client.py", line 257, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) socket.timeout: timed out
btw, tested by SIGSTOP and SIGCONT the bitcoind process: 2020-01-10T16:55:13Z INFO Refresh took 0:00:01.985098 seconds, sleeping for 30.0 seconds 2020-01-10T16:58:21Z ERROR Retry after exception socket.timeout: timed out |
Context: I have 3 nodes running in different environments and occasionally the monitor crashes after a few days on different nodes. Crash log: 2020-01-10T14:37:07Z INFO Refresh took 0:00:00.654659 seconds, sleeping for 30.0 seconds 2020-01-10T14:37:38Z INFO Refresh took 0:00:00.596821 seconds, sleeping for 30.0 seconds 2020-01-10T14:38:08Z INFO Refresh took 0:00:00.600586 seconds, sleeping for 30.0 seconds Traceback (most recent call last): File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 334, in <module> main() File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 318, in main refresh_metrics() File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 216, in refresh_metrics blockchaininfo = bitcoinrpc("getblockchaininfo") File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retry.py", line 129, in wrapper return retrier.run(fn, *args, **_kw) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 291, in run self._handle_error(err) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 232, in _handle_error raise err File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 288, in run return self._call(fn, *args, **kw) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 162, in _call res = fn(*args, **kw) File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 180, in bitcoinrpc result = rpc_client().call(*args) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 352, in call return self._call(service_name, *args) File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 236, in _call response = self._get_response() File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 261, in _get_response http_response = self.__conn.getresponse() File "/usr/lib/python3.7/http/client.py", line 1321, in getresponse response.begin() File "/usr/lib/python3.7/http/client.py", line 296, in begin version, status, reason = self._read_status() File "/usr/lib/python3.7/http/client.py", line 257, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) socket.timeout: timed out
Thanks! Merged. |
Context: I have 3 nodes running in different environments and occasionally the monitor crashes after a few days on different nodes.
Crash log:
2020-01-10T14:37:07Z INFO Refresh took 0:00:00.654659 seconds, sleeping for 30.0 seconds
2020-01-10T14:37:38Z INFO Refresh took 0:00:00.596821 seconds, sleeping for 30.0 seconds
2020-01-10T14:38:08Z INFO Refresh took 0:00:00.600586 seconds, sleeping for 30.0 seconds
Traceback (most recent call last):
File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 334, in
main()
File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 318, in main
refresh_metrics()
File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 216, in refresh_metrics
blockchaininfo = bitcoinrpc("getblockchaininfo")
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retry.py", line 129, in wrapper
return retrier.run(fn, *args, **_kw)
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 291, in run
self._handle_error(err)
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 232, in _handle_error
raise err
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 288, in run
return self._call(fn, *args, **kw)
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/riprova/retrier.py", line 162, in _call
res = fn(*args, **kw)
File "/home/bitcoin/jvstein/bitcoin-prometheus-exporter/bitcoind-monitor.py", line 180, in bitcoinrpc
result = rpc_client().call(*args)
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 352, in call
return self._call(service_name, *args)
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 236, in _call
response = self._get_response()
File "/home/bitcoin/monitoring-bitcoind/lib/python3.7/site-packages/bitcoin/rpc.py", line 261, in _get_response
http_response = self.__conn.getresponse()
File "/usr/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/usr/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.7/http/client.py", line 257, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out