Graceful shutdown #170

kao73 · 2021-02-09T09:41:24Z

How should I configure the CloudWatch Agent to make sure all collected data pushed away when I stop the agent?
I tried many options, but wasn't able to make it work as expected.

We use metrics to collect statistics in our application. Thus, we expect the metrics to be precise enough.
We have several ASG clusters with installed CloudWatch Agent with StatsD plugin on each EC2. There are several modules sending metrics to the agent. We found some metrics lost when ASG scales down. During investigation and some manual tests we found the agent doesn't publish collected data even if the amazon-cloudwatch-agent service stopped gracefully.

Can it be just incorrect configuration or WAD feature?

My configuration:

{
  "agent": {
    "metrics_collection_interval": 60,
    "omit_hostname": true,
    "logfile": "/var/log/amazon-cloudwatch-agent.log"
  },
  "metrics": {
    "namespace": "MyNameSpace",
    "append_dimensions": {
      "AutoScalingGroupName": "${aws:AutoScalingGroupName}"
    },
    "metrics_collected": {
      "statsd": {
        "metrics_collection_interval": 60,
        "metrics_aggregation_interval": 0
      },
      "mem": {
        "metrics_collection_interval": 60,
        "metrics_aggregation_interval": 0,
        "measurement": [
          "used_percent"
        ]
      }
    }
  }
}

For manual test we used:

start the agent service
send 1000 data points for a single metric by netcat
wait about 20 seconds
stop the agent service

Thanks.

The text was updated successfully, but these errors were encountered:

pingleig · 2021-02-09T15:27:59Z

I think if you send a SIGTERM to the agent, it should flush the data in buffer. btw: We are not following telegraf closely, which support flushing without shutdown using SIGUSER1 influxdata/telegraf#7366

Shigerello · 2021-04-15T01:46:44Z

Fluentd supports a variety of flushing and shutting-down options using signals as well.

https://docs.fluentd.org/deployment/signals

SIGUSR1
Forces the buffered messages to be flushed and reopens Fluentd's log. Fluentd will try to flush the current buffer (both memory and file) immediately, and keep flushing at flush_interval.

mihaileu · 2023-07-21T07:55:57Z

I have same issue

[agent] Hang on, flushing any cached metrics before shutdown

doesn't flush the last aggregated stats. Is there a workaround or a fix for this ?

jhnlsn added the enhancement New feature or request label Oct 1, 2021

brandondahler mentioned this issue Feb 1, 2022

Make cloudwatchlogs's pusher wait for the final flush to complete before returning #350

Merged

dipeshgautamsol mentioned this issue Nov 15, 2023

amazon-cloudwatch-agent doesnt respect SIGTERM to flush metrics #961

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graceful shutdown #170

Graceful shutdown #170

kao73 commented Feb 9, 2021 •

edited

Loading

pingleig commented Feb 9, 2021

Shigerello commented Apr 15, 2021

mihaileu commented Jul 21, 2023

Graceful shutdown #170

Graceful shutdown #170

Comments

kao73 commented Feb 9, 2021 • edited Loading

pingleig commented Feb 9, 2021

Shigerello commented Apr 15, 2021

mihaileu commented Jul 21, 2023

kao73 commented Feb 9, 2021 •

edited

Loading