Remap transform skips parsing for nginx "upstream info" logs (only "error" logs processed) #23092
-
Hi, These logs are not dropped: they appear in OpenSearch. But I don't see the .nginx field or any custom fields like .resources.new_third_field, which I add at the top of the script. This suggests the whole transform block is skipped or fails silently for these messages. Here's the relevant part of my remap script. I use parse_regex! because I had the same issue earlier when trying parse_nginx_log!(..., format: "upstream_info"): logs were never parsed either. transforms:
restructure_logs:
type: remap
inputs:
- otlp.logs
source: |-
.resources.new_third_field = "new_third_value"
msg = string!(.message)
ns = get(.resources, ["k8s","namespace","name"]) ?? ""
if ns == "ingress-nginx" || contains(msg, "HTTP/") {
err = parse_regex!(
msg,
r'^(?P<timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})\s+\[(?P<severity>\w+)\]\s+(?P<pid>\d+)#(?P<tid>\d+):\s+\*(?P<cid>\d+)\s+(?P<errmsg>.*?)(?:,\s+client:\s+(?P<client>\S+))?(?:,\s+server:\s+(?P<server>[^,]+))?(?:,\s+request:\s+"(?P<request>[^"]+)")?(?:,\s+upstream:\s+"(?P<upstream>[^"]+)")?(?:,\s+host:\s+"(?P<host>[^"]+)")?$'
)
if length(keys(err)) > 0 {
.nginx = err
if exists(err.severity) { .severity_text = downcase!(string!(err.severity)) }
} else {
ing = parse_regex!(
msg,
r'^(?P<remote_addr>\S+)\s+-\s+-\s+\[(?P<timestamp>[^\]]+)]\s+"(?P<request>[^"]+)"\s+(?P<status>\d{3})\s+(?P<body_bytes_size>\d+)\s+"[^"]*"\s+"(?P<user_agent>[^"]+)"\s+(?P<request_length>\d+)\s+(?P<request_time>[\d\.]+)\s+\[(?P<upstream_name>[^\]]*)]\s+\[[^\]]*]\s+(?P<upstream_addr>\S+(?:,\s*\S+)*)\s*(?P<rest>.*)$'
)
}
}
if is_json(msg, variant:"object") {
parsed = parse_json!(msg)
merge!(., parsed)
}
if exists(.level) && !exists(.severity_text) {
.severity_text = downcase!(string!(.level))
}
if exists(.nginx.status) && .nginx.status != "" {
v = to_int(.nginx.status) ?? null
if v != null { .nginx.status = v }
}
if exists(.nginx.upstream_status) {
.nginx.upstream_status = string!(.nginx.upstream_status)
} Both log types (error and upstream info) come from the same container (ingress-nginx-controller), only difference is stdout vs stderr. In internal logs, I often see:
Below are examples of an ingress-nginx "error" log that I see in OpenSearch, and an upstreaminfo log, with values of some field that I blurred purposefully ingress nginx error type of log {
"_index": "logs-2025.05.22",
"_id": "",
"_version": 1,
"_score": null,
"_source": {
"attributes": {
"log.file.path": "/var/log/pods/ingress-nginx_ingress-nginx-controller/controller/0.log",
"log.iostream": "stderr",
"logtag": "F"
},
"dropped_attributes_count": 0,
"message": "2025/05/22 07:28:29 [error] 186#186: ...
"nginx": {
"cid": "4021063",
"client": "10.204.2.233",
"errmsg": "upstream timed out (110: Operation timed out) while connecting to upstream",
"host": "",
"pid": "186",
"request": "GET ...",
"server": "",
"severity": "error",
"tid": "186",
"timestamp": "2025/05/22 07:28:29",
"upstream": ""
},
"observed_timestamp": "2025-05-22T07:28:29.801672351Z",
"resources": {
"host.name": "otel-agent-agent-wg8t5",
"k8s.container.name": "controller",
"k8s.container.restart_count": "0",
"k8s.namespace.name": "ingress-nginx",
"k8s.pod.name": "ingress-nginx-controller",
"k8s.pod.uid": "",
"new_third_field": "new_third_value",
"os.type": ""
},
"severity_text": "error",
"source_type": "opentelemetry",
"timestamp": "2025-05-22T07:28:29.703365917Z"
},
"fields": {
"observed_timestamp": [
"2025-05-22T07:28:29.801Z"
],
"nginx.timestamp": [
"2025-05-22T07:28:29.000Z"
],
"timestamp": [
"2025-05-22T07:28:29.703Z"
]
},
"highlight": {
"resources.k8s.namespace.name": [
""
]
},
"sort": [
]
} ingress nginx upstream info type of log {
"_index": "logs-2025.05.22",
"_id": "C6Xk9pYBDrkFTkXWjXGg",
"_version": 1,
"_score": null,
"_source": {
"attributes": {
"log.file.path": "/var/log/pods/ingress-nginx_ingress-nginx-controller/controller/0.log",
"log.iostream": "stdout",
"logtag": "F"
},
"dropped_attributes_count": 0,
"message": "10.204.2.233 - - [22/May/2025:07:28:14 +0000] \"GET ...,
"observed_timestamp": "2025-05-22T07:28:14.817477922Z",
"resources": {
"host.name": "otel-agent-agent-8vlfz",
"k8s.container.name": "controller",
"k8s.container.restart_count": "0",
"k8s.namespace.name": "ingress-nginx",
"k8s.pod.name": "ingress-nginx-controller",
"k8s.pod.uid": "",
"os.type": "linux"
},
"source_type": "opentelemetry",
"timestamp": "2025-05-22T07:28:14.698255068Z"
},
"fields": {
"observed_timestamp": [
"2025-05-22T07:28:14.817Z"
],
"timestamp": [
"2025-05-22T07:28:14.698Z"
]
},
"highlight": {
"resources.k8s.namespace.name": [
""
]
},
"sort": [
]
} Any idea why the transform seems to skip certain logs even though they reach Vector and are not dropped? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, The logs are not being dropped because you have not set it to drop logs on failure so its failing to parse the log and passing it along as is. The issue it tells you is at one of the two To drop the logs update your transform to the below and it wont get passed on and will go out a different 'path' ref. If you add this and tap the output of this route Vector will add metadata about why the log was dropped and help you fix the issue. transforms:
restructure_logs:
type: remap
drop_on_error: true
reroute_dropped: true
inputs:
- otlp.logs
source: |
Beta Was this translation helpful? Give feedback.
Hi,
The logs are not being dropped because you have not set it to drop logs on failure so its failing to parse the log and passing it along as is. The issue it tells you is at one of the two
parse_regex
functions.To drop the logs update your transform to the below and it wont get passed on and will go out a different 'path' ref. If you add this and tap the output of this route Vector will add metadata about why the log was dropped and help you fix the issue.