AWS Lambda forwards everything you write to stdout into CloudWatch for free, and it keeps doing so even after your handler has returned. The moment you want richer signals than text logs, or you want to send them anywhere other than CloudWatch, that free path stops being enough: you have to push the data over the network yourself. On a function that freezes between invocations, every way of doing that trades off against one of two things. You either pay Lambda’s per-millisecond billing for the time spent waiting on a remote endpoint, or you risk dropping the telemetry entirely when the environment is suspended out from under you. This post walks the approaches from simplest to most reliable and names where each one bills you or loses signals. The problem is not specific to CloudWatch; it applies to any serverless environment that scales to zero.
The free path: stdout to CloudWatch
Lambda’s default logging mode is stdout. Anything you print with console.log, println!, or your language’s equivalent is picked up by the execution environment and shown in CloudWatch as a new log event. This works without any setup and without you knowing how. The environment captures those stdout lines and forwards them to CloudWatch out of band, even if the handler has already finished its execution phase. It is reliable and effectively free of latency on your execution path, because the forwarding is not your code’s problem.
The ceiling is that it only does plain logs, and only to CloudWatch. As soon as you want metrics or traces, or you want the data in Grafana, Datadog, or anything that is not CloudWatch, this mechanism does not carry you any further.
When you need more than logs
Metrics and traces cannot ride stdout the way logs do. They need an exporter: your code calls into a telemetry provider, the provider exports the signals to a collector, and the collector eventually forwards them to whatever backend ingests them. That is a network call originating from inside your handler, which is the root of everything below.
AWS has native options here. X-Ray does distributed traces and the Embedded Metric Format does metrics. Neither is a cross-vendor standard, and building on them is an easy way to end up locked into one observability stack. The vendor-agnostic standard today is OpenTelemetry, with OTLP as the wire protocol, implemented across most backends and languages. Picking OTLP keeps the backend swappable; it does not change the fact that you still have to get the bytes out of a function that may be frozen a millisecond after it responds.
Flushing in the handler
The simplest approach is to flush before returning. You produce your signals during the request and, just before sending the response, you call the telemetry framework to flush its buffer to the remote endpoint. James Eastham demonstrates this pattern for serverless Rust by flushing on each line.
It is correct, and the signals reliably arrive, but you pay for it directly. Lambda bills wall-clock duration, including time the handler spends waiting on IO, so the flush is billed compute spent doing nothing but waiting for a network round trip. If the OTLP endpoint is slow, degraded, or unreachable, your execution time absorbs all of it, and retries make the cost worse. You might spend 10 ms processing the request and then 500 ms waiting for the endpoint to confirm the signals were accepted, and you are billed for all 510.
Why a detached background task does not work
The obvious next idea is to move the flush off the response path with a background task that is not tied to the handler, for example a tokio::spawn that the handler does not await. This should be avoided. The Lambda execution environment suspends your process after the handler returns the result and does not resume it until the next invocation, to save resources. That is a fundamental property of the platform, not an edge case.
So a detached task is racing the freeze. If the environment is suspended after you return the response, the task can stall mid-transmission with a connection that never completes, hit a timeout it cannot observe, or have the environment shut down before any retry runs. The failure mode is silent: the handler returned 200, the caller is happy, and the telemetry for that invocation simply never arrived.
The sidecar pattern and Lambda extensions
A common way around this is to make exporting someone else’s concern. In ECS this is the sidecar pattern: one container runs the service and a second runs a forwarder, often an OpenTelemetry Collector, and the service ships signals to it over loopback while the forwarder owns the remote call. Lambda has no second service, but it has extensions, shipped as Lambda layers and driven by the Extensions API.
As Melvin Philips describes in his InfoQ article on deferred flushing, the response to the client does not wait for the flush. The extension registers, then blocks on /extension/event/next; Lambda will not freeze the environment until the extension signals readiness by calling that endpoint again. By deferring that signal until after a flush cycle completes, instead of immediately, the extension gets to buffer and flush when convenient, and the freeze-before-flush failure goes away. The catch is billing: the client gets its response early, but the function’s billed duration still includes the extension’s post-processing, which can run into the hundreds of milliseconds depending on the latency, load, and reliability of the OTLP endpoint.
The most reliable path: stdout plus a log forwarder
The most reliable approach is to go back to stdout. Serialize the signals and write them to stdout as encoded entries. The handler does no network IO for telemetry and has no side effects to fail; it just prints. A separate log forwarder, wired up as a subscription filter feeding a forwarder Lambda or Firehose into an OTLP collector, subscribes to the CloudWatch log group, parses the encoded lines, and ships them to the backend asynchronously and entirely off your execution path.
- Minimal handler, near-zero overhead. The function code stays as simple as plain logging. stdout does not block the execution path and adds almost no telemetry overhead.
- Decoupled availability. If OTLP ingestion goes down, the signals sit in CloudWatch until it recovers, and the forwarder catches up later. Your function’s availability is unaffected by the backend’s.
- CloudWatch still bills ingestion. You pay CloudWatch to ingest these logs, so keep telemetry to the minimum you actually need to diagnose.
- More infrastructure. You now own a separate stack that subscribes to CloudWatch, parses the lines, and forwards them.
- Polluted text logs. Your human-readable logs are now interleaved with encoded telemetry entries.
Closing
Enriching a function that scales to zero with metrics and traces is not as straightforward as adding a logging library, particularly when you are performance or cost constrained. Each approach moves the cost somewhere: into billed duration, into dropped signals, or into infrastructure you have to operate. None of them is free. The right choice depends on which of simplicity, cost, performance, and reliability is the binding constraint for the service in front of you.
Links
- AWS X-Ray developer guide: AWS-native distributed tracing
- CloudWatch Embedded Metric Format: AWS-native metrics via structured logs
- OpenTelemetry Protocol (OTLP): the vendor-agnostic telemetry wire protocol
- AWS Lambda extensions: running additional processes alongside a function
- Lambda Extensions API: the lifecycle and event API extensions use
- Lambda Extensions for Deferred Telemetry Flushing: Melvin Philips, InfoQ, on deferring the flush off the response path
- OpenTelemetry For Your Serverless Rust Applications: James Eastham, video walkthrough
- James Eastham’s blog: distributed systems, serverless, and AWS
- CloudWatch Logs subscription filters: streaming log events to a forwarder
- AWS Distro for OpenTelemetry (Lambda): the ADOT Lambda layer