In Part 1, Part 2, and Part 3 I argued for Rust on Lambda and measured where the gains come from. This post draws the other edge of that fence: the shapes of work where Lambda is the wrong architectural choice, however lean the binary, and whichever language you wrote it in. The common thread across every case below is a single billing fact: Lambda charges you for wall-clock duration, including the time your handler spends waiting on IO.

Long-running synchronous work

The 29-second trap is the most common way this goes wrong. API Gateway HTTP APIs cap integration timeout at 30 seconds and will not let you raise it. REST APIs historically capped at 29 seconds. In June 2024 AWS let you raise that cap, but only for REST APIs (Regional and Private); HTTP APIs are excluded and still hit the 30-second hard cap. Even on REST, the extra timeout comes out of your account-level throttle quota. It is a tradeoff, not a free upgrade. And whatever you do at the gateway, Lambda itself caps execution at 15 minutes, a hard ceiling that cannot be raised. Your handler routinely outlives the caller inside that window.

The failure mode this creates is the dangerous one. The client sends a request. API Gateway forwards it. Lambda begins the work, writes to the database, calls the downstream service. At 29 seconds API Gateway gives up and returns a 504 Gateway Timeout to the client, the HTTP status a proxy sends when its upstream did not respond in time. Lambda keeps going, the write commits, the downstream call succeeds, Lambda returns a perfectly good response that API Gateway has already stopped listening to. The client sees a failed request and retries, and the same insert lands again.

This is not really a Lambda problem, it is a distributed systems one. Any caller that can retry will eventually retry against a handler that already succeeded, and without idempotency every retry is a duplicate: two charges, two orders, two rows. Idempotency means designing the handler so that two invocations with the same input produce the same result as one. It is easy to treat as optional until the first duplicate charge shows up in production. I plan to cover concrete idempotency patterns in a later post; for now it is enough to flag that any synchronous Lambda behind API Gateway is a retry-prone shape that needs this baked in.

API Gateway returns 504 while Lambda keeps running and inserts into the database

Unreliable IO and retry storms

The instinct when a downstream call is flaky (a third-party API, an internal microservice over HTTP, a database query, an S3 read, a DynamoDB write, anything your handler waits on) is to wrap it in a retry with exponential backoff. On a long-lived server, that costs almost nothing: the process is already running, the thread is cheap, the only price is wall-clock latency on the request that happened to draw the short straw. On Lambda the same code has a very different bill: you pay for every millisecond the handler is alive, including the time it spends doing nothing but waiting for the next retry.

Every millisecond your handler spends sleeping between retries is billed at the function’s full per-ms rate, and it holds a concurrency slot the whole time. A handler that retries three times with 1s, 2s, 4s of backoff has paid for seven seconds of compute to do work that would have taken maybe 50 ms on a healthy day. Multiply that across a traffic spike where the downstream is degraded for everyone, and you get a retry storm: every concurrent invocation is asleep waiting on the same flaky dependency, the concurrency pool fills up, and new requests start getting throttled while the in-flight ones quietly run the meter.

The cheaper shape is usually to fail fast. A single attempt with a tight timeout, and on failure return a clear error to the caller and let them decide whether to retry. The caller is not billed by the millisecond, and a client-side retry with backoff costs you nothing while the dependency is down. You only pay for the one failed attempt, not for sitting in a sleep loop.

What to reach for instead

When Lambda is the wrong shape, the replacement usually depends on what kind of work is misfitting. A short map of the common ones:

  • Long-running jobs. Anything that legitimately takes minutes or longer belongs on Step Functions, AWS Batch, or a container task on ECS/Fargate. Step Functions in particular handles the orchestration, retries, and state for free, with each step still able to be a Lambda if it is short.
  • Stream processing. For continuous high-throughput event streams, Kinesis Data Streams, MSK (managed Kafka), or DynamoDB Streams are the right primitives. Lambda can still be the consumer, but the stream service owns ordering, partitioning, and durability.
  • Fan-out async work. When one event needs to trigger many independent units of work, SQS or SNS in front of worker Lambdas is the idiomatic shape, provided each worker is bounded and short. For workers that genuinely need to be long-running, ECS workers consuming from the same queue are the better fit.