How to Log 1M Events/Sec Without Slowing Down…

Skilled Coder

Dec 9, 2025

Why Logging Isn’t as Simple as It Looks

Read →

6 Comments

Nice article.

Thanks for the deep dive. One approach I have seen systems take is to make the asyn-logger write to a short lived file on disk (5 mins or 1hr) and let a daemon running (independent from the application) on the same instance to take care of flushing those logs to the stream/queue system over network. This way (1) company can have a centralized system decoupled from application's language etc. for publish logs (2) handle Kafka/Kinesis failure using retry without bothering the application (3) application crashes won't affect logging (esp info needed to debug the crash), daemon would continue to publish as long as instance is up (4) allow batching to avoid too many network calls.

Dheemanth Bykere Mallikarjun

Jan 5

The logs are batched in memory and sent to Kafka. How is the partition key chosen here ? And how will consumers know whether to persist it in hot , warm or cold tier ? Is it pushed to hot tier first and then a separate daemon process transfer the logs to warm & cold tiers? Because the retention in Kafka holds for 7 days (can be extended) , if increased the Kafka storage also increases

Zigotto

Dec 24

Great solution. Looking forward to seeing the hot warm cold solution!

Benard Mesander

Dec 22

Those who do not know syslog are doomed to reinvent it.

Another key thing is if a log is not going to be emitted due to log level, do not do the string operations to create the message which will not be sent.

Also the most expensive floating point operation is printf, so only log floating point quantities when essential.

The AI Architect

Dec 9

Really strong breakdown of how logging becomes its own infrastructure problem. The part about treating logs as a firehose instead of structured data really captures why most teams hit a wall. One thing that often gets underestimated is the query cost later on, even with hot/warm/cold storage people dunno they're still indexing way too much in the hot tier. Sampling or filtering eariler can save tons in compute.