5 Comments
User's avatar
Abhishek Pathak's avatar

Nice article.

Expand full comment
Spraghav's avatar

Thanks for the deep dive. One approach I have seen systems take is to make the asyn-logger write to a short lived file on disk (5 mins or 1hr) and let a daemon running (independent from the application) on the same instance to take care of flushing those logs to the stream/queue system over network. This way (1) company can have a centralized system decoupled from application's language etc. for publish logs (2) handle Kafka/Kinesis failure using retry without bothering the application (3) application crashes won't affect logging (esp info needed to debug the crash), daemon would continue to publish as long as instance is up (4) allow batching to avoid too many network calls.

Expand full comment
Gilberto Martins's avatar

Great solution. Looking forward to seeing the hot warm cold solution!

Expand full comment
Benard Mesander's avatar

Those who do not know syslog are doomed to reinvent it.

Another key thing is if a log is not going to be emitted due to log level, do not do the string operations to create the message which will not be sent.

Also the most expensive floating point operation is printf, so only log floating point quantities when essential.

Expand full comment
The AI Architect's avatar

Really strong breakdown of how logging becomes its own infrastructure problem. The part about treating logs as a firehose instead of structured data really captures why most teams hit a wall. One thing that often gets underestimated is the query cost later on, even with hot/warm/cold storage people dunno they're still indexing way too much in the hot tier. Sampling or filtering eariler can save tons in compute.

Expand full comment