How Would You Compute Trending Posts at Millions of Events per Second?

Computing "Trending Posts" Is Harder Than It Looks

Dec 22, 2025

Every social platform has it. A Trending section.

The posts everyone seems to be talking about right now - not yesterday, not last week, but this minute.

At first glance, it sounds easy. Just count likes and sort, right?

That illusion lasts until traffic grows, events start arriving out of order, and one viral post suddenly floods your system. What felt like a simple ranking problem quickly turns into a real-time data challenge.

In This Chapter
The solutions that look correct but break at scale
Why “trending” is not a batch problem
How real systems compute it continuously using streaming data

We will solve it with first-principles thinking.

First Attempts (and Why They Break)

Attempt 1: Recompute Everything Every Minute

The most obvious idea is this:

Every minute, scan all recent likes, comments, shares, and views
Group them by post
Sort by count
Pick the top 10

Simple SQL or Spark job. Done.

Why it feels right:

Easy to reason about
Works fine when you have 10k posts
“Trending every minute” sounds like a batch problem

Why it fails:

You’re reprocessing the same data again and again
At scale, each minute means scanning millions of events
Latency creeps up → your “every minute” job takes longer than a minute
Spikes (viral posts) make jobs miss their SLA

What looked like a clean batch job quietly turns into a runaway compute bill.

Lesson : Recomputing from scratch doesn’t scale with velocity.

Attempt 2: Maintain Counters in the Database

Okay, let’s be smarter.

Whenever a user likes or comments on a post: Increment a counter in the posts table. Use that counter to decide what’s trending.

Why it feels right:

Constant-time updates
Data is always “fresh”
Easy queries like ORDER BY score DESC

Why it fails:

High write contention on hot posts
Viral posts become bottlenecks
Locks, retries, and replication lag start showing up
Your primary DB is now handling analytics traffic

Suddenly, your transactional database is doing real-time ranking. That never ends well.

Lesson : Databases are great at state, not high-frequency event aggregation.

Attempt 3: Cache Trending Posts in Redis

Fine. Move fast stuff to Redis.

Increment counters in Redis
Every minute, read top keys
Redis is fast, right?

Why it feels right:

In-memory speed
No DB locks
Widely used pattern

What’s the problem:

Redis becomes a single hot spot
You lose ordering guarantees under heavy concurrency
Crash or restart = counters gone (unless heavily persisted)
Cross-region or multi-shard ranking gets messy

Redis helps performance, but it doesn’t solve time-windowed aggregation cleanly.

Lesson : Speed alone doesn’t solve correctness at scale.

Attempt 4: Trigger Jobs with Cron

Another classic move:

Run a cron job every minute
Compute trending posts for the last 60 seconds
Publish the result

Why it feels right:

Simple mental model
Clear time boundaries
Easy to debug

What’s the issue:

All computation spikes at the same second
Late events get dropped or miscounted
Real-time systems don’t respect clock boundaries
Users see inconsistent “trending” lists

Trending is a flowing signal, not a punctual snapshot.

Lesson : Time-based cron jobs don’t mix well with real-time streams.

Attempt 5: “Just Use Machine Learning”

Someone always suggests this. “Why not train a model to predict trending posts?”

Why it feels right:

Sounds sophisticated
Great for long-term ranking

Why it fails (for this problem):

ML still needs clean, real-time features
You still haven’t solved streaming aggregation
Latency requirements make inference tricky
Overkill for minute-level trends

ML improves ranking. It doesn’t replace real-time computation.

Lesson : Intelligence doesn’t fix broken pipelines.

The Pattern Emerging

All these approaches fail for the same reason:

They treat trending as a static calculation.
In reality, trending is a moving window over a live stream of events.

To solve it properly, we need to stop thinking in rows and start thinking in flows.

The Right Way to Think About Trending

Once you step back, the mistake in all the earlier attempts becomes obvious.

Trending is not a number. It’s a signal changing over time.

A post doesn’t become trending because it crossed a fixed count. It becomes trending because activity around it is rising faster than others right now.

That immediately tells us two things:

We must process events as they happen
We must reason in sliding time windows, not batches or cron jobs

This is a streaming problem, not a database one.

The Mental Model Shift

Instead of asking:

“Every minute, what are the top posts?”

We ask:

“As events flow in, how does each post’s momentum change?”

Once you adopt this mindset, the architecture almost designs itself.

The Architecture: Step by Step

Step 1: Treat Every Interaction as an Event

Likes, comments, shares, views - they are all signals. Each interaction becomes a small event:

{
  "post_id": "abc123",
  "event_type": "like",
  "timestamp": 1702999200
}

Nothing is aggregated yet. Nothing is ranked yet. We simply emit events and move on.

This keeps the write path extremely fast and lets the system absorb sudden spikes when something goes viral.

Step 2: Stream, Don’t Store First

These events flow into a stream - Kafka, Pulsar, Kinesis, Pub/Sub.

Why a stream? Because streams:

Preserve order within partitions
Handle burst traffic naturally
Allow multiple consumers to compute different views

At this point, the system isn’t “computing trending” yet. It’s just recording reality as it unfolds.

Step 3: Compute Trends Using Sliding Windows

Now comes the core idea.

Instead of recomputing everything every minute, we maintain rolling windows:

Last 1 minute
Last 5 minutes
Last 15 minutes

As events flow in, stream processors continuously update counters per post per window.

This is not batch. This is incremental math.

When a new like arrives:

Add +1 to that post’s current window
Expire events that fall out of the window
Update the score instantly

No rescans. No cron jobs. No waiting.

Trending becomes a continuously updated result.

Step 4: Separate Scoring from Ranking

Raw counts alone don’t tell the full story.

A post with 10 likes in 1 minute might be more interesting than one with 1,000 likes over a day.

So instead of just counting, we compute a score:

Weight recent events higher than older ones
Give comments more weight than likes
Normalize by post age

This scoring logic lives inside the stream processor.

The output is simple:

{
  "post_id": "abc123",
  "trend_score": 847.5,
  "window": "5min"
}

The heavy thinking happens once, in the stream - not on every read.

Step 5: Maintain a Live Top-K Structure

From these scores, we maintain a Top-K list (say top 100 posts per region or category).

This structure updates incrementally:

When a score changes, we adjust its position
No full sorting required
No global locks

The result is always ready.

When the UI asks for “trending posts”, it’s a simple read - not a computation.

Step 6: Serve Reads Without Touching the Stream

The final trending list is pushed to a fast store:

Redis
DynamoDB
In-memory cache

Reads are cheap. Writes are controlled. The stream keeps flowing independently.

If the UI goes down, streaming continues. If the stream lags, the UI still serves the last known good state.

Failures don’t cascade.

Why This Works (At Scale)

This architecture works because nothing waits.

Events flow in once
Computation happens incrementally
Ranking updates continuously
Reads never trigger heavy work

You’re no longer fighting time or traffic spikes.

Trending stops being a periodic job and becomes a living signal.

Interview Pivot (How to Explain It)

If someone asks:

“How would you compute trending posts every minute?”

“I wouldn’t recompute every minute. I’d treat user interactions as a stream, aggregate them in sliding windows, compute a rolling trend score per post, and maintain a live top-K list.
The UI simply reads the latest result, while the stream updates continuously in the background.”

That answer shows you understand data flow, not just data storage.

What’s Next?

Trending isn’t something you calculate on a schedule. It’s something you observe as it emerges.

This design opens the door to deeper questions:

How do you handle late or out-of-order events?
How do you avoid one viral post dominating forever?
How do you scale this across regions?
How do you debug wrong trending results?

The key insight: Trending isn’t about counting - it’s about capturing momentum. The systems that get this right don’t fight the stream of events. They ride it.
Think in flows, not snapshots. That’s how real-time systems scale.

Related

This article scratched the surface. The real interview goes 10 levels deeper.

How do you handle hot partitions?
What if the cache goes down during a spike?
How do you avoid counting the same view twice?

I've written an ebook that prepares you for all of it.

35 real problems. The patterns that solve them. The follow-ups you'll actually face. The principles behind solving problems at scale, not just the final answers.

👉 Grab your copy here

Thanks for reading. See you in next post.

Thanks for reading! This post is public so feel free to share it.

The AI Architect

Dec 22

The shift from "recompute every minute" to"observe momentum as it emerges" really captures the fundamental mindset change. Most people get stuck treating this as a database query problem when it's actually a flow problem. I ran into this exact issue last year on a project where we initially tried Redis counters and hit the hot partition bottleneck you described, switching to a sliding window with Kafka made the difference but the toughest part was handling late arrivals without doublecounting or introducing crazy lag. How do you typically balance window size vs accuracy when events showup out of order?

Skilled Coder

Discussion about this post

Ready for more?