Why does full recalculation seem fine at first?
When your GA4 dataset is small, rebuilding the whole table every day is no big deal. Takes a minute, costs nothing notable, everyone's happy. I've built pipelines this way and they worked for a while.
When does it stop working?
As data grows, full recalculation gets slow and expensive. At some point, teams start running updates weekly instead of daily. Dashboards show stale data. Someone makes a decision based on numbers that are five days old. This is when "it works fine" becomes "it's a problem."
What does MERGE do differently?
MERGE compares incoming data against the existing table and applies targeted changes:
- INSERT new records that don't exist yet
- UPDATE records where values have changed
- DELETE records that should no longer be there
For GA4 data, a typical pattern: MERGE on user_pseudo_id + event_date as the key. Only the new day's data touches the table. Everything else stays as-is. Runtime drops from minutes to seconds.
Why is this a business reliability problem?
Fresh data is a business dependency. Marketing decisions, budget pacing, campaign optimization — all of it runs on data that needs to be current. I've watched teams lose confidence in their dashboards because updates were too slow and too expensive. MERGE is one of the simplest ways to make data pipelines fast enough to actually be trusted.
Want to get all my top Linkedin content? I regularly upload it to one Notion doc.
Go here to download it for FREE

