Skip to main content

Webhook-Driven, Reconciler-Bounded

Freshness comes from webhooks. The reconciler is what catches the webhooks that get dropped. Full-table polling is not an option.

The rule: webhooks are the primary freshness mechanism. A periodic, change-based reconciler handles the gap between "usually delivered" and "always delivered." Never poll the upstream for the full list on a schedule.


The two-level design

Level 1: webhook hydration. Upstream fires an event → our webhook handler verifies the signature → writes the payload directly into the local cache. L1 cache entries are invalidated so stale in-process copies don't linger. This is the real-time path.

Level 2: reconciler. A periodic sweep from the worker binary that asks the upstream "what changed since I last asked?" Any changes are upserted into the cache. Default interval: 15 minutes. This is the safety net.

The reconciler is not a substitute for webhooks. If webhooks go down for an extended period, the reconciler reduces data loss from "total" to "bounded by the interval." It is not fast enough to be the primary freshness path — users notice 15-minute lag.


The reconciler MUST be change-based

Full-table pulls are forbidden. Use upstream since-filters wherever they exist:

Endpoint example Since-filter
HaloPSA /Tickets lastactiondate_start=<timestamp>
HaloPSA /Invoice start_date=<timestamp> + datesearch=last_modified
HaloPSA /Quotation No filter — cap with count=200, rely on skip-if-unchanged upsert
HaloPSA /Client No filter — same strategy

When the upstream offers no since-filter, cap the pull (count=200) and lean on the skip-if-unchanged guard in the upsert. The upsert comparison is cheap; redundant DB writes are the expensive thing.

Never full-table. Tenants have hundreds of thousands of records. A scheduled "SELECT all invoices" against HaloPSA is an incident.


First-run handling

On a fresh deploy, the reconciler records a baseline timestamp without pulling history. Historical records hydrate via per-user backfill on first login (see the section below), not a one-shot tenant-wide pull at deploy time. This distributes hydration load across real usage instead of hammering the upstream at deploy time.


Per-user backfill on login

Webhooks only fire on changes. Records that predate your deployment never arrive via webhook. Solution: on each user's first login, dispatch a backfill job scoped to that user's client_id. Idempotent via a database marker keyed per (user_id, client_id) so subsequent logins skip it.

This pattern handles three cases:

  • Fresh deploy against existing tenant. First users to log in trigger backfills for their clients; historical data flows in organically.
  • Re-enabling a client. A client reactivated after being dormant gets backfilled the next time a user from that client logs in.
  • New employee at an existing client. If for some reason the existing client's backfill marker is missing, the new employee's first login reruns it.

Closed-record optimization

Terminal-state records (closed tickets, paid invoices, rejected quotes) rarely or never change. When the cache serves a terminal record, it serves the cached copy regardless of TTL and dispatches a background refresh job for drift detection. A dedup guard on the queue stops hot pages from stacking duplicate refreshes.

This eliminates unnecessary upstream calls for records that are effectively immutable while retaining eventual correctness.


Why we do this

  • Rate-limit friendliness. Change-based pulls use orders of magnitude less of the upstream's budget than full-table pulls.
  • Webhook misses self-heal within one reconciler interval. Default 15 minutes is the worst-case lag for a missed webhook.
  • Deploy-time load is flat. No tenant-wide spike when the portal launches; load scales with actual user activity.
  • Terminal records pay once. Closed tickets stop costing us API calls forever after their last webhook.

When this applies

Every integration where we maintain a local cache of upstream entities.

When it does not apply

  • Upstream has no webhook support at all. In that case the reconciler becomes the primary mechanism — at which point we need to think hard about interval vs. API budget, and whether a cache is even worth it.
  • Upstream pushes changes to us via a different mechanism (streaming, long-poll). Adapt the pattern — the webhook handler becomes the stream handler, but the rest stays.