Skip to content
Informer Cache Optimization
On this page

Informer Cache Optimization

This page describes the informer cache transform functions that Pipelines-as-Code uses to reduce the memory footprint of its watcher controller. These transforms are applied automatically and require no configuration.

Background

The PAC watcher controller maintains in-memory caches (via Kubernetes informers) for Repository and PipelineRun objects. In clusters with many repositories or long-running PipelineRuns, these caches can consume a significant amount of memory because each cached object carries fields that the watcher never reads — managedFields, last-applied-configuration annotations, embedded specs, and status details.

Pipelines-as-Code registers TransformFunc callbacks on each informer. A TransformFunc is called on every object before it is stored in the cache, allowing unnecessary fields to be stripped while the object is still in use.

This is the same approach used by the Tekton Pipelines controller to reduce its own cache memory usage.

What Gets Stripped

Repository Objects

Field Why it is safe to strip
metadata.managedFields Written by the API server for server-side apply tracking; not read by any reconciler logic
metadata.annotations No reconciler logic reads Repository annotations from the lister; the largest annotation (kubectl.kubernetes.io/last-applied-configuration) can be 500-2000 bytes alone
status The reconciler always fetches Repository.Status via a direct API call before updating it; it is never read from the lister

Benchmark result: ~89% JSON size reduction per Repository object.

PipelineRun Objects

Fields preserved (required for reconciliation)

Field Used for
metadata.name, metadata.namespace Object identity
metadata.labels Repository lookup, pipeline identification
metadata.annotations PAC state (pipelinesascode.tekton.dev/state), repository keys
metadata.finalizers, metadata.deletionTimestamp Finalizer-based cleanup
spec.status Detecting PipelineRunSpecStatusPending
status.conditions Completion state and reason
status.startTime, status.completionTime Metrics recording

Fields stripped

Field Why it is safe to strip
metadata.managedFields API server bookkeeping, not used in reconciliation
spec.pipelineRef Not read from the cache by the watcher
spec.pipelineSpec Embedded pipeline definition; can be very large (~20KB in production)
spec.params Not read from the cache
spec.workspaces Not read from the cache
spec.taskRunSpecs Not read from the cache
spec.taskRunTemplate Not read from the cache
spec.timeouts Not read from the cache
status.pipelineSpec Snapshot of the executed pipeline spec; the largest status field (~20KB)
status.childReferences References to child TaskRuns
status.provenance Build provenance metadata
status.spanContext Tracing span context

When the reconciler needs the full PipelineRun (for example, during postFinalStatus or GetStatusFromTaskStatusOrFromAsking), it fetches the complete object directly from the API server.

Benchmark result: ~94% JSON size reduction per PipelineRun object.

Memory Impact

Benchmarks using realistic object sizes from production clusters show the following per-object savings:

Object Original size After transform Reduction
Repository (5 status entries) ~5.7 KB ~0.6 KB ~89%
PipelineRun (15-task pipeline) ~10.7 KB ~0.7 KB ~94%

For a cluster with 1000 Repositories and 700 PipelineRuns, the estimated watcher cache reduction is approximately 12 MB.

Graceful Degradation

The transform functions are designed to degrade gracefully:

  • If an object is wrapped in a DeletedFinalStateUnknown tombstone (which happens when the watcher misses a delete event), the transform unwraps it, strips the inner object, and re-wraps it.
  • If the transform receives an unexpected type, it returns the object unmodified rather than returning an error.
  • If any error occurs during transformation, the original object is returned unchanged so that the informer cache continues to function.

Developer Notes

If you add new reconciliation logic that reads a field from cached objects (via listers), you must verify that the field is not stripped by these transforms. Fields stripped from cached objects will be nil or empty even though they exist in etcd.

If you need a stripped field, fetch the full object via the API client instead of the lister:

// Don't do this — spec.params is stripped from the cache:
pr, _ := pipelineRunLister.PipelineRuns(ns).Get(name)
params := pr.Spec.Params // always nil!

// Do this instead — fetch from the API server:
pr, _ := tektonClient.TektonV1().PipelineRuns(ns).Get(ctx, name, metav1.GetOptions{})
params := pr.Spec.Params // full object from etcd

The transform functions and their benchmarks live in pkg/informer/transform/.

To run the benchmarks yourself:

go test -bench=. -benchmem -v ./pkg/informer/transform/

To see the size reduction report:

go test -v -run 'TestMeasure.*TransformSavings' ./pkg/informer/transform/