KEDA vs HPA: Kubernetes Event-Driven Autoscaling Compared (2026)
KEDA vs HPA compared for 2026 - how the Horizontal Pod Autoscaler and KEDA differ on metrics, scale-to-zero, event sources, and why KEDA builds on HPA rather than replacing it. Which pod autoscaler should you use?
KEDA vs HPA is the core decision for Kubernetes autoscaling at the pod layer in 2026. The Horizontal Pod Autoscaler (HPA) is built into Kubernetes and scales pods on CPU, memory, and custom metrics. KEDA is a CNCF project that adds event-driven autoscaling on 60+ external sources plus scale-to-zero. The most important thing to understand up front: KEDA does not replace HPA - it builds on top of it.
This guide compares KEDA and HPA on what actually matters for cost and responsiveness: the metrics each can scale on, event-source support, scale-to-zero, operational complexity, and exactly when to use each.
The short answer
Pick HPA if:
- Your workloads scale cleanly on CPU, memory, or a custom metric you already expose
- You do not need to drop to zero replicas when idle
- You want zero extra components - HPA is built into Kubernetes
- Your scaling signals are internal resource utilization, not external events
Pick KEDA if:
- You need to scale on external events - Kafka consumer lag, RabbitMQ / SQS queue depth, Prometheus queries, cron schedules, and 60+ more
- You want scale-to-zero so idle workloads cost nothing until work arrives
- You run queue consumers, batch processors, or event-driven microservices
- You want event-driven scaling without hand-building custom metrics adapters
Both are valid when: they always are - using KEDA means you are using HPA too. KEDA creates and manages a standard HPA for every ScaledObject. The real choice is whether you drive that HPA directly with resource metrics, or let KEDA drive it with external events.
Deciding factors at a glance
| If your priority is… | Choose |
|---|---|
| Simple CPU / memory scaling | HPA |
| Scale on Kafka lag, queue depth, or Prometheus | KEDA |
| Scale-to-zero for idle workloads | KEDA |
| Nothing extra to install | HPA |
| Cron / schedule-based scaling | KEDA |
| Custom metrics without building adapters | KEDA |
| Maximum simplicity for steady web traffic | HPA |
What each tool is
Horizontal Pod Autoscaler (HPA)
The Horizontal Pod Autoscaler is a built-in Kubernetes controller (autoscaling/v2) that adjusts the replica count of a Deployment, StatefulSet, or other scalable resource to keep an observed metric near a target. Out of the box it scales on CPU and memory using the metrics server. With a metrics adapter (for example the Prometheus Adapter) it can also scale on custom and external metrics via the Kubernetes metrics APIs.
HPA’s minimum replica count is one - it cannot scale a workload to zero. It is the standard, zero-install answer for workloads whose load correlates well with resource utilization: web frontends, APIs, and most request-driven services.
KEDA
KEDA (Kubernetes Event-Driven Autoscaling) is a CNCF-graduated project that adds event-driven autoscaling to Kubernetes. You declare a ScaledObject that references your Deployment and one or more triggers (scalers). KEDA ships 60+ scalers for sources like Kafka, RabbitMQ, AWS SQS, Azure Service Bus, NATS, Google Pub/Sub, Prometheus, Datadog, CloudWatch, databases, cloud storage, and cron.
Crucially, KEDA is an HPA extension, not a competitor. For each ScaledObject it creates and manages a standard HPA and acts as an external metrics adapter feeding that HPA values from your event source. KEDA’s own controller adds the scale-to-zero transition - activating from 0 to 1 (and back to 0) based on the event source - while the managed HPA handles ongoing scaling between its minimum and maximum.
KEDA vs HPA: head-to-head
| Dimension | KEDA | HPA |
|---|---|---|
| What it is | CNCF project that extends HPA with event-driven scaling | Built-in Kubernetes controller |
| Relationship | Creates and manages an HPA under the hood | The autoscaling primitive KEDA builds on |
| Metrics out of the box | 60+ event sources via scalers | CPU and memory |
| Custom / external metrics | Built in per scaler, no adapter wiring | Needs a metrics adapter you deploy and maintain |
| Scale-to-zero | Yes - activates from 0 on first event | No - minimum is one replica |
| Event sources | Kafka, queues, Prometheus, cron, DBs, cloud, more | None natively; only what an adapter exposes |
| Install footprint | Deploy KEDA operator + CRDs | None - part of Kubernetes |
| Best for | Event-driven, bursty, scale-to-zero workloads | Steady, resource-correlated workloads |
| Maturity | CNCF graduated, production-proven | Core Kubernetes, extremely mature |
The defining contrast: HPA scales on internal resource utilization and never reaches zero; KEDA turns external events into autoscaling signals and adds scale-to-zero, doing so by managing an HPA for you rather than reinventing the scaling loop.
When to choose KEDA
Choose KEDA when:
- Load comes from a queue or stream. Consumers of Kafka, RabbitMQ, SQS, Service Bus, NATS, or Pub/Sub scale best on queue depth or consumer lag, not CPU. A backlog should add pods immediately, even if CPU has not spiked yet. KEDA reads the source directly and scales accordingly.
- You want scale-to-zero. Batch processors, infrequently used internal tools, and event-driven microservices can sit at zero pods when idle and activate on the first message. On clusters where capacity is paid for, this is a direct cost saving - especially combined with node autoscaling.
- You scale on schedules. The cron scaler pre-warms capacity before known peaks (market open, business hours) and scales down afterward, without custom controllers.
- You need custom metrics without the adapter tax. Rather than building and operating a Prometheus Adapter to expose a metric to HPA, KEDA’s Prometheus and vendor scalers consume those signals directly.
For UAE AI/ML and data teams, KEDA shines on inference queues and batch pipelines where scale-to-zero between bursts meaningfully cuts GPU and compute spend. Pair pod-level KEDA with node-level provisioning - see our companion guide on Karpenter vs Cluster Autoscaler - so that scaling pods to zero also lets idle nodes drain away.
When to choose HPA
Choose plain HPA when:
- Load correlates with CPU or memory. Classic web and API workloads scale well on resource utilization. HPA handles this with no extra components - it is already in your cluster.
- You do not need scale-to-zero. If a baseline of at least one replica is always acceptable (or desirable for latency), HPA’s one-replica minimum is fine, and you avoid the cold-start latency that scale-to-zero introduces on the first request.
- You want the smallest possible operational surface. No operator to install, patch, or reason about during incidents. For risk-averse or tightly governed platforms, fewer moving parts is a feature.
- Your custom metric is simple and already exposed. If you have one straightforward custom metric and an adapter already running, HPA can use it directly without adopting KEDA.
A steady-traffic UAE banking portal with predictable diurnal load is often best served by HPA on CPU plus a sensible minimum replica count - simple, well understood, and easy to audit.
Can you use them together?
In practice, using KEDA is using HPA together - every ScaledObject is backed by an HPA that KEDA creates and manages. So the question is really about avoiding conflict.
The rule: never point a manually created HPA and a KEDA ScaledObject at the same Deployment. Both would try to set the replica count and fight each other. Keep one controller per workload:
- Use plain HPA directly for simple CPU / memory workloads.
- Use KEDA (which owns its HPA) for event-driven or scale-to-zero workloads.
A common real-world split is HPA for request-driven frontends and KEDA for the queue consumers and batch jobs behind them - different workloads, different controllers, no overlap. And remember that pod autoscaling only frees real capacity if node autoscaling reclaims the emptied nodes, which is why teams tune KEDA / HPA alongside a node autoscaler rather than in isolation.
FAQ
KEDA vs HPA: which should I use? Use plain HPA for simple CPU / memory scaling with no scale-to-zero - it is built in. Use KEDA for event-driven sources (Kafka lag, queue depth, Prometheus, cron) and scale-to-zero. KEDA does not replace HPA; it builds on it, managing an HPA for you and feeding it external metrics.
Does KEDA replace HPA? No. A KEDA ScaledObject generates and manages a standard HPA under the hood and feeds it metrics from your event source. HPA still does the scaling math; KEDA adds scale-to-zero and 60+ event scalers.
What is scale-to-zero and can HPA do it? Scale-to-zero runs zero pods when idle and starts the first pod when work arrives. Plain HPA cannot go below one replica. KEDA watches the event source and activates from zero to one, then hands ongoing scaling to its managed HPA.
What metrics can HPA scale on? CPU and memory out of the box, plus custom and external metrics if you deploy a metrics adapter. KEDA bundles that plumbing for 60+ sources, so event-driven scaling is far less work than wiring adapters into raw HPA.
What event sources does KEDA support? 60+ scalers including Kafka, RabbitMQ, SQS, Azure Service Bus, NATS, Pub/Sub, Prometheus, Datadog, CloudWatch, databases, cloud storage, and cron - each turning a signal like queue depth or consumer lag into an autoscaling metric.
Can I use KEDA and HPA together? You already do - KEDA manages an HPA per ScaledObject. Just never point a manual HPA and a ScaledObject at the same Deployment. One controller per workload: plain HPA for simple cases, KEDA for event-driven and scale-to-zero.
How NomadX Kubernetes Delivers
NomadX Kubernetes runs autoscaling and cost optimization as fixed-scope sprints:
- 5-day Autoscaling Readiness Assessment - reviews current pod and node autoscaling, identifies workloads that should be event-driven or scale-to-zero, and recommends KEDA or HPA per workload
- 2-3 week Autoscaler Implementation Sprint - deploys and tunes KEDA scalers and HPAs, wires event sources (Kafka, queues, Prometheus, cron), and validates scale-to-zero behavior with safe rollback
- Monthly Cost Optimization Retainer - ongoing autoscaler tuning, rightsizing, and spend reporting across pod and node layers
Book a free 30-minute discovery call to scope your Kubernetes autoscaling and cost engagement with a NomadX Kubernetes engineer.
Frequently Asked Questions
KEDA vs HPA: which should I use?
Use plain HPA if your workloads scale well on CPU, memory, or a custom metric you already expose, and you do not need scale-to-zero - it is built into Kubernetes with nothing to install. Use KEDA when you need event-driven autoscaling on external sources (Kafka lag, queue depth, SQS messages, Prometheus queries, cron schedules) or scale-to-zero for idle workloads. The key point: KEDA is not a replacement for HPA - it builds on top of HPA, creating and managing an HPA object for you and feeding it external metrics. For most event-driven or bursty workloads in 2026, KEDA is the better fit; for simple CPU / memory scaling, HPA alone is enough.
Does KEDA replace HPA?
No. KEDA extends HPA rather than replacing it. When you create a KEDA ScaledObject, KEDA generates and manages a standard HorizontalPodAutoscaler under the hood and acts as an external metrics adapter that feeds it values from your event source. HPA still does the actual scaling math between its minimum and maximum replicas. KEDA's added value is the scale-to-zero transition (0 to 1 and back) and the 60+ scalers that turn external signals into metrics HPA can consume. Think of KEDA as a superset: everything HPA does, plus event-driven sources and scale-to-zero.
What is scale-to-zero and can HPA do it?
Scale-to-zero means running zero pods when there is no work, then spinning the first pod up the moment work arrives. Plain HPA cannot scale a Deployment below one replica - its minimum is one. KEDA adds true scale-to-zero: its controller watches the event source directly and activates the workload from zero to one when there are messages in a queue, lag on a topic, or any other trigger, then hands ongoing scaling back to the HPA it manages. Scale-to-zero is one of the main reasons teams adopt KEDA for queue consumers, batch processors, and infrequently used services.
What metrics can HPA scale on?
HPA scales on resource metrics (CPU and memory) out of the box via the metrics server, plus custom metrics and external metrics if you deploy an adapter (such as Prometheus Adapter) that implements the Kubernetes custom / external metrics API. So HPA can technically scale on almost anything, but you have to build and maintain the metrics-adapter plumbing yourself. KEDA bundles that plumbing for 60+ sources, which is why event-driven scaling is far less work with KEDA than wiring custom adapters into raw HPA.
What event sources does KEDA support?
KEDA ships 60+ scalers covering message queues (Kafka, RabbitMQ, AWS SQS, Azure Service Bus, NATS, Google Pub/Sub), databases (PostgreSQL, MySQL, MongoDB, Redis), metrics systems (Prometheus, Datadog, New Relic, Azure Monitor, CloudWatch), cloud storage and serverless triggers, cron schedules, and many more. Each scaler knows how to read a meaningful signal - queue length, consumer lag, query result, schedule - and turn it into a metric that drives autoscaling, including the activation from and to zero.
Can I use KEDA and HPA together?
You already are when you use KEDA - it creates and manages an HPA for each ScaledObject. What you must not do is point a manually created HPA and a KEDA ScaledObject at the same Deployment, because they will both try to set replica counts and conflict. The clean model is: use plain HPA directly for simple CPU / memory workloads, and use KEDA (which owns its HPA) for event-driven or scale-to-zero workloads. One controller per Deployment.
Complementary NomadX Services
Get Started for Free
We would be happy to speak with you and arrange a free consultation with our Kubernetes Expert in Dubai, UAE. 30-minute call, actionable results in days.
Talk to an Expert