HPA Kubernetes Jobs: Run One-Time or Batch Tasks in Kubernetes

HPA Kubernetes Jobs

Table of Contents

Get up to 50% off now

Become a partner with CyberPanel and gain access to an incredible offer of up to 50% off on CyberPanel add-ons. Plus, as a partner, you’ll also benefit from comprehensive marketing support and a whole lot more. Join us on this journey today!

Scaling applications effectively is one of the primary challenges in modern cloud native environments. Kubernetes addresses this using Horizontal Pod Autoscaler (HPA), which is a built-in controller that automatically adjusts the number of pods in the deployment, replication controller, or stateful sets based on real-time resource usage. 

Rather than over-provisioning or risking performance issues during peak traffic, HPA maintains the perfect balance between resource consumption and application responsiveness. So, whether you are dealing with fluctuating Kubernetes workloads or preparing for scaling, understanding the concept of HPA Kubernetes is essential. 

In this guide, we shall explore how Horizontal Pod Autoscaler functions in Kubernetes and its functionality.   

How HPA Kubernetes Works: Core Concepts

The Horizontal Pod Autoscaler (HPA) is a Kubernetes controller that would automatically scale pod replicas in a deployment, replication, and other workloads. 

Here’s how it works:

  • Metrics Collection: HPA Kubernetes fetches the resource usage metrics (like the CPU or memory) from the Kubernetes Metrics Server or other custom metrics. 
  • Evaluation Loop: At regular intervals, the HPA Kubernetes will compare all the observed metrics to target the threshold defined in your configurations.
  • Scaling Decision: Based on the ratio of observed to target metrics, HPA calculates the desired number of replicas. For example:

    desiredReplicas = currentReplicas × (currentMetric / targetMetric)
  • Update Replica Count: If there’s a significant difference, the number of pods in the deployment is adjusted up or down accordingly.

HPA operates at the deployment level—it does not change pod specs like CPU limits or memory (that’s the role of the Vertical Pod Autoscaler).

Tech Delivered to Your Inbox!

Get exclusive access to all things tech-savvy, and be the first to receive 

the latest updates directly in your inbox.

When Should You Use HPA Kubernetes?

HPA Kubernetes is your ideal provider when the application is experiencing fluctuating workloads or inconsistent traffic. Some common scenarios include: 

  • Web applications with variable traffic: where you need to increase pods during peak hours and reduce during low traffic hours. 
  • Data processing jobs: automatic scalability that is based on CPU/memory usage during peak hours. 
  • Microservices architecture: scale individual services independently based on their own resource demands. 
  • APIs or backend services: maintain consistent performance without constant manual intervention. 

Use HPA when:

  • You need to optimize resource usage. 
  • Your app performance is directly related to the CPU/ memory load. 
  • You need to scale horizontally based on demand. 

Avoid HPA when:

  • The application state is not easily replicated across pods.
  • You require vertical scaling (more resources per pod)—in that case, consider VPA.

Important Metrics Used by HPA Kubernetes

The effectiveness of the Horizontal Pod Autoscaler is completely dependent on the metrics that it uses for decisions. HPA Kubernetes supports multiple types of metrics to determine when to scale the pods up or down. 

CPU Utilization

CPU utilization is one of the most widely used metrics with HPA. It is the percentage of the CPU resources being used by the container in comparison to the resource limits or requests defined in the pod specs. 

  • Default behavior: HPA uses this if no other metric is specified.
  • Example: If your HPA is set to maintain 50% CPU usage and usage rises to 100%, it will trigger a scale-up.

You should use CPU utilisation because it is simple, effective, and supported out of the box by the Kubernetes Metrics Server. 

Memory Utilization

Memory utilization tracks the exact amount of memory consumed by the pods in Kubernetes compared to the requested memory resources. 

  • Memory-based scaling requires proper memory requests to be defined in your pod specs.
  • HPA doesn’t support memory metrics by default—you’ll need to explicitly configure this using the resource metric type.

Custom Metrics

For more advanced scaling needs, HPA Kubernetes can use custom metrics, such as: 

  • Request rates (e.g., HTTP requests per second)
  • Queue length (e.g., messages in a RabbitMQ or Kafka topic)
  • Business metrics (e.g., active users or transactions)

Custom metrics are defined by using the Custom Metrics API or External Metrics API, which are often exposed using the Prometheus Adapter or Stackdriver.

Setting Up HPA Kubernetes – Step-by-Step Guide

Setting up HPA Kubernetes is pretty straightforward if you have already defined the resource requests in your deployments. Here is a step by step guide. 

Enhance Your CyerPanel Experience Today!
Discover a world of enhanced features and show your support for our ongoing development with CyberPanel add-ons. Elevate your experience today!

Step 1: Ensure the Metrics Server is Installed

HPA relies on the Kubernetes Metrics Server to gather CPU and memory metrics.

kubectl get deployment metrics-server -n kube-system

If it’s not installed, you can install it using:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Step 2: Define Resource Requests in Your Deployment

HPA works only if your pods define resource requests for CPU (and optionally memory).

resources:

  requests:

    cpu: “100m”

  limits:

    cpu: “200m”

Step 3: Create an HPA Resource

You can use kubectl autoscale or define it via YAML.

Option 1: CLI

kubectl autoscale deployment my-app –cpu-percent=50 –min=2 –max=10

Option 2: YAML

apiVersion: autoscaling/v2

kind: HorizontalPodAutoscaler

metadata:

  name: my-app-hpa

spec:

  scaleTargetRef:

    apiVersion: apps/v1

    kind: Deployment

    name: my-app

  minReplicas: 2

  maxReplicas: 10

  metrics:

  – type: Resource

    resource:

      name: cpu

      target:

        type: Utilization

        averageUtilization: 50

Apply it:

kubectl apply -f hpa.yaml

Step 4: Verify HPA Behavior

Monitor the HPA:

kubectl get hpa

kubectl describe hpa my-app-hpa

You can also simulate CPU load with a stress tool or use kubectl exec to run CPU-intensive commands.

Advanced HPA Configurations

If you require more control, you can use HPA Kubernetes v2 for multiple metrics support and scalable behaviour. 

  1. Scaling Based on Multiple Metrics

You can combine CPU, memory, and custom metrics:

metrics:

  – type: Resource

    resource:

      name: cpu

      target:

        type: Utilization

        averageUtilization: 60

  – type: Resource

    resource:

      name: memory

      target:

        type: Utilization

        averageUtilization: 70

  1. Custom Scaling Policies

Fine-tune how aggressively HPA scales:

behavior:

  scaleUp:

    stabilizationWindowSeconds: 30

    policies:

    – type: Percent

      value: 100

      periodSeconds: 60

  scaleDown:

    stabilizationWindowSeconds: 60

    policies:

    – type: Pods

      value: 1

      periodSeconds: 60

Troubleshooting HPA Kubernetes Common Issues

IssueLikely CauseSolution
HPA not scaling podsMetrics Server not installed or not runningInstall or restart the Metrics Server
HPA remains at minimum replicasLow CPU/memory usageTest with a load generator or verify actual resource usage
Error: unknown metric source typeInvalid metric type in HPA YAMLUse valid types: Resource, Pods, or External
No CPU/memory dataResource requests/limits not defined in pod specsDefine resources.requests in your deployment configuration
Overreacting to short spikesNo scaling stabilization configuredAdd behavior settings to control how fast HPA reacts
Custom metrics not workingPrometheus Adapter not set upInstall and configure the Prometheus Adapter
HPA stuck in Pending stateAPI server or Metrics Server issuesCheck logs, ensure API access, and verify RBAC permissions

HPA vs VPA vs Cluster Autoscaler – Comparison Table 

FeatureHPA (Horizontal Pod Autoscaler)VPA (Vertical Pod Autoscaler)Cluster Autoscaler
What it scalesNumber of pod replicasResource requests (CPU/memory) per podNumber of nodes in the cluster
When it triggersBased on metrics like CPU, memory, or customWhen pods are under/over-provisionedWhen pods are pending due to lack of resources
Best forApps with varying load or trafficStateful apps or those with changing resource needsAutomatically increasing/decreasing cluster size
Requires Metrics Server?✅ Yes✅ Yes❌ No
Can they be used together?✅ Yes✅ Yes, but not with HPA on the same deployment✅ Yes
GranularityPer deployment/pod levelPer pod levelCluster/node level
Configuration complexityMediumMediumHigh (with cloud provider-specific setup)

Conclusion – HPA In Kubernetes 

The Horizontal Pod Autoscaler is one of the essential parts of building effective Kubernetes workloads. When implemented in the right manner, HPA in Kubernetes can improve application performance, increase scalability, and eliminate downtime. 

FAQs

What metrics does HPA use to scale pods?

HPA typically uses CPU and memory utilization but can also work with custom or external metrics through tools like Prometheus Adapter.

Can HPA and VPA be used together?

Yes, but with limitations. You should avoid using HPA and VPA on the same resource simultaneously if both are targeting CPU or memory to prevent conflicting behavior.

Why is my HPA not scaling?

Common reasons include missing metrics server, incorrect resource requests, low actual usage, or misconfigured HPA settings.

Marium Fahim
Hi! I am Marium, and I am a full-time content marketer fueled by an iced coffee. I mainly write about tech, and I absolutely love doing opinion-based pieces. Hit me up at [email protected].
Unlock Benefits

Become a Community Member

SIMPLIFY SETUP, MAXIMIZE EFFICIENCY!
Setting up CyberPanel is a breeze. We’ll handle the installation so you can concentrate on your website. Start now for a secure, stable, and blazing-fast performance!