• Kubernetes HPA


    Kubernetes HPA


    What is Kubernetes HPA

    Kubernetes HPA (Horizontal Pod Autoscaler) is a feature in Kubernetes that automatically scales the number of pods in a deployment based on the observed metrics. It ensures that the application has enough resources to handle the incoming traffic while minimizing resource waste.
    在这里插入图片描述



    Why we need Kubernetes HPA

    1. Efficient resource utilization: By automatically scaling the number of pods, HPA adjusts the resources allocated to your application based on the current demand. It helps ensure that your application has enough resources to handle the workload efficiently, avoiding over-provisioning or under-provisioning.

    2. Automatic scaling: HPA monitors the metrics of your application, such as CPU utilization, memory usage, or custom metrics, and scales the number of pods accordingly. This automation eliminates the need for manual scaling, making it easier to handle fluctuations in traffic and workload.

    3. Improved application availability: HPA helps maintain high availability by dynamically scaling the pods. It ensures that your application can handle increased traffic without becoming overwhelmed, resulting in a better user experience.

    4. Cost optimization: With HPA, you can optimize resource utilization and avoid unnecessary costs. By scaling up when needed and scaling down during periods of low demand, you can make efficient use of your cloud resources.

    Kubernetes HPA simplifies the management of your application’s scalability, improves resource utilization, enhances availability, and reduces manual intervention, making it an essential tool for managing applications on Kubernetes.



    How to set HPA

    Let’s assume we have a deployment called “my-app” that we want to scale using HPA.

    Here’s an example YAML configuration for creating an HPA:

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: my-app-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: my-app
      minReplicas: 2
      maxReplicas: 5
      metrics:
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 50
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18

    Let’s break down each field:

    • metadata.name: Specifies the name of the HPA resource, in this case, “my-app-hpa”.

    • spec.scaleTargetRef: Defines the target for scaling, which is the deployment “my-app”. It specifies the API version, kind (Deployment), and name of the deployment.

    • spec.minReplicas: Specifies the minimum number of replicas (pods) that should always be running, even during low demand. In this example, it’s set to 2.

    • spec.maxReplicas: Specifies the maximum number of replicas (pods) that can be scaled up to. In this example, it’s set to 5.

    • spec.metrics: Defines the metrics used for scaling. In this example, we’re using CPU utilization as the metric.

    • spec.metrics.type: Specifies the type of metric used for scaling. In this case, we’re using a resource metric.

    • spec.metrics.resource.name: Specifies the resource metric to use, which is “cpu” in this example.

    • spec.metrics.resource.target.type: Specifies the type of scaling target. In this case, it’s “Utilization” which means we’ll scale based on CPU utilization.

    • spec.metrics.resource.target.averageUtilization: Specifies the target average CPU utilization percentage that we want to achieve. In this example, it’s set to 50.

    Once you apply this YAML configuration using kubectl apply -f hpa.yaml, Kubernetes will create the HPA resource, and it will automatically scale the number of pods in the “my-app” deployment based on the CPU utilization metric.

    Remember to adjust the values according to your specific needs, such as the target CPU utilization or the minimum/maximum number of replicas.



    How to inspect the HPA

    To inspect the HorizontalPodAutoscaler (HPA) and get information about its current status, you can use the kubectl command-line tool. Here are a few commands you can use to inspect the HPA:

    1. To get a list of all HPAs in your cluster, run:

      kubectl get hpa
      
      • 1

      This command will show you the names and basic information of all the HPAs in your cluster.

    2. To get detailed information about a specific HPA, including the current number of replicas and the target metrics, run:

      kubectl describe hpa 
      
      • 1

      Replace with the name of the HPA you want to inspect. This command will provide you with detailed information about the HPA, including the scaling target, current replicas, target metrics, and other useful information.

    3. To see the current metrics and scaling behavior of the HPA, run:

      kubectl get hpa  -o yaml
      
      • 1

      This command will display the HPA configuration and the current metrics in YAML format.

    These commands will help you inspect and gather information about your HPAs, allowing you to monitor their behavior and ensure they are functioning as expected.

  • 相关阅读:
    get_started_3dsctf_2016
    2、开发环境安装
    源码解析springbatch的job是如何运行的?
    Linux使用记录
    【知识网络分析】二模网络(two node)、多模网络(multi node)与多级别网络(multi level)
    STM32 驱动
    【蓝桥杯选拔赛真题48】Scratch跳舞机游戏 少儿编程scratch蓝桥杯选拔赛真题讲解
    Hadoop HA(高可用)部署
    【面试准备-react】React.memo和useMemo、useCallback的区别
    程序员福音!BAT企业联合出品《Java开发手册》,每一条都是血的教训
  • 原文地址:https://blog.csdn.net/mukouping82/article/details/133970123