Title Performance-based SLO recovery for containerized applications /
Authors Pozdniakova, Olesia ; Mažeika, Dalius
DOI 10.15388/DAMSS.14.2023
ISBN 9786090709856
Full Text Download
Is Part of DAMSS 2023: 14th conference on data analysis methods for software systems, Druskininkai, Lithuania, November 30 - December 2, 2023.. Vilnius : Vilniaus universiteto leidykla, 2023. p. 72-73.. ISBN 9786090709856
Keywords [eng] autoscaler ; SLO recovery ; microservices
Abstract [eng] The development of cloud-ready applications involves a focus on scalability and loose coupling of the containerized microservices to guarantee seamless deployment on cloud or container orchestration platforms. Auto-scaler is a component that is responsible for dynamic resource provisioning and QoS provided by the cloud. Premature or excessive resource provisioning may increase costs, while delays could lead to service degradation and SLA violation. Hence, finding the balance between avoiding SLA violations and managing costs effectively stands as the problem addressed by the majority of auto-scaling algorithms. Rule-based or predictive machine learning-based algorithms are used to determine a proper number of resources and to meet SLO requirements. Usually, SLI values such as CPU utilization or transactions per second are used as datasets. However, the existing auto-scaling approaches minimize the risk of SLA violations but do not address the problem of degraded SLA i.e., they lack a mechanism for overall SLA restoring after a violation, leaving a gap in the comprehensive SLA management. Threshold-based autoscaling approach was proposed and investigated. Auto-scaler monitors the SLO value during the particular evaluation timeframe and restores the SLO in the case of violations. The proposed autoscaling approach ensures that the application operates at an elevated service performance level for a specific duration, aiming to achieve the SLO target during that period. An experimental study was conducted to evaluate the ability of the proposed auto-scaler to recover or maintain SLO under five distinct load scenarios. The obtained results were compared with the similar dynamic thresholds-based autoscaling solution known as DM that previously demonstrated good results in terms of resource provisioning and response time compared to other rule-based algorithms. Several criteria were used to assess the effectiveness of the proposed algorithm to recover and maintain the required SLO level including the amount of the provided resources, and the number of over-provisioned and under-provisioned pods. The results revealed that the proposed auto-scaling solution demonstrates better adherence to SLA across the majority of evaluated load scenarios, even when employing a similar number of resources as other threshold-based algorithms.
Published Vilnius : Vilniaus universiteto leidykla, 2023
Type Conference paper
Language English
Publication date 2023
CC license CC license description