Energy aware load balancing and application scaling for the cloud ecosystem

of 14
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Document Description
1. 2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See…
Document Share
Document Transcript
  • 1. 2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2015.2396059, IEEE Transactions on Cloud Computing Energy-aware Load Balancing and Application Scaling for the Cloud Ecosystem Ashkan Paya and Dan C. Marinescu Computer Science Division, EECS Department University of Central Florida, Orlando, FL 32816, USA Email:ashkan, Abstract - In this paper we introduce an energy-aware operation model used for load balancing and application scaling on a cloud. The basic philosophy of our approach is defining an energy-optimal operation regime and attempting to maximize the number of servers operating in this regime. Idle and lightly-loaded servers are switched to one of the sleep states to save energy. The load balancing and scaling algorithms also exploit some of the most desirable features of server consolidation mechanisms discussed in the literature. Index terms - load balancing, application scaling, idle servers, server consolidation, energy proportional systems. 1 Motivation and Related Work In the last few years packaging computing cycles and stor- age and offering them as a metered service became a reality. Large farms of computing and storage platforms have been as- sembled and a fair number of Cloud Service Providers (CSPs) offer computing services based on three cloud delivery models SaaS (Software as a Service), PaaS (Platform as a Service), and IaaS (Infrastructure as a Service). Warehouse-scale computers (WSCs) are the building blocks of a cloud infrastructure. A hierarchy of networks connect 50, 000 to 100, 000 servers in a WSC. The servers are housed in racks; typically, the 48 servers in a rack are connected by a 48-port Gigabit Ethernet switch. The switch has two to eight up-links which go to the higher level switches in the network hierarchy [4, 17]. Cloud elasticity, the ability to use as many resources as needed at any given time, and low cost, a user is charged only for the resources it consumes, represents solid incentives for many organizations to migrate their computational activities to a public cloud. The number of CSPs, the spectrum of ser- vices offered by the CSPs, and the number of cloud users have increased dramatically during the last few years. For exam- ple, in 2007 the EC2 (Elastic Computing Cloud) was the first service provided by AWS (Amazon Web Services); five years later, in 2012, AWS was used by businesses in 200 countries. Amazon’s S3 (Simple Storage Service) has surpassed two tril- lion objects and routinely runs more than 1.1 million peak requests per second. Elastic MapReduce has launched 5.5 million clusters since May 2010 when the service started [35]. The rapid expansion of the cloud computing has a signifi- cant impact on the energy consumption in US and the world. The costs for energy and for cooling large-scale data centers are significant and are expected to increase in the future. In 2006, the 6 000 data centers in the U.S. reportedly consumed 61 × 109 kWh of energy, 1.5% of all electricity consumption in the country, at a cost of $4.5 billion [34]. The energy con- sumption of data centers and of the network infrastructure is predicted to reach 10, 300 TWh/year (1 TWh = 109 kWh) in 2030, based on 2010 efficiency levels [28]. These increases are expected in spite of the extraordinary reduction in energy requirements for computing activities. Idle and under-utilized servers contribute significantly to wasted energy, see Section 2. A 2010 survey [8] reports that idle servers contribute 11 million tons of unnecessary CO2 emissions each year and that the total yearly costs for idle servers is $19 billion. Recently, Gartner Research [30] re- ported that the average server utilization in large data-centers is 18%, while the utilization of x86 servers is even lower, 12%. These results confirm earlier estimations that the aver- age server utilization is in the 10 − 30% range [3]. The concept of “load balancing” dates back to the time when the first distributed computing systems were imple- mented. It means exactly what the name implies, to evenly distribute the workload to a set of servers to maximize the throughput, minimize the response time, and increase the sys- tem resilience to faults by avoiding overloading the systems. An important strategy for energy reduction is concentrating the load on a subset of servers and, whenever possible, switch- ing the rest of them to a state with a low energy consump- tion. This observation implies that the traditional concept of load balancing in a large-scale system could be reformulated as follows: distribute evenly the workload to the smallest set of servers operating at optimal or near-optimal energy levels, while observing the Service Level Agreement (SLA) between the CSP and a cloud user. An optimal energy level is one when the performance per Watt of power is maximized. Scaling is the process of allocating additional resources to a cloud application in response to a request consistent with the SLA. We distinguish two scaling modes, horizontal and vertical scaling. Horizontal scaling is the most common mode of scaling on a cloud; it is done by increasing the number of Virtual Machines (VMs) when the load of applications in- creases and reducing this number when the load decreases. 1 For More Details Contact G.Venkat Rao PVR TECHNOLOGIES 8143271457
  • 2. 2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2015.2396059, IEEE Transactions on Cloud Computing Load balancing is critical for this mode of operation. Vertical scaling keeps the number of VMs of an application constant, but increases the amount of resources allocated to each one of them. This can be done either by migrating the VMs to more powerful servers or by keeping the VMs on the same servers, but increasing their share of the server capacity. The first alternative involves additional overhead; the VM is stopped, a snapshot is taken, the file is migrated to a more powerful server, and the VM is restarted at the new site. The alternative to the wasteful resource management pol- icy when the servers are always on, regardless of their load, is to develop energy-aware load balancing and scaling poli- cies. Such policies combine dynamic power management with load balancing and attempt to identify servers operat- ing outside their optimal energy regime and decide if and when they should be switched to a sleep state or what other actions should be taken to optimize the energy consump- tion. The vast literature on energy-aware resource man- agement concepts and ideas discussed in this paper includes [1, 3, 5, 6, 7, 10, 11, 22, 25, 28, 32, 33, 34]. Some of the questions posed by energy-aware load balancing and application scaling are: (a) Under what conditions should a server be switched to a sleep state? (b) What sleep state should the server be switched to? (c) How much energy is necessary to switch a server to a sleep state and then switch it back to an active state? (d) How much time it takes to switch a server to a running state from a sleep state? (e) How much energy is necessary for migrating a VM running on a server to another one? (f) How much energy is necessary for starting the VM on the target server? (g) How to choose the target where the VM should migrate to? (h) How much time does it takes to migrate a VM? The answers to some of these questions depend on the server’s hardware and software, including the virtual machine monitor and the operating systems, and change as the tech- nology evolves and energy awareness becomes increasingly more important. In this paper we are concerned with high- level policies which, to some extent are independent of the specific attributes of the server’s hardware and, due to space limitation, we only discuss (a), (b), and (g). We assume that the workload is predictable, has no spikes, and that the de- mand of an application for additional computing power dur- ing an evaluation cycle is limited. We also assume a clustered organization, typical for existing cloud infrastructure [4, 17]. There are three primary contributions of this paper: (1) a new model of cloud servers that is based on different operating regimes with various degrees of “energy efficiency” (process- ing power versus energy consumption); (2) a novel algorithm that performs load balancing and application scaling to max- imize the number of servers operating in the energy-optimal regime; and (3) analysis and comparison of techniques for load balancing and application scaling using three differently-sized clusters and two different average load profiles. Models for energy-aware resource management and applica- tion placement policies and the mechanisms to enforce these policies such as the ones introduced in this paper can be evalu- ated theoretically [1], experimentally [10, 11, 13, 22], through simulation [5, 7, 27], based on published data [2, 8, 19, 20], or through a combination of these techniques. Analytical mod- els can be used to derive high-level insight on the behavior of the system in a very short time but the biggest challenge is in determining the values of the parameters; while the results from an analytical model can give a good approximation of the relative trends to expect, there may be significant errors in the absolute predictions. Experimental data is collected on small-scale systems; such experiments provide useful perfor- mance data for individual system components but no insights on the interaction between the system and applications and the scalability of the policies. Trace-based workload anal- ysis such as the ones in [10] and [33] are very useful though they provide information for a particular experimental set-up, hardware configuration, and applications. Typically trace- based simulation need more time to produce results. Traces can also be very large and it is hard to generate represen- tative traces from one class of machines that will be valid for all the classes of simulated machines. To evaluate the energy aware load balancing and application scaling policies and mechanisms introduced in this paper we chose simulation using data published in the literature [4]. Operating efficiency of a system and server consolidation are discussed in Sections 2 and 3, respectively. The model described in Section 4 introduces the operating regimes of a processors and the conditions when to switch a server to a sleep state. Load balancing and scaling algorithms suitable for a clustered cloud organization based on the model are presented in Section 5; these algorithms aim to optimize the energy efficiency and to balance the load. Simulations exper- iments and conclusions are covered in Sections 6 and 7. 2 Energy Efficiency of a System The energy efficiency of a system is captured by the ra- tio “performance per Watt of power.” During the last two decades the performance of computing systems has increased much faster than their energy efficiency [3]. Energy proportional systems. In an ideal world, the energy consumed by an idle system should be near zero and grow linearly with the system load. In real life, even systems whose energy requirements scale linearly, when idle, use more than half the energy they use at full load. Data collected over a long period of time shows that the typical operating regime for data center servers is far from an optimal energy consumption regime. An energy-proportional system consumes no energy when idle, very little energy under a light load, and gradually, more energy as the load increases. An ideal energy-proportional system is always operating at 100% efficiency [3]. Energy efficiency of a data center; the dynamic range of subsystems. The energy efficiency of a data center is measured by the power usage effectiveness (PUE), the ratio of total energy used to power a data center to the energy used to power computational servers, storage servers, and other IT equipment. The PUE has improved from around 1.93 in 2003 to 1.63 in 2005; recently, Google reported a PUE ratio as low 2 For More Details Contact G.Venkat Rao PVR TECHNOLOGIES 8143271457
  • 3. 2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2015.2396059, IEEE Transactions on Cloud Computing as 1.15 [2]. The improvement in PUE forces us to concentrate on energy efficiency of computational resources [7]. The dynamic range is the difference between the upper and the lower limits of the energy consumption of a system as a function of the load placed on the system. A large dynamic range means that a system is able to operate at a lower frac- tion of its peak energy when its load is low. Different subsystems of a computing system behave differ- ently in terms of energy efficiency; while many processors have reasonably good energy-proportional profiles, significant im- provements in memory and disk subsystems are necessary. The largest consumer of energy in a server is the processor, followed by memory, and storage systems. Estimated distri- bution of the peak power of different hardware systems in one of the Google’s datacenters is: CPU 33%, DRAM 30%, Disks 10%, Network 5%, and others 22% [4] . The power consumption can vary from 45W to 200W per multi-core CPU. The power consumption of servers has in- creased over time; during the period 2001 - 2005 the esti- mated average power use has increased from 193 to 225 W for volume servers, from 457 to 675 for mid range servers, and from 5,832 to 8,163 W for high end ones [19]. Volume servers have a price less than $25 K, mid-range servers have a price between $25 K and $499 K, and high-end servers have a price tag larger than $500 K. Newer processors include power saving technologies. The processors used in servers consume less than one-third of their peak power at very-low load and have a dynamic range of more than 70% of peak power; the processors used in mobile and/or embedded applications are better in this respect. According to [3], the dynamic power range of other components of a system is much narrower: less than 50% for DRAM, 25% for disk drives, and 15% for networking switches. Large servers often use 32 − 64 dual in-line memory modules (DIMMs); the power consumption of one DIMM is in the 5 to 21 W range. A server with 2−4 hard disk drives (HDDs) consumes 24− 48 W. A strategy to reduce energy consumption by disk drives is concentrating the workload on a small number of disks and allowing the others to operate in a low-power mode. One of the techniques to accomplish this is based on replication [34]. Another technique is based on data migration [15]. A number of proposals have emerged for energy propor- tional networks; the energy consumed by such networks is proportional with the communication load. An example of an energy proportional network is the InfiniBand. A network- wide power manager, which dynamically adjusts the network links and switches to satisfy changing datacenter traffic loads, called Elastic Tree is described in [16]. Sleep states. The effectiveness of sleep states in optimiz- ing energy consumption is analyzed in [11]. A comprehen- sive document [18] describes the advanced configuration and power interface (ACPI) specifications which allow an operat- ing system (OS) to effectively manage the power consumption of the hardware. Several types of sleep sates, are defined: C-states (C1−C6) for the CPU, D-states (D0 − D3) for modems, hard-drives, and CD-ROM, and S-states (S1 − S4) for the basic input- output system (BIOS). The C-states allow a computer to save energy when the CPU is idle. In a sleep state, the idle units of a CPU have their clock signal and the power cut. The higher the state number, the deeper the CPU sleep mode, the larger the energy saved, and the longer the time for the CPU to return to the state C0 which corresponds to the CPU fully operational. The clock signal and the power of different CPU units are cut in states C1 to C3 , while in state C4 to C6 the CPU voltage is reduced. In state C1 the main internal CPU clock is stopped by the software, but the bus interface and the advanced programmable interrupt controller (APIC) are running, while in state C3 all internal clocks are stopped, and in state C4 the CPU voltage is reduced. Resource management policies for large-scale data centers. These policies can be loosely grouped into five classes: (a) Admission control; (b) Capacity allocation; (c) Load balancing; (d) Energy optimization; and (e) Quality of service (QoS) guarantees. The explicit goal of an admission control policy is to prevent the system from accepting work- load in violation of high-level system policies; a system should not accept additional workload preventing it from completing work already in progress or contracted. Limiting the workload requires some knowledge of the global state of the system; in a dynamic system such knowledge, when available, is at best obsolete. Capacity allocation means allocating resources for individual instances; an instance is an activation of a service. Some of the mechanisms for capacity allocation are based on either static or dynamic thresholds [23]. Economy of scale affects the energy efficiency of data pro- cessing. For example, Google reports that the annual energy consumption for an Email service varies significantly depend- ing on the business size and can be 15 times larger for a small business than for a large one [13]. Cloud computing can be more energy efficient than on-premise computing for many organizations [2, 26]. 3 Server Consolidation The term server consolidation is used to describe: (1) switch- ing idle and lightly loaded systems [33] to a sleep state; (2) workload migration to prevent overloading of systems [7]; or (3) any optimization of cloud performance and energy effi- ciency by redistributing the workload [25]. Server consolidation policies. Several policies have been proposed to decide when to switch a server to a sleep state. The reactive policy [31] responds to the current load; it switches the servers to a sleep state when the load decreases and switches them to the running state when the load in- creases. Generally, this policy leads to SLA violations and could work only for slow-varying, predictable loads. To re- duce SLA violations one can envision a reactive with extra ca- pacity policy when one attempts to have a safety margin and keep running a fraction of the total number of servers, e.g., 20% above those needed for the current load. The AutoScale policy [10] is a very conservative reactive policy in switching servers to sleep state to avoid the power consumption and the delay in switching them back to running state. This can be 3 For More Details Contact G.Venkat Rao PVR TECHNOLOGIES 8143271457
  • 4. 2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information. This article has been accepted for publication in a fu
  • Search Related
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks

    We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

    More details...

    Sign Now!

    We are very appreciated for your Prompt Action!