Cross-Layer Adaptation for QualityAware and Energy-Efficient Next Generation

Cross-Layer Adaptation for QualityAware and Energy-Efficient Next Generation

Cross-Layer Adaptation for QualityAware and Energy-Efficient Next Generation Mobile Multimedia Devices Klara Nahrstedt GRACE [email protected] Department of Computer Science University of Illinois at Urbana-Champaign Joined work with Wanghong Yuan, and PIs of NSF ITR Sarita Adve, Doug Jones, Robin Kravets Motivation Mobile devices Running multimedia apps (e.g., MP3 players, DVD players) Running on general purpose systems Demanding quality requirements System resources: high performance OS: predictable resource management Limited battery energy System resources: low power consumption OS: energy as first-class resource

New Opportunities Adaptability of software and hardware Multimedia applications Multiple Quality levels: quality vs. resource usage Statistical performance requirements (e.g., meeting 96% of guarantees) Soft guarantees from OS Hardware components Multiple operating states: performance vs. power (e.g., mobile processors Intels XSacle, AMDs Athlon, Transmetas Crusoe) Reducing CPU voltage can reduce CPU energy consumption substantially Goal for Next Generation Mobile Devices Take advantage of new opportunities adaptability Address new challenges quality provision and energy saving

1. Design a cross-layer adaptation framework Each layer adapts to changes All layers adapt cooperatively for system-wide optimal configuration 2. OS support for such coordinated cross-layer adaptation Outline 1. Motivation 2. Existing Approaches 3. GRACE Cross-Layer Adaptation Framework 4. Evaluation 5. Conclusion Layered Adaptation

Application Network Protocols Operating System Architecture and Hardware Each adaptive layer must make several decisions affecting all resources - time, energy, bandwidth other layers Layered Adaptation Application Which video compression technique? How much compression? Network Protocols Operating System Architecture and Hardware Each adaptive layer must make several decisions affecting all resources - time, energy, bandwidth

other layers Layered Adaptation Application Which video compression technique? How much compression? Network Protocols How much error correction for wireless channel? Which congestion control protocols for wired network? Operating System Architecture and Hardware Each adaptive layer must make several decisions affecting all resources - time, energy, bandwidth other layers Layered Adaptation Application Which video compression technique? How much compression? Network Protocols How much error correction for wireless channel? Which congestion control protocols for wired network?

Operating System How to allocate resources to multiple applications? How to allocate among components of the same application? Architecture and Hardware Each adaptive layer must make several decisions affecting all resources - time, energy, bandwidth other layers Layered Adaptation Application Which video compression technique? How much compression? Network Protocols How much error correction for wireless channel? Which congestion control protocols for wired network? Operating System How to allocate resources to multiple applications? How to allocate among components of the same application? Architecture and Hardware Which processor, cache, memory configuration? Which frequency, voltage? Each adaptive layer must make several decisions affecting

all resources - time, energy, bandwidth other layers State of the Art Quality or energy aware adaptation Hardware layer Dynamic power management (e.g., Simunic01,Benini00) Dynamic voltage scaling - DVS (e.g., Ishihaa98, Pering00, Pillai01) Common mechanism to save CPU energy; Important characteristics of CMOS-based processors - lower frequency enables lower voltage and yields a quadratic energy reduction) Effectiveness of DVS dependent on predictions of application CPU demands

OS layer Soft-real-time scheduling (e.g., Bavier00, Banachowski02) Task-based Speed and Voltage Scheduling (e.g., Lorch01, Lorch03) Application layer Trade off quality for resource usage (e.g., Flinn01, Chandra02) Network layer Power Management (e.g., Krashinsky02) Energy-aware routing and transmission (e.g., Kravets98,Gomez03) What Is Missing

Most current work adapts a single layer Some jointly adapt two layers, BUT one layer drives adaptation (e.g., application controls video coding and network error correction) Applications Applications Applications Applications OS/Network OS/Network OS/Network OS/Network

Hardware Hardware Hardware Hardware (a) hardware adaptation (b) OS adaptation (c) app. adaptation (d) OS/app. adaptation For our target mobile systems, we need Applications OS/Network Hardware cross-layer adaptation Cross-layer != Simple Combination Combination is not straightforward Adaptations may be in conflict E.g., CPU slows down, while apps increase demand

Various adaptation objectives E.g., maximizing quality vs. minimizing energy Different adaptation costs and impact E.g., OS adaptation for small variations, application adaptation for large variations Consider integration and coordination ! Outline 1. Motivation 2. Existing approaches 3. GRACE Cross-Layer Adaptation Framework 4. Evaluation 5. Conclusion GRACE Global Resource Adaptation via CoopEration

Current approaches GRACE Operating System Operating System Architecture, Hardware Coordinator Application Network Protocols Architecture, Hardware Application Network Protocols System divided into layers

Adapt 1 or 2 layers Global community All adapt cooperatively via coordinator S. Adve et al. The Illinois GRACE Project: Global Resource Adaptation through CoopEration, Workshop on Self-Healing Adaptive and self-MANaged Systems, 2002 Global and Internal Adaptation Global Triggers: rare, coarse-grain Application joins or leaves Internal Triggers: frequent, fine-grain Small usage change Large usage change Large availability change Adaptation: Via coordinator Determine a system-wide

optimal configuration Adaptation: Each layer adapts locally Respect the global configuration Cost: expensive Cost: cheap adapt App Adaptor QoS level residual energy Battery Monitor CPU allocation

CPU frequency CPU Speed Adaptor Soft-Real-Time Scheduling OS Coordinator schedule Adjusted CPU demand adapt CPU W. Yuan, K. Nahrstedt, et al Design and Evaluation of a Cross-Layer Adaptation Framework for Mobile Multimedia Systems, SPIE Multimedia Computing and Networking (MMCN), 2003 Hardware

QoS Level Options Application Application Application Application GRACE Architecture (First Version) OS Role in GRACE GRACE-OS: Coordinator Coordinate in cooperative manner hardware, OS, and application layers Soft real-time scheduling framework Support multimedia application quality requirements Adapt internal scheduling Monitor and react to variations in CPU usage Integrates dynamic voltage scaling (DVS) into soft-real-time (SRT) scheduling

Uses stochastic scheduling and allocation based on statistical performance requirements and probability distribution of cycle demands of individual application tasks Estimates demand distribution of tasks via online profiling and estimations Finds speed schedule for each task based on probabilistic distribution of the tasks cycle demands (this speed schedule enables each job of a task to start slowly and accelerate as the job progresses) Decides how fast to execute applications in addition to when and how long to execute them Outline 1. Motivation 2. Existing approaches 3. GRACE Cross-Layer Adaptation Framework GRACE Architecture Global coordination

Soft real-time scheduling (Internal Adaptation) 4. Evaluation 5. Conclusion System Models Adaptive periodic multimedia application Multiple QoS levels, {q1, , qm} Utility u(q) CPU demand: period P(q) and cycle C(q) Statistical performance requirement: probability to meet deadlines Adaptive processor Multiple speeds, {f1, , fmax} Frequency f Power p(f) Battery Desired lifetime Tlife and residual energy Eres Coordination Problem

Mediate three layers to find QoS level for each application CPU allocation for each application CPU frequency to maximize overall system utility under CPU and energy constraints Constrained Optimization (accumulated system utility) (CPU constraint: EDF schedulability) (energy constraint: last for desired lifetime) Heuristic Approaches Utility-greedy Maximize current utility Energy-greedy Guarantee desired lifetime NP-hard problems can be mapped to multi-choice Knapsack problem; use dynamic programming

with complexity O(mlogm), with m Quality Levels Coordination Protocol (5.2) adapt QoS parameters App Adaptor (1) utility demand (5.1) coordinated QoS level Coordinator (2) residual energy (3) optimization Battery Monitor application

(6.1) coord. allocation SRT CPU Scheduler (4.1) coordinated speed CPU Speed Adaptor (4.2) adapt speed CPU Outline 1. Motivation 2. Existing approaches 3. GRACE Cross-Layer Adaptation Framework GRACE Architecture

Global coordination Soft real-time scheduling (Internal Adaptation) 4. Evaluation 5. Conclusion Soft-Real-Time Scheduling Multimedia tasks (processes or threads) GRACE-OS performance requirements (via system calls) monitoring scheduling

Stochastic SRT Scheduler Profiler demand distribution time allocation CPU Speed Adaptor (Stochastic DVS) speed scaling CPU SRT Scheduling Framework Profiler monitors cycle usage of individual tasks derives probability distribution of their cycle demands from cycle usage Stochastic SRT scheduler allocates cycles to task

schedules them to deliver performance guarantees, performs SRT scheduling based on the statistical performance requirements and demand distribution Speed adaptor adjusts CPU speed dynamically to save energy W. Yuan, K. Nahrstedt, Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems, ACM Symposium on Operating Systems Principles (SOSP), 2003 Demand Estimation (1) 1. Kernel-based online profiling Measure cycles between switch-in (in) and switch-out (out) Accurate with small overhead in out in finish/out c1

c2 c3 c4 c2 c1 cycles c4 c3 cycles for the job = (c2 c1) + (c4 c3) Measured cycles are kept in cycle counter of the process control block of each task. Demand Estimation (2) 2. Histogram for probability distribution Group profiled cycles Use profiling window of n jobs with cycles [Cmin, Cmax] Partition profiling window into r equal-sized groups (Cmin = b0 < b1 <

Count occurrence in each group distribution function P[X<=x] 1 cumulative probability P[X<=bi] = Cmin=b0 b1 b2 bi cycle demand br-1 br=Cmax Demand Estimation (3) 3. Determine amount of cycles C allocated to each task

Statistical performance requirement of a task Meet percent of deadlines so that Search tasks histogram to find smallest bm with P[X bm] cumulative probability statistical performance requirement Cmin=b0 b1 b2 cycle demand C br-1 br=Cmax

Demand Estimation (b) Cycle demand distribution for gray dithering cumulative probability # of cycles (millions) (a) Profiled decoding cycles for gray dithering 8 6 4 2 0 0 500 1000 1500 2000 # of frames

1 0.8 0.6 0.4 first 100 jobs first 200 jobs all jobs 0.2 0 2.1 2.6 3.1 3.6 4.1 4.6 5.1 5.6 6.1 job cycles (millions) Probability distribution is more stable, but changes slowly and smoothly Stochastic SRT Scheduling (Speed-Aware EDF Scheduling) Variable speed constant bandwidth server(VS-CBS) Maximum budget C -- Period P

Budget c -- Deadline d Hierarchical scheduling 1. SRT scheduler selects earliest-deadline VS-CBS 2. VS-CBS executes the application Decrease budget c by # of consumed cycles If c=0, then c = C and d = d + P Stochastic SRT scheduling determines which task to execute, when and how long Stochastic DVS Scheduling Dynamic speed scaling policy: GRACE-OS starts a job at a lower speed and accelerate as it progresses Speed Schedule for each task

Each point (x,y) in schedule specifies that a job accelerates to the speed y when it uses x cycles Speed list is sorted in ascending order of cycle number x We calculate speed schedule based on tasks demand distribution (similar to techniques proposed by Lorch/Smith and Gruian) Stochastic DVS (Example) speed (MHz) speed (MHz) cycle: speed: 0 100 MHz

1 x 106 120 MHz 2 x 106 180 MHz (a) Speed schedule with four scaling points 120 100 job1's cycles=1.6x10 6 10 180 120 100 time (ms) 15 job2's cycles = 2.5 x 10 6 10 speed (MHz)

3 x 106 300 MHz 18.3 time (ms) 21.1 300 180 120 100 job3's cycles = 3.9 x 10 6 10 18.3 (b) Speed scaling for three jobs using speed schedule in (a) 23.8 26.8

time (ms) Outline 1. Motivation 2. Existing approaches 3. GRACE Cross-Layer Adaptation Framework 4. Evaluation 5. Conclusion GRACE-OS Implementation Hardware: HP N5470 laptop AMD Athlon processor, six speeds p freq x volt2 Implementation: Software Adaptive applications w/ application adaptor coordinator application

middleware system call Standard Linux scheduler SRT -DVS modules SRT scheduling hook PowerNow module GRACE-OS message queue Linux kernel Experiments Application: MPEG video player Video: 4Dice (352 x 240 pixels, 1679 frames) QoS parameters (dithering method, frame rate) Dithering: gray, ordered, and color2

Frame rate: 20, 25, and 33 fps Nine QoS levels Utility function Utility for SRT mode Utility for QoS level q Global Coordination Overhead Overhead of global coordination # of cycles (thousands) 250 utility-greedy energy-greedy 200 150 100 50 0

1 2 3 4 5 6 7 # of applications 8 9 10 SRT Scheduling Overhead

Comparison w/ Other Policies CPU speed App QoS internal simplified adaptation None No-adapt Single-layer CPU-only App-only highest adapt highest highest no

highest adapt single app no no adapt single app adapt all apps adapt all apps no no no adapt all apps adapt all apps yes yes Uncoordinated multi-layers App-CPU App-OS App-OS-CPU

Cross-layer Utility-greedy adapt highest adapt adapt Energy-greedy adapt Methodology Start a player every 12 seconds Each exits after finishing 4Dice video Normalized energy measurement Normalized energy = time * relative power If 300 MHz for 1 second, energy is 1 * 22% = 0.22 Battery Desired lifetime 900 seconds Initial battery energy: 300, 600, 900, and 1200 Compare Lifetime

achieved lifetime no-adapt time (seconds) 900 app-only app-OS 600 CPU-only app-CPU 300 app-OS-CPU utility-greedy 0 300 600

900 initial energy 1200 energy-greedy Compare Utility accumulated system utility no-adapt accumulated utility 2000 app-only app-OS CPU-only 1000 app-CPU

app-OS-CPU utility-greedy 0 300 600 900 initial energy 1200 energy-greedy Process Group Management in Cross-Layer Adaptation miss ratio (%) 5 4 3

2 1.2 1.4 1 0 GRACE -1 GRACE -grp normalized energy Deadline miss ratio of the hyper-video Normalized CPU energy Consumption (hyper-video 4 mpgplay) 180 130.2 120

80.8 70.7 60 0 Static CPU GRACE GRACE -1 -grp W. Yuan, K. Nahrstedt, Process Group Management in Cross-Layer Adaptation, SPIE Multimedia Computing and Networking (MMCN), 2004 Outline 1. Motivation 2. Existing approaches 3. GRACE Cross-Layer Adaptation Framework 4. Evaluation 5. Conclusion

Lessons Learned So Far GRACE 1. Coordinate cross-layer adaptation for energy saving and Quality provision 2. Consider stochastic real-time scheduling for soft-real time applications Statistical performance requirement and probability distribution of demand Integration of SRT and DVS 3. Build real systems and test-beds for experimental validation (GRACE-OS is first implementation of OS resource manager for cross-layer adaptation in Linux) Acknowledgements NSF ITR Funding CCR 02-055638

NSF CISE EIA 99-72884 GRACE Group Sarita Adve, Douglas Jones, Robin Kravets, Wanghong Yuan, Albert F. Harris, Christopher J. Hughes, Daniel Grobe Sachs,Ruchira Sasanka, Jayanth Srinivasan Contact: [email protected]

Recently Viewed Presentations

  • Investigations into Mechanisms of Word Finding John Hart, Jr ...

    Investigations into Mechanisms of Word Finding John Hart, Jr ...

    Frontotemporal Dementia American Academy of Neurology John Hart, Jr., M.D. Frontotemporal Dementia Definition: clinicopathologic condition consisting of deterioration of personality and cognition assoc. with prominent frontal and temporal lobe atrophy Accounts for up to 3-20% of dementias Third behind AD...
  • Thermal Soaring Forecasting

    Thermal Soaring Forecasting

    Depends on shape of lapse rate curve Strong winds result in turbulence that tends to break up thermals Buoyancy/shear is a ratio of the potential thermal strength to the wind shear that causes turbulent break up Ratio below ~ 5...
  • Machine Guarding in The Timber Industry - Ttia

    Machine Guarding in The Timber Industry - Ttia

    Guide to best practice machine guarding in the timber industry - 2016
  • Literacy Toolkit Asking Asking the the Right Right

    Literacy Toolkit Asking Asking the the Right Right

    Asking the right questions at the right time. The objective of this PPT is to explore what types of questions we ask, the developmental nature of verbal reasoning skills and how to adapt our language to achieve improved learning outcomes...
  • Chapter 10

    Chapter 10

    created a standard set of terms and graphical notations for documenting business processes known as Business Process Modeling Notation (www.bpmn.org). As-Is Business Order Process: Existing Ordering Process
  • COALA prezint BULETIN EUROPEAN INFORMATIV BUCURETI 04 /

    COALA prezint BULETIN EUROPEAN INFORMATIV BUCURETI 04 /

    La proba de matematică a Concursului de Evaluare în Educaţie, etapa a III-a, următorii elevi au obţinut peste 90 de puncte: -clasa a II-a: Cristian Iorgoveanu, Teodora Nacea, Luca Vlăsceanu, Andreea Popescu, Ştefan Cotruţă, Maria Rotaru, Ana Popescu, Vlad Calomfirescu,...
  • FOR FACILITATORS ONLY: THE STORY OF DAY 3

    FOR FACILITATORS ONLY: THE STORY OF DAY 3

    Teacher reads aloud to students, modeling pronunciation, pacing, and expression. ... Purpose: understanding there is a method to the madness of development. Speaker Notes: When you build sets of texts around a topic, it's important to be strategic. Using complexity...
  • Hvad siger Bibelen om tro?

    Hvad siger Bibelen om tro?

    C.S. Lewis. Enhver må træffe et valg. Enten var og er dette menneske Guds Søn, eller også er han en galning eller noget endnu værre. Man kan spærre ham inde som sindssyg, man kan spytte på ham og slå ham...