Computer Architecture A Quantitative Approach, Sixth Edition Chapter

Computer Architecture A Quantitative Approach, Sixth Edition Chapter

Computer Architecture A Quantitative Approach, Sixth Edition Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Copyright 2019, Elsevier Inc. All rights Reserved 1 Introduction Introduction Warehouse-scale computer (WSC) Provides Internet services Differences with HPC clusters:

Search, social networking, online maps, video sharing, online shopping, email, cloud computing, etc. Clusters have higher performance processors and network Clusters emphasize thread-level parallelism, WSCs emphasize request-level parallelism Differences with datacenters: Datacenters consolidate different machines and software into one location Datacenters emphasize virtual machines and hardware heterogeneity in order to serve varied customers Copyright 2019, Elsevier Inc. All rights Reserved 2

Important design factors for WSC: Cost-performance Small savings add up Energy efficiency Introduction Introduction Affects power distribution and cooling

Work per joule Dependability via redundancy Network I/O Interactive and batch processing workloads Copyright 2019, Elsevier Inc. All rights Reserved 3 Ample computational parallelism is not important Can afford to build customized systems since WSC require volume purchase Location counts

Power consumption is a primary, not secondary, constraint when designing system Scale and its opportunities and problems Most jobs are totally independent Request-level parallelism Operational costs count Introduction Introduction Real estate, power cost; Internet, end-user, and workforce availability Computing efficiently at low utilization Scale and the opportunities/problems associated with scale

Unique challenges: custom hardware, failures Unique opportunities: bulk discounts Copyright 2019, Elsevier Inc. All rights Reserved 4 Location of WSC Proximity to Internet backbones, electricity cost, property tax rates, low risk from earthquakes, floods, and hurricanes Power distribution Copyright 2019, Elsevier Inc. All rights Reserved Efficiency and Cost of WSC

Efficiency and Cost of WSC 5 Batch processing framework: MapReduce Map: applies a programmer-supplied function to each logical input record Programming Models and Workloads for WSCs Prgrmg Models and Workloads Runs on thousands of computers Provides new set of key-value pairs as intermediate values Reduce: collapses values using another programmer-supplied function

Copyright 2019, Elsevier Inc. All rights Reserved 6 Example: map (String key, String value): // key: document name // value: document contents for each word w in value EmitIntermediate(w,1); // Produce list of all words reduce (String key, Iterator values):

// key: a word // value: a list of counts int result = 0; for each v in values: Programming Models and Workloads for WSCs Prgrmg Models and Workloads result += ParseInt(v); // get integer from key-value pair Emit(AsString(result)); Copyright 2019, Elsevier Inc. All rights Reserved 7 Availability:

Use replicas of data across different servers Use relaxed consistency: No need for all replicas to always agree File systems: GFS and Colossus Databases: Dynamo and BigTable Copyright 2019, Elsevier Inc. All rights Reserved Programming Models and Workloads for WSCs Prgrmg Models and Workloads 8 MapReduce runtime environment schedules

map and reduce task to WSC nodes Workload demands often vary considerably Scheduler assigns tasks based on completion of prior tasks Tail latency/execution time variability: single slow task can hold up large MapReduce job Runtime libraries replicate tasks near end of job Copyright 2019, Elsevier Inc. All rights Reserved Programming Models and Workloads for WSCs Prgrmg Models and Workloads 9 Copyright 2019, Elsevier Inc. All rights Reserved Programming Models and Workloads for WSCs

Prgrmg Models and Workloads 10 WSC often use a hierarchy of networks for interconnection Each 19 rack holds 48 1U servers connected to a rack switch Rack switches are uplinked to switch higher in hierarchy Computer Ar4chitecture of WSC Computer Architecture of WSC Uplink has 6-24X times lower bandwidthGoal is to maximize locality of communication relative to the rack

Copyright 2019, Elsevier Inc. All rights Reserved 11 Storage options: Use disks inside the servers, or Network attached storage through Infiniband WSCs generally rely on local disks Google File System (GFS) uses local disks and maintains at least three relicas Copyright 2019, Elsevier Inc. All rights Reserved Computer Ar4chitecture of WSC Storage

12 Switch that connects an array of racks Array switch should have 10 X the bisection bandwidth of rack switch Cost of n-port switch grows as n2 Often utilize content addressible memory chips and FPGAs Copyright 2019, Elsevier Inc. All rights Reserved Computer Ar4chitecture of WSC Array Switch 13

Computer Ar4chitecture of WSC WSC Memory Hierarchy Servers can access DRAM and disks on other servers using a NUMA-style interface Copyright 2019, Elsevier Inc. All rights Reserved 14 Copyright 2019, Elsevier Inc. All rights Reserved Computer Ar4chitecture of WSC WSC Memory Hierarchy 15 Copyright 2019, Elsevier Inc. All rights Reserved Computer Ar4chitecture of WSC WSC Memory Hierarchy 16

Cooling Air conditioning used to cool server room 64 F 71 F Keep temperature higher (closer to 71 F) Cooling towers can also be used Minimum temperature is wet bulb temperature Copyright 2019, Elsevier Inc. All rights Reserved Physcical Infrastrcuture and Costs of WSC Infrastructure and Costs of WSC

17 Cooling system also uses water (evaporation and spills) Power cost breakdown: E.g. 70,000 to 200,000 gallons per day for an 8 MW facility Chillers: 30-50% of the power used by the IT equipment Air conditioning: 10-20% of the IT power, mostly due to fans How man servers can a WSC support? Each server:

Physcical Infrastrcuture and Costs of WSC Infrastructure and Costs of WSC Nameplate power rating gives maximum power consumption To get actual, measure power under actual workloads Oversubscribe cumulative server power by 40%, but monitor power closely Copyright 2019, Elsevier Inc. All rights Reserved 18 Determining the maximum server capacity

Nameplate power rating: maximum power that a server can draw Better approach: measure under various workloads Oversubscribe by 40% Typical power usage by component: Processors: 42% DRAM: 12% Disks: 14% Networking: 5% Cooling: 15% Power overhead: 8% Miscellaneous: 4% Copyright 2019, Elsevier Inc. All rights Reserved Physcical Infrastrcuture and Costs of WSC

Infrastructure and Costs of WSC 19 Power Utilization Effectiveness (PEU) = Total facility power / IT equipment power Median PUE on 2006 study was 1.69 Performance Latency is important metric because it is seen by users Bing study: users will use search less as

response time increases Service Level Objectives (SLOs)/Service Level Agreements (SLAs) Physcical Infrastrcuture and Costs of WSC Measuring Efficiency of a WSC E.g. 99% of requests be below 100 ms Copyright 2019, Elsevier Inc. All rights Reserved 20 Copyright 2019, Elsevier Inc. All rights Reserved Physcical Infrastrcuture and Costs of WSC Measuring Efficiency of a WSC 21 Capital expenditures (CAPEX)

Cost to build a WSC $9 to 13/watt Operational expenditures (OPEX) Cost to operate a WSC Copyright 2019, Elsevier Inc. All rights Reserved Physcical Infrastrcuture and Costs of WSC Cost of a WSC 22 Amazon Web Services

Virtual Machines: Linux/Xen Low cost Open source software Initially no guarantee of service No contract Copyright 2019, Elsevier Inc. All rights Reserved Cloud Computing Cloud Computing 23 Cloud Computing Growth Copyright 2019, Elsevier Inc. All rights Reserved Cloud Computing

Cloud Computing 24 Cloud computing providers are losing money AWS has a margin of 25%, Amazon retail 3% Fallcies and Pitfalls Fallacies and Pitfalls Focusing on average performance instead of 99 th percentile performance Using too wimpy a processor when trying to

improve WSC cost-performance Inconsistent Measure of PUE by different companies Capital costs of the WSC facility are higher than for the servers that it houses Copyright 2019, Elsevier Inc. All rights Reserved 25 Fallcies and Pitfalls Fallacies and Pitfalls Trying to save power with inactive low power modes versus active low power modes Given improvements in DRAM dependability and the fault tolerance of WSC systems software, there is no need to spend extra for ECC memory

in a WSC Coping effectively with microsecond (e.g. Flash and 100 GbE) delays as opposed to nansecond or millisecond delays Turning off hardware during periods of low activity improves the cost-performance of a WSC Copyright 2019, Elsevier Inc. All rights Reserved 26

Recently Viewed Presentations

  • Kuby's Immunology Complement

    Kuby's Immunology Complement

    Complement Fixation Hugh B. Fackrell Content Outline Complement Components Complement Activation Biological Consequences Complement Fixation Complement system Phases in the Complement Cascade Recognition Activation Amplification Membrane Attack Complement Cascade Complement Activation Classical pathway Alternative pathway Membrane attack complex Complement Components...
  • Topic 3.5 - Pressure in Fluids - PASCAL'S LAW

    Topic 3.5 - Pressure in Fluids - PASCAL'S LAW

    Measured in pascals (Pa) A pascal equals the force of 1 N (newton) over an area of 1 m2 The MORE force you can apply to an area, the GREATER the pressure Formula The formula for calculating pressure is: P...
  • The Sun - University of Colorado Boulder

    The Sun - University of Colorado Boulder

    Geometry of Black Hole Space Curves in on itself There's no path out! Can you boost the signal? No, that doesn't help. Curved Space Rubber Sheet Analogy Properties - Size The radius of a black hole is 3km per solar...
  • NYS CIO Council NYS Forum Corporate Roundtable May

    NYS CIO Council NYS Forum Corporate Roundtable May

    Membership 15 members - 3 State CIO Appointees, 2 Designated per Charter: President of NYSLGIDTA, New York City DoITT, and 10 elected by the CIO Council Elected Members David Gardam Adam Gigandet (Vice Chair) Thomas Herzog Andrew Hoppin Kim McKinney...
  • BEP Training Pesentation - Illinois

    BEP Training Pesentation - Illinois

    Business Enterprise Program with BidBuy. Presenters: Ann Marie Rembert - CMS. Harry Reinhard - CMS -BEP. Goals. Upon completion, participants will be better able to: Establish a goal. Submit the goal for review to BEP through BidBuy. Receive the review...
  • Using Concept Maps to Organize Reviews of Literature

    Using Concept Maps to Organize Reviews of Literature

    Using Concept Maps to Organize Reviews of Literature For CAUSE Research Clusters Hollylynne Lee April 6, 2010 Goals of a Literature Review Getting familiar with research and best practices in area of interest Extracting salient findings that seem interesting/important to...
  • Praxis II - grant.kyschools.us

    Praxis II - grant.kyschools.us

    www.ovec.org . is an example . ... A licensed school nurse is an important part of a special education evaluation team. They may assist in determining if a child meets the criteria for special education by providing health assessments as...
  • Week 4 - OOP II

    Week 4 - OOP II

    Encapsulation. Encapsulation is the mechanism that binds together the code and data it manipulates into a logical black box that is safe from outside interference and misuse. Let's see an example…