160 likes | 325 Vues
Data Center Specification. Gregory Provan CS6404. Global Properties. Global data structures Global clock System inputs: job specification Sub-systems Input/output computations. Plant Model. Data Center. CRAC. Key Global data structures. R 11 . . . . . . . . . . . . . . . . R 1k
E N D
Data Center Specification Gregory Provan CS6404
Global Properties • Global data structures • Global clock • System inputs: job specification • Sub-systems • Input/output • computations
Plant Model Data Center CRAC
Key Global data structures R11 . . . . . . . . . . . . . . . . R1k R21 . . . . . . . . . . . . . . . . R2k Rm1 . . . . . . . . . . . . . . . . Rmk • CPU matrix C (m x k) • CPU health H (m x k) • H: C h, h{0,1} • CPU thermal output O(m x k) • O: C T, with T [0,…,Tmax] (kW) • Thermal map T (m x k) • T: C (degrees C) • CRAC parameters • Thermal outputs: Temp x mass-flowrate
System Clock • The two basic metrics • Job-epoch j • CRAC-epoch for integer k: C = kj • CRAC is constant during this epoch • System has temporal window j • Used to compute future schedules, temperatures, etc.
Jobs and Allocation • Jobs arrive at rate r, and we assume jobs arrive in batches, at job-epochs j apart • J={J1,…,Jn} • Each job denoted by pair (d,), d, integers • d denotes the job duration • denotes the deadline, which is j longer than the duration • Assignment A: J x C • Can only assign jobs to CPUs that are not expected to fail during the job duration • is the CPU processing-rate: {low, high}
Job Revenue • Revenue function • Amount based on fulfilling the deadlines, with penalties for missing deadlines • R: J x A x (euros)
Health Computation • Requirements • Health decreases with high temp and use • Health increases with low temp and no use • If hhmax, then the CPU fails • CPUs also fail randomly with rate /day • H(t+1) H(t) x T(t) x A(t) x
Job Scheduling • Inputs • Jobs J(t) • Weight function: w(t)=(wrevenue, wthermal) • Governs schedule returned • Prior data: A(t-1), H(t-1), T(t-1) • Output • A(t),…,A(t+) • specifies the temporal window for the system • Computation • A(t) = f(J(t),A(t-1), H(t-1), T(t-1),w(t))
Power Computation • CPU power • Each CPU consumes power based on its rate • PC: (kW) • Total power: A x PC • CRAC power • CRAC cooling rate: based on Tr x T* x flow-rate • Return-air temp Tr • Setpoint temp T*
Thermal Computation • Used history of CPU usage, CRAC output and data-center airflow to compute thermal map • For our purposes, thermal map is T • T: C (degrees C) • Must specify • thermal maps over CRAC time-windows • Critical-temp maps, where Tcrit: C (-Tmax) (degrees C)
Hot-Spot Computation • Assume simple air-circulation model for Rack Warm air outflow Temperature Gradient (with equal Loads) Cold air inflow
Hot-Spot Computation • Impact or radiation on adjacent Racks (optional) Warm air outflow Temperature Gradient (with equal Loads) Cold air inflow
Key Publication • Banerjee et al. • Integrating cooling awareness with thermal aware workload placement, Sustainable Computing 1, 2011.