A hierarchy independent approach (ongoing work) Predicting cache contents by abstract interpretation Michael Monerau Courant Institute, NYU Ecole Normale Suprieure de Paris, France Chris Hankin Imperial College London Overview Problem description Abstract Interpretation Cache prediction Related & Future Conclusion work 2
Cache mechanics Quick introduction: How do cache work? 3 Cache mechanics: Cache hit Offchip On-chip Virtual Memory CPU s Cache hierarchy L3 L2 L1 L3 L2
Read / Write 4 Cache mechanics: Cache miss Offchip On-chip Virtual Memory Sends data Cache hierarchy L3 Ask for data CPU s
L2 L1 L3 L2 Read / Write 5 Cache mechanisms: LowLevel An example: 4-Way Associative Cache Level Cache 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.
10. 11. Cache line . . . . . . . . . . . A cache line contains a copy of a Virtual Memory Block VM Address % 3 == 0 VM Address % 3 == 1 VM Address % 3 == 2 6
Cache hierarchies Cache hierarchies may vary a lot Number of levels Internals of each level Replacement strategies (each level) Write strategies Inter-level strategies Several CPUs, shared/unshared level Combinatorial explosion ! 7 Predict it! Knowing cache contents can help WCET (Worst Case Execution Time) Clever allocation on scratchpads (software-managed small caches in
embedded systems) Goals Modular and hierarchy independent prediction framework 8 Abstract Interpretation A quick overview of Abstract Interpretation 9 Abstract Interpretation: Principle CONCRETE WORLD ABSTRACT WORLD 10 Abstract Interpretation:
Principle CONCRETE WORLD ABSTRACT WORLD T T T 11 T Abstract Interpretation: Galois Given C, two posets Best abstraction Connection :
P C , Q A, A, and (ie. most precise) ( P) Q iff Galois P (Q) Soundness is optimal 12 Abstract Interpretation: Analysis 1. Goal x = Rand(1, 10) Find an
overapproximation x > 5 2. y = 0 x <= 5 3. of the abstract state at each control point y = 1 Equations (fixpoint) 4. Print y
2 x 5 1 3 x 5 1 4 2 3 13 Abstract Interpretation: Analysis 1. x = Rand(1, 10) x > 5 2. y = 0 4. x <= 5 3. y = 1
e.g. with intervals: x [1,10], y ] , [ 1 1 x2 [6,10], y2 [0, 0] x3 [1,5], y3 [1,1] x4 [1,5] [6,10] [1,10] y4 [0, 0] [1,1] [0,1] Print y 14 Cache prediction Description of the abstract domain for cache prediction 15
Cache Prediction: Dispatch Cache Level Receive Receive // Forward Other level Forward L3 Relation Strategy Reques Reques ts ts Receive Receive // Forward Forward Other level Report Report updates
updates Stock Strategy Make Make Address Address Available Available L1 Candidate Candidate locations locations Stock Management (low-level data organization) L2 16 Cache Prediction: Priority Goal:
Modeling the replacement dynamic Instead of abstracting locations contents abstract addresses priorities Priority for each Virtual Memory Address p [0, n 1] Or: the address is not in the cache Meaning: p 0 Victim is the lowest priority 17 Cache Prediction: ChangePriority Main ingredient to implement
strategies: Change Priority(VMAddress, p) LRU: Just call ChangePriority(.., Highest+1) FIFO: Call ChangePriority(.., Highest+1) only if the address is not in the cache level 18 Cache Prediction: Abstraction nA ABSTRACTION FOR LINES CACHE C=P ({ p : VM [0, n 1] { } A { p : VM P ([0, n - 1]{ })} / p is injective})
P p : ad p(ad ) pP { p : VM [0, n 1] { } / p is injective and ad VM , p(ad ) p (ad )} p Galois Connection 19 Cache Prediction: ChangePriority# Priority pr
p 1 ( pr ) ChangePriority#(A Priority d, pr) p ( pr ) 1 . a1 a2 a3 Ad a4a4 a5 a6 Ad a7 . . pr Moved Moved 1 1 down down Moved
Moved 0 0 or or 1 1 down down . a1 a2 a3 Ad a4 a5 a6 Ad a7 . . # Soundness : CP CP 20 Cache Prediction: Hits & Misses CONCRETE WORLD
ABSTRACT WORLD Let P p (Address) Miss Miss Hit/Miss? { } P Hit p(Address) Hit p(Address) { } P
P 21 Cache Prediction: Scratchpad Hardware managed Virtual Cache hierarchy Memory L3 L2 CPU L1 If (unlikely) y=0
Analysis gives dont put y information: Possible contents of the cache on the Simulate Scratchpad as ascratchpad L0 cache Heuristic Software manage d Scratchp ad Scratchpad allocation strategy 22 Related & Future Work Related Work
[Wilhelm et al. VMCAI10] and [Reineke et al. SAS09]: clever abstract domains, no generic proof, no hierarchy [Marvadel et al.]: profiling (for scratchpad allocation) Future work Safe initial states for some replacement strategies (domino effect, cf. [SAS09], [Berg 06]) Improvements proposed in [SAS09] Implementation & benchmarks Clever strategies for scratchpads 23 Conclusion Modular analysis: Hierarchy-independent Replacement strategy-independent Easy-to-build transfer functions
Formal and modular proofs of soundness Thank you! 24