100 likes | 231 Vues
This paper explores innovative techniques for predictive caching in query results by utilizing user history and identifying long-term patterns in query sequences. With a focus on optimizing the cache selection process, the methodology integrates historical data to enhance the efficiency of single-user streams and similar query types. Key strategies include matching cached items with new execution plans and employing greedy heuristics for optimal cache item selection. Furthermore, the approach addresses the challenges of maintaining relevant cached nodes and creating a unified cache system for improved performance.
E N D
Query Result Caching Prasan Roy, Krithi Ramamritham, S. Seshadri, S. Sudarshan Pradeep Shenoy, Jinesh Vora
Model • Predictive Caching - use history • Query results/ intermediate • Single user stream - very similar queries • Global sequence of queries - long term patterns • Leverages off MQO (P. Roy, et.al.)
Issues • Matching and reuse of cached items • Choice of items to cache
Matching • Integrated into optimization • Hash-based storage of DAGs and plans • New plans unified with old identical plans • Cache items chosen in cost based manner
Review of MQO • Basic idea • Sharable nodes considered for caching • Benefit of all subsets computed, choose best set • Greedy heuristic: take highest benefit node at each step • Several optimizations included
Adaptation • Characterizing the query workload • Weighted set of queries - frequency based • Candidates for caching is varied
Local Commonality • Use small window • Candidate set: current cache contents + new execution plan • Make greedy choice on this set • Re-check if old nodes are relevant (cleanup) • Any nodes in current plan worth caching? (scavenge) • Metric: benefit to representative set.
Disadvantage • DAG is small - no long term patterns • Candidate set is small - only local minimum • similar to "quick-and-dirty" method Volcano-RU
Global Commonality • Dynamic "view selection" • Large DAG, full-scale MQO • Candidate set includes all sharable nodes • Extended-predictive: no immediate caching • compute and materialize during slack time • cache on first use
Status • In Progress!