Comprehensive Research Topics in Data Warehousing Directed by Dr. Mostafa H. Chehreghani

Research topics in data warehouse Directed By : Dr Rahgozar Mostafa h.Chehreghani

List of research topics • Lineage tracing • Incremental view maintenance • Indexing in data warehouse • Data quality

Lineage tracing • List of papers : • Using AutoMed Metadata in Data Warehousing Environments • A Tutorial on the IQL Query Language • Practical Lineage Tracing in Data Warehouses • Incremental view maintenance and data lineage tracing in heterogeneous database environments • A Framework for supporting data integration using the materialized and virtual approaches

Lineage tracing • Automed: model for metadata in data warehouse • Use tag for relations • Use a language such as IQL • Node , Edge , Constraint • IOL: • Functional and typed language • Prefix and Infix functions • New functions by lambda • lambda {x,y,z} ((*) ((+) x y) z)

IQL • let v = q1 in q2 • let v = ((+) 200 500) in ((*) v v) • union : R ++ S • duplicate elimination: distinct (R) • setUnion R S Ξdistinct (R ++ S) • difference : R – S • projection : [{x,z} | {x,y,z} <- R] • Cartesian product and Joins • gc agFun xs • map f xs • Grouping and Aggregation Operations

Using IQL in Automed • Example : Enforce unique key constraint: (=) (count (distinct [n | {s,n} <- <<Student,name>>])) (count <<Student>>) • Name : field • Student : table

Example of lineage tracing

Example of lineage tracing • TS1,S2 = addNode (dept,{“Maths”,“CompSci”}); • addNode (person, [x| x mathematician] ++ [x| x compScientist]); • addNode (avgDeptSalary, {avg [s| (m,s)«_, mathematician, salary»]} ++ {avg [s| (c,s)«_, compScientist, salary»]}); • addEdge («_, dept, person», [( “Maths”, x)| x mathematician] ++ [(“CompSci”, x) | x compScientist]); • addEdge («_, person, salary», «_, mathematician,salary» ++ «_, compScientist, salary»); • addEdge («_, dept, avgDeptSalary», {( “Maths”, avg [s| (m,s) «_, mathematician, salary»]),

Example of lineage tracing • (“CompSci”, avg [s| (c,s)«_, compScientist, salary»])}); • delEdge («_, mathematician, salary», [(p, s)| (d, p) «_, dept, person»; (p’, s) «_, person, salary»; d = “Maths”; p = p’]); • delEdge («_, compScientist, salary», [(p, s)| (d, p) «_, dept, person»; (p’, s) «_, person, salary»; d = “CompSci”; p = p’}); • delNode (mathematician, [p| (d, p) «_, dept, person»; d = “Maths”]); • delNode («compScientist», [p| (d, p) «_, dept, person»; d = “CompSci”]);

Incremental view maintenance • List of papers • Incremental view maintenance and data lineage tracing in heterogeneous database environments • View maintenance in a warehousing environment • A System Prototype for Warehouse View Maintenance

Incremental view maintenance • Di : set of base relations • ΔDi : bags inserted into Di • ⌂Di : bags deleted from Di • V : materialized view • ΔV : bags inserted into V • ⌂V : bags deleted from V • Vnew = (V ++ ΔV) -- ⌂V • Minimality condition • ΔV C V • ΔV∩ ⌂V = Ø

Incremental view maintenance

Indexing in data warehouse • Paper • Bitmap Index Design and Evaluation • Advantages : • Compact size • Efficient hardware support for bitmap operations (AND, OR, XOR, NOT) • Fast search

Bitmap Index

Data quality in data warehouse • List of papers • Towards Quality-Oriented Data Warehouse Usage and Evolution • Data Quality Problems and Proactive Data Quality Management in Data-Warehouse-Systems • Data Warehouse Data Policy • Fitness for use • Subjective : • Related to end users • Objective : • Definition of system • Models: • GQM : Goal Question Metric • English

GQM • Goal factor • Importance of each factor determined respect to Goal • Quality dimension : • Data coherence • Data Completeness • Data freshness

GQM

Comprehensive Research Topics in Data Warehousing Directed by Dr. Mostafa H. Chehreghani

Comprehensive Research Topics in Data Warehousing Directed by Dr. Mostafa H. Chehreghani

Presentation Transcript

Data Warehouse

Data Warehouse

Data Warehouse

Data Warehouse

Data Warehouse

Data Warehouse

Data Warehouse

Data Warehouse

Topics to be covered in this lecture:- What is Data Warehouse History of data warehouse

Data Warehouse

Data Warehouse