700 likes | 948 Vues
Many-to-many Relationships in DAX. When Many To Many Are Really Too Many . Alberto Ferrari Senior Consultant SQLBI.COM. Who’s Speaking ?. BI Expert and Consultant Problem Solving Complex Project Assistance DataWarehouse Assesments and Development Courses, Trainings and Workshops
E N D
Many-to-many Relationships in DAX When Many To Many Are Really Too Many Alberto Ferrari Senior Consultant SQLBI.COM
Who’sSpeaking? • BI Expert and Consultant • ProblemSolving • Complex Project Assistance • DataWarehouseAssesments and Development • Courses, Trainings and Workshops • SQLBI: Branch of a Microsoft BI Gold Partner • Book Author, SSAS Maestro • alberto.ferrari@sqlbi.com • «Spaghetti English» • Getprepared, I can’t help it
Agenda • Many To ManyRelationships • Data Model • DAX Formula pattern • ClassicalMany To Many • CascadingMany To Many • Survey Data Model • Basket Analysis
Prerequisites • Wewill use, butnotdescribe: • Tabular basics • DAX basics • Evaluation contexts and interactions with relationships • Data modeling with Many To Many • Afterall… thisis a 400 session
No support for M2M in Tabular • Facts • Multidimensionalhandles M2M relationships • Tabular doesnot • Thus, myfastestever session endshere • Or… we can dive into DAX and play with it
Demo – Classical M2M • Wewillalways start lookingat the finalresult • Then, we dive into the DAX code
The M2M DAX Pattern • Leverages • CALCULATE • RowContexts • FilterContexts • Automatictransformation of RowContextintoFilterContextusingCALCULATE • Next slide: the formula. Keepit in mind Itwillappearquiteoften from now on
The DAX Formula of M2M Pattern AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 ))
What the formula shouldperform AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 )) Filter on Dim_Customer Filter on Dim_Account
How the formula works AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 )) The filterisapplied by theTabular data model
How the formula works AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 )) FILTER: For eachrow in Dim_AccountCALCULATE: filters the bridge The filterisapplied by theTabular data model
How the formula works AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 )) FILTER: For eachrow in Dim_AccountCALCULATE: filters the bridge Onlyrows in Dim_AccountwhereCOUNTROWS () > 0 survive the FILTER The filterisapplied by theTabular data model
How the formula works AmountM2M :=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 )) FILTER: For eachrow in Dim_AccountCALCULATE: filters the bridge Onlyrows in Dim_AccountwhereCOUNTROWS () > 0 survive the FILTER Thisfilterisapplied by theTabular data model SUMisevaluatedonly for the accounts filtered from the customers
BI123|many-to-many In DAX The Karma of CALCULATE AmountM2M_Wrong:= CALCULATE(SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,COUNTROWS(Bridge_AccountCustomer) > 0 )) AmountM2M_Correct := CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS (Bridge_AccountCustomer) ) > 0 ))
Wrong formula in action AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,COUNTROWS(Bridge_AccountCustomer) > 0 )) All the rows in the Account tablesurvived the FILTER
Many-to-many in action AmountM2M:=CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS(Bridge_AccountCustomer) ) > 0 )) Tworows in the Account tablesurvived the FILTER
BI123|many-to-many In DAX The Karma of CALCULATE AmountM2M_Wrong:= CALCULATE(SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,COUNTROWS(Bridge_AccountCustomer) > 0 )) AmountM2M_Correct:= CALCULATE (SUM (Fact_Transaction[Amount]),FILTER (Dim_Account,CALCULATE (COUNTROWS (Bridge_AccountCustomer) ) > 0 ))
Many to Many - Conclusions • Complexity • Not in the data model • Only in the formula Goodunderstanding of CALCULATE and of relationships in Tabular needed Thisis just the beginningof ourjourney
CascadingMany To Many The pattern is the same, butthis time weneed to jumptwosteps to complete ourtask Filter on Dim_Category Filter on Dim_Account
CascadingMany To Many CALCULATE ( CALCULATE ( SUM (Fact_Transaction[Amount]), FILTER (Dim_Account, CALCULATE ( COUNTROWS (Bridge_AccountCustomer) ) > 0 )), FILTER (Dim_Customer, CALCULATE ( COUNTROWS (Bridge_CustomerCategory) ) > 0))
CascadingMany To Many • Generic Formula • Works with anynumber of steps • Be careful • Always start with the farthesttable • And moveonestepat a time in the direction of the facttable • One CALCULATE for eachstep • Complexity: M x N (geometric…)
Cascading Alternative Need for someETL here
Cascading Alternative The bridge now «feels»threedifferentfilters, butthe formula becomes a classicalmany to manyand runsfaster
FlattenedCascadingMany To Many • Flattened Data Model • Fasterthan the cascadingone • Simpler formula • Needs ETL • Worth a try in Multidimensionaltoo…
Survey Data Model • Facts: • Onecustomeranswersmanyquestions • Onequestionisanswered by manycustomers • Wewant to use a PivotTable to query this
Survey Scenario • Question: «Whatis the yearlyincome of consultants?» In otherwords… • Take the customerswhoanswered «Consultant» at the «Job» question • Slicethemusing the answer to the question «YearlyIncome»
SurveyAnalytical Data Model • Two «Filter» Dimensions • One for «Job» = «Consultants» • One for «YearlyIncome» Question1 = «Job»Answer1 = «Consultant» Question2 = «YearlyIncome»
Survey: The FinalResult • Value = CustomerCount • The facttable, this time, actsas the bridge
SurveyAnalytical Data Model No relationshipbetween the fact table and the twofilterdimensions Becausehereisonlyonevalue for ID_Answer
SurveyAnalytical Data Model • The twofilters • Are applied on the Customerstable • Use separate instancesof the Answerstable
Survey: The DAX Formula IF ( COUNTROWS (VALUES (Filter1[ID_Answer])) = 1 && COUNTROWS (VALUES (Filter2[ID_Answer])) = 1, CALCULATE ( CALCULATE ( COUNTROWS (Customers), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter2[ID_Answer])) > 0 ) ), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter1[ID_Answer])) > 0 ) ) ) Additionalconditions toset the relationships withtwotables in DAX only
Survey: The DAX Formula IF ( COUNTROWS (VALUES (Filter1[ID_Answer])) = 1 && COUNTROWS (VALUES (Filter2[ID_Answer])) = 1, CALCULATE ( CALCULATE ( COUNTROWS (Customers), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter2[ID_Answer])) > 0 ) ), FILTER ( Customers, CALCULATE ( COUNTROWS (Answers), Answers[ID_Answer] = VALUES (Filter1[ID_Answer])) > 0 ) ) ) The tworeferences to «Answers» work on separate instances of the sametable
Survey - Conclusions • Verypowerful data model • Very compact • Duplicatesonlydimensions • Different from the same pattern in Multidimensional • Handlesanynumber of questions • Wehaveshowntwo • Butitisnot a limit • Interestingtopics • Facttableas the bridge • Relationships set in DAX • Can be queried with a simplePivotTable
Basket Analysis: The Scenario Of all the customerswhohavebought a Mountain Bike, howmanyhaveneverbought a mountain tire tube?
Basket Analysis in SQL Two iterations over the fact table needed SELECT COUNT (DISTINCT A.CustomerID) FROM FactSales A INNER JOIN FactSalesB ON A.CustomerID= B.CustomerID WHERE A.ProductModel= ‘MOUNTAIN TIRE TUBE' AND A.Year<= 2004 AND B.ProductModel= ‘MOUNTAIN-100' AND B.Year<= 2004
Look the query plan… Thisis the facttable… Do youreallylike to self-join it?
Basket Analysis: The Data Model Of all the customerswhohavebought a Mountain Bike, howmanyhaveneverbought a mountain tire tube? We can filter «Mountain Tire Tube» with thistablebut… where do wefilter «Mountain Bike»?
Basket Analysis: The Data Model 2° filter: «Mountain Bike» No relationships 1° filter: «Mountain Tire Tube»
The FinalResult 2° filter: «Mountain Tire Tube» HavingProduct = BoughtBoth NotHavingProduct = Bought Bike, No Tire 1° filter: «Mountain Bike»
HavingProduct := CALCULATE ( COUNTROWS (DISTINCT (FactSales[CustomerKey])), FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ), FILTER (DimCustomer, SUMX (ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ) ) ) > 0 ))
HavingProduct := CALCULATE (COUNTROWS (DISTINCT (FactSales[CustomerKey])), FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ), FILTER (DimCustomer, SUMX (ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ) ) ) > 0 )) Count the number of customers…
HavingProduct := CALCULATE ( COUNTROWS (DISTINCT (FactSales[CustomerKey])),FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ), FILTER (DimCustomer, SUMX (ProductFilter, CALCULATE ( COUNTROWS (FactSales), ALL (FactSales), FactSales[CustomerKey] = EARLIER (DimCustomer[CustomerKey]), FactSales[ProductKey] = EARLIER (ProductFilter[ProductKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ) ) ) > 0 )) In the period of time before the end of the currentlyselectedfilter