1 / 47

SQL Server 2012 for Business Intelligence

SQL Server 2012 for Business Intelligence. UTS Short Course. Mehmet Ozdemir – SA @ SSW. w: blog.ozdemir.id.au | e: mehmet @ssw.com.au | t: @ mozdemir_au. SQL Server, BI, Infrastructure Specializes in Application architecture and design SQL Performance Tuning and Optimization

avani
Télécharger la présentation

SQL Server 2012 for Business Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL Server 2012 for Business Intelligence UTS Short Course

  2. Mehmet Ozdemir – SA @ SSW w: blog.ozdemir.id.au | e: mehmet@ssw.com.au | t: @mozdemir_au • SQL Server, BI, Infrastructure • Specializes in • Application architecture and design • SQL Performance Tuning and Optimization • HyperV, SCVMM • Technology aficionado • Virtualization • Reporting/BI • Cubes

  3. Admin Stuff • Attendance • You initial sheet • Hands On Lab • You get me to initial sheet • Homework • Certificate • At end of 5 sessions • If I say if you have completed successfully 

  4. Course Website • http://sharepoint.ssw.com.au/Training/UTSSQL/Pages/default.aspx • Course Timetable • Course Materials

  5. Course Overview

  6. Last week(s) • Other cube browsers • Excel reports to SSAS • Power View • Power Pivot • Extending with DAX • SharePoint deployments

  7. Install SSAS!!!!

  8. The plan

  9. Step by step to BI • Create Data Warehouse • Copy data to data warehouse • Create OLAP Cubes • Create Reports • Browse the cube • Do some Data Mining • Discovering relationships • Predict future events

  10. Agenda • What is Data Mining? • Why? • Uses • Algorithms • Demo • Hands on Lab

  11. What is Data Mining? “Data mining is the use of powerful software tools to discover significant traits or relationships,from databases or data warehouses and often used to predict future events”

  12. What is Data Mining? • It exploits statistical algorithms • Once the “knowledge” is extracted it: • Can be used to discover • Can be used to predict values of other cases

  13. Why Data Mining? • Marketing • Who picks the movie? The kids, the wife, me • Who are our Customers and what sort of films do they hire? • Is a 30 year old woman with 2 children going to hire Arnie’s latest film • Validation • Is this data sensible? Terminator 2 and Toy Story • Prediction • Sales Next Year

  14. Why? Its all about money • Get new information from data, future trends, past trends, outlier, maximums, minimums • Analyse data from different perspectives and summarizing it into useful information • New information to • increase revenue • cuts costs • or both :-)

  15. Which Questions are Data Mining? • Who are our biggest customers? • What are customers buying with cigars? • What are the customer retention levels of our branches? • Which customers have bought olives, feta cheese but no ciabatta bread? • Which regions have the highest male/female ratio of single 20 something's? • Which region has lowest customer retention levels and list out lost customers?

  16. What’s not data mining • Ad hoc query • Drill through to details

  17. Data - Uncover patterns in samples • Huge amount of data • Good raw material  good data mining • Samples should be representative • Samples "similar" to domain • Not all-seeing crystal ball • Verify and Validate!

  18. OLAP versus Data Mining • OLAP • Is about fast ad hoc querying • Analysis by dimensions and measures • Gives precise answers • Data Mining • May use RDBMS or OLAP source • Is about discovering and predicting • Gives imprecise answers • OLAP is not a prerequisite for data mining, but it almost always comes first (learning to ride a bike before a car)

  19. Types of Data Mining Algorithms • Classification algorithms • predictone or more discrete variables, based on the other attributes in the dataset • Regression algorithms • predictone or more continuous variables, such as profit or loss, based on other attributes in the dataset • Segmentation algorithms • dividedata into groups, or clusters, of items that have similar properties • Association algorithms • find correlations between different attributes in a dataset • Sequence analysis algorithms • summarize frequent sequences or episodes in data, such as a Web path flow

  20. Complete Set Of AlgorithmsWays to analyze your data Clustering Time Series Decision Trees Naïve Bayes Association Linear Regression Neural Network Sequence Clustering Logistic Regression

  21. Decision trees • Split data • Each of branch is like an attribute • Brightness = amount of data

  22. Decision Trees (1) • Decision Trees assign (classify) each case to one of a few (discrete) broad categories of selected attribute (variable) and explains the classification with few selected input variables • The process of building is recursive partitioning – splitting data into partitions and then splitting it up more • Initially all cases are in one big box

  23. Decision Trees (2) • The algorithm tries all possible breaks in classes using all possible values of each input attribute; it then selects the split that partitions data to the purest classes of the searched variable • Several measures of purity • Then it repeats splitting for each new class • Again testing all possible breaks • Unuseful branches of the tree can be pre-pruned or post-pruned

  24. Decision Trees (3) • Decision trees are used for classification and prediction • Typical questions: • Predict which customers will leave • Help in mailing and promotion campaigns • Explain reasons for a decision • What are the movies young female customers like to buy?

  25. Decision Trees – Who Decides

  26. Naïve Bayes • Bayes Formula • Uses statistics to say falls into certain category or not with probability • Spam filtering: score of spam (Bayes) • Testing only a particular attribute

  27. Naïve Bayes • Quickly builds mining models that can be used for classification and prediction • It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute • This can later be used to predict an outcome of the predicted attribute based on the known input attributes • This makes the model a good option for exploring the data

  28. Cluster Analysis (1) • Grouping data into clusters • Objects within a cluster have high similarity based on the attribute values • The class label of each object is not known • Several techniques • Partitioning methods • Hierarchical methods • Density based methods • Model based methods • And more…

  29. Cluster Analysis (2) • Segments a heterogeneous population into a number of more homogenous subgroups or clusters • Some typical questions: • Discover distinct groups of customers • Identification of groups of houses in a city • In biology, derive animal and plant taxonomies • Find outliers

  30. Clustering Annual Income Age

  31. Time series • Timebaseddata  prediction

  32. Sequence clustering • Numbers orders stronger associations • Direction of association (not necessary the other direction)

  33. Association • If you own certain stocks ' you own maybe other ones as well • Probability = thickness of line

  34. Neural Nets • Let system learn how to classify data • Neural Network adapts to the new data • Formulate statement/hypothesis • Outcome is know • (Data / Surveys) • 1. 70% data to train network (outcome is known) • 2. 30% of data to test network (outcome is known) • 3. New data (no survey needed, predict from network) • Other example: OCR

  35. Conclusion: When To Use What

  36. Excel Data Mining • Microsoft SQL Server 2012 Data Mining Add-ins for Microsoft Office 2013 • http://www.microsoft.com/en-us/download/details.aspx?id=35578

  37. Other usages of data miningFind patterns - Profiling • Train station / airport • Who is the bad guy • Farmers • Find the best crops • Supermarket • Find to figure out how to get you to buy more, where the expensive items

  38. Create a Data Mining model • Create the Analysis Services Database • Creating a Named Calculation • Creating a Named Query • Creating the Mining Structure and Model • Create and Modifying Additional Models • Processing the Mining Structure • Using the Model Viewer to examine the models

  39. Master Data Management (MDM) • Collect and maintain accurate and complete master data to ensure standardized data definitions of key business entities across all of your IT assets. • Manage data consistency across different environments • Manage the superset of all data attributes across all systems. • Create versions for each model at different time intervals. • http://www.microsoft.com/en-us/sqlserver/solutions-technologies/enterprise-information-management/master-data-services.aspx

  40. Tip • SSIS 2012 - Data profiling task • Get a profile of the data in a table • potential candidate keys • length of data values in columns • Null percentage of rows • distribution of values • ....

  41. Resources 1 • Video: Simple data mining model http://www.sqlservercentral.com/articles/Video/65055/ • Video: Data mining and Reporting Services http://www.sqlservercentral.com/articles/Video/64190/ • Data Mining Algorithms http://msdn.microsoft.com/en-us/library/ms175595.aspx

  42. Resources 2 • Data Mining Resourceshttp://www.sqlserverdatamining.com/ssdm/ • Richard Lees on BIhttp://richardlees.blogspot.com/ Book Data Mining with Microsoft SQL Server 2008http://www.amazon.com/gp/product/0470277742?ie=UTF8&tag=sqlserverda09-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0470277742http://au.wiley.com/WileyCDA/WileyTitle/productCd-0470650931.html

  43. Summary • Why Data Mining? • Uses • Algorithms • Demo • Hands on Lab

  44. 3things… • mehmet@ssw.com.au • http://blog.ozdemir.id.au • twitter.com/mozdemir_au

  45. Thank You! Gateway Court Suite 10 81 - 91 Military Road Neutral Bay, Sydney NSW 2089 AUSTRALIA ABN: 21 069 371 900 Phone: + 61 2 9953 3000 Fax: + 61 2 9953 3105 info@ssw.com.auwww.ssw.com.au

More Related