260 likes | 295 Vues
ESS DATA VISUALISATION EBA’s BI and ITS analytics approach. 6 -7 May 2019 JULIO ROCHA. Agenda. The EBA’s data analytics platform Demo’s Data Quality – the start of an open discussion. About us.
E N D
ESS DATA VISUALISATIONEBA’s BI and ITS analytics approach 6-7 May 2019 JULIO ROCHA
Agenda • The EBA’s data analytics platform • Demo’s • Data Quality – the start of an open discussion PRESENTATION TITLE
About us • Established by Regulation (EC) No. 1093/2010 of the European Parliament and of the Council of 24 November 2010. • Come into being as of 1 January 2011 • Taken over all existing and ongoing tasks and responsibilities from the Committee of European Banking Supervisors (CEBS). • Hub and spoke network of EU and national bodies safeguarding public values such as the stability of the financial system, the transparency of markets and financial products and the protection of depositors and investors. PRESENTATION TITLE
Situation of supervisory data exploration • ITS Data • Very heterogeneous, different granularity • Complex definitions • Data Dictionary • Hyper-dimensional DPM; complex meta model structure • Static, hidden from end user; no semantic layer for data analysis • Data Warehouse • Structure designed for data integration, not for data analysis • No subject oriented data marts • Analytical Tools • Difficult to use by non-expert end users • No visual analytics capabilities Difficult to understand Difficult to use
Defining a BI strategy • WHAT we’re trying to achieve • Make ITS data easily accessible for exploration to all EBA information analysts • WHO we’re trying to reach • The majority of EBA users who lack the means to access ITS data in a self-sufficient manner • WHY it matters • To unlock the value of collected ITS data that remains largely unexploited for not being accessible • HOW we’ll measure success • A steady increase on the usage of ITS data
In a nut shell Report definitions Collected Data Analysis
The idea Comprehensible Data Report definitions Collected Data Subject Knowledge Analysis
Business requirements (1) • Use of all data properties • template based properties (row, column, cell references), including open axes • data point metric and dimensions • instance level dimensions (entity, period, consolidation scope, subsidiaries) • Automatic refresh of content • scheduled or on demand • Specific submission periods (data submitted ‘as of’) • Immediate adjustment to structural changes • introduced by a new DPM release • of the subject area (adding or removing templates) • Exhibiting time series continuity over all data point versions • by referencing data to invariant Data Point IDs • Tackling shared data points, changes in templates over time etc.
Business requirements (2) • Offering multiple possibilities of data visualisation • by using standard template layout (or combining templates) • by defining an ad hoc report layout, based on any of the available dimensions, or template references, or both • Easy filtering of data • slicing based on any combination of • periods, entities, entities master data • DPM dimension members • frameworks, report types, template groups, templates, cell coordinates • filtering for different coverage (selected banks, countries, EU, country aggregates) • guided analysis, never loosing view of the selection criteria, nor getting empty result sets • Aggregating/drilling through data • analysing data at different levels, by adding or removing dimensions
BI delivery mode • Mode 1 • Plan-based, approval based • Waterfall • Conventional projects • IT-centric • Long cycle • Mode 2 • Empirical, continuous, process based • Agile • New and uncertain projects • Business-centric • Short cycle • The delivery of ITS Analytics may look like an obvious candidate for Mode 2 approach • … however, it’s also clear in this case that reliability and accuracy cannot be compromised in any way! • The envisaged solution combines the advantages of both modes in an alternative approach, based on developing a highly structured process for delivering user-defined analytical models, where internal and mutual consistency is always guarantied. Bimodal BI (acc. Gartner) Mode 1 -Reliability -Accuracy -Stability Mode 2 -Agility -Speed -Autonomy
The design Reporting metadata Collected data Subject specific, comprehensible, analytically focussed data sets Master Data Exploration and Usage Data Marts Metadata DPM Metadata Data Warehouse Transformation (ETL) Data Analytics System
Analytical model - Sources Sheets Rows Columns Tables Reporting Entities Report Types Reference Dates Consolid Scopes Master data Metrics Reception Dates Report Instances Cells Data Points Main Dimensions Other Dimensions DPM Instances Fact Values Open Dimensions Typed Dimensions Facts & Context + DPM
DPM-based dimensions PRESENTATION TITLE
Process Overview Reported Data Warehouse Analytical Data Marts SAS Analytics Data Point Model 1 2 3 Data Marts Definition Data Marts Generation Cubes Generation Business User Business User Master Data & Reference Data Data Marts Metadata Dimensional Data Cubes Excel + Power BI
Ad-hoc creation of data marts and cubes Defining Subject Areas
Process for defining a data mart • Define data mart properties • Name • Description • Business Justification • Owner/Requester • Refresh frequency • Start/End date • User access format (i.e. MS Cube, SAS data mart, or both ) • Define data mart scope • Selecting the relevant Templates • Excluding unwanted cells (columns, rows, sheets) • Identify data mart main dimensions • by selecting from the returned list of applicable dimensions
1. Create a subject area • Basic management info • Selection of date range • Choice of outputs to be built.
2. Choose some tables • Only shows reporting frameworks and tables to which the user has access • Select by table • Can select/deselect specific cells • Shows all versions of tables (changes over time)
3. Decide on the more/less relevant dimensions • Indicates: • The dimensions present in the selected data • The values used • On how many data points. • Depending on the data, and the intended analysis, some will be useful, some will be less useful or just redundant. • User assigns between “main” and “other” Used a lot… Don’t apply to much Looks interesting … but not informative
4. Build/Refresh the marts and cubes • Only shows reporting frameworks and tables to which the user has access • Select by table • Can select/deselect specific cells • Shows all versions of tables (changes over time)
5. Explore the cube • Directly launch into Excel…
5. Explore the cube • Directly launch into Excel…
Demo Some of the EBA’s data visualisation tools
Enhancing ITS data quality • Quality of reported data Over 2K validation rules are systematically applied to the ITS reports upon reception at the EBA, and failures have significantly decreased over time. However, these validations cover only a small fraction of the overall collected data, in a reporting framework with more than 65K datapoints. A widespread use of analytic tools on ITS data will multiply the possibilities of spotting abnormalities in reported data, learn from them, and enhance the data quality process. • Quality of meta data The development of the DPM has been, so far, focusing solely on defining a data dictionary for data collection, with no consideration of later application for data analysis. The exploration of DPM-based analytical models shall provide feedback to the DPM development process, which will be used to fine-tune the model for easier data analysis.