1 / 14

More ETL

More ETL. ETL in a nutshell. ETL is an abbreviation of the three words  E xtract,  T ransform and L oad. It is an ETL process to extract data, mostly from different types of systems, transform it into a structure that's more appropriate for reporting and analysis

Télécharger la présentation

More ETL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. More ETL

  2. ETL in a nutshell • ETL is an abbreviation of the three words Extract, Transform and Load. • It is an ETL process to • extract data, mostly from different types of systems, • transform it into a structure that's more appropriate for reporting and analysis • finally load it into the database and or cube(s).

  3. Logical Data Map • Used to collect and document source systems to be used for DW • Should contain the following: • Target table name • Target column name • Table type • SCD type • Source db • Source table name • Source column name • Transformation

  4. ETL in a nutshell • Extract from source • Transform the data • Load the data into the datawarehouse

  5. Do not forget • Data Profiling • Data Quality Control • Metadata Management

  6. ETL Tools 1. Oracle Warehouse Builder (OWB) Oracle 2. Data Services SAP Business Objects 3. IBM Information Server (Datastage) IBM 4. SAS Data Integration Studio SAS Institute 5. PowerCenter Informatica 9.0 Informatica 6. Elixir Repertoire 7.2.2 Elixir 7. Data Migrator 7.7 Information Builders 8. SQL Server Integration Services (SSIS) Microsoft 9. Talend Open Studio & Integration Suite Talend • DataFlow Manager 6.5 Pitney Bowes Business Insight

  7. On ETL Tools • All the ETL vendors say the same thing: “our biggest competitor is custom code and stored procedures”. There are still a large number of organizations building data warehouses without using an ETL tool, writing their own, mostly very complex, SQL statements, often difficult to maintain. • ETL tools can increase productivity by a factor of three to five. Two of the most important features in this case are that these tools are completely meta data driven, and work with an easy to understand graphical user interface (GUI).

  8. On ETL Tools (WYSWYG) • Tools with data driven, embedded data driven approach. They can show you, at any time you choose, how the data will be transformed and loaded into the data warehouse, without actually loading it. At any time, the user can press the ‘WYSIWYG’ button, and the ETL tool shows immediately what the results of an ETL flow would be

  9. The Future? EAI and ETL • What are the similarities between Enterprise Application Integration and Extraction Transformation and Loading tools? Some people (analysts) think these technologies will converge to form one product, Enterprise Information Management. They both move data, often use the same metadata, so why shouldn't they merge into one package?

  10. ETL Tool Selection • In many organisations there is no software selection process at all. Sometimes there is a corporate standard you need to comply to, or you may have a database or BI tool in place from a specific vendor, and you decide to buy an ETL tool from the same vendor, because you expect that these will work better together. • Often organisations find it difficult and labor intensive to execute a software selection process in order to gather all the necessary data. 

  11. ETL Tool Selection: From Company Perspective • Create data integration strategy • Define criteria in business terms • Create a short-list • Invite vendors for live demonstration • Perform Proof of Concept • Negotiate with Vendors • Close the Deal

  12. Selection Criteria • Architecture • ETL Functionality • Ease-of-Use • Reusability • Debugging • Real-time • Connectivity • General ETL tool characteristics

  13. Selection Criteria • Architecture • Symmetric Multiprocssing? Grid? etc • ETL Functionality • Conditional Splitting? Pivoting? etc • Ease-of-Use • Screen design? Task compatibility? etc • Reusability • Decompisition? User-defined functions? • Debugging • Breakpoints? Validators? etc • Real-time • On demand data integration? etc

  14. Long Test • Date and Venue • Coverage

More Related