SQL Server 2005 Integration Services Matthew Stephen IT Pro Evangelist (SQL Server) http://blogs.technet.com/mat_stephen Microsoft Ltd.
What is SQL Server Integration Services? • A new Microsoft SQL Server Business Intelligence application • The successor to Data Transformation Services • The platform for a new generation of high performance data integration technologies
Integrate Analyze Report SQL Server Business Intelligence • Data enrichment, with business logic, hierarchical views • Data discovery via data mining • Data acquisition from source systems and integration • Data transformation and synthesis • Data presentation and distribution • Data access for the masses
Alerts & escalation Call centre data: semi structured Data mining ETL Text Mining Staging Legacy data: binary files Staging Warehouse ETL Hand coding Staging Cleansing & ETL Reports Application database ETL Mobile data Example: before Integration Services • Integration and warehousing require separate, staged, operations. • Preparation of data requires different, often incompatible, tools. • Reporting and escalation is a slow process, delaying smart responses. • Heavy data volumes make this scenario increasingly unworkable.
Alerts & escalation Mobile data Text mining components Call centre: semi-structured data Merges Data cleansing components Data mining components Standard sources Custom source Warehouse Legacy data: binary files Reports SQL Server Integration Services Application database Example: with Integration Services • Integration and warehousing are a seamless, manageable, operation. • Sourced, prepare and load data in a single, auditable process. • Reporting and escalation can be parallelized with the warehouse load. • Scales to handle heavy and complex data requirements.
How SSIS Operates • Data sources can be diverse, including custom or scripted adapters • Transformation components shape and modify data in many ways. • Data is routed by rules or error conditions for cleansing and conforming. • Flows can be as complex as your business rules, but highly concurrent. • And finally data can be loaded in parallel to many varied destinations.
Customer benefits of SSIS • Performance • Data flows process large volumes of data efficiently • Facility • Many prebuilt adapters and transformations reduce hand coding • Extensible object model • Highly productive visual environment • Data cleansing features • Data mining • imputation of incomplete data
Feature drilldown:Data Integration • Traditional data sources • XML • Custom data sources • Integrate diverse sources • Parallel loading of diverse destinations
Feature drilldown:Data Warehousing • Sorting and aggregation during loading • Multicast and partition • Slowly changing dimensions • Load and process Analysis Services cubes
Feature drilldown:Intelligent Data Handling • Capture error rows • Fuzzy lookup and grouping • Data mining on the data flow • Text mining
Feature drilldown:Large Data Volumes • Efficient data sources • High performance processing • > 700% faster than DTS 2000 • Unique SQL Server Destination • > 8% faster than Bulk Insert • Advanced data flow architecture • Enables flexible concurrent processing
Feature drilldown:Development • Visual studio integration • Visual designer • Visual debugging • Build and deploy • Custom code integration
Enabling new architectures … Traditional (DTS) warehouse loading • Integration process simply conforms data and loads the database server • The database performs aggregations, sorting and other operations • Database competes for resources from user queries • This solution does not scale very well
Enabling new architectures … Warehouse loading with SQL Server Integration Services • SQL Server Integration Services conforms the data • But also aggregates and sorts, and loads the database • This frees-up the database server for user queries
Life Cycle tools • Design • Business Intelligence Designer • Migration wizard for pre SQL 2005 packages • Execute DTS 2000 package Task • Visual Source Safe Integration • Deployment • Deployment Utility • Command Line execution • Flexible Configuration Options • Supportability • Rich per package Logging • SSIS service • SQL Management Studio. • Checkpoint - Restart ability
Packages on file system Destination data Source data SSIS package error rows SSIS package Logging SSIS packages stored in SQL Sample Server Layout Integration ServicesPackage Execution Source Flat Files SSIS support Server(s)
Destination data Source data SSIS packages in SQL SSIS support Server(s) Logging Error Rows Sample Server Layout SSIS ParentPackage Execution via SQL Agent (scheduled) Parent calls children via SQL Agent on other machines
Summary • SQL Server Integration Services is an exceptionally high performance integration and transformation tool • Some processes benefit more from parallelism, some from memory • Many new tasks and transforms • Separation of control flow and data flow
Resources • Microsoft SQL Team blogshttp://www.sqljunkies.com/blogs • Microsoft SQL Server community on the webwww.sqlservercentral.com • SQL Server Integration Services on the webwww.sqlis.com • SQL Server Developer Centerhttp://msdn.microsoft.com/sql/ • Microsoft SQL Server 2005 websitehttp://www.microsoft.com/sql/2005/default.asp • SQL Newsgroups http://www.microsoft.com/technet/community/newsgroups/server/sql.mspx • SQL Server 2005 Datamininghttp://www.sqlserverdatamining.com/DMCommunity/