Back to description
This book is about applications. Specifically, this book is about applying the functionality of SQL Server 2005 Integration... more
This book is about applications. Specifically, this book is about applying the functionality of SQL Server 2005 Integration Services (SSIS) to help you envision, develop, and implement your data processing needs. The discussions throughout the book spotlight how SSIS can help you accomplish your data integration and processing requirements.
Core to the data processing that SSIS does best is extraction, transformation, and loading (ETL). Over the years, this ETL has taken on a range of different meanings, from the general perspective of moving data from somewhere to somewhere else, to the specific application of data warehousing ETL. In fact, ETL has its roots in business intelligence (BI) and data warehouse processing.
This chapter provides important background information for generalized ETL that DBAs will need, as well as basic data warehousing ETL concepts. In addition, this chapter includes a practical review of SSIS functionality and provides the foundation for building the book’s examination of applying the functionality of SSIS to help you accomplish your individual goals in data integration and processing requirements.
... less
Whether applying SSIS toward warehouse ETL or data integration, or toward more complicated DBA tasks, the programming extendibility... more
Whether applying SSIS toward warehouse ETL or data integration, or toward more complicated DBA tasks, the programming extendibility can be a useful and powerful tool to accomplish workflow tasks in the control flow, or data-centric operations in the data flow, that are not easily handled by out-of-the-box components. You can extend SSIS in this way by using a couple of different supported approaches:
Through Custom Tasks, Components, and Enumerators, you are able to compile and register native or managed code, as well as extend the product with reusable components just like the built-in components found in the toolbox.
By using the Script Task or the Script Component, you can write code to accomplish things that are better suited to scripting than what can be accomplished through other components.
Both of these topics are discussed in the Professional SQL Server 2005 Integration Services book (Wiley Publishing, 2006). In that book, Chapter 14 discusses writing a custom task or custom component, and Chapter 15 shows how to write a user interface for your component. Scripting is also covered in Chapter 7 of that book, presenting the basics of the Script Task and the Script Component. All those chapters are commendable. It is beyond the scope of this chapter to cover those topics again at that level. However, we felt that more could be said and exemplified about extending your scriptingtaking full advantage of scripting in SSIS when it is appropriate.
What is meant here is that there may be specific times when you might want to customize or extend that functionality beyond basic scripting practices. For example, SSIS runs within the .NET Framework, and, therefore, you can use the base .NET libraries in the Global Access Cache (GAC), or you can build your own custom libraries to handle the custom logic that you need. This chapter shows you how to leverage the power of SSIS by explaining how to extend the Script Tasks and the Script Components, including the following:
Working with package variables
Referencing custom libraries
Updating connections in the Script Task
Raising error events
Encryption data in the Script Component
Profiling data in the data flow
A natural first discussion point for ETL is the extraction, the E in ETL. This chapter applies the concepts of the data extraction... more
A natural first discussion point for ETL is the extraction, the E in ETL. This chapter applies the concepts of the data extraction with using SSIS. As discussed in Chapter 1, ETL applies to a broad spectrum of applications beyond just data warehousing and ETL loading. Therefore, the discussion of this topic will include both generalized extraction concepts and data warehouse–specific concepts.
Data extraction is the process of moving data off of a source system, potentially to a staging environment or into the transformation phase of the ETL. Figure 3-1 shows the extraction process separated out on the left. An extraction process may pull data from a variety of sources, including files or database systems, as this figure highlights.
Figure 3-1: Extraction process
A few common objectives of data extraction include the following:
Consistency in how data is extracted across source systems
Performance of the extraction
Minimal impact on the source to avoid contention with critical source processes
Flexibility to handle source system changes
The ability to target only new or changed records
This chapter is structured into the following three sections:
SSIS data flow source adaptersBeyond just a review of the data flow sources, this chapter looks at performance and configuration options.
Incremental extractionPart of applying data extraction in SSIS involves seeing how to design SSIS packages that perform incremental extractionsextracting only changed records from a source.
Data lineageMany applications require that the source data being extracted include a mechanism that points back to the very records that were extracted, which provides valuable data validation.
The next three chapters focus on the discussion of ETL for data warehousing and business intelligence processing. This chapter... more
The next three chapters focus on the discussion of ETL for data warehousing and business intelligence processing. This chapter examines the processing of dimension tables in SSIS. Dimension tables are a data warehouse concept, which this chapter describes and then discusses how to move data from your data sources to your data warehouse dimension tables. Similar to this, Chapter 5 reviews the same things, but only applied for fact tables. Chapter 6 also covers business intelligence, but looks at the integration of SSIS with SQL Server 2005 Analysis Services (SSAS). Integration between SSIS and SSAS involves Online Analytical Processing (OLAP) cube and dimension processing, as well as SSAS data mining querying and training.
When processing data warehouses, dimension table ETL is only half the story. The next major aspect of ETL involves fact tables... more
When processing data warehouses, dimension table ETL is only half the story. The next major aspect of ETL involves fact tables. But the good news is that dimension ETL usually represents the majority of the complexity in the overall warehouse ETL. To be sure, a data warehouse involves more than just dimension and fact ETL (such as data lineage, auditing, and execution precedence), but in terms of business data, dimension, and facts, it contains the core information.
In this chapter, we focus on applying SSIS to fact table ETL. We discuss the theory and general concepts of fact table ETL, including data mapping, workflow, and precedence. We also identify dimension surrogate keys and measure calculations, and discuss how to manage data grain changes.
This chapter targets the integration points between SSIS and SQL Server 2005 Analysis Services (SSAS). Business Intelligence... more
This chapter targets the integration points between SSIS and SQL Server 2005 Analysis Services (SSAS). Business Intelligence (BI) solutions that involve SSAS cubes typically include a method for processing the objects from the data warehouse or data mart database sources into the OLAP cube structures. With the focus of Chapters 4 and 5 on handling dimension and fact table ETL, the final step in the process is to load the related SSAS objects, which, in most cases, will be the responsibility of the ETL process.
Since a big portion of SSAS integration in SSIS involves processing data and managing measure group partitions, the focus of this chapter is on processing and partition management techniques and best practices, starting with the out-of-the-box basics and followed by handling more complicated situations. The chapter concludes with a consideration of dimension change type implications with SSAS attribute relationships.
What sets enterprise ETL solutions like SSIS apart from just writing your complete process in a TSQL script is how SSIS handles... more
What sets enterprise ETL solutions like SSIS apart from just writing your complete process in a TSQL script is how SSIS handles problems. This chapter walks you through several scenarios on how to trap an error that makes your system go bump in the night without having to receive that call or page. In SSIS, you can develop a package that can self-heal or, at worst, roll back to a predictable state where you can fix things in the morning. If you must get a 4 a.m. call, you will want to be able to fix the problem and re-execute the package without having to worry about your data being in an inconsistent state. This chapter walks you through a number of examples on how to stabilize and make your packages bulletproof.
Most enterprise environments rely on consistent and reproducible patterns in development to keep cost down and deployments... more
Most enterprise environments rely on consistent and reproducible patterns in development to keep cost down and deployments predictable. Whether your environment is an enterprise-sized company or a smaller company, you want minimal work in order to move your solution through development, quality control, and production. This chapter shows a number of options to help you with development of your SSIS solution, and shows how to lower your total cost of development by creating simple, easy migrations from environment to environment.
So, you have a set of packages and are ready to run the package in production. This chapter focuses on how to administer... more
So, you have a set of packages and are ready to run the package in production. This chapter focuses on how to administer packages after they’re deployed to production. The discussion also examines how to create a stand-alone ETL server and some of the command line utilities you can use to make your job easier.
Because of its price-point (out of the box with SQL Server 2005), SSIS is not only being used to push data into SQL Server... more
Because of its price-point (out of the box with SQL Server 2005), SSIS is not only being used to push data into SQL Server, but also to heterogeneous sources. On the other side of the coin, you will often receive data that’s not anywhere close to perfect that you must transform with SSIS. This chapter walks you through some of the scenarios witnessed in the field, that kept us up late at night. These examples should cover most scenarios that you’ll see in your company, and will hopefully keep you from having to have the same sleepless nights that we had while creating these solutions!
The SQL Server perfect storm is about to approach. SQL Server 2005 does still support Data Transformation Services... more
The SQL Server perfect storm is about to approach. SQL Server 2005 does still support Data Transformation Services (DTS) from a runtime perspective, but the next release of SQL Server may not. This gives you a small window of opportunity to upgrade your packages from DTS to SSIS using SQL Server 2005. This chapter discusses first how to run DTS from within a SQL Server 2005 environment, and then how to upgrade your DTS packages to SSIS. The first part of this process can be done through a wizard, but the conversion doesn’t stop when the package is migrated.
As a fitting conclusion to this book, considering how to scale SSIS will help tie several concepts together. The topic is... more
As a fitting conclusion to this book, considering how to scale SSIS will help tie several concepts together. The topic is not new, as ways to achieve scalability have been presented throughout the book (such as how to efficiently work with transactions and snapshots, and how to scale your dimension processing). This chapter hones the discussion of scalability to several key areas, including data flow tuning, destination adapter optimization, SSIS execution architectures, and effective use of SQL.
This chapter is grouped into three sections:
The first section, “SSIS Scalability Foundations,” looks at the low-hanging fruit. In other words, you may be reading this chapter because your packages are not performing to the level you would like. In this section, you'll find out what you should consider when trying to identify and improve performance. Additionally, one of the main areas of concern when architecting a package is when to use database scripts versus leveraging the data flow. Therefore, this section will provide guidelines to identifying when to use each of these valuable tools.
The second section, “Data Flow Optimization,” deals with scalability of the data flow. When it comes to tuning SSIS performance, a majority of the resource impact (directly from SSIS) will come from the data flow itself, since the control flow simply is coordinating the execution of other services and programs. Among other topics, this optimization review of the data flow will examine destination optimizations.
The final section, “Package Execution Principles,” discusses package architecture and execution location. Identifying where packages should run impacts scalability. Included in this section is a discussion of distributed package execution.
Purchase Before purchasing this product, please be sure you have met all software and system requirements, and that you understand any limits placed upon its use.
Return Policy Wrox Chapters on Demand are non-returnable and non-refundable.
Reader Software Wrox Chapters on Demand are offered as PDFs, and they must be viewed using the Adobe Reader. If you do not have the Reader installed, it can be downloaded for free at Adobe.com.
Test Download As Wrox Chapters on Demand purchases are non-returnable, it is advisable that you test your system and software configurations with a free sample download before you place an order.
Usage Rights for a Wrox Chapter on Demand File Any Wrox Chapter on Demand product you purchase from this site will come with certain restrictions that allow Wiley to protect the copyrights of its products. After you purchase and download this title, you:
If you have any questions about these restrictions, you may contact Customer Care at (877) 762-2974 (8 a.m. - 5 p.m. EST, Monday - Friday). If you have any issues related to Technical Support, please contact us at 800-762-2974 (United States only) or 317-572-3994 (International) 8 a.m. - 8 p.m. EST, Monday - Friday).
Related Books