Wrox Home  
Search
Professional SQL Server 2005 XML
by Scott Klein
January 2006, Paperback


SQL Server 2005 Programmability Enhancements — Common Table Expressions

By Scott Klein

There is a plethora of new T-SQL features and enhancements in SQL Server 2005, and you could probably write a small book talking about all of them. However, this article will focus on one of them — specifically, Common Table Expressions (CTE), because personally they have come in handy on many occasions.

I have used a few of the other new features and enhancements, and if you have not had a chance to delve into any of them, I would highly recommend familiarizing yourself with some of them, such as the new xml data type, improved error handling, query and event notifications, new DDL (Data Definition Language) triggers, ranking functions, TOP operator enhancements, the PIVOT and UNPIVOT operators, and the new APPLY operator. There are a few others, and they all add tremendous value to T-SQL developers.

This article will introduce common table expressions and go as in-depth as space will allow, showing how CTEs can be used to improve T-SQL's capabilities and how they can help better your T-SQL.

Common Table Expressions

A common table expression is expressed as a temporary table or view defined within an executing statement, such as SELECT INSERT, UPDATE, or DELETE. Think of a CTE as somewhat like a derived table in that a CTE only lasts throughout the duration of the executing statement. When that execution of the statement finishes, the CTE is gone. However, there are a few differences between a derived table and a CTE:

A CTE can reference itself. A derived table cannot.

A CTE can be referenced multiple times in the same query.

So what is the difference between a CTE and a true temporary table? The quick and easy answer is the overhead of having to create a temporary table and the performance consequence of using a temporary table. Think about what happens when you use a true temporary table. First, you have to define and create the table then populate it, and then run queries against it. When using CTEs you can forgo the first step and the associated overhead and use the built-in CTE, which provides all of that functionality for you.

CTE Structure

The structure of CTE is fairly simple. It basically contains two main parts, the first part being the CTE followed by the second part, which is the normal execution statement. The basic syntax for a CTE is as follows:

WITH common_table_expression_name [column(s)]
AS
(
  query_definition
)
EXECUTION STATEMENT

As defined in the syntax above, a common table expression contains several arguments that are defined as follows:

common_table_expression_name: The identifier for the common table expression. This name must be different than any other table or view used within the CTE. It can, however, be named the same as a base table or view, but when the common table expression is referenced in the normal execution statement, the common table expression name must be used, rather than the base name.

column(s): The column names in the common table expression. These column names must be unique (meaning, no duplicate names allowed) and must match the same number of columns returned by the query_definition.

query_definition: The SELECT statement in which the results are used to populate the common table expression. This statement follows the same requirements that the normal execution statement follows.

A CTE expression cannot define another CTE expression.

The following example uses the syntax outlined above to create a very simple common table expression query. Open a query window in SQL Server Management Studio and execute the following query against the AdventureWorks database.

WITH CTE_ProdMod (ProductModelID, ProductModelCount) AS
(
	SELECT   ProductModelID, COUNT(*)
	FROM     Production.Product
	WHERE    ProductModelID IS NOT NULL
	GROUP BY ProductModelID
)
SELECT   ProductModelID, ProductModelCount
FROM     CTE_ProdMod
ORDER BY ProductModelID

The example above uses a common table expression to count the number of products for each product model in the Production.Product table. A portion of the results of this query when executed are shown below.

ProductModelID ProductModelCount
-------------- -----------------
1              3
2              1
3              3
4              3
5              10
6              11
7              8
8              10
9              12
10             10

A simple example, yet not that impressive because it doesn't quite showcase the true power and flexibility of what a common table expression can really do. So, the following example uses a common table expression to return the count of products for each product model. The count is returned in the CTE, then the CTE is used to filter the results even further based on what was returned in the CTE.

WITH CTE_TranHist (ProductID, OrderQuantity) AS
(
	SELECT   ProductID, COUNT(*)
	FROM     Production.TransactionHistory
	WHERE    TransactionDate > '05/01/2004'
	GROUP BY ProductID
)
SELECT cte.ProductID, cte.OrderQuantity, 
	pp.productmodelid, ProductSubcategoryID
FROM Production.Product pp
INNER JOIN CTE_TranHist AS cte ON pp.productID = cte.productid
WHERE cte.productid BETWEEN 500 AND 750
AND ProductModelID IS NOT NULL
ORDER BY cte.ProductID

The example above uses a common table expression to count the number of products for each product model in the Production.Product table. A portion of the results of this query when executed are shown below.

ProductID   OrderQuantity ProductModelID ProductSubcategoryID
----------- ------------- -------------- --------------------
680         21            6              14
706         15            6              14
707         567           33             31
708         526           33             31
711         578           33             31
712         567           2              19
713         118           11             21
714         150           11             21
715         138           11             21
716         99            11             21
717         10            6              14
718         12            6              14
722         57            9              14
725         1             9              14
726         1             9              14
729         1             9              14
730         2             9              14
736         54            9              14
737         45            9              14
738         71            9              14
739         63            5              12
742         67            5              12
743         78            5              12
746         47            5              12
747         62            5              12
748         80            5              12

Imagine now that syntax you would have had to use to get the same results prior to SQL Server 2005. It would have looked something like this:

SELECT TranHist.ProductID, TranHist.OrderQuantity, 
	pp.ProductModelID, pp.ProductSubcategoryID
FROM Production.Product pp
INNER JOIN 
	(SELECT ProductID, COUNT(*)
		FROM Production.TransactionHistory
		WHERE TransactionDate > '05/01/2004'
		GROUP BY ProductID) 
	AS TranHist (ProductID, OrderQuantity)
ON pp.productID = TranHist.productid
WHERE TranHist.productid BETWEEN 500 AND 750
AND ProductModelID IS NOT NULL

Now, which would you rather read, let alone write? I thought so. Common table expressions have a number of advantages, including better readability to make it much easier to maintain more complex queries.

Now that you have a general understanding of CTEs, the true value of CTEs appear when you have the need for recursive queries. This topic is discussed next.