Common Table Expression (CTE)


A common table expression (CTE) is a bit like a temporary table. It’s transient, lasting only as long as the query requires it. Temporary tables are available for use during the lifetime of the session of the query running the code or until they are explicitly dropped. The creation and use of temporary tables is a two- or three-part phase: table creation, population, and use. A CTE is built in the same code line as the SELECT, INSERT, UPDATE, or DELETE statements that use it. The best way to understand a CTE is to demonstrate an example with some code. Within the AdventureWorks database, there are a number of products held in the Production.Product table. For this example, let’s say you want to know the maximum list price of stock you’re holding over all the product categories. Using a temporary table, this would be a two-part process, as follows:

USE AdventureWorks
GO
SELECT p.ProductSubcategoryID, s.Name, SUM(ListPrice) AS ListPrice
INTO #Temp1
FROM Production.Product p
JOIN Production.ProductSubcategory s ON s.ProductSubcategoryID =
p.ProductSubcategoryID
WHERE p.ProductSubcategoryID IS NOT NULL
GROUP BY p.ProductSubcategoryID, s.Name
SELECT ProductSubcategoryID,Name,MAX(ListPrice)
FROM #Temp1
GROUP BY ProductSubcategoryID, Name
HAVING MAX(ListPrice) = (SELECT MAX(ListPrice) FROM #Temp1)
DROP TABLE #Temp1

However, with CTEs, this becomes a bit simpler and more efficient. In the preceding code snippet, we’ve created a temporary table. This table has no index on it, and therefore SQL Server will complete a table scan operation on it when executing the second part. In contrast, the upcoming code snippet uses the raw AdventureWorks tables. There is no creation of a temporary table, which would have used up processing time, and also existing indexes could be used in building up the query as well rather than a table scan.
The CTE is built up using the WITH statement, which defines the name of the CTE you’ll be returning, in this case ProdList, and the columns contained within it. The columns returned within the CTE will take the data types placed into it from the SELECT statement within the brackets. Of course, the number of columns within the CTE has to be the same as the table defined within the brackets. This table is built up, returned, and passed immediately into the following SELECT statement outside of the WITH block where the rows of data can then be processed as required. Therefore, the rows returned between the brackets could be seen as a temporary table that is used by the statement outside of the brackets.

WITH ProdList (ProductSubcategoryID,Name,ListPrice) AS
(
SELECT p.ProductSubcategoryID, s.Name,SUM(ListPrice) AS ListPrice
FROM Production.Product p
JOIN Production.ProductSubcategory s ON s.ProductSubcategoryID =
p.ProductSubcategoryID
WHERE p.ProductSubcategoryID IS NOT NULL
GROUP BY p.ProductSubcategoryID, s.Name
)
SELECT ProductSubcategoryID,Name,MAX(ListPrice)
FROM ProdList
GROUP BY ProductSubcategoryID, Name
HAVING MAX(ListPrice) = (SELECT MAX(ListPrice) FROM ProdList)

SQL Server 2005: CTE(Recursive)
SQL Server 2005 adds a new feature called Common Table Expressions (CTE). The true power of CTEs emerges when you use them recursively to perform hierarchical queries on tree structured data. In fact, besides SQL-92 compliance, this was the main reason Microsoft built CTEs. A recursive CTE is constructed from a minimum of two queries, the first, or anchor member (AM), is a nonrecursive query, and the second, or recursive member (RM), is the recursive query. Within your CTE’s parentheses (after the AS clause), you define queries that are either independent or refer back to the same CTE. The AM and RM are separated by a UNION ALL statement. Anchor members and are invoked only once and are invoked repeatedly until the query returns no rows. Multiple AMs can be appended to each other using either a UNION or a UNION ALL operator, depending on whether you want to eliminate duplicates. (You must append recursive members using a UNION ALL operator.)  Here is the syntax:
With SimpleRecursive( field names)
As
(
     <Select Statement for the Anchor Member>
     Union All
     <Select Statement for the Recursive Member>
)
Select * From SimpleRecursive

To demonstrate this feature, I will create an example here. We create a table with employees and a self referencing field back to Employee_ID called Reports To (I call this a Domestic Key in value of a Foreign Key). We are going to write a query that returns all the employees who report to Stephen (Employee_ID=2) and all the employees who report to Stephen’s subordinates: 
--create a table with tree data
--Reports to is a "domestic key" back to Employee_id
create table Employee_Tree (Employee_NM nvarchar(50), Employee_ID int Primary Key, ReportsTo int)
--insert some data, build a reporting tree
insert into employee_tree values(Richard, 1, 1)
insert into employee_tree values(Stephen, 2, 1)
insert into employee_tree values(Clemens, 3, 2)
insert into employee_tree values(Malek, 4, 2)
insert into employee_tree values(Goksin, 5, 4)
insert into employee_tree values(Kimberly, 6, 1)
insert into employee_tree values(Ramesh, 7, 5)

 

 

Our table looks like this:

Employee_NM

Employee_ID

ReportsTo

Richard

1

null

Stephen

2

1

Clemens

3

2

Malek

4

2

Goksin

5

4

Kimberly

6

1

Ramesh

7

5

Now the recursive query to determine all the employees who will report to Stephen:

--Recursive Query
WITH SimpleRecursive(Employee_NM, Employee_ID, ReportsTO)
            AS
(SELECT Employee_NM, Employee_ID, ReportsTO
  FROM Employee_Tree WHERE Employee_ID = 2
UNION ALL
SELECT p.Employee_NM, p.Employee_ID, p.ReportsTO
 FROM Employee_Tree  P  INNER JOIN
 SimpleRecursive A ON A.Employee_ID = P.ReportsTO
)
SELECT Employee_NM FROM SimpleRecursive

Employee_NM
--------------------------------------------------
Stephen
Clemens
Malek
Goksin
Ramesh

(5 row(s) affected)
This recursion starts where Employee_ID= 2 (the ANCHOR MEMBER or the first SELECT). It picks up that record and then, via the RECURSIVE MEMBER (the SELECT after the UNION ALL), picks up all of the records that report to Stephen and that record’s children (Goksin reports to Malek and Malek reports to Stephen). Each subsequent recursion tries to find more children that have as parents the employees found by the previous recursion. Eventually the recursion returns no results and that is what causes the recursion to stop (the reason why Kimberly is not returned).