);
The table value constructor is defined in the FROM clause of the outer query.
The table’s body is made of a VALUES clause, followed by a comma separated list of pairs of parentheses, each defining a row with a comma separated list of expressions forming the row’s values.
The table’s heading is a comma separated list of the target column names. I’ll talk about a shortcoming of this syntax regarding the table’s heading shortly.
The following code uses a table value constructor to define a table called MyCusts with three columns called custid, companyname and contractdate, and three rows:
SELECT custid, companyname, contractdate
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid, companyname, contractdate);
The above code is equivalent (both logically and in performance terms) in T-SQL to the following alternative:
SELECT custid, companyname, contractdate
FROM ( SELECT 2, 'Cust 2', '20200212' UNION ALL
SELECT 3, 'Cust 3', '20200118' UNION ALL
SELECT 5, 'Cust 5', '20200401' )
AS MyCusts(custid, companyname, contractdate);
The two are internally algebrized the same way. The syntax with the VALUES clause is standard whereas the syntax with the unified FROMless queries isn’t, hence I prefer the former.
There is a shortcoming in the design of table value constructors in both standard SQL and in T-SQL. Remember that the heading of a relation is made of a set of attributes, and an attribute has a name and a type name. In the table value constructor’s syntax, you specify the column names, but not their data types. Suppose that you need the custid column to be of a SMALLINT type, the companyname column of a VARCHAR(50) type, and the contractdate column of a DATE type. It would have been good if we were able to define the column types as part of the definition of the table’s heading, like so (this syntax isn’t supported):
SELECT custid, companyname, contractdate
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid SMALLINT, companyname VARCHAR(50), contractdate DATE);
That’s of course just wishful thinking.
The way it works in T-SQL, is that each literal that is based on a constant has a predetermined type irrespective of context. For instance, can you guess what the types of the following literals are:
- 1
- 2147483647
- 2147483648
- 1E
- '1E'
- '20200212'
Is 1 considered BIT, INT, SMALLINT, other?
Is 1E considered VARBINARY(1), VARCHAR(2), other?
Is '20200212' considered DATE, DATETIME, VARCHAR(8), CHAR(8), other?
There’s a simple trick to figure out the default type of a literal, using the SQL_VARIANT_PROPERTY function with the 'BaseType' property, like so:
SELECT SQL_VARIANT_PROPERTY(2147483648, 'BaseType');
What happens is that SQL Server implicitly converts the literal to SQL_VARIANT—since that’s what the function expects—but preserves its base type. It then reports the base type as requested.
Similarly, you can query other properties of the input value, like the maximum length (MaxLength), Precision, Scale, and so on.
Try it with the aforementioned literal values, and you will get the following:
- 1:INT
- 2147483647:INT
- 2147483648:NUMERIC(10, 0)
- 1E:FLOAT
- '1E':VARCHAR(2)
- '20200212':VARCHAR(8)
As you can see, SQL Server has default assumptions about the data type, maximum length, precision, scale, and so on.
There are some cases where you need to specify a literal of a certain type, but you cannot do it directly in T-SQL. For example, you cannot specify a literal of the following types directly:BIT, TINYINT, BIGINT, all date and time types, and quite a few others. Unfortunately, T-SQL doesn’t provide a selector property for its types, which would have served exactly the needed purpose of selecting a value of the given type. Of course, you can always convert an expression’s type explicitly using the CAST or CONVERT function, as in CAST(5 AS SMALLINT). If you don’t, SQL Server will sometimes need to implicitly convert some of your expressions to a different type based on its implicit conversion rules. For example, when you try to compare values of different types, e.g., WHERE datecol ='20200212', assuming datecol is of a DATE type. Another example is when you specify a literal in an INSERT or an UPDATE statement, and the literal’s type is different than the target column’s type.
If all this is not confusing enough, set operators like UNION ALL rely on data type precedence to define the target column types—and remember, a table value constructor is algebrized like a series of UNION ALL operations. Consider the table value constructor shown earlier:
SELECT custid, companyname, contractdate
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid, companyname, contractdate);
Each literal here has a predetermined type. 2, 3 and 5 are all of an INT type, so clearly the custid target column type is INT. If you had the values 1000000000, 3000000000 and 2000000000, the first and the third are considered INT and the second is considered NUMERIC(10, 0). According to data type precedence NUMERIC (same as DECIMAL) is stronger than INT, hence in such a case the target column type would be NUMERIC(10, 0).
If you want to figure out which data types SQL Server chooses for the target columns in your table value constructor, you have a few options. One is to use a SELECT INTO statement to write the table value constructor’s data into a temporary table, and then query the metadata for the temporary table, like so:
SELECT custid, companyname, contractdate
INTO #MyCusts
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid, companyname, contractdate);
SELECT name AS colname, TYPE_NAME(system_type_id) AS typename, max_length AS maxlength
FROM tempdb.sys.columns
WHERE OBJECT_ID = OBJECT_ID(N'tempdb..#MyCusts');
Here’s the output of this code:
colname typename maxlength
------------- ---------- ---------
custid int 4
companyname varchar 6
contractdate varchar 8
You can then drop the temporary table for cleanup:
DROP TABLE IF EXISTS #MyCusts;
Another option is to use the SQL_VARIANT_PROPERTY, which I mentioned earlier, like so:
SELECT TOP (1)
SQL_VARIANT_PROPERTY(custid, 'BaseType') AS custid_typename,
SQL_VARIANT_PROPERTY(custid, 'MaxLength') AS custid_maxlength,
SQL_VARIANT_PROPERTY(companyname, 'BaseType') AS companyname_typename,
SQL_VARIANT_PROPERTY(companyname, 'MaxLength') AS companyname_maxlength,
SQL_VARIANT_PROPERTY(contractdate, 'BaseType') AS contractdate_typename,
SQL_VARIANT_PROPERTY(contractdate, 'MaxLength') AS contractdate_maxlength
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid, companyname, contractdate);
This code generates the following output (formatted for readability):
custid_typename custid_maxlength
-------------------- ----------------
int 4
companyname_typename companyname_maxlength
-------------------- ---------------------
varchar 6
contractdate_typename contractdate_maxlength
--------------------- ----------------------
varchar 8
So, what if you need to control the types of the target columns? As mentioned earlier, say you need custid to be SMALLINT, companyname VARCHAR(50), and contractdate DATE.
Don’t be misled to think that it’s enough to explicitly convert just one row’s values. If a corresponding value’s type in any other row is considered stronger, it would dictate the target column’s type. Here’s an example demonstrating this:
SELECT custid, companyname, contractdate
INTO #MyCusts1
FROM ( VALUES( CAST(2 AS SMALLINT), CAST('Cust 2' AS VARCHAR(50)), CAST('20200212' AS DATE)),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid, companyname, contractdate);
SELECT name AS colname, TYPE_NAME(system_type_id) AS typename, max_length AS maxlength
FROM tempdb.sys.columns
WHERE OBJECT_ID = OBJECT_ID(N'tempdb..#MyCusts1');
Questo codice genera il seguente output:
colname typename maxlength
------------- --------- ---------
custid int 4
companyname varchar 50
contractdate date 3
Notice that the type for custid is INT.
The same applies never mind which row’s values you explicitly convert, if you don’t convert all of them. For example, here the code explicitly converts the types of the values in the second row:
SELECT custid, companyname, contractdate
INTO #MyCusts2
FROM ( VALUES( 2, 'Cust 2', '20200212'),
( CAST(3 AS SMALLINT), CAST('Cust 3' AS VARCHAR(50)), CAST('20200118' AS DATE) ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid, companyname, contractdate);
SELECT name AS colname, TYPE_NAME(system_type_id) AS typename, max_length AS maxlength
FROM tempdb.sys.columns
WHERE OBJECT_ID = OBJECT_ID(N'tempdb..#MyCusts2');
Questo codice genera il seguente output:
colname typename maxlength
------------- --------- ---------
custid int 4
companyname varchar 50
contractdate date 3
As you can see, custid is still of an INT type.
You basically have two main options. One is to explicitly convert all values, like so:
SELECT custid, companyname, contractdate
INTO #MyCusts3
FROM ( VALUES( CAST(2 AS SMALLINT), CAST('Cust 2' AS VARCHAR(50)), CAST('20200212' AS DATE)),
( CAST(3 AS SMALLINT), CAST('Cust 3' AS VARCHAR(50)), CAST('20200118' AS DATE)),
( CAST(5 AS SMALLINT), CAST('Cust 5' AS VARCHAR(50)), CAST('20200401' AS DATE)) )
AS MyCusts(custid, companyname, contractdate);
SELECT name AS colname, TYPE_NAME(system_type_id) AS typename, max_length AS maxlength
FROM tempdb.sys.columns
WHERE OBJECT_ID = OBJECT_ID(N'tempdb..#MyCusts3');
This code generates the following output, showing all target columns have the desired types:
colname typename maxlength
------------- --------- ---------
custid smallint 2
companyname varchar 50
contractdate date 3
That’s a lot of coding, though. Another option is to apply the conversions in the SELECT list of the query against the table value constructor, and then define a derived table against the query that applies the conversions, like so:
SELECT custid, companyname, contractdate
INTO #MyCusts4
FROM ( SELECT
CAST(custid AS SMALLINT) AS custid,
CAST(companyname AS VARCHAR(50)) AS companyname,
CAST(contractdate AS DATE) AS contractdate
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS D(custid, companyname, contractdate) ) AS MyCusts;
SELECT name AS colname, TYPE_NAME(system_type_id) AS typename, max_length AS maxlength
FROM tempdb.sys.columns
WHERE OBJECT_ID = OBJECT_ID(N'tempdb..#MyCusts4');
Questo codice genera il seguente output:
colname typename maxlength
------------- --------- ---------
custid smallint 2
companyname varchar 50
contractdate date 3
The reasoning for using the additional derived table is due to how logical query processing is designed. The SELECT clause is evaluated after FROM, WHERE, GROUP BY and HAVING. By applying the conversions in the SELECT list of the inner query, you allow expressions in all clauses of the outermost query to interact with the columns with the proper types.
Back to our wishful thinking, clearly, it would be good if we ever get a syntax that allows explicit control of the types in the definition of the table value constructor’s heading, like so:
SELECT custid, companyname, contractdate
FROM ( VALUES( 2, 'Cust 2', '20200212' ),
( 3, 'Cust 3', '20200118' ),
( 5, 'Cust 5', '20200401' ) )
AS MyCusts(custid SMALLINT, companyname VARCHAR(50), contractdate DATE);
When you’re done, run the following code for cleanup:
DROP TABLE IF EXISTS #MyCusts1, #MyCusts2, #MyCusts3, #MyCusts4;
Used in modification statements
T-SQL allows you to modify data through table expressions. That’s true for derived tables, CTEs, views and inline TVFs. What gets modified in practice is some underlying base table that is used by the table expression. I have much to say about modifying data through table expressions, and I will in a future article dedicated to this topic. Here, I just wanted to briefly mention the types of modification statements that specifically support derived tables, and provide the syntax.
Derived tables can be used as the target table in DELETE and UPDATE statements, and also as the source table in the MERGE statement (in the USING clause). They cannot be used in the TRUNCATE statement, and as the target in the INSERT and MERGE statements.
For the DELETE and UPDATE statements, the syntax for defining the derived table is a bit awkward. You don’t define the derived table in the DELETE and UPDATE clauses, like you would expect, but rather in a separate FROM clause. You then specify the derived table name in the DELETE or UPDATE clause.
Here’s the general syntax of a DELETE statement against a derived table:
DELETE [ FROM ]
FROM ( ) [ AS ] [ () ]
[ WHERE ];
As an example (don’t actually run it), the following code deletes all US customers with a customer ID that is greater than the minimum for the same region (the region column represents the state for US customers):
DELETE FROM UC
FROM ( SELECT *, ROW_NUMBER() OVER(PARTITION BY region ORDER BY custid) AS rownum
FROM Sales.Customers
WHERE country = N'USA' ) AS UC
WHERE rownum > 1;
Here’s the general syntax of an UPDATE statement against a derived table:
UPDATE
SET
FROM ( ) [ AS ] [ () ]
[ WHERE ];
As you can see, from the perspective of the definition of the derived table, it’s quite similar to the syntax of the DELETE statement.
As an example, the following code changes the company names of US customers to one using the format N'USA Cust ' + rownum, where rownum represents a position based on customer ID ordering:
BEGIN TRAN;
UPDATE UC
SET companyname = newcompanyname
OUTPUT
inserted.custid,
deleted.companyname AS oldcompanyname,
inserted.companyname AS newcompanyname
FROM ( SELECT custid, companyname,
N'USA Cust ' + CAST(ROW_NUMBER() OVER(ORDER BY custid) AS NVARCHAR(10)) AS newcompanyname
FROM Sales.Customers
WHERE country = N'USA' ) AS UC;
ROLLBACK TRAN;
The code applies the update in a transaction that it then rolls back so that the change won't stick.
This code generates the following output, showing both the old and the new company names:
custid oldcompanyname newcompanyname
------- --------------- ----------------
32 Customer YSIQX USA Cust 1
36 Customer LVJSO USA Cust 2
43 Customer UISOJ USA Cust 3
45 Customer QXPPT USA Cust 4
48 Customer DVFMB USA Cust 5
55 Customer KZQZT USA Cust 6
65 Customer NYUHS USA Cust 7
71 Customer LCOUJ USA Cust 8
75 Customer XOJYP USA Cust 9
77 Customer LCYBZ USA Cust 10
78 Customer NLTYP USA Cust 11
82 Customer EYHKM USA Cust 12
89 Customer YBQTI USA Cust 13
That’s it for now on the topic.
Riepilogo
Derived tables are one of the four main types of named table expressions that T-SQL supports. In this article I focused on the logical aspects of derived tables. I described the syntax for defining them and their scope.
Remember that a table expression is a table and as such, all of its columns must have names, all column names must be unique, and the table has no order.
The design of derived tables incurs two main flaws. In order to query one derived table from another, you need to nest your code, causing it to be more complex to maintain and troubleshoot. If you need to interact with multiple occurrences of the same table expression, using derived tables you are forced to duplicate your code, which hurts the maintainability of your solution.
You can use a table value constructor to define a table based on self-contained expressions as opposed to querying some existing base tables.
You can use derived tables in modification statements like DELETE and UPDATE, though the syntax for doing so is a bit awkward.