Does SQL Server Cache Aggregate Results When Duplicated Across Columns?
Clash Royale CLAN TAG#URR8PPP
Suppose we have a table Orders
containing the columns order_id, total, discount
Then we write a query similar to the following
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
Is the value for COUNT(order_id)
Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:
DECLARE @order_count AS INT
SELECT
@order_count = COUNT(order_id)
FROM Orders
SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders
Note that, while writing this question I noticed that since SQL Server 2008 AVG()
is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.
sql-server sql-server-2012
add a comment |
Suppose we have a table Orders
containing the columns order_id, total, discount
Then we write a query similar to the following
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
Is the value for COUNT(order_id)
Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:
DECLARE @order_count AS INT
SELECT
@order_count = COUNT(order_id)
FROM Orders
SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders
Note that, while writing this question I noticed that since SQL Server 2008 AVG()
is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.
sql-server sql-server-2012
See also.
– Jeroen Mostert
Feb 5 at 13:08
add a comment |
Suppose we have a table Orders
containing the columns order_id, total, discount
Then we write a query similar to the following
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
Is the value for COUNT(order_id)
Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:
DECLARE @order_count AS INT
SELECT
@order_count = COUNT(order_id)
FROM Orders
SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders
Note that, while writing this question I noticed that since SQL Server 2008 AVG()
is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.
sql-server sql-server-2012
Suppose we have a table Orders
containing the columns order_id, total, discount
Then we write a query similar to the following
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
Is the value for COUNT(order_id)
Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:
DECLARE @order_count AS INT
SELECT
@order_count = COUNT(order_id)
FROM Orders
SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders
Note that, while writing this question I noticed that since SQL Server 2008 AVG()
is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.
sql-server sql-server-2012
sql-server sql-server-2012
edited Feb 4 at 17:54
DavidScherer
asked Feb 4 at 17:37
DavidSchererDavidScherer
1516
1516
See also.
– Jeroen Mostert
Feb 5 at 13:08
add a comment |
See also.
– Jeroen Mostert
Feb 5 at 13:08
See also.
– Jeroen Mostert
Feb 5 at 13:08
See also.
– Jeroen Mostert
Feb 5 at 13:08
add a comment |
3 Answers
3
active
oldest
votes
SQL Server only calculates the COUNT
once. You can see this by looking at the properties of the execution plan for
create table Orders(order_id int, total int, discount int)
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
The stream aggregate (1) has the following defined values
[Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])),
[Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
[Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
[Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
[Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))
Expr1008
is the calculation of the COUNT
that you ask about.
There are some other COUNT
aggregates for the other two columns. These are needed because the correct result for SUM(total)
(for example) if COUNT(total)
is 0
should be NULL
.
This is carried out by the next compute scalar along (2). This also converts the COUNT
result (Expr1008
) from bigint
to int
and labels that as Expr1003
[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
[Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
[Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)
Finally the left most compute scalar (3) uses Expr1003
in the division operation...
[Expr1006] = Scalar Operator([Expr1004]/[Expr1003]),
[Expr1007] = Scalar Operator([Expr1005]/[Expr1003])
... and outputs columns Expr1003, Expr1006, Expr1007
as the final result
PS: AVG
has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULL
s anyway.
I assume order_id
is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL
total
values of 2
and 4
then AVG(total)
would be 3
but SUM(total) / COUNT(order_id)
would be 0.6
(or 0
once integer division is taken into account).
add a comment |
It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.
use tempdb
go
create or alter function slow(@ms int)
returns int
as
begin
declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
declare @start datetime2 = sysdatetime()
while datediff(ms,@start,sysdatetime()) < @ms
begin
declare @hash binary(256) = hashbytes('sha_256',@buf)
end
return 42;
end
go
set statistics time on
go
select max(dbo.slow(10)) a
from sys.objects
select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
from sys.objects
add a comment |
You are trying to optimize for things that aren't necessary to optimize. Consider these queries:
SET STATISTICS IO ON;
SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once
SELECT COUNT(SalesOrderID),
SUM(OrderQty)/COUNT(SalesOrderID),
SUM(ProductID)/COUNT(SalesOrderID),
SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
MIN(OrderQty)/COUNT(SalesOrderID),
MAX(OrderQty)/COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- still only scans once
DECLARE @order_count AS INT
SELECT @order_count = COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- scans once
SELECT @order_count,
SUM(OrderQty)/@order_count,
SUM(ProductID)/@order_count,
SUM(UnitPriceDiscount)/@order_count,
MIN(OrderQty)/@order_count,
MAX(OrderQty)/@order_count
FROM SalesLT.SalesOrderDetail; -- has to scan again
The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:
-- first query - 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- second query - still just 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- variable + last two queries - 11 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
1
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228868%2fdoes-sql-server-cache-aggregate-results-when-duplicated-across-columns%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
SQL Server only calculates the COUNT
once. You can see this by looking at the properties of the execution plan for
create table Orders(order_id int, total int, discount int)
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
The stream aggregate (1) has the following defined values
[Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])),
[Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
[Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
[Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
[Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))
Expr1008
is the calculation of the COUNT
that you ask about.
There are some other COUNT
aggregates for the other two columns. These are needed because the correct result for SUM(total)
(for example) if COUNT(total)
is 0
should be NULL
.
This is carried out by the next compute scalar along (2). This also converts the COUNT
result (Expr1008
) from bigint
to int
and labels that as Expr1003
[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
[Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
[Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)
Finally the left most compute scalar (3) uses Expr1003
in the division operation...
[Expr1006] = Scalar Operator([Expr1004]/[Expr1003]),
[Expr1007] = Scalar Operator([Expr1005]/[Expr1003])
... and outputs columns Expr1003, Expr1006, Expr1007
as the final result
PS: AVG
has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULL
s anyway.
I assume order_id
is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL
total
values of 2
and 4
then AVG(total)
would be 3
but SUM(total) / COUNT(order_id)
would be 0.6
(or 0
once integer division is taken into account).
add a comment |
SQL Server only calculates the COUNT
once. You can see this by looking at the properties of the execution plan for
create table Orders(order_id int, total int, discount int)
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
The stream aggregate (1) has the following defined values
[Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])),
[Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
[Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
[Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
[Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))
Expr1008
is the calculation of the COUNT
that you ask about.
There are some other COUNT
aggregates for the other two columns. These are needed because the correct result for SUM(total)
(for example) if COUNT(total)
is 0
should be NULL
.
This is carried out by the next compute scalar along (2). This also converts the COUNT
result (Expr1008
) from bigint
to int
and labels that as Expr1003
[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
[Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
[Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)
Finally the left most compute scalar (3) uses Expr1003
in the division operation...
[Expr1006] = Scalar Operator([Expr1004]/[Expr1003]),
[Expr1007] = Scalar Operator([Expr1005]/[Expr1003])
... and outputs columns Expr1003, Expr1006, Expr1007
as the final result
PS: AVG
has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULL
s anyway.
I assume order_id
is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL
total
values of 2
and 4
then AVG(total)
would be 3
but SUM(total) / COUNT(order_id)
would be 0.6
(or 0
once integer division is taken into account).
add a comment |
SQL Server only calculates the COUNT
once. You can see this by looking at the properties of the execution plan for
create table Orders(order_id int, total int, discount int)
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
The stream aggregate (1) has the following defined values
[Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])),
[Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
[Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
[Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
[Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))
Expr1008
is the calculation of the COUNT
that you ask about.
There are some other COUNT
aggregates for the other two columns. These are needed because the correct result for SUM(total)
(for example) if COUNT(total)
is 0
should be NULL
.
This is carried out by the next compute scalar along (2). This also converts the COUNT
result (Expr1008
) from bigint
to int
and labels that as Expr1003
[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
[Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
[Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)
Finally the left most compute scalar (3) uses Expr1003
in the division operation...
[Expr1006] = Scalar Operator([Expr1004]/[Expr1003]),
[Expr1007] = Scalar Operator([Expr1005]/[Expr1003])
... and outputs columns Expr1003, Expr1006, Expr1007
as the final result
PS: AVG
has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULL
s anyway.
I assume order_id
is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL
total
values of 2
and 4
then AVG(total)
would be 3
but SUM(total) / COUNT(order_id)
would be 0.6
(or 0
once integer division is taken into account).
SQL Server only calculates the COUNT
once. You can see this by looking at the properties of the execution plan for
create table Orders(order_id int, total int, discount int)
SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders
The stream aggregate (1) has the following defined values
[Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])),
[Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
[Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
[Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
[Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))
Expr1008
is the calculation of the COUNT
that you ask about.
There are some other COUNT
aggregates for the other two columns. These are needed because the correct result for SUM(total)
(for example) if COUNT(total)
is 0
should be NULL
.
This is carried out by the next compute scalar along (2). This also converts the COUNT
result (Expr1008
) from bigint
to int
and labels that as Expr1003
[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
[Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
[Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)
Finally the left most compute scalar (3) uses Expr1003
in the division operation...
[Expr1006] = Scalar Operator([Expr1004]/[Expr1003]),
[Expr1007] = Scalar Operator([Expr1005]/[Expr1003])
... and outputs columns Expr1003, Expr1006, Expr1007
as the final result
PS: AVG
has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULL
s anyway.
I assume order_id
is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL
total
values of 2
and 4
then AVG(total)
would be 3
but SUM(total) / COUNT(order_id)
would be 0.6
(or 0
once integer division is taken into account).
edited Feb 4 at 22:25
answered Feb 4 at 18:13
Martin SmithMartin Smith
63.4k10170254
63.4k10170254
add a comment |
add a comment |
It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.
use tempdb
go
create or alter function slow(@ms int)
returns int
as
begin
declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
declare @start datetime2 = sysdatetime()
while datediff(ms,@start,sysdatetime()) < @ms
begin
declare @hash binary(256) = hashbytes('sha_256',@buf)
end
return 42;
end
go
set statistics time on
go
select max(dbo.slow(10)) a
from sys.objects
select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
from sys.objects
add a comment |
It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.
use tempdb
go
create or alter function slow(@ms int)
returns int
as
begin
declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
declare @start datetime2 = sysdatetime()
while datediff(ms,@start,sysdatetime()) < @ms
begin
declare @hash binary(256) = hashbytes('sha_256',@buf)
end
return 42;
end
go
set statistics time on
go
select max(dbo.slow(10)) a
from sys.objects
select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
from sys.objects
add a comment |
It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.
use tempdb
go
create or alter function slow(@ms int)
returns int
as
begin
declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
declare @start datetime2 = sysdatetime()
while datediff(ms,@start,sysdatetime()) < @ms
begin
declare @hash binary(256) = hashbytes('sha_256',@buf)
end
return 42;
end
go
set statistics time on
go
select max(dbo.slow(10)) a
from sys.objects
select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
from sys.objects
It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.
use tempdb
go
create or alter function slow(@ms int)
returns int
as
begin
declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
declare @start datetime2 = sysdatetime()
while datediff(ms,@start,sysdatetime()) < @ms
begin
declare @hash binary(256) = hashbytes('sha_256',@buf)
end
return 42;
end
go
set statistics time on
go
select max(dbo.slow(10)) a
from sys.objects
select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
from sys.objects
answered Feb 4 at 18:11
David Browne - MicrosoftDavid Browne - Microsoft
11.9k729
11.9k729
add a comment |
add a comment |
You are trying to optimize for things that aren't necessary to optimize. Consider these queries:
SET STATISTICS IO ON;
SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once
SELECT COUNT(SalesOrderID),
SUM(OrderQty)/COUNT(SalesOrderID),
SUM(ProductID)/COUNT(SalesOrderID),
SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
MIN(OrderQty)/COUNT(SalesOrderID),
MAX(OrderQty)/COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- still only scans once
DECLARE @order_count AS INT
SELECT @order_count = COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- scans once
SELECT @order_count,
SUM(OrderQty)/@order_count,
SUM(ProductID)/@order_count,
SUM(UnitPriceDiscount)/@order_count,
MIN(OrderQty)/@order_count,
MAX(OrderQty)/@order_count
FROM SalesLT.SalesOrderDetail; -- has to scan again
The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:
-- first query - 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- second query - still just 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- variable + last two queries - 11 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
1
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
add a comment |
You are trying to optimize for things that aren't necessary to optimize. Consider these queries:
SET STATISTICS IO ON;
SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once
SELECT COUNT(SalesOrderID),
SUM(OrderQty)/COUNT(SalesOrderID),
SUM(ProductID)/COUNT(SalesOrderID),
SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
MIN(OrderQty)/COUNT(SalesOrderID),
MAX(OrderQty)/COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- still only scans once
DECLARE @order_count AS INT
SELECT @order_count = COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- scans once
SELECT @order_count,
SUM(OrderQty)/@order_count,
SUM(ProductID)/@order_count,
SUM(UnitPriceDiscount)/@order_count,
MIN(OrderQty)/@order_count,
MAX(OrderQty)/@order_count
FROM SalesLT.SalesOrderDetail; -- has to scan again
The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:
-- first query - 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- second query - still just 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- variable + last two queries - 11 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
1
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
add a comment |
You are trying to optimize for things that aren't necessary to optimize. Consider these queries:
SET STATISTICS IO ON;
SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once
SELECT COUNT(SalesOrderID),
SUM(OrderQty)/COUNT(SalesOrderID),
SUM(ProductID)/COUNT(SalesOrderID),
SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
MIN(OrderQty)/COUNT(SalesOrderID),
MAX(OrderQty)/COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- still only scans once
DECLARE @order_count AS INT
SELECT @order_count = COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- scans once
SELECT @order_count,
SUM(OrderQty)/@order_count,
SUM(ProductID)/@order_count,
SUM(UnitPriceDiscount)/@order_count,
MIN(OrderQty)/@order_count,
MAX(OrderQty)/@order_count
FROM SalesLT.SalesOrderDetail; -- has to scan again
The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:
-- first query - 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- second query - still just 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- variable + last two queries - 11 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.
You are trying to optimize for things that aren't necessary to optimize. Consider these queries:
SET STATISTICS IO ON;
SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once
SELECT COUNT(SalesOrderID),
SUM(OrderQty)/COUNT(SalesOrderID),
SUM(ProductID)/COUNT(SalesOrderID),
SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
MIN(OrderQty)/COUNT(SalesOrderID),
MAX(OrderQty)/COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- still only scans once
DECLARE @order_count AS INT
SELECT @order_count = COUNT(SalesOrderID)
FROM SalesLT.SalesOrderDetail; -- scans once
SELECT @order_count,
SUM(OrderQty)/@order_count,
SUM(ProductID)/@order_count,
SUM(UnitPriceDiscount)/@order_count,
MIN(OrderQty)/@order_count,
MAX(OrderQty)/@order_count
FROM SalesLT.SalesOrderDetail; -- has to scan again
The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:
-- first query - 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- second query - still just 7 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- variable + last two queries - 11 reads
Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.
answered Feb 4 at 18:14
Aaron Bertrand♦Aaron Bertrand
152k18289489
152k18289489
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
1
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
add a comment |
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
1
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.
– DavidScherer
Feb 4 at 18:35
1
1
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
@DavidScherer you will come to appreciate the power of a real query optimizer!
– usr
Feb 5 at 9:40
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228868%2fdoes-sql-server-cache-aggregate-results-when-duplicated-across-columns%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
See also.
– Jeroen Mostert
Feb 5 at 13:08