Does SQL Server Cache Aggregate Results When Duplicated Across Columns?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












6















Suppose we have a table Orders containing the columns order_id, total, discount



Then we write a query similar to the following



SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders


Is the value for COUNT(order_id) Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:



DECLARE @order_count AS INT 

SELECT
@order_count = COUNT(order_id)
FROM Orders

SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders


Note that, while writing this question I noticed that since SQL Server 2008 AVG() is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.










share|improve this question
























  • See also.

    – Jeroen Mostert
    Feb 5 at 13:08















6















Suppose we have a table Orders containing the columns order_id, total, discount



Then we write a query similar to the following



SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders


Is the value for COUNT(order_id) Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:



DECLARE @order_count AS INT 

SELECT
@order_count = COUNT(order_id)
FROM Orders

SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders


Note that, while writing this question I noticed that since SQL Server 2008 AVG() is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.










share|improve this question
























  • See also.

    – Jeroen Mostert
    Feb 5 at 13:08













6












6








6


2






Suppose we have a table Orders containing the columns order_id, total, discount



Then we write a query similar to the following



SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders


Is the value for COUNT(order_id) Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:



DECLARE @order_count AS INT 

SELECT
@order_count = COUNT(order_id)
FROM Orders

SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders


Note that, while writing this question I noticed that since SQL Server 2008 AVG() is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.










share|improve this question
















Suppose we have a table Orders containing the columns order_id, total, discount



Then we write a query similar to the following



SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders


Is the value for COUNT(order_id) Preserved across columns or re-computed (Eg, is there a performance hit)? Or is it better to determine the computed value(s) first and used those in the query, for example:



DECLARE @order_count AS INT 

SELECT
@order_count = COUNT(order_id)
FROM Orders

SELECT
@order_count AS num_orders
, SUM(total) / @order_count as avg_total
, SUM(discount) / @order_count AS avg_discount
FROM Orders


Note that, while writing this question I noticed that since SQL Server 2008 AVG() is supported. However, I continued this question and intend this to be more general to wanting to understand how SQL server handles Identical aggregates across columns as I do sometimes run into this in other forms.







sql-server sql-server-2012






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 4 at 17:54







DavidScherer

















asked Feb 4 at 17:37









DavidSchererDavidScherer

1516




1516












  • See also.

    – Jeroen Mostert
    Feb 5 at 13:08

















  • See also.

    – Jeroen Mostert
    Feb 5 at 13:08
















See also.

– Jeroen Mostert
Feb 5 at 13:08





See also.

– Jeroen Mostert
Feb 5 at 13:08










3 Answers
3






active

oldest

votes


















10














SQL Server only calculates the COUNT once. You can see this by looking at the properties of the execution plan for



create table Orders(order_id int, total int, discount int)


SELECT
COUNT(order_id) AS num_orders
, SUM(total) / COUNT(order_id) as avg_total
, SUM(discount) / COUNT(order_id) AS avg_discount
FROM Orders


enter image description here



The stream aggregate (1) has the following defined values



[Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])), 
[Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
[Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
[Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
[Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))


Expr1008 is the calculation of the COUNT that you ask about.



There are some other COUNT aggregates for the other two columns. These are needed because the correct result for SUM(total) (for example) if COUNT(total) is 0 should be NULL.



This is carried out by the next compute scalar along (2). This also converts the COUNT result (Expr1008) from bigint to int and labels that as Expr1003



[Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
[Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
[Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)


Finally the left most compute scalar (3) uses Expr1003 in the division operation...



[Expr1006] = Scalar Operator([Expr1004]/[Expr1003]), 
[Expr1007] = Scalar Operator([Expr1005]/[Expr1003])


... and outputs columns Expr1003, Expr1006, Expr1007 as the final result



PS: AVG has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULLs anyway.



I assume order_id is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL total values of 2 and 4 then AVG(total) would be 3 but SUM(total) / COUNT(order_id) would be 0.6 (or 0 once integer division is taken into account).






share|improve this answer
































    4














    It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.



    use tempdb
    go
    create or alter function slow(@ms int)
    returns int
    as
    begin

    declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
    declare @start datetime2 = sysdatetime()

    while datediff(ms,@start,sysdatetime()) < @ms
    begin
    declare @hash binary(256) = hashbytes('sha_256',@buf)
    end
    return 42;
    end
    go
    set statistics time on
    go
    select max(dbo.slow(10)) a
    from sys.objects

    select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
    from sys.objects





    share|improve this answer






























      4














      You are trying to optimize for things that aren't necessary to optimize. Consider these queries:



      SET STATISTICS IO ON;

      SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once

      SELECT COUNT(SalesOrderID),
      SUM(OrderQty)/COUNT(SalesOrderID),
      SUM(ProductID)/COUNT(SalesOrderID),
      SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
      MIN(OrderQty)/COUNT(SalesOrderID),
      MAX(OrderQty)/COUNT(SalesOrderID)
      FROM SalesLT.SalesOrderDetail; -- still only scans once

      DECLARE @order_count AS INT
      SELECT @order_count = COUNT(SalesOrderID)
      FROM SalesLT.SalesOrderDetail; -- scans once

      SELECT @order_count,
      SUM(OrderQty)/@order_count,
      SUM(ProductID)/@order_count,
      SUM(UnitPriceDiscount)/@order_count,
      MIN(OrderQty)/@order_count,
      MAX(OrderQty)/@order_count
      FROM SalesLT.SalesOrderDetail; -- has to scan again


      The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:




      -- first query - 7 reads

      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


      -- second query - still just 7 reads
      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


      -- variable + last two queries - 11 reads

      Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.




      If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.






      share|improve this answer























      • That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

        – DavidScherer
        Feb 4 at 18:35






      • 1





        @DavidScherer you will come to appreciate the power of a real query optimizer!

        – usr
        Feb 5 at 9:40










      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "182"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228868%2fdoes-sql-server-cache-aggregate-results-when-duplicated-across-columns%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      10














      SQL Server only calculates the COUNT once. You can see this by looking at the properties of the execution plan for



      create table Orders(order_id int, total int, discount int)


      SELECT
      COUNT(order_id) AS num_orders
      , SUM(total) / COUNT(order_id) as avg_total
      , SUM(discount) / COUNT(order_id) AS avg_discount
      FROM Orders


      enter image description here



      The stream aggregate (1) has the following defined values



      [Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])), 
      [Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
      [Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
      [Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
      [Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))


      Expr1008 is the calculation of the COUNT that you ask about.



      There are some other COUNT aggregates for the other two columns. These are needed because the correct result for SUM(total) (for example) if COUNT(total) is 0 should be NULL.



      This is carried out by the next compute scalar along (2). This also converts the COUNT result (Expr1008) from bigint to int and labels that as Expr1003



      [Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
      [Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
      [Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)


      Finally the left most compute scalar (3) uses Expr1003 in the division operation...



      [Expr1006] = Scalar Operator([Expr1004]/[Expr1003]), 
      [Expr1007] = Scalar Operator([Expr1005]/[Expr1003])


      ... and outputs columns Expr1003, Expr1006, Expr1007 as the final result



      PS: AVG has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULLs anyway.



      I assume order_id is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL total values of 2 and 4 then AVG(total) would be 3 but SUM(total) / COUNT(order_id) would be 0.6 (or 0 once integer division is taken into account).






      share|improve this answer





























        10














        SQL Server only calculates the COUNT once. You can see this by looking at the properties of the execution plan for



        create table Orders(order_id int, total int, discount int)


        SELECT
        COUNT(order_id) AS num_orders
        , SUM(total) / COUNT(order_id) as avg_total
        , SUM(discount) / COUNT(order_id) AS avg_discount
        FROM Orders


        enter image description here



        The stream aggregate (1) has the following defined values



        [Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])), 
        [Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
        [Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
        [Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
        [Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))


        Expr1008 is the calculation of the COUNT that you ask about.



        There are some other COUNT aggregates for the other two columns. These are needed because the correct result for SUM(total) (for example) if COUNT(total) is 0 should be NULL.



        This is carried out by the next compute scalar along (2). This also converts the COUNT result (Expr1008) from bigint to int and labels that as Expr1003



        [Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
        [Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
        [Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)


        Finally the left most compute scalar (3) uses Expr1003 in the division operation...



        [Expr1006] = Scalar Operator([Expr1004]/[Expr1003]), 
        [Expr1007] = Scalar Operator([Expr1005]/[Expr1003])


        ... and outputs columns Expr1003, Expr1006, Expr1007 as the final result



        PS: AVG has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULLs anyway.



        I assume order_id is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL total values of 2 and 4 then AVG(total) would be 3 but SUM(total) / COUNT(order_id) would be 0.6 (or 0 once integer division is taken into account).






        share|improve this answer



























          10












          10








          10







          SQL Server only calculates the COUNT once. You can see this by looking at the properties of the execution plan for



          create table Orders(order_id int, total int, discount int)


          SELECT
          COUNT(order_id) AS num_orders
          , SUM(total) / COUNT(order_id) as avg_total
          , SUM(discount) / COUNT(order_id) AS avg_discount
          FROM Orders


          enter image description here



          The stream aggregate (1) has the following defined values



          [Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])), 
          [Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
          [Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
          [Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
          [Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))


          Expr1008 is the calculation of the COUNT that you ask about.



          There are some other COUNT aggregates for the other two columns. These are needed because the correct result for SUM(total) (for example) if COUNT(total) is 0 should be NULL.



          This is carried out by the next compute scalar along (2). This also converts the COUNT result (Expr1008) from bigint to int and labels that as Expr1003



          [Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
          [Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
          [Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)


          Finally the left most compute scalar (3) uses Expr1003 in the division operation...



          [Expr1006] = Scalar Operator([Expr1004]/[Expr1003]), 
          [Expr1007] = Scalar Operator([Expr1005]/[Expr1003])


          ... and outputs columns Expr1003, Expr1006, Expr1007 as the final result



          PS: AVG has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULLs anyway.



          I assume order_id is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL total values of 2 and 4 then AVG(total) would be 3 but SUM(total) / COUNT(order_id) would be 0.6 (or 0 once integer division is taken into account).






          share|improve this answer















          SQL Server only calculates the COUNT once. You can see this by looking at the properties of the execution plan for



          create table Orders(order_id int, total int, discount int)


          SELECT
          COUNT(order_id) AS num_orders
          , SUM(total) / COUNT(order_id) as avg_total
          , SUM(discount) / COUNT(order_id) AS avg_discount
          FROM Orders


          enter image description here



          The stream aggregate (1) has the following defined values



          [Expr1008] = Scalar Operator(COUNT([tempdb].[dbo].[Orders].[order_id])), 
          [Expr1009] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[total])),
          [Expr1010] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[total])),
          [Expr1011] = Scalar Operator(COUNT_BIG([tempdb].[dbo].[Orders].[discount])),
          [Expr1012] = Scalar Operator(SUM([tempdb].[dbo].[Orders].[discount]))


          Expr1008 is the calculation of the COUNT that you ask about.



          There are some other COUNT aggregates for the other two columns. These are needed because the correct result for SUM(total) (for example) if COUNT(total) is 0 should be NULL.



          This is carried out by the next compute scalar along (2). This also converts the COUNT result (Expr1008) from bigint to int and labels that as Expr1003



          [Expr1003] = Scalar Operator(CONVERT_IMPLICIT(int,[Expr1008],0)),
          [Expr1004] = Scalar Operator(CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END),
          [Expr1005] = Scalar Operator(CASE WHEN [Expr1011]=(0) THEN NULL ELSE [Expr1012] END)


          Finally the left most compute scalar (3) uses Expr1003 in the division operation...



          [Expr1006] = Scalar Operator([Expr1004]/[Expr1003]), 
          [Expr1007] = Scalar Operator([Expr1005]/[Expr1003])


          ... and outputs columns Expr1003, Expr1006, Expr1007 as the final result



          PS: AVG has been supported much longer than SQL Server 2008. I imagine it has likely been available in the beginning. However it does not have the same semantics as your rewrite in the presence of NULLs anyway.



          I assume order_id is the primary key and therefore not nullable but for a table with 10 orders and two NOT NULL total values of 2 and 4 then AVG(total) would be 3 but SUM(total) / COUNT(order_id) would be 0.6 (or 0 once integer division is taken into account).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Feb 4 at 22:25

























          answered Feb 4 at 18:13









          Martin SmithMartin Smith

          63.4k10170254




          63.4k10170254























              4














              It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.



              use tempdb
              go
              create or alter function slow(@ms int)
              returns int
              as
              begin

              declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
              declare @start datetime2 = sysdatetime()

              while datediff(ms,@start,sysdatetime()) < @ms
              begin
              declare @hash binary(256) = hashbytes('sha_256',@buf)
              end
              return 42;
              end
              go
              set statistics time on
              go
              select max(dbo.slow(10)) a
              from sys.objects

              select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
              from sys.objects





              share|improve this answer



























                4














                It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.



                use tempdb
                go
                create or alter function slow(@ms int)
                returns int
                as
                begin

                declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
                declare @start datetime2 = sysdatetime()

                while datediff(ms,@start,sysdatetime()) < @ms
                begin
                declare @hash binary(256) = hashbytes('sha_256',@buf)
                end
                return 42;
                end
                go
                set statistics time on
                go
                select max(dbo.slow(10)) a
                from sys.objects

                select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
                from sys.objects





                share|improve this answer

























                  4












                  4








                  4







                  It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.



                  use tempdb
                  go
                  create or alter function slow(@ms int)
                  returns int
                  as
                  begin

                  declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
                  declare @start datetime2 = sysdatetime()

                  while datediff(ms,@start,sysdatetime()) < @ms
                  begin
                  declare @hash binary(256) = hashbytes('sha_256',@buf)
                  end
                  return 42;
                  end
                  go
                  set statistics time on
                  go
                  select max(dbo.slow(10)) a
                  from sys.objects

                  select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
                  from sys.objects





                  share|improve this answer













                  It should be calculated only once. Here's little test that shows that even if the aggregate calls a UDF, it's not re-executed in every expression in which it appears.



                  use tempdb
                  go
                  create or alter function slow(@ms int)
                  returns int
                  as
                  begin

                  declare @buf varbinary(max) = cast( replicate(cast(0x as nvarchar(max)),1000*1000*20) as varbinary(max))
                  declare @start datetime2 = sysdatetime()

                  while datediff(ms,@start,sysdatetime()) < @ms
                  begin
                  declare @hash binary(256) = hashbytes('sha_256',@buf)
                  end
                  return 42;
                  end
                  go
                  set statistics time on
                  go
                  select max(dbo.slow(10)) a
                  from sys.objects

                  select max(dbo.slow(10)) a, sin(max(dbo.slow(10))) b, max(dbo.slow(10))/500 c, count(*) + max(dbo.slow(10)) d
                  from sys.objects






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 4 at 18:11









                  David Browne - MicrosoftDavid Browne - Microsoft

                  11.9k729




                  11.9k729





















                      4














                      You are trying to optimize for things that aren't necessary to optimize. Consider these queries:



                      SET STATISTICS IO ON;

                      SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT COUNT(SalesOrderID),
                      SUM(OrderQty)/COUNT(SalesOrderID),
                      SUM(ProductID)/COUNT(SalesOrderID),
                      SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
                      MIN(OrderQty)/COUNT(SalesOrderID),
                      MAX(OrderQty)/COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- still only scans once

                      DECLARE @order_count AS INT
                      SELECT @order_count = COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT @order_count,
                      SUM(OrderQty)/@order_count,
                      SUM(ProductID)/@order_count,
                      SUM(UnitPriceDiscount)/@order_count,
                      MIN(OrderQty)/@order_count,
                      MAX(OrderQty)/@order_count
                      FROM SalesLT.SalesOrderDetail; -- has to scan again


                      The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:




                      -- first query - 7 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- second query - still just 7 reads
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- variable + last two queries - 11 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.




                      If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.






                      share|improve this answer























                      • That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

                        – DavidScherer
                        Feb 4 at 18:35






                      • 1





                        @DavidScherer you will come to appreciate the power of a real query optimizer!

                        – usr
                        Feb 5 at 9:40















                      4














                      You are trying to optimize for things that aren't necessary to optimize. Consider these queries:



                      SET STATISTICS IO ON;

                      SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT COUNT(SalesOrderID),
                      SUM(OrderQty)/COUNT(SalesOrderID),
                      SUM(ProductID)/COUNT(SalesOrderID),
                      SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
                      MIN(OrderQty)/COUNT(SalesOrderID),
                      MAX(OrderQty)/COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- still only scans once

                      DECLARE @order_count AS INT
                      SELECT @order_count = COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT @order_count,
                      SUM(OrderQty)/@order_count,
                      SUM(ProductID)/@order_count,
                      SUM(UnitPriceDiscount)/@order_count,
                      MIN(OrderQty)/@order_count,
                      MAX(OrderQty)/@order_count
                      FROM SalesLT.SalesOrderDetail; -- has to scan again


                      The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:




                      -- first query - 7 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- second query - still just 7 reads
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- variable + last two queries - 11 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.




                      If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.






                      share|improve this answer























                      • That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

                        – DavidScherer
                        Feb 4 at 18:35






                      • 1





                        @DavidScherer you will come to appreciate the power of a real query optimizer!

                        – usr
                        Feb 5 at 9:40













                      4












                      4








                      4







                      You are trying to optimize for things that aren't necessary to optimize. Consider these queries:



                      SET STATISTICS IO ON;

                      SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT COUNT(SalesOrderID),
                      SUM(OrderQty)/COUNT(SalesOrderID),
                      SUM(ProductID)/COUNT(SalesOrderID),
                      SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
                      MIN(OrderQty)/COUNT(SalesOrderID),
                      MAX(OrderQty)/COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- still only scans once

                      DECLARE @order_count AS INT
                      SELECT @order_count = COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT @order_count,
                      SUM(OrderQty)/@order_count,
                      SUM(ProductID)/@order_count,
                      SUM(UnitPriceDiscount)/@order_count,
                      MIN(OrderQty)/@order_count,
                      MAX(OrderQty)/@order_count
                      FROM SalesLT.SalesOrderDetail; -- has to scan again


                      The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:




                      -- first query - 7 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- second query - still just 7 reads
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- variable + last two queries - 11 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.




                      If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.






                      share|improve this answer













                      You are trying to optimize for things that aren't necessary to optimize. Consider these queries:



                      SET STATISTICS IO ON;

                      SELECT SUM(OrderQty)/COUNT(SalesOrderID) FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT COUNT(SalesOrderID),
                      SUM(OrderQty)/COUNT(SalesOrderID),
                      SUM(ProductID)/COUNT(SalesOrderID),
                      SUM(UnitPriceDiscount)/COUNT(SalesOrderID),
                      MIN(OrderQty)/COUNT(SalesOrderID),
                      MAX(OrderQty)/COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- still only scans once

                      DECLARE @order_count AS INT
                      SELECT @order_count = COUNT(SalesOrderID)
                      FROM SalesLT.SalesOrderDetail; -- scans once

                      SELECT @order_count,
                      SUM(OrderQty)/@order_count,
                      SUM(ProductID)/@order_count,
                      SUM(UnitPriceDiscount)/@order_count,
                      MIN(OrderQty)/@order_count,
                      MAX(OrderQty)/@order_count
                      FROM SalesLT.SalesOrderDetail; -- has to scan again


                      The cost of scanning the entire table once just to get the count and then scanning again to use that count in operations within the second scan is definitely more trouble than it's worth. This is obvious in both the plans and the stats I/O, even on a very small table:




                      -- first query - 7 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- second query - still just 7 reads
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                      -- variable + last two queries - 11 reads

                      Table 'SalesOrderDetail'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                      Table 'SalesOrderDetail'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.




                      If you're going to be reusing the count in multiple queries, queries against different tables, or queries where you're not otherwise going to have to scan (joins, filtered queries etc.), then it's a different question, potentially. But you shouldn't be concerned that the count will happen more than once during a single scan; as others have demonstrated, it won't. To justify pulling that count out of the query you'd have to have other scenarios happening.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Feb 4 at 18:14









                      Aaron BertrandAaron Bertrand

                      152k18289489




                      152k18289489












                      • That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

                        – DavidScherer
                        Feb 4 at 18:35






                      • 1





                        @DavidScherer you will come to appreciate the power of a real query optimizer!

                        – usr
                        Feb 5 at 9:40

















                      • That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

                        – DavidScherer
                        Feb 4 at 18:35






                      • 1





                        @DavidScherer you will come to appreciate the power of a real query optimizer!

                        – usr
                        Feb 5 at 9:40
















                      That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

                      – DavidScherer
                      Feb 4 at 18:35





                      That's what I was hoping, but wasn't sure of. :) I grew up in MySQL and only recently have started doing any serious work with SQL Server, and I often find SQL Server query plans overwhelming compared to what I'm used to looking at in MySQL so I'm still getting used to inspecting them.

                      – DavidScherer
                      Feb 4 at 18:35




                      1




                      1





                      @DavidScherer you will come to appreciate the power of a real query optimizer!

                      – usr
                      Feb 5 at 9:40





                      @DavidScherer you will come to appreciate the power of a real query optimizer!

                      – usr
                      Feb 5 at 9:40

















                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Database Administrators Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228868%2fdoes-sql-server-cache-aggregate-results-when-duplicated-across-columns%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown






                      Popular posts from this blog

                      How to check contact read email or not when send email to Individual?

                      Bahrain

                      Postfix configuration issue with fips on centos 7; mailgun relay