Sort spills to tempdb due to varchar(max)

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












9















On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:



CREATE TABLE [dbo].[Settings](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceId] [int] NULL,
[typeID] [int] NULL,
[remark] [varchar](max) NULL,
CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Resources](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceUID] [int] NULL,
CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO


with following non-clustered indexes:



CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]
(
[resourceUID] ASC
)

CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]
(
[resourceId] ASC,
[typeID] ASC
)


The database is configured with compatibility level 120.



When I run this query there are spills to tempdb.
This is how I execute the query:



exec sp_executesql N'
select r.id,remark
FROM Resources r
inner join Settings on resourceid=r.id
where resourceUID=@UID
ORDER BY typeID',
N'@UID int',
@UID=38


If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.



So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.



Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?



Adding OPTION (RECOMPILE) to the query makes no difference.










share|improve this question
























  • May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

    – mustaccio
    Jan 16 at 14:48











  • @Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

    – Frederik Vanderhaegen
    Jan 16 at 16:08















9















On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:



CREATE TABLE [dbo].[Settings](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceId] [int] NULL,
[typeID] [int] NULL,
[remark] [varchar](max) NULL,
CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Resources](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceUID] [int] NULL,
CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO


with following non-clustered indexes:



CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]
(
[resourceUID] ASC
)

CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]
(
[resourceId] ASC,
[typeID] ASC
)


The database is configured with compatibility level 120.



When I run this query there are spills to tempdb.
This is how I execute the query:



exec sp_executesql N'
select r.id,remark
FROM Resources r
inner join Settings on resourceid=r.id
where resourceUID=@UID
ORDER BY typeID',
N'@UID int',
@UID=38


If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.



So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.



Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?



Adding OPTION (RECOMPILE) to the query makes no difference.










share|improve this question
























  • May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

    – mustaccio
    Jan 16 at 14:48











  • @Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

    – Frederik Vanderhaegen
    Jan 16 at 16:08













9












9








9


2






On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:



CREATE TABLE [dbo].[Settings](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceId] [int] NULL,
[typeID] [int] NULL,
[remark] [varchar](max) NULL,
CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Resources](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceUID] [int] NULL,
CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO


with following non-clustered indexes:



CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]
(
[resourceUID] ASC
)

CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]
(
[resourceId] ASC,
[typeID] ASC
)


The database is configured with compatibility level 120.



When I run this query there are spills to tempdb.
This is how I execute the query:



exec sp_executesql N'
select r.id,remark
FROM Resources r
inner join Settings on resourceid=r.id
where resourceUID=@UID
ORDER BY typeID',
N'@UID int',
@UID=38


If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.



So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.



Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?



Adding OPTION (RECOMPILE) to the query makes no difference.










share|improve this question
















On a server with 32GB we are running SQL Server 2014 SP2 with a max memory of 25GB we have two tables, here you find a simplified structure of both tables:



CREATE TABLE [dbo].[Settings](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceId] [int] NULL,
[typeID] [int] NULL,
[remark] [varchar](max) NULL,
CONSTRAINT [PK_Settings] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Resources](
[id] [int] IDENTITY(1,1) NOT NULL,
[resourceUID] [int] NULL,
CONSTRAINT [PK_Resources] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
GO


with following non-clustered indexes:



CREATE NONCLUSTERED INDEX [IX_UID] ON [dbo].[Resources]
(
[resourceUID] ASC
)

CREATE NONCLUSTERED INDEX [IX_Test] ON [dbo].[Settings]
(
[resourceId] ASC,
[typeID] ASC
)


The database is configured with compatibility level 120.



When I run this query there are spills to tempdb.
This is how I execute the query:



exec sp_executesql N'
select r.id,remark
FROM Resources r
inner join Settings on resourceid=r.id
where resourceUID=@UID
ORDER BY typeID',
N'@UID int',
@UID=38


If don't select the [remark] field no spills occurs. My first reaction was that the spills occurred due to the low number of estimated rows on the nested-loop operator.



So I add 5 datetime and 5 integer columns to the settings table and add them to my select statement. When I execute the query no spills are happening.



Why are the spills only happening when [remark] is selected? It has probably something to do with the fact that this is a varchar(max).
What can I do to avoid spilling to tempdb?



Adding OPTION (RECOMPILE) to the query makes no difference.







sql-server query-performance tempdb cardinality-estimates






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 17 at 5:05









Paul White

50.7k14277447




50.7k14277447










asked Jan 16 at 14:10









Frederik VanderhaegenFrederik Vanderhaegen

7681317




7681317












  • May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

    – mustaccio
    Jan 16 at 14:48











  • @Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

    – Frederik Vanderhaegen
    Jan 16 at 16:08

















  • May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

    – mustaccio
    Jan 16 at 14:48











  • @Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

    – Frederik Vanderhaegen
    Jan 16 at 16:08
















May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
Jan 16 at 14:48





May be you can try select r.id, LEFT(remark, 512) (or whatever sensible substring length might be).

– mustaccio
Jan 16 at 14:48













@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
Jan 16 at 16:08





@Forrest: I'm trying to reproduce the data needed to simulate the problem. At first sight it has to do with the low estimate of the nested loop. In my dummy data the estimated number of rows is much higher and no spilling is happening

– Frederik Vanderhaegen
Jan 16 at 16:08










3 Answers
3






active

oldest

votes


















9














There are going to be several possible workarounds here.



You can manually adjust the memory grant, though I probably wouldn't go that route.



You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.



WITH CTE AS (
SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID
FROM Resources r
inner join Settings s on resourceid=r.id
where resourceUID=@UID
ORDER BY s.typeID
)
SELECT c.ID, ca.remark
FROM CTE c
CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)
ORDER BY c.typeID


Proof-of-concept dbfiddle here.
Sample data would still be appreciated!



If you want to read an excellent analysis by Paul White, read here.






share|improve this answer
































    6















    Why are the spills only happening when [remark] is selected?




    The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.



    You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).



    Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?



    It could be a statistics issue, which you can check by running DBCC SHOW_STATISTICS('dbo.Settings', 'IX_Test') and see the counts for that histogram step. But the execution plan seems to indicate that the stats are as complete and up-to-date as they could be.



    Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.






    share|improve this answer






























      2














      To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.



      I created some test data, and in the end came up with two solutions, storing the ID field from resources in either a variable (if it is always unique) or a temp table, if we can have more than one ID's.



      Base test records



      SET NOCOUNT ON
      DECLARE @i int= 1;
      WHILE @i <= 10000
      BEGIN
      INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
      VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value
      INSERT INTO [dbo].[Resources](resourceUID)
      VALUES(@i);
      SET @i += 1;
      END


      Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)



      INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
      VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')
      GO 1300


      Change compat & Update statistics to match OP



      ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;
      UPDATE STATISTICS settings WITH FULLSCAN;
      UPDATE STATISTICS resources WITH FULLSCAN;


      Original query



      exec sp_executesql N'
      select r.id
      FROM Resources r
      inner join Settings on resourceid=r.id
      where resourceUID=@UID
      ORDER BY typeID',
      N'@UID int',
      @UID=38


      My estimates are even worse, with one estimated row, while 1300 are returned.
      And like OP stated, it does not matter if I add OPTION(RECOMPILE)



      An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.



      I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point



      exec sp_executesql N'
      select r.id,remark
      FROM Resources r with(index([IX_UID]))
      inner join Settings WITH(INDEX([IX_Test]))
      on resourceid=r.id
      ORDER BY typeID',
      N'@UID int',
      @UID=38


      As expected, good estimates.



      So, what could we change to get better estimates but still seek on our values?



      IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)



      DECLARE @UID int =38 , @RID int;
      SELECT @RID=r.id from
      Resources r where resourceUID = @UID;

      SELECT @uid, remark
      from Settings
      where resourceId = @uid
      Order by typeID
      OPTION(RECOMPILE);


      Which gives 100% accurate estimates



      But what if there are multiple resourceUID's in resources?



      add some test data



      INSERT INTO Resources(ResourceUID)
      VALUES (38);
      go 50


      This could be resolved with a temp table



      CREATE TABLE #RID (id int)
      DECLARE @UID int =38
      INSERT INTO #RID
      SELECT r.id
      from
      Resources r where resourceUID = @UID

      SELECT @uid, remark
      from Settings s
      INNER JOIN #RID r
      ON r.id =s.resourceId
      Order by typeID
      OPTION(RECOMPILE)

      DROP TABLE #RID


      Again with accurate estimates.



      This was done with my own dataset,YMMV.




      Written with sp_executesql



      With a variable



      exec sp_executesql N'
      DECLARE @RID int;
      SELECT @RID=r.id from
      Resources r where resourceUID = @UID;

      SELECT @uid, remark
      from Settings
      where resourceId = @uid
      Order by typeID
      OPTION(RECOMPILE);',
      N'@UID int',
      @UID=38


      With a temp table



      exec sp_executesql N'

      CREATE TABLE #RID (id int)

      INSERT INTO #RID
      SELECT r.id
      from
      Resources r where resourceUID = @UID

      SELECT @uid, remark
      from Settings s
      INNER JOIN #RID r
      ON r.id =s.resourceId
      Order by typeID
      OPTION(RECOMPILE)

      DROP TABLE #RID',
      N'@UID int',
      @UID=38


      Still 100% correct estimates on my test






      share|improve this answer
























        Your Answer








        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "182"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f227288%2fsort-spills-to-tempdb-due-to-varcharmax%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        9














        There are going to be several possible workarounds here.



        You can manually adjust the memory grant, though I probably wouldn't go that route.



        You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.



        WITH CTE AS (
        SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID
        FROM Resources r
        inner join Settings s on resourceid=r.id
        where resourceUID=@UID
        ORDER BY s.typeID
        )
        SELECT c.ID, ca.remark
        FROM CTE c
        CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)
        ORDER BY c.typeID


        Proof-of-concept dbfiddle here.
        Sample data would still be appreciated!



        If you want to read an excellent analysis by Paul White, read here.






        share|improve this answer





























          9














          There are going to be several possible workarounds here.



          You can manually adjust the memory grant, though I probably wouldn't go that route.



          You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.



          WITH CTE AS (
          SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID
          FROM Resources r
          inner join Settings s on resourceid=r.id
          where resourceUID=@UID
          ORDER BY s.typeID
          )
          SELECT c.ID, ca.remark
          FROM CTE c
          CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)
          ORDER BY c.typeID


          Proof-of-concept dbfiddle here.
          Sample data would still be appreciated!



          If you want to read an excellent analysis by Paul White, read here.






          share|improve this answer



























            9












            9








            9







            There are going to be several possible workarounds here.



            You can manually adjust the memory grant, though I probably wouldn't go that route.



            You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.



            WITH CTE AS (
            SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID
            FROM Resources r
            inner join Settings s on resourceid=r.id
            where resourceUID=@UID
            ORDER BY s.typeID
            )
            SELECT c.ID, ca.remark
            FROM CTE c
            CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)
            ORDER BY c.typeID


            Proof-of-concept dbfiddle here.
            Sample data would still be appreciated!



            If you want to read an excellent analysis by Paul White, read here.






            share|improve this answer















            There are going to be several possible workarounds here.



            You can manually adjust the memory grant, though I probably wouldn't go that route.



            You can also use a CTE and TOP to push the sort lower, before grabbing the max length column. It will look something like below.



            WITH CTE AS (
            SELECT TOP 1000000000 r.ID, s.ID AS ID2, s.typeID
            FROM Resources r
            inner join Settings s on resourceid=r.id
            where resourceUID=@UID
            ORDER BY s.typeID
            )
            SELECT c.ID, ca.remark
            FROM CTE c
            CROSS APPLY (SELECT remark FROM dbo.Settings s WHERE s.id = c.ID2) ca(remark)
            ORDER BY c.typeID


            Proof-of-concept dbfiddle here.
            Sample data would still be appreciated!



            If you want to read an excellent analysis by Paul White, read here.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Jan 16 at 20:45

























            answered Jan 16 at 14:58









            ForrestForrest

            2,0471518




            2,0471518























                6















                Why are the spills only happening when [remark] is selected?




                The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.



                You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).



                Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?



                It could be a statistics issue, which you can check by running DBCC SHOW_STATISTICS('dbo.Settings', 'IX_Test') and see the counts for that histogram step. But the execution plan seems to indicate that the stats are as complete and up-to-date as they could be.



                Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.






                share|improve this answer



























                  6















                  Why are the spills only happening when [remark] is selected?




                  The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.



                  You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).



                  Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?



                  It could be a statistics issue, which you can check by running DBCC SHOW_STATISTICS('dbo.Settings', 'IX_Test') and see the counts for that histogram step. But the execution plan seems to indicate that the stats are as complete and up-to-date as they could be.



                  Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.






                  share|improve this answer

























                    6












                    6








                    6








                    Why are the spills only happening when [remark] is selected?




                    The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.



                    You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).



                    Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?



                    It could be a statistics issue, which you can check by running DBCC SHOW_STATISTICS('dbo.Settings', 'IX_Test') and see the counts for that histogram step. But the execution plan seems to indicate that the stats are as complete and up-to-date as they could be.



                    Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.






                    share|improve this answer














                    Why are the spills only happening when [remark] is selected?




                    The spill is happening when you include that column because you don't get a big enough memory grant for the large string data being sorted.



                    You don't get a big enough memory grant because the actual number of rows is 10x more than the estimated number of rows (1,302 actual vs 126 estimated).



                    Why is the estimate off? Why does SQL Server think there's only one row in dbo.Settings with a resourceid of 38?



                    It could be a statistics issue, which you can check by running DBCC SHOW_STATISTICS('dbo.Settings', 'IX_Test') and see the counts for that histogram step. But the execution plan seems to indicate that the stats are as complete and up-to-date as they could be.



                    Since stats aren't helping, your best bet is probably a query rewrite - which Forrest has covered in his answer.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Jan 16 at 15:02









                    jadarnel27jadarnel27

                    4,7441634




                    4,7441634





















                        2














                        To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.



                        I created some test data, and in the end came up with two solutions, storing the ID field from resources in either a variable (if it is always unique) or a temp table, if we can have more than one ID's.



                        Base test records



                        SET NOCOUNT ON
                        DECLARE @i int= 1;
                        WHILE @i <= 10000
                        BEGIN
                        INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                        VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value
                        INSERT INTO [dbo].[Resources](resourceUID)
                        VALUES(@i);
                        SET @i += 1;
                        END


                        Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)



                        INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                        VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')
                        GO 1300


                        Change compat & Update statistics to match OP



                        ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;
                        UPDATE STATISTICS settings WITH FULLSCAN;
                        UPDATE STATISTICS resources WITH FULLSCAN;


                        Original query



                        exec sp_executesql N'
                        select r.id
                        FROM Resources r
                        inner join Settings on resourceid=r.id
                        where resourceUID=@UID
                        ORDER BY typeID',
                        N'@UID int',
                        @UID=38


                        My estimates are even worse, with one estimated row, while 1300 are returned.
                        And like OP stated, it does not matter if I add OPTION(RECOMPILE)



                        An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.



                        I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point



                        exec sp_executesql N'
                        select r.id,remark
                        FROM Resources r with(index([IX_UID]))
                        inner join Settings WITH(INDEX([IX_Test]))
                        on resourceid=r.id
                        ORDER BY typeID',
                        N'@UID int',
                        @UID=38


                        As expected, good estimates.



                        So, what could we change to get better estimates but still seek on our values?



                        IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)



                        DECLARE @UID int =38 , @RID int;
                        SELECT @RID=r.id from
                        Resources r where resourceUID = @UID;

                        SELECT @uid, remark
                        from Settings
                        where resourceId = @uid
                        Order by typeID
                        OPTION(RECOMPILE);


                        Which gives 100% accurate estimates



                        But what if there are multiple resourceUID's in resources?



                        add some test data



                        INSERT INTO Resources(ResourceUID)
                        VALUES (38);
                        go 50


                        This could be resolved with a temp table



                        CREATE TABLE #RID (id int)
                        DECLARE @UID int =38
                        INSERT INTO #RID
                        SELECT r.id
                        from
                        Resources r where resourceUID = @UID

                        SELECT @uid, remark
                        from Settings s
                        INNER JOIN #RID r
                        ON r.id =s.resourceId
                        Order by typeID
                        OPTION(RECOMPILE)

                        DROP TABLE #RID


                        Again with accurate estimates.



                        This was done with my own dataset,YMMV.




                        Written with sp_executesql



                        With a variable



                        exec sp_executesql N'
                        DECLARE @RID int;
                        SELECT @RID=r.id from
                        Resources r where resourceUID = @UID;

                        SELECT @uid, remark
                        from Settings
                        where resourceId = @uid
                        Order by typeID
                        OPTION(RECOMPILE);',
                        N'@UID int',
                        @UID=38


                        With a temp table



                        exec sp_executesql N'

                        CREATE TABLE #RID (id int)

                        INSERT INTO #RID
                        SELECT r.id
                        from
                        Resources r where resourceUID = @UID

                        SELECT @uid, remark
                        from Settings s
                        INNER JOIN #RID r
                        ON r.id =s.resourceId
                        Order by typeID
                        OPTION(RECOMPILE)

                        DROP TABLE #RID',
                        N'@UID int',
                        @UID=38


                        Still 100% correct estimates on my test






                        share|improve this answer





























                          2














                          To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.



                          I created some test data, and in the end came up with two solutions, storing the ID field from resources in either a variable (if it is always unique) or a temp table, if we can have more than one ID's.



                          Base test records



                          SET NOCOUNT ON
                          DECLARE @i int= 1;
                          WHILE @i <= 10000
                          BEGIN
                          INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                          VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value
                          INSERT INTO [dbo].[Resources](resourceUID)
                          VALUES(@i);
                          SET @i += 1;
                          END


                          Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)



                          INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                          VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')
                          GO 1300


                          Change compat & Update statistics to match OP



                          ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;
                          UPDATE STATISTICS settings WITH FULLSCAN;
                          UPDATE STATISTICS resources WITH FULLSCAN;


                          Original query



                          exec sp_executesql N'
                          select r.id
                          FROM Resources r
                          inner join Settings on resourceid=r.id
                          where resourceUID=@UID
                          ORDER BY typeID',
                          N'@UID int',
                          @UID=38


                          My estimates are even worse, with one estimated row, while 1300 are returned.
                          And like OP stated, it does not matter if I add OPTION(RECOMPILE)



                          An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.



                          I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point



                          exec sp_executesql N'
                          select r.id,remark
                          FROM Resources r with(index([IX_UID]))
                          inner join Settings WITH(INDEX([IX_Test]))
                          on resourceid=r.id
                          ORDER BY typeID',
                          N'@UID int',
                          @UID=38


                          As expected, good estimates.



                          So, what could we change to get better estimates but still seek on our values?



                          IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)



                          DECLARE @UID int =38 , @RID int;
                          SELECT @RID=r.id from
                          Resources r where resourceUID = @UID;

                          SELECT @uid, remark
                          from Settings
                          where resourceId = @uid
                          Order by typeID
                          OPTION(RECOMPILE);


                          Which gives 100% accurate estimates



                          But what if there are multiple resourceUID's in resources?



                          add some test data



                          INSERT INTO Resources(ResourceUID)
                          VALUES (38);
                          go 50


                          This could be resolved with a temp table



                          CREATE TABLE #RID (id int)
                          DECLARE @UID int =38
                          INSERT INTO #RID
                          SELECT r.id
                          from
                          Resources r where resourceUID = @UID

                          SELECT @uid, remark
                          from Settings s
                          INNER JOIN #RID r
                          ON r.id =s.resourceId
                          Order by typeID
                          OPTION(RECOMPILE)

                          DROP TABLE #RID


                          Again with accurate estimates.



                          This was done with my own dataset,YMMV.




                          Written with sp_executesql



                          With a variable



                          exec sp_executesql N'
                          DECLARE @RID int;
                          SELECT @RID=r.id from
                          Resources r where resourceUID = @UID;

                          SELECT @uid, remark
                          from Settings
                          where resourceId = @uid
                          Order by typeID
                          OPTION(RECOMPILE);',
                          N'@UID int',
                          @UID=38


                          With a temp table



                          exec sp_executesql N'

                          CREATE TABLE #RID (id int)

                          INSERT INTO #RID
                          SELECT r.id
                          from
                          Resources r where resourceUID = @UID

                          SELECT @uid, remark
                          from Settings s
                          INNER JOIN #RID r
                          ON r.id =s.resourceId
                          Order by typeID
                          OPTION(RECOMPILE)

                          DROP TABLE #RID',
                          N'@UID int',
                          @UID=38


                          Still 100% correct estimates on my test






                          share|improve this answer



























                            2












                            2








                            2







                            To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.



                            I created some test data, and in the end came up with two solutions, storing the ID field from resources in either a variable (if it is always unique) or a temp table, if we can have more than one ID's.



                            Base test records



                            SET NOCOUNT ON
                            DECLARE @i int= 1;
                            WHILE @i <= 10000
                            BEGIN
                            INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                            VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value
                            INSERT INTO [dbo].[Resources](resourceUID)
                            VALUES(@i);
                            SET @i += 1;
                            END


                            Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)



                            INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                            VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')
                            GO 1300


                            Change compat & Update statistics to match OP



                            ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;
                            UPDATE STATISTICS settings WITH FULLSCAN;
                            UPDATE STATISTICS resources WITH FULLSCAN;


                            Original query



                            exec sp_executesql N'
                            select r.id
                            FROM Resources r
                            inner join Settings on resourceid=r.id
                            where resourceUID=@UID
                            ORDER BY typeID',
                            N'@UID int',
                            @UID=38


                            My estimates are even worse, with one estimated row, while 1300 are returned.
                            And like OP stated, it does not matter if I add OPTION(RECOMPILE)



                            An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.



                            I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point



                            exec sp_executesql N'
                            select r.id,remark
                            FROM Resources r with(index([IX_UID]))
                            inner join Settings WITH(INDEX([IX_Test]))
                            on resourceid=r.id
                            ORDER BY typeID',
                            N'@UID int',
                            @UID=38


                            As expected, good estimates.



                            So, what could we change to get better estimates but still seek on our values?



                            IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)



                            DECLARE @UID int =38 , @RID int;
                            SELECT @RID=r.id from
                            Resources r where resourceUID = @UID;

                            SELECT @uid, remark
                            from Settings
                            where resourceId = @uid
                            Order by typeID
                            OPTION(RECOMPILE);


                            Which gives 100% accurate estimates



                            But what if there are multiple resourceUID's in resources?



                            add some test data



                            INSERT INTO Resources(ResourceUID)
                            VALUES (38);
                            go 50


                            This could be resolved with a temp table



                            CREATE TABLE #RID (id int)
                            DECLARE @UID int =38
                            INSERT INTO #RID
                            SELECT r.id
                            from
                            Resources r where resourceUID = @UID

                            SELECT @uid, remark
                            from Settings s
                            INNER JOIN #RID r
                            ON r.id =s.resourceId
                            Order by typeID
                            OPTION(RECOMPILE)

                            DROP TABLE #RID


                            Again with accurate estimates.



                            This was done with my own dataset,YMMV.




                            Written with sp_executesql



                            With a variable



                            exec sp_executesql N'
                            DECLARE @RID int;
                            SELECT @RID=r.id from
                            Resources r where resourceUID = @UID;

                            SELECT @uid, remark
                            from Settings
                            where resourceId = @uid
                            Order by typeID
                            OPTION(RECOMPILE);',
                            N'@UID int',
                            @UID=38


                            With a temp table



                            exec sp_executesql N'

                            CREATE TABLE #RID (id int)

                            INSERT INTO #RID
                            SELECT r.id
                            from
                            Resources r where resourceUID = @UID

                            SELECT @uid, remark
                            from Settings s
                            INNER JOIN #RID r
                            ON r.id =s.resourceId
                            Order by typeID
                            OPTION(RECOMPILE)

                            DROP TABLE #RID',
                            N'@UID int',
                            @UID=38


                            Still 100% correct estimates on my test






                            share|improve this answer















                            To me it appears that the where clause in the query is giving the issue, and is the cause of the low estimates, even if OPTION(RECOMPILE) is used.



                            I created some test data, and in the end came up with two solutions, storing the ID field from resources in either a variable (if it is always unique) or a temp table, if we can have more than one ID's.



                            Base test records



                            SET NOCOUNT ON
                            DECLARE @i int= 1;
                            WHILE @i <= 10000
                            BEGIN
                            INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                            VALUES(@i,@i,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here'); -- 23254 character length on each value
                            INSERT INTO [dbo].[Resources](resourceUID)
                            VALUES(@i);
                            SET @i += 1;
                            END


                            Insert the 'Seek' values, as to get to the same approximate resultset as OP (1300 records)



                            INSERT INTO [dbo].[Settings]([resourceId],[typeID],remark)
                            VALUES(38,38,'KEPT THESE VALUES OUT BECAUSE IT WOULD CLUTTER THE EXAMPLES, VALUES OVER 8000 Chars entered here')
                            GO 1300


                            Change compat & Update statistics to match OP



                            ALTER DATABASE StackOverflow SET COMPATIBILITY_LEVEL = 120;
                            UPDATE STATISTICS settings WITH FULLSCAN;
                            UPDATE STATISTICS resources WITH FULLSCAN;


                            Original query



                            exec sp_executesql N'
                            select r.id
                            FROM Resources r
                            inner join Settings on resourceid=r.id
                            where resourceUID=@UID
                            ORDER BY typeID',
                            N'@UID int',
                            @UID=38


                            My estimates are even worse, with one estimated row, while 1300 are returned.
                            And like OP stated, it does not matter if I add OPTION(RECOMPILE)



                            An important thing to note, is that when we get rid of the where clause the estimates are 100% correct, which is expected since we are using all data in both tables.



                            I forced the indexes just to make sure we use the same ones as in the previous query, to prove the point



                            exec sp_executesql N'
                            select r.id,remark
                            FROM Resources r with(index([IX_UID]))
                            inner join Settings WITH(INDEX([IX_Test]))
                            on resourceid=r.id
                            ORDER BY typeID',
                            N'@UID int',
                            @UID=38


                            As expected, good estimates.



                            So, what could we change to get better estimates but still seek on our values?



                            IF @UID is unique, as in the example OP gave, we could put the single id that was returned from resources in a variable, then seek on that variable with an OPTION(RECOMPILE)



                            DECLARE @UID int =38 , @RID int;
                            SELECT @RID=r.id from
                            Resources r where resourceUID = @UID;

                            SELECT @uid, remark
                            from Settings
                            where resourceId = @uid
                            Order by typeID
                            OPTION(RECOMPILE);


                            Which gives 100% accurate estimates



                            But what if there are multiple resourceUID's in resources?



                            add some test data



                            INSERT INTO Resources(ResourceUID)
                            VALUES (38);
                            go 50


                            This could be resolved with a temp table



                            CREATE TABLE #RID (id int)
                            DECLARE @UID int =38
                            INSERT INTO #RID
                            SELECT r.id
                            from
                            Resources r where resourceUID = @UID

                            SELECT @uid, remark
                            from Settings s
                            INNER JOIN #RID r
                            ON r.id =s.resourceId
                            Order by typeID
                            OPTION(RECOMPILE)

                            DROP TABLE #RID


                            Again with accurate estimates.



                            This was done with my own dataset,YMMV.




                            Written with sp_executesql



                            With a variable



                            exec sp_executesql N'
                            DECLARE @RID int;
                            SELECT @RID=r.id from
                            Resources r where resourceUID = @UID;

                            SELECT @uid, remark
                            from Settings
                            where resourceId = @uid
                            Order by typeID
                            OPTION(RECOMPILE);',
                            N'@UID int',
                            @UID=38


                            With a temp table



                            exec sp_executesql N'

                            CREATE TABLE #RID (id int)

                            INSERT INTO #RID
                            SELECT r.id
                            from
                            Resources r where resourceUID = @UID

                            SELECT @uid, remark
                            from Settings s
                            INNER JOIN #RID r
                            ON r.id =s.resourceId
                            Order by typeID
                            OPTION(RECOMPILE)

                            DROP TABLE #RID',
                            N'@UID int',
                            @UID=38


                            Still 100% correct estimates on my test







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Jan 16 at 19:30

























                            answered Jan 16 at 19:10









                            Randi VertongenRandi Vertongen

                            2,233519




                            2,233519



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Database Administrators Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f227288%2fsort-spills-to-tempdb-due-to-varcharmax%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown






                                Popular posts from this blog

                                Peggy Mitchell

                                Palaiologos

                                The Forum (Inglewood, California)