How to prevent GPU from overheating and auto turning off

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I was wondering how Linux could handle a Gamer Computer, so I have built one, but as we know GeForce does not like Linux so much as AMD, that is why I choose the last.



I built up a computer with AMD Ryzen 7 1800X CPU and Radeon RX 560D GPU, as the Vega is too expensive for me to purchase, and the benchmarking said 560 is the best cost-benefit ratio currently.



After some research I discovered the suffix D means it has slightly less clock speed in order to save some power consumption in comparison with RX560 without D.



After countless crashes during random gaming I finally found out the problem is the GPU overheating, it's fan speed tends to follow the CPU fan speed, but of course the CPU is much less required than the GPU in some games.



I partially solved the problem by customizing the fan speed based on GPU temperature instead of CPU, it is now growing gradually, and achieves the maximum speed on 50 Celsius degrees, but the problem is: on some games it holds on maximum speed all the time, and eventually still crashes.



Describing the crash: the screen blinks and then became black, GPU fan stops, keyboard led blinks and then turn off, mouse the same, other CPU fan keeps, sometimes the system keeps frozen forever, sometimes the system auto reboot.



As a reboot is required I could not find any tip on system logs, initially I though it was a kernel panic, but even using kdump and duplicating the kernel the system stills crashes the way I could not recover it.



I do not know if Windows would have the same problem, but I strongly believe does not, I have never seen someone with the same problem on Windows, so my question is: there is some way to tell the kernel to make GPU take it easy when it is about to overheat, maybe just auto reducing the GPU clock speed?







share|improve this question

























    up vote
    1
    down vote

    favorite












    I was wondering how Linux could handle a Gamer Computer, so I have built one, but as we know GeForce does not like Linux so much as AMD, that is why I choose the last.



    I built up a computer with AMD Ryzen 7 1800X CPU and Radeon RX 560D GPU, as the Vega is too expensive for me to purchase, and the benchmarking said 560 is the best cost-benefit ratio currently.



    After some research I discovered the suffix D means it has slightly less clock speed in order to save some power consumption in comparison with RX560 without D.



    After countless crashes during random gaming I finally found out the problem is the GPU overheating, it's fan speed tends to follow the CPU fan speed, but of course the CPU is much less required than the GPU in some games.



    I partially solved the problem by customizing the fan speed based on GPU temperature instead of CPU, it is now growing gradually, and achieves the maximum speed on 50 Celsius degrees, but the problem is: on some games it holds on maximum speed all the time, and eventually still crashes.



    Describing the crash: the screen blinks and then became black, GPU fan stops, keyboard led blinks and then turn off, mouse the same, other CPU fan keeps, sometimes the system keeps frozen forever, sometimes the system auto reboot.



    As a reboot is required I could not find any tip on system logs, initially I though it was a kernel panic, but even using kdump and duplicating the kernel the system stills crashes the way I could not recover it.



    I do not know if Windows would have the same problem, but I strongly believe does not, I have never seen someone with the same problem on Windows, so my question is: there is some way to tell the kernel to make GPU take it easy when it is about to overheat, maybe just auto reducing the GPU clock speed?







    share|improve this question























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I was wondering how Linux could handle a Gamer Computer, so I have built one, but as we know GeForce does not like Linux so much as AMD, that is why I choose the last.



      I built up a computer with AMD Ryzen 7 1800X CPU and Radeon RX 560D GPU, as the Vega is too expensive for me to purchase, and the benchmarking said 560 is the best cost-benefit ratio currently.



      After some research I discovered the suffix D means it has slightly less clock speed in order to save some power consumption in comparison with RX560 without D.



      After countless crashes during random gaming I finally found out the problem is the GPU overheating, it's fan speed tends to follow the CPU fan speed, but of course the CPU is much less required than the GPU in some games.



      I partially solved the problem by customizing the fan speed based on GPU temperature instead of CPU, it is now growing gradually, and achieves the maximum speed on 50 Celsius degrees, but the problem is: on some games it holds on maximum speed all the time, and eventually still crashes.



      Describing the crash: the screen blinks and then became black, GPU fan stops, keyboard led blinks and then turn off, mouse the same, other CPU fan keeps, sometimes the system keeps frozen forever, sometimes the system auto reboot.



      As a reboot is required I could not find any tip on system logs, initially I though it was a kernel panic, but even using kdump and duplicating the kernel the system stills crashes the way I could not recover it.



      I do not know if Windows would have the same problem, but I strongly believe does not, I have never seen someone with the same problem on Windows, so my question is: there is some way to tell the kernel to make GPU take it easy when it is about to overheat, maybe just auto reducing the GPU clock speed?







      share|improve this question













      I was wondering how Linux could handle a Gamer Computer, so I have built one, but as we know GeForce does not like Linux so much as AMD, that is why I choose the last.



      I built up a computer with AMD Ryzen 7 1800X CPU and Radeon RX 560D GPU, as the Vega is too expensive for me to purchase, and the benchmarking said 560 is the best cost-benefit ratio currently.



      After some research I discovered the suffix D means it has slightly less clock speed in order to save some power consumption in comparison with RX560 without D.



      After countless crashes during random gaming I finally found out the problem is the GPU overheating, it's fan speed tends to follow the CPU fan speed, but of course the CPU is much less required than the GPU in some games.



      I partially solved the problem by customizing the fan speed based on GPU temperature instead of CPU, it is now growing gradually, and achieves the maximum speed on 50 Celsius degrees, but the problem is: on some games it holds on maximum speed all the time, and eventually still crashes.



      Describing the crash: the screen blinks and then became black, GPU fan stops, keyboard led blinks and then turn off, mouse the same, other CPU fan keeps, sometimes the system keeps frozen forever, sometimes the system auto reboot.



      As a reboot is required I could not find any tip on system logs, initially I though it was a kernel panic, but even using kdump and duplicating the kernel the system stills crashes the way I could not recover it.



      I do not know if Windows would have the same problem, but I strongly believe does not, I have never seen someone with the same problem on Windows, so my question is: there is some way to tell the kernel to make GPU take it easy when it is about to overheat, maybe just auto reducing the GPU clock speed?









      share|improve this question












      share|improve this question




      share|improve this question








      edited May 16 at 16:49
























      asked May 16 at 16:44









      Tiago Pimenta

      273213




      273213




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          I found the solution, there are some files on /sys/class/drm/card0/device the file pp_dpm_mclk indicates GPU memory clock, and the file pp_dpm_sclk indicates GPU core clock, mine:



          $ egrep -H . /sys/class/drm/card0/device/pp_dpm_*
          /sys/class/drm/card0/device/pp_dpm_mclk:0: 300Mhz
          /sys/class/drm/card0/device/pp_dpm_mclk:1: 1500Mhz *
          /sys/class/drm/card0/device/pp_dpm_pcie:0: 2.5GB, x8 *
          /sys/class/drm/card0/device/pp_dpm_pcie:1: 8.0GB, x16
          /sys/class/drm/card0/device/pp_dpm_sclk:0: 214Mhz *
          /sys/class/drm/card0/device/pp_dpm_sclk:1: 481Mhz
          /sys/class/drm/card0/device/pp_dpm_sclk:2: 760Mhz
          /sys/class/drm/card0/device/pp_dpm_sclk:3: 1000Mhz
          /sys/class/drm/card0/device/pp_dpm_sclk:4: 1050Mhz
          /sys/class/drm/card0/device/pp_dpm_sclk:5: 1100Mhz
          /sys/class/drm/card0/device/pp_dpm_sclk:6: 1150Mhz
          /sys/class/drm/card0/device/pp_dpm_sclk:7: 1196Mhz


          And the file power_dpm_force_performance_level indicates the profile, which can be low, auto or manual, the default is auto, when low it runs always on lowest clock, which is not exactly what I want, so I set it to manual and made a script that keeps changing the clock according the GPU temperature, voilà, it worked!



          To change the clock on manual profile just write a number to file pp_dpm_sclk that represents the line, starting with 0, in my case till 7.



          If you are interested on my script here is it.






          share|improve this answer





















            Your Answer







            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "106"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f444200%2fhow-to-prevent-gpu-from-overheating-and-auto-turning-off%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote



            accepted










            I found the solution, there are some files on /sys/class/drm/card0/device the file pp_dpm_mclk indicates GPU memory clock, and the file pp_dpm_sclk indicates GPU core clock, mine:



            $ egrep -H . /sys/class/drm/card0/device/pp_dpm_*
            /sys/class/drm/card0/device/pp_dpm_mclk:0: 300Mhz
            /sys/class/drm/card0/device/pp_dpm_mclk:1: 1500Mhz *
            /sys/class/drm/card0/device/pp_dpm_pcie:0: 2.5GB, x8 *
            /sys/class/drm/card0/device/pp_dpm_pcie:1: 8.0GB, x16
            /sys/class/drm/card0/device/pp_dpm_sclk:0: 214Mhz *
            /sys/class/drm/card0/device/pp_dpm_sclk:1: 481Mhz
            /sys/class/drm/card0/device/pp_dpm_sclk:2: 760Mhz
            /sys/class/drm/card0/device/pp_dpm_sclk:3: 1000Mhz
            /sys/class/drm/card0/device/pp_dpm_sclk:4: 1050Mhz
            /sys/class/drm/card0/device/pp_dpm_sclk:5: 1100Mhz
            /sys/class/drm/card0/device/pp_dpm_sclk:6: 1150Mhz
            /sys/class/drm/card0/device/pp_dpm_sclk:7: 1196Mhz


            And the file power_dpm_force_performance_level indicates the profile, which can be low, auto or manual, the default is auto, when low it runs always on lowest clock, which is not exactly what I want, so I set it to manual and made a script that keeps changing the clock according the GPU temperature, voilà, it worked!



            To change the clock on manual profile just write a number to file pp_dpm_sclk that represents the line, starting with 0, in my case till 7.



            If you are interested on my script here is it.






            share|improve this answer

























              up vote
              1
              down vote



              accepted










              I found the solution, there are some files on /sys/class/drm/card0/device the file pp_dpm_mclk indicates GPU memory clock, and the file pp_dpm_sclk indicates GPU core clock, mine:



              $ egrep -H . /sys/class/drm/card0/device/pp_dpm_*
              /sys/class/drm/card0/device/pp_dpm_mclk:0: 300Mhz
              /sys/class/drm/card0/device/pp_dpm_mclk:1: 1500Mhz *
              /sys/class/drm/card0/device/pp_dpm_pcie:0: 2.5GB, x8 *
              /sys/class/drm/card0/device/pp_dpm_pcie:1: 8.0GB, x16
              /sys/class/drm/card0/device/pp_dpm_sclk:0: 214Mhz *
              /sys/class/drm/card0/device/pp_dpm_sclk:1: 481Mhz
              /sys/class/drm/card0/device/pp_dpm_sclk:2: 760Mhz
              /sys/class/drm/card0/device/pp_dpm_sclk:3: 1000Mhz
              /sys/class/drm/card0/device/pp_dpm_sclk:4: 1050Mhz
              /sys/class/drm/card0/device/pp_dpm_sclk:5: 1100Mhz
              /sys/class/drm/card0/device/pp_dpm_sclk:6: 1150Mhz
              /sys/class/drm/card0/device/pp_dpm_sclk:7: 1196Mhz


              And the file power_dpm_force_performance_level indicates the profile, which can be low, auto or manual, the default is auto, when low it runs always on lowest clock, which is not exactly what I want, so I set it to manual and made a script that keeps changing the clock according the GPU temperature, voilà, it worked!



              To change the clock on manual profile just write a number to file pp_dpm_sclk that represents the line, starting with 0, in my case till 7.



              If you are interested on my script here is it.






              share|improve this answer























                up vote
                1
                down vote



                accepted







                up vote
                1
                down vote



                accepted






                I found the solution, there are some files on /sys/class/drm/card0/device the file pp_dpm_mclk indicates GPU memory clock, and the file pp_dpm_sclk indicates GPU core clock, mine:



                $ egrep -H . /sys/class/drm/card0/device/pp_dpm_*
                /sys/class/drm/card0/device/pp_dpm_mclk:0: 300Mhz
                /sys/class/drm/card0/device/pp_dpm_mclk:1: 1500Mhz *
                /sys/class/drm/card0/device/pp_dpm_pcie:0: 2.5GB, x8 *
                /sys/class/drm/card0/device/pp_dpm_pcie:1: 8.0GB, x16
                /sys/class/drm/card0/device/pp_dpm_sclk:0: 214Mhz *
                /sys/class/drm/card0/device/pp_dpm_sclk:1: 481Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:2: 760Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:3: 1000Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:4: 1050Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:5: 1100Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:6: 1150Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:7: 1196Mhz


                And the file power_dpm_force_performance_level indicates the profile, which can be low, auto or manual, the default is auto, when low it runs always on lowest clock, which is not exactly what I want, so I set it to manual and made a script that keeps changing the clock according the GPU temperature, voilà, it worked!



                To change the clock on manual profile just write a number to file pp_dpm_sclk that represents the line, starting with 0, in my case till 7.



                If you are interested on my script here is it.






                share|improve this answer













                I found the solution, there are some files on /sys/class/drm/card0/device the file pp_dpm_mclk indicates GPU memory clock, and the file pp_dpm_sclk indicates GPU core clock, mine:



                $ egrep -H . /sys/class/drm/card0/device/pp_dpm_*
                /sys/class/drm/card0/device/pp_dpm_mclk:0: 300Mhz
                /sys/class/drm/card0/device/pp_dpm_mclk:1: 1500Mhz *
                /sys/class/drm/card0/device/pp_dpm_pcie:0: 2.5GB, x8 *
                /sys/class/drm/card0/device/pp_dpm_pcie:1: 8.0GB, x16
                /sys/class/drm/card0/device/pp_dpm_sclk:0: 214Mhz *
                /sys/class/drm/card0/device/pp_dpm_sclk:1: 481Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:2: 760Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:3: 1000Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:4: 1050Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:5: 1100Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:6: 1150Mhz
                /sys/class/drm/card0/device/pp_dpm_sclk:7: 1196Mhz


                And the file power_dpm_force_performance_level indicates the profile, which can be low, auto or manual, the default is auto, when low it runs always on lowest clock, which is not exactly what I want, so I set it to manual and made a script that keeps changing the clock according the GPU temperature, voilà, it worked!



                To change the clock on manual profile just write a number to file pp_dpm_sclk that represents the line, starting with 0, in my case till 7.



                If you are interested on my script here is it.







                share|improve this answer













                share|improve this answer



                share|improve this answer











                answered May 17 at 0:29









                Tiago Pimenta

                273213




                273213






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f444200%2fhow-to-prevent-gpu-from-overheating-and-auto-turning-off%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    How to check contact read email or not when send email to Individual?

                    How many registers does an x86_64 CPU actually have?

                    Nur Jahan