Working out what is causing my machine to freeze

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite












Background



For a few years now, I have been having problems with my Xeon workstations freezing. For many things they have been lightning fast, but sometimes applications or even the desktop just freeze for no apparent reason.



It got so bad last year that I had my whole workstation replaced with new hardware, but the problems persisted with the new machine. Both were installed from the same RHEL6 boot image. Both were specified with a decent CPU, plenty of memory and a direct connection to the corporate network gig-e switch. The original machine had an SSD, the new one a spinning hunk of rust. On the original machine, I even switched briefly to RHEL7, but it behaved the same way and I found Gnome 3 to be a step backwards in usability, so I reinstalled RHEL6.



I do not have root access to my workstation, though I do have the ability to use other software, via modules.



How the problems manifest



The problem is most severe (and most reproducible) when running my Eclipse development environment. Often just saving a file, or committing changes through eGit will cause the whole of Eclipse to stop responding for 10 to 30 seconds. When this happens, I double click on the title bar twice to restore and then maximise the window, I then wait for the window to be re-painted before I can continue working.



I use synergy to share my Linux workstation keyboard and mouse with my Windows laptop. Sometimes the whole desktop freezes, the mouse pointer flicks back to the workstation and I lose the ability to control the laptop until the workstation unfreezes.



I also see problems with Firefox freezing, at one point it was frustratingly freezing for 10 seconds every 30 seconds, and during the freeze I would be unable to scroll or switch tab. Now it only happens occasionally though (it's happened once while writing this post).



Although not as common as the others, I've also seen problems on the bash command line. Just pressing enter with no command can take 10 to 30 seconds for the subsequent prompt to be displayed.



What I've tried so far



I've monitored CPU and IO usage during application freezes, and usage appears to be minimal. Obviously monitoring tools like System Monitor and top at the command line also freeze when the whole desktop freezes, so it's difficult to see what is happening then.



I have tried moving my Eclipse application to my local disk, and symlinking ~/.eclipse to a directory on my local disk, but this didn't make a significant difference. The problems also occur whether my Eclipse workspace is on a local drive, or one of the nfs shares.



I've tried tracing file accesses in Eclipse in an attempts to minimise network file access, but this didn't suggest any particular problem.



Tweaking my strace to include child processes however, I'm seeing lots of messages, around the time of each freeze, of the form:



[pid 13513] --- SIGSEGV si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fe7db165000 ---


I'm not sure how to investigate these access errors further however.



I have tried creating a new firefox profile from scratch and using that, but again this made no real difference. I can't move my firefox profiles to a local disk as I need to be able to access them when working on other machines on our network. Those other machines don't seem to have the same problems as I see on my personal workstation, but I rarely use them for more than a few hours at a time.



I have tried running benchmarks on our filesystems (both local and network) but the tools I've found seem to concentrate on average transfers, and I suspect this may be worse case/latency related, and these appear to be averaged away by the fact that most transfers are fast.



Checking out answers to How do I figure out what's freezing up my machine? I can confirm that the local filesystems are ext4 (on LVM), while network filesystems are all nfs, and my machine isn't using LUKS.



Looking at Can I try to induce a freeze in my computer to isolate what is causing freezing? reminds me to say that this problem has persisted across many kernel versions, redhat releases and Nvidia Quadro drivers.



My suspicions



My persistent suspicion has been that my problems are network related. But I'm not sure how best to investigate.



I know that if I loose network connection for any reason, the whole machine will freeze until network is restored. I've never seen this before, but our systems seem to assume that home directories and application server shares are always available and responsive.



My question



What do I need to look at to work out why my machine is behaving as it is?



What RHEL tools can I use to track down these performance problems, and can I use these tools without root access?







share|improve this question





















  • Let me guess: your desktop manager is KDE, right? If true, dig no further - KDE is notorious for freezing at random. Try Mate or another alternative, or different video card altogether (i.e. if you are on a built in Intel, try discreete AMD or Nvidia, etc).
    – ajeh
    Apr 19 at 14:49










  • Nope @ajeh, Gnome shop here, and the Quadro is a discrete Nvidia card already. I've updated my question to mention Gnome.
    – Mark Booth
    Apr 19 at 15:44










  • I am fresh out of ideas then, as I never experienced random freezes with Gnome 2 and am still on Mate due to its premature death at the hands of Gnome 3+ team. Maybe indicate if you are running open source or closed source video drivers? I had issues with proprietary video drivers long time ago and never touched them since.
    – ajeh
    Apr 19 at 16:16










  • If your home directory is on NFS you may want to make sure that this works as intended. A 10 to 30 second wait for the next prompt could mean that the home directory is unavailable for brief periods of time, and it could also be the reason why other software freezes. If you're at a company, get a network guy to debug the NFS mount options that you are using to make sure they are optimal.
    – Kusalananda
    May 2 at 13:34














up vote
4
down vote

favorite












Background



For a few years now, I have been having problems with my Xeon workstations freezing. For many things they have been lightning fast, but sometimes applications or even the desktop just freeze for no apparent reason.



It got so bad last year that I had my whole workstation replaced with new hardware, but the problems persisted with the new machine. Both were installed from the same RHEL6 boot image. Both were specified with a decent CPU, plenty of memory and a direct connection to the corporate network gig-e switch. The original machine had an SSD, the new one a spinning hunk of rust. On the original machine, I even switched briefly to RHEL7, but it behaved the same way and I found Gnome 3 to be a step backwards in usability, so I reinstalled RHEL6.



I do not have root access to my workstation, though I do have the ability to use other software, via modules.



How the problems manifest



The problem is most severe (and most reproducible) when running my Eclipse development environment. Often just saving a file, or committing changes through eGit will cause the whole of Eclipse to stop responding for 10 to 30 seconds. When this happens, I double click on the title bar twice to restore and then maximise the window, I then wait for the window to be re-painted before I can continue working.



I use synergy to share my Linux workstation keyboard and mouse with my Windows laptop. Sometimes the whole desktop freezes, the mouse pointer flicks back to the workstation and I lose the ability to control the laptop until the workstation unfreezes.



I also see problems with Firefox freezing, at one point it was frustratingly freezing for 10 seconds every 30 seconds, and during the freeze I would be unable to scroll or switch tab. Now it only happens occasionally though (it's happened once while writing this post).



Although not as common as the others, I've also seen problems on the bash command line. Just pressing enter with no command can take 10 to 30 seconds for the subsequent prompt to be displayed.



What I've tried so far



I've monitored CPU and IO usage during application freezes, and usage appears to be minimal. Obviously monitoring tools like System Monitor and top at the command line also freeze when the whole desktop freezes, so it's difficult to see what is happening then.



I have tried moving my Eclipse application to my local disk, and symlinking ~/.eclipse to a directory on my local disk, but this didn't make a significant difference. The problems also occur whether my Eclipse workspace is on a local drive, or one of the nfs shares.



I've tried tracing file accesses in Eclipse in an attempts to minimise network file access, but this didn't suggest any particular problem.



Tweaking my strace to include child processes however, I'm seeing lots of messages, around the time of each freeze, of the form:



[pid 13513] --- SIGSEGV si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fe7db165000 ---


I'm not sure how to investigate these access errors further however.



I have tried creating a new firefox profile from scratch and using that, but again this made no real difference. I can't move my firefox profiles to a local disk as I need to be able to access them when working on other machines on our network. Those other machines don't seem to have the same problems as I see on my personal workstation, but I rarely use them for more than a few hours at a time.



I have tried running benchmarks on our filesystems (both local and network) but the tools I've found seem to concentrate on average transfers, and I suspect this may be worse case/latency related, and these appear to be averaged away by the fact that most transfers are fast.



Checking out answers to How do I figure out what's freezing up my machine? I can confirm that the local filesystems are ext4 (on LVM), while network filesystems are all nfs, and my machine isn't using LUKS.



Looking at Can I try to induce a freeze in my computer to isolate what is causing freezing? reminds me to say that this problem has persisted across many kernel versions, redhat releases and Nvidia Quadro drivers.



My suspicions



My persistent suspicion has been that my problems are network related. But I'm not sure how best to investigate.



I know that if I loose network connection for any reason, the whole machine will freeze until network is restored. I've never seen this before, but our systems seem to assume that home directories and application server shares are always available and responsive.



My question



What do I need to look at to work out why my machine is behaving as it is?



What RHEL tools can I use to track down these performance problems, and can I use these tools without root access?







share|improve this question





















  • Let me guess: your desktop manager is KDE, right? If true, dig no further - KDE is notorious for freezing at random. Try Mate or another alternative, or different video card altogether (i.e. if you are on a built in Intel, try discreete AMD or Nvidia, etc).
    – ajeh
    Apr 19 at 14:49










  • Nope @ajeh, Gnome shop here, and the Quadro is a discrete Nvidia card already. I've updated my question to mention Gnome.
    – Mark Booth
    Apr 19 at 15:44










  • I am fresh out of ideas then, as I never experienced random freezes with Gnome 2 and am still on Mate due to its premature death at the hands of Gnome 3+ team. Maybe indicate if you are running open source or closed source video drivers? I had issues with proprietary video drivers long time ago and never touched them since.
    – ajeh
    Apr 19 at 16:16










  • If your home directory is on NFS you may want to make sure that this works as intended. A 10 to 30 second wait for the next prompt could mean that the home directory is unavailable for brief periods of time, and it could also be the reason why other software freezes. If you're at a company, get a network guy to debug the NFS mount options that you are using to make sure they are optimal.
    – Kusalananda
    May 2 at 13:34












up vote
4
down vote

favorite









up vote
4
down vote

favorite











Background



For a few years now, I have been having problems with my Xeon workstations freezing. For many things they have been lightning fast, but sometimes applications or even the desktop just freeze for no apparent reason.



It got so bad last year that I had my whole workstation replaced with new hardware, but the problems persisted with the new machine. Both were installed from the same RHEL6 boot image. Both were specified with a decent CPU, plenty of memory and a direct connection to the corporate network gig-e switch. The original machine had an SSD, the new one a spinning hunk of rust. On the original machine, I even switched briefly to RHEL7, but it behaved the same way and I found Gnome 3 to be a step backwards in usability, so I reinstalled RHEL6.



I do not have root access to my workstation, though I do have the ability to use other software, via modules.



How the problems manifest



The problem is most severe (and most reproducible) when running my Eclipse development environment. Often just saving a file, or committing changes through eGit will cause the whole of Eclipse to stop responding for 10 to 30 seconds. When this happens, I double click on the title bar twice to restore and then maximise the window, I then wait for the window to be re-painted before I can continue working.



I use synergy to share my Linux workstation keyboard and mouse with my Windows laptop. Sometimes the whole desktop freezes, the mouse pointer flicks back to the workstation and I lose the ability to control the laptop until the workstation unfreezes.



I also see problems with Firefox freezing, at one point it was frustratingly freezing for 10 seconds every 30 seconds, and during the freeze I would be unable to scroll or switch tab. Now it only happens occasionally though (it's happened once while writing this post).



Although not as common as the others, I've also seen problems on the bash command line. Just pressing enter with no command can take 10 to 30 seconds for the subsequent prompt to be displayed.



What I've tried so far



I've monitored CPU and IO usage during application freezes, and usage appears to be minimal. Obviously monitoring tools like System Monitor and top at the command line also freeze when the whole desktop freezes, so it's difficult to see what is happening then.



I have tried moving my Eclipse application to my local disk, and symlinking ~/.eclipse to a directory on my local disk, but this didn't make a significant difference. The problems also occur whether my Eclipse workspace is on a local drive, or one of the nfs shares.



I've tried tracing file accesses in Eclipse in an attempts to minimise network file access, but this didn't suggest any particular problem.



Tweaking my strace to include child processes however, I'm seeing lots of messages, around the time of each freeze, of the form:



[pid 13513] --- SIGSEGV si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fe7db165000 ---


I'm not sure how to investigate these access errors further however.



I have tried creating a new firefox profile from scratch and using that, but again this made no real difference. I can't move my firefox profiles to a local disk as I need to be able to access them when working on other machines on our network. Those other machines don't seem to have the same problems as I see on my personal workstation, but I rarely use them for more than a few hours at a time.



I have tried running benchmarks on our filesystems (both local and network) but the tools I've found seem to concentrate on average transfers, and I suspect this may be worse case/latency related, and these appear to be averaged away by the fact that most transfers are fast.



Checking out answers to How do I figure out what's freezing up my machine? I can confirm that the local filesystems are ext4 (on LVM), while network filesystems are all nfs, and my machine isn't using LUKS.



Looking at Can I try to induce a freeze in my computer to isolate what is causing freezing? reminds me to say that this problem has persisted across many kernel versions, redhat releases and Nvidia Quadro drivers.



My suspicions



My persistent suspicion has been that my problems are network related. But I'm not sure how best to investigate.



I know that if I loose network connection for any reason, the whole machine will freeze until network is restored. I've never seen this before, but our systems seem to assume that home directories and application server shares are always available and responsive.



My question



What do I need to look at to work out why my machine is behaving as it is?



What RHEL tools can I use to track down these performance problems, and can I use these tools without root access?







share|improve this question













Background



For a few years now, I have been having problems with my Xeon workstations freezing. For many things they have been lightning fast, but sometimes applications or even the desktop just freeze for no apparent reason.



It got so bad last year that I had my whole workstation replaced with new hardware, but the problems persisted with the new machine. Both were installed from the same RHEL6 boot image. Both were specified with a decent CPU, plenty of memory and a direct connection to the corporate network gig-e switch. The original machine had an SSD, the new one a spinning hunk of rust. On the original machine, I even switched briefly to RHEL7, but it behaved the same way and I found Gnome 3 to be a step backwards in usability, so I reinstalled RHEL6.



I do not have root access to my workstation, though I do have the ability to use other software, via modules.



How the problems manifest



The problem is most severe (and most reproducible) when running my Eclipse development environment. Often just saving a file, or committing changes through eGit will cause the whole of Eclipse to stop responding for 10 to 30 seconds. When this happens, I double click on the title bar twice to restore and then maximise the window, I then wait for the window to be re-painted before I can continue working.



I use synergy to share my Linux workstation keyboard and mouse with my Windows laptop. Sometimes the whole desktop freezes, the mouse pointer flicks back to the workstation and I lose the ability to control the laptop until the workstation unfreezes.



I also see problems with Firefox freezing, at one point it was frustratingly freezing for 10 seconds every 30 seconds, and during the freeze I would be unable to scroll or switch tab. Now it only happens occasionally though (it's happened once while writing this post).



Although not as common as the others, I've also seen problems on the bash command line. Just pressing enter with no command can take 10 to 30 seconds for the subsequent prompt to be displayed.



What I've tried so far



I've monitored CPU and IO usage during application freezes, and usage appears to be minimal. Obviously monitoring tools like System Monitor and top at the command line also freeze when the whole desktop freezes, so it's difficult to see what is happening then.



I have tried moving my Eclipse application to my local disk, and symlinking ~/.eclipse to a directory on my local disk, but this didn't make a significant difference. The problems also occur whether my Eclipse workspace is on a local drive, or one of the nfs shares.



I've tried tracing file accesses in Eclipse in an attempts to minimise network file access, but this didn't suggest any particular problem.



Tweaking my strace to include child processes however, I'm seeing lots of messages, around the time of each freeze, of the form:



[pid 13513] --- SIGSEGV si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7fe7db165000 ---


I'm not sure how to investigate these access errors further however.



I have tried creating a new firefox profile from scratch and using that, but again this made no real difference. I can't move my firefox profiles to a local disk as I need to be able to access them when working on other machines on our network. Those other machines don't seem to have the same problems as I see on my personal workstation, but I rarely use them for more than a few hours at a time.



I have tried running benchmarks on our filesystems (both local and network) but the tools I've found seem to concentrate on average transfers, and I suspect this may be worse case/latency related, and these appear to be averaged away by the fact that most transfers are fast.



Checking out answers to How do I figure out what's freezing up my machine? I can confirm that the local filesystems are ext4 (on LVM), while network filesystems are all nfs, and my machine isn't using LUKS.



Looking at Can I try to induce a freeze in my computer to isolate what is causing freezing? reminds me to say that this problem has persisted across many kernel versions, redhat releases and Nvidia Quadro drivers.



My suspicions



My persistent suspicion has been that my problems are network related. But I'm not sure how best to investigate.



I know that if I loose network connection for any reason, the whole machine will freeze until network is restored. I've never seen this before, but our systems seem to assume that home directories and application server shares are always available and responsive.



My question



What do I need to look at to work out why my machine is behaving as it is?



What RHEL tools can I use to track down these performance problems, and can I use these tools without root access?









share|improve this question












share|improve this question




share|improve this question








edited May 2 at 18:59
























asked Apr 19 at 11:56









Mark Booth

6571821




6571821











  • Let me guess: your desktop manager is KDE, right? If true, dig no further - KDE is notorious for freezing at random. Try Mate or another alternative, or different video card altogether (i.e. if you are on a built in Intel, try discreete AMD or Nvidia, etc).
    – ajeh
    Apr 19 at 14:49










  • Nope @ajeh, Gnome shop here, and the Quadro is a discrete Nvidia card already. I've updated my question to mention Gnome.
    – Mark Booth
    Apr 19 at 15:44










  • I am fresh out of ideas then, as I never experienced random freezes with Gnome 2 and am still on Mate due to its premature death at the hands of Gnome 3+ team. Maybe indicate if you are running open source or closed source video drivers? I had issues with proprietary video drivers long time ago and never touched them since.
    – ajeh
    Apr 19 at 16:16










  • If your home directory is on NFS you may want to make sure that this works as intended. A 10 to 30 second wait for the next prompt could mean that the home directory is unavailable for brief periods of time, and it could also be the reason why other software freezes. If you're at a company, get a network guy to debug the NFS mount options that you are using to make sure they are optimal.
    – Kusalananda
    May 2 at 13:34
















  • Let me guess: your desktop manager is KDE, right? If true, dig no further - KDE is notorious for freezing at random. Try Mate or another alternative, or different video card altogether (i.e. if you are on a built in Intel, try discreete AMD or Nvidia, etc).
    – ajeh
    Apr 19 at 14:49










  • Nope @ajeh, Gnome shop here, and the Quadro is a discrete Nvidia card already. I've updated my question to mention Gnome.
    – Mark Booth
    Apr 19 at 15:44










  • I am fresh out of ideas then, as I never experienced random freezes with Gnome 2 and am still on Mate due to its premature death at the hands of Gnome 3+ team. Maybe indicate if you are running open source or closed source video drivers? I had issues with proprietary video drivers long time ago and never touched them since.
    – ajeh
    Apr 19 at 16:16










  • If your home directory is on NFS you may want to make sure that this works as intended. A 10 to 30 second wait for the next prompt could mean that the home directory is unavailable for brief periods of time, and it could also be the reason why other software freezes. If you're at a company, get a network guy to debug the NFS mount options that you are using to make sure they are optimal.
    – Kusalananda
    May 2 at 13:34















Let me guess: your desktop manager is KDE, right? If true, dig no further - KDE is notorious for freezing at random. Try Mate or another alternative, or different video card altogether (i.e. if you are on a built in Intel, try discreete AMD or Nvidia, etc).
– ajeh
Apr 19 at 14:49




Let me guess: your desktop manager is KDE, right? If true, dig no further - KDE is notorious for freezing at random. Try Mate or another alternative, or different video card altogether (i.e. if you are on a built in Intel, try discreete AMD or Nvidia, etc).
– ajeh
Apr 19 at 14:49












Nope @ajeh, Gnome shop here, and the Quadro is a discrete Nvidia card already. I've updated my question to mention Gnome.
– Mark Booth
Apr 19 at 15:44




Nope @ajeh, Gnome shop here, and the Quadro is a discrete Nvidia card already. I've updated my question to mention Gnome.
– Mark Booth
Apr 19 at 15:44












I am fresh out of ideas then, as I never experienced random freezes with Gnome 2 and am still on Mate due to its premature death at the hands of Gnome 3+ team. Maybe indicate if you are running open source or closed source video drivers? I had issues with proprietary video drivers long time ago and never touched them since.
– ajeh
Apr 19 at 16:16




I am fresh out of ideas then, as I never experienced random freezes with Gnome 2 and am still on Mate due to its premature death at the hands of Gnome 3+ team. Maybe indicate if you are running open source or closed source video drivers? I had issues with proprietary video drivers long time ago and never touched them since.
– ajeh
Apr 19 at 16:16












If your home directory is on NFS you may want to make sure that this works as intended. A 10 to 30 second wait for the next prompt could mean that the home directory is unavailable for brief periods of time, and it could also be the reason why other software freezes. If you're at a company, get a network guy to debug the NFS mount options that you are using to make sure they are optimal.
– Kusalananda
May 2 at 13:34




If your home directory is on NFS you may want to make sure that this works as intended. A 10 to 30 second wait for the next prompt could mean that the home directory is unavailable for brief periods of time, and it could also be the reason why other software freezes. If you're at a company, get a network guy to debug the NFS mount options that you are using to make sure they are optimal.
– Kusalananda
May 2 at 13:34















active

oldest

votes











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f438711%2fworking-out-what-is-causing-my-machine-to-freeze%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f438711%2fworking-out-what-is-causing-my-machine-to-freeze%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Bahrain

Postfix configuration issue with fips on centos 7; mailgun relay