Arch Linux freeze: “kernel: watchdog: BUG: soft lockup” and once more w/o logs - how to investigate

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















I just bought a new desktop PC and had a freeze a week ago. Looking into the kernel log reveals a soft lockup (from journalctl -xa):



Mar 03 23:48:55 xxx kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [systemd:1]
Mar 03 23:48:55 xxx kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat amdgpu chash amd_iommu_v2 snd_hda_codec_realtek gpu_sched snd_hda_codec_generic snd_hda_codec_hdmi ttm snd_hda_intel drm_kms_helper snd_hda_codec edac_mce_amd eeepc_wmi asus_wmi snd_hda_core sparse_keymap drm snd_hwdep kvm_amd rfkill mxm_wmi wmi_bmof snd_pcm kvm igb agpgart syscopyarea snd_timer sysfillrect sysimgblt sp5100_tco i2c_algo_bit mousedev joydev input_leds fb_sys_fops dca i2c_piix4 pcspkr snd irqbypass k10temp soundcore pinctrl_amd wmi evdev gpio_amdpt pcc_cpufreq mac_hid acpi_cpufreq ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto algif_skcipher af_alg dm_crypt dm_mod sr_mod cdrom hid_holtek_mouse hid_cherry hid_generic usbhid hid sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci aesni_intel libata aes_x86_64 crypto_simd cryptd xhci_pci ccp glue_helper xhci_hcd scsi_mod rng_core
Mar 03 23:48:55 xxx kernel: CPU: 6 PID: 1 Comm: systemd Tainted: G L 4.20.13-arch1-1-ARCH #1
Mar 03 23:48:55 xxx kernel: Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4207 12/08/2018
Mar 03 23:48:55 xxx kernel: RIP: 0010:smp_call_function_many+0x1f4/0x250
Mar 03 23:48:55 xxx kernel: Code: c7 e8 60 b1 6f 00 3b 05 6e 3d 21 01 0f 83 8f fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 20 c8 53 91 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 48 c7 c2 60 47 73 91 4c 89 e6 89 df
Mar 03 23:48:55 xxx kernel: RSP: 0018:ffff99e08004bb40 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
Mar 03 23:48:55 xxx kernel: RAX: 0000000000000002 RBX: ffff9481ceba2bc0 RCX: ffff9481ceaa7ac0
Mar 03 23:48:55 xxx kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9481ceba2bc8
Mar 03 23:48:55 xxx kernel: RBP: ffff9481ceba2bc8 R08: 0000000000000000 R09: 0000000000000000
Mar 03 23:48:55 xxx kernel: R10: ffff9481ceba2bc8 R11: 0000000000000005 R12: ffff9481ceba2bf0
Mar 03 23:48:55 xxx kernel: R13: ffffffff90476530 R14: 0000000000000000 R15: 0000000000000140
Mar 03 23:48:55 xxx kernel: FS: 00007f7111232e80(0000) GS:ffff9481ceb80000(0000) knlGS:0000000000000000
Mar 03 23:48:55 xxx kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 03 23:48:55 xxx kernel: CR2: 00005574b01882c8 CR3: 0000000405a68000 CR4: 00000000003406e0
Mar 03 23:48:55 xxx kernel: Call Trace:
Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
Mar 03 23:48:55 xxx kernel: smp_call_function+0x36/0x60
Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
Mar 03 23:48:55 xxx kernel: on_each_cpu+0x2a/0x80
Mar 03 23:48:55 xxx kernel: flush_tlb_kernel_range+0x48/0x90
Mar 03 23:48:55 xxx kernel: ? cpumask_next+0x16/0x20
Mar 03 23:48:55 xxx kernel: ? purge_fragmented_blocks_allcpus+0x45/0x1d0
Mar 03 23:48:55 xxx kernel: __purge_vmap_area_lazy+0x4d/0xc0
Mar 03 23:48:55 xxx kernel: vm_unmap_aliases+0xf0/0x120
Mar 03 23:48:55 xxx kernel: change_page_attr_set_clr+0xc8/0x2d0
Mar 03 23:48:55 xxx kernel: ? bpf_jit_binary_alloc+0x60/0xe0
Mar 03 23:48:55 xxx kernel: set_memory_ro+0x26/0x30
Mar 03 23:48:55 xxx kernel: bpf_int_jit_compile+0x237/0x2fe
Mar 03 23:48:55 xxx kernel: bpf_prog_select_runtime+0xa5/0xf0
Mar 03 23:48:55 xxx kernel: bpf_prog_load+0x37a/0x5a0
Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x32/0x90
Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x58/0x90
Mar 03 23:48:55 xxx kernel: ? _raw_spin_unlock_irqrestore+0x20/0x40
Mar 03 23:48:55 xxx kernel: ? trie_update_elem+0x20b/0x340
Mar 03 23:48:55 xxx kernel: __se_sys_bpf+0x5c6/0x16d0
Mar 03 23:48:55 xxx kernel: do_syscall_64+0x5b/0x170
Mar 03 23:48:55 xxx kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar 03 23:48:55 xxx kernel: RIP: 0033:0x7f71132534ed
Mar 03 23:48:55 xxx kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 43 79 0c 00 f7 d8 64 89 01 48
Mar 03 23:48:55 xxx kernel: RSP: 002b:00007fff04946e98 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
Mar 03 23:48:55 xxx kernel: RAX: ffffffffffffffda RBX: 00005574b01c95d0 RCX: 00007f71132534ed
Mar 03 23:48:55 xxx kernel: RDX: 0000000000000048 RSI: 00007fff04946ea0 RDI: 0000000000000005
Mar 03 23:48:55 xxx kernel: RBP: 0000000000000000 R08: 656369767265732e R09: 0000002500000008
Mar 03 23:48:55 xxx kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00005574b01ca7c0
Mar 03 23:48:55 xxx kernel: R13: 0000000000000001 R14: 00005574afa8412a R15: 0000000000000000


All necessary info should be in there, but feel free to ask for more details.



I think this easily should not happen, and I don't know how to deal with it. Since arch is running on kernel 5.0 by now, I was unsure whether it's worth to investigate. A freeze occurred again just a few minutes ago (sadly, this time without any useful kernel logs for the last 1.5h of running). I greped for "watchdog" after the reboot and found the following entries after reboot:



Mar 10 22:03:48 xxx kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
Mar 10 22:03:48 xxx kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Using 0xfed80b00 for watchdog MMIO address
Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
Mar 10 22:05:10 xxx rtkit-daemon[1392]: Watchdog thread running.


Is this related to the strange artifacts I get when switching to another desktop or opening a new window (see my other question)?










share|improve this question






























    0















    I just bought a new desktop PC and had a freeze a week ago. Looking into the kernel log reveals a soft lockup (from journalctl -xa):



    Mar 03 23:48:55 xxx kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [systemd:1]
    Mar 03 23:48:55 xxx kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat amdgpu chash amd_iommu_v2 snd_hda_codec_realtek gpu_sched snd_hda_codec_generic snd_hda_codec_hdmi ttm snd_hda_intel drm_kms_helper snd_hda_codec edac_mce_amd eeepc_wmi asus_wmi snd_hda_core sparse_keymap drm snd_hwdep kvm_amd rfkill mxm_wmi wmi_bmof snd_pcm kvm igb agpgart syscopyarea snd_timer sysfillrect sysimgblt sp5100_tco i2c_algo_bit mousedev joydev input_leds fb_sys_fops dca i2c_piix4 pcspkr snd irqbypass k10temp soundcore pinctrl_amd wmi evdev gpio_amdpt pcc_cpufreq mac_hid acpi_cpufreq ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto algif_skcipher af_alg dm_crypt dm_mod sr_mod cdrom hid_holtek_mouse hid_cherry hid_generic usbhid hid sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci aesni_intel libata aes_x86_64 crypto_simd cryptd xhci_pci ccp glue_helper xhci_hcd scsi_mod rng_core
    Mar 03 23:48:55 xxx kernel: CPU: 6 PID: 1 Comm: systemd Tainted: G L 4.20.13-arch1-1-ARCH #1
    Mar 03 23:48:55 xxx kernel: Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4207 12/08/2018
    Mar 03 23:48:55 xxx kernel: RIP: 0010:smp_call_function_many+0x1f4/0x250
    Mar 03 23:48:55 xxx kernel: Code: c7 e8 60 b1 6f 00 3b 05 6e 3d 21 01 0f 83 8f fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 20 c8 53 91 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 48 c7 c2 60 47 73 91 4c 89 e6 89 df
    Mar 03 23:48:55 xxx kernel: RSP: 0018:ffff99e08004bb40 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
    Mar 03 23:48:55 xxx kernel: RAX: 0000000000000002 RBX: ffff9481ceba2bc0 RCX: ffff9481ceaa7ac0
    Mar 03 23:48:55 xxx kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9481ceba2bc8
    Mar 03 23:48:55 xxx kernel: RBP: ffff9481ceba2bc8 R08: 0000000000000000 R09: 0000000000000000
    Mar 03 23:48:55 xxx kernel: R10: ffff9481ceba2bc8 R11: 0000000000000005 R12: ffff9481ceba2bf0
    Mar 03 23:48:55 xxx kernel: R13: ffffffff90476530 R14: 0000000000000000 R15: 0000000000000140
    Mar 03 23:48:55 xxx kernel: FS: 00007f7111232e80(0000) GS:ffff9481ceb80000(0000) knlGS:0000000000000000
    Mar 03 23:48:55 xxx kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Mar 03 23:48:55 xxx kernel: CR2: 00005574b01882c8 CR3: 0000000405a68000 CR4: 00000000003406e0
    Mar 03 23:48:55 xxx kernel: Call Trace:
    Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
    Mar 03 23:48:55 xxx kernel: smp_call_function+0x36/0x60
    Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
    Mar 03 23:48:55 xxx kernel: on_each_cpu+0x2a/0x80
    Mar 03 23:48:55 xxx kernel: flush_tlb_kernel_range+0x48/0x90
    Mar 03 23:48:55 xxx kernel: ? cpumask_next+0x16/0x20
    Mar 03 23:48:55 xxx kernel: ? purge_fragmented_blocks_allcpus+0x45/0x1d0
    Mar 03 23:48:55 xxx kernel: __purge_vmap_area_lazy+0x4d/0xc0
    Mar 03 23:48:55 xxx kernel: vm_unmap_aliases+0xf0/0x120
    Mar 03 23:48:55 xxx kernel: change_page_attr_set_clr+0xc8/0x2d0
    Mar 03 23:48:55 xxx kernel: ? bpf_jit_binary_alloc+0x60/0xe0
    Mar 03 23:48:55 xxx kernel: set_memory_ro+0x26/0x30
    Mar 03 23:48:55 xxx kernel: bpf_int_jit_compile+0x237/0x2fe
    Mar 03 23:48:55 xxx kernel: bpf_prog_select_runtime+0xa5/0xf0
    Mar 03 23:48:55 xxx kernel: bpf_prog_load+0x37a/0x5a0
    Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x32/0x90
    Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x58/0x90
    Mar 03 23:48:55 xxx kernel: ? _raw_spin_unlock_irqrestore+0x20/0x40
    Mar 03 23:48:55 xxx kernel: ? trie_update_elem+0x20b/0x340
    Mar 03 23:48:55 xxx kernel: __se_sys_bpf+0x5c6/0x16d0
    Mar 03 23:48:55 xxx kernel: do_syscall_64+0x5b/0x170
    Mar 03 23:48:55 xxx kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    Mar 03 23:48:55 xxx kernel: RIP: 0033:0x7f71132534ed
    Mar 03 23:48:55 xxx kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 43 79 0c 00 f7 d8 64 89 01 48
    Mar 03 23:48:55 xxx kernel: RSP: 002b:00007fff04946e98 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
    Mar 03 23:48:55 xxx kernel: RAX: ffffffffffffffda RBX: 00005574b01c95d0 RCX: 00007f71132534ed
    Mar 03 23:48:55 xxx kernel: RDX: 0000000000000048 RSI: 00007fff04946ea0 RDI: 0000000000000005
    Mar 03 23:48:55 xxx kernel: RBP: 0000000000000000 R08: 656369767265732e R09: 0000002500000008
    Mar 03 23:48:55 xxx kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00005574b01ca7c0
    Mar 03 23:48:55 xxx kernel: R13: 0000000000000001 R14: 00005574afa8412a R15: 0000000000000000


    All necessary info should be in there, but feel free to ask for more details.



    I think this easily should not happen, and I don't know how to deal with it. Since arch is running on kernel 5.0 by now, I was unsure whether it's worth to investigate. A freeze occurred again just a few minutes ago (sadly, this time without any useful kernel logs for the last 1.5h of running). I greped for "watchdog" after the reboot and found the following entries after reboot:



    Mar 10 22:03:48 xxx kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
    Mar 10 22:03:48 xxx kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
    Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Using 0xfed80b00 for watchdog MMIO address
    Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
    Mar 10 22:05:10 xxx rtkit-daemon[1392]: Watchdog thread running.


    Is this related to the strange artifacts I get when switching to another desktop or opening a new window (see my other question)?










    share|improve this question


























      0












      0








      0








      I just bought a new desktop PC and had a freeze a week ago. Looking into the kernel log reveals a soft lockup (from journalctl -xa):



      Mar 03 23:48:55 xxx kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [systemd:1]
      Mar 03 23:48:55 xxx kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat amdgpu chash amd_iommu_v2 snd_hda_codec_realtek gpu_sched snd_hda_codec_generic snd_hda_codec_hdmi ttm snd_hda_intel drm_kms_helper snd_hda_codec edac_mce_amd eeepc_wmi asus_wmi snd_hda_core sparse_keymap drm snd_hwdep kvm_amd rfkill mxm_wmi wmi_bmof snd_pcm kvm igb agpgart syscopyarea snd_timer sysfillrect sysimgblt sp5100_tco i2c_algo_bit mousedev joydev input_leds fb_sys_fops dca i2c_piix4 pcspkr snd irqbypass k10temp soundcore pinctrl_amd wmi evdev gpio_amdpt pcc_cpufreq mac_hid acpi_cpufreq ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto algif_skcipher af_alg dm_crypt dm_mod sr_mod cdrom hid_holtek_mouse hid_cherry hid_generic usbhid hid sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci aesni_intel libata aes_x86_64 crypto_simd cryptd xhci_pci ccp glue_helper xhci_hcd scsi_mod rng_core
      Mar 03 23:48:55 xxx kernel: CPU: 6 PID: 1 Comm: systemd Tainted: G L 4.20.13-arch1-1-ARCH #1
      Mar 03 23:48:55 xxx kernel: Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4207 12/08/2018
      Mar 03 23:48:55 xxx kernel: RIP: 0010:smp_call_function_many+0x1f4/0x250
      Mar 03 23:48:55 xxx kernel: Code: c7 e8 60 b1 6f 00 3b 05 6e 3d 21 01 0f 83 8f fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 20 c8 53 91 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 48 c7 c2 60 47 73 91 4c 89 e6 89 df
      Mar 03 23:48:55 xxx kernel: RSP: 0018:ffff99e08004bb40 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
      Mar 03 23:48:55 xxx kernel: RAX: 0000000000000002 RBX: ffff9481ceba2bc0 RCX: ffff9481ceaa7ac0
      Mar 03 23:48:55 xxx kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9481ceba2bc8
      Mar 03 23:48:55 xxx kernel: RBP: ffff9481ceba2bc8 R08: 0000000000000000 R09: 0000000000000000
      Mar 03 23:48:55 xxx kernel: R10: ffff9481ceba2bc8 R11: 0000000000000005 R12: ffff9481ceba2bf0
      Mar 03 23:48:55 xxx kernel: R13: ffffffff90476530 R14: 0000000000000000 R15: 0000000000000140
      Mar 03 23:48:55 xxx kernel: FS: 00007f7111232e80(0000) GS:ffff9481ceb80000(0000) knlGS:0000000000000000
      Mar 03 23:48:55 xxx kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Mar 03 23:48:55 xxx kernel: CR2: 00005574b01882c8 CR3: 0000000405a68000 CR4: 00000000003406e0
      Mar 03 23:48:55 xxx kernel: Call Trace:
      Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
      Mar 03 23:48:55 xxx kernel: smp_call_function+0x36/0x60
      Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
      Mar 03 23:48:55 xxx kernel: on_each_cpu+0x2a/0x80
      Mar 03 23:48:55 xxx kernel: flush_tlb_kernel_range+0x48/0x90
      Mar 03 23:48:55 xxx kernel: ? cpumask_next+0x16/0x20
      Mar 03 23:48:55 xxx kernel: ? purge_fragmented_blocks_allcpus+0x45/0x1d0
      Mar 03 23:48:55 xxx kernel: __purge_vmap_area_lazy+0x4d/0xc0
      Mar 03 23:48:55 xxx kernel: vm_unmap_aliases+0xf0/0x120
      Mar 03 23:48:55 xxx kernel: change_page_attr_set_clr+0xc8/0x2d0
      Mar 03 23:48:55 xxx kernel: ? bpf_jit_binary_alloc+0x60/0xe0
      Mar 03 23:48:55 xxx kernel: set_memory_ro+0x26/0x30
      Mar 03 23:48:55 xxx kernel: bpf_int_jit_compile+0x237/0x2fe
      Mar 03 23:48:55 xxx kernel: bpf_prog_select_runtime+0xa5/0xf0
      Mar 03 23:48:55 xxx kernel: bpf_prog_load+0x37a/0x5a0
      Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x32/0x90
      Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x58/0x90
      Mar 03 23:48:55 xxx kernel: ? _raw_spin_unlock_irqrestore+0x20/0x40
      Mar 03 23:48:55 xxx kernel: ? trie_update_elem+0x20b/0x340
      Mar 03 23:48:55 xxx kernel: __se_sys_bpf+0x5c6/0x16d0
      Mar 03 23:48:55 xxx kernel: do_syscall_64+0x5b/0x170
      Mar 03 23:48:55 xxx kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Mar 03 23:48:55 xxx kernel: RIP: 0033:0x7f71132534ed
      Mar 03 23:48:55 xxx kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 43 79 0c 00 f7 d8 64 89 01 48
      Mar 03 23:48:55 xxx kernel: RSP: 002b:00007fff04946e98 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
      Mar 03 23:48:55 xxx kernel: RAX: ffffffffffffffda RBX: 00005574b01c95d0 RCX: 00007f71132534ed
      Mar 03 23:48:55 xxx kernel: RDX: 0000000000000048 RSI: 00007fff04946ea0 RDI: 0000000000000005
      Mar 03 23:48:55 xxx kernel: RBP: 0000000000000000 R08: 656369767265732e R09: 0000002500000008
      Mar 03 23:48:55 xxx kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00005574b01ca7c0
      Mar 03 23:48:55 xxx kernel: R13: 0000000000000001 R14: 00005574afa8412a R15: 0000000000000000


      All necessary info should be in there, but feel free to ask for more details.



      I think this easily should not happen, and I don't know how to deal with it. Since arch is running on kernel 5.0 by now, I was unsure whether it's worth to investigate. A freeze occurred again just a few minutes ago (sadly, this time without any useful kernel logs for the last 1.5h of running). I greped for "watchdog" after the reboot and found the following entries after reboot:



      Mar 10 22:03:48 xxx kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
      Mar 10 22:03:48 xxx kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
      Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Using 0xfed80b00 for watchdog MMIO address
      Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
      Mar 10 22:05:10 xxx rtkit-daemon[1392]: Watchdog thread running.


      Is this related to the strange artifacts I get when switching to another desktop or opening a new window (see my other question)?










      share|improve this question
















      I just bought a new desktop PC and had a freeze a week ago. Looking into the kernel log reveals a soft lockup (from journalctl -xa):



      Mar 03 23:48:55 xxx kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [systemd:1]
      Mar 03 23:48:55 xxx kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat amdgpu chash amd_iommu_v2 snd_hda_codec_realtek gpu_sched snd_hda_codec_generic snd_hda_codec_hdmi ttm snd_hda_intel drm_kms_helper snd_hda_codec edac_mce_amd eeepc_wmi asus_wmi snd_hda_core sparse_keymap drm snd_hwdep kvm_amd rfkill mxm_wmi wmi_bmof snd_pcm kvm igb agpgart syscopyarea snd_timer sysfillrect sysimgblt sp5100_tco i2c_algo_bit mousedev joydev input_leds fb_sys_fops dca i2c_piix4 pcspkr snd irqbypass k10temp soundcore pinctrl_amd wmi evdev gpio_amdpt pcc_cpufreq mac_hid acpi_cpufreq ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto algif_skcipher af_alg dm_crypt dm_mod sr_mod cdrom hid_holtek_mouse hid_cherry hid_generic usbhid hid sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci aesni_intel libata aes_x86_64 crypto_simd cryptd xhci_pci ccp glue_helper xhci_hcd scsi_mod rng_core
      Mar 03 23:48:55 xxx kernel: CPU: 6 PID: 1 Comm: systemd Tainted: G L 4.20.13-arch1-1-ARCH #1
      Mar 03 23:48:55 xxx kernel: Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4207 12/08/2018
      Mar 03 23:48:55 xxx kernel: RIP: 0010:smp_call_function_many+0x1f4/0x250
      Mar 03 23:48:55 xxx kernel: Code: c7 e8 60 b1 6f 00 3b 05 6e 3d 21 01 0f 83 8f fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 20 c8 53 91 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 48 c7 c2 60 47 73 91 4c 89 e6 89 df
      Mar 03 23:48:55 xxx kernel: RSP: 0018:ffff99e08004bb40 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
      Mar 03 23:48:55 xxx kernel: RAX: 0000000000000002 RBX: ffff9481ceba2bc0 RCX: ffff9481ceaa7ac0
      Mar 03 23:48:55 xxx kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9481ceba2bc8
      Mar 03 23:48:55 xxx kernel: RBP: ffff9481ceba2bc8 R08: 0000000000000000 R09: 0000000000000000
      Mar 03 23:48:55 xxx kernel: R10: ffff9481ceba2bc8 R11: 0000000000000005 R12: ffff9481ceba2bf0
      Mar 03 23:48:55 xxx kernel: R13: ffffffff90476530 R14: 0000000000000000 R15: 0000000000000140
      Mar 03 23:48:55 xxx kernel: FS: 00007f7111232e80(0000) GS:ffff9481ceb80000(0000) knlGS:0000000000000000
      Mar 03 23:48:55 xxx kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Mar 03 23:48:55 xxx kernel: CR2: 00005574b01882c8 CR3: 0000000405a68000 CR4: 00000000003406e0
      Mar 03 23:48:55 xxx kernel: Call Trace:
      Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
      Mar 03 23:48:55 xxx kernel: smp_call_function+0x36/0x60
      Mar 03 23:48:55 xxx kernel: ? tlbflush_read_file+0x70/0x70
      Mar 03 23:48:55 xxx kernel: on_each_cpu+0x2a/0x80
      Mar 03 23:48:55 xxx kernel: flush_tlb_kernel_range+0x48/0x90
      Mar 03 23:48:55 xxx kernel: ? cpumask_next+0x16/0x20
      Mar 03 23:48:55 xxx kernel: ? purge_fragmented_blocks_allcpus+0x45/0x1d0
      Mar 03 23:48:55 xxx kernel: __purge_vmap_area_lazy+0x4d/0xc0
      Mar 03 23:48:55 xxx kernel: vm_unmap_aliases+0xf0/0x120
      Mar 03 23:48:55 xxx kernel: change_page_attr_set_clr+0xc8/0x2d0
      Mar 03 23:48:55 xxx kernel: ? bpf_jit_binary_alloc+0x60/0xe0
      Mar 03 23:48:55 xxx kernel: set_memory_ro+0x26/0x30
      Mar 03 23:48:55 xxx kernel: bpf_int_jit_compile+0x237/0x2fe
      Mar 03 23:48:55 xxx kernel: bpf_prog_select_runtime+0xa5/0xf0
      Mar 03 23:48:55 xxx kernel: bpf_prog_load+0x37a/0x5a0
      Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x32/0x90
      Mar 03 23:48:55 xxx kernel: ? lpm_trie_node_alloc+0x58/0x90
      Mar 03 23:48:55 xxx kernel: ? _raw_spin_unlock_irqrestore+0x20/0x40
      Mar 03 23:48:55 xxx kernel: ? trie_update_elem+0x20b/0x340
      Mar 03 23:48:55 xxx kernel: __se_sys_bpf+0x5c6/0x16d0
      Mar 03 23:48:55 xxx kernel: do_syscall_64+0x5b/0x170
      Mar 03 23:48:55 xxx kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Mar 03 23:48:55 xxx kernel: RIP: 0033:0x7f71132534ed
      Mar 03 23:48:55 xxx kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 43 79 0c 00 f7 d8 64 89 01 48
      Mar 03 23:48:55 xxx kernel: RSP: 002b:00007fff04946e98 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
      Mar 03 23:48:55 xxx kernel: RAX: ffffffffffffffda RBX: 00005574b01c95d0 RCX: 00007f71132534ed
      Mar 03 23:48:55 xxx kernel: RDX: 0000000000000048 RSI: 00007fff04946ea0 RDI: 0000000000000005
      Mar 03 23:48:55 xxx kernel: RBP: 0000000000000000 R08: 656369767265732e R09: 0000002500000008
      Mar 03 23:48:55 xxx kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00005574b01ca7c0
      Mar 03 23:48:55 xxx kernel: R13: 0000000000000001 R14: 00005574afa8412a R15: 0000000000000000


      All necessary info should be in there, but feel free to ask for more details.



      I think this easily should not happen, and I don't know how to deal with it. Since arch is running on kernel 5.0 by now, I was unsure whether it's worth to investigate. A freeze occurred again just a few minutes ago (sadly, this time without any useful kernel logs for the last 1.5h of running). I greped for "watchdog" after the reboot and found the following entries after reboot:



      Mar 10 22:03:48 xxx kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
      Mar 10 22:03:48 xxx kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
      Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Using 0xfed80b00 for watchdog MMIO address
      Mar 10 22:03:48 xxx kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
      Mar 10 22:05:10 xxx rtkit-daemon[1392]: Watchdog thread running.


      Is this related to the strange artifacts I get when switching to another desktop or opening a new window (see my other question)?







      arch-linux linux-kernel freeze watchdog






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 11 at 4:39









      Rui F Ribeiro

      42k1483142




      42k1483142










      asked Mar 10 at 21:33









      noxnox

      1315




      1315




















          0






          active

          oldest

          votes












          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f505532%2farch-linux-freeze-kernel-watchdog-bug-soft-lockup-and-once-more-w-o-logs%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f505532%2farch-linux-freeze-kernel-watchdog-bug-soft-lockup-and-once-more-w-o-logs%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown






          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          Displaying single band from multi-band raster using QGIS

          How many registers does an x86_64 CPU actually have?