User Tools

Site Tools


new_server

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
new_server [2017/12/13 21:15] – add M.2 drive disappearing problem joshnew_server [2018/01/16 10:35] (current) josh
Line 181: Line 181:
  
 After the third time this happened, on 2017-12-13, I updated the UEFI firmware on the ASRock motherboard to version 3.30. The upgrade was successful, and after resetting my options (particularly re-enabling SVM and power on after AC loss), the system appears to be running properly again now. After the third time this happened, on 2017-12-13, I updated the UEFI firmware on the ASRock motherboard to version 3.30. The upgrade was successful, and after resetting my options (particularly re-enabling SVM and power on after AC loss), the system appears to be running properly again now.
 +
 +On 2017-12-15 the system froze again. I disabled "C6 Mode" in the UEFI setup and started up again.
 +
 +2017-12-17: I have not observed the problem since disabling "C6 Mode", but I have a feeling it will still come back and could be related to NVMe APST modes. Similar problem here: [[https://bbs.archlinux.org/viewtopic.php?id=232692]]. I added kernel parameter:
 +
 +<code>
 +nvme_core.default_ps_max_latency_us=0
 +</code>
 +
 +Now ''nvme get-feature -f 0x0c -H /dev/nvme0n1'' shows APST is disabled.
 +
 +Turned "C6 Mode" back on but have not observed the M.2 drive disappearing problem again with APST disabled.
 +
 +===== BUG: soft lockup =====
 +
 +A few times after fixing the M.2 drive disappearing issue, my server has frozen. One of the times I caught the console output which listed several messages like "watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [worker:14788]", and one "INFO: rcu_sched detected stalls on CPUs/tasks:" ... "rcu_sched kthread starved for 19150377 jiffies!". I'm not sure what is causing this.
 +
 +On 2018-01-08, I upgraded from kernel 4.14.4 to 4.14.11 and added "consoleblank=0" to the kernel command line so if this happens again hopefully I will not lose console output.
 +
 +2018-01-09: Igor seems to be experiencing the same bug and pointed me to https://bugzilla.kernel.org/show_bug.cgi?id=196683. I will probably disable C6 in UEFI setup again.
 +
 +2018-01-15: Got the freeze again after disabling C6 mode in setup. I looked deeper in setup options and found the buried "Global C-State Control" option that Igor and Mike had disabled so I disabled that as well. No freezes since then.
new_server.1513217730.txt.gz · Last modified: by josh