5 08 2020
CRITICAL – hpasmd needs to be restarted, System: ‘unknown’, S/N: ‘unknown’, ROM: ‘unknown’
We encountered this issue on our HP servers when trying to use the check_hp health check for sensu/nagios. It was also showing a segfault in dmesg when trying to start the hp-health service
[6822665.991794] hpasmlited[1183913]: segfault at 0 ip 0000000000421ffb sp 00007ffe3c24fe40 error 4 in hpasmlited[400000+38000]
I did some digging and found that the issue seems to stem from servers using the legacy bios setting, instead of UEFI.
The resolution is to add the “nopat” parameter to the kernel boot line. I appended GRUB_CMDLINE_LINUX in /etc/default/grub and regenerated the grub config. Once the server had been rebooted the service would start
# service hp-health status Redirecting to /bin/systemctl status hp-health.service ● hp-health.service - HP System Health Monitor Loaded: loaded (/usr/lib/systemd/system/hp-health.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2020-08-05 19:53:42 BST; 1s ago Process: 922779 ExecStop=/usr/lib/systemd/scripts/hp-health.sh stop (code=exited, status=0/SUCCESS) Process: 922868 ExecStart=/usr/lib/systemd/scripts/hp-health.sh start (code=exited, status=0/SUCCESS) Main PID: 922941 (hpasmlited) CGroup: /system.slice/hp-health.service └─922941 hpasmlited -f /dev/hpilo
and check_hp returned correctly.
# check_hp OK - System: 'proliant dl360 gen9', S/N: 'xxxxx', ROM: 'P89 03/25/2019', hardware working fine, da: 1 logical drives, 2 physical drives
I then applied the fix to the rest of our HP servers with puppet
if ($::productname =~ /ProLiant/) { if ($::operatingsystemmajrelease >= '7') { exec { 'set-kernel-nopat': command => 'sed -i \'/GRUB_CMDLINE_LINUX/ s/"$/ nopat"/\' /etc/default/grub', unless => 'grep -q nopat /etc/default/grub', notify => Exec['regen-grub2'], } exec { 'regen-grub2': command => 'grub2-mkconfig > /etc/grub2.cfg', onlyif => 'grep -q nopat /etc/default/grub', unless => 'grep -q nopat /etc/grub2.cfg', } } else { exec { 'set-kernel-nopat-grubby': command => 'grubby --update-kernel=ALL --args="nopat"', unless => 'grep -q kernel.*nopat$ /boot/grub/grub.conf', } } }
I can’t find anything that says setting nopat is bad, and the systems are running fine with the parameter set.
MongoDB notes -bash: fork: Cannot allocate memory