Nerdier

Adjective: Comparative form of nerdy: more nerdy.

CRITICAL – hpasmd needs to be restarted, System: ‘unknown’, S/N: ‘unknown’, ROM: ‘unknown’

We encountered this issue on our HP servers when trying to use the check_hp health check for sensu/nagios. It was also showing a segfault in dmesg when trying to start the hp-health service

[6822665.991794] hpasmlited[1183913]: segfault at 0 ip 0000000000421ffb sp 00007ffe3c24fe40 error 4 in hpasmlited[400000+38000]

I did some digging and found that the issue seems to stem from servers using the legacy bios setting, instead of UEFI.

The resolution is to add the “nopat” parameter to the kernel boot line. I appended GRUB_CMDLINE_LINUX in /etc/default/grub and regenerated the grub config. Once the server had been rebooted the service would start

# service hp-health status
Redirecting to /bin/systemctl status hp-health.service
● hp-health.service - HP System Health Monitor
   Loaded: loaded (/usr/lib/systemd/system/hp-health.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-08-05 19:53:42 BST; 1s ago
  Process: 922779 ExecStop=/usr/lib/systemd/scripts/hp-health.sh stop (code=exited, status=0/SUCCESS)
  Process: 922868 ExecStart=/usr/lib/systemd/scripts/hp-health.sh start (code=exited, status=0/SUCCESS)
 Main PID: 922941 (hpasmlited)
   CGroup: /system.slice/hp-health.service
           └─922941 hpasmlited -f /dev/hpilo

and check_hp returned correctly.

# check_hp
OK - System: 'proliant dl360 gen9', S/N: 'xxxxx', ROM: 'P89 03/25/2019', hardware working fine, da: 1 logical drives, 2 physical drives

I then applied the fix to the rest of our HP servers with puppet

if ($::productname =~ /ProLiant/) {
    if ($::operatingsystemmajrelease >= '7') {
        exec { 'set-kernel-nopat':
            command => 'sed -i \'/GRUB_CMDLINE_LINUX/ s/"$/ nopat"/\' /etc/default/grub',
            unless  => 'grep -q nopat /etc/default/grub',
            notify  => Exec['regen-grub2'],
        }
        exec { 'regen-grub2':
            command => 'grub2-mkconfig > /etc/grub2.cfg',
            onlyif  => 'grep -q nopat /etc/default/grub',
            unless  => 'grep -q nopat /etc/grub2.cfg',
        }
    } else {
        exec { 'set-kernel-nopat-grubby':
            command => 'grubby --update-kernel=ALL --args="nopat"',
            unless  => 'grep -q kernel.*nopat$ /boot/grub/grub.conf',
        }
    }
}

I can’t find anything that says setting nopat is bad, and the systems are running fine with the parameter set.

Leave a Reply

Your email address will not be published. Required fields are marked *