Recommendations for HPC Platform Verification¶
For general notes on running memtester, IOZone, IMB and HPL see - https://github.com/alces-software/knowledgebase/wiki/Burn-in-Testing
Further details can be found at:
Checking Hardware and Software Configuration¶
Check CPU type:
pdsh -g groupname 'grep -m 1 name /proc/cpuinfo'
Check CPU count:
pdsh -g groupname 'grep processor /proc/cpuinfo |wc -l'
Check RAID active:
pdsh -g groupname 'cat /proc/mdstat | grep md[0-1]'
Check Infiniband up/active:
pdsh -g groupname 'ibstatus |grep phys'
Check free memory:
pdsh -g groupname 'free -m |grep ^Mem'
Check GPU type and count:
pdsh -g groupname 'nvidia-smi'
Grab serial numbers:
pdsh -g groupname 'dmidecode -t baseboard |grep -i serial'