One of my colleagues has been building beautiful CPU utilisation graphs using Zabbix and data drawn from snmpd. Inconveniently, one of the machines was persistently showing no idle time, on a largely idle machine. A little digging has unearthed the following:
- “snmpget -Ov -OU -v 2c -c public localhost .220.127.116.11.4.1.2021.11.53.0 -t 10 | cut –delimiter=’ ‘ -f2″ was reliably returning 4294967295 (2^32 minus 1) which rather pointed to the problem (note that Zabbix is configured to calculate the difference between successive samples, which is why it was seeing “zero” idle – all of the samples were the same)
- snmpd draws CPU idle data from /proc/stat which, in this case, contained “cpu 252742675 1810147 102583077 5150841115 48268073 1099082 0 0″
- .18.104.22.168.4.1.2021.11.53.0 is a 32-bit “COUNTER” rather than a 64-bit big-counter.
My impression is that counters are supposed to simply wrap around (so, an snmpd bug because it’s not reporting “mod 2^32″), but I can’t find clear information. In any event, a simple reboot resolved the problem. uptime-fetishists take note…