Here’s a good question for all of you: What is the average daily CPU utilization of your Linux systems? If you know, you’re probably on the higher end of utilization, or have recently done a study with an eye to reducing costs, such as server consolidation. If you don’t know, why not? I always recommend that you gather data, even if it’s not that often, because it’s better to know than not know, and you can really get good information from your systems.

Over the years I have used a wide range of methods to get system information, including extremely complex and expensive tools from the higher end vendors. However, like most of the rest of Open Source, no one tool seems to be the be-all-end-all, it takes a toolbox to really get what you want, so here is the set of tools that I think will be of help to you.

Command Line Tools:

top – One of the most basic tools, it’s quick, easy to use and has some surprising features, like using toggle keys for turning on and off features while top is displaying it’s information (i to toggle off idle processes, b to bold the process that’s most active and H to show threads and processes).

htop – An improved version of top, not included in most distro’s but you can get it here. Check out the comparison of top and htop.

iftop – Like the top command for interfaces, this command shows a sorted view of the various networking interfaces, toggles include (s to ignore source, d to ignore destination and t toggles through display options. Get iftop here.

iptraf – Where iftop is a little simplistic, iptraf offers a lot more complexity and options. Rather than try to explain it in detail, there’s a great article that explains a lot about iptraf. Get iptraf here.

uptime – Talk about simplistic, the uptime command features various values, such as time, amount of uptime, users count and load averages for the last 1, 5 and 10 minutes. (Included in SLE) Sample output is shown below:

13:30 up 2 days, 15:39, 2 users, load averages: 1.07 1.16 1.24

strace – Used to trace the system calls and signals that a particular command uses, you can save the output to a file, filter it through grep for keywords like “error” etc. Very useful for a misbehaving program, or for errors that aren’t displayed in the interface. Can be used to trace either a program at execution “strace /bin/date” or a running process “strace -p 4899“. (Included in SLE)

ltrace – Monitors dynamic library calls made by a program, either at execution or while running (same syntax as strace above). (Included in SLE)

sar – Possibly one of the most useful (and vexing) commands you can use to monitor a system, System Activity Reporter gathers the specified information from the system on a scheduled basis and builds logfiles that you can then report on or mine specific timeframes to see what was going on between a set of time ticks. You can monitor many things with sar, including: file access routines, buffer activity, system call activity, block device activity, paging and much more.

Hopefully this little roundup is helpful to the troops out there, Part II will cover the GUI tools for monitoring Linux systems.