Sorting Processes
A few weeks back, while teaching a SmartOS Internals class, a few of the students had an interesting suggestion for me. It was either Sam Zaydel from Racktop, or one of the Voxer guys that asked me to do a weekly tweet showing a use of mdb, DTrace, or something else in SmartOS that might be interesting. I thought about this for a bit, and I really like the idea, as one of my favorite things is instructing. The ZFS internals class also like the idea and say they have some good questions I could answer this way, so ask @MrBruning your SmartOS, DTrace, or mdb questions!
There’s a few ways to submit questions to me: Email MrBruning@joyent.com, ask on twitter at hashtag #BruningQuestions, or attend one of my classes. Every week, I’ll answer one of your questions in this new column, to be published every Friday on the Joyent blog. And for Sam, Dave, and Mike, thanks for the idea, and I’ll tweet it out just for you. Here we go with the first question! -- Mr. Bruning.
Q. How do I sort processes in the order they last ran?
Occasionally, while examining a system crash (or even the running system), I would like to see a sorted list of processes with respect to the time they last ran. I used to use scat (Solaris Crash Analysis Tool) for this, but found that the tool is no longer available without logging in to Oracle support. So, how to do the same thing with mdb.
# ls -lh *.0-rw-r--r-- 1 root root 1.7M Nov 29 01:40 unix.0-rw-r--r-- 1 root root 260M Nov 29 01:40 vmcore.0-rw-r--r-- 1 root root 59M Nov 14 08:24 vmdump.0# mdb 0Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci ufs ip hook neti sockfs arp usba stmf_sbd stmf zfs lofs mpt idm sd crypto random cpc logindmux ptm sppp nfs ]> ::walk thread | ::print -a -d kthread_t t_hrtime !sed -e 's/t_hrtime = //' -e 's/0t//' | sort -n -k 2,2 | cut -d " " -f 1 > /tmp/foo> ::offsetof kthread_t t_hrtimeoffsetof (kthread_t, t_hrtime) = 0x370, sizeof (...->t_hrtime) = 8> ::cat /tmp/foo | ::eval ".-370::print kthread_t t_procp | ::print proc_t p_user.u_psargs"p_user.u_psargs = [ "sched" ]p_user.u_psargs = [ "sched" ]…p_user.u_psargs = [ "/usr/sbin/nscd" ]p_user.u_psargs = [ "/usr/sbin/nscd" ]p_user.u_psargs = [ "/lib/svc/bin/svc.configd" ]p_user.u_psargs = [ "sched" ]p_user.u_psargs = [ "/usr/sbin/nscd" ]p_user.u_psargs = [ "zpool-testpool" ]>
I’m running mdb on a kernel crash dump (unix.0 and vmcore.0). The same set of commands would also work on a running system, though may not be entirely accurate (things can change while running the three mdb command lines). Note that some processes show up more than once in the output. This is either because there are multiple instances of the process, or the process is multithreaded.
The following goes through the mdb lines and explains what they do. The line:
> ::walk thread | ::print -a -d kthread_t t_hrtime !sed -e 's/t_hrtime = //' -e 's/0t//' | sort -n -k 2,2 | cut -d " " -f 1 > /tmp/foo
does the following:
Walk through all of the kernel threads on the system (threads run, processes are just containers for an address space and for one or more threads).
For each kthread_t, print the
t_hrtime
member. This field is the (high resolution) time that the thread last ran. The “-a” option to print will print the address of the field, and the “-d” will print the value in decimal. The output of::walk thread | ::print -a -d kthread_t t_hrtime
looks like:fffffffffbc2fe30 t_hrtime = 0t361484793159ffffff0003e05fb0 t_hrtime = 0ffffff0003e0bfb0 t_hrtime = 0t1970215059950ffffff0003e11fb0 t_hrtime = 0ffffff0003e17fb0 t_hrtime = 0t4834878495122ffffff0003e1dfb0 t_hrtime = 0…
The next step is to pipe the output of
::walk thread | ::print -a -d kthread_t t_hrtime
tosed -e ‘s/t_hrtime = //’ -e ‘s/0t//’
which stripst_hrtime
and0t
from the output. (The “0t” is mdb’s way of telling you that it is a decimal number. By default, mdb shows hex with::print
).The output from
sed
is then piped tosort -n -k 2,2 | cut -d " " -f 1 > /tmp/foo
, which sorts the output by thet_hrtime
value, and then strips off the time, leaving only the addresses. This output is put into/tmp/foo
. This will sort from earliest to latest. Adding “-r” to thesort
command will sort from latest to earliest. A quick look at/tmp/foo
shows:> ::cat /tmp/fooffffff0003e05fb0ffffff0003e11fb0ffffff0003e1dfb0ffffff0003e23fb0ffffff0003e29fb0ffffff0003e2ffb0…
The next step is to get the offset within kthread_t
of the t_hrtime
member. We need this because we want to get other information from the kthread_t
structure. This is done via:
> ::offsetof kthread_t t_hrtimeoffsetof (kthread_t, t_hrtime) = 0x370, sizeof (...->t_hrtime) = 8>
The final line:
> ::cat /tmp/foo | ::eval ".-370::print kthread_t t_procp | ::print proc_t p_user.u_psargs"
does the following:
- Pipes the contents of the
/tmp/foo
file to the::eval
dcmd. - The
::eval “.-370:print kthread_t t_procp | ::print proc_t p_user.u_psargs"
line takes each line of input from the::cat
, subtracts 0x370 from the address to get the start of thekthread_t
, and prints the command line for the process in which the thread is running. If you wanted to print the stacks instead of the command line, you would use::cat /tmp/foo | ::eval “.-370::threadlist -v”
, or, instead of::threadlist -v
, you could use::findstack
.
What if you wanted to see the order of threads, but only for a specific process? You could use:
> ::pgrep inetd | ::walk thread | ::print -a -d kthread_t t_hrtime !sed -e 's/t_hrtime = //' -e 's/0t//' | sort -n -k 2,2 | cut -d " " -f 1 > /tmp/foo> ::cat /tmp/foo | ::eval ".-370::threadlist -v" ADDR PROC LWP CLS PRI WCHANffffff014aa29860 ffffff01498b7078 ffffff014a597cc0 2 29 0 PC: _resume_from_idle+0xf1 CMD: /usr/lib/inet/inetd start stack pointer for thread ffffff014aa29860: ffffff00045c6d50 [ ffffff00045c6d50 _resume_from_idle+0xf1() ] swtch+0x141() shuttle_swtch+0x203() door_return+0x214() doorfs32+0x180() _sys_sysenter_post_swapgs+0x149() ADDR PROC LWP CLS PRI WCHANffffff01496c14e0 ffffff01498b7078 ffffff014a5983c0 2 29 0 PC: _resume_from_idle+0xf1 CMD: /usr/lib/inet/inetd start stack pointer for thread ffffff01496c14e0: ffffff00045c0d20 [ ffffff00045c0d20 _resume_from_idle+0xf1() ] swtch_to+0xb6() shuttle_resume+0x2af() door_return+0x3e0() doorfs32+0x180() _sys_sysenter_post_swapgs+0x149()…
What if you wanted to know how long it has been since threads last ran? Or you wanted to look at user level stacks? That will wait for a future blog. To ask this question, or questions of your own, submit to MrBruning@joyent.com or on Twitter at #BruningQuestions
Post written by Joyent