Now consider the case where Thread 2 is made to heartbeat. A HAM will automatically detect when the thread is unresponsive and can terminate it and/or perform recovery.
Thread 1 Thread 2 ... ... (Run at high priority) (Run at low priority) while true while true do do obtain lock a obtain lock a send heartbeat (compute section1) (compute section1) release lock a release lock a done done ... ...
Here Thread 2 is expected to send heartbeats to a HAM. By placing the heartbeat call within the inside loop, the HAM detects when Thread 2 begins to starve.
The threads will execute as described earlier, but eventually Thread 2 will starve. We'll wait for a reasonable amount of time (some seconds) until it does. The threads write a simple execution log in /dev/shmem/mutex-starvation-heartbeat.log. The HAM detects that the thread has stopped heartbeating and terminates the process, after saving its state for postmortem analysis.
Let's look at what happens:
Here's the current state of the threads in process 753682 and the state of mutex-starvation when it missed heartbeats:
pid tid name prio STATE Blocked 753682 1 t/mutex-starvation 10r JOIN 4 753682 2 t/mutex-starvation 63r RECEIVE 1 753682 3 t/mutex-starvation 1r MUTEX 753682-04 #-2147 753682 4 t/mutex-starvation 63r NANOSLEEP Entity state from HAM Path : mutex-starvation Entity Pid : 753682 Num conditions : 1 Condition type : ATTACHEDSELF Stats: HeartBeat Period: 1000000000 HB Low Mark : 5 HB High Mark : 5 Last Heartbeat : 2001/09/03 14:44:37:796119160 HeartBeat State : MISSEDHIGH Created : 2001/09/03 14:44:34:780239800 Num Restarts : 0
And here's the tail from the threads' log file:
Thread 1: Unlocking lock a Thread 1: Locking lock a Thread 1: Unlocking lock a Thread 1: Locking lock a Thread 1: Unlocking lock a Thread 1: Locking lock a Thread 1: Unlocking lock a Thread 1: Locking lock a Thread 1: Unlocking lock a Thread 1: Locking lock a
/tmp/mutex-starvation.core: processor=ARM num_cpus=2 cpu 1 cpu=602370 name=604e speed=299 flags=0xc0000001 FPU MMU EAR cpu 2 cpu=602370 name=604e speed=299 flags=0xc0000001 FPU MMU EAR cyc/sec=16666666 tod_adj=999522656000000000 nsec=5627098907040 inc=999960 boot=999522656 epoch=1970 intr=-2147483648 rate=600000024 scale=-16 load=16666 MACHINE="mtx604-smp" HOSTNAME="localhost" hwflags=0x000004 pretend_cpu=0 init_msr=36866 pid=753682 parent=729105 child=0 pgrp=753682 sid=1 flags=0x000300 umask=0 base_addr=0x48040000 init_stack=0x4803f9f0 ruid=0 euid=0 suid=0 rgid=0 egid=0 sgid=0 ign=0000000006801000 queue=ff00000000000000 pending=0000000000000000 fds=5 threads=4 timers=1 chans=4 thread 1 REQUESTED ip=0xfe32f8c8 sp=0x4803f880 stkbase=0x47fbf000 stksize=528384 state=JOIN flags=0 last_cpu=2 timeout=00000000 pri=10 realpri=10 policy=RR thread 2 ip=0xfe32f1a8 sp=0x47fbef50 stkbase=0x47f9e000 stksize=135168 state=RECEIVE flags=4000000 last_cpu=2 timeout=00000000 pri=63 realpri=63 policy=RR blocked_chid=1 thread 3 ip=0xfe32f838 sp=0x47f9df80 stkbase=0x47f7d000 stksize=135168 state=MUTEX flags=4000000 last_cpu=2 timeout=00000000 pri=1 realpri=1 policy=RR thread 4 ip=0xfe32f9a0 sp=0x47f7cf20 stkbase=0x47f5c000 stksize=135168 state=NANOSLEEP flags=4000000 last_cpu=1 timeout=0x1001000 pri=63 realpri=63 policy=RR