Monday, June 2, 2014

Suspended Process (KiDeliverApc)

There are a lot of software problems we can experience when we are using our computer. Today, I am going to talk about a story of analysis for a problem from suspended process.

It was a system hang case. There were a lot of processes suspended in the system. I am going to find out who suspends the processes.

Here is a call stack of taskmgr.exe in the system.


THREAD 8956e660  Cid 0c18.0bcc  Teb: 7ffdd000 Win32Thread: e2975360 WAIT: (Suspended) KernelMode Non-Alertable SuspendCount 1
...
    Owning Process            899cd5a8       Image:      taskmgr.exe
    Attached Process          N/A            Image:         N/A
...
    ChildEBP RetAddr  
    aad52c18 80504d50 nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
    aad52c24 804fcf40 nt!KiSwapThread+0x8a (FPO: [0,0,0])
    aad52c4c 8050448c nt!KeWaitForSingleObject+0x1c2 (FPO: [5,5,4])
    aad52c64 80500dfa nt!KiSuspendThread+0x18 (FPO: [3,0,0])
    aad52cac 80504d6e nt!KiDeliverApc+0x124 (FPO: [3,10,0])
    aad52cc4 804fcf40 nt!KiSwapThread+0xa8 (FPO: [0,0,0])
    aad52cec 805c108c nt!KeWaitForSingleObject+0x1c2 (FPO: [5,5,4])
    aad52d50 8054289c nt!NtWaitForSingleObject+0x9a (FPO: [Non-Fpo])
    aad52d50 7c93e514 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ aad52d64)
    00dbff84 7c802532 ntdll!KiFastSystemCallRet
    00dbff98 0100d90d kernel32!WaitForSingleObject+0x12 (FPO: [2,0,0])
    00dbffb4 7c80b699 taskmgr!WorkerThread+0x3a (FPO: [1,0,4])
    00dbffec 00000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])


We can see KiSwapContext call on top of the call stack, it usually means context switching but there is an odd thing in this call stack that there are two KiSwapThread calls, this is not usual case. The KiSwapThread in the middle of the call stack is for context switching, what is the second KiSwapTherad on top of the call stack?

You can see a call flow that KiDeliverApc, KiSuspendThread, KeWaitForSingleObject, KiSwapThread, so the second KiSwapThread is from KiSuspendThread. I can figure out that some process called SuspendThread API to taskmgr.exe and it didn’t call ResumeThread.

Now we need to find out a process to call SuspendThread to all other processes in the system. In order to find the caller of SuspendThread, first of all, we have to understand KiDeliverApc function to call KiSuspendThread and APC(Asynchronous Procedure Call) mechanism in Windows.

Windows kernel has a concept of IRQL (Interrupt Request Level) and thread scheduler of Windows kernel do thread context switching at DISPATCH_LEVEL (It is IRQL 2). After scheduler chooses thread to run, IRQL becomes lower to APC_LEVEL and the APCs queued in the thread are executed and then IRQL becomes lower to PASSIVE_LEVEL that the thread codes can run.

Therefore, KiDeliverApc means that it delivers APCs queued in the thread and KiSuspendThread called by KiDeliverApc means that KiSuspendThread was inserted into APC queue of the thread by APC. The prototype of KiDeliverApc is as following.


VOID
NTAPI
KiDeliverApc(IN KPROCESSOR_MODE DeliveryMode,
             IN PKEXCEPTION_FRAME ExceptionFrame,
             IN PKTRAP_FRAME TrapFrame)

It doesn’t take any APC structures through it’s parameters. I wondered how KiDeliverApc deliver APC of the thread, so I analyzed inside of KiDeliverApc function. Following pseudo codes represent the functionality of it.

KiDeliverApc()
{
    PKTHREAD Thread = KeGetCurrentThread();

    //
    // If APCs exist in ApcState.ApcListHead of current thread,
    // KiDeliverApc gets all APCs from the list and call KernelRoutine, NormalRoutine.
    //    
    while (!IsListEmpty(&Thread->ApcState.ApcListHead[KernelMode]))
    {
        ApcListEntry = Thread->ApcState.ApcListHead[KernelMode].Flink;
        Apc = CONTAINING_RECORD(ApcListEntry, KAPC, ApcListEntry);

        KernelRoutine = Apc->KernelRoutine;
        NormalRoutine = Apc->NormalRoutine;
        KernelRoutine(Apc, ...);
        NormalRoutine(...);
    }
}

The point is that someone inserts APC containing KiSuspendThread as NormalRoutine to ApcState.ApcListHead of current thread.

To find out who is someone, I tried to set memory break point at the address of Thread->ApcState.ApcListHead in order to get debugger break when any APC is inserted to this thread where ApcState field offset is 0x34 in KTHREAD structure.


kd> dt _KTHREAD ApcState.
nt!_KTHREAD
   +0x034 ApcState  : 
      +0x000 ApcListHead : [2] _LIST_ENTRY
      +0x010 Process   : Ptr32 _KPROCESS
      +0x014 KernelApcInProgress : UChar
      +0x015 KernelApcPending : UChar
      +0x016 UserApcPending : UChar
   +0x138 ApcStatePointer : [2] 
   +0x165 ApcStateIndex : UChar

kd> ba w4 (Address of Thread+0x34)


Following call stack is when debugger has got the break point. This is an operation during thread creation time.

kd> k
ChildEBP RetAddr  
f9b9bb60 804fdaa3 nt!KiInsertQueueApc+0x79
f9b9bb80 805c7436 nt!KeSuspendThread+0x67
f9b9bcc4 805c7f02 nt!PspCreateThread+0x570
f9b9bd3c 8053ea48 nt!NtCreateThread+0xfc
f9b9bd3c 7c93e514 nt!KiFastCallEntry+0xf8

kd> ub nt!KiInsertQueueApc+0x79
nt!KiInsertQueueApc+0x64:
804ff374 eb3f            jmp     nt!KiInsertQueueApc+0xa5 (804ff3b5)
804ff376 0fbeda          movsx   ebx,dl
804ff379 8d3cdf          lea     edi,[edi+ebx*8]
804ff37c 8b5f04          mov     ebx,dword ptr [edi+4]
804ff37f 8d700c          lea     esi,[eax+0Ch]       ; ApcListEntry
804ff382 893e            mov     dword ptr [esi],edi
804ff384 895e04          mov     dword ptr [esi+4],ebx  
804ff387 8933            mov     dword ptr [ebx],esi ; Insertion to list


A list is inserted at instruction pointer 804ff387, it means that ebx is the address of Thread->ApcState.ApcListHead and esi is an address of ListEntry. esi was from instruction pointer 804ff37f, it is eax+0Ch. We can be aware of esi is ApcListEntry and eax is an address of KAPC by following structure.

kd> dt _KAPC @eax
nt!_KAPC
   +0x000 Type             : 18
   +0x002 Size             : 48
   +0x004 Spare0           : 0
   +0x008 Thread           : 0x812f1020 _KTHREAD
   +0x00c ApcListEntry     : _LIST_ENTRY [ 0x812f1054 - 0x812f1054 ]
   +0x014 KernelRoutine    : 0x80501ed4     void  nt!KiSuspendNop+0
   +0x018 RundownRoutine   : 0x805277ba     void  nt!PopAttribNop+0
   +0x01c NormalRoutine    : 0x8050230a     void  nt!KiSuspendThread+0
   +0x020 NormalContext    : (null) 
   +0x024 SystemArgument1  : (null) 
   +0x028 SystemArgument2  : (null) 
   +0x02c ApcStateIndex    : 0 ''
   +0x02d ApcMode          : 0 ''
   +0x02e Inserted         : 0 ''


We can see nt!KiSuspendThread+0 in NormalRoutine at offset 0x01c. We've found an instruction pointer where KAPC is inserted and field offset of KAPC where KiSuspendThread pointer saved, now I am going to set a break point at 804ff389 (nt!KiInsertQueueApc+0x79) with a condition only if the NormalRoutine is KiSuspendThread in order to get a break point when someone inserts an APC having KiSuspendThread.

kd> u nt!KiSuspendThread L1
nt!KiSuspendThread:
8050230a 64a124010000    mov     eax,dword ptr fs:[00000124h]

kd> bp nt!KiInsertQueueApc+79 ".if (poi(@eax+1c)==8050230a) {} .else {gc}”

After several break hits from thread creation, eventually, I've got a call stack as following.

kd> k
ChildEBP RetAddr  
f766ecc8 804fdaa3 nt!KiInsertQueueApc+0x79
f766ece8 805cb671 nt!KeSuspendThread+0x67
f766ed28 805cb706 nt!PsSuspendThread+0x6f
f766ed44 805cb8fe nt!PsSuspendProcess+0x28
f766ed58 8053ea48 nt!NtSuspendProcess+0x40
f766ed58 7c93e514 nt!KiFastCallEntry+0xf8
WARNING: Stack unwind information not available. Following frames may be wrong.
00000000 00000000 ntdll!KiFastSystemCallRet

This call stack means that this process called Win32 API SuspendProcess in user mode and it has a responsibility to call ResumeProcess but it didn’t, so the process is the root cause of system hang.

No comments:

Post a Comment