Tuesday, September 1, 2015

Debugging Process Hang

A few month ago, when I was launching my Visual Studio, it took almost 2 minutes until it was loaded completely. It seemed like a process hang. I wondered what the problem was, so I decided to dump the system into a complete dump file.

I looked for a running thread of msdev.exe after I opened the dump file with WinDbg.

0: kd> !process 0 3f msdev.exe
PROCESS fffffa800fce6b30
    SessionId: 1  Cid: 1a10    Peb: 7efdf000  ParentCid: 1268
    DirBase: 1dfa1c000  ObjectTable: fffff8a00862a110  HandleCount: 314.
    Image: MSDEV.EXE
    VadRoot fffffa80107d3c90 Vads 367 Clone 0 Private 12752. Modified 197443. Locked 0.
    DeviceMap fffff8a007ec5160
    Token                             fffff8a0157176f0
    ElapsedTime                       00:00:45.997
    UserTime                          00:00:00.000
    KernelTime                        00:00:00.062
    ...

        THREAD fffffa80126a0060  Cid 1a10.19d8  Teb: 000000007efdb000 Win32Thread: fffff900c2d53c20 RUNNING on processor 3

        Not impersonating
        DeviceMap                 fffff8a007ec5160
        Owning Process            fffffa800fce6b30       Image:         MSDEV.EXE
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      131878         Ticks: 0
        Context Switch Count      156311         IdealProcessor: 7                 LargeStack
        UserTime                  00:00:21.762
        KernelTime                00:00:13.041
*** ERROR: Module load completed but symbols could not be loaded for MSDEV.EXE
        Win32 Start Address MSDEV (0x000000000040155c)
        Stack Init fffff8800c441530 Current fffff8800c441210
        Base fffff8800c442000 Limit fffff8800c435000 Call fffff8800c441580
        Priority 9 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

        Child-SP          RetAddr           Call Site

        fffff880`0c440680 fffff880`012aa404 Ntfs!NtfsFindPrefix+0x222
        fffff880`0c440730 fffff880`012a7af3 Ntfs!NtfsFindStartingNode+0x6e4
        fffff880`0c440800 fffff880`0128a149 Ntfs!NtfsCommonCreate+0x3d3
        fffff880`0c4409d0 fffff880`01140367 Ntfs!NtfsNetworkOpenCreate+0x119
        fffff880`0c440cb0 fffff880`011438ba fltmgr!FltpPerformFastIoCall+0x357
        fffff880`0c440d10 fffff880`01161a9f fltmgr!FltpPassThroughFastIo+0xda
        fffff880`0c440d50 fffff800`035d3c72 fltmgr!FltpFastIoQueryOpen+0x10f
        fffff880`0c440df0 fffff800`035cf878 nt!IopParseDevice+0x1249
        fffff880`0c440f50 fffff800`035d0a96 nt!ObpLookupObjectName+0x588
        fffff880`0c441040 fffff800`035aef66 nt!ObOpenObjectByName+0x306
        fffff880`0c441110 fffff800`032d2e53 nt!NtQueryAttributesFile+0x145
        fffff880`0c4413a0 00000000`7754168a nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`0c4413a0)
        00000000`0008c9f8 00000000`73a2ae19 ntdll!NtQueryAttributesFile+0xa
        00000000`0008ca00 00000000`73a1d18f wow64!whNtQueryAttributesFile+0x91
        00000000`0008ca80 00000000`750c2776 wow64!Wow64SystemServiceEx+0xd7
        00000000`0008d340 00000000`73a1d286 wow64cpu!ServiceNoTurbo+0x2d
        00000000`0008d400 00000000`73a18a90 wow64!RunCpuSimulation+0xa
        00000000`0008d450 00000000`739e2c52 wow64!Wow64KiUserCallbackDispatcher+0x204
        00000000`0008d7a0 00000000`775411f5 wow64win!whcbfnDWORD+0xe2
        00000000`0008e190 00000000`739efe4a ntdll!KiUserCallbackDispatcherContinue (TrapFrame @ 00000000`0008e058)
        00000000`0008e218 00000000`739caf02 wow64win!ZwUserMessageCall+0xa
        00000000`0008e220 00000000`739e281f wow64win!whNT32NtUserMessageCallCB+0x32
        00000000`0008e270 00000000`739cb03e wow64win!Wow64DoMessageThunk+0x8b
        00000000`0008e2b0 00000000`73a1d18f wow64win!whNtUserMessageCall+0x12e
        00000000`0008e350 00000000`750c2776 wow64!Wow64SystemServiceEx+0xd7
        00000000`0008ec10 00000000`73a1d286 wow64cpu!ServiceNoTurbo+0x2d
        00000000`0008ecd0 00000000`73a1c69e wow64!RunCpuSimulation+0xa
        00000000`0008ed20 00000000`77534966 wow64!Wow64LdrpInitialize+0x42a
        00000000`0008f270 00000000`77531937 ntdll!LdrpInitializeProcess+0x17e3
        00000000`0008f760 00000000`7751c34e ntdll! ?? ::FNODOBFM::`string'+0x28ff0
        00000000`0008f7d0 00000000`00000000 ntdll!LdrInitializeThunk+0xe


The thread was consuming UserTime 21 seconds and KernelTime 13 seconds. The other threads were not meaningful because they were suspended all.

I could be aware of that the running thread was handling a file by calling NtQueryAttributesFile from the call stack. I wondered what the file was as well so I decided to get the file name from the first parameter of NtQueryAttributesFile.

Even though analyzing 64 bit call stack is a little tough task but I can suggest simplified steps as following.


1. Finding the stack location of the first parameter

This is the function prototype of NtQueryAttributesFile.

NTSTATUS 
NtQueryAttributesFile(
    POBJECT_ATTRIBUTES ObjectAttributes,
    PFILE_BASIC_INFORMATION FileInformation
);

The first parameter ObjectAttributes contains a file name so we need to find a location where ObjectAttributes saved from the code flow of NtQueryAttributesFile.

0: kd> uf nt!NtQueryAttributesFile
nt!NtQueryAttributesFile:
fffff800`035aee20 48895c2418      mov     qword ptr [rsp+18h],rbx
fffff800`035aee25 56              push    rsi
fffff800`035aee26 57              push    rdi
fffff800`035aee27 4154            push    r12
fffff800`035aee29 4881ec70020000  sub     rsp,270h
fffff800`035aee30 488b05a9fbe9ff  mov     rax,qword ptr [nt!_security_cookie (fffff800`0344e9e0)]
fffff800`035aee37 4833c4          xor     rax,rsp
fffff800`035aee3a 4889842460020000 mov     qword ptr [rsp+260h],rax
fffff800`035aee42 488bda          mov     rbx,rdx
fffff800`035aee45 488bf1          mov     rsi,rcx  ; saving 1st param (rsi <= rcx)
fffff800`035aee48 65488b042588010000 mov   rax,qword ptr gs:[188h]
fffff800`035aee51 0fb6b8f6010000  movzx   edi,byte ptr [rax+1F6h]
fffff800`035aee58 4084ff          test    dil,dil
fffff800`035aee5b 7432            je      nt!NtQueryAttributesFile+0x6e (fffff800`035aee8f)

nt!NtQueryAttributesFile+0x3d:
fffff800`035aee5d 488bca          mov     rcx,rdx
fffff800`035aee60 f6c207          test    dl,7
fffff800`035aee63 750e            jne     nt!NtQueryAttributesFile+0x53 (fffff800`035aee73)

nt!NtQueryAttributesFile+0x45:
fffff800`035aee65 488b0594c1f5ff  mov     rax,qword ptr [nt!MmUserProbeAddress (fffff800`0350b000)]
fffff800`035aee6c 483bd0          cmp     rdx,rax
fffff800`035aee6f 7308            jae     nt!NtQueryAttributesFile+0x58 (fffff800`035aee79)

nt!NtQueryAttributesFile+0x51:
fffff800`035aee71 eb09            jmp     nt!NtQueryAttributesFile+0x5b (fffff800`035aee7c)

nt!NtQueryAttributesFile+0x53:
fffff800`035aee73 e818070b00      call    nt!ExRaiseDatatypeMisalignment (fffff800`0365f590)
fffff800`035aee78 cc              int     3

nt!NtQueryAttributesFile+0x58:
fffff800`035aee79 488bc8          mov     rcx,rax

nt!NtQueryAttributesFile+0x5b:
fffff800`035aee7c 0fb601          movzx   eax,byte ptr [rcx]
fffff800`035aee7f 8801            mov     byte ptr [rcx],al
fffff800`035aee81 0fb64127        movzx   eax,byte ptr [rcx+27h]
fffff800`035aee85 884127          mov     byte ptr [rcx+27h],al
fffff800`035aee88 eb05            jmp     nt!NtQueryAttributesFile+0x6e (fffff800`035aee8f)

nt!NtQueryAttributesFile+0x6e:
fffff800`035aee8f 41bcb8000000    mov     r12d,0B8h
fffff800`035aee95 4d8bc4          mov     r8,r12
fffff800`035aee98 33d2            xor     edx,edx
fffff800`035aee9a 488d4c2450      lea     rcx,[rsp+50h]
fffff800`035aee9f e8ec6ed2ff      call    nt!memset (fffff800`032d5d90)
fffff800`035aeea4 41bb08000000    mov     r11d,8
fffff800`035aeeaa 6644895c2450    mov     word ptr [rsp+50h],r11w
fffff800`035aeeb0 664489642452    mov     word ptr [rsp+52h],r12w
fffff800`035aeeb6 b807000000      mov     eax,7
fffff800`035aeebb 6689842496000000 mov     word ptr [rsp+96h],ax
fffff800`035aeec3 c78424a800000001000000 mov dword ptr [rsp+0A8h],1
fffff800`035aeece c784249000000000402000 mov dword ptr [rsp+90h],204000h
fffff800`035aeed9 48899c24b0000000 mov     qword ptr [rsp+0B0h],rbx
fffff800`035aeee1 488d842410010000 lea     rax,[rsp+110h]
fffff800`035aeee9 48898424b8000000 mov     qword ptr [rsp+0B8h],rax
fffff800`035aeef1 c68424d100000001 mov     byte ptr [rsp+0D1h],1
fffff800`035aeef9 488d842450010000 lea     rax,[rsp+150h]
fffff800`035aef01 48898424d8000000 mov     qword ptr [rsp+0D8h],rax
fffff800`035aef09 4889b42480000000 mov     qword ptr [rsp+80h],rsi     <= saving rsi to a stack area
fffff800`035aef11 c78424e000000020000000 mov dword ptr [rsp+0E0h],20h
fffff800`035aef1c 65488b042588010000 mov   rax,qword ptr gs:[188h]
fffff800`035aef25 48ff8030030000  inc     qword ptr [rax+330h]
fffff800`035aef2c 65ff042564220000 inc     dword ptr gs:[2264h]
fffff800`035aef34 488d442440      lea     rax,[rsp+40h]
fffff800`035aef39 4889442430      mov     qword ptr [rsp+30h],rax
fffff800`035aef3e 488d442450      lea     rax,[rsp+50h]
fffff800`035aef43 4889442428      mov     qword ptr [rsp+28h],rax
fffff800`035aef48 c744242080000000 mov     dword ptr [rsp+20h],80h
fffff800`035aef50 4533c9          xor     r9d,r9d
fffff800`035aef53 440fb6c7        movzx   r8d,dil
fffff800`035aef57 488b15eac0f5ff  mov     rdx,qword ptr [nt!IoFileObjectType (fffff800`0350b048)]
fffff800`035aef5e 488bce          mov     rcx,rsi    <= rsi is passed to 1st param of ObOpenObjectByName
fffff800`035aef61 e82a180200      call    nt!ObOpenObjectByName (fffff800`035d0790)


We know that a first parameter of a function is passed through rcx on 64 bit platform so I traced the data flow of rcx from entry point of NtQueryAttributesFile.

I can summarize the code snippet like this.

fffff800`035aee45 mov rsi,rcx  ; saving 1st param (rsi <= rcx)
fffff800`035aef09 mov qword ptr [rsp+80h],rsi  ; saving rsi to [rsp+80]
fffff800`035aef5e mov rcx,rsi  ; rsi is passed to 1st param of ObOpenObjectByName

The first parameter of ObOpenObjectByName is also POBJECT_ATTRIBUTES.

NTSTATUS STDCALL
ObOpenObjectByName(
  POBJECT_ATTRIBUTES ObjectAttributes,
  POBJECT_TYPE ObjectType,
  PVOID ParseContext,
  KPROCESSOR_MODE AccessMode,
  ACCESS_MASK DesiredAccess,
  PACCESS_STATE PassedAccessState,
  PHANDLE Handle
);

I can be sure that rsi contains the first parameter of NtQueryAttributesFile because it is not changed between fffff8000'035aee45 and fffff800'035aef5e, and it is saved on [rsp+80h].


2. Finding the base pointer of stack frame

I need to know the rsp value when NtQueryAttributesFile is running in order to dereference [rsp+80h] in the stack frame. I can get the rsp of NtQueryAttributesFile from the call stack.

Child-SP          RetAddr           Call Site
fffff880`0c441040 fffff800`035aef66 nt!ObOpenObjectByName+0x306
fffff880`0c441110 fffff800`032d2e53 nt!NtQueryAttributesFile+0x145

WinDbg shows Child-SP kindly, it is rsp in the function so rsp of NtQueryAttributesFile is fffff880'0c441110.

On 64 bit Windows, rsp is a base pointer of stack frame like ebp on 32 bit platform. The rsp value is not changed until the function returns.


3. Checking the parameter

Now, we can read the value stored on [rsp+80].

0: kd> dq fffff880`0c441110+80
fffff880`0c441190  00000000`0008d2e0 00000000`00000000
fffff880`0c4411a0  00070000`00200000 00000000`00000000
...

0: kd> dq 00000000`0008d2e0
00000000`0008d2e0  00000032`00000030 00000000`00000000
00000000`0008d2f0  00000000`0008d2d0 00000000`00000040
00000000`0008d300  00000000`00000000 00000000`00000000

The 0x00000000'0008d2e0 is POBJECT_ATTRIBUTES ObjectAttributes.

0: kd> dt _OBJECT_ATTRIBUTES
ntdll!_OBJECT_ATTRIBUTES
   +0x000 Length           : Uint4B
   +0x008 RootDirectory    : Ptr64 Void
   +0x010 ObjectName       : Ptr64 _UNICODE_STRING
   +0x018 Attributes       : Uint4B
   +0x020 SecurityDescriptor : Ptr64 Void
   +0x028 SecurityQualityOfService : Ptr64 Void

I am interested in the ObjectName field, it has 0x00000000'0008d2d0.

0: kd> dt _UNICODE_STRING 00000000`0008d2d0
ntdll!_UNICODE_STRING
 "\??\D:\TEMP\Project\TestTool\dummy\dummy.cpp"
   +0x000 Length           : 0x5A
   +0x002 MaximumLength    : 0x21a
   +0x008 Buffer           : 0x00000000`04ffd1b8  "\??\D:\TEMP\Project\TestTool\dummy\dummy.cpp"


4. The cause of the process hang

"Oops, the path does not exist."

I realized something after I looked at the path. I had removed the path a few days ago before the process hang happened. The Visual Studio might have a bug that it could not deal with a removed path of recent workspaces correctly.

The process hang was fixed after I had deleted all temporary files of Visual Studio. :)