Wednesday, January 9, 2013

A micro windows crash catcher in python

In this article we describe how to write a minimalistic Windows debugging loop in python. Modern applications usually spawn more than one process and the bugs in them generate different type of crashes. Our minimalistic debugger shall detect "any" crash condition of a process or process tree. Be aware that our aim is purely educational and more mature and complete options exist. If you need a full fledged debugger in python you should check winappdbg.


After dealing with full/complete/true debuggers for a while, we decided to develop our own tiny debugger in python from scratch. Understanding the debugger innerparts is a must when dealing with complex bugs in complex application settings. Our tiny debugger is only a few lines long and has the functionality of either detect a crash or close the  process with a timeout. 

First steps:

Our code will run on windows and depends only on ctypes. ctypes is a mature and powerful library to access native code from Python. With ctypes we can interface with the system DLLS using C like structures for parameter passing directly from the interpreter. For example, we can call to kernel32.CreateProcess function from our Python code.

Using Ctypes:

Suppose we want to launch the windows calculator. The Windows API call for creating new processes is  kernel32.CreateProcess. It's signature is the following:
BOOL WINAPI CreateProcess(
  _In_opt_     LPCTSTR lpApplicationName,
  _Inout_opt_  LPTSTR lpCommandLine,
  _In_opt_     LPSECURITY_ATTRIBUTES lpProcessAttributes,
  _In_opt_     LPSECURITY_ATTRIBUTES lpThreadAttributes,
  _In_         BOOL bInheritHandles,
  _In_         DWORD dwCreationFlags,
  _In_opt_     LPVOID lpEnvironment,
  _In_opt_     LPCTSTR lpCurrentDirectory,
  _In_         LPSTARTUPINFO lpStartupInfo,
  _Out_        LPPROCESS_INFORMATION lpProcessInformation

Passing "calc.exe" as the path argument should do the trick. But first we need  to teach python how to encode the different data types needed. Fundamental data types as char, int and void* are supported in ctypes out of the box. Most of CreateProcess argument types are fundamental and nativelly supported by ctypes, like: lpCommandLine

Other parameters, like lpProcessAttributes, are optional and can be NULL in most cases, hence, there is no need to define their types in python.

Summing up, for our purpose CreateProcess use only two arguments with complex types (in red): LPPROCESS_INFORMATION and LPSTARTUPINFOSo, we need to define these 2 non-trivial data structures in ctypes.

Defining the structures:

The C MSDN definition of the _PROCESS_INFORMATION structure is the following: 

  1. typedef struct _PROCESS_INFORMATION {
  2.   HANDLE hProcess;
  3.   HANDLE hThread;
  4.   DWORD  dwProcessId;
  5.   DWORD  dwThreadId;

We implement our equivalent structure using a "ctype  Structure" class in python:

  1. class ProcessInfo(Structure):
  2.     _fields_ = [('hProcess', HANDLE),
  3.                 ('hThread',  HANDLE),
  4.                 ('dwProcessId', DWORD),
  5.                 ('dwThreadId',  DWORD)]

where HANDLE and DWORD  are alias to c_void_p and c_ulong, existing types respectively.

Note: LPSTARTUP_INFO is defined in similarly form.

Running Calc.exe with ctypes:

Once we have defined the structures to use them in the call to CreateProcess, the code to launch calc.exe will be:

  1. from ctypes import *
  2. from defines import *
  3. kernel32 = windll.kernel32
  4. #Instantiate the structs
  5. si = StartupInfo()
  6. pi = ProcessInfo()
  7. cmd = "c:\\windows\\system32\\calc.exe"
  8. kernel32.CreateProcessA(c_char_p(cmd),
  9.                         c_char_p("calc"),
  10.                         None,
  11.                         None,
  12.                         False,
  13.                         0,
  14.                         None,
  15.                         None,
  16.                         byref(si),
  17.                         byref(pi))

If you run this, the calc.exe program will be launched:

Note: defines module is a python program with the definition of structures like STARTUP_INFO, PROCESS_INFO and the other types used in this article.

Debugging a process:
We craft the CreateProcess arguments so the debugee process pass the debugger every exception. In a debugging session any exception in the debugged process is reported to the debugger as an event. Events include process and thread creation, the loading of a dynamic-linked library, and in general any exception occurring in the debugged process.

Also a process can create multiple child processes, and of course, we are interested in the child's crashes as well.  So using the dwCreationFlags argument we indicate that once the debugee creates a new child process, the debugger gets also the events produced by the newly created processes.  

Most of the magic resides in the dwCreationFlags argument. This flag is used to indicate that the calling thread is debugging the new process and its child processes. 

The next step is to control the behavior of the recently created process, the debugee. For this the debugger must execute a debugger loop as explained the next section.

The main loop:

The main debugging loop is quite simple and is described in some detail in the windows writting debugging loop reference. It consist in an "infinite loop" that waits for an event on the attached processes to occur. When an event happens, the debugger decides how to proceed based on the information provided in the  DEBUG_EVENT structure

Each event is obtained using the WaitForDebugEvent function, provided by the kernel32 dll. 
For each event received the debugger must decide what to do with it. For convenience, custom handler functions are defined for each type of event(ex. OnCreateThreadEvent in "writing debugging loop") and a switch statement dispatches each event to its correspondent handler.  

Thus the debugger main task is to react to every DEBUG_EVENT structure:
typedef struct _DEBUG_EVENT {
  DWORD dwDebugEventCode;
  DWORD dwProcessId;
  DWORD dwThreadId;
  union {
    EXCEPTION_DEBUG_INFO      Exception;
    LOAD_DLL_DEBUG_INFO       LoadDll;
    UNLOAD_DLL_DEBUG_INFO     UnloadDll;
    RIP_INFO                  RipInfo;
  } u;

This structure will be associated  to our target process (and its child processes ) and will give us information about the event fired by the debugged process. When a debugged process starts, it generally produces and dispatches events, and each one must be "handled" by the debugger and then eventually passed to the originating thread so it can continue.
To do that, we need to build a debug loop to manage all events produced.

Note: We need to build this structure using ctypes(the proceeding is similarly to the class ProcessInfo). Ctypes also provides union support.

We aim for a simple crash catcher,  thus we just need to know only if the event was due to a crash.

  1.        if debug.dwDebugEventCode == EXCEPTION_DEBUG_EVENT:
               if debug.u.Exception.ExceptionRecord.ExceptionCode in InterestingExceptions:
  2.                print 'EXCEPTION CODE:', hex(debug.u.Exception.ExceptionRecord.ExceptionCode)
  3.                closed  = 'Crashed'
  4.            else:
  5.                dwContinueStatus = DBG_EXCEPTION_NOT_HANDLED
  6.        elif debug.dwDebugEventCode == CREATE_PROCESS_DEBUG_EVENT:
  7.        elif debug.dwDebugEventCode == EXIT_PROCESS_DEBUG_EVENT:
  8.            pids.remove(debug.dwProcessId)

At each event the originating thread is stopped to let the debugger handle the event. After the event is handled we need to explicitly issue a continueDebugEvent to let the debugee thread continue (as shown here).

At this point we can write all the debugging loop like this:

  1. pi = ProcessInfo()
  2. si = StartupInfo()
  3. success = kernel32.CreateProcessA(c_char_p(0),  #cmd must not be None
  4.                                   c_char_p(cmd),
  5.                                   0,
  6.                                   0,
  7.                                   0,
  8.                                   #debug flag with follow forks
  9.                                   1,            
  10.                                   0,
  11.                                   0,
  12.                                   byref(si),
  13.                                   byref(pi))
  14. pids = None
  15. closed = "Normal"
  16. maxTime = time()+timeout
  17. dwContinueStatus = DBG_CONTINUE
  18. debug = DEBUG_EVENT()
  19. while pids is None or pids:
  20.     #Wait for a debugging event to occur. The second parameter indicates
  21.     #that the function does not return until a debugging event occurs.
  22.    if kernel32.WaitForDebugEvent(byref(debug)100):
  23.        #Process the debugging event code.
  24.        if debug.dwDebugEventCode == EXCEPTION_DEBUG_EVENT:
  25.            #Process the exception code. When handling
  26.            #exceptions, remember to set the continuation
  27.            #status parameter (dwContinueStatus). This value
  28.            #is used by the ContinueDebugEvent function.
  29.            if debug.u.Exception.ExceptionRecord.ExceptionCode in InterestingExceptions:
  30.                print 'EXCEPTION CODE:', hex(debug.u.Exception.ExceptionRecord.ExceptionCode)
  31.                closed  = 'Crashed'
  32.            else:
  33.                dwContinueStatus = DBG_EXCEPTION_NOT_HANDLED
  34.        elif debug.dwDebugEventCode == CREATE_PROCESS_DEBUG_EVENT:
  35.            if pids is None:
  36.                pids = []
  37.                pids.append(debug.dwProcessId)              
  38.        elif debug.dwDebugEventCode == EXIT_PROCESS_DEBUG_EVENT:
  39.            pids.remove(debug.dwProcessId)
  41.         #If crashed or the timeout was reached
  42.         #Close all processes in the debugge loop.    
  43.         if maxTime < time() or closed == 'Crashed':
  44.             if closed != 'Crashed':
  45.                 closed = 'Timeout'
  46.             for pid in reversed(pids):
  47.                 handle = kernel32.OpenProcess(10, pid)
  48.                 kernel32.TerminateProcess(handle,0)
  49.                 kernel32.CloseHandle(handle)
  50.         #print repr(pids)    
  51.         kernel32.ContinueDebugEvent(debug.dwProcessId, debug.dwThreadId, dwContinueStatus)

The main loop waits until there is no more PIDs in the pids array. Whenever a child process is created from the main process, it is stored in the pids array and removed from there when the child exits or finishes.

If the running process exceeded the timeout, we close all the child running processes in reverse order (lines 46 - 49). Terminating a process at this point(line 48) will force its removal from the pids array in the next iteration (line 39).

Handling events:
For each iteration, we wait until 100 ms  for an event to occur. Several events could happen, but at this point we are interested only in 3 of them:
The rest will be ignored. 

EXCEPTION_DEBUG_EVENT is used to detect a crash condition.
CREATE_PROCESS_DEBUG_EVENT and EXIT_PROCESS_DEBUG_EVENT are used to manage the pids array, to add (line 37) or remove (lines 39) PIDs from the pids array.

The interesting exceptions:
In MSDN, there is a set of exceptions that are thrown with the EXCEPTION_DEBUG_EVENT (line 24) type. From that set, we filter out the least interesting codes. The list of crash related codes are compiled in the InterestingExceptions array:

  1. InterestingExceptions = [EXCEPTION_ACCESS_VIOLATION,
  2.                          EXCEPTION_ARRAY_BOUNDS_EXCEEDED,
  3.                          EXCEPTION_DATATYPE_MISALIGNMENT,
  4.                          EXCEPTION_ILLEGAL_INSTRUCTION,
  5.                          EXCEPTION_IN_PAGE_ERROR,
  6.                          EXCEPTION_PRIV_INSTRUCTION,
  7.                          EXCEPTION_STACK_OVERFLOW]

So, when an event fall in InterestingExceptions we argue a crash occurred and can proceed to recollect all the information about why it was produced. Else we can continue waiting for another event.

Note: The exception codes are simply numbers, and we have the definitions for each code in defines module, previously mentioned.

Testing our micro debugger:

We test our implementation using crash_test.cppThe program crashes in several different ways, possibly spawning child processes. Different crashes can be triggered using command line arguments.

For example we can run:
  crash_test.exe spawn spawn crash

This run will create 2 child processes and then crash. With this we can test how our micro debugger performs on different settings.

How to use it:
    crash_test.exe [ behavior  ... ]
      where behavior = crash | spawn | EXCEPTION
                         EXCEPTION_ARRAY_BOUNDS_EXCEEDED |
                         EXCEPTION_ILLEGAL_INSTRUCTION | 
                         EXCEPTION_PRIV_INSTRUCTION | 

Note: The crash_test.exe can't reproduce a DATATYPE_MISALIGNMENT or IN_PAGE_ERROR exception. Any comments about how to produce such exceptions are welcome.

The Demo:
We have tested our approach thoroughly. To show the behavior of our micro crash catcher, here we paste some pictures of it in action.

A normal program wont crash:
In this case we can see that launching "crash_test.exe SPAWN SPAWN SPAWN SPAWN"
with our debugger, 5 processes are created (including crash_test.exe parent process), the pids are in brackets, and if we see the variable "closed" at exit of crash_test.exe process, it must be equal to "Normal".  

A crashing process:
In contrast, here we debug "crash_test.exe SPAWN EXCEPTION_STACK_OVERFLOW", obtaining the exception code: 0xc00000fd, which represents the STACK_OVERFLOW exception

A divergent program will timeout:
In this case we want to show that a process that never ends eventually timeouts. Here we use the the never ending interactive calc.exe. The timeout is set to 5 seconds. As you can see, the calculator starts, and waits for the user to input some numbers, and after 5 second it is forced to close.   

Other combinations of crashes, normal termination and never ending processes were also tested. Comments, doubts or suggestions are welcome. 


  1. Maybe DATATYPE_MISALIGNMENT can be triggered by using SSE instructions, they require pointers to be aligned to 64 bytes. I haven´t tried it, though.

    It may also be possible in Windows Phone, ARM processors are also sensitive to data alignment.

  2. There´s also this for triggering exceptions, but it´s cheating :)