DoS? Then Who Was Phone?

Introduction

This post presents exploitation notes on a vulnerability we discovered in Asterisk, an open source telephony solution produced by Digium. We reported this bug to Digium on November 27th, 2012, and provided it to customers of the Exodus Intelligence Feed as EIP-2012-0008. Digium released the advisory AST-2012-014 for this vulnerability on January 2nd, 2013, which was picked up shortly after by some of the aggregator sites and incorrectly categorized as a denial-of-service; however, this bug is certainly exploitable. As we found it fun to analyze, and since discussions about server-side memory bugs are a little sparse now-a-days, we thought it would be cool to share for others who might also find it interesting.

Vulnerability

The vulnerability resides in the HTTP Asterisk Management Interface (AMI) service, and is the result of an alloca being used to “allocate” memory with a remotely-supplied, untrusted size value. The vulnerability is present in the Asterisk source code file main/http.c, specifically in the function ast_http_get_post_vars, which as the name would suggest is used to parse HTTP POST variable data. A snip of the pertinent vulnerable code in this function is shown below:

struct ast_variable *ast_http_get_post_vars(
  struct ast_tcptls_session_instance *ser, struct ast_variable *headers)
{
  int content_length = 0;
  struct ast_variable *v, *post_vars=NULL, *prev = NULL;
  char *buf, *var, *val;

  for (v = headers; v; v = v->next) {
    if (!strcasecmp(v->name, "Content-Type")) {
      if (strcasecmp(v->value, "application/x-www-form-urlencoded")) {
        return NULL;
      }
      break;
    }
  }

  for (v = headers; v; v = v->next) {
    if (!strcasecmp(v->name, "Content-Length")) {
      content_length = atoi(v->value) + 1;
      break;
    }
  }

  if (!content_length) {
    return NULL;
  }

  if (!(buf = alloca(content_length))) {
    return NULL;
  }
  if (!fgets(buf, content_length, ser->f)) {
    return NULL;
  }

The code shows the length value being converted from the Content-Length string using atoi, then incremented by one and stored in the content_length variable. Memory is obtained by alloca for the expected content length, and pointed to by *buf. Finally, fgets is called to read the expected amount of content data into this buffer. I found it interesting that the code looks as though it may have been written with memory management issues in mind, as the check to ensure content_length is not zero would catch an integer overflow caused by adding one to the value.

Below is a snip of disassembled code for the vulnerable function as compiled in the Asterisk package shipped with Ubuntu. This snip shows the size value being set and used to subtract the stack pointer (ESP) to “allocate” stack memory:

<ast_http_get_post_vars+187>:   call   <strtol@plt>
<ast_http_get_post_vars+192>:   mov    edx,eax
<ast_http_get_post_vars+194>:   add    edx,0x1
<ast_http_get_post_vars+197>:   je     <ast_http_get_post_vars+408>
<ast_http_get_post_vars+203>:   mov    ecx,DWORD PTR [ebp-0x30]
<ast_http_get_post_vars+206>:   add    eax,0x1f
<ast_http_get_post_vars+209>:   and    eax,0xfffffff0
<ast_http_get_post_vars+212>:   sub    esp,eax <----- LOL
<ast_http_get_post_vars+214>:   lea    esi,[esp+0x1b]

As shown, the alloca is compiled into a simple set of instructions to ADD and AND-off the size to be allocated from the stack. It then subtracts the revised size from the stack pointer, and stores an address derived from this into the ESI register for further use.

Exploitation Obstacles

Since most compilers implement alloca as a fairly direct subtraction of the stack pointer, the exploitation of alloca is often as simple as providing a size value large enough to wrap the stack pointer around to a desirable location higher on the stack. Subsequent use of the pointer to store remotely supplied data would then result in stack memory corruption, and allow for vanilla exploitation techniques to gain control of program execution flow.

However, here the vulnerable code uses the function fgets to read network data into the obtained memory space. This complicates the situation for exploitation as the libc implementation of fgets performs a check on its length argument to ensure that it is not beyond the signed integer boundary of 0x7FFFFFFF. If this check fails, fgets does not read data and returns an error. The code snip below shows the check performed inside of fgets as implemented in libc.6.so:

<fgets+0>:     sub    esp,0x4c
<fgets+3>:     mov    DWORD PTR [esp+0x48],ebp
<fgets+7>:     mov    ebp,DWORD PTR [esp+0x54]
<fgets+11>:    mov    DWORD PTR [esp+0x3c],ebx
<fgets+15>:    call   <mov_esp_ebx>
<fgets+20>:    add    ebx,0x14051c
<fgets+26>:    mov    DWORD PTR [esp+0x40],esi
<fgets+30>:    mov    esi,DWORD PTR [esp+0x58]
<fgets+34>:    test   ebp,ebp
<fgets+36>:    mov    DWORD PTR [esp+0x44],edi
<fgets+36>:    mov    DWORD PTR [esp+0x44],edi
<fgets+40>:    jle    <fgets+336>
...
<fgets+336>:   mov    DWORD PTR [esp+0x50],0x0
<fgets+344>:   jmp    <fgets+256>
...
<fgets+256>:   mov    eax,DWORD PTR [esp+0x50]
<fgets+260>:   mov    ebx,DWORD PTR [esp+0x3c]
<fgets+264>:   mov    esi,DWORD PTR [esp+0x40]
<fgets+268>:   mov    edi,DWORD PTR [esp+0x44]
<fgets+272>:   mov    ebp,DWORD PTR [esp+0x48]
<fgets+276>:   add    esp,0x4c
<fgets+279>:   ret    

The EBP register, containing the length argument, is checked to be a positive signed value using the TEST and JLE instructions at <fgets+34> and <fgets+40>. If the check fails, the code jumps to return an error, making fgets unusable for exploiting a wrapped stack pointer to overwrite memory with data read from the network. While stack corruption by this means is still possible through the pushing and moving of data to the stack by other compiled code operations, the lack of control and limited set of operations make this approach undesirable.

At this point some might categorize this vulnerability as purely theoretical or possibly even unexploitable. As I hope many readers would agree, a challenge of this nature is always inviting. The Exodus team loves goading and trolling one another in these scenarios, usually with something along the lines of “Yeah, it is probably too tough for you to exploit…” or “you should probably just give up.” The recipient of this pep talk usually proceeds to cry and reevaluate the code until an idea hits them or they decide to resign to a life of PCI compliance auditing. Challenge accepted.

EIP Control

After spending some time analyzing the problem and hating computers, I found a way to exploit this vulnerability. The HTTP listener for the Asterisk Management Interface handles every new connection by creating a new thread to execute a designated worker function to process the request. The code to setup and complete this task is spread out across multiple functions and macros and is a little messy, so we’ll try to keep details to a minimum. The HTTP AMI is started initially by a call chain of functions starting with ast_http_init, which calls __ast_http_load, which then calls ast_tcptls_server_start. The function ast_tcptls_server_start performs standard TCP socket setup operations, and is defined as:

void ast_tcptls_server_start(struct ast_tcptls_session_args *desc)

Despite the name, ast_tcptls_server_start is used for both TLS and non-TLS service setup. The single argument taken by this function is a structure describing aspects of the server to be started. From __ast_http_load, the call looks like:

ast_tcptls_server_start(&http_desc);

The structure structure http_desc is defined in main/http.c as:

static struct ast_tcptls_session_args http_desc = {
  .accept_fd = -1,
  .master = AST_PTHREADT_NULL,
  .tls_cfg = NULL,
  .poll_timeout = -1,
  .name = "http server",
  .accept_fn = ast_tcptls_server_root,
  .worker_fn = httpd_helper_thread,
};

The .accept_fn is a function pointer for a function to accept the connection, and the worker_fn is a pointer to the worker function responsible for processing the request once a new thread is created. After more setup code, a new thread is created to accept socket connections by calling the function ast_tcptls_server_root. For each TCP connection accepted on the listening HTTP port (default 8088), ast_tcptls_server_root calls the following thread creation wrapper to create a new thread and eventually call the worker function:

...
if (ast_pthread_create_detached_background(&launched, NULL, handle_tcptls_connection, tcptls_session)) {
  ast_log(LOG_WARNING, "Unable to launch helper thread: %s\n", strerror(errno));
   ast_tcptls_close_session_file(tcptls_session);
   ao2_ref(tcptls_session, -1);
   }
...

The function ast_pthread_create_detached_background is a macro wrapper for the function ast_pthread_create_stack. The macro definition looks roughly like:

ast_pthread_create_detached_stack(a, b, c, d, AST_BACKGROUND_STACKSIZE, ...)

The important thing to note here is the argument AST_BACKGROUND_STACKSIZE. This is used by the function to set the new thread's stack size attribute before starting the thread:

pthread_attr_setstacksize(attr, stacksize ? stacksize : AST_STACKSIZE)
...
return pthread_create(thread, attr, start_routine, data);

For builds without low memory restrictions defined, the AST_BACKGROUND_STACKSIZE and the AST_STACKSIZE macros are defined as:

#define AST_BACKGROUND_STACKSIZE AST_STACKSIZE
#define AST_STACKSIZE (((sizeof(void *) * 8 * 8) - 16) * 1024) /* becomes 0x3C000 */

The use of AST_STACKSIZE, or 0x3C000, to set the size of the stack for each new HTTP thread is significant, as it means the stack of the newly created thread will begin at 0x3C000 below the top of the previous thread's stack. In turn, if a value of this size or greater is used for alloca pointer subtraction, the resulting stack pointer will overlap with the stack memory of a newer thread. By carefully synchronizing the state of the threads involved so they do not collide their shared use of stack memory, it is possible to use this behavior to overwrite the contents of one thread's stack area with network data read by another thread. To visualize this, and because I love drawing stack diagrams, I present the following bad art:

By offsetting from the higher stack by 0x3C000, the stack pointer will be at the equivalent location in the lower stack

Synchronizing the two threads such that they do not collide and clobber each other's critical stack contents is as simple as not sending data when a given thread is expecting it. While one thread is waiting for data in a blocking read operation, the other thread may be using the stack. Using the HTTP POST method (as is required to trigger the vulnerability) allows for two separate network read operations per thread: one for the initial read of HTTP headers, and a second for reading the HTTP Content-Data. Having two individual network read operations per thread provides enough blocking opportunity to align the augmented stack pointer of the first thread to a desirable location used by the second thread. Better yet, this provides an opportunity to align the pointer of the first thread to a location that is not yet used by the second thread, but will be used once the second thread completes its initial read and resumes execution. The following diagram steps attempt to illustrate this process, ignoring trivial details and using round numbers for simplicity.

1. Two socket connections to the HTTP AMI service are established, causing Asterisk to create two threads to handle the connections. Both threads are expecting HTTP headers and so they are both blocking on a read operation. To depict the state of these threads:

two threads created, with their stacks offset by 0x3C000

2. Thread1 is sent HTTP headers with an HTTP Content-Length string equivalent to 0x3C900. Once headers are received, Thread1's initial read operation finishes. It performs the alloca, subtracting its stack pointer by 0x3C900, which places its pointer for *buf at 0x900 bytes down from the top of Thread2's stack:

Thread1 stack pointer now overlaps with the stack area allocated for thread2

3. Thread1 is then sent approximately 0x700 bytes of the 0x3C900 it is expecting. This advances the *buf pointer index used by fgets up the stack, closer to Thread2's current stack pointer. Thread1 continues waiting as it has not yet received the full amount of data expected (0x3C900).

The location into *buf is advanced by 0x700, moving it up the stack

4. Thread2, still waiting on its initial network read, is sent HTTP POST headers with a Content-Length string equivalent to approximately 0x200, which it uses for its own alloca, subtracting from its stack pointer. Coordinating this length carefully places it precisely where the *buf pointer in Thread1 fgets currently points. Thread2 then calls fgets to receive its HTTP Content-Data, causing it to block while waiting to read in data.

The stack frame for the call performed by thread2 sites directly next to the current *buf index of thread1

5. Thread1 is sent 4 more bytes of the data it is waiting to receive, which is stored starting at its current *buf index in fgets, and overwrites where Thread2's stored return address is for fgets. A return from fgets can then be triggered by sending the remaining data expected, or a newline character, or also by simply closing the connection. Once Thread2 returns, EIP is restored from the overwritten return address value and execution flow is controlled.

Clockwork

Protection Mechanisms

Precisely overwriting only desired stack contents leaves stack canaries intact so that they do not interfere with exploitation. To avoid non-executable memory protections, typical return-oriented techniques may be employed to reuse existing executable memory once execution flow is controlled. This leaves Address Space Layout Randomization (ASLR), and more specifically, Asterisk builds compiled as Position-Independent-Executables (PIE) as the only remaining obstacle to overcome, as fixed return locations cannot be used in this case.

While the default Makefile generated to compile Asterisk from source does not include flags for PIE, popular Linux distributions may package their own Asterisk builds compiled with PIE for extra security, such as with Ubuntu (props to @kees_cook for keeping us on our toes with this). ASLR via PIE significantly frustrates exploitation. Since Ubuntu is a popular distribution, and having set the bar for difficulty in this case, the Ubuntu Asterisk build is the target we challenged ourselves with.

Who Was Phone

I will save you from babble about entropy and efforts made to try and guess addresses in the presence of ASLR. Instead we will discuss how this vulnerability can be reliably exploited for memory disclosure, and used to determine the location of Asterisk code memory to redirect execution to.

The function generic_http_callback in main/manager.c is the URL handling function executed when triggering the vulnerability, and is defined as:

static int generic_http_callback(struct ast_tcptls_session_instance *ser,
             enum ast_http_method method,
             enum output_format format,
             struct sockaddr_in *remote_address, const char *uri,
             struct ast_variable *get_params,
             struct ast_variable *headers)
{

Above you can see the output_format argument format is an enumeration value for one of the possible formats used for the reply. Its expected possible values are 0, 1, or 2 for "plain", "html", "xml" respectively. This value is used to retrieve a pointer from a global array when constructing a response in generic_http_callback:

/* ... */
  ast_str_append(&http_header, 0,
    "Content-type: text/%s\r\n"
    "Cache-Control: no-cache;\r\n"
    "Set-Cookie: mansession_id=\"%08x\"; Version=1; Max-Age=%d\r\n"
    "Pragma: SuppressEvents\r\n",
    contenttype[format],
    session->managerid, httptimeout);
/* ... */
  ast_http_send(ser, method, 200, NULL, http_header, out, 0, 0);
/* ... */

The contenttype array contains the pointers to the strings used for the HTTP response, and thus the pointer retrieved from this look-up directly influences data sent back to the HTTP user. By conducting the same style of stack pointer manipulation previously described, it is possible to align a thread's *buf pointer to overwrite the stack memory where format is stored, so it indexes beyond the contenttype array into other memory. With the help of some handy debugger scripting, I was able to find a pointer->pointer->code from a relative offset of contenttype. My code to do this with VDB is shown below. (Comments document the code a little bit, but a more extensive explanation of VDB is beyond the scope of this post):

for m in trace.getMemoryMaps():

  # check memory map name
  if m[3].lower() == "/usr/sbin/asterisk":

    #  check for flags Read & Write for data segment
    if m[2] == 6:
      addr = m[0]
      memlen = m[1]
      memory = trace.readMemory(addr, memlen)
  
    # check for Execute flag
    elif m[2] == 5:
      # save beginning and ending of executable memory
      code = m[0]
      codestop = code+m[1]

# from each offset in the memory
for offset in range(memlen-4):

  # read for the size of a pointer
  ptr = struct.unpack("<I", memory[offset:offset+4])[0]

  # check if it is a pointer
  if ispoi(ptr):
    # read the value at the pointer
    ptr = struct.unpack("<I", trace.readMemory(ptr, 4))[0]

    # is that value in the asterisk code?
    if ptr > code and ptr < codestop:
      print "[*] Found 0x%08x -> 0x%08x" % (addr+offset, ptr)

The script simply searches the memory maps of the attached process for the Asterisk data and code memory regions. Once they are found, the value at every possible offset in the data map is checked to be a valid memory address. Passing this check, the value at the memory it points to is then also checked to see if it is a pointer to code memory and then prints out valid matches. This script identified 8 locations of usable pointers when ran against Ubuntu's packaged Asterisk binary.

By overwriting the saved format variable with an index to offset to one of these pointer sequences, it is possible to manufacture a remote memory disclosure and determine an address of Asterisk code memory. Putting this all together allows for successful remote arbitrary code execution despite ASLR/PIE/NX/STACK COOKIES/ALL_OF_THE_THINGS compiled in with the Ubuntu build. To add to an already silly amount of convenience with the conditions surrounding this bug, when gaining EIP control through the method described, the next value on the stack above the overwritten return address is a pointer to the buffer passed to fgets in the second thread. This buffer is populated with the second thread's received HTTP Content-Data (remotely-controlled data). Using the memory disclosure to calculate the address of a call to the function ast_safe_system, which takes a single string pointer argument to execute as a command line, it is simple to exploit the return in the second thread to execute arbitrary commands from the Asterisk process -- which often runs as root. Using this to spawn a remote shell with Ubuntu's default dash shell is a little obnoxious, but possible, and an exercise left up to the reader.

Hope you enjoyed the post!
--
Brandon Edwards
@drraid

10 thoughts on “DoS? Then Who Was Phone?

  1. I don’t think you can say this is the fault of alloca.
    The programmer made the decision to trust user supplied data in memory manipulating operations. Even if the implementation of alloca checked the size argument in the same manner fgets does this bug, introduced by the programmer, would still result in at least a denial of service condition.

  2. Pingback: Kritische Schwachstellen in Asterisk | Edv-Sicherheitskonzepte.de – News Blog aus vielen Bereichen
  3. Pingback: Kritische Schwachstellen in Asterisk | virtualfiles.net
  4. Pingback: IT Secure Site » Blog Archive » Critical vulnerabilities in Asterisk
  5. Fantastic exploit and writeup! Thanks for taking the time to explain everything so well.

    The general approach reminds me a bit of Jon Oberheide’s half-nelson.c, which involves colliding two Linux kernel stacks.

    > I found it interesting that the code looks as though it may have been written with memory management issues in mind, as the check to ensure content_length is not zero would catch an integer overflow caused by adding one to the value.

    I think they’re just checking that content_length got set within the for loop. At any rate, content_length is a signed int, so overflow behavior is undefined — of course, many developers don’t know this.

  6. Pingback: Weekendowa Lektura | Zaufana Trzecia Strona
  7. Pingback: j00ru i Gynvael laureatami Pwnie Awards! Niestety Hakin9 te┼╝… | Zaufana Trzecia Strona

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s