VxWorks: Execute My Packets

Contributors

David Barksdale and Alex Wheeler

1. Background

Earlier this year we reported 3 vulnerabilities in VxWorks to Wind River. Each of these vulnerabilities can be exploited by anonymous remote attackers on the same network without user interaction to take control of the affected device. VxWorks is widely used in Aerospace and Defense, Automotive, Industrial, Medical, Consumer Electronics, Networking and Communication Infrastructure applications (https://en.wikipedia.org/wiki/VxWorks#Notable_uses).

2. Summary

As of this writing the flaws have not been assigned CVE numbers, they are:

  1. DHCP client heap overflow in handle_ip() affecting VxWorks 6.4 and prior
  2. DHCP server stack overflow in ipdhcps_negotiate_lease_time() affecting VxWorks 6.9 versions prior to 6.9.3.1, VxWorks 6.8, VxWorks 6.7, VxWorks 6.6, and VxWorks 6.5 and prior versions
  3. DNS client stack overflow in ipdnsc_decode_name() affecting VxWorks 7, VxWorks 6.9, VxWorks 6.8, VxWorks 6.7, VxWorks 6.6, and VxWorks 6.5

Please login to your support account on windriver.com or contact your Wind River support representative for mitigation of these issues.

3. Vulnerabilities

A. DHCP IP Address Option Client Heap Overflow

VxWorks 6.4 and prior fail to properly handle the lengths of IP addresses in DHCP Options in  handle_ip() and handle_ips().  handle_ip() contains a trivial overflow and will be the focus of this section. The flaw was initially found while auditing the network stack of IN_DISCLOSURE. Below is the disassembly describing the flaw in handle_ip() from the IN_DISCLOSURE firmware.

RAM:803D6D38 handle_ip: # DATA XREF: RAM:80F18AC8o
RAM:803D6D38            # RAM:80F18B04o ...
RAM:803D6D38
RAM:803D6D38 addiu $sp, -0x28
RAM:803D6D3C sw    $s3, 0x28+var_C($sp)
RAM:803D6D40 sw    $s2, 0x28+var_10($sp)
RAM:803D6D44 sw    $s0, 0x28+var_18($sp)
RAM:803D6D48 sw    $ra, 0x28+var_8($sp)
RAM:803D6D4C sw    $s1, 0x28+var_14($sp)
RAM:803D6D50 move  $s3, $a0
RAM:803D6D54 lb    $s1, 0($s3)
RAM:803D6D58 move  $s2, $a1
RAM:803D6D5C li    $v0, 0x36
RAM:803D6D60 beq   $s1, $v0, __copy_option__ # code == 36h
RAM:803D6D64 addiu $s0, $s2, 0x98
RAM:803D6D68 li    $v0, 0x20
RAM:803D6D6C beq   $s1, $v0, __copy_option__ # code == 20h
RAM:803D6D70 addiu $s0, $s2, 0xB8
RAM:803D6D74 li    $a0, 1 # num
RAM:803D6D78 jal   my_calloc # 4 byte buffer
RAM:803D6D7C li    $a1, 4 # size
RAM:803D6D80 move  $s0, $v0
RAM:803D6D84 beqz  $s0, __exit__ # calloc() == ERROR
RAM:803D6D88 li    $v0, 0xFFFFFFFF
<...SNIP...>
RAM:803D6E28 __copy_option__: # CODE XREF: handle_ip+28j
RAM:803D6E28 lbu   $a2, 1($s3) # len (1 BYTE FROM PACKET)
RAM:803D6E2C move  $a1, $s0 # dst (4 BYTE BUFFER)
RAM:803D6E30 jal   my_bcopy
RAM:803D6E34 addiu $a0, $s3, 2 # src (OptionPtr + 2)
RAM:803D6E38 move  $v0, $zero
RAM:803D6E3C __exit__: # CODE XREF: handle_ip+4Cj
RAM:803D6E3C lw    $ra, 0x28+var_8($sp)
RAM:803D6E40 lw    $s3, 0x28+var_C($sp)
RAM:803D6E44 lw    $s2, 0x28+var_10($sp)
RAM:803D6E48 lw    $s1, 0x28+var_14($sp)
RAM:803D6E4C lw    $s0, 0x28+var_18($sp)
RAM:803D6E50 jr    $ra
RAM:803D6E54 addiu $sp, 0x28
RAM:803D6E54 # End of function handle_ip

As described in the disassembly above, the vulnerability is caused by using a DHCP option length from the packet to copy into a 4 byte heap buffer, resulting in a heap overflow. This vulnerability can be exploited by responding to an affected device’s DHCP request with a malicious response containing a DHCP option length larger than 4 for the following DHCP option codes: 1, 16, 28, 32, and 54.

B. DHCP Option Lease Time Negotiation Server Stack Overflow

VxWorks 6.5 through VxWorks 6.9.3 fail to properly validate a lease time length when a DHCP server parses DHCP option 51 in ipdhcps_negotiate_lease_time(), which results in a stack overflow. The flaw is caused by using a DHCP IP Address Time option length from the packet to copy into a 4 byte stack buffer, resulting in a stack overflow. In either a DHCP Discovery or Request packet, the attacker simply includes an option of type 51 (the lease time option) that is larger than the expected 4 bytes. The entire contents of the option record (up to 255 bytes) will be copied into a buffer on the stack that is only 4 bytes.

C. DNS Response Decompression Stack Overflow

VxWorks 6.5 through VxWorks 7 fail to properly bound the decompression of names in ipdnsc_decode_name() which results in a stack overflow. The following is a snippet of the affected code for your review.

IP_STATIC Ip_s32
ipdnsc_decode_name(Ip_u8 *name, Ip_u8 *buf, Ip_u8 *start, Ip_u8 *end)
{
  Ip_u8 *ptr, *prev;
  Ip_s32 i, len, tot_len = 0, compress = 0;

  ptr = buf;
  while (*ptr && ptr < end)
  {
    /* Loop until we find a non-pointer */
    while ((*ptr & 0xc0) == 0xc0 && ptr < end)
    {
      prev = ptr;
      ptr = start + (IP_GET_NTOHS(ptr) & 0x3fff);
      if (ptr >= prev)
        return -1; /* Do not allow forward jumps (avoid loops) */
      if (!compress)
        tot_len += 2;
      compress = 1;
    }
    /* Store the length of the label */
    if (ptr >= end)
      return -1;
    len = *ptr++;
    if (len > IPDNSC_MAXLABEL)
      return -1;
    if (!compress) 
      tot_len = tot_len + len + 1;
    if (tot_len > IPDNSC_MAXNAME)
      return -1;
    /* Copy the label to name */
    for (i=0; i<len; i++)
    {
      if (ptr >= end)
        return -1;
      *name++ = *ptr++;
    }
    *name++ = '.';
  }

  if (!compress)/* Increment for the last zero */
    tot_len++;

  /* Null terminate the name string */
  if (tot_len)
    name--;
  *name = 0;
  return tot_len;
}

In the above code, the programmer fails to properly bound the decoded name to IPDNSC_MAXNAME when decompression is involved.  The only caller to this function, ipdnsc_parse_response(), passes the address of a 255-byte stack buffer as the output buffer name. When an attacker causes the target to process a DNS response with a name record that decompresses to larger than 255 bytes, the stack buffer will be overflowed.

4. Exploitation

Attack Vectors

All 3 vulnerabilities may be exploited by anonymous remote attackers on the same network as the target. Since the DHCP vulnerabilities are reachable over UDP and we found no TTL enforcement, in theory, an anonymous remote attacker may be able to exploit them while not on the same network by spoofing packets. Non-local network exploitation seems more plausible against the DHCP Option IP Lease Time Server Stack Overflow than the DHCP Option IP Client Heap Overflow – mainly because you need to guess the client’s 2 byte Transaction ID to trigger the client overflow (spoof, spray, and pray). The DNS Decompression Stack Overflow may be exploited by attackers that are on the same network, in control of a name server, or MITM between the target and a legit name server.

The remainder of this post discusses exploitation of the DHCP IP Option Client Heap Overflow. The stack overflows are left as an exercise for the reader.

Exploiting the Heap Overflow in handle_ip()

The DHCP client heap overflow occurs when parsing option records in the DHCP Offer packet normally sent to clients from a DHCP server during start-up and periodically afterwards. DHCP option records which correspond to IP address values (type 1, 16, 28, 32, and 54) are assumed to have a length of four bytes and the function which processes these options (named handle_ip) allocates a 4-byte buffer on the heap. However when copying the contents of the option record into the buffer, the function uses the length value in the option record for the number of bytes to copy. An attacker can provide up to 255 bytes to copy into the 4-byte heap buffer.

While we weren’t able to test all affected versions on all platforms, we were able to develop an exploit for two IP Deskphones from two different vendors both running VxWorks 5.5 on MIPS32.

In broad strokes our exploit needs to corrupt heap metadata in such a way that gives us control of execution, then it needs to flush our exploit code from the data cache to main memory (MIPS has separate data and instruction caches) so it can be executed, then it needs to jump to that code. The exploit code then needs to repair the heap and for convenience start an OS task that executes whatever payload we may have.

The heap allocator maintains a doubly-linked list of free chunks which it scans when allocating memory. The previous and next pointers are stored in the chunk header along with the size of the chunk, a flag indicating if the chunk is free or allocated, and a pointer to the previous chunk in memory.

Previous Chunk Free Chunk Next Chunk
Previous chunk pointer Chunk size and free flag Next free chunk pointer Previous free chunk pointer Data

Our exploit overwrites the previous and next pointers of a free chunk and then allocates that chunk. During allocation the free chunk is removed from the doubly-linked list, giving us the ability to write an arbitrary 4-byte value to an arbitrary location in memory. In order to get control of execution we overwrite the function pointer in the table of DHCP option handling functions for option type 48, then cause that function to be called by adding an option of that type to our DHCP Offer packet.

To accomplish this we need to arrange the heap so that our buffer is adjacent to a free heap chunk of a known size, overflow that chunk’s header, and then allocate that chunk. This turns out to be easier than it sounds. The following DHCP option list does the job in most cases:

    Code   Len
   +-----+-----+
   |  3  |  0  |
   +-----+-----+
   |  4  |  0  |
   +-----+-----+
   \\    \\    \\
   +-----+-----+
   | 11  |  0  |
   +-----+-----+
   |  1  |  0  |
   +-----+-----+---\\---+
   |  1  | 32  |  Data  |
   +-----+-----+---\\---+
   | 28  | ... |
   +-----+-----+
   | 48  | ... |
   +-----+-----+

Option codes 3-11 cause two small allocations from the heap each, this helps defragment the heap and makes it more likely that the next chunk on the free list is large enough for our next two allocations. Assuming the next free chunk is large enough, the heap allocator will split it into two smaller chunks and return the one at the end for our first option 1. When handle_ip() processes the second option 1 record, it will allocate the heap buffer (which will be before and adjacent to the one we just allocated), notice that a buffer for option 1 was already allocated and free it (adding it to the head of the free list), then write our 32 bytes of data into the buffer which overflows into the metadata of the first free chunk on the free list. Option 28 then allocates that corrupt chunk and in doing so overwrites the function pointer for handling option 48. Option 48 then calls that pointer and we have control of execution.

We will post more details about exploitation of this issue in the near future, after IN_DISCLOSURE have had a chance to publish a fix. If you have a VxWorks-based device and would like us to develop a PoC for it, please contact info@exodusintel.com with the details.

5. Detection

A. DHCP Option IP Address Client Heap Overflow

Detection of attempts to exploit this vulnerability can be accomplished by examining the length field of DHCP Option Codes 1, 16, 28, 32, and 54 for values greater than 4 in DHCP Offers.

B. DHCP Option Lease Time Server Stack Overflow

Detection of attempts to exploit this vulnerability can be accomplished by examining the length field of DHCP Option Code 51 for values greater than 4 in DHCP Discover and Request packets.

C. DNS Response Decompression Stack Overflow

Detection of attempts to exploit this vulnerability can be accomplished by examining names in DNS responses for compression that results in a decoding of a name to larger than 255 bytes.

Changing to Coordinated Disclosure

UPDATE 5/17/2016: The link for the POC for CVE-2016-1287 is live at https://github.com/exodusintel/disclosures

Last week Exodus finished disclosure on CVE-2016-1287 “Cisco ASA Software IKEv1 and IKEv2 Buffer Overflow Vulnerability” officially marking the first time that we have gone through the process of coordinated disclosure. This disclosure represents a change in our internal policies and warrants discussion regarding the particulars of the change and what it means for Exodus going forward.

Read moreChanging to Coordinated Disclosure

Execute My Packet

Contributors

David Barksdale, Jordan Gruskovnjak, and Alex Wheeler

1. Background

Cisco has issued a fix to address CVE-2016-1287. The Cisco ASA Adaptive Security Appliance is an IP router that acts as an application-aware firewall, network antivirus, intrusion prevention system, and virtual private network (VPN) server. It is advertised as “the industry’s most deployed stateful firewall.” When deployed as a VPN, the device is accessible from the Internet and provides access to a company’s internal networks.

Read moreExecute My Packet

DoS? Then Who Was Phone?

Introduction

This post presents exploitation notes on a vulnerability we discovered in Asterisk, an open source telephony solution produced by Digium. We reported this bug to Digium on November 27th, 2012, and provided it to customers of the Exodus Intelligence Feed as EIP-2012-0008. Digium released the advisory AST-2012-014 for this vulnerability on January 2nd, 2013, which was picked up shortly after by some of the aggregator sites and incorrectly categorized as a denial-of-service; however, this bug is certainly exploitable. As we found it fun to analyze, and since discussions about server-side memory bugs are a little sparse now-a-days, we thought it would be cool to share for others who might also find it interesting.

Vulnerability

The vulnerability resides in the HTTP Asterisk Management Interface (AMI) service, and is the result of an alloca being used to “allocate” memory with a remotely-supplied, untrusted size value. The vulnerability is present in the Asterisk source code file main/http.c main/http.c, specifically in the function ast_http_get_post_vars, which as the name would suggest is used to parse HTTP POST variable data. A snip of the pertinent vulnerable code in this function is shown below:

struct ast_variable *ast_http_get_post_vars(
  struct ast_tcptls_session_instance *ser, struct ast_variable *headers)
{
  int content_length = 0;
  struct ast_variable *v, *post_vars=NULL, *prev = NULL;
  char *buf, *var, *val;

  for (v = headers; v; v = v->next) {
    if (!strcasecmp(v->name, "Content-Type")) {
      if (strcasecmp(v->value, "application/x-www-form-urlencoded")) {
        return NULL;
      }
      break;
    }
  }

  for (v = headers; v; v = v->next) {
    if (!strcasecmp(v->name, "Content-Length")) {
      content_length = atoi(v->value) + 1;
      break;
    }
  }

  if (!content_length) {
    return NULL;
  }

  if (!(buf = alloca(content_length))) {
    return NULL;
  }
  if (!fgets(buf, content_length, ser->f)) {
    return NULL;
  }

The code shows the length value being converted from the Content-Length string using atoi, then incremented by one and stored in the content_length variable. Memory is obtained by alloca for the expected content length, and pointed to by *buf. Finally, fgets is called to read the expected amount of content data into this buffer. I found it interesting that the code looks as though it may have been written with memory management issues in mind, as the check to ensure content_length is not zero would catch an integer overflow caused by adding one to the value.

Below is a snip of disassembled code for the vulnerable function as compiled in the Asterisk package shipped with Ubuntu. This snip shows the size value being set and used to subtract the stack pointer (ESP) to “allocate” stack memory:

<ast_http_get_post_vars+187>:   call   <strtol@plt>
<ast_http_get_post_vars+192>:   mov    edx,eax
<ast_http_get_post_vars+194>:   add    edx,0x1
<ast_http_get_post_vars+197>:   je     <ast_http_get_post_vars+408>
<ast_http_get_post_vars+203>:   mov    ecx,DWORD PTR [ebp-0x30]
<ast_http_get_post_vars+206>:   add    eax,0x1f
<ast_http_get_post_vars+209>:   and    eax,0xfffffff0
<ast_http_get_post_vars+212>:   sub    esp,eax <----- LOL
<ast_http_get_post_vars+214>:   lea    esi,[esp+0x1b]

As shown, the alloca is compiled into a simple set of instructions to ADD and AND-off the size to be allocated from the stack. It then subtracts the revised size from the stack pointer, and stores an address derived from this into the ESI register for further use.

Exploitation Obstacles

Since most compilers implement alloca as a fairly direct subtraction of the stack pointer, the exploitation of alloca is often as simple as providing a size value large enough to wrap the stack pointer around to a desirable location higher on the stack. Subsequent use of the pointer to store remotely supplied data would then result in stack memory corruption, and allow for vanilla exploitation techniques to gain control of program execution flow.

However, here the vulnerable code uses the function fgets to read network data into the obtained memory space. This complicates the situation for exploitation as the libc implementation of fgets performs a check on its length argument to ensure that it is not beyond the signed integer boundary of 0x7FFFFFFF. If this check fails, fgets does not read data and returns an error. The code snip below shows the check performed inside of fgets as implemented in libc.6.so:

<fgets+0>:     sub    esp,0x4c
<fgets+3>:     mov    DWORD PTR [esp+0x48],ebp
<fgets+7>:     mov    ebp,DWORD PTR [esp+0x54]
<fgets+11>:    mov    DWORD PTR [esp+0x3c],ebx
<fgets+15>:    call   
<fgets+20>:    add    ebx,0x14051c
<fgets+26>:    mov    DWORD PTR [esp+0x40],esi
<fgets+30>:    mov    esi,DWORD PTR [esp+0x58]
<fgets+34>:    test   ebp,ebp
<fgets+36>:    mov    DWORD PTR [esp+0x44],edi
<fgets+36>:    mov    DWORD PTR [esp+0x44],edi
<fgets+40>:    jle    <fgets+336>
...
<fgets+336>:   mov    DWORD PTR [esp+0x50],0x0
<fgets+344>:   jmp    <fgets+256>
...
<fgets+256>:   mov    eax,DWORD PTR [esp+0x50]
<fgets+260>:   mov    ebx,DWORD PTR [esp+0x3c]
<fgets+264>:   mov    esi,DWORD PTR [esp+0x40]
<fgets+268>:   mov    edi,DWORD PTR [esp+0x44]
<fgets+272>:   mov    ebp,DWORD PTR [esp+0x48]
<fgets+276>:   add    esp,0x4c
<fgets+279>:   ret    

The EBP register, containing the length argument, is checked to be a positive signed value using the TEST and JLE instructions at <fgets+34> and <fgets+40>. If the check fails, the code jumps to return an error, making fgets unusable for exploiting a wrapped stack pointer to overwrite memory with data read from the network. While stack corruption by this means is still possible through the pushing and moving of data to the stack by other compiled code operations, the lack of control and limited set of operations make this approach undesirable.

At this point some might categorize this vulnerability as purely theoretical or possibly even unexploitable. As I hope many readers would agree, a challenge of this nature is always inviting. The Exodus team loves goading and trolling one another in these scenarios, usually with something along the lines of “Yeah, it is probably too tough for you to exploit…” or “you should probably just give up.” The recipient of this pep talk usually proceeds to cry and reevaluate the code until an idea hits them or they decide to resign to a life of PCI compliance auditing. Challenge accepted.

EIP Control

After spending some time analyzing the problem and hating computers, I found a way to exploit this vulnerability. The HTTP listener for the Asterisk Management Interface handles every new connection by creating a new thread to execute a designated worker function to process the request. The code to setup and complete this task is spread out across multiple functions and macros and is a little messy, so we’ll try to keep details to a minimum. The HTTP AMI is started initially by a call chain of functions starting with ast_http_init, which calls __ast_http_load, which then calls ast_tcptls_server_start. The function ast_tcptls_server_start performs standard TCP socket setup operations, and is defined as:

void ast_tcptls_server_start(struct ast_tcptls_session_args *desc)

Despite the name, ast_tcptls_server_start is used for both TLS and non-TLS service setup. The single argument taken by this function is a structure describing aspects of the server to be started. From __ast_http_load, the call looks like:

ast_tcptls_server_start(&http_desc);

The structure structure http_desc is defined in main/http.c as:

static struct ast_tcptls_session_args http_desc = {
  .accept_fd = -1,
  .master = AST_PTHREADT_NULL,
  .tls_cfg = NULL,
  .poll_timeout = -1,
  .name = "http server",
  .accept_fn = ast_tcptls_server_root,
  .worker_fn = httpd_helper_thread,
};

The .accept_fn is a function pointer for a function to accept the connection, and the worker_fn is a pointer to the worker function responsible for processing the request once a new thread is created. After more setup code, a new thread is created to accept socket connections by calling the function ast_tcptls_server_root. For each TCP connection accepted on the listening HTTP port (default 8088), ast_tcptls_server_root calls the following thread creation wrapper to create a new thread and eventually call the worker function:

...
if (ast_pthread_create_detached_background(&launched, NULL, handle_tcptls_connection, tcptls_session)) {
  ast_log(LOG_WARNING, "Unable to launch helper thread: %sn", strerror(errno));
   ast_tcptls_close_session_file(tcptls_session);
   ao2_ref(tcptls_session, -1);
   }
...

The function ast_pthread_create_detached_background is a macro wrapper for the function ast_pthread_create_stack. The macro definition looks roughly like:

ast_pthread_create_detached_stack(a, b, c, d, AST_BACKGROUND_STACKSIZE, ...)

The important thing to note here is the argument AST_BACKGROUND_STACKSIZE. This is used by the function to set the new thread’s stack size attribute before starting the thread:

pthread_attr_setstacksize(attr, stacksize ? stacksize : AST_STACKSIZE)
...
return pthread_create(thread, attr, start_routine, data);

For builds without low memory restrictions defined, the AST_BACKGROUND_STACKSIZE and the AST_STACKSIZE macros are defined as:

#define AST_BACKGROUND_STACKSIZE AST_STACKSIZE
#define AST_STACKSIZE (((sizeof(void *) * 8 * 8) - 16) * 1024) /* becomes 0x3C000 */

The use of AST_STACKSIZE, or 0x3C000, to set the size of the stack for each new HTTP thread is significant, as it means the stack of the newly created thread will begin at 0x3C000 below the top of the previous thread’s stack. In turn, if a value of this size or greater is used for alloca pointer subtraction, the resulting stack pointer will overlap with the stack memory of a newer thread. By carefully synchronizing the state of the threads involved so they do not collide their shared use of stack memory, it is possible to use this behavior to overwrite the contents of one thread’s stack area with network data read by another thread. To visualize this, and because I love drawing stack diagrams, I present the following bad art:

By offsetting from the higher stack by 0x3C000, the stack pointer will be at the equivalent location in the lower stack

Synchronizing the two threads such that they do not collide and clobber each other’s critical stack contents is as simple as not sending data when a given thread is expecting it. While one thread is waiting for data in a blocking read operation, the other thread may be using the stack. Using the HTTP POST method (as is required to trigger the vulnerability) allows for two separate network read operations per thread: one for the initial read of HTTP headers, and a second for reading the HTTP Content-Data. Having two individual network read operations per thread provides enough blocking opportunity to align the augmented stack pointer of the first thread to a desirable location used by the second thread. Better yet, this provides an opportunity to align the pointer of the first thread to a location that is not yet used by the second thread, but will be used once the second thread completes its initial read and resumes execution. The following diagram steps attempt to illustrate this process, ignoring trivial details and using round numbers for simplicity.

1. Two socket connections to the HTTP AMI service are established, causing Asterisk to create two threads to handle the connections. Both threads are expecting HTTP headers and so they are both blocking on a read operation. To depict the state of these threads:

two threads created, with their stacks offset by 0x3C000

2. Thread1 is sent HTTP headers with an HTTP Content-Length string equivalent to 0x3C900. Once headers are received, Thread1’s initial read operation finishes. It performs the alloca, subtracting its stack pointer by 0x3C900, which places its pointer for *buf at 0x900 bytes down from the top of Thread2’s stack:

Thread1 stack pointer now overlaps with the stack area allocated for thread2

3. Thread1 is then sent approximately 0x700 bytes of the 0x3C900 it is expecting. This advances the *buf pointer index used by fgets up the stack, closer to Thread2’s current stack pointer. Thread1 continues waiting as it has not yet received the full amount of data expected (0x3C900).

The location into *buf is advanced by 0x700, moving it up the stack

4. Thread2, still waiting on its initial network read, is sent HTTP POST headers with a Content-Length string equivalent to approximately 0x200, which it uses for its own alloca, subtracting from its stack pointer. Coordinating this length carefully places it precisely where the *buf pointer in Thread1 fgets currently points. Thread2 then calls fgets to receive its HTTP Content-Data, causing it to block while waiting to read in data.

The stack frame for the call performed by thread2 sites directly next to the current *buf index of thread1

5. Thread1 is sent 4 more bytes of the data it is waiting to receive, which is stored starting at its current *buf index in fgets, and overwrites where Thread2’s stored return address is for fgets. A return from fgets can then be triggered by sending the remaining data expected, or a newline character, or also by simply closing the connection. Once Thread2 returns, EIP is restored from the overwritten return address value and execution flow is controlled.

Clockwork

Protection Mechanisms

Precisely overwriting only desired stack contents leaves stack canaries intact so that they do not interfere with exploitation. To avoid non-executable memory protections, typical return-oriented techniques may be employed to reuse existing executable memory once execution flow is controlled. This leaves Address Space Layout Randomization (ASLR), and more specifically, Asterisk builds compiled as Position-Independent-Executables (PIE) as the only remaining obstacle to overcome, as fixed return locations cannot be used in this case.

While the default Makefile generated to compile Asterisk from source does not include flags for PIE, popular Linux distributions may package their own Asterisk builds compiled with PIE for extra security, such as with Ubuntu (props to @kees_cook for keeping us on our toes with this). ASLR via PIE significantly frustrates exploitation. Since Ubuntu is a popular distribution, and having set the bar for difficulty in this case, the Ubuntu Asterisk build is the target we challenged ourselves with.

Who Was Phone

I will save you from babble about entropy and efforts made to try and guess addresses in the presence of ASLR. Instead we will discuss how this vulnerability can be reliably exploited for memory disclosure, and used to determine the location of Asterisk code memory to redirect execution to.

The function generic_http_callback in main/manager.c is the URL handling function executed when triggering the vulnerability, and is defined as:

static int generic_http_callback(struct ast_tcptls_session_instance *ser,
             enum ast_http_method method,
             enum output_format format,
             struct sockaddr_in *remote_address, const char *uri,
             struct ast_variable *get_params,
             struct ast_variable *headers)
{

Above you can see the output_format argument format is an enumeration value for one of the possible formats used for the reply. Its expected possible values are 0, 1, or 2 for “plain”, “html”, “xml” respectively. This value is used to retrieve a pointer from a global array when constructing a response in generic_http_callback:

/* ... */
  ast_str_append(&http_header, 0,
    "Content-type: text/%srn"
    "Cache-Control: no-cache;rn"
    "Set-Cookie: mansession_id="%08x"; Version=1; Max-Age=%drn"
    "Pragma: SuppressEventsrn",
    contenttype[format],
    session->managerid, httptimeout);
/* ... */
  ast_http_send(ser, method, 200, NULL, http_header, out, 0, 0);
/* ... */

The contenttype array contains the pointers to the strings used for the HTTP response, and thus the pointer retrieved from this look-up directly influences data sent back to the HTTP user. By conducting the same style of stack pointer manipulation previously described, it is possible to align a thread’s *buf pointer to overwrite the stack memory where format is stored, so it indexes beyond the contenttype array into other memory. With the help of some handy debugger scripting, I was able to find a pointer->pointer->code from a relative offset of contenttype. My code to do this with VDB is shown below. (Comments document the code a little bit, but a more extensive explanation of VDB is beyond the scope of this post):

for m in trace.getMemoryMaps():

  # check memory map name
  if m[3].lower() == "/usr/sbin/asterisk":

    #  check for flags Read & Write for data segment
    if m[2] == 6:
      addr = m[0]
      memlen = m[1]
      memory = trace.readMemory(addr, memlen)
  
    # check for Execute flag
    elif m[2] == 5:
      # save beginning and ending of executable memory
      code = m[0]
      codestop = code+m[1]

# from each offset in the memory
for offset in range(memlen-4):

  # read for the size of a pointer
  ptr = struct.unpack("<I", memory[offset:offset+4])[0]

  # check if it is a pointer
  if ispoi(ptr):
    # read the value at the pointer
    ptr = struct.unpack("<I", trace.readMemory(ptr, 4))[0]

    # is that value in the asterisk code?
    if ptr > code and ptr < codestop:
      print "[*] Found 0x%08x -> 0x%08x" % (addr+offset, ptr)

The script simply searches the memory maps of the attached process for the Asterisk data and code memory regions. Once they are found, the value at every possible offset in the data map is checked to be a valid memory address. Passing this check, the value at the memory it points to is then also checked to see if it is a pointer to code memory and then prints out valid matches. This script identified 8 locations of usable pointers when ran against Ubuntu’s packaged Asterisk binary.

By overwriting the saved format variable with an index to offset to one of these pointer sequences, it is possible to manufacture a remote memory disclosure and determine an address of Asterisk code memory. Putting this all together allows for successful remote arbitrary code execution despite ASLR/PIE/NX/STACK COOKIES/ALL_OF_THE_THINGS compiled in with the Ubuntu build. To add to an already silly amount of convenience with the conditions surrounding this bug, when gaining EIP control through the method described, the next value on the stack above the overwritten return address is a pointer to the buffer passed to fgets in the second thread. This buffer is populated with the second thread’s received HTTP Content-Data (remotely-controlled data). Using the memory disclosure to calculate the address of a call to the function ast_safe_system, which takes a single string pointer argument to execute as a command line, it is simple to exploit the return in the second thread to execute arbitrary commands from the Asterisk process — which often runs as root. Using this to spawn a remote shell with Ubuntu’s default dash shell is a little obnoxious, but possible, and an exercise left up to the reader.

Hope you enjoyed the post!

Brandon Edwards
@drraid