Post

0xL4ugh Writeup: 'nano'

A writeup for a reversing challenge from the 0xL4ugh CTF involving assembly level obfuscation and cross process fault handling.

0xL4ugh Writeup: 'nano'

This is my solution to the “nano” crackme which was created by Stoopid for 0xL4ugh CTF 2024. You can find it on crackmes.one.

Step 1: Just run it

After just running ./nano we are greeted with a nice usage text:

1
2
> ./nano
usage: ./nano <flag>

Ok so then lets just try giving it some random string to see how it behaves:

1
2
> ./nano test
no

Since there is not much more here to see, let’s load it into Ghidra.

Step 2: A first look in Ghidra

Loading it into Ghidra we see a large main function. Ignoring a bunch of unreadable code we are left with this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int main(int argc, char** argv, char** envp, char param_4) {
    // declarations and initializations

    if (argc != 2) {
        printf("usage: %s <flag>\n", argv[0]);
        exit(1);
    }
    pid = fork();
    if (pid == 0) {
        // a bunch of unreadable code

        if (check(argv[1]) == 0) {
            puts("yes");
        } else {
            puts("no");
        }
        exit(0);
    }
    // a bunch of unreadable code
}

This seems to suggest that we only need to look at the check function to see how the flag is validated, so let’s do that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
undefined4 check(char *param_1) {
    size_t sVar1;
    byte local_1d;
    undefined4 local_1c;
    
    sVar1 = strlen(param_1);
    local_1c = 0;
    local_1d = 0;
    while( true ) {
        if (0x23 < local_1d) {
            return local_1c;
        }
        if ((int)sVar1 < (int)(uint)local_1d) break;
        if ((char)(KEY[(int)(uint)local_1d] ^ param_1[local_1d]) != flag[(int)(uint)local_1d]) {
            local_1c = 1;
        }
        local_1d = local_1d + 1;
    }
    return 1;
}

Ok not very readable, let’s clean that up a little bit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
int check(char *input) {
    size_t length;
    byte i;
    int wrong;
    
    length = strlen(input);
    wrong = false;
    for (int i = 0; i <= 35; i++) {
        if (i > length) return 1;
        if ((char)(KEY[i] ^ input[i]) != flag[i]) {
          wrong = true;
        }
    }
    return wrong;
}

Ok so it seems like the input is verified by xor encrypting it with KEY and comparing the result to the xor encrypted flag. Since xor encryption and decryption are equivalent, we can just take the cipher text in flag and decrypt it using KEY. And now we have our flag:

watch : https://youtu.be/dQw4w9WgXcQ

Hmm suspicicous you should take a look at that video, I will try validating it with the binary:

1
2
> ./nano 'watch : https://youtu.be/dQw4w9WgXcQ'
no

Ok that was to be expected, let’s keep investigating.

Step 3: GDB

To get a better idea of what happens in the check function, let’s load it into gdb (with pwndbg) and see what happens exactly happens there:

1
2
3
4
5
> gdb ./nano
> set follow-fork-mode child  # set gdb to follow the child process on a fork() call
> start test
> b check                     # set a breakpoint at check()
> continue

Now gdb should stop execution at the beginning of a call to check. Stepping through the function one instruction at a time, we find the following instruction which is executed right before the segfault:

mov r11, qword ptr [0]

Ok that is weird, there is no circumstance under which this shouldn’t lead to a segfault, yet when we executed the binary without gdb and with the same input, no segfault occured.

It seems we will have to look at all of that unreadable code in main afterall.

Step 4: Deobfuscating main

Looking at the main function more closely, the first thing I notice is these weird if statements all over the place:

1
2
3
if ((!bVar11) && (bVar11)) {
    // more code
}

That looks very suspicious, so let’s look at the assembly code that leads to this decompilation error:

addressbytesinstruction
001012d289 45 f8MOV dword ptr [RBP + local_10], EAX
001012d583 7d f8 00CMP dword ptr [RBP + local_10], 0x0
001012d975 6cJNZ LAB_00101347
001012db74 03JZ 0x001012e0
001012dd75 01JNZ 0x001012e0
001012dfe8 48 c7 c7 00CALL SUB_00d7da2c
001012e400 00ADD byte ptr [RAX],AL
001012e600 48 8bADD byte ptr [RAX + -0x75],param_4

Looking at these instructions, we can see that there are three jump instructions after another. The latter two jump to the same address and each tests the exact negation of the other’s condition. Looking at the address they jump to, ghidra seems to think it is in the middle of an instruction?

This is because Ghidra assumes that the third conditional jump may not trigger a jump and therefore it concludes that the first byte after it must be the first byte of an instruction.

Since we know that it will always jump, we can easily remove this type of obfuscation by replacing the five involved bytes (two bytes per jump and one garbage byte that is skipped by the jumps) with two NOP instructions, as you can see here:

addressbytesinstruction
001012d289 45 f8MOV dword ptr [RBP + local_10], EAX
001012d583 7d f8 00CMP dword ptr [RBP + local_10], 0x0
001012d975 6cJNZ LAB_00101347
001012db66 48 90NOP
001012de48 90NOP
001012e048 c7 c7 00 00 00 00MOV argc,0x0
001012e748 8b ......

Having fixed one of these is great, but it would be nice to find and fix all of them and be done with this obfuscation. Luckily the jumps are relative, meaning it is likely the obfuscator used the same bytes everytime.

Searching the program memory for the four bytes that make up the jump instructions (74 03 75 01), we can indeed find more occurences of this type of obfuscation. After patching all of these out, the decompiler output already looks a LOT better:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
int main(int argc,char **argv,char **envp,int param_4) {
    int result;
    undefined4 in_register_0000000c;
    undefined auStack_108 [24];
    ulong uStack_f0;
    long lStack_88;
    uint local_24;
    undefined *puStack_20;
    byte local_11;
    __pid_t pid;
    uint uStack_c;
    
    uStack_c = 0;
    if (argc != 2) {
        printf("usage: %s <flag>\n",*argv,envp,CONCAT44(in_register_0000000c,param_4));
        exit(1);
    }
    pid = fork();
    if (pid != 0) {
        func_00101189(0x10,CONCAT44(uStack_c,pid),0,0);
        while (waitpid(pid,(int *)&local_24,0), (local_24 & 0x7f) != 0) {
            if (local_24 != 0xffff) {
                if (((local_24 & 0xff) == 0x7f) && ((local_24 & 0xff00) == 0xb00)) {
                    uStack_c = uStack_c + 1;
                    local_11 = ((char)uStack_c * '\b' ^ 0xcaU | (byte)((int)uStack_c >> 5)) ^ 0xfe;
                    puStack_20 = auStack_108;
                    func_00101189(0xc,CONCAT44(uStack_c,pid),0,puStack_20);
                    uStack_f0 = (ulong)local_11 | 0x7ffc9286a800;
                    lStack_88 = lStack_88 + 8;
                    puStack_20 = auStack_108;
                    func_00101189(0xd,CONCAT44(uStack_c,pid),0,puStack_20);
                }
                func_00101189(7,CONCAT44(uStack_c,pid),0,0);
            }
        }
        return 0;
    }
    func_00101189(0,(ulong)uStack_c << 0x20,0,0);
    result = check(argv[1]);
    if (result == 0) {
        puts("yes");
    } else {
        puts("no");
    }
    exit(0);
}

Since we are modifying the binary, we will have to make sure to validate the flag with the original unmodifed file.

Now that the cpde isn’t wrong anymore, it is time to understand what it does.

Step 5: Understanding the behaviour of the parent process

As we can see the main function branches into two parts after a fork of the process, meaning the first part is executed by the parent process, while the second part is executed by the child process.

Looking at the parent process code we see a bunch of calls to func_00101189, so let’s take a look at that function.

The decompiler output is relatively useless here, but the disassembly speaks volumes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
MOV  qword ptr [RBP + local_20],RDI # unnecessary
MOV  qword ptr [RBP + local_28],RSI # unnecessary
MOV  qword ptr [RBP + local_30],RDX # unnecessary
MOV  qword ptr [RBP + local_38],RCX # unnecessary
MOV  RAX,0x4f
MOV  RDI,qword ptr [RBP + local_20] # unnecessary
MOV  RSI,qword ptr [RBP + local_28] # unnecessary
MOV  RDX,qword ptr [RBP + local_30] # unnecessary
NOP                                 # result of our deobfuscation
NOP                                 # result of our deobfuscation
XOR  RAX,0x2a
MOV  R10,qword ptr [RBP + local_38]
SYSCALL
MOV  RAX,RAX                        # unnecessary
MOV  qword ptr [RBP + local_10],RAX # unnecessary
MOV  RAX,qword ptr [RBP + local_10] # unnecessary

I have marked a bunch of instructions in there that are unneccessary to understanding the function. And with those marked, the purpose of the method becomes relatively obvious: It makes a syscall and returns the result. Looking up what syscall the id 0x2a belongs to, we can now update the functions signature with the following result:

1
2
3
long ptrace(long request,long pid,void *addr,void *data) {
    return syscall();
}

Knowing the purpose of this function and having corrected its signature, we can go back and take another look at main. Looking at it now, the next thing we need to find out is what ptrace requests are actually happening there. Luckily Ghidra has a really nice feature called “Equate” using this we get a list of C macros with the same value. Searching those for “ptrace” we can quickly identify all the names for them:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
int main(int argc,char **argv) {
    int result;
    undefined auStack_108 [24];
    ulong uStack_f0;
    long lStack_88;
    int wstatus;
    undefined *puStack_20;
    byte local_11;
    __pid_t pid;
    uint uStack_c;
    
    uStack_c = 0;
    if (argc != 2) {
        printf("usage: %s <flag>\n",*argv);
        exit(1);
    }
    pid = fork();
    if (pid != 0) {
        ptrace(PTRACE_ATTACH,CONCAT44(uStack_c,pid),(void *)0x0,(void *)0x0);
        while (waitpid(pid,(int *)&wstatus,0), (wstatus & 0x7f) != 0) {
            if (wstatus != 0xffff) {
                if (((wstatus & 0xff) == 0x7f) && ((wstatus & 0xff00) == 0xb00)) {
                    uStack_c = uStack_c + 1;
                    local_11 = ((char)uStack_c * '\b' ^ 0xcaU | (byte)((int)uStack_c >> 5)) ^ 0xfe;
                    puStack_20 = auStack_108;
                    ptrace(PTRACE_GETREGS,CONCAT44(uStack_c,pid),(void *)0x0,puStack_20);
                    uStack_f0 = (ulong)local_11 | 0x7ffc9286a800;
                    lStack_88 = lStack_88 + 8;
                    puStack_20 = auStack_108;
                    ptrace(PTRACE_SETREGS,CONCAT44(uStack_c,pid),(void *)0x0,puStack_20);
                }
                ptrace(PTRACE_CONT,CONCAT44(uStack_c,pid),(void *)0x0,(void *)0x0);
            }
        }
        return 0;
    }
    ptrace(PTRACE_TRACEME,(ulong)uStack_c << 0x20,(void *)0x0,(void *)0x0);
    result = check(argv[1]);
    if (result == 0) {
        puts("yes");
    } else {
        puts("no");
    }
    exit(0);
}

If Ghidra doesn’t let you set an Equate, try to set it on the value in the Assembly listing and re-create the function.

Looking at this now, it appears that the parent process waits for status changes in the child and then, if some conditions apply, modifies the register contents before continuing the child process.

Looking at the man page for waitpid (man 2 waitpid) tells us that the wstatus result is interpreted via macros which can be found in the sys/wait.h file. Considering the definitions in that header file and its dependencies, this is what the original C code most likely looked like:

1
2
3
4
5
6
7
8
9
ptrace(PTRACE_ATTACH,pid,(void *)0x0,(void *)0x0);
while (waitpid(pid, &wstatus, 0), WTERMSIG(wstatus) != 0) {
    if (!WIFCONTINUED(wstatus)) {
        if (WIFSTOPPED(wstatus) && WSTOPSIG(wstatus) == SIGSEGV) {
            // register modification code
        }
        ptrace(PTRACE_CONT,pid,(void *)0x0,(void *)0x0)
    }
}

The SIGSEGV macro is included via signal.h.

Looking at this the general behaviour of the parent process should be easy to understand: It waits until the child process is stopped, modifies its registers only if it was stopped because of a segfault (remember that weird instruction that always triggers a segfault? That seems to trigger this handler) and continues its execution again.

We can also take a look at the man page for ptrace (man 2 ptrace) and update some types to end up with the following:

1
2
3
4
5
6
segfault_cnt++;
override = ((char)segfault_cnt * '\b' ^ 0xcaU | (byte)((int)segfault_cnt >> 5)) ^ 0xfe;
ptrace(PTRACE_GETREGS,pid,(void *)0x0,&user_regs);
user_regs.r12 = (ulong)override | 0x7ffc9286a800;
user_regs.rip = user_regs.rip + 8;
ptrace(PTRACE_SETREGS,pid,(void *)0x0,&user_regs);

So from this it seems like on every segfault the R12 register of the child process is set to a new value and the Program Counter / Return Instuction Pointer Register is increased by 8, mostly likely to skip an instruction to prevent the child process from segfaulting again.

Step 6: Understanding the check function

Now let’s try to understand what that those register modifications are doing by looking closer at the check function where the segfault is caused:

addressbytesinstruction / comment
  Load the value at KEY[i] into R12D
001012060f b6 45 ebMOVZX length,byte ptr [RBP + i]
0010120a48 98CDQE
0010120c48 8d 15 8d 2e 00 00LEA RDX,[KEY]
001012130f b6 04 10MOVZX length,byte ptr [length + RDX]=>KEY
001012170f b6 c0MOVZX length,length
0010121a41 89 c4MOV R12D,length
  Trigger segfault
0010121d4c 8b 1c 25 00 00 00 00MOV R11,qword ptr [DAT_00000000]
  Load the value at flag[i] into local_25
001012250f b6 45 ebMOVZX length,byte ptr [RBP + i]
0010122948 98CDQE
0010122b48 8d 15 2e 2e 00 00LEA RDX,[flag]
001012320f b6 04 10MOVZX length,byte ptr [length + RDX]=>flag
0010123688 45 e3MOV byte ptr [RBP + local_25],length
  Load the value at input2[i] into EDX
001012390f b6 55 ebMOVZX EDX,byte ptr [RBP + i]
0010123d48 8b 45 d8MOV length,qword ptr [RBP + input2]
0010124148 01 d0ADD length,RDX
001012440f b6 00MOVZX length,byte ptr [length]
0010124789 c2MOV EDX,length
  XOR the input and KEY values
0010124944 89 e0MOV length,R12D
0010124c31 d0XOR length,EDX
0010124e88 45 e2MOV byte ptr [RBP + local_26],length
001012510f b6 45 e2MOVZX length,byte ptr [RBP + local_26]
  Compare XOR result to flag value and jump
001012553a 45 e3CMP length,byte ptr [RBP + local_25]
0010125874 07JZ LAB_00101261

As we can see, the Segfault instruction is exactly eight bytes long, which lines up nicely with the eight byte increase of the program counter in the segfault handler, which therefore exactly skips that instruction.

We can also see that the byte loaded from KEY is loaded into R12D meaning that it is overwritten by the segfault handler which is triggered by the instruction at 0010121d.

In conclusion this means the check iterates over the input, xors it with the characters generated by the segfault handler and compares the result to the hardcoded encrypted flag.

From here we can extract the flag by extracting the generated key bytes and then using those the same way we did in the naive approach to the check function.

Step 7: Getting the Flag

To extract the key bytes we can simply put a breakpoint where the R12 register is overwritten and take the value from the EAX register when the breakpoint hits.

(Make sure to start it with input that is long enough, otherwise the loop in check will end early)

Doing that we get a list of values like these:

1
2
3
4
5
6
7
8
9
0x7ffc9286a83c
0x7ffc9286a824
0x7ffc9286a82c
0x7ffc9286a814
...
0x7ffc9286a83d
0x7ffc9286a825
0x7ffc9286a82d
0x7ffc9286a815

Since these are not one byte each we need to know which byte is xored with the input. Looking at the instructions from the check function, we can see that the last byte of the xored value is moved onto the stack and back. Therefore we know that only the last byte of those values is relevant. If we truncate the extracted values accordingly, we get a list of of values like these:

1
2
3
4
5
6
7
8
9
0x3c
0x24
0x2c
0x14
...
0x3d
0x25
0x2d
0x15

Xoring these with the encrypted flag, we get the unencrypted flag:

0xL4ugh{3z_n4n0mites_t0_g3t_st4rt3d}

Which we can validate on the unpatched binary like so:

1
2
> ./nano 0xL4ugh{3z_n4n0mites_t0_g3t_st4rt3d}
yes
This post is licensed under CC BY 4.0 by the author.