At BlockHarbor, we find it to be extremely valuable to “sharpen the saw” by competing in Capture The Flag competitions. This helps us stay up to date on new tools, techniques, and procedures relating to work we do every day. One such event was the annual online HackTheBox Business CTF for 2024. In this event, we are going to be taking a look at one reverse engineering challenge, satellitehijack.
Challenge Introduction
Satellitehijack is a challenge that was listed as a hard difficulty level with 90/943 teams successfully solving it. satellitehijack was of particular interest, given the program modifies itself and hooks its own function at runtime. This is a scheme that malware and code protection software both employ to obfuscate the real behavior of a program.
As part of the challenge, we are given a zip file containing a binary `satellite` and shared library `library.so`. Let’s load the given files into Ghidra and find out what they are doing. Checking out the `main` function for `satellite` we see the following (cleaned up) code
Static Analysis of the Binary
void main(void)
{
long lVar1;
undefined8 *puVar2;
byte bVar3;
undefined8 uStack_420;
undefined8 local_418;
undefined8 local_410;
undefined8 local_408 [127];
ssize_t local_10;
bVar3 = 0;
setbuf(stdout,(char *)0x0);
puts(banner);
send_satellite_message(0,"START");
local_418 = 0;
local_410 = 0;
puVar2 = local_408;
for (lVar1 = 0x7e; lVar1 != 0; lVar1 = lVar1 + -1) {
*puVar2 = 0;
puVar2 = puVar2 + (ulong)bVar3 * -2 + 1;
}
do {
while( true ) {
putchar(0x3e);
putchar(0x20);
local_10 = read(1,&local_418,0x400);
if (-1 < local_10) break;
puts("ERROR READING DATA");
}
if (0 < local_10) {
*(undefined *)((long)&uStack_420 + local_10 + 7) = 0;
}
printf("Sending `%s`\n",&local_418);
send_satellite_message(0,&local_418);
} while( true );
}
It looks like the main function is reading in some input, and eventually calling send_satellite_message, with a pointer to our input as an argument.
Of interest, is the external symbol send_satellite_message, which we can guess is defined in the shared library. Let’s take a look at that next.
Shared Library Analysis
code * send_satellite_message(void)
{
long lVar1;
char *pcVar2;
long in_FS_OFFSET;
uint i;
char local_28 [24];
lVar1 = *(long *)(in_FS_OFFSET + 0x28);
local_28[0] = 'T';
local_28[1] = 'B';
local_28[2] = 'U';
local_28[3] = '`';
local_28[4] = 'Q';
local_28[5] = 'S';
local_28[6] = 'P';
local_28[7] = 'E';
local_28[8] = '`';
local_28[9] = 'F';
local_28[10] = 'O';
local_28[11] = 'W';
local_28[12] = 'J';
local_28[13] = 'S';
local_28[14] = 'P';
local_28[15] = 'O';
local_28[16] = 'N';
local_28[17] = 'F';
local_28[18] = 'O';
local_28[19] = 'U';
local_28[20] = '\0';
for (i = 0; i < 0x14; i = i + 1) {
local_28[(int)i] = local_28[(int)i] + -1;
}
pcVar2 = getenv(local_28);
if (pcVar2 != (char *)0x0) {
FUN_001023e3();
}
if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return FUN_001024db;
}
It dynamically builds a string on the stack, and subtracts one from each byte. Let’s take this into python quickly and find out what this string is.
In [1]: string = ''' local_28[0] = 'T';
...: local_28[1] = 'B';
...: local_28[2] = 'U';
...: local_28[3] = '`';
...: local_28[4] = 'Q';
...: local_28[5] = 'S';
...: local_28[6] = 'P';
...: local_28[7] = 'E';
...: local_28[8] = '`';
...: local_28[9] = 'F';
...: local_28[10] = 'O';
...: local_28[11] = 'W';
...: local_28[12] = 'J';
...: local_28[13] = 'S';
...: local_28[14] = 'P';
...: local_28[15] = 'O';
...: local_28[16] = 'N';
...: local_28[17] = 'F';
...: local_28[18] = 'O';
...: local_28[19] = 'U';
...: local_28[20] = '\0';'''
In [2]: for c in string.split('\n'):
...: print(chr(ord(c.split("'")[1])-1),end='')
...:
SAT_PROD_ENVIRONMENT
After decoding the string, we see send_satellite_message extracting the environment variable SAT_PROD_ENVIRONMENT, and checking this is not equal to NULL.
Moving on to the next function called FUN_001023e3.
void FUN_001023e3(void)
{
ulong uVar1;
void **ppvVar2;
void *__dest;
uVar1 = getauxval(3);
ppvVar2 = (void **)FUN_001021a9(uVar1 & 0xfffffffffffff000,&DAT_00103000);
__dest = mmap((void *)0x0,0x2000,7,0x22,-1,0);
memcpy(__dest,&DAT_001011a9,0x1000);
memfrob(__dest,0x1000);
*ppvVar2 = __dest;
return;
}
In this function, we see a call to getauxval(3). Let’s identify what this is retrieving. In the man page for getauxval we read that “getauxval – retrieve a value from the auxiliary vector”, but what is the “auxiliary vector”? Reading further, it is described as:
> The getauxval() function retrieves values from the auxiliary vector, a mechanism that the kernel’s ELF binary loader uses to pass certain information to user space when a program is executed.
Understanding the Auxval
Various useful pieces of information about the current program will be located in the auxiliary vector. We are getting the third entry. Let’s find out what that third entry in the auxiliary vector is defined as.
Grepping inside of the linux include files (/usr/include on my system) for any of the AT_* values in the manual page leads us to the elf.h file which has definitions for these. Of note is #define AT_PHDR 3 /* Program headers for program */. The program masks the AT_PHDR auxval with 0xfffffffffff000 to get the base address of our binary in memory, and passes this to what I labeled parseElfHeader.
char * parseElfHeader(Elf64_Ehdr *param_1,char *param_2)
{
int iVar1;
int i;
int read_offset;
int k;
Elf64_Sym *symtab;
Elf64_Rela *jmprel;
char *strtab;
Elf64_Dyn *j;
symtab = (Elf64_Sym *)0x0;
jmprel = (Elf64_Rela *)0x0;
strtab = (char *)0x0;
/* for all Phdr entries */
for (i = 0; i < (int)(uint)param_1->e_phnum; i = i + 1) {
/* look for PT_DYNAMIC (2) */
if (*(int *)(param_1->e_ident_magic_str + (long)i * 0x38 + param_1->e_phoff + -1) == 2) {
/* for all Dyn entries */
for (j = (Elf64_Dyn *)
(param_1->e_ident_magic_str +
*(long *)(param_1->e_ident_pad + (long)i * 0x38 + param_1->e_phoff + -1) + -1);
j->d_tag != DT_NULL; j = j + 1) {
if (j->d_tag == DT_SYMTAB) {
symtab = (Elf64_Sym *)(param_1->e_ident_magic_str + (j->d_val - 1));
}
else if (j->d_tag == DT_STRTAB) {
strtab = param_1->e_ident_magic_str + (j->d_val - 1);
}
else if (j->d_tag == DT_JMPREL) {
jmprel = (Elf64_Rela *)(param_1->e_ident_magic_str + (j->d_val - 1));
}
}
}
}
if (((symtab != (Elf64_Sym *)0x0) && (strtab != (char *)0x0)) && (jmprel != (Elf64_Rela *)0x0)) {
read_offset = -1;
for (k = 0; symtab + k < strtab; k = k + 1) {
if ((symtab[k].st_name != 0) &&
(iVar1 = strcmp(strtab + symtab[k].st_name,param_2), iVar1 == 0)) {
read_offset = k;
break;
}
}
if (-1 < read_offset) {
for (; jmprel->r_offset != 0; jmprel = jmprel + 1) {
if (jmprel->r_info >> 0x20 == (long)read_offset) {
return param_1->e_ident_magic_str + (jmprel->r_offset - 1);
}
}
}
}
return (char *)0x0;
}
Let’s dissect this function and see what values they are extracting from the ELF header. In the first loop, we take the total number of Phdr values, and iterate each one of them. These are of type Elf64_Phdr. We then look for a Phdr with the value of PT_DYNAMIC, which is the dynamic section of our binary, responsible for dynamic runtime information, eg. resolving symbols. In this section, we iterate each entry, looking for three values. That of the SYMTAB, the STRTAB, and the JMPREL. DT_SYMTAB is used to hold information about runtime symbols, such as read. DT_STRTAB is used for the actual table where the strings are held. DT_JMPREL is used for finding the offset of a particular symbols function in the Procedure Linkage Table. We then check each strtab entry for param_2 (“read”), and if found, we get the offset of it in the PLT and return this value.
Back to FUN_001023e3 we manually allocate some memory, memcopy data from 0x001011a9, and call memfrob. According to the manual page…
> The memfrob() function obfuscates the first n bytes of the memory areas by exclusive-ORing each character with the number 42. The effect can be reversed by using memfrob() on the obfuscated memory area.
Finally, it changes the pointer in the PLT of read to that of the newly allocated memory. So, the application is setting up a custom hook of the read function, which will alter the intended program execution flow.
Let’s investigate the new function.
Investigating the hook
To alter the memory at 0x001011a9 using Ghidra we can use the provided XorMemory.java script. Making our selection of the entire section and running the script we see a new function…
ulong hooked_read(int param_1,long param_2,long param_3)
{
int iVar1;
ulong uVar2;
int *i;
uVar2 = wrapped_read();
if (((param_1 == 1) && (-1 < (long)uVar2)) && (4 < uVar2)) {
i = (int *)(param_2 + 4);
do {
if ((i[-1] == 0x7b425448) &&
(iVar1 = FUN_00101235((char *)i,(param_3 + param_2) - (long)i), iVar1 != 0)) {
FUN_001012b2(param_2,0,uVar2);
return -1;
}
i = (int *)((long)i + 1);
} while (i != (int *)(param_2 + uVar2));
}
return uVar2;
}
The wrapped_read function simply calls the actual read via a manual syscall. Once called, we check the buffer read into for the first 4 bytes of 0x7b425448, which converted to ascii is HTB{. Looks like we are close to the flag! Inside the next function FUN_00101235 we see:
long FUN_00101235(char *param_1,long param_2)
{
long i;
char stackStr [30];
stackStr[0] = 'l';
stackStr[1] = '5';
stackStr[2] = '{';
stackStr[3] = '0';
stackStr[4] = 'v';
stackStr[5] = '0';
stackStr[6] = 'Y';
stackStr[7] = '7';
stackStr[8] = 'f';
stackStr[9] = 'V';
stackStr[10] = 'f';
stackStr[11] = '?';
stackStr[12] = 'u';
stackStr[13] = '>';
stackStr[14] = '|';
stackStr[15] = ':';
stackStr[16] = 'O';
stackStr[17] = '!';
stackStr[18] = '|';
stackStr[19] = 'L';
stackStr[20] = 'x';
stackStr[21] = '!';
stackStr[22] = 'o';
stackStr[23] = '$';
stackStr[24] = 'j';
stackStr[25] = ',';
stackStr[26] = ';';
stackStr[27] = 'f';
stackStr[28] = '\0';
i = 0;
if (param_2 == 0) {
return i;
}
while( true ) {
if ((char)(param_1[i] ^ stackStr[i]) != i) {
return 0;
}
i = i + 1;
if (param_2 == i) break;
if (i == 0x1c) {
return 1;
}
}
return 0;
}
Another stack string. This time, it is being XOR’d against our input, and being checked to be equal to i, which is our index into the string. So, in order to reverse this, let’s take that string and XOR it by an incrementing value…
In [1]: string =''' stackStr[0] = 'l';
...: stackStr[1] = '5';
...: stackStr[2] = '{';
...: stackStr[3] = '0';
...: stackStr[4] = 'v';
...: stackStr[5] = '0';
...: stackStr[6] = 'Y';
...: stackStr[7] = '7';
...: stackStr[8] = 'f';
...: stackStr[9] = 'V';
...: stackStr[10] = 'f';
...: stackStr[11] = '?';
...: stackStr[12] = 'u';
...: stackStr[13] = '>';
...: stackStr[14] = '|';
...: stackStr[15] = ':';
...: stackStr[16] = 'O';
...: stackStr[17] = '!';
...: stackStr[18] = '|';
...: stackStr[19] = 'L';
...: stackStr[20] = 'x';
...: stackStr[21] = '!';
...: stackStr[22] = 'o';
...: stackStr[23] = '$';
...: stackStr[24] = 'j';
...: stackStr[25] = ',';
...: stackStr[26] = ';';
...: stackStr[27] = 'f';
...: stackStr[28] = '\0';'''
In [2]: for i,c in enumerate(string.split('\n')):
...: print(chr(ord(c.split("'")[1])^i),end='')
l4y3r5_0n_l4y3r5_0n_l4y3r5!}
And adding it all up we get
HTB{l4y3r5_0n_l4y3r5_0n_l4y3r5!}
Challenge solved, statically. Not too bad.
Conclusion
How can we apply this to our work? As reverse engineers, when you encounter unknown program behavior, you must be willing to learn and research until you have a solid understanding of what is occurring. This is why CTF challenges are a great way to practice your skills, and improve your pattern recognition. If you are interested in more CTF challenges, please take a look at our retired challenges from our first season of the Automotive CTF (https://ctf.blockharbor.io) and their associated walkthroughs available at https://vsec.blockharbor.io/theplunge. If you are someone that enjoys participating in CTF’s, you might be interested in our upcoming season two CTF. Checkout the BlockHarbor LinkedIn page (https://www.linkedin.com/company/block-harbor) for the latest information on our upcoming events!