Mach-O file format for red team professionals - Part 4
Diving deep into the Load Commands part of the Mach-O file format. Mach-O is the preferred file format on macOS.
Previously, we looked at the Mach-O file format at a high level, covered differences between standard and universal binaries and discussed the header part in detail. Now, lets continue our descent into details of the Mach-O file format and talk about the Load Commands part.
One of the most critical components of a Mach-O file is the Load Command part. This part contains essential instructions that guide the macOS loader on how to load and execute the file.
Each Mach-O file may contain multiple load commands, which specify different types of information like:
The program's entry point
Memory layout
Linked libraries
Code signature details
Debugging symbols
These commands help macOS understand how to handle the file correctly. In a Mach-O file, the Load Commands are located right after the initial Mach-O header.
Each Load Command starts with a common structure:
cmd
- Specifies the type of command (e.g., segment, dynamic linker, library, etc.). This can hold one of the 54 values defined in loader.h file.cmdsize
- Specifies the size of the command in bytes.Additional fields (depending on the command type).
struct load_command {
uint32_t cmd; // For example, LC_SEGMENT_64
uint32_t cmdsize; // Size of this command
};
Mach-O supports many types of Load Commands, each serving a specific purpose. Below are the most common ones:
LC_SEGMENT_64 / LC_SEGMENT (Memory Segments) - Defines a segment in memory (e.g., TEXT, DATA, LINKEDIT). Each segment may contain multiple sections that hold actual code, data, or metadata.
LC_LOAD_DYLIB (Dynamic Libraries) - Specifies shared libraries (dylibs) that the executable depends on. Other load commands pertaining to dynamic libraries include,
LC_LOAD_WEAK_DYLIB
allows optional linking (if the library is missing, execution still continues) andLC_REEXPORT_DYLIB
re-exports symbols from another library.LC_LOAD_DYLINKER / LC_DYLD_ENVIRONMENT (Dynamic Linker) - Specifies the dynamic linker (like
/usr/lib/dyld
) that will be used to load the program.LC_UUID (Unique Identifier) - Contains a 128-bit UUID to uniquely identify the Mach-O file. It is used for debugging and crash reports.
LC_MAIN (Program Entry Point) - Specifies the entry point function (like
main()
in C). It replaces the olderLC_UNIXTHREAD
command.LC_SYMTAB (Symbol Table) - Stores debugging symbols, function names, and variable names. It is useful for debugging and profiling tools.
LC_DYSYMTAB (Dynamic Symbol Table) - Stores additional information about dynamically linked symbols.
LC_CODE_SIGNATURE (Code Signing) - Stores digital signatures to verify file integrity and authenticity.
LC_RPATH (Runtime Search Path) - Defines custom search paths for dynamic libraries.
You can view the load command
part of a Mach-O by using otool
with -lv flag:
otool -lv /bin/ls
Lets take a closer look at the structures for LC_SEGMENT
and LC_SEGMENT_64
load commands:
struct segment_command {
uint32_t cmd; // LC_SEGMENT
uint32_t cmdsize; // includes sizeof section structs
char segname[16]; // segment name
uint32_t vmaddr; // memory address of this segment
uint32_t vmsize; // memory size of this segment
uint32_t fileoff; // file offset of this segment
uint32_t filesize; // amount to map from the file
int32_t maxprot; // maximum VM protection
int32_t initprot; // initial VM protection
uint32_t nsects; // number of sections in segment
uint32_t flags; // flags
};
struct segment_command_64 {
uint32_t cmd; // LC_SEGMENT_64
uint32_t cmdsize; // includes sizeof section_64 structs
char segname[16]; // segment name
uint64_t vmaddr; // memory address of this segment
uint64_t vmsize; // memory size of this segment
uint64_t fileoff; // file offset of this segment
uint64_t filesize; // amount to map from the file
int32_t maxprot; // maximum VM protection
int32_t initprot; // initial VM protection
uint32_t nsects; // number of sections in segment
uint32_t flags; // flags
};
The structure starts with the previously discussed header, followed by segname
, a character string that holds the segment's name. This name must be 16 characters or fewer.
Next are vmaddr
and vmsize
, which define the memory region where the segment should be loaded. The fields fileoff
and filesize
specify the location in the file from which the segment should be loaded.
The maxprot and initprot
fields indicate the initial and maximum memory protection levels for the segment. The nsects
field specifies the number of section structures that follow. Lastly, flags control various segment-specific settings.
In the 64-bit version of this structure, the fields remain the same, except vmaddr, vmsize, fileoff
, and filesize
are expanded to 64-bit values.
Immediately following the segment structure are nsects
section structures. The definition of these structures can be found here. The following three sections are most commonly found in Mach-O binaries:
__TEXT - Contains executable code and read-only data
__DATA - Contains writable data
__LINKEDIT - Contains information used by the dynamic loader
Here’s how load commands work:
During execution, the Mach-O loader (dyld) processes the load commands to determine how the executable should be loaded into memory. First, the loader reads the Mach-O header and retrieves the list of load commands. It then processes these commands to allocate memory for different segments, such as TEXT, DATA,
and LINKEDIT
, ensuring they are correctly mapped based on their memory protections. If the file depends on shared libraries, the loader resolves LC_LOAD_DYLIB commands to locate and link the required dynamic libraries. It also checks for security-related commands like LC_CODE_SIGNATURE
to verify the file's integrity. Once all dependencies are resolved and memory is correctly allocated, the loader identifies the program’s entry point from the LC_MAIN
command and transfers control to the executable, allowing it to start running. If any critical load command is missing or malformed, the loader may terminate execution with an error.
You can view a Mach-O binary’s dependencies using otool with -L flag:
otool -L /bin/ls
Red Team Notes - Load commands in the Mach-O file format provide essential instructions for loading and executing a program on macOS. They define memory segments, dynamic libraries, entry points, and security settings. During execution, the Mach-O loader (dyld) processes these commands to allocate memory, resolve dependencies, and verify integrity before starting the program. Missing or incorrect load commands can prevent execution.
Follow my journey of 100 Days of Red Team on WhatsApp or Discord.
References