Intro
Abstract
HyperVenom is a hypervisor injection and attachment framework that leverages Microsoft’s Hyper-V for memory introspection. Taking advantage of Microsoft’s Virtualization-Based Security (VBS) on Windows 11, the framework intercepts VM-exits directly inside the hypervisor, allowing for code execution at the hypervisor priviledge level. HyperVenom demonstrates how a lightweight, symbiotic payload can bypass Ring 0 visibility without causing timing or performance issues that would be picked up by telemetry.
Background: Hypervisors and VBS
A hypervisor, also known as a virtual machine monitor (VMM), is software that manages virtual operating systems on a physical host machine. Hyper-V is Window’s built in hypervisor.
VBS (Virtualization-Based Security), a Microsoft security feature, is enabled by default on most modern Windows 11 systems and it uses hardware virtualization to isolate sensitive security features from normally accesible kernel code. It raises the cost of kernel exploitation and prevents malware from affecting the hardware and achieving persistence. Some features, like HVCI (Hypervisor-protected Code Integrity) and Credential Guard, rely on this isolation model.
On VBS-enabled systems, Hyper-V boots first and owns the real hardware translation path. To the user, Windows seems like the host machine, but under the hood, it is treated like a managed guest partition under the hypervisor.
Memory access on a VBS-enabled system is mediated by two translation layers. The operating system first resolves a Guest Virtual Address to a Guest Physical Address through its own page tables and, from the kernel’s perspective, appears to be modifying hardware memory directly. That Guest Physical Address, however, is not the final hardware destination. Hyper-V then performs a second translation to the underlying Machine Address (MA). This PA -> MA indirection enables the hypervisor to enforce isolation and maintain authoritative control over memory visibility. Code executing at Ring -1 therefore sits at the control boundary beneath Ring 0 and can observe, constrain, or redirect mappings that the kernel cannot directly govern.
The Approach: Symbiosis
The goal isn’t to replace the hypervisor, but to live inside of it. Microsoft’s Hyper-V already controls the hardware, manages page tables, and handles VM-exits. Fighting that infrastructure is a losing battle.
What made this project possible was to be symbiotic, rather than adversarial. Unlike previous approaches that attempt to build entirely custom hypervisors from scratch (like Illusion-rs), rely on active dual-EPT switching (like noahware/hyper-reV), or destroy SLAT memory protections entirely (like Ring-1.io), acting as a parasite inside the existing infrastructure avoids state collisions. This repeatedly got me past the biggest hurdles, where everything I did would triple fault. Instead, I use native APIs and inject just enough code to intercept the events I care about and hand everything else back to Microsoft untouched.
Project Architecture
The project is broken down into three primary parts:
- Part 1: The Bootloader (Injector): A UEFI boot application that runs before
bootmgfw.efito hijack execution flow and inject the attachment into Hyper-V as it loads. - Part 2: The Attachment (Payload): This code sits inside the hypervisor, intercepts VM-exits, manipulates hardware page tables, and performs memory introspection.
- Part 3: The Usermode Interface(s): This usermode hypercall library acts as a communication channel, enabling a standard Ring 3 guest application to pass discrete messages into Ring -1.
Technical Deep Dive Part 1 - The Bootloader
In the normal Windows boot sequence, the boot manager bootmgfw.efi loads the boot application winload.efi. Winload, as the name suggests, loads the Windows kernel. Under VBS, it also stages the Hyper-V environment through hvloader.dll and the hypervisor binary hvix64.exe (on Intel systems).
The bootloader follows the execution flow through this process, intercepting the hypervisor launch function hv_launch. Subsequently, it maps the payload into the hypervisor’s page tables.
I break this process into four phases:
Phase 1: UefiMain
This first phase loads the bootloader, prepares it for future phases, and hooks the ImgpLoadPEImage function, which loads the winload executable into memory.
The application’s base address is saved for later persistence (GetMemoryMap). Then, the attachment EFI\HyperVenom\HyperVenomAttachment.efi is loaded into memory (as EfiRuntimeServicesData for persistence). Here the hypervisor heap (EfiRuntimeServicesCode) and the log buffer are also loaded. Finally a hook is planted on GetMemoryMap.
GetMemoryMap:
winloadcalls this to get a map of the system’s physical RAM from the motherboard. At this point, standard UEFI applications be marked to be overwritten.GetMemoryMapis intercepted, and the original function is called to get the true memory layout. This layout is modified to protect the application base, the hypervisor heap, and the trampoline and then forwarded to winload. The bootloader memory is changed toEfiRuntimeServicesCodeandEFI_MEMORY_RUNTIMEso as to not be deleted.
A similar approach is used to bypass TPM measurements.
The HashLogExtendEvent function within the EFI_TCG2_PROTOCOL is intercepted. Due to the Static Root of Trust for Measurement (SRTM), the initial execution of the bootkit is inherently logged to PCR 4 before UefiMain gains control. To prevent OS-level security panics (such as BitLocker Recovery triggers), the correct bytes must be passed into PCR4. When the bootkit manually chainloads the Windows OS, a hook intercepts the secondary measurement, reads the authentic Microsoft bootloader from the disk into a hidden buffer, and feeds those bytes to the TPM. In testing, this kept PCR 11 and 12 aligned with expected values and reduced System Guard alerts on the modified chain.
What the TPM measures
During boot, the UEFI firmware extends measurements into Platform Configuration Registers (PCRs) inside the TPM chip. “Extending” means:
PCR[n] = SHA-256(PCR[n] || SHA-256(data))
It’s a hash chain, meaning each measurement is cryptographically bound to all previous measurements. The PCR values represent the entire boot history.
PCR assignment for Secure Boot:
| PCR | What gets measured |
|---|---|
| 0 | BIOS firmware code |
| 4 | Boot loader (bootmgfw.efi) |
| 7 | Secure Boot state |
| 11 | BitLocker key release |
When the firmware loads bootx64.efi (which is the bootloader), it calls TCG2.HashLogExtendEvent() to measure it into PCR 4. This means PCR 4 would contain our bootloader’s hash instead of the authentic bootloader’s hash.
What the spoof does
EFI_STATUS EFIAPI HookedHashLogExtendEvent(..., DataToHash, DataToHashLen, ...) {
if (DataToHash == g_AppBase) {
// The firmware is measuring US.
// Feed it the clean bootmgfw.efi we cached from disk instead.
return g_OriginalHashLogExtendEvent(
..., CleanMicrosoftBuffer, CleanMicrosoftSize, ...);
}
// Everything else gets measured normally
return g_OriginalHashLogExtendEvent(..., DataToHash, DataToHashLen, ...);
}
Next SetVirtualAddressMap is hooked for later:
SetVirtualAddressMap: SVAM is called by
winloadto switch the CPU from raw physical memory to virtual memory. Here bootloader memory must be transfered as well.
The problem
During UEFI boot, everything uses physical addresses (identity mapped). When Windows calls ExitBootServices(), it’s about to switch to virtual addressing. It calls SetVirtualAddressMap() to tell UEFI runtime services “here’s where your physical pages ended up in virtual memory.”
Normal UEFI runtime drivers get their pointers automatically translated. But the bootloader is a standalone UEFI application that hooked itself into the memory map as EfiRuntimeServicesCode. UEFI doesn’t know about HyperVenom’s internal global variables, thus it won’t translate them.
The fix:
During SVAM, pointers are translated into virtual memory. The hypervisor heap is skipped, as it will be used to build Extended Page Tables later and must remain a physical address. Trampolines are also translated such that the hooks remain functional.
The untouched original boot manager is now loaded in, with the bootloader acting as a proxy loader.
Now that the boot manager is loaded, ImgpLoadPEImage is found in its .text section. This is the function which loads winload.efi.
It is hooked using a trampoline hook.
Trampoline Hook: This hook works by finding a code cave, building a trampoline in the cave with stolen native instructions, preserving the original logic, and then overwriting the stolen bytes with a detour. The CPU’s hardware write protection is dropped to do this.
Code Cave: When the MSVC compiler emits a PE, it pads the space between functions with
0xCCbytes, unused executable memory.
...[end of FunctionA]...
CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC ← Code cave
CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC
...[start of FunctionB]...
The trampoline structure
When hooking ImgpLoadPEImage, you need a trampoline so the hook can call the original function. Here’s the layout:
Original Function (e.g. ImgpLoadPEImage):
┌──────────────────────────────┐
│ JMP [rip] → HookedFunction │ ← 14 bytes (overwrites original code)
│ (address payload) │
├──────────────────────────────┤
│ ... rest of function ... │ ← Continues from byte 15 onward
└──────────────────────────────┘
Code Cave (found via 0xCC scan):
┌──────────────────────────────┐
│ [15 stolen bytes] │ ← The original first 15 bytes, copied here
├──────────────────────────────┤
│ JMP [rip] → Original + 15 │ ← 14-byte absolute jump back
│ (address payload) │
└──────────────────────────────┘
Call flow:
- Someone calls
ImgpLoadPEImage - First 14 bytes are the JMP → goes to the detour function
HookedWinloadPeImage - Detour does its work, then calls
g_OriginalWinloadPEImage(which points to the code cave) - Code cave executes the 15 stolen bytes, then JMPs back to original+15
- The rest of the original function executes normally and returns
Why 15 bytes? The hook JMP is 14 bytes (FF 25 00 00 00 00 + 8-byte address). But x86 instructions have variable length and you can’t split an instruction. If byte 14 is in the middle of a multi-byte instruction, you’d corrupt it. 15 bytes is chosen because it’s the maximum x86 instruction length, guaranteeing at least one complete instruction boundary.
Phase 2: The Pivot
When the boot manager loads an application, calling ImgpLoadPEImage, the hook intercepts it, lets Microsoft’s loader finish, and then checks the file name. This continues until winload.efi is found.
Once winload.efi is loaded, BlImgLoadPEImage is found in its .text section using pattern scanning.
This specific (wildcard) pattern masks out the compiler’s stack frame allocation sizes (sub rsp, ???), ensuring that the hypervisor can still locate the 15-parameter core application loader (BlImgLoadPEImage) even if Microsoft modifies winload.efi’s local variables in future Windows updates. An improvement could use Zydis for further resilience.
A trampoline hook is then planteed on this winload.efi application loader.
BlImgLoadPEImage:
winload.efiuses this function to load the core Windows operating system and hypervisor modules into memory.
The bootloader is now in place to monitor the rest of the boot chain and wait for the target hypervisor modules to load in Phase 3.
Phase 3: Pre-Virtualization Catch
Using the BlImgLoadPEImage hook established in Phase 2, the Windows boot chain is passively monitored as the operating system loads its core modules into memory. Microsoft loads each file natively, unless if the file name matches hvix64.exe or hvloader.dll.
When the Intel Hypervisor (hvix64.exe) is caught loading, the Relative Virtual Address (RVA) of its Entry Point is extracted from the PE headers. Using a wildcard pattern, the Hardware VM-exit stub is found in the .text section.
Instead of hooking it now, the delta between the hypervisor’s Entry Point and this VM-exit stub is cached. This is necessary to locate the true VM-exit handler later without performing pattern scans on live, isolated, SLAT-protected hypervisor memory.
When hvloader.dll is caught, a similar pattern scan finds hv_launch, the thunk responsible for ultimately launching Hyper-V.
Once hv_launch is found, the delta between it and the base of hvloader.dll is cached to dynamically resolve the live loader again later. A 14-byte absolute jump hook is placed directly onto hv_launch, pointing execution to the custom assembly wrapper (AsmHypervisorLaunchIntercept). The wrapper saves registers and prepares the stack for HookedHvlLaunchHypervisor, the hv_launch detour.
Absolute Hook: Unlike trampoline hooks used earlier, I explicitly do not build a trampoline for
hv_launch. Instead, the CPU’s hardware write protection is temporarily dropped, the original 14 bytes ofhv_launchare copied into a global array (g_OriginalHvLaunchBytes), and the start of the function is overwritten with an absolute jump. This ensures that when Phase 4 intercepts the launch, it can restore the original bytes inline and perform a clean handoff back to Microsoft.
The bootloader is now ready to intercept the boot flow when Windows attempts to transition the CPU into a virtualized hardware state in Phase 4.
Phase 4: Ring -1 Escalation
When Windows attempts to launch Hyper-V, the absolute jump redirects to HookedHvlLaunchHypervisor. The original 14 bytes of hv_launch is immediately restored in memory to prepare for the handoff back to Microsoft.
1. Live Resolution of winload.efi
Using the delta cached in Phase 3, the live base address of hvloader.dll is resolved in memory. The bootloader walks backward from this base, parses the PE DOS and NT headers, traverses the Import Directory, and resolves the live base of winload.efi. The .text section of winload.efi is then scanned to locate its native memory manager, BlMmMapPhysicalAddressEx.
2. Shared PML4 Bridge
The attachment needs to execute inside Hyper-V’s address space, but Hyper-V’s CR3 is completely isolated from the bootloader’s CR3. This is solved by injecting the same page table entry into both CR3s, creating a shared virtual address range that maps to the same physical memory regardless of which context the CPU is in.
Here, Microsoft’s native boot memory allocator (BlMmMapPhysicalAddressEx) is used to map the payload allocation into the active page tables of both the guest OS loader and the isolated hypervisor. Rather than building a page table allocator and risking conflicts with Winload’s memory state, Microsoft’s own API does the work (symbiosis). The page table structures (PDPT, PD, PT) are written into PML4 Index 500, which corresponds to the high-canonical virtual address 0xFFFFFA0000000000.
Why use Microsoft’s memory manager? During
hv_launch, Windows is tearing down the UEFI memory map and building secure hypervisor structures. Standard allocations get overwritten or fault on access. By reusing the loader’s own API, the mapping stays native to the transition path and avoids exhausting Winload’s limited PTEs. This was an key step to avoid immediate crashes.
To evade hardware MMU traps and HVCI shadow maps, these structures are mapped using explicit hardware flags 0x63 (Present, Read/Write, Accessed, Dirty, Write-Back PAT). The TLB is then flushed by reloading CR3 (__writecr3(__readcr3())).
3. Native PE Mapping
Because this high-canonical virtual address maps to the exact same physical memory inside both the guest’s CR3 and Hyper-V’s CR3, it creates a shared memory window across the privilege boundary. This allows the payload, executing in the bootloader environment, to safely calculate page boundaries, translate isolated virtual addresses into raw physical addresses, and directly overwrite the hypervisor’s hardware VM-exit handler.
With the shared memory window established, the VMX-Root payload (HyperVenomAttachment.efi) is manually mapped into the new executable virtual address space. Uninitialized memory regions are zeroed to prevent security cookie access violations. The Portable Executable (PE) headers and sections are copied into the space, and Base Relocation blocks (IMAGE_REL_BASED_DIR64) are processed to ensure the payload functions correctly at the new virtual base. Finally, the .payload section is located to resolve the payload’s entry point.
4. Isolated Page Table (PT) Walk
Hyper-V’s memory is isolated, so the delta from Phase 3 is used to get the virtual address of the VM-exit handler. A page table walk is done through Hyper-V’s isolated CR3 page tables (traversing the PML4E, PDPTE, PDE, and PTE), calculating 1GB, 2MB, or 4KB page boundaries to successfully translate the VM-exit handler’s virtual address into a raw physical address.
5. The XMM-Safe Master Router & The Canonical Anchor
I anchor my hook directly to the tail end of the 2MB Heap (ExecutionBase + 0x1E0000). Using the OS’s native BlMmMapPhysicalAddressEx, this physical address is mapped into Winload’s context and the assembly “Master Wrapper” is constructed.
This wrapper:
- Saves all General Purpose Registers (GPRs) and Volatile XMM Registers (XMM0-XMM5).
- Aligns the stack to the C-calling convention boundary.
- Executes my C-based detour payload.
- Restores the XMM and GPR state.
- Routes execution either to a Premature
VMRESUMEor falls through to Microsoft’s native handler based on the payload’s return value.
The Triple Fault Preventer
If VMRESUME fails due to an invalid guest state, the CPU can triple fault and reboot the machine. To prevent this, the wrapper includes a 14-byte JMP immediately after the VMRESUME instruction. If resume fails, execution falls through to that jump and returns to Microsoft’s original handler.
6. Planting the Hijack
The CPU’s hardware write protection (CR0.WP) is dropped and the wrapper is written to the anchor, followed by the stolen original Intel stack alignment bytes (18 bytes), followed by a 14-byte absolute jump back to the rest of Hyper-V. Finally, a 14-byte absolute jump is planted at the actual VM-exit entry point, pointing directly to the wrapper.
7. Removing the Guest Bridge
Before returning control to the system, the shared memory bridge from winload.efi’s CR3 is removed by zeroing out PML4 Index 500 and flushing the TLB via CR3 reload. This ensures the Windows kernel inherits a clean, artifact-free memory environment.
The raw pointer of the restored hv_launch is then returned to the initial assembly stub, which unwraps the stack (HOST_RSP) and jumps to the original execution path, booting Hyper-V natively with the payload actively loaded.
Physical Memory:
┌─────────────────────────────────────┐
│ 0x00000000 - 0x00000FFF │ Real Mode IVT / BDA
├─────────────────────────────────────┤
│ 0x00001000 │ ← BIOS template page
│ │ (cloned for EPT camouflage)
├─────────────────────────────────────┤
│ ... │
├─────────────────────────────────────┤
│ g_HeapPhysBase │ ← 2MB Heap Start
│ ├─ +0x000000 — PE Image │ Attachment code + data
│ ├─ +0x100000 — Usable Heap │ 4KB free-list blocks
│ ├─ +0xF0000 — PDPT (bridge) │ Page table level 3
│ ├─ +0xF1000 — PD (bridge) │ Page table level 2
│ ├─ +0xF2000 — PT (bridge) │ Page table level 1
│ │ ├─ [0-506] Static heap map │
│ │ ├─ [507] Scan slot │
│ │ ├─ [508] Primary dynamic │
│ │ └─ [509] Secondary dynamic │
│ ├─ +0x1E0000 — MasterWrapper │ 186-byte VM-exit trampoline
│ 1. Preserve GPRs, XMM0-XMM5, MXCSR │
│ 2. Stack Alignment │
│ 3. Execute .payload (Detour) │
│ 4. Restore GPRs, XMM0-XMM5, MXCSR │
│ 5. Router: │
│ ├─ [Route A] JMP to Microsoft │
│ └─ [Route B] VMRESUME + Fallback │
│ ├─ +0x1FE000 — (reserved) │
│ └─ +0x1FF000 — Log Buffer (4KB) │ Ring buffer for debug output
├─────────────────────────────────────┤
│ g_HeapPhysBase + 0x200000 │ ← 2MB Heap End
└─────────────────────────────────────┘
Virtual Memory (via PML4[500]):
┌─────────────────────────────────────┐
│ 0xFFFFFA0000000000 │ ← IDENTITY_BASE
│ ├─ +0x000000 — PE code (RX) │
│ ├─ +0x100000 — Heap (RW) │
│ ├─ +0xF2000 — Page Table (RW) │
│ ├─ +0x1E0000 — MasterWrapper (RX) │
│ ├─ +0x1FC000 — PTE 507 (dynamic) │ boot-time scan
│ ├─ +0x1FD000 — PTE 508 (dynamic) │ runtime reads
│ ├─ +0x1FE000 — PTE 509 (dynamic) │ runtime reads
│ └─ +0x1FF000 — Log Buffer │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│HARDWARE VM-exit HANDLER ENTRY POINT │
├─────────────────────────────────────┤
│14-Byte Absolute JMP (Planted Hijack)│─────┐
├─────────────────────────────────────┤ │
│18 Stolen Intel Stack Alignment Bytes│ │
├─────────────────────────────────────┤ │
│14-Byte Absolute JMP (Native Return) │ │
├─────────────────────────────────────┤ │
│Native Microsoft Handler Continuation│ │
└─────────────────────────────────────┘ │
│
Jump immediately to MasterWrapper <┘
Successful Boot (Serial Output)
[HyperVenom] Ghost App Bootkit Initializing
[Phase 1] Ghost App Base Address Captured: 0x00000000F6AF4000
[Phase 1] TPM 2.0 TCG2 Protocol successfully blinded!
[Phase 1] bootmgfw.efi loaded! Scanning for ImgpLoadPEImage
[Phase 1] Found High-Level PE Wrapper! Planting static trampoline
[Phase 1.5] *** WINLOAD.EFI CAUGHT ***
[Phase 2] Pivoting to winload.efi natively
[Phase 3] hv_launch found! Planting 14-Byte Static Hook
[Phase 3] Hardware Stub mathematically resolved! Delta Cached: 0x000000000000226D
[Phase 4] Hooked SetVirtualAddressMap called! Finalizing Ring -1 pivot
[!] ========================================
[!] PHASE 4: HYPERVISOR LAUNCH INTERCEPTED
[!] ========================================
[Phase 4] Original hv_launch restored. Handing off to Hypervisor
[Phase 4] OriginalVmExitHandler resolved at: 0xFFFFF819705A743D
[Phase 4] Surgical Canonical Bridge built at Index 480 via LiveBlMmMap!
[PT Walk] Target resolved via 2MB Large Page!
[Phase 4] VMX-Root handler hijacked via Reordered Trampoline Cave!
[Phase 4] Winload Bridge burned! Guest OS will boot pristine.
[Phase 4] Handing off natively to Hypervisor
[-] Rootkit Heap safely cloaked!
[-] Commencing EFI Memory Map Ghosting
[-] EFI Allocation successfully unlinked & ghosted!
[*] Ring -1 Reality (True Memory) : 0x0000000000005A4D
[*] Guest Reality (Spoofed EPT) : 0x0000000100064DE9
[+] Successfully Removed EPT Hook & Freed Shadow Page.
Technical Deep Dive Part 2 - The Attachment
Once the bootkit finishes Phase 4 and calls the original hv_launch, the CPU enters VMX-root. From this point on, every VM-exit on the system flows through the assembly wrapper before reaching Microsoft’s handler.
This section is organized around operation groups: foundation (bootstrapping), runtime operations, and handoff (how control returns to Hyper-V safely).
Foundation: The VMX-Root Environment
The attachment runs as a standalone PE inside Hyper-V’s address space. There is no CRT, no kernel, and no UEFI runtime. Everything must be built from scratch.
Entry Point
Execution begins at entry_point, located in a custom .payload PE section. The bootkit resolved this section by name during Phase 4. The PE’s normal AddressOfEntryPoint is a dummy that returns EFI_UNSUPPORTED.
Heap Allocator
A singly-linked free-list heap is initialized over the upper 1MB of the 2MB physical allocation. Every block is exactly 4KB (one page), which aligns with the granularity of both guest page tables and the EPT. This matters later when the hooking engine needs to allocate shadow pages and split EPT entries at page boundaries. HeapAlloc() pops the head; HeapFree() pushes back. Both are protected by g_HeapLock (spinlock via _InterlockedCompareExchange).
SMP Tracking
InitializeSmpSupport uses CPUID leaf 1 to determine the maximum logical processor count and initializes placeholder state. APIC IDs for per-core MSR indexing are not cached at init but fetched live at each VM-exit via CPUID leaf 0xB (x2APIC ID), which indexes into the per-core g_Virtual_1D[1024] and g_Virtual_570[1024] arrays.
The Dynamic PTE Bridge The attachment frequently needs to read or write physical addresses it doesn’t have mapped, such as EPT entries, guest page tables, MSR bitmaps, and arbitrary guest RAM. A dedicated set of reserved PTE slots in the page table act as a bridge: each slot is a fixed virtual address whose backing physical page can be swapped at runtime by rewriting the PTE. This gives VMX-root safe, on-demand access to any physical address on the machine.
Without this bridge, reading an unmapped or MMIO physical address from VMX-root causes a Machine Check Exception.
Recall from Part 1 that the bootkit injected page tables at PML4 Index 500 (0xFFFFFA0000000000). Page Table entries 0-506 are static mappings of the heap. Entries 507-509 are the bridge slots. The mechanism:
- Acquire
g_PteLock(only one core can use the slot at a time). - Write a new PTE at Index 508 pointing to the target physical page, with flags
0x63(Present + R/W + Accessed + Dirty). INVLPGto flush the stale TLB entry for that virtual address.- Read through the virtual address — the CPU walks the page table and reaches the target physical page.
- Zero the PTE, flush again, release the lock.
static UINT64 SafeReadSystemRam64(UINT64 PhysicalAddress) {
while (_InterlockedCompareExchange(&g_PteLock, 1, 0) != 0)
_mm_pause();
volatile UINT64 *MappedPt = (volatile UINT64 *)(IDENTITY_BASE + 0xF2000);
UINT64 dynamic_va = IDENTITY_BASE + (508 * 4096);
*(MappedPt + 508) = (PhysicalAddress & g_PhysMask) | 0x63;
__invlpg((void *)dynamic_va);
UINT64 value = *(UINT64 *)(dynamic_va + (PhysicalAddress & 0xFFF));
*(MappedPt + 508) = 0;
__invlpg((void *)dynamic_va);
_InterlockedExchange(&g_PteLock, 0);
return value;
}
Why
0x63? If Present and R/W are set but Accessed and Dirty are not, the CPU will try to set them itself via an atomic read-modify-write on the PTE. In VMX-root, that hardware write can cause unexpected behavior (faults, hangs, etc). Pre-setting A+D tells the CPU the bits are already there.
Why
g_PhysMask? Different CPUs support different physical address widths (36-bit, 39-bit, 46-bit, 52-bit).g_PhysMaskis built from CPUID leaf0x80000008:((1ULL << (eax & 0xFF)) - 1) & ~0xFFFULL. This ensures PTEs contain valid physical addresses regardless of the hardware.
Log Buffer
A 4KB ring buffer is initialized at heap offset +0x1FF000 for debug output. Usermode can query the physical address of this buffer via Op 5.
Finally, the address of vmexit_handler_detour is written into the VmExitDetourOut pointer, embedding the dispatch logic into the hardware VM-exit path.
The Dispatcher: Intercepting VM-exits
Every VM-exit on the system hits the Master Wrapper first. The wrapper saves all GPRs, XMM0-XMM5, and MXCSR, then calls the C function vmexit_handler_detour(context).
┌─────────────────────────────┐
│ Hardware VM-exit │
│ (CPU traps to VMX-root) │
└────────────┬────────────────┘
│
▼
┌─────────────────────────────┐
│ Master Wrapper │
│ Save GPRs + XMM0-5 + MXCSR│
│ Align stack, CLD │
│ CALL vmexit_handler_detour │
└────────────┬────────────────┘
│
┌─────┴──────┐
│ Return val │
└─────┬──────┘
│
┌────────┴────────┐
│ │
▼ ▼
Route A Route B
(non-zero) (zero)
│ │
Restore regs Restore regs
│ │
JMP to VMRESUME
Microsoft's (return to
handler guest now)
│ │
Microsoft ┌──────────┐
processes │ Fallback │
the exit │ JMP if │
normally │ VMRESUME │
│ fails │
└──────────┘
The detour reads the exit reason from the VMCS (__vmx_vmread(0x4402, &reason)) and dispatches:
| Exit Reason | Trigger | Handled by |
|---|---|---|
| 10 (CPUID) | Any CPUID instruction | Op Group A |
| 18 (VMCALL) | VMCALL instruction | Op Group A (affinity spoof) |
| 31 (MSR Read) | RDMSR on DEBUGCTL or RTIT_CTL | Op Group C |
| 32 (MSR Write) | WRMSR on DEBUGCTL, RTIT_CTL, or LSTAR | Op Group C |
| Everything else | — | Falls through to Microsoft (Route A) |
The detour returns either a pointer to g_original_vmexit_handler (Route A — let Microsoft handle it) or zero (Route B — premature VMRESUME, Microsoft never sees the exit).
Operation Group A: The Stealth Interface (CPUID & VMCALL)
This is the primary communication channel between usermode and Ring -1. The CPUID instruction is used because it is the only unprivileged instruction that unconditionally causes a VM-exit on VBS systems. Any Ring 3 process can execute it, it requires no driver or syscall, and the hypervisor always intercepts it. The path is: CPUID instruction → hardware VM-exit → the handler → VMRESUME → usermode reads the result from the guest register context.
The Knock: Secure Authentication & CR3 Locking
The attachment stays dormant until it receives a valid knock. On EXIT_REASON_CPUID, a check is performed to determine whether the leaf matches the backdoor value and whether RDX contains the correct magic. If not, the exit falls through to Microsoft and is handled as a normal CPUID path.
When a valid knock arrives (Op 0):
CPL check. The Guest CS selector is read from the VMCS. The low 2 bits are the Current Privilege Level. Only Ring 3 (CPL == 3) can authenticate. This limits this interface to user-mode callers and reduces straightforward probing from kernel-mode contexts.
First-caller-wins CR3 lock. The guest’s CR3 (page table base) is captured and stored as
g_AuthenticatedCr3. Every subsequent operation compares the caller’s CR3 against this value. Mismatched CR3 → silent rejection. Because every process has a unique CR3, only the process that authenticated can issue commands.KVA Shadow handling: Windows uses KPTI, so each process has two CR3 values (user/kernel) that differ only in the low bits. Both sides are masked with
g_PhysMaskbefore comparing, so both CR3s match.Privilege violation lockout. A wrong magic value is silently dropped and the exit falls through to Microsoft with no penalty. Strikes fire on privilege violations only: a kernel-mode caller (CPL != 3) hitting Op 0, or bypassing the CR3 check on an authenticated op. Three such violations trigger permanent lockout for the boot session. This protects against ring-escalation attacks, not brute-forcing the magic value.
After authentication, Op 0 triggers the one-time stealth sequence (covered below in “Stealth Cloaking”).
Physical Memory Operations
Once authenticated, the DLL can issue operations via the sub-operation field in RCX:
| Op | Name | Description |
|---|---|---|
| 0 | Init | Authenticate, lock CR3, trigger stealth |
| 1 | TranslateVirtual | Walk guest page tables (CR3 → PML4 → … → PA) |
| 2 | ReadPhysical | Map a PA via the PTE bridge, return 16 bytes in R10+R9 |
| 3 | WritePhysical | Write up to 16 bytes to a PA (size in R8[63:56], data in R9+R10) |
| 5 | GetLogBuffer | Return the log buffer physical address |
| 6 | DeployHook | Full passive VTable EPT hook (see Op Group B) |
| 7 | RemoveHook | Restore original EPT PTE, merge pages (see Op Group B) |
| 9 | RegisterAffinity | Store a VA for VMCALL-based display affinity spoofing |
| 10 | GetKernelBase | Return the LSTAR-resolved ntoskrnl.exe base |
Ops 4 and 8 are currently unused.
Since x64 MSVC does not support inline assembly, the usermode DLL tunnels parameters through an external MASM stub (HyperVenomHypercall) that loads R8, R9, and R10 before firing the CPUID instruction. The stub returns the modified register values as the result.
Page Boundary Guards
When returning data to usermode (e.g., the 16-byte HyperVenom_HOOK_OUT struct from Op 6), data is not written blindly to user virtual pointers. A structure that straddles a 4KB page boundary could land on two different physical pages — the PTE bridge maps one page at a time. The user’s VA is translated to a PA, the page offset is checked (offset <= 0xFF0 for 16 bytes, <= 0xFF8 for 8 bytes), and writing only occurs if the structure fits within the current page.
The VMCALL Spoof: Display Affinity Hijacking
Some anti-cheats don’t just rely on kernel telemetry. They also check user-visible OS state. One common check is GetWindowDisplayAffinity, a Win32 API that detects whether a window has screen capture protection enabled. If the user is running an overlay or screen capture tool, the anti-cheat sees it. This check is intercepted at the hardware level via EXIT_REASON_VMCALL (18), making the function report that no capture protection is active.
The setup (done in usermode):
- Allocate a “Touch-and-Call” trampoline. The first instruction is
mov eax, dword ptr [rdx]. This forces the OS memory manager to page-in the data, preventing a crash if the page was swapped to disk (demand paging). - Overwrite the start of
GetWindowDisplayAffinitywith an absolute jump to the trampoline. - Register the trampoline’s VMCALL address with the hypervisor via CPUID Op 9.
When the anti-cheat calls the API:
- Execution hits the trampoline → touches memory → fires
VMCALL→ VM-exit. Guest_RIP == registered addressis verified.- The anti-cheat’s output pointer (
RDX) is translated to a physical address, mapped via the PTE bridge, and written with0(WDA_NONE) → “no capture protection detected.” - A
RETis simulated from VMX-root: the return address is read from the top of the guest stack, Guest RIP is set to that address, and Guest RSP is advanced by 8. - Premature
VMRESUME. The anti-cheat resumes execution with a spoofed result and does not observe a VM-exit in its own call path.
Operation Group B: The EPT Hooking Engine
This is the primary hooking capability. A VTable is a table of function pointers that game engines use for polymorphic dispatch. When the engine calls a virtual method, it reads a pointer from the VTable to find the actual function address. By swapping one of these pointers, execution can be redirected to custom code.
The challenge is doing this invisibly. Modifying the VTable in guest memory would be caught by CRC integrity checks. Instead, the engine performs data pointer swaps via EPT page shadowing. A copy of the page containing the VTable is created, the pointer on the copy is modified, and the EPT is instructed to serve the copy instead of the original. The guest’s memory is never touched. The engine avoids EPT violations and MTF traps, both of which are commonly monitored.
Background: Extended Page Tables
On a VBS system, there are three layers of address translation:
Guest Virtual Address → [Guest Page Tables / CR3] → Guest Physical Address
│
▼
[EPT / SLAT]
│
▼
Host Physical Address
(actual RAM chip)
The EPT is a 4-level page table structure managed by the hypervisor. Microsoft maps most physical memory as 2MB Large Pages for performance. Each EPT entry has its own permission bits (Read/Write/Execute) and memory type.
Surgical Fracturing: Splitting 1GB/2MB Pages into 4KB
A 2MB EPT entry maps an entire 2MB block as one unit. Any permission change affects all 512 constituent 4KB pages. To modify a single 4KB page, the large page must be split into 512 individual entries.
BEFORE (one 2MB entry): AFTER (512 × 4KB entries):
┌─────────────────────────┐ ┌─────────────────────────┐
│ PDE: Phys 0x1000000 │ │ PDE → new Page Table │
│ LargePage=1, RWX │ │ LargePage=0 │
│ Maps 0x1000000-0x11FFFFF│ ├─────────────────────────┤
│ as one block │ │ PT[0] = 0x1000000 RWX │
└─────────────────────────┘ │ PT[1] = 0x1001000 RWX │
│ PT[2] = 0x1002000 RWX │
│ ... │
│ PT[511]= 0x11FF000 RWX │
└─────────────────────────┘
SafePageSplit2MB walks the EPT via the PTE bridge to find the PDE for the target address. If it’s a 2MB Large Page (Bit 7 set):
- A 4KB page is allocated from the heap to serve as the new Page Table.
- Populate 512 entries, each mapping one 4KB slice of the original 2MB range.
- Inherit the original page’s caching and permission attributes via
(pde & 0xFFF0000000000F7FULL). This copies R/W/X, Memory Type, and Ignore PAT, while stripping the Large Page flag (Bit 7). - Swap the PDE to point to the new Page Table.
INVEPTto flush the EPT TLB. This is the EPT equivalent ofINVLPG: it tells the CPU to discard cached EPT translations so the hardware picks up the new page table structure.
From the guest’s perspective, nothing changed. The mapping is identical, but now individual page entries can be modified.
SafePageSplit1GB does the same for 1GB Huge Pages → 512 × 2MB entries. If the target is inside a 1GB page, a 1GB → 2MB split is performed first, then 2MB → 4KB.
Shadowing: Cloning Pages to Hide VTable Hooks from VBS
Instead of modifying the guest’s page directly (which anti-cheats guard with CRC checks), the entire physical page is swapped in the EPT. The guest reads from what it thinks is the same address, but the EPT redirects the read to a shadow page containing the modified pointer.
The EPT hook lifecycle (Op 6):
1. Translate target VA → PA
Walk guest page tables via SafeReadSystemRam64
│
▼
2. Allocate shadow page (4KB from heap)
Clone original page byte-for-byte
│
▼
3. Modify shadow: overwrite target pointer
with detour function address
(original page untouched)
│
▼
4. Split EPT to 4KB granularity
SafePageSplit1GB → SafePageSplit2MB
│
▼
5. Swap EPT PTE: point at shadow page
Inherit original attribute bits via
(original_pte & 0xFFF0000000000FFFULL)
│
▼
6. INVEPT — flush EPT TLB
Hardware picks up new mapping
Result:
BEFORE: Guest reads VTable → EPT serves original page → original pointer
AFTER: Guest reads VTable → EPT serves SHADOW page → detour pointer
Original page: untouched in RAM. CRC checks pass.
Guest PT: untouched. VA → GPA mapping unchanged.
Shadow page: full R/W/X permissions. No EPT violations fire.
The hook executes on the direct EPT translation path, with zero additional VM-exits during the call chain. This is another example of symbiosis: the hooks don’t fight the hypervisor, they ride the existing EPT hardware. No traps fire, no handler re-entry, no performance delta for telemetry to detect.
Slot Recycling: When allocating hook tracking structures, the g_EptHooks array is scanned for inactive slots (IsActive == FALSE) and they are reused instead of burning new entries.
Hook Removal & Page Merging
On Op 7:
- Find the hook entry in
g_EptHooks[]by target physical address. - Restore the original EPT PTE (saved during deploy).
INVEPT.PageMerge2MB: reads all 512 PTEs and checks that every entry’s PFN equalsfirst_pfn + i*0x1000(strictly contiguous physical pages). If a shadow hook leaves a non-contiguous PFN, the merge aborts. If all entries are clean, it restores the 2MB Large Page bit to the PDE and abandons the orphaned 4KB Page Table (deferred free).
Merging matters as leaving split regions increases TLB pressure, which degrades performance and is detectable.
Why abandon instead of free? Other cores may hold stale TLB entries pointing to the old Page Table. If that memory is freed and reused, a remote core could read garbage and triple-fault. The old PT is pushed to
g_AbandonedPt[], such that it stays valid even if a remote core has a stale entry. This is effectively a deferred free.
Operation Group C: Hardware Telemetry Evasion (MSRs)
Anti-forensics tools and anti-cheats use hardware tracing (LBR, BTS, Intel PT) to reconstruct execution flow. These are enabled via Model-Specific Registers. The relevant MSR reads and writes are intercepted to blind them.
Blinding Intel PT/LBR: Sabotaging DEBUGCTL and RTIT_CTL
On MSR Write (EXIT_REASON_MSR_WRITE, 32):
When the guest writes to IA32_DEBUGCTL (0x1D) or IA32_RTIT_CTL (0x570):
- Extract the hardware APIC ID via CPUID leaf 1 to index into a per-core state array.
- Cache the guest’s intended 64-bit value in the per-core array (this is the “virtual” value).
- Strip the tracking enable bits from the value in
RAX:- DEBUGCTL: clear Bit 0 (LBR) and Bit 7 (BTS).
- RTIT_CTL: clear Bit 0 (TraceEn).
- Fall through to Microsoft’s handler. Microsoft writes the sabotaged value to hardware, believing it is correct. The write is not blocked, nor is the exit consumed. Microsoft performs the actual hardware write.
On MSR Read (EXIT_REASON_MSR_READ, 31):
When the guest reads either register:
- Return the cached virtual value (what the guest originally wrote, before sabotage) via
RAX/RDX. - Advance Guest RIP past the
RDMSRinstruction. - Premature
VMRESUME
The guest believes tracing is enabled (reads return the intended value). The hardware has the enable bits stripped and no tracing occurs.
Guest writes DEBUGCTL = 0x81 (LBR=1, BTS=1)
│
▼ VM-exit
Cache 0x81 in g_Virtual_1D[core_id]
Strip → RAX = 0x00
Fall through → Microsoft writes 0x00 to hardware
Guest reads DEBUGCTL
│
▼ VM-exit
Return 0x81 from cache → VMRESUME
Guest sees 0x81. Hardware holds 0x00.
Autonomous Kernel Resolution (LSTAR)
IA32_LSTAR (0xC0000082) is the MSR that holds the address of KiSystemCall64, the kernel’s syscall entry point. Every time usermode executes a syscall instruction, the CPU jumps to the address stored in LSTAR. When Windows initializes, it writes this MSR, and the first write is intercepted:
- Page-align the syscall handler address.
- Scan backward through physical memory, page by page, reading via
SafeReadSystemRam64until theMZDOS header is found. - Cache the result as
g_KernelBase.
Once resolved, the MSR is unhooked by clearing the LSTAR intercept bit in the VMCS MSR Bitmap (mapped via the PTE bridge). Future LSTAR writes no longer cause VM-exits.
Stealth Cloaking (Triggered by Op 0)
When the first valid knock arrives, the attachment performs a one-time stealth sequence:
1. Physical Memory Camouflage
A physical memory scanner dumping every page would notice 2MB of “weird” data where the heap is. Physical page 0x1000 (BIOS/IVT data: legitimate, hardware-reserved content) is cloned into a dummy page. The EPT is then walked (splitting to 4KB as needed), and every PTE that maps a heap page is swapped to point at the dummy. Any guest-side scan of those physical addresses now sees authentic BIOS data.
The batch strategy: instead of acquiring the spinlock 512 times (once per page), the number of pages that fit in the current Page Table is computed (PagesToProcess = 512 - StartIndex), the lock is acquired once, batch-overwritten, released, and stepped to the next PT boundary.
2. EFI Memory Map Ghosting
The UEFI memory map persists in physical RAM after ExitBootServices. Windows tools like MmGetPhysicalMemoryRanges() can read it. Physical memory is scanned from 1MB to 3GB (skipping the PCI MMIO hole at 3GB-4GB) looking for the EFI descriptor whose PhysicalStart matches the heap base. Once found, all five fields are zeroed (Type, PhysicalStart, VirtualStart, NumberOfPages, Attribute). An sfence drains the store buffer before the mapping PTE is torn down. The allocation no longer exists as far as Windows is concerned.
3. hvix64 Hook Page Shadow
The bootkit planted a 14-byte JMP at the VM-exit handler inside hvix64.exe. If the guest EPT exposes that page, a memory scanner would see the hook bytes. The Host CR3 is read from the VMCS, the VM-exit handler’s VA is translated to a host-physical address, and it is checked whether the page is visible in the guest EPT. If SLAT already hides it (the expected case), no action is needed. If it’s exposed, a shadow page is allocated, cloned with original bytes restored, and the EPT PTE is swapped. The guest sees clean Microsoft code while the CPU continues executing the real hooked page in VMX-root.
The Handoff: The Triple-Fault Preventer
The Master Wrapper must safely return control to Microsoft for any exit it doesn’t handle (or partially handles). This is the backbone of the symbiotic design: the vast majority of VM-exits pass straight through to Microsoft. The system runs exactly as it would without the payload, and the custom code only activates for the handful of exits it cares about.
Route A — Forward to Microsoft
The detour returns g_original_vmexit_handler. The wrapper restores all registers and jumps to Microsoft’s handler. For authenticated CPUID exits that the custom logic handled, RAX and RCX are first zeroed (simulating a standard CPUID leaf 0 request). Microsoft then processes a harmless dummy exit instead of the original request semantics.
Route B — Premature VMRESUME
The detour returns zero. The wrapper restores registers and executes VMRESUME, returning directly to the guest. Microsoft’s handler never runs.
The Fallback
If VMRESUME fails (invalid guest state, corrupted VMCS), the CPU sets RFLAGS.CF=1 and falls through to the next instruction. Immediately following VMRESUME in the wrapper is a 14-byte absolute JMP to Microsoft’s native handler. Instead of triple-faulting, the system falls back gracefully.
Route B in the Master Wrapper:
VMRESUME ← Try to return to guest
JMP [rip] ← CPU falls here if VMRESUME fails
<handler addr> ← Microsoft takes over, system stays stable
Technical Deep Dive Part 3 - The Usermode DLL
The usermode DLL (HyperVenom.dll) is the messenger. It issues CPUID instructions to send structured commands into Ring -1 and reads the result from the guest register context.
The MASM Tunnel
The x64 MSVC compiler does not support inline assembly. To load specific values into R8, R9, and R10 before executing CPUID, the DLL calls an external MASM stub:
; HyperVenomHypercall(leaf, op, arg1, arg2, arg3)
HyperVenomHypercall PROC
mov eax, ecx ; leaf
mov ecx, edx ; op (sub-leaf)
mov r8, r8 ; arg1 (no-op: already in R8 per x64 ABI)
mov r9, r9 ; arg2 (no-op: already in R9 per x64 ABI)
mov r10, [rsp+28h] ; arg3 (5th param from stack)
mov edx, MAGIC ; silent knock value
cpuid ; → VM-exit
; On return: EAX, EBX, ECX, EDX contain hypervisor response
; R10, R11 contain extended status/data
ret
HyperVenomHypercall ENDP
The stub returns the modified register values. The DLL reads R11 as the status code and R10/R9 as data.
The Gatekeeper Handshake
Before any operations, the DLL must authenticate:
int HyperVenomInit(void) {
CPUID_RESULT r = HyperVenomHypercall(LEAF, /*op=*/0, 0, 0, 0);
if (r.r11 != SUCCESS) return -1; // Locked out or wrong magic
// From this point, the CR3 is bound.
// Only this process can issue commands.
return 0;
}
The hypervisor checks CPL == 3, validates the magic, and locks the caller’s CR3. Lockout is triggered by repeated privilege violations (as described in Part 2), not by wrong-magic CPUID probes alone.
Operation Reference
| Op | DLL Function | What it does | Return |
|---|---|---|---|
| 0 | HyperVenomInit | Authenticate, bind CR3, trigger stealth | R11 = status |
| 1 | TranslateVirtual | Walk guest PTs for a given VA+CR3 | R10 = PA |
| 2 | ReadPhysical | Read 16 bytes at a PA | R10+R9 = data |
| 3 | WritePhysical | Write up to 16 bytes to a PA | R11 = status |
| 5 | GetLogBuffer | Get log buffer PA | R10 = PA |
| 6 | DeployHook | EPT shadow hook on a VTable pointer | R10 = original ptr, R11 = status |
| 7 | RemoveHook | Restore EPT, merge pages | R11 = status |
| 9 | RegisterAffinity | Register VMCALL gadget for display spoof | R11 = status |
| 10 | GetKernelBase | Return ntoskrnl base (LSTAR-resolved) | R10 = base |
End-to-End Flow: Deploying an EPT Hook
Here is the complete sequence when the test DLL deploys a VTable hook:
test_dll.c (Ring 3)
│
├─ 1. LoadLibrary("HyperVenom.dll")
│
├─ 2. HyperVenomInit()
│ └─ CPUID(LEAF, op=0, magic=MAGIC)
│ CPU traps → VM-exit → MasterWrapper → vmexit_handler_detour
│ Op 0: verify magic, CPL==3, save CR3, trigger stealth
│ Return 0 → VMRESUME → DLL reads R11=SUCCESS
│
├─ 3. VirtualAlloc() two pages: VTable page + Code page
│ └─ Write function bytes, set PAGE_EXECUTE_READ
│ └─ *(UINT64*)pVTable = pRealFunction
│
├─ 4. DeployHook(VTable_VA, Detour_VA, &OriginalPtr)
│ └─ CPUID(LEAF, op=6, target_va, detour_va)
│ VM-exit → vmexit_handler_detour
│ Op 6:
│ a. TranslateGuestVirtual(cr3, target_va) → target_phys
│ b. HeapAlloc(4096) → shadow page
│ c. Clone original page to shadow (via PTE 508)
│ d. Overwrite target pointer on shadow with detour_va
│ e. SafePageSplit1GB + SafePageSplit2MB (if needed)
│ f. Swap EPT PTE: original_phys → shadow_phys
│ g. INVEPT → flush EPT TLB
│ R10 = original function pointer, R11 = SUCCESS
│ VMRESUME → DLL reads OriginalPtr from R10
│
├─ 5. GameEngineCall = *(volatile UINT64*)pVTable
│ └─ CPU reads from VTable VA
│ Guest PT: VA → GPA (unchanged)
│ EPT: GPA → shadow HPA (swapped)
│ CPU serves shadow page → reads detour address
│ └─ CPU calls MyDetour(100)
│ MyDetour calls OriginalGameFunction(100) → returns 1437
│ MyDetour adds 9000 → returns 10437
│ (ZERO VM-exits during this entire call)
│
├─ 6. RemoveHook(VTable_VA)
│ └─ CPUID(LEAF, op=7, target_va)
│ VM-exit → vmexit_handler_detour
│ Op 7:
│ a. Find hook in g_EptHooks[] by target phys
│ b. Restore original EPT PTE
│ c. INVEPT
│ d. PageMerge2MB (restore large page if clean)
│ e. Abandon old PT (deferred free)
│ VMRESUME → DLL reads R11 = SUCCESS
│
└─ 7. GameEngineCall reads original pointer → returns 1437
Why the hook is zero-VM-exit: The shadow page has full R/W/X permissions in the EPT. When the guest reads the VTable pointer or executes the detour, no EPT violation fires. The EPT hardware translates GPA -> shadow HPA directly, so the hypervisor is not re-entered during that execution path.
The Stress Test
The test DLL validates correctness under concurrency:
16 threads:
├─ 8 × PhysicalReadSpammer: 50,000 CPUID calls each → 400,000 VM-exits
│ Tests: g_PteLock contention, PTE remapping, TLB coherency
│
└─ 8 × HookExecutionSpammer: 10,000,000 VTable reads each → 80M EPT translations
Tests: EPT consistency, shadow page stability, zero-VM-exit path
Pass criteria:
- Every
SafeReadSystemRam64(0x1000)returns the same value (no read corruption). - Every hooked
GameEngineCall(100)returns exactly10437(no EPT inconsistency).
This validates spinlock correctness, TLB flush coverage across all cores, EPT shadow coherency, and heap allocator thread safety.
Conclusion
The framework has been running stably on a bare-metal machine for several weeks, exhibiting no noticeable change in system performance or stability. Stress tests indicate minimal latency in the hypercalls, with both CPUID and VMCALL operations taking approximately 3,000 cycles on average. The usermode application successfully introspects memory, reads the BIOS, and verifies that the hypervisor footprint remains hidden from the guest OS.
========================================
HyperVenom DLL Test
========================================
[1/3] Checking for Hyper-V...
-> PASS: Hyper-V is present!
-> Hyper-V Vendor: Microsoft Hv
[2/3] Loading HyperVenom.dll...
-> PASS: DLL loaded (handle: 00007FFE3DD10000)
-> PASS: Function addresses obtained
-> PASS: HyperVenomGetLogBuffer available
[3/3] Initializing HyperVenom...
-> Init returned: 0
-> PASS: HyperVenom initialized!
-> Hyper-V Present: YES
-> Partition ID: 0x0
-> VP Index: 0
-> Version: 0x1
[Log Buffer] (4096 bytes):
[-] Commencing Hardware-Safe EFI Memory Map Ghosting
[-] EFI Allocation completely wiped from physical map safely!
[Ring -1] Detour installed, returning to Bootkit
[+] Ring -1 Autonomy: ntoskrnl.exe safely resolved at: 0xFFFFF80284B40000
[+] IA32_LSTAR dynamically unhooked!
[-] hvix64 hook NOT in guest EPT SLAT hides JMP natively.
[-] Rootkit Heap safely cloaked behind the Black Hole!
[+] Successfully Removed EPT Hook (Shadow Page Abandoned).
[RDTSC] Commencing CPUID Execution Profiling
-> Native CPUID execution time: 2932 cycles
-> HyperVenom Backdoor execution time: 4917 cycles
[4/4] Zero-VM-exit Data-Pointer Hook Test
-> Simulated Game VTable Allocated at: 00000260FCF60000
-> Original Function Address: 0x260fcf70000
-> Ring -1 deployed Passive EPT Hook
-> Hypervisor returned Original Pointer: 0x260fcf70000
-> [GUEST] Executing VTable Pointer Native Call
-> [GUEST] Call Returned: 10437 (Expected 10437)
-> Spawning 8 threads to hammer g_PteLock Bridge
-> Spawning 8 threads to saturate hardware TLB with VTable hooks
-> [PASS] Concurrency: 0 Data Races. g_PteLock & TLBs stable.
-> Profiling Ring -1 VM-exit MSR Latency (100,000 iterations)
-> [PASS] Average VM-exit Overhead: 3120 clock cycles.
-> Hook successfully removed & Shadow Page Freed.
-> [GUEST] Call After Removal Returned: 1437 (Expected 1437)
[5/5] Executing Genuine Programmatic Stealth Audit
-> [PASS] HVCI / VAD Audit: OS natively sees PAGE_READWRITE.
-> [PASS] Absolute Hardware Access Confirmed. Hex Dump:
[0x1000] E9 4D 06 00 47 00 00 00 A0 86 01 00 9D 05 00 00
-> [PASS] Hook Integrity: Extracted function pointer is natively executable in Guest.
-> [PASS] Bootkit Audit: OS runtime stability verified.
-> [PASS] Hardware Tracing Audit: MSR Virtualization active and context-safe.
To validate the framework’s efficacy against Ring 0 telemetry agents, proof-of-concept memory introspection tools were developed and evaluated within the environments of industry-leading kernel anti-cheats (specifically, Vanguard and Easy Anti-Cheat). The framework caused no system instability and easily handled 5,000,000 hypervisor calls a second for extended periods without detection, although unoptimized high hypercall frequencies produced minor CPU overhead (~5% on an Intel i7-8700).
Demonstrations are shown below. (Recorded externally, as the hypervisor intercepts display affinity checks to prevent local capture).
Limitations
Secure Boot. This project requires Secure Boot to be disabled or a custom Platform Key (PK) enrolled. On systems with enforced Secure Boot policies (enterprise environments, Windows 11 default configurations), the unsigned bootkit cannot load without first managing the UEFI key database. A production implementation would require either a signed shim loader or physical access to modify the Secure Boot configuration.
AMD. The entire project targets Intel VT-x and specifically hvix64.exe, the Intel variant of Microsoft’s hypervisor. AMD systems use a different binary (hvax64.exe), a different virtualization instruction set (AMD-V / SVM), and different VM-exit handler structures. The pattern scans, VMCS field reads, and VMRESUME/INVEPT instructions would all need to be adapted for AMD-V equivalents (VMRUN, #VMEXIT, VMCB).
Related Work
- Voyager & Sputnik: backengineering/Voyager & cutecatsandvirtualmachines/Sputnik
- Illusion-rs: memN0ps/illusion-rs
- Matrix-rs: memN0ps/matrix-rs
- Ring-1.io: ring-1.io
- hyper-reV: noahware/hyper-reV