A recent journal publication by members of our team examined this issue for userland API hooks. In this blog post we will show an example of what AV kernel hooks often look like. The sample investigated in this post was sent by a friend who wanted clarification of the behavior, as it was part of a real investigation. In particular, Volatility's ssdt plugin was showing many system call table entries as being hooked, and our friend wanted to know if the hooks were malicious or benign.
System calls are the common mechanism by which unprivileged code in userland (process) memory requests services through the kernel to access privileged resources, such as the hard drive to create/read/write/delete files, the network stack to send/receive packets, the monitor to display information to the user, and numerous other resources. System calls are also used to enumerate the active system state, such as the list of running processes, active network connections, loaded kernel modules, and nearly everything else that live forensic tools and endpoint agents gather for analysis. This obviously makes the system call table an attractive target for malware to tamper with, as well as a high-value location for security agents to monitor.
To start the analysis, the ssdt plugin was run against the sample:
To begin the process of determining if the hooks are benign or malicious, volshell was used to examine the code of a few hooks. volshell is a Volatility plugin that allows you to interactively explore a memory sample, including viewing, searching, and disassembling arbitrary addresses in any context (the kernel or a particular process).
After reading the disassembly, several things stand out. First, there is a real function prologue with the instructions of "PUSH EBP; MOV EBP, ESP;" [1] and, shortly after, a proper function epilogue [5]. Examining the code in between, a hardcoded address is stored in ECX [2]. That address is dereferenced and stored in EAX [3]. A dereference of EAX + 4 is then used as the target address of a CALL instruction [4].
This code flow can be re-implemented in volshell through the use of its dd (display double word) and dis functions.
The first invocation of dd is at the hardcoded address from [2] earlier. As can be seen, the address stored here is 0x964c72b4. The second dd invocation dereferences this address + 4 just as the disassembly in [4] from earlier shows. This gives a CALL target address of 0x964ba922, which the address used in the above dis call.
Looking at the instructions here, there is the beginning of a long function which appears to be the real hook implementation. To determine which, if any, module is hosting the real payload, the drivermodule plugin can be invoked with the address of the function:
So far, it has only been shown that 1 of the 45 hooks belongs to Symantec. To ensure that all of them belong to Symantec, the remaining 44 must be checked. Repeating the volshell steps for two more of the system calls showed the same pattern as before: a hardcoded address (different address for each hook) being dereferenced, followed by the value of the first dereference having 4 added then being dereferenced itself. The consistent result between the hooks of this sample was that the final destination was the same: 0x964ba922. With this in mind, it became clear that the process of checking each hook could be automated to see if they all ended at 0x964ba922.
To accomplish this, awk was first used to extract just the UNKNOWN addresses from the ssdt output saved earlier:
Running this loop over the sample showed that all system call hooks transferred control flow to 0x964ba922, which means that all hooks belonged to the same Symantec driver.
Through a mix of manual reverse engineering and automated comparisons based on knowledge learned, the system call table was automatically verified as clean, and our friend was given a repeatable methodology to apply to other samples in his investigation.
If you like this type of analysis and want to challenge yourself to write plugins that automate memory forensic techniques, then consider a submission to our 2020 Volatility Plugin Contest. This year's contest is based on Volatility 3, and you can learn about the new, exciting features of Volatility 3 in our recently recorded presentation.
If you would like to stay in touch with the Volatility Team and community then consider following us on Twitter, joining our Slack Server, and subscribing to our email list.
System Calls Background
System calls are the common mechanism by which unprivileged code in userland (process) memory requests services through the kernel to access privileged resources, such as the hard drive to create/read/write/delete files, the network stack to send/receive packets, the monitor to display information to the user, and numerous other resources. System calls are also used to enumerate the active system state, such as the list of running processes, active network connections, loaded kernel modules, and nearly everything else that live forensic tools and endpoint agents gather for analysis. This obviously makes the system call table an attractive target for malware to tamper with, as well as a high-value location for security agents to monitor.
To check for signs of rootkits, Volatility's ssdt plugin locates each system call table present in a memory sample and then enumerates each system call table entry. For each entry, it prints either the containing module, if found, or UNKNOWN if the entry points to a memory location not associated with the kernel itself or a module contained within the kernel module list. Several common kernel rootkit methods of hiding code will trigger the classification as UNKNOWN, such as unlinking the malicious module from the module list, or allocating a RWX memory region and then using it to host code after the malicious module is unloaded (see this blog post on analyzing a detached kernel thread).
Unfortunately, the use of system call hooking by AV and EDR engines frequently triggers UNKNOWN entries in ssdt output, as security agents' attempts to "hide" on a system often leverage the same tactics used by malware.
Initial Analysis
To start the analysis, the ssdt plugin was run against the sample:
$ python vol.py -f sample.raw --profile=Win7SP1x86 ssdt > ssdt-output.txtVolatility Foundation Volatility Framework 2.6$ grep -c UNKNOWN ssdt-output.txt45$ grep UNKNOWN ssdt-output.txt | head -20Entry 0x000d: 0x886ea580 (NtAlertResumeThread) owned by UNKNOWNEntry 0x000e: 0x881f70b0 (NtAlertThread) owned by UNKNOWNEntry 0x0013: 0x886e4d88 (NtAllocateVirtualMemory) owned by UNKNOWNEntry 0x0016: 0x88217ca0 (NtAlpcConnectPort) owned by UNKNOWNEntry 0x002b: 0x886e9e80 (NtAssignProcessToJobObject) owned by UNKNOWNEntry 0x004a: 0x886ea390 (NtCreateMutant) owned by UNKNOWNEntry 0x0056: 0x886e9c48 (NtCreateSymbolicLinkObject) owned by UNKNOWNEntry 0x0057: 0x88438858 (NtCreateThread) owned by UNKNOWNEntry 0x0058: 0x886e9d00 (NtCreateThreadEx) owned by UNKNOWNEntry 0x0060: 0x886e9f30 (NtDebugActiveProcess) owned by UNKNOWNEntry 0x006f: 0x886e4910 (NtDuplicateObject) owned by UNKNOWNEntry 0x0083: 0x886eaba8 (NtFreeVirtualMemory) owned by UNKNOWNEntry 0x0091: 0x88436198 (NtImpersonateAnonymousToken) owned by UNKNOWNEntry 0x0093: 0x886ea4d0 (NtImpersonateThread) owned by UNKNOWNEntry 0x009b: 0x880d3478 (NtLoadDriver) owned by UNKNOWNEntry 0x00a8: 0x886eaae0 (NtMapViewOfSection) owned by UNKNOWNEntry 0x00b1: 0x886ea2e0 (NtOpenEvent) owned by UNKNOWNEntry 0x00be: 0x886e4de0 (NtOpenProcess) owned by UNKNOWNEntry 0x00bf: 0x886e4db8 (NtOpenProcessToken) owned by UNKNOWNEntry 0x00c2: 0x886ea1a8 (NtOpenSection) owned by UNKNOWN
Looking at the output, it appears that 45 entries are hooked by UNKNOWN module(s). Furthermore, the hooked functions are commonly targeted by both malware and security agents.
Classifying One Hook
To begin the process of determining if the hooks are benign or malicious, volshell was used to examine the code of a few hooks. volshell is a Volatility plugin that allows you to interactively explore a memory sample, including viewing, searching, and disassembling arbitrary addresses in any context (the kernel or a particular process).
To start, volshell was loaded and the hook for Entry 0x000d: 0x886ea580 (NtAlertResumeThread) was disassembled (numbers, dashes, and arrows added for ease of explanation):
$ python vol.py -f sample.raw --profile=Win7SP1x86 volshell<snip>> dis(0x886ea580)0x886ea580 55 PUSH EBP <----- [1]0x886ea581 8bec MOV EBP, ESP0x886ea583 ff750c PUSH DWORD [EBP+0xc]0x886ea586 b9fca56e88 MOV ECX, 0x886ea5fc <----- [2]0x886ea58b ff7508 PUSH DWORD [EBP+0x8]0x886ea58e 51 PUSH ECX0x886ea58f 8b01 MOV EAX, [ECX] <----- [3]0x886ea591 ff5004 CALL DWORD [EAX+0x4] <----- [4]0x886ea594 83c40c ADD ESP, 0xc0x886ea597 5d POP EBP0x886ea598 c20800 RET 0x8 <----- [5]0x886ea59b cc INT 30x886ea59c 0000 ADD [EAX], AL0x886ea59e 0000 ADD [EAX], AL<snip>
This code flow can be re-implemented in volshell through the use of its dd (display double word) and dis functions.
> dd(0x886ea5fc, 4)886ea5fc 964c72b4> dd(0x964c72b4 + 4, 4)964c72b8 964ba922> dis(0x964ba922)0x964ba922 55 PUSH EBP0x964ba923 8bec MOV EBP, ESP0x964ba925 51 PUSH ECX0x964ba926 51 PUSH ECX0x964ba927 53 PUSH EBX0x964ba928 56 PUSH ESI0x964ba929 57 PUSH EDI0x964ba92a bf010000c0 MOV EDI, 0xc0000001<snip>
The first invocation of dd is at the hardcoded address from [2] earlier. As can be seen, the address stored here is 0x964c72b4. The second dd invocation dereferences this address + 4 just as the disassembly in [4] from earlier shows. This gives a CALL target address of 0x964ba922, which the address used in the above dis call.
$ python vol.py -f sample.raw --profile=Win7SP1x86 drivermodule -a 0x964ba922
Volatility Foundation Volatility Framework 2.6Module Driver Alt. Name Service Key--------------------- ------------- ------------- -----------SYMEVENT.SYS SymEvent SymEvent \Driver\SymEvent
This output shows that the function is inside of the SYMEVENT.SYS driver, which is part of the Symantec AV/Endpoint protection suite installed on the system.
Classifying All Hooks
So far, it has only been shown that 1 of the 45 hooks belongs to Symantec. To ensure that all of them belong to Symantec, the remaining 44 must be checked. Repeating the volshell steps for two more of the system calls showed the same pattern as before: a hardcoded address (different address for each hook) being dereferenced, followed by the value of the first dereference having 4 added then being dereferenced itself. The consistent result between the hooks of this sample was that the final destination was the same: 0x964ba922. With this in mind, it became clear that the process of checking each hook could be automated to see if they all ended at 0x964ba922.
To accomplish this, awk was first used to extract just the UNKNOWN addresses from the ssdt output saved earlier:
$ grep UNKNOWN ssdt-output.txt | awk '{ print $3 }' > unknown-addresses
This placed all of the UNKNOWN addresses in the unknown-addresses file. Next, code was written inside of volshell that repeated the read hardcoded address-> dereference -> dereference+4 pattern. Note: volshell runs inside of the Python shell, so whatever you can do in Python, you can do in volshell as well. The following shows this code in its entirety, with line numbers added to the beginning of each line.
1. > for line in open("unknown-addresses", "r").readlines():2. ...: address = int(line.strip(), 16)3. ...: insts = addrspace().zread(address + 6, 5)4. ...: if ord(insts[0]) != 0xb9:5. ...: print "invalid instruction at %x | %x" % (address, ord(insts[0]))6. ...: break7. ...:8. ...: first_addr = struct.unpack("<I", insts[1:])[0]9. ...:10. ...: second_addr_str = addrspace().zread(first_addr, 4)11. ...: second_addr = struct.unpack("<I", second_addr_str)[0]12. ...:13. ...: function_addr_str = addrspace().zread(second_addr + 4, 4)14. ...: function_addr = struct.unpack("<I", function_addr_str)[0]15. ...:16. ...: if function_addr != 0x964ba922:17. ...: print "incorrect function_address: %x" % function_addr18. ...:
This code starts by looping for each line in the file, which contains one line for each UNKNOWN hook address. Next, the address is converted to an integer. On line 3, the hook address plus 6 is read for 5 bytes inside of the kernel address space. This corresponds to reading what should be the MOV ECX, <hardcoded address> instruction at [2] in the first disassembly listing. The first byte of the instruction is verified to be 0xb9, which is the MOV instruction opcode. On line 8, the remaining 4 bytes are then converted to an integer, as they are the hardcoded address in little endian. On lines 10 and 11, the hardcoded address is dereferenced and then converted to an integer. On lines 13 and 14, the dereferenced valued plus 4 is dereferenced and converted to an integer. It is then verified to be the target CALL address of 0x964ba922.
Closing Thoughts
Through a mix of manual reverse engineering and automated comparisons based on knowledge learned, the system call table was automatically verified as clean, and our friend was given a repeatable methodology to apply to other samples in his investigation.
If you would like to stay in touch with the Volatility Team and community then consider following us on Twitter, joining our Slack Server, and subscribing to our email list.
No comments:
Post a Comment