Wednesday, May 27, 2020

When Anti-Virus Engines Look Like Kernel Rootkits

While analyzing real-world systems, memory analysts will often encounter anti-virus (AV) engines, EDRs, and similar products that, at first glance, look suspiciously like malware. This occurs because these security products leverage the same techniques commonly employed by malware—such as API hooking, system call hooking, and registering callbacks—in order to gain the insight they need to detect and analyze threats.

A recent journal publication by members of our team examined this issue for userland API hooks. In this blog post we will show an example of what AV kernel hooks often look like. The sample investigated in this post was sent by a friend who wanted clarification of the behavior, as it was part of a real investigation. In particular, Volatility's ssdt plugin was showing many system call table entries as being hooked, and our friend wanted to know if the hooks were malicious or benign.

System Calls Background


System calls are the common mechanism by which unprivileged code in userland (process) memory requests services through the kernel to access privileged resources, such as the hard drive to create/read/write/delete files, the network stack to send/receive packets, the monitor to display information to the user, and numerous other resources. System calls are also used to enumerate the active system state, such as the list of running processes, active network connections, loaded kernel modules, and nearly everything else that live forensic tools and endpoint agents gather for analysis. This obviously makes the system call table an attractive target for malware to tamper with, as well as a high-value location for security agents to monitor.

To check for signs of rootkits, Volatility's ssdt plugin locates each system call table present in a memory sample and then enumerates each system call table entry. For each entry, it prints either the containing module, if found, or UNKNOWN if the entry points to a memory location not associated with the kernel itself or a module contained within the kernel module list. Several common kernel rootkit methods of hiding code will trigger the classification as UNKNOWN, such as unlinking the malicious module from the module list, or allocating a RWX memory region and then using it to host code after the malicious module is unloaded (see this blog post on analyzing a detached kernel thread).

Unfortunately, the use of system call hooking by AV and EDR engines frequently triggers UNKNOWN entries in ssdt output, as security agents' attempts to "hide" on a system often leverage the same tactics used by malware.

Initial Analysis


To start the analysis, the ssdt plugin was run against the sample:

$ python vol.py -f sample.raw --profile=Win7SP1x86  ssdt > ssdt-output.txt 
Volatility Foundation Volatility Framework 2.6
$ grep -c UNKNOWN ssdt-output.txt 
45
$ grep UNKNOWN ssdt-output.txt | head -20
  Entry 0x000d: 0x886ea580 (NtAlertResumeThread) owned by UNKNOWN
  Entry 0x000e: 0x881f70b0 (NtAlertThread) owned by UNKNOWN
  Entry 0x0013: 0x886e4d88 (NtAllocateVirtualMemory) owned by UNKNOWN
  Entry 0x0016: 0x88217ca0 (NtAlpcConnectPort) owned by UNKNOWN
  Entry 0x002b: 0x886e9e80 (NtAssignProcessToJobObject) owned by UNKNOWN
  Entry 0x004a: 0x886ea390 (NtCreateMutant) owned by UNKNOWN
  Entry 0x0056: 0x886e9c48 (NtCreateSymbolicLinkObject) owned by UNKNOWN
  Entry 0x0057: 0x88438858 (NtCreateThread) owned by UNKNOWN
  Entry 0x0058: 0x886e9d00 (NtCreateThreadEx) owned by UNKNOWN
  Entry 0x0060: 0x886e9f30 (NtDebugActiveProcess) owned by UNKNOWN
  Entry 0x006f: 0x886e4910 (NtDuplicateObject) owned by UNKNOWN
  Entry 0x0083: 0x886eaba8 (NtFreeVirtualMemory) owned by UNKNOWN
  Entry 0x0091: 0x88436198 (NtImpersonateAnonymousToken) owned by UNKNOWN
  Entry 0x0093: 0x886ea4d0 (NtImpersonateThread) owned by UNKNOWN
  Entry 0x009b: 0x880d3478 (NtLoadDriver) owned by UNKNOWN
  Entry 0x00a8: 0x886eaae0 (NtMapViewOfSection) owned by UNKNOWN
  Entry 0x00b1: 0x886ea2e0 (NtOpenEvent) owned by UNKNOWN
  Entry 0x00be: 0x886e4de0 (NtOpenProcess) owned by UNKNOWN
  Entry 0x00bf: 0x886e4db8 (NtOpenProcessToken) owned by UNKNOWN
  Entry 0x00c2: 0x886ea1a8 (NtOpenSection) owned by UNKNOWN

Looking at the output, it appears that 45 entries are hooked by UNKNOWN module(s). Furthermore, the hooked functions are commonly targeted by both malware and security agents.

Classifying One Hook


To begin the process of determining if the hooks are benign or malicious, volshell was used to examine the code of a few hooks. volshell is a Volatility plugin that allows you to interactively explore a memory sample, including viewing, searching, and disassembling arbitrary addresses in any context (the kernel or a particular process).

To start, volshell was loaded and the hook for Entry 0x000d: 0x886ea580 (NtAlertResumeThread) was disassembled (numbers, dashes, and arrows added for ease of explanation):

$ python vol.py -f sample.raw --profile=Win7SP1x86  volshell
<snip>
> dis(0x886ea580)
0x886ea580 55                               PUSH EBP                               <----- [1]
0x886ea581 8bec                           MOV EBP, ESP
0x886ea583 ff750c                         PUSH DWORD [EBP+0xc]
0x886ea586 b9fca56e88                MOV ECX, 0x886ea5fc            <-----  [2]
0x886ea58b ff7508                         PUSH DWORD [EBP+0x8]
0x886ea58e 51                               PUSH ECX
0x886ea58f 8b01                            MOV EAX, [ECX]                      <-----  [3]
0x886ea591 ff5004                         CALL DWORD [EAX+0x4]        <-----  [4]
0x886ea594 83c40c                       ADD ESP, 0xc
0x886ea597 5d                               POP EBP
0x886ea598 c20800                       RET 0x8                                    <-----  [5]
0x886ea59b cc                               INT 3
0x886ea59c 0000                           ADD [EAX], AL
0x886ea59e 0000                           ADD [EAX], AL
<snip>

After reading the disassembly, several things stand out. First, there is a real function prologue with the instructions of "PUSH EBP; MOV EBP, ESP;" [1] and, shortly after, a proper function epilogue [5]. Examining the code in between, a hardcoded address is stored in ECX [2]. That address is dereferenced and stored in EAX [3]. A dereference of EAX + 4 is then used as the target address of a CALL instruction [4].

This code flow can be re-implemented in volshell through the use of its dd (display double word) and dis functions.

> dd(0x886ea5fc, 4)                   
886ea5fc  964c72b4

> dd(0x964c72b4 + 4, 4)          
964c72b8  964ba922

> dis(0x964ba922)
0x964ba922 55                               PUSH EBP
0x964ba923 8bec                           MOV EBP, ESP
0x964ba925 51                               PUSH ECX
0x964ba926 51                               PUSH ECX
0x964ba927 53                               PUSH EBX
0x964ba928 56                               PUSH ESI
0x964ba929 57                               PUSH EDI
0x964ba92a bf010000c0                MOV EDI, 0xc0000001
<snip>

The first invocation of dd is at the hardcoded address from [2] earlier. As can be seen, the address stored here is 0x964c72b4. The second dd invocation dereferences this address + 4 just as the disassembly in [4] from earlier shows. This gives a CALL target address of 0x964ba922, which the address used in the above dis call.

Looking at the instructions here, there is the beginning of a long function which appears to be the real hook implementation. To determine which, if any, module is hosting the real payload, the drivermodule plugin can be invoked with the address of the function:

$ python vol.py -f sample.raw --profile=Win7SP1x86 drivermodule -a 0x964ba922
Volatility Foundation Volatility Framework 2.6
Module                   Driver          Alt. Name    Service Key
--------------------- ------------- ------------- -----------
SYMEVENT.SYS     SymEvent   SymEvent   \Driver\SymEvent

This output shows that the function is inside of the SYMEVENT.SYS driver, which is part of the Symantec AV/Endpoint protection suite installed on the system.

Classifying All Hooks


So far, it has only been shown that 1 of the 45 hooks belongs to Symantec. To ensure that all of them belong to Symantec, the remaining 44 must be checked. Repeating the volshell steps for two more of the system calls showed the same pattern as before: a hardcoded address (different address for each hook) being dereferenced, followed by the value of the first dereference having 4 added then being dereferenced itself. The consistent result between the hooks of this sample was that the final destination was the same: 0x964ba922. With this in mind, it became clear that the process of checking each hook could be automated to see if they all ended at 0x964ba922. 

To accomplish this, awk was first used to extract just the UNKNOWN addresses from the ssdt output saved earlier:
grep UNKNOWN ssdt-output.txt | awk '{ print $3 }' > unknown-addresses
This placed all of the UNKNOWN addresses in the unknown-addresses file. Next, code was written inside of volshell that repeated the read hardcoded address-> dereference -> dereference+4 pattern. Note: volshell runs inside of the Python shell, so whatever you can do in Python, you can do in volshell as well.  The following shows this code in its entirety, with line numbers added to the beginning of each line.
1.    > for line in open("unknown-addresses", "r").readlines():
2.   ...:    address = int(line.strip(), 16)
3.   ...:    insts = addrspace().zread(address + 6, 5)
4.   ...:    if ord(insts[0]) != 0xb9:
5.   ...:        print "invalid instruction at %x | %x" % (address, ord(insts[0]))
6.   ...:        break
7.   ...:     
8.   ...:    first_addr = struct.unpack("<I", insts[1:])[0]
9.   ...:     
10. ...:   second_addr_str = addrspace().zread(first_addr, 4)
11. ...:   second_addr = struct.unpack("<I", second_addr_str)[0]
12. ...:     
13. ...:   function_addr_str = addrspace().zread(second_addr + 4, 4)
14. ...:   function_addr = struct.unpack("<I", function_addr_str)[0]
15. ...:     
16. ...:   if function_addr != 0x964ba922:
17. ...:        print "incorrect function_address: %x" % function_addr
18. ...:     

This code starts by looping for each line in the file, which contains one line for each UNKNOWN hook address. Next, the address is converted to an integer. On line 3, the hook address plus 6 is read for 5 bytes inside of the kernel address space. This corresponds to reading what should be the MOV ECX, <hardcoded address> instruction at [2] in the first disassembly listing. The first byte of the instruction is verified to be 0xb9, which is the MOV instruction opcode. On line 8, the remaining 4 bytes are then converted to an integer, as they are the hardcoded address in little endian. On lines 10  and 11, the hardcoded address is dereferenced and then converted to an integer. On lines 13 and 14, the dereferenced valued plus 4 is dereferenced and converted to an integer. It is then verified to be the target CALL address of 0x964ba922.

Running this loop over the sample showed that all system call hooks transferred control flow to 0x964ba922, which means that all hooks belonged to the same Symantec driver.

Closing Thoughts


Through a mix of manual reverse engineering and automated comparisons based on knowledge learned, the system call table was automatically verified as clean, and our friend was given a repeatable methodology to apply to other samples in his investigation.

If you like this type of analysis and want to challenge yourself to write plugins that automate memory forensic techniques, then consider a submission to our 2020 Volatility Plugin Contest. This year's contest is based on Volatility 3, and you can learn about the new, exciting features of Volatility 3 in our recently recorded presentation.

If you would like to stay in touch with the Volatility Team and community then consider following us on Twitter, joining our Slack Server, and subscribing to our email list.


No comments:

Post a Comment