[Archive of Volatility Labs]: MoVP 1.5 KBeast Rootkit, Detecting Hidden Modules, and sysfs

Month of Volatility Plugins

In this post I will analyze the KBeast rootkit using Volatility’s new Linux features. This will include finding hidden modules, network connections, opened files, and hooked system calls.

If you would like to follow along or recreate the steps taken, please see the LinuxForensicsWiki for instructions on how to do so.

Obtaining the Samples

To have a sample to test against I installed the KBeast rootkit in my Debian virtual machine that was running the 2.6.26-2-686 32-bit kernel.

KBeast

KBeast is a kernel mode rootkit that loads as a kernel module. It also has a userland component that provides remote access to the computer. This userland backdoor is hidden from other userland applications by the kernel module. KBeast also hides files, directories, and processes that start with a user defined prefix. Keylogging abilities are also optionally provided.

KBeast gains its control over a computer by hooking the system call table and by hooking the operations structures used to implement the netstat interface to userland.

We will know go through each piece of functionality the rootkit offers, how it accomplishes it, and how we can detect it with Volatility.

Hiding the Kernel Module

Effect on Forensics

Rootkits hide themselves from the module list as any unknown modules will be very noticeable to IT security staff as well as to integrity verifiers that operate in userland. The inability to locate hidden modules can give investigators a false sense of security and make them trust the output of tools on a live machine that they should not.

How KBeast Accomplishes it

To hide its kernel module component, KBeast uses the same technique that many other modules do, which is breaking itself from the linked list of loaded kernel modules. This list is exported through /proc/modules and the lsmod binary reads this file to list the loaded modules of a system. This has the effect of the module still being active in memory, but not detectable with lsmod or from kernel tools that simply walk the linked list.

How Volatility Detects This

Volatility leverages sysfs to find modules that are removed from the modules list but still active. sysfs is a kernel to userland interface, similar to /proc, that exports a wide range of kernel information and statistics. One of these types of data is the loaded modules and their associated information such as parameters, sections, and reference counts. On a running system, this information is exported through the /sys/module directory.

Inside of this directory, there is one directory per-kernel module, and the directory is named the same as the module appears in lsmod. The per-module sub-directories contain more sub-directories that hold the parameters, sections, and other module data. The following shows reading of the parameters sent to LiME to obtain the memory capture for this blog post (the original command was insmod lime.ko "path=kbeast.this format=lime")

# cat /sys/module/lime/parameters/path

kbeast.this

# cat /sys/module/lime/parameters/format

lime

The linux_check_modules plugin finds hidden modules by walking the linked list of modules as well as enumerating all the directories under /sys/module. These two lists are then compared and any entries that are only found in sysfs are reported as hidden kernel modules. We have yet to find a rootkit that hides from sysfs at all, so this method has worked well across a number of malware samples. The following shows this plugin against KBeast:

# python vol.py -f kbeast.this --profile=LinuxDebianx86 linux_check_modules

Volatile Systems Volatility Framework 2.2_rc1

Module Name

-----------

ipsecs_kbeast_v1

As can be seen, the KBeast module is detected as hidden.

The sysfs enumeration code works by finding the module_kset variable, of type kset, that holds all information for /sys/module. The plugin then walks each member of the kset’s entry list which is of type kobject. Each of these kobject structures represents a module and its subdirectory immediately under /sys/module. The names of these directories are then gathered to be compared with the module list names.

Hooking System Call Table

Effect on Forensics

System calls are the main mechanism for userland code to trigger event handling by the kernel. Reading and writing files, sending network data, spawning and exiting processes, etc are all done through system calls. The system call table is an array of function pointers, in which each pointer corresponds to a system call handler (i.e. sys_read handles the read system call).

Rootkits often target this table due to the power it gives them over the control flow of the running kernel. KBeast hooks a number of entries in this table in order to hide files, processes, and more.

How KBeast Accomplishes it

During the initialization of its kernel module, KBeast hooks the unlink, rmdir, unlinkat, rename, open, kill, read, write, getdents, and delete_module system calls with its own handlers. These handlers ensure that files and processes that start with the user-supplied prefix are hidden and that they cannot be tampered with.

The overwritten kill system call handler also acts as the mechanism that the rootkit provides in order for userland processes to elevate privileges. All a userland process has to do is send a signal with the backdoor signal value and the process will be elevated. If you read our post yesterday, you know that the Average Coder rootkit used a mechanism that allowed us to detect elevated processes. Unfortunately, KBeast does not use this mechanism and instead uses the proper interfaces provided by the kernel, namely prepare_creds and commit_creds. This mechanism does not produce any inconsistencies, so we cannot immediately find processes elevated by KBeast.

How Volatility Detects This

Volatility detects all of these hooks by enumerating and verifying each entry in the system call table. This is implemented in the linux_check_syscall plugin, which, for every member of the system call table, either prints out the symbol name or, if it is hooked, prints out the hook address. Since there is anywhere from 300 to 400+ system calls on normal Linux system, it is advisable to redirect the plugin output to a file and then grep for bad entries as shown here:

# python vol.py -f kbeast.lime --profile=LinuxDebianx86 linux_check_syscall > ksyscall

# head -10 ksyscall

Table Name Index Address Symbol

---------- ---------- ---------- ------------------------------

32bit 0x0 0xc103ba61 sys_restart_syscall

32bit 0x1 0xc103396b sys_exit

32bit 0x2 0xc100333c ptregs_fork

32bit 0x3 0xe0fb46b9 HOOKED

32bit 0x4 0xe0fb4c56 HOOKED

32bit 0x5 0xe0fb4fad HOOKED

32bit 0x6 0xc10b1b16 sys_close

32bit 0x7 0xc10331c0 sys_waitpid

# grep HOOKED ksyscall

32bit 0x3 0xe0fb46b9 HOOKED

32bit 0x4 0xe0fb4c56 HOOKED

32bit 0x5 0xe0fb4fad HOOKED

32bit 0xa 0xe0fb4d30 HOOKED

32bit 0x25 0xe0fb4412 HOOKED

32bit 0x26 0xe0fb4ebd HOOKED

32bit 0x28 0xe0fb4db1 HOOKED

32bit 0x81 0xe0fb5044 HOOKED

32bit 0xdc 0xe0fb4b9e HOOKED

32bit 0x12d 0xe0fb4e32 HOOKED

We can see in the first output what some clean entries look like and that the system call table index is reported along with the symbol name and address. For hooked entries, we instead see HOOKED in place of a symbol name because the hooked function points to an unknown address (in this case inside the rootkit’s module).

The plugin only prints the index of the system call entries and not a name because the system call table varies widely across distributions and kernel versions, and determining the name of each one requires the debug build of the kernel (vmlinux). This may be incorporated into future versions of the plugins, but will require additions to the current code base, and in many cases the debug build is not made available by the distribution package maintainers.

Hiding Network Connections

Effect on Forensics

The ability to hide network connections from userland frustrates not only host investigators, but also network forensics teams who wish to tie traffic back to a specific computer. The ease in which kernel modules can hide information from userland makes a strong case for all incident response to be based on offline memory captures and not on the output from tools running on the live system.

How KBeast Accomplishes it

To hide network connections from netstat and the userland interfaces it uses, KBeast hooks the show member of the tcp4_seq_afinfo sequence operation structure. This structure is of type tcp_seq_afinfo and has members of type file_operations and of type seq_operations. Please refer to yesterday’s blog post to learn about file_operations structures. Sequence operation structures provide a generic mechanism to display information inside of the /proc filesystem. This structure has the members start, show, next, stop, and the wrapping code provides handling of partial seeks, buffered reads, and other complicated logic so that it only has to be implemented once throughout the entire kernel.

Sequence operations structures are often targeted by malware because they directly affect what is populated in /proc. By overwriting the show member of such a structure, a rootkit can easily filter out entries it does not want to appear in userland. KBeast effectively hides its backdoored network connection by filtering the show member of the TCP4 structure. This technique is also used by many other rootkits.

How Volatility Detects This

To detect KBeast’s overwriting of network sequence operation structures, the linux_check_afinfo plugin walks the file_operations and sequence_operations structures of all UDP and TCP protocol structures including, tcp6_seq_afinfo, tcp4_seq_afinfo, udplite6_seq_afinfo, udp6_seq_afinfo, udplite4_seq_afinfo, and udp4_seq_afinfo, and verifies each member. This effectively detects any tampering with the interesting members of these structures. The following output shows this plugin against the VM infected with KBeast:

# python vol.py -f kbeast.lime --profile=LinuxDebianx86 linux_check_afinfo

Volatile Systems Volatility Framework 2.2_rc1

Symbol Name Member Address

----------- ------ ----------

tcp4_seq_afinfo show 0xe0fb9965

This plugin reports and verifies that the show member is indeed hooked and that the system is compromised.

Analyzing the Userland Backdoor

Effect on Forensics

The kernel module provides cover for the attacker by hiding any process or files that start with a user-defined prefix or any network connection on a specified port. By default, the prefix is set to “_h4x_”, but the rootkit’s README recommends changing it to something that is not so simple to grep for. For this demo, I just left it as the default. The port number to hide is also a compile time configuration option chosen by the user.

The userland backdoor consists of a simple application that listens on the hidden network port, requires a password, and then spawns a bash shell with the privileges of root if the password is correct.

How KBeast Accomplishes it

As stated in the section on Hooking the System Call table, these userland activities are hidden by hooking the system call table and the sequence operations structure of TCP. Once connected to the backdoor, the attacker can perform a wide range of attacks and post-compromise activity. We will know focus on recovering this activity.

How Volatility Detects This

Fortunately for Volatility’s users, particularly those with a baseline of the system they are analyzing or a copy of ps output from the infected system, finding the hidden process is trivial. The output of linux_pslist can simply be compared with that of the baseline or ps. Since KBeast hides processes by hooking the system call table, the process list is untouched and the hidden process will be in Volatility’s output but not the others. In the case of my infected image the _h4x_bd process has a PID of 2777:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_pslist -p 2777

Volatile Systems Volatility Framework 2.2_rc1

Offset Name Pid Uid Start Time

---------- ----- ------- --- ----------

0xdf4cd5a0 _h4x_bd 2777 0 Wed, 12 Sep 2012 20:49:25 +0000

Since we know the PID is 2777, we can then investigate the rest of the application’s activities using Volatility. First, we want to determine if any processes have the backdoor as a parent process. We can use the linux_pstree plugin to determine this and it will show us what programs were executed by the backdoor:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_pstree

Volatile Systems Volatility Framework 2.2_rc1

Name Pid Uid

<snip>

._h4x_bd 2777 0

..bash 3053 0

...sleep 3077 0

<snip>

This plugin lists the parent/child relationship between processes by adding a ‘.’ for each depth in the hierarchy. The displayed portions of the output show us that the backdoor is active with a spawned bash shell and that this shell ran the sleep command. We can then use the linux_psaux plugin to display the command line arguments of each of these processes and their start time:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_psaux -p 2777,3053,3077

Volatile Systems Volatility Framework 2.2_rc1

Pid Uid Arguments

2777 2 ./_h4x_bd Wed, 12 Sep 2012 20:49:25 +0000

3053 2 bash -i Thu, 13 Sep 2012 01:00:31 +0000

3077 2 sleep 100 Thu, 13 Sep 2012 01:02:22 +0000

In this output we can see that bash was run in interactive mode and that sleep was passed a parameter of 100. In a real incident response situation, this can determine what parameters were sent to a wide range of tools used during post-compromise activity.

Now that we know a connection was active to the backdoor at the time of the compromise, we want to recover the network connections associated with it. We can use the linux_netstat plugin with the backdoor’s PID to accomplish this:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_netstat -p 2777

Volatile Systems Volatility Framework 2.2_rc1

TCP 192.168.110.150:13377 192.168.110.140:41744 CLOSE_WAIT _h4x_bd/2777

TCP 0.0.0.0:13377 0.0.0.0:0 LISTEN _h4x_bd/2777

TCP 192.168.110.150:13377 192.168.110.140:41745 ESTABLISHED _h4x_bd/2777

This shows us that the backdoor is listening on port 13377 and that there is an active connection from 192.168.110.140 on port 41745. We also see a previous connection in the CLOSE_WAIT state on port 41744. As we will see in a future blog post on recovering network data, we could attempt to recover the packets associated with these connections by using the linux_sk_buff_cache and linux_pkt_queues plugins. Having the IP address and port pairs also allows us to focus network forensics investigations on only the streams associated with the communication channels of the malware.

At this point, we have found the processes and network activity associated with the backdoor, all of which would be hidden from us on a live system, and are able to dig deep into the workings of the process. Now our goal is to discover the hidden directory that the backdoor is placed in as the keylogging file is stored in the same directory. We can use linux_proc_map for this:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_proc_maps -p 2777

Volatile Systems Volatility Framework 2.2_rc1

0x8048000-0x8049000 r-x 0 8: 1 301353 /usr/_h4x_/_h4x_bd

0x8049000-0x804a000 rw- 4096 8: 1 301353 /usr/_h4x_/_h4x_bd

0xb75d7000-0xb75d8000 rw- 0 0: 0 0

0xb75d8000-0xb772d000 r-x 0 8: 1 513087 /lib/i686/cmov/libc-2.7.so

0xb772d000-0xb772e000 r-- 1396736 8: 1 513087 /lib/i686/cmov/libc-2.7.so

0xb772e000-0xb7730000 rw- 1400832 8: 1 513087 /lib/i686/cmov/libc-2.7.so

0xb7730000-0xb7733000 rw- 0 0: 0 0

0xb7739000-0xb773b000 rw- 0 0: 0 0

0xb773b000-0xb773c000 r-x 0 0: 0 0

0xb773c000-0xb7756000 r-x 0 8: 1 505267 /lib/ld-2.7.so

0xb7756000-0xb7758000 rw- 106496 8: 1 505267 /lib/ld-2.7.so

0xbf81b000-0xbf831000 rw- 0 0: 0 0 [stack]

And by looking at the mapping starting at 0x8048000, we see that our backdoor binary is loaded at that address and that its full path is /usr/_h4x_/_h4x_bd. Since the directory name has the hidden prefix, this directory would not show on a live machine, and we would have to analyze a disk image to find it. Timelining would be a good method to narrow down the results quickly.

We can partially recover the backdoor binary by using the linux_dump_map command:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_dump_map -p 2777 -s 0x8048000 -O h4xbd

This invocation focuses on PID 2777 (the network backdoor) and tells the plugin to write the mapping to the h4xbd file. This will only partially recover the file though as the binary is not loaded directly from disk into the process’s memory and instead its sections are spread throughout the address space. We can verify this with the file and readelf commands:

# file h4xbd

bin22: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked (uses shared libs), stripped

# readelf -s h4xbd

readelf: Error: Unable to read in 0x28 bytes of section headers

readelf: Error: Unable to read in 0x5a0 bytes of section headers

readelf: Error: Unable to read in 0xd0 bytes of dynamic section

Note that the file command see it as an ELF file, but readelf is unable to process the file. To recover the file in-tact, we need to acquire it from the page cache using the linux_find_file plugin. This is because the page cache holds all the physical pages backing a file in memory without any modifications.

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_find_file -F "/usr/_h4x_/_h4x_bd"

Volatile Systems Volatility Framework 2.2_rc1

Inode Number Inode

---------------- ----------

301353 0xd606ea70

We then recover the file with another invocation of linux_find_file:

# python vol.py --profile=LinuxDebianx86 -f kbeast.lime linux_find_file -i 0xd606ea70 -O h4xbd

Now when we run readelf, we get much better results:

# readelf -s h4xbd | head -15

Symbol table '.dynsym' contains 25 entries:

Num: Value Size Type Bind Vis Ndx Name

0: 00000000 0 NOTYPE LOCAL DEFAULT UND

1: 00000000 220 FUNC GLOBAL DEFAULT UND signal@GLIBC_2.0 (2)

2: 00000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__

3: 00000000 112 FUNC GLOBAL DEFAULT UND write@GLIBC_2.0 (2)

4: 00000000 55 FUNC GLOBAL DEFAULT UND listen@GLIBC_2.0 (2)

5: 00000000 44 FUNC GLOBAL DEFAULT UND setsid@GLIBC_2.0 (2)

6: 00000000 441 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.0 (2)

7: 00000000 14 FUNC GLOBAL DEFAULT UND htons@GLIBC_2.0 (2)

8: 00000000 112 FUNC GLOBAL DEFAULT UND read@GLIBC_2.0 (2)

9: 00000000 210 FUNC GLOBAL DEFAULT UND perror@GLIBC_2.0 (2)

10: 00000000 108 FUNC GLOBAL DEFAULT UND accept@GLIBC_2.0 (2)

11: 00000000 55 FUNC GLOBAL DEFAULT UND socket@GLIBC_2.0 (2)

# readelf -s h4xbd | wc -l

131

Which shows us that the symbol table is in-tact and that 131 symbols were present. (Thanks to the malware author for not stripping his bins ;). In fact, if we hash the recovered binary from memory and the backdoor binary on the infected VM, the hashes will match exactly.

As a final step, we will quickly perform binary analysis of the binary recovered from memory. Since the password, hidden port, secret signal number, etc are all compile time options, they will be different per instance of the sample, but can be recovered with simple reverse engineering. To start this process, we find symbols from the binary that may be interesting, by using nm and filtering for functions (code).

# nm h4xbd | grep -wi "t"

08048b70 t __do_global_ctors_aux

08048770 t __do_global_dtors_aux

08048b6a T __i686.get_pc_thunk.bx

08048b00 T __libc_csu_fini

08048b10 T __libc_csu_init

08048b9c T _fini

08048584 T _init

08048740 T _start

08048906 T bindshell

0804881d T enterpass

080487f4 T error_ret

080487d0 t frame_dummy

08048ace T main

From this output, the functions bindshell and enterpass look interesting. If we load the binary into gdb and disassemble this function we notice a few things:

# gdb -q h4xbd

Reading symbols from /root/h4xbd...done.

(gdb) set disassembly-flavor intel

(gdb) disassemble enterpass

Dump of assembler code for function enterpass:

0x0804881d <enterpass+0>: push ebp

0x0804881e <enterpass+1>: mov ebp,esp

0x08048820 <enterpass+3>: sub esp,0x68

0x08048823 <enterpass+6>: mov DWORD PTR [ebp-0x8],0x8048ea8 <--- banner string

0x0804882a <enterpass+13>: mov DWORD PTR [ebp-0x4],0x8048ec9 <--- another banner string

<…snip…>

0x08048892 <enterpass+117>: mov DWORD PTR [esp+0x8],0x5

0x0804889a <enterpass+125>: mov DWORD PTR [esp+0x4],0x8048ee6 <---- hardcoded address of password

0x080488a2 <enterpass+133>: lea eax,[ebp-0x48]

0x080488a5 <enterpass+136>: mov DWORD PTR [esp],eax

0x080488a8 <enterpass+139>: call 0x8048714 <strncmp@plt> <---- strncmp call

<…snip…>

What becomes immediately apparent is that we have a call to strncmp at 0x080488a8, which is likely where the password is check is contained, and that we see other hardcoded strings in the address range of 0x8048eXX. At address 0x0804889a, we can see one of these strings being placed on the stack as a parameter to the check string call. If we investigate these addresses, we see that the password (“h4x3d”) is contained in cleartext and that the other strings in the same memory region contain the backdoor’s login banner, debug information, the hidden directory “/usr/_h4x_”, and other interesting information.

(gdb) x/s 0x8048ee6

0x8048ee6: "h4x3d"

(gdb) x/30s 0x8048e00

0x8048e7d: ""

0x8048e7e: ""

0x8048e7f: ""

0x8048e80: "ERROR! Error occured on your system!"

0x8048ea5: ""

0x8048ea6: ""

0x8048ea7: ""

0x8048ea8: "Password [displayed to screen]: "

0x8048ec9: "<< Welcome To The Server >>\n"

0x8048ee6: "h4x3d"

0x8048eec: "Wrong!\n"

0x8048ef4: "socket"

0x8048efb: "bind"

0x8048f00: "listen"

0x8048f07: "Daemon running with PID = %i\n"

0x8048f25: "/usr/_h4x_"

0x8048f30: "/bin/bash"

If we analyze the bindshell function, we find more configuration information about the particular KBeast instance:

(gdb) disassemble bindshell

Dump of assembler code for function bindshell:

0x08048906 <bindshell+0>: push ebp

0x08048907 <bindshell+1>: mov ebp,esp

0x08048909 <bindshell+3>: sub esp,0x58

0x0804890c <bindshell+6>: mov WORD PTR [ebp-0x24],0x2

0x08048912 <bindshell+12>: mov DWORD PTR [esp],0x3441 <-- the backdoor port

0x08048919 <bindshell+19>: call 0x8048624 <htons@plt>

<snip>

0x080489d9 <bindshell+211>: mov DWORD PTR [esp],0x8048f25 <- the hidden directory

0x080489e0 <bindshell+218>: call 0x80486b4 <chdir@plt>

<snip>

(gdb) x/s 0x8048f25

0x8048f25: "/usr/_h4x_"

At this point we have done a fairly thorough job of analyzing the rootkit and can perform very effective analysis against it. If needed, we could even write Volatility plugins to automatically recover the configuration parameters directly from memory.

Conclusion

We have thoroughly investigated the KBeast rootkit, including its internals, artifacts left on a system, and interactions with the attackers who place it on a system. This includes hooking the system call table, overwriting network operation structures, and allowing “stealth” access to the compromised computer over the network.

In next week’s Linux posts, we will analyze another rootkit, Jynx, which requires more plugins to analyze, and we will have a blog post on analyzing network information with Volatility. If you have any questions or comments please use the comment section of the blog or you can find me on Twitter (@attrc).

[Archive of Volatility Labs]

Friday, September 14, 2012

MoVP 1.5 KBeast Rootkit, Detecting Hidden Modules, and sysfs

No comments:

Post a Comment