In this post I will analyze the KBeast rootkit using Volatility’s new Linux features. This will include finding hidden modules, network connections, opened files, and hooked system calls.
If you would like to follow along or recreate the steps taken, please see the LinuxForensicsWiki for instructions on how to do so.
Obtaining the Samples
To have a sample to test against I installed the KBeast rootkit in my Debian virtual machine that was running the 2.6.26-2-686 32-bit kernel.
KBeast
KBeast is a kernel mode rootkit that loads as a
kernel module. It also has a userland component that provides remote access to
the computer. This userland backdoor is hidden from other userland applications
by the kernel module. KBeast also hides
files, directories, and processes that start with a user defined prefix.
Keylogging abilities are also optionally provided.
KBeast gains its control over a computer by hooking
the system call table and by hooking the operations structures used to
implement the netstat interface to
userland.
We will know go through each piece of functionality the rootkit offers, how it accomplishes it, and how we can detect it with Volatility.
Hiding the Kernel Module
Effect
on Forensics
Rootkits hide themselves from the module list as any
unknown modules will be very noticeable to IT security staff as well as to
integrity verifiers that operate in userland.
The inability to locate hidden modules can give investigators a false
sense of security and make them trust the output of tools on a live machine
that they should not.
How
KBeast Accomplishes it
To hide its kernel module component, KBeast uses the
same technique that many other modules do, which is breaking itself from the
linked list of loaded kernel modules. This list is exported through /proc/modules and the lsmod binary reads this file to list the
loaded modules of a system. This has the effect of the module still being
active in memory, but not detectable with lsmod
or from kernel tools that simply walk the linked list.
How
Volatility Detects This
Volatility leverages sysfs to find modules that are removed from the modules list but
still active. sysfs is a kernel to
userland interface, similar to /proc,
that exports a wide range of kernel information and statistics. One of these
types of data is the loaded modules and their associated information such as
parameters, sections, and reference counts. On a running system, this
information is exported through the /sys/module
directory.
Inside of this directory, there is one directory per-kernel
module, and the directory is named the same as the module appears in lsmod. The per-module sub-directories contain more
sub-directories that hold the parameters, sections, and other module data. The
following shows reading of the parameters sent to LiME to obtain the memory
capture for this blog post (the original command was insmod lime.ko "path=kbeast.this format=lime")
#
cat /sys/module/lime/parameters/path
kbeast.this
# cat /sys/module/lime/parameters/format
lime
The linux_check_modules
plugin finds hidden modules by walking the linked list of modules as well
as enumerating all the directories under /sys/module.
These two lists are then compared and any entries that are only found in sysfs are reported as hidden kernel modules. We have yet
to find a rootkit that hides from sysfs
at all, so this method has worked well across a number of malware samples. The following shows this plugin against KBeast:
# python vol.py -f kbeast.this --profile=LinuxDebianx86
linux_check_modules
Volatile Systems Volatility Framework 2.2_rc1
Module Name
-----------
ipsecs_kbeast_v1
As can be seen, the KBeast module is detected as
hidden.
The sysfs enumeration
code works by finding the module_kset variable,
of type kset, that holds all
information for /sys/module. The
plugin then walks each member of the kset’s
entry list which is of type kobject. Each of these kobject structures represents a module
and its subdirectory immediately under /sys/module.
The names of these directories are then gathered to be compared with the module
list names.
Hooking System Call Table
Effect
on Forensics
System calls are the main mechanism for userland code to
trigger event handling by the kernel. Reading and writing files, sending
network data, spawning and exiting processes, etc are all done through system
calls. The system call table is an array
of function pointers, in which each pointer corresponds to a system call
handler (i.e. sys_read handles the read system call).
Rootkits often target this table due to the power it
gives them over the control flow of the running kernel. KBeast hooks a number of entries in this
table in order to hide files, processes, and more.
How
KBeast Accomplishes it
During the initialization of its kernel module,
KBeast hooks the unlink, rmdir, unlinkat,
rename, open, kill, read, write, getdents, and delete_module system calls with its own handlers. These handlers
ensure that files and processes that start with the user-supplied prefix are
hidden and that they cannot be tampered with.
The overwritten kill
system call handler also acts as the mechanism that the rootkit provides in
order for userland processes to elevate privileges. All a userland process has
to do is send a signal with the backdoor signal value and the process will be
elevated. If you read our post yesterday, you
know that the Average Coder rootkit used a mechanism that allowed us to detect
elevated processes. Unfortunately, KBeast does not use this mechanism and
instead uses the proper interfaces provided by the kernel, namely prepare_creds and
commit_creds. This mechanism does not
produce any inconsistencies, so we cannot immediately find processes elevated
by KBeast.
How
Volatility Detects This
Volatility detects all of these hooks by enumerating
and verifying each entry in the system call table. This is implemented in the linux_check_syscall plugin, which, for
every member of the system call table, either prints out the symbol name or, if
it is hooked, prints out the hook address. Since there is anywhere from 300 to
400+ system calls on normal Linux system, it is advisable to redirect the plugin
output to a file and then grep for bad entries as shown here:
#
python vol.py -f kbeast.lime --profile=LinuxDebianx86 linux_check_syscall >
ksyscall
#
head -10 ksyscall
Table
Name Index Address Symbol
----------
---------- ---------- ------------------------------
32bit
0x0 0xc103ba61 sys_restart_syscall
32bit
0x1 0xc103396b sys_exit
32bit
0x2 0xc100333c ptregs_fork
32bit
0x3 0xe0fb46b9 HOOKED
32bit
0x4 0xe0fb4c56 HOOKED
32bit
0x5 0xe0fb4fad HOOKED
32bit
0x6 0xc10b1b16 sys_close
32bit
0x7 0xc10331c0 sys_waitpid
#
grep HOOKED ksyscall
32bit
0x3 0xe0fb46b9 HOOKED
32bit
0x4 0xe0fb4c56 HOOKED
32bit
0x5 0xe0fb4fad HOOKED
32bit
0xa 0xe0fb4d30 HOOKED
32bit
0x25 0xe0fb4412 HOOKED
32bit
0x26 0xe0fb4ebd HOOKED
32bit
0x28 0xe0fb4db1 HOOKED
32bit
0x81 0xe0fb5044 HOOKED
32bit
0xdc 0xe0fb4b9e HOOKED
32bit
0x12d 0xe0fb4e32 HOOKED
We can see in
the first output what some clean entries look like and that the system call table
index is reported along with the symbol name and address. For hooked entries,
we instead see HOOKED in place of a symbol name because the hooked function
points to an unknown address (in this case inside the rootkit’s module).
The plugin only
prints the index of the system call entries and not a name because the system
call table varies widely across distributions and kernel versions, and
determining the name of each one requires the debug build of the kernel
(vmlinux). This may be incorporated into
future versions of the plugins, but will require additions to the current code
base, and in many cases the debug build is not made available by the
distribution package maintainers.
Hiding Network
Connections
Effect
on Forensics
The ability to hide network connections from
userland frustrates not only host investigators, but also network forensics
teams who wish to tie traffic back to a specific computer. The ease in which kernel modules can hide
information from userland makes a strong case for all incident response to be
based on offline memory captures and not on the output from tools running on
the live system.
How
KBeast Accomplishes it
To hide network connections from netstat and the userland interfaces it
uses, KBeast hooks the show member of the tcp4_seq_afinfo sequence operation
structure. This structure is of type tcp_seq_afinfo and has members of type file_operations and of type seq_operations. Please refer to yesterday’s blog
post to learn about file_operations
structures. Sequence operation structures provide a generic mechanism to
display information inside of the /proc filesystem.
This structure has the members start,
show, next, stop, and the wrapping code provides handling of partial seeks,
buffered reads, and other complicated logic so that it only has to be
implemented once throughout the entire kernel.
Sequence operations structures are often targeted by
malware because they directly affect what is populated in /proc. By overwriting the show
member of such a structure, a rootkit can easily filter out entries it does
not want to appear in userland. KBeast
effectively hides its backdoored network connection by filtering the show member of the TCP4 structure. This technique is also used by many other
rootkits.
How
Volatility Detects This
To detect KBeast’s overwriting of network sequence
operation structures, the linux_check_afinfo
plugin walks the file_operations and
sequence_operations structures of all
UDP and TCP protocol structures including, tcp6_seq_afinfo,
tcp4_seq_afinfo, udplite6_seq_afinfo,
udp6_seq_afinfo, udplite4_seq_afinfo, and udp4_seq_afinfo, and verifies each
member. This effectively detects any tampering with the interesting members of
these structures. The following output shows this plugin against the VM infected
with KBeast:
# python vol.py -f
kbeast.lime --profile=LinuxDebianx86 linux_check_afinfo
Volatile Systems Volatility Framework 2.2_rc1
Symbol Name Member Address
----------- ------ ----------
tcp4_seq_afinfo show 0xe0fb9965
This plugin
reports and verifies that the show member
is indeed hooked and that the system is compromised.
Analyzing the Userland Backdoor
Effect
on Forensics
The kernel module provides cover for the attacker by
hiding any process or files that start with a user-defined prefix or any network
connection on a specified port. By default, the prefix is set to “_h4x_”, but
the rootkit’s README recommends changing it to something that is not so simple
to grep for. For this demo, I just left it as the default. The port number to hide is also a compile time configuration option
chosen by the user.
The userland backdoor consists of a simple
application that listens on the hidden network port, requires a password, and
then spawns a bash shell with the privileges of root if the password is
correct.
How
KBeast Accomplishes it
As stated in the section on Hooking the System Call
table, these userland activities are hidden by hooking the system call table
and the sequence operations structure of TCP. Once connected to the backdoor,
the attacker can perform a wide range of attacks and post-compromise activity.
We will know focus on recovering this activity.
How
Volatility Detects This
Fortunately for Volatility’s users, particularly
those with a baseline of the system they are analyzing or a copy of ps output from the infected system, finding
the hidden process is trivial. The output of linux_pslist can simply be compared with that of the baseline or ps. Since KBeast hides processes by
hooking the system call table, the process list is untouched and the hidden
process will be in Volatility’s output but not the others. In the case of my
infected image the _h4x_bd process
has a PID of 2777:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_pslist -p 2777
Volatile Systems Volatility Framework 2.2_rc1
Offset
Name Pid Uid Start Time
---------- ----- ------- --- ----------
0xdf4cd5a0 _h4x_bd 2777 0 Wed, 12 Sep 2012 20:49:25 +0000
Since we know the PID is 2777, we can then
investigate the rest of the application’s activities using Volatility. First,
we want to determine if any processes have the backdoor as a parent process. We
can use the linux_pstree plugin to
determine this and it will show us what programs were executed by the backdoor:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_pstree
Volatile Systems Volatility Framework 2.2_rc1
Name
Pid Uid
<snip>
._h4x_bd
2777 0
..bash
3053 0
...sleep
3077 0
<snip>
This plugin lists the parent/child relationship
between processes by adding a ‘.’ for each depth in the hierarchy. The displayed
portions of the output show us that the backdoor is active with a spawned bash
shell and that this shell ran the sleep
command. We can then use the linux_psaux plugin to display the
command line arguments of each of these processes and their start time:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_psaux -p 2777,3053,3077
Volatile Systems Volatility Framework 2.2_rc1
Pid
Uid Arguments
2777 2 ./_h4x_bd Wed, 12 Sep 2012 20:49:25 +0000
3053 2 bash -i Thu, 13 Sep 2012 01:00:31 +0000
3077 2 sleep 100 Thu, 13 Sep 2012 01:02:22 +0000
In this output
we can see that bash was run in
interactive mode and that sleep was
passed a parameter of 100. In a real
incident response situation, this can determine what parameters were sent to a
wide range of tools used during post-compromise activity.
Now that we know
a connection was active to the backdoor at the time of the compromise, we want
to recover the network connections associated with it. We can use the linux_netstat plugin with the backdoor’s
PID to accomplish this:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_netstat -p 2777
Volatile Systems Volatility Framework 2.2_rc1
TCP 192.168.110.150:13377
192.168.110.140:41744 CLOSE_WAIT
_h4x_bd/2777
TCP
0.0.0.0:13377 0.0.0.0:0 LISTEN _h4x_bd/2777
TCP
192.168.110.150:13377 192.168.110.140:41745 ESTABLISHED _h4x_bd/2777
This shows us that the backdoor is listening on port
13377 and that there is an active
connection from 192.168.110.140 on port 41745.
We also see a previous connection in the CLOSE_WAIT
state on port 41744. As we will see in a
future blog post on recovering network data, we could attempt to recover the
packets associated with these connections by using the linux_sk_buff_cache and linux_pkt_queues
plugins. Having the IP address and port
pairs also allows us to focus network forensics investigations on only the
streams associated with the communication channels of the malware.
At this point, we have found the processes and
network activity associated with the backdoor, all of which would be hidden
from us on a live system, and are able to dig deep into the workings of the
process. Now our goal is to discover the hidden directory that the backdoor is
placed in as the keylogging file is stored in the same directory. We can use linux_proc_map for this:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_proc_maps -p 2777
Volatile Systems Volatility Framework 2.2_rc1
0x8048000-0x8049000 r-x 0
8: 1 301353
/usr/_h4x_/_h4x_bd
0x8049000-0x804a000 rw- 4096
8: 1 301353 /usr/_h4x_/_h4x_bd
0xb75d7000-0xb75d8000 rw- 0
0: 0 0
0xb75d8000-0xb772d000 r-x 0
8: 1 513087
/lib/i686/cmov/libc-2.7.so
0xb772d000-0xb772e000 r-- 1396736
8: 1 513087
/lib/i686/cmov/libc-2.7.so
0xb772e000-0xb7730000 rw- 1400832
8: 1 513087
/lib/i686/cmov/libc-2.7.so
0xb7730000-0xb7733000 rw- 0
0: 0 0
0xb7739000-0xb773b000 rw- 0
0: 0 0
0xb773b000-0xb773c000 r-x 0
0: 0 0
0xb773c000-0xb7756000 r-x 0
8: 1 505267 /lib/ld-2.7.so
0xb7756000-0xb7758000 rw- 106496
8: 1 505267 /lib/ld-2.7.so
0xbf81b000-0xbf831000 rw- 0
0: 0 0 [stack]
And by looking
at the mapping starting at 0x8048000,
we see that our backdoor binary is loaded at that address and that its full
path is /usr/_h4x_/_h4x_bd. Since the
directory name has the hidden prefix, this directory would not show on a live
machine, and we would have to analyze a disk image to find it. Timelining
would be a good method to narrow down the results quickly.
We can partially
recover the backdoor binary by using the linux_dump_map
command:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_dump_map -p 2777 -s 0x8048000 -O h4xbd
This invocation
focuses on PID 2777 (the network backdoor) and tells the plugin to write the
mapping to the h4xbd file. This will
only partially recover the file though as the binary is not loaded directly
from disk into the process’s memory and instead its sections are spread
throughout the address space. We can verify this with the file and readelf commands:
# file h4xbd
bin22: ELF 32-bit LSB executable, Intel 80386,
version 1 (SYSV), statically linked (uses shared libs), stripped
# readelf -s h4xbd
readelf: Error: Unable to read in 0x28 bytes of
section headers
readelf: Error: Unable to read in 0x5a0 bytes of
section headers
readelf: Error: Unable to read in 0xd0 bytes of
dynamic section
Note that the file command see it as an ELF file, but readelf is unable to process the file. To
recover the file in-tact, we need to acquire it from the page cache using the linux_find_file plugin. This is because
the page cache holds all the physical pages backing a file in memory without any
modifications.
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_find_file -F "/usr/_h4x_/_h4x_bd"
Volatile Systems Volatility Framework 2.2_rc1
Inode Number
Inode
---------------- ----------
301353 0xd606ea70
We then recover
the file with another invocation of linux_find_file:
# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_find_file -i 0xd606ea70 -O h4xbd
Now when we run readelf, we get much better results:
# readelf -s h4xbd | head -15
Symbol table '.dynsym' contains 25 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 220 FUNC GLOBAL DEFAULT UND signal@GLIBC_2.0 (2)
2: 00000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
3: 00000000 112 FUNC GLOBAL DEFAULT UND write@GLIBC_2.0 (2)
4: 00000000 55 FUNC GLOBAL DEFAULT UND listen@GLIBC_2.0 (2)
5: 00000000 44 FUNC GLOBAL DEFAULT UND setsid@GLIBC_2.0 (2)
6: 00000000 441 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.0 (2)
7: 00000000 14 FUNC GLOBAL DEFAULT UND htons@GLIBC_2.0 (2)
8: 00000000 112 FUNC GLOBAL DEFAULT UND read@GLIBC_2.0 (2)
9: 00000000 210 FUNC GLOBAL DEFAULT UND perror@GLIBC_2.0 (2)
10: 00000000 108 FUNC GLOBAL DEFAULT UND accept@GLIBC_2.0 (2)
11: 00000000 55 FUNC GLOBAL DEFAULT UND socket@GLIBC_2.0 (2)
# readelf -s h4xbd | wc -l
131
Which shows us that the symbol table is in-tact and
that 131 symbols were present. (Thanks to the malware author for not stripping
his bins ;). In fact, if we hash the recovered binary from memory and the
backdoor binary on the infected VM, the hashes will match exactly.
As a final step, we will quickly perform binary
analysis of the binary recovered from memory. Since the password, hidden port,
secret signal number, etc are all compile time options, they will be different
per instance of the sample, but can be recovered with simple reverse
engineering. To start this process, we find symbols from the binary that may be
interesting, by using nm and
filtering for functions (code).
# nm h4xbd |
grep -wi "t"
08048b70 t __do_global_ctors_aux
08048770 t __do_global_dtors_aux
08048b6a T __i686.get_pc_thunk.bx
08048b00 T __libc_csu_fini
08048b10 T __libc_csu_init
08048b9c T _fini
08048584 T _init
08048740 T _start
08048906 T bindshell
0804881d T enterpass
080487f4 T error_ret
080487d0 t frame_dummy
08048ace T main
From this output, the functions bindshell and enterpass look
interesting. If we load the binary into gdb
and disassemble this function we notice a few things:
# gdb -q h4xbd
Reading symbols
from /root/h4xbd...done.
(gdb) set
disassembly-flavor intel
(gdb) disassemble enterpass
Dump of assembler code for function enterpass:
0x0804881d <enterpass+0>: push
ebp
0x0804881e <enterpass+1>: mov
ebp,esp
0x08048820 <enterpass+3>: sub
esp,0x68
0x08048823 <enterpass+6>: mov
DWORD PTR [ebp-0x8],0x8048ea8 <--- banner string
0x0804882a <enterpass+13>: mov
DWORD PTR [ebp-0x4],0x8048ec9 <--- another banner string
<…snip…>
0x08048892 <enterpass+117>: mov
DWORD PTR [esp+0x8],0x5
0x0804889a <enterpass+125>: mov
DWORD PTR [esp+0x4],0x8048ee6 <---- hardcoded address of password
0x080488a2 <enterpass+133>: lea
eax,[ebp-0x48]
0x080488a5 <enterpass+136>: mov
DWORD PTR [esp],eax
0x080488a8 <enterpass+139>: call
0x8048714 <strncmp@plt> <---- strncmp call
<…snip…>
What becomes
immediately apparent is that we have a call to strncmp at 0x080488a8,
which is likely where the password is check is contained, and that we see other
hardcoded strings in the address range of 0x8048eXX.
At address 0x0804889a, we can see one
of these strings being placed on the stack as a parameter to the check string
call. If we investigate these addresses, we see that the password (“h4x3d”) is
contained in cleartext and that the other strings in the same memory region
contain the backdoor’s login banner, debug information, the hidden directory “/usr/_h4x_”, and other interesting
information.
(gdb) x/s 0x8048ee6
0x8048ee6:
"h4x3d"
(gdb) x/30s 0x8048e00
0x8048e7d:
""
0x8048e7e:
""
0x8048e7f:
""
0x8048e80:
"ERROR! Error occured on your system!"
0x8048ea5:
""
0x8048ea6:
""
0x8048ea7:
""
0x8048ea8:
"Password [displayed to screen]: "
0x8048ec9:
"<< Welcome To The Server >>\n"
0x8048ee6:
"h4x3d"
0x8048eec:
"Wrong!\n"
0x8048ef4:
"socket"
0x8048efb:
"bind"
0x8048f00:
"listen"
0x8048f07:
"Daemon running with PID = %i\n"
0x8048f25:
"/usr/_h4x_"
0x8048f30:
"/bin/bash"
If we analyze
the bindshell function, we
find more configuration information about the particular KBeast instance:
(gdb) disassemble bindshell
Dump of assembler code for function bindshell:
0x08048906 <bindshell+0>: push
ebp
0x08048907 <bindshell+1>: mov
ebp,esp
0x08048909 <bindshell+3>: sub
esp,0x58
0x0804890c <bindshell+6>: mov
WORD PTR [ebp-0x24],0x2
0x08048912 <bindshell+12>: mov
DWORD PTR [esp],0x3441 <-- the backdoor port
0x08048919 <bindshell+19>: call
0x8048624 <htons@plt>
<snip>
0x080489d9 <bindshell+211>: mov
DWORD PTR [esp],0x8048f25 <- the hidden directory
0x080489e0 <bindshell+218>: call
0x80486b4 <chdir@plt>
<snip>
(gdb) x/s 0x8048f25
0x8048f25:
"/usr/_h4x_"
At this point we
have done a fairly thorough job of analyzing the rootkit and can perform very
effective analysis against it. If
needed, we could even write Volatility plugins to automatically recover the
configuration parameters directly from memory.
Conclusion
We have thoroughly
investigated the KBeast rootkit, including its internals, artifacts left on a
system, and interactions with the attackers who place it on a system. This includes hooking the system call table,
overwriting network operation structures, and allowing “stealth” access to the
compromised computer over the network.
In next week’s
Linux posts, we will analyze another rootkit, Jynx, which requires more plugins to analyze, and we will have a
blog post on analyzing network information with Volatility. If you have any questions or comments please
use the comment section of the blog or you can find me on Twitter (@attrc).
No comments:
Post a Comment