Friday, May 31, 2013

MoVP II - 3.3 - Automated Linux/Android Bash History Scanning

Recovering bash command history from Linux and Android memory dumps just got a lot easier.

In previous releases of Volatility, extracting commands and the associated timestamps was possible, but with one caveat - you needed to know the offset into the /bin/bash binary where a pointer to the start of the command history list existed. As described in the linux_bash documentation, you could get the offset in a few ways - by using gdb on the live system or analyzing /bin/bash with IDA. Unfortunately, one of those ways requires live access to the victim machine (which you may not have) and the other requires reverse engineering experience.

Starting with Volatility 2.3, the linux_bash command doesn't need any help. It scans through the heap of bash processes and automatically finds command histories. This is a major enhancement to the plugin and brings it a lot closer to the practical world of memory forensics. You get a whole lot more by doing less.

Glancing Back 

Here's a look at the linux_bash output prior to Volatility 2.3. As you can see, the -H/--history-list argument is required.

$ python -f avgcoder.mem --profile=LinuxCentOS63x64 linux_bash --history-list= 0x6ed4a0 
Volatile Systems Volatility Framework 2.2_rc1
Command Time         Command
-------------------- -------
#1376083693          dmesg | head -50
#1376085088          df
#1376085110          dmesg | tail -50
#1376085118          sudo mount /dev/sda1 /mnt
#1376085122          cd /mnt

If the history list offset varies per system, and you need to know it in advance before analyzing a memory dump, what options do you have? As mentioned earlier, one way is to load /bin/bash in gdb and disassemble the history_list function.

mhl@ubuntu:~$ gdb /bin/bash 
GNU gdb (Ubuntu/Linaro 7.4-2012.02-0ubuntu2) 7.4-2012.02
Reading symbols from /bin/bash...(no debugging symbols found)...done.

(gdb) disassemble history_list
Dump of assembler code for function history_list:
   0x00000000004a5030 <+0>:  mov    0x248469(%rip),%rax  # 0x6ed4a0
   0x00000000004a5037 <+7>:  retq   
End of assembler dump.

The number you see in the comment (0x6ed4a0) is the offset you pass to the plugin as the --history-list value. As previously stated, however, running gdb on a live system or reversing the binary in IDA is not always the optimal solution, especially for forensic investigators who don't have debugging and reverse engineering skills.

Moving Forward

Yesterday, we discussed the new Linux volshell command, which you can use when actively developing a new plugin or an extension/improvement to an existing plugin. For example, before we can scan for history commands we have to know what a history structure looks like. You can find out like this:

$ python -f avgcoder.mem --profile=LinuxCentOS63x64 linux_volshell
Volatile Systems Volatility Framework 2.3_alpha
Current context: process init, pid=1 DTB=0x366ec000
Welcome to volshell! Current memory image is:
To get help, type 'hh()'
>>> dt("_hist_entry")
'_hist_entry' (24 bytes)
0x0   : line              ['pointer', ['String', {'length': 1024}]]
0x8   : timestamp         ['pointer', ['String', {'length': 1024}]]
0x10  : data              ['pointer', ['void']]

What you see is a 24-byte structure with a member at offset 0 named line. This is a pointer to a string which is the actual command entered. Also there is a timestamp at offset 8, which is a rather unique - its a string of digits (representing epoch seconds) preceded by a pound/hash (#) character. If we knew, for example, that all commands entered started with a common value such as "sudo", we could easily scan memory for all instances of "sudo" and then print them. But alas, that's theoretical at best. We can, however, rely on the fact that all timestamps start with a # character and are followed by at least 10 digits (unless you're dealing with a memory dump from before 2001, which is highly unlikely).

So the plan of action becomes:

  1. Scan the heap of all running /bin/bash instances, or all processes period if --scan-all is supplied. The ---scan-all allows you to ignore the process name, in case an attacker copied a /bin/bash shell to /tmp/a and then entered commands. Furthermore, since we're only scanning the heap of the process, its much quicker than a whole process address space scan.
  2. Look for # characters in heap segments. With the address in process memory for each # character, do a second scan for pointers to that address elsewhere on the heap. The goal is to find the timestamp member of the _hist_entry structure. We're essentially linking up data with pointers to the data.
  3. With each potential timestamp, we subtract 8 bytes (since it exists at offset 8 of the structure). That should give us the base address of the _hist_entry. Now we can associate any other members of _hist_entry (in particular the line member) with the timestamp.
  4. Once the scan is finished, collect all _hist_entry structures and place them in chronological order by timestamp. Then report the results.

If you want to explore the steps from a code perspective, check out the source file here.

Getting More by Doing Less

Here's an example of the described algorithm in action. Notice we don't have to supply any parameters to the linux_bash plugin anymore:

$ python -f avgcoder.mem --profile=LinuxCentOS63x64 linux_bash
Volatile Systems Volatility Framework 2.3_alpha
Pid      Name         Command Time                   Command
-------- ------------ ------------------------------ -------
    2738 bash         2013-08-09 21:28:13 UTC+0000   dmesg | head -50
    2738 bash         2013-08-09 21:51:28 UTC+0000   df
    2738 bash         2013-08-09 21:51:50 UTC+0000   dmesg | tail -50
    2738 bash         2013-08-09 21:51:58 UTC+0000   sudo mount /dev/sda1 /mnt

Important Reminders 

There are a few things you should note about the linux_bash plugins which aren't immediately obvious:

  • The plugin works flawlessly on Android memory dumps also
  • If you supply the -P/--printunalloc parameter, the plugin will print potentially unallocated entries (can be verbose)
  • Although the default mode is to use the brute force scanning approach, if you know the --history-list value, you can still supply it. The advantage with using the list head pointer is you see commands in exactly the same order they were entered by an attacker. The brute force scanning approach is a little different. Assuming multiple commands were entered within the same second, there's no way to determine which of those few commands were first.


Recovering commands and timestamps from memory dumps is one of the most powerful investigation techniques. What more could you ask for than the ability to figure out exactly what an attacker did on the machine? In the new Volatility 2.3 release, this task is even easier - it doesn't require any debugging or reversing and it now converts the epoch seconds to a human-readable timestamps (you can also set the timezone with the --tz option if you're generating timelines from artifacts in memory)


  1. Can an attacker disable history collection in bash by prefixing commands with spaces?

  2. No, its not the command that's prefixed with a '#' its the timestamp. So regardless if the command is "ifconfig" or " ifconfig" (with a space), the timestamp is still "#1234567890" for example. Also see for other anti-forensic tricks like setting history histsize=0 or redirecting the history to /dev/null.