[Archive of Volatility Labs]: 2015

Monday, November 30, 2015

Guest Post: Martin Korman (VolatilityBot - An Automated Malicious Code Dumper)

This is a guest post from Martin Korman, author of VolatilityBot.

Lately, I've found myself manually unpacking different versions of the same malware in order to perform static analysis with IDA and BinDiff. Therefore, I've decided to write a small system that will automate the entire process – the VolatilityBot.

How does VolatilityBot work?

It executes the malware on a VM.
It waits for a pre-defined period of time.
It suspends the VM.
It compares the snapshot to a golden image of the VM, finds new processes, injected code, loaded DLLs or Kernel Modules, dumps them from the memory and fixes the PE file in order to make static analysis easier.

All metadata is saved to a SQLite DB. Dumps are saved to a configured storage path on disk. All PE files pass a short static analysis and reports are stored as well. VolatilityBot can theoretically manage an unlimited quantity of virtual machines, depending on the performance of your host machine.

Some core capabilities of this automation tool were designed to make it as scalable as possible and easy to work with over time:

The Bot-Excavator's Manager can handle an unlimited number of machines at once, all depending on the performance of the researcher's equipment.
Automatic "Golden Image" generation is provided in order to ease the process of creating all "Golden Image" data. Just create your configuration file, and execute the script.
To simplify scalability, any database supported by SQLalchemy can be set or replaced as needed. The default Bot-Excavator's backend saves data to an SQLite database.
Memory dumps and data from all configured post processing modules are saved to a predefined storage directory.
Add tags to each execution of the script for quick visual reference to information you may need later.
Add dynamic tags to post-processing modules. For example, for malware that loads a kernel mode driver, tag it with "loads_kmd".Volatility Bot-Excavator's modular structure makes it easy to add additional modules and post-processors that may be deemed necessary with time.

Let's dive deeper into the architecture of VolatilityBot. The Bot-Excavator is made of four (4) major components:

1. The Manager. This is the core of the Bot-Excavator tool. The Manager executes the automatic extraction as well as the post-processing modules. This module also controls the associated machines' activity to streamline the workflow.

2. Machines Module. This is an abstract design of a research machine; it contains five functions: Revert, Start, Suspend, Clean-up and Get memory path. Each machine has a minimal python agent running, that listens and waits for a malware sample. When the sample is sent, it executes it by double clicking the sample, using an AutoIt script.

The agent is really minimal in order to not affect the behavior of the malware in the machine. The agent does not perform any API hooking and does not control the machine in any form besides executing the malware. The machines are controlled and monitored by the manager component, which knows not to send more analyses to a machine if its busy or if it encountered any errors analyzing the previous sample.

3. Code Extractors. This component consists of a number of modules grouped together for the purpose of extracting all the different malicious code components from memory. These are separate for code injections, new processes, etc. This component is modular and allows researchers to write new code for other extractors they may need.

These are the existing modules for this component:

injected_code – uses the Volatility "malfind" plugin in order to find suspect memory areas, after dumping them it tries to determine if the section contains a valid PE or valid shell code. If it found a valid PE, it fixes PE the header. Either way it will extract strings and execute YARA on the section dumped from memory.
module_scan - Uses the Volatility "modscan" plugin to load kernel modules. It detects and dumps newly loaded kernel modules.
create_process_dump - Uses the Volatility "procdump" and "pslist" plugins to dump new processes created by malware. Executes YARA, and extracts malware strings.
create_process_dump_as - Uses the Volatility "procdump" plugin to dump address space. Executes YARA, and extracts malware strings.
hooks – Extract API hooks done by malware (User-mode and Kernel-mode)

4. Post-Processing Modules. The post-processing modules kick in after the extraction is complete. They are then tasked with automated actions like fixing the PE, or availing the resulting elements to static analysis, YARA scans, strings and IP address logging etc.

This module type can help with:

Executing the configured YARA rules on post-extraction input as defined by the researcher.
Extracting strings from the input defined. Other modules can be layered on top in order to extract IP addresses, URLs, etc.
Producing a report for static analysis with basic PE analysis of the input file.

Because the malware samples are executed in a virtual machine, in order to avoid VM detection, a few tricks were used:

Registry keys cleanup (all VMware stuff I don't think there is a need to describe, as there's a lot of information on the internet regarding this issue).
A macro that moves the mouse and executes the malware.
Of course, no VMware tools on the machine.

Documentation and installation instructions are available here:
https://bitbucket.org/martink90/volatilitybot_public/downloads/Documentation_Nov_2015.pdf

Source code is available here: https://bitbucket.org/martink90/volatilitybot_public

My VB2015 slides are available here: https://www.virusbtn.com/pdf/conference_slides/2015/Korman-VB2015.pdf

For any questions, feel free to reach me via twitter (@MartinKorman), or open an issue in the project's Bit Bucket.

Friday, November 6, 2015

PlugX: Memory Forensics Lifecycle with Volatility

At OSDFCon last week, we discussed a case study showing how we identified manipulated memory artifacts in an infected environment. We were then able to rapidly introduce new capabilities to Volatility that could be used proactively in other environments.

The presentation (hosted on prezi) takes you on a tour of this activity, using a series of videos. Unfortunately, there's no audio (you should have attended the conference!) but it should be fairly self explanatory what's going on. Here's a short description of what the presentation covers:

A recent PlugX variant penetrated the target's network. It installed a service named RasTls for persistence and then hid the service from all live tools (WMI, Power Shell, services.msc, sc query, etc).
It injected code into a process and then erased its PE header to evade detection and make dumping the PE file out of memory a bit more complicated.
We identified the injected region with malfind, extracted it with Volatility, then used a hex editor and PE editor to reconstruct the PE file format.
We leveraged the impscan plugin to label API calls inside IDA Pro, then reverse engineered the binary to figure out how it's rootkit component was working. It was clearly targeting the double-linked list of services in usermode space of services.exe.
Next we wrote a new Volatility plugin to automatically detect unlinked services, by comparing the records in memory with the memory-resident registry hives.

The second part of our talk covered the new features in Volatility 2.5 (including the unified output, community integration, and support for Windows 10, Mac El Capitan, and Linux kernels up to 4.2.3) and all of the submissions to this year's Volatility Plugin Contest.

You can find the slide deck for our talk here (pdf).

Thursday, October 29, 2015

Results from the 2015 Volatility Plugin Contest are in!

The competition this year was fierce! We received 12 plugins to the contest. Similar to last year, ranking the submissions was one of the hardest things we’ve had to do. Each plugin is unique in its own way and introduces a capability to open source memory forensics that didn’t previously exist. Although a few people will receive prizes for their work, the real winners in this contest are the practitioners and investigators in the community that perform memory forensics.

Needless to say, we're very proud of everyone who submitted to the contest.

Here are this year’s rankings:

Fred House, Andrew Davis, and Claudiu Teodorescu for the shimcachemem plugin.
James Habben for the Evolve web interface to Volatility.
Philip Huppert for the VM migration address space.
Ying Li for the linux python strings and SSH agent plugins.
Adam Bridge for the NDIS packet scanning plugin.

Here is a detailed summary of the submissions. If you have feedback for the authors, we're sure they'd love to hear your thoughts.

1st: Shimcache Memory Scan

This plugin by FireEye and Mandiant researchers Fred House, Andrew Davis, and Claudiu Teodorescu parses the Windows Application Compatibility Database (aka, ShimCache) from the module or process memory that contain the database. In the authors' own words:

Shim cache is a highly valuable forensic artifact used to identify evidence of file execution. In addition to recording potential file executions, the cache is ordered, meaning that an analyst can identify other files that may have executed before or after a file of interest.

Most forensic tools that parse the shim cache rely on the cache stored in the Windows registry. The cache in the registry is only updated when a system is shutdown so this approach has the disadvantage of only parsing cache entries since the last shutdown. On systems that are not rebooted regularly (e.g., production servers) an analyst must either use out-of-date shim cache data or request a system reboot.

This plugin parses the shim cache directly from the module or process containing the cache, thereby providing analysts access to the most up-to-date cache. The plugin supports Windows XP SP2 through Windows 2012 R2 on both 32 and 64 bit architectures.

Fred's Twitter: @0xF2EDCA5A

Claudiu's Twitter: @cteo13

FireEye's blog post: https://www.fireeye.com/blog/threat-research/2015/10/shim_shady_live_inv.html

Shimcachemem Submission: https://github.com/volatilityfoundation/community/tree/master/ShimcacheMemory

2nd: James Habben: Evolve Web Interface

This submission provides a web interface to Volatility built with AJAX, jQuery, and JSON. It allows the user to run several plugins at once and leverage the power of multi-processing. It shows the plugins' status and places the completed plugins near the top for easy access. The output is easily searchable given its sqlite3 backend and there is a "morph" option as well for highlighting interesting artifacts. In fact, users can create their own morphs using a plugin-based template system. Existing morphs include associating country codes with IP addresses and marking filenames that do not match the NSRL list.

James' Submission: https://github.com/JamesHabben/evolve

James' Twitter: @JamesHabben
James' GitHub: https://github.com/jameshabben
Evolve Videos: https://youtu.be/55G2oGPQHF8 and https://youtu.be/mqMuQQowqM

3rd: Philip Huppert: VM Live Migration

This submission includes an address space plugin for accessing memory samples found in network data captured during a VMotion live migration. In particular, it currently supports analysis of samples collected during VMotion migrations between ESXi hosts. This allows an analyst to access the entire runtime state of a virtual machine as it is being transferred between two physical hosts over the network. This has a number of valuable applications toward virtual machine introspection and forensics investigations of cloud environments.

Philip's plugin represents interesting implications toward VM's that migrate between different cloud providers or countries. Wireshark has the capability to analyze the traffic protocols but not the payload of the traffic, so this is a new and exciting capability for both forensic researchers and offensive security analysts.

Philip's Submission: https://github.com/volatilityfoundation/community/tree/master/PhilipHuppert

Philip's Twitter: @oilheap

Philip's GitHub: https://github.com/Phaeilo

4th: Ying Li: Python Strings and SSH Keys

This submission includes a collection of plugins for analyzing memory samples acquired from 64-bit Linux systems. The plugins were initially presented at PyCon 2015. The first plugin in the submission, linux_python_strings, extracts memory resident Python strings from the heap of Python processes. This is accomplished by scanning the heap for Python string objects. The advantage of this approach is that we are able to leverage context that the interpreter has about the particular string and how it is being used. The author also includes a plugin for extracting strings stored in Python dictionaries. The context provided by this plugin could allow an analyst to determine which strings were associated as key-value pairs. The final plugin, linux_ssh_keys, will extract RSA keys from the heap of ssh-agent processes. These keys could be useful when investigating lateral movement or performing an investigation of a suspected person's machine.

Ying's plugins continue the trend of moving further up the analysis stack into the application to extract more memory resident context. As for RSA keys, extracting crypto artifacts is always interesting. To date, we haven't seen anyone extract the RSA keys from ssh-agents with Volatility. Additionally, Ying created an automated test harness for generating memory samples, which we thought was particularly useful.

Ying's Submission: https://github.com/volatilityfoundation/community/tree/master/YingLi

Ying's Twitter: @cyli
Ying's GitHub: https://github.com/cyli
Ying's PyCon 2015 Talk: https://www.youtube.com/watch?v=tMKXcc2-xO8

5th: Adam Bridge: NDIS Packet Scan

This submission carves packets and ethernet frames from NDIS shared memory sections (regions of RAM shared between the OS and DMA NIC). Although tools exist to scan an arbitrary binary file for packets, the extra context that Adam's plugin provides is extremely valuable. For example, the methodology is far less likely to produce false positives or report fake/decoy packets. In addition to outputting text and pcap formatted results, the plugin also decodes NetBIOS names found in DNS traffic and extracts slack space between packet payloads that other tools may miss. This research represents the beginning of a very exciting realm of future plugins that focus on NDIS private data structures including data waiting in sent/received buffers.

Adam's Submission: https://github.com/volatilityfoundation/community/tree/master/AdamBridge

Adam's Twitter: @bridgeythegeek
Adam's GitHub: https://github.com/bridgeythegeek

The following submissions appear in the order they were received. As previously mentioned, everyone succeeded in solving a specific problem that they (and undoubtedly others) faced. For this, they deserve huge props. We look forward to seeing future work by these authors!

Joe Greenwood: Hacking Team RCS Attribution

This plugin searches a memory dump for evidence of the Hacking Team Galileo Remote Control System (RCS), and attempts to attribute the infection to particular Hacking Team client. One of our favorite aspects of this plugin is that it detects RCS based on its predictable named shared memory sections, a creative alternative to scanning for the typical byte signatures and mutexes. Its also a really nice touch to provide the extra attribution context via the watermark lookups.

Joe's Submission: https://github.com/volatilityfoundation/community/tree/master/JoeGreenwood

Joe's Twitter: @SeawolfRN
Joe's (4ARMED) GitHub: https://github.com/4ARMED
Joe's Blog: https://www.4armed.com/blog/memory-forensics-detecting-galileo-rcs-windows/

Alexander Tarasenko: Pykd/Windbg Address Space

This submission lets you integrate Volatility into Windbg through Pykd. You're then able to query for addresses of critical data structures, functions, and other variables using the debugging APIs. Additionally, you can connect Volatility to a running Windows system in debugging mode and run Volatility plugins against the live system, which particularly comes in handy for malware/rootkit analysis (not necessarily IR/forensics).

The author of this address space also mentioned a new project named Karmadbg (see the link below) which is a GUI intended for development of new debugging and memory analysis scripts. Check it out! Unfortunately, documentation in English is not yet available.

Alexander's Submission: https://github.com/volatilityfoundation/community/tree/master/AlexanderTarasenko

Pykd's Twitter: @pykd and @pykd_dev
Pykd's Website: https://pykd.codeplex.com
Karmadbg's Website: https://karmadbg.codeplex.com

Loïc Jaquemet: Haystack

These plugins are an interface between the Volatility framework and the haystack framework. While Volatility establishes a forensic framework to analyse a system's RAM, the haystack framework is intended to analyse a process's RAM, allowing an analyst to search for defined structures in a process's memory. You can build structure definitions from known/public C header files or with Python's ctypes library (based on undocumented or reverse engineered data structures) and then plug them into this framework to scan across process heaps, all process memory allocations, etc. The author provided examples of constraints that can find openssl cipher contexts, session keys, and passphrases, but its surely not limited to those types of data. Give it a shot and see what you can find!

Loïc's Submission: https://github.com/volatilityfoundation/community/tree/master/Lo%C3%AFcJaquemet

Loïc's Twitter: @trolldbois
Loïc's GitHub: https://github.com/trolldbois

The python-haystack module: https://github.com/trolldbois/python-haystack

May Medhat (et. al.): GVol Tool

This tool by EG-CERT researchers May Medhat and Mohamad Shawkey provides a (thick) GUI front end for Volatility written in Java. It lets you run preconfigured batch scripts against a memory dump with just a couple mouse clicks or you can customize your own batch scripts. You can choose from categories such as Rootkits, Kernel Artifacts, Networking, etc. and they link up with Volatility plugins in the backend. We imagine this tool will be very handy for analysts who are not comfortable on the command line or with Volatility usage in general. Additionally, for those who are seasoned Volatility users, this tool can reduce the amount of time you spend typing commands before you actually dive into the details of your case.

May's GitHub: https://github.com/May-Medhat
Mohomad's GitHub: https://github.com/m-shawkey

GVol User Guide: https://github.com/eg-cert/GVol/blob/master/docs/GVol-User%20Guide.pdf

EG-CERT's GitHub: https://github.com/eg-cert/GVol

Monnappa Ka: Linux Memory Diff

This plugin uses the Volatility advanced memory forensics framework to run various plugins against a clean and infected Linux memory image and reports the changes. Many times while doing memory analysis (or malware analysis) an analyst is presented with an abundance of data and the analyst has to manually find the malicious artifacts from that data which takes time and effort. This tool helps in solving that problem by comparing the results between the clean and infected memory images. This tool helps speed up analysis, reduce manual effort and allows you to focus on the relevant data.

Monnappa's Submission: https://github.com/volatilityfoundation/community/tree/master/MonnappaKa

Monnappa's Twitter: @monnappa22
Monnappa's GitHub: https://github.com/monnappa22
Monnappa's Blog: http://malware-unplugged.blogspot.com

Bart Inglot: Scheduled Task and Job Scanners

This plugin scans for job files and prints out their information. It's useful from a DFIR perspective, since job files are often used by attackers in order to run programs with SYSTEM privileges, run at scheduled moments, or to move laterally within a network. Aside from the job plugin for Volatility, adapted from Jamie Levy's jobparser.py (http://gleeda.blogspot.com/2012/09/job-file-parser.html) Bart also has written standalone carvers for jobs and scheduled tasks that work against memory samples, disk images, and other binary files.

Bart's Submission: https://github.com/volatilityfoundation/community/tree/master/BartoszInglot

Bart's Twitter: @BartInglot
Bart's GitHub: https://github.com/binglot
Bart's Blog: http://passionateaboutis.blogspot.com

Tuesday, August 25, 2015

Volatility Updates Summer 2015

Summer 2015 has been quite a busy time for the memory forensics community. We wanted to write a quick update to talk about some recent events and research as well as upcoming news.

Conferences

Black Hat Vegas 2015

We wanted to again thank everyone who came out and supported us during Black Hat. Between our Arsenal demo, book signing, and party, we met hundreds of Volatility users and fans. Your support and enthusiasm is greatly appreciated. Come back next year for twice the champagne, twice the suite size, and twice the fun!

HTCIA International Conference (Orlando)

We're putting on a lab session at HTCIA's International Conference in Orlando next week. You can also stop by the Volexity booth for a chance to win a free seat at any upcoming Windows Malware and Memory Forensics Training course.

Open Source Digital Forensics Conference (OSDFC) 2015

The Volatility team will once again be presenting the latest in memory forensic research at OSDFC 2015. This year we will be focusing on the anti-forensic capabilities of PlugX as well as new Volatility capabilities that can auto-detect them. We will also be discussing the results of the 2015 Volatility Plugin contest.

OSDFC has a great lineup this year so you should try and attend. We hope to see many Volatility users while we are out there.

Research

A New Paper on OS X Memory Forensics

At DFRWS 2015, Dr. Golden Richard and I published a paper on OS X memory forensics entitled: Advancing OS X Rootkit Detection. The purpose of this paper was to document gaps in existing OS X rootkit detection techniques and then develop new methods, in the form of Volatility plugins, to close these gaps. The plugins introduced in the paper will be committed to GitHub in the coming weeks.

DFRWS had a number of good submissions this year, and we recommend browsing the program for other interesting forensics research.

Volatility vs Hacking Team

The malware used by Hacking Team to control victim's computers is known as Galileo RCS, and in a well done blog post, Joe Greenwood shows how to use Volatility to detect RCS in a number of ways.

RCS now joins a long list (Stuxnet, Careto, Flame, and more) of 'advanced', 'stealthy' malware that immediately falls to inspection by memory forensics.

Volatility at PyCon

At PyCon 2015, Ying Li showed how to use newly developed Volatility capabilities in order to find artifacts of Python scripts that were executing on the system. This was very interesting work, and we suggest watching the video of her talk.

Projects Building on Volatility

VolDiff

VolDiff is a project that compares the results of a number of Volatility plugins against two memory samples and automatically the reports the differences. Compared to manually running and comparing the plugins, this can save a substantial amount of time.

The purpose of VolDiff is to compare in-memory artifacts both before (clean state) and after (post-install state) an application, such as a malware sample, has executed. The new artifacts that appear post-installation can be immediately isolated for further analysis and/or for the creation of highly-effective IOCs.

If you wanted to get started with VolDiff then you should read the author's post on analyzing DarkComet with VolDiff.

Evolve

Evolve is an open source web interface to Volatility. It is under very active development and is constantly having new features added. Consult the README on GitHub for the latest features, and be sure to follow the tool's author on Twitter.

To see Evolve in action, check out a video showing the basic features here and advanced features here.

As much as we love the command line, it is sometimes nice to have a GUI visualize and shuffle data for you!

Plugin Contest

The 2015 Volatility Plugin Contest is underway and accepting submissions until October. The contest is a great way to win cash and other prizes, gain recognition in the community and become more familiar with Volatility and Python. We feature the research submitted to the contest on this blog, during conferences, presentations, and all throughout social media and our mailing lists.

New Volatility Capabilities

Windows 10 Support

While not merged in the official Volatility branch (yet), the Windows 10 branch is currently under active development. Nearly all of the plugins are working at the current time, and we would be very appreciative of any bug reports you may have when testing the branch. Please post any bugs to the GitHub issue tracker. Currently supported functionality includes process and kernel module listing, all pool scanning plugins (files, mutexes, processes, drivers, etc.), handles, DLLs, PE file extraction, process memory (VAD) parsing, service enumeration, cached file extraction, and memory signature scanning with Yara.

OS X 10.10.x Support and New Plugins

Volatility now has official support for OS X 10.10.4 and 10.10.5, which are the latest two versions. We have also tested Volatility on a preview release of 10.11, and it appears that all of the plugins work as expected. We will release an official profile for 10.11 once Apple releases debug kits (we are currently using special custom built profiles).

Volatility also has a new plugin named mac_get_profile. This plugin allows Volatility to auto-detect which profile (OS version) matches the given memory sample. To use this plugin you do NOT need any OS X profiles installed. Instead, you can run a fresh checkout of Volatility, determine the profile by using mac_get_profile, and then download the correct profile from our profiles repository.

To use mac_get_profile, simply pass the path to your memory sample as the -f option and then put mac_get_profile after it:

$ python vol.py -f <path to memory sample> mac_get_profile

Linux

Volatility Linux support has now been tested through kernel version 3.19. As many of the data structures do not change between versions, we expect that most or all of the plugins will work with bleeding edge developments kernels as well. Please file an issue if you encounter any bugs.

Memory Forensic Trainings

Our memory forensics training class in Amsterdam is now SOLD OUT. We appreciate the support and word of mouth praise from past attendees as well as fans of the project.

Our current course is 5 days of memory forensics and malware analysis training against Windows systems. Full information on the course, as well as upcoming dates and locations, can be found here. If our current set of public offerings does not work for your company then please contact us about conducting a private training at one of your facilities.

Saturday, August 1, 2015

Recovering TeamViewer (and other) Credentials from RAM with EditBox

I recently stumbled upon the TeamViewer-dumper-in-CPP project, which shows just how easy it is to recover TeamViewer IDs, passwords, and account information from a running TV instance by enumerating child windows (on a live machine). The method is based on sending a WM_GETTEXT message to the TV GUI controls that contain the credentials. In particular, we're looking for the two fields under the "Allow Remote Control" heading (Your ID: 567 744 114 and Password q16jp7).

The equivalent of TeamViewer-dumper for memory forensics analysts is Adam Bridge's EditBox plugin for Volatility. Adam's submission won 3rd place in last years Volatility Plugin Contest, but I still feel like many people don't realize the full potential of this plugin. While TeamViewer-dumper is specific to TV, the EditBox plugin recovers text from editbox controls for all applications (that depend on Microsoft Common Controls) across all user sessions (local or remote via RDP/VNC), even for "special" editboxes that contain passwords and show up as asterisks on the screen.

Here's an example of the editbox plugin's output when TV is running:


$ python vol.py -f memory.dmp --profile=Win7SP1x64 editbox 
Volatility Foundation Volatility Framework 2.4
41 processes to check.

*******************************************************
Wnd context          : 1\WinSta0\Default
Window title         : -
pointer-to tagWND    : 0xfffff900c062b510 [0x67dc6510]
pid                  : 2524
imageFileName        : TeamViewer.exe
wow64                : Yes
atom_class           : 6.0.7601.17514!Edit
address-of cbwndExtra: 0xfffff900c062b5f8 [0x67dc65f8]
value-of cbwndExtra  : 4 (0x4)
address-of WndExtra  : 0xfffff900c062b638 [0x67dc6638]
value-of WndExtra    : 0x46e0480 [0x67302480]
pointer-to hBuf      : 0x46af000 [0x67e28000]
hWnd                 : 0x10228
parenthWnd           : 0x1020a
nChars               : 6 (0x6)
selStart             : 0 (0x0)
selEnd               : 0 (0x0)
text_md5             : 7a62c5fa901ff86a1562b9c7075674f8
isPwdControl         : No
q16jp7

*******************************************************
Wnd context          : 1\WinSta0\Default
Window title         : -
pointer-to tagWND    : 0xfffff900c062b150 [0x67dc6150]
pid                  : 2524
imageFileName        : TeamViewer.exe
wow64                : Yes
atom_class           : 6.0.7601.17514!Edit
address-of cbwndExtra: 0xfffff900c062b238 [0x67dc6238]
value-of cbwndExtra  : 4 (0x4)
address-of WndExtra  : 0xfffff900c062b278 [0x67dc6278]
value-of WndExtra    : 0x46a0f98 [0x689d7f98]
pointer-to hBuf      : 0x46bf390 [0x6769d390]
hWnd                 : 0x10224
parenthWnd           : 0x1020a
nChars               : 11 (0xb)
selStart             : 0 (0x0)
selEnd               : 0 (0x0)
text_md5             : b45dfe635940d5490276a5ae41e1422f
isPwdControl         : No
567 744 114

*******************************************************
Wnd context          : 1\WinSta0\Default
Window title         : -
pointer-to tagWND    : 0xfffff900c0631a50 [0x552cea50]
pid                  : 2524
imageFileName        : TeamViewer.exe
wow64                : Yes
atom_class           : 6.0.7601.17514!Edit
address-of cbwndExtra: 0xfffff900c0631b38 [0x552ceb38]
value-of cbwndExtra  : 4 (0x4)
address-of WndExtra  : 0xfffff900c0631b78 [0x552ceb78]
value-of WndExtra    : 0x4781678 [0x6648b678]
pointer-to hBuf      : 0x46fac80 [0x68493c80]
hWnd                 : 0x801aa
parenthWnd           : 0x70186
nChars               : 15 (0xf)
selStart             : 0 (0x0)
selEnd               : 0 (0x0)
text_md5             : 2cbe388f82d11af92a8d4950e24db799
isPwdControl         : No
WIN-948O8I1DO91

[snip]

As you can see, the ID, password, computer name, and various other fields are recovered. This is a powerful way to reconstruct the state of the user interface from memory. Although technically you could also find the values by brute force string scanning in process memory, but there's no need to brute force when you can use a structured, focused approach. Kudos to Adam for creating such a useful extension to last year's plugin contest.

Thursday, July 16, 2015

The 2015 Volatility Plugin contest is now live!

This is a quick update to announce that the 2015 Volatility Plugin contest is now live and accepting submissions until October 1st. Winners of this year's contest will be receiving over $2,000 in cash prizes as well as plenty of Volatility swag (t-shirts, stickers, etc.).

The purpose of the contest is to encourage open memory forensics research and development. It is a great opportunity for students to become familiar with memory forensics, develop a master's thesis or PhD project, as well as gain experience that will be very desirable by future employers. For those already in the field, submitting to the contest is a great way to gain experience and visibility in the memory forensics community. After the contest is over we promote the work in our conference presentations, blogs, and social media.

If you are looking for inspiration or to see the past winners, please check out the pages from 2013 and 2014. You will find projects that allow for inspection of virtual machines guests from the view of the host, recovery of in-memory browser artifacts, methods to detect stealthy rootkits, and much more.

If you have any questions please feel free to reach out to us.

We are looking forward to another year of innovative open source research!

Monday, July 13, 2015

Volatility at Black Hat USA & DFRWS 2015!

Due to another year of open research and giving back to the open source community, Volatility will have a strong presence at both Black Hat USA and DFRWS 2015. This includes presentations, a book signing, and even a party!

At Black Hat, the core Volatility Developers (@4tphi, @attrc, @gleeda, and @iMHLv2) will be partaking in a number of events including:

Demoing Volatility at Black Hat Arsenal. This will include new plugins targeted at the PlugX malware, showing how to write simple, but effective Volatility plugins, and more!

Book signing for The Art of Memory Forensics starting at 11:10AM on Wednesday in the Black Hat book store. All four authors will be present, so be sure to bring your book along or purchase a copy on-site in the bookstore.

Volatility Happy Hour: This will be an open bar party where you can meet our team, bring books to be signed, and get Volatility swag all while enjoying tasty beverages. You must register (free) if you wish to attend! Note that the party will be at MGM Grand.

Friends of Volatility will also be leading a number of events at Black Hat including Briefing presentations from Jonathan Brossard, jduck, and Alex Ionescu as well as Arsenal Demos from Brian Baskin, Marc Ochsenmeier, David Cowen, and Takahiro Haruyama.

At DFRWS, Dr. Golden Richard will be presenting a paper that he and I wrote: Advancing Mac OS X Rootkit Detection. In this paper, we present several new methods to detect rootkits on OS X systems through memory forensic analysis. All the of the plugins described in the paper will be incorporated into Volatility after the conference.

Also, at DFRWS, Joe Sylve and Vico Marziale will be leading a workshop on creating forensics tools in Go. If you have never seen Go before, or want to gain some hands-on experience, then we recommend checking it out.

And finally, be sure to check out the "Finding your naughty BITS" presentation by Matthew Geiger, who has been a long time friend of the project.

We hope to see everyone at these events, and we are looking forward to an exciting August!

Wednesday, June 3, 2015

Volshell Quickie: The Case of the Missing Unicode Characters

The other day someone reached out to me because they had a case that involved files with Arabic names. Unfortunately the filenames were only question marks when using filescan or handles, so I set out to figure out why.

In order to figure out why, I created a few files with Hebrew names (which I can read and write, so I can verify if it is correct) and Arabic names (which was just me tapping on the keyboard, so they don't say anything). After creating them, I interacted with them to make sure they'd show up in filescan. Below you can see the filescan results:


[snip]
$ python vol.py -f Win7x86.vmem --profile=Win7SP1x86 filescan
0x000000003d7008d0     16      0 RW-rw-
\Device\HarddiskVolume2\Users\user\Desktop\????.txt
0x000000003ddfef20     18      1 RW-r--
\Device\HarddiskVolume2\Windows\Tasks\SCHEDLGU.TXT
0x000000003def9340     16      0 RW-r--
\Device\HarddiskVolume2\Users\user\Desktop\????????????????????.txt
[snip]

In order to understand how the filename is output, you can look at the code in filescan:


        for file in data:
            header = file.get_object_header()
            self.table_row(outfd,
                         file.obj_offset,
                         header.PointerCount,
                         header.HandleCount,
                         file.access_string(),
                         str(file.file_name_with_device() or ''))

The function file_name_with_device() is defined in volatility/plugins/overlays/windows/windows.py and the relevant part is highlighted in red:


class _FILE_OBJECT(obj.CType, ExecutiveObjectMixin):
    """Class for file objects"""

    def file_name_with_device(self):
        """Return the name of the file, prefixed with the name
        of the device object to which the file belongs"""
        name = "" 
        if self.DeviceObject:
            object_hdr = obj.Object("_OBJECT_HEADER",
                            self.DeviceObject - self.obj_vm.profile.get_obj_offset("_OBJECT_HEADER", "Body"),
                            self.obj_native_vm)
            if object_hdr:
                name = "\\Device\\{0}".format(str(object_hdr.NameInfo.Name or '')) 
        if self.FileName:
            name += str(self.FileName)
        return name

So we can take a look at this in volshell:


  1 $ python vol.py -f Win7x86.vmem --profile=Win7SP1x86 volshell
  2 [snip]
  3 >>> file = obj.Object("_FILE_OBJECT", offset = 0x000000003d7008d0, vm = addrspace().base, native_vm = addrspace())
  4 >>> print file.FileName
  5 \Users\user\Desktop\????.txt
  6
  7 >>> file2 = obj.Object("_FILE_OBJECT", offset = 0x000000003def9340, vm = addrspace().base, native_vm = addrspace())
  8 >>> print file2.FileName
  9 \Users\user\Desktop\????????????????????.txt

On line 3 we create a _FILE_OBJECT object. We know the offset where this object resides (0x000000003d7008d0) from filescan. Since this object was obtained from the physical address space, we specify this by setting vm to addrspace().base, or the physical layer, since this is a raw memory sample. Since the _FILE_OBJECT's native address space is virtual, we specify this as well: native_vm = addrspace(). At this point we have instantiated a _FILE_OBJECT in a variable called "file". We then print out the FileName member on line 5 and see its output on line 6. We follow the same process for the second file, except we save the object as "file2". As you can see, the output is not very helpful. So now we need to know what type of member, FileName is. In order to accomplish this, we need to look at the vtypes in the volatility/plugins/overlays/windows/win7_sp1_x86_vtypes.py file:


  '_FILE_OBJECT' : [ 0x80, {
    'Type' : [ 0x0, ['short']],
[snip]
    'FileName' : [ 0x30, ['_UNICODE_STRING']],
[snip]

We have some functionality added to this type in volatility/plugins/overlays/windows/windows.py:


class _UNICODE_STRING(obj.CType):
    [snip]
    def v(self):
        """  
        If the claimed length of the string is acceptable, return a unicode string.
        Otherwise, return a NoneObject.
        """
        data = self.dereference()
        if data:
            return unicode(data)
        return data 

    def dereference(self):
        length = self.Length.v()
        if length > 0 and length <= 1024:
            data = self.Buffer.dereference_as('String', encoding = 'utf16', length = length)
            return data
        else:
            return obj.NoneObject("Buffer length {0} for _UNICODE_STRING not within bounds".format(length))

    [snip]

    def __format__(self, formatspec):
        return format(self.v(), formatspec)

    def __str__(self):
        return str(self.dereference())
    
    [snip]

We know that the file_name_with_device() function uses str() in order to transform the _UNICODE_STRING into something readable and if we look at the overridden __str__() operator in the above code, we see that it uses the dereference() function. The dereference() function never casts the data as unicode, however, so the data is printed incorrectly. If we look at the above v() function, we see that there is a call to dereference() and that the resulting data is case as unicode, so let's see if we get valid data back by calling that function instead:


>>> print file.FileName.v()
\Users\user\Desktop\שלום.txt

>>> print file2.FileName.v()
\Users\user\Desktop\تهحححتهحححتهحححتهححح.txt

Success! So let's modify the __str__() operator to use v() instead and see if that fixes filescan:


[snip]
    def __str__(self):
        return str(self.v())
[snip]

Now let's examine the filescan data:


0x000000003d7008d0     16      0 RW-rw-
\Device\HarddiskVolume2\Users\user\Desktop\שלום.txt
0x000000003ddfef20     18      1 RW-r--
\Device\HarddiskVolume2\Windows\Tasks\SCHEDLGU.TXT
0x000000003def9340     16      0 RW-r--
\Device\HarddiskVolume2\Users\user\Desktop\تهحححتهحححتهحححتهححح.txt

Success! Just to make dually sure, I then created a user with a Hebrew name: גלידה, and created some files with Hebrew characters as well. If we look back to the file_name_with_device() function, you'll see that the complete file path is populated by using the NameInfo optional header (name = "\\Device\\{0}".format(str(object_hdr.NameInfo.Name or ''))). If you look in the volatility/plugins/overlays/windows/win7.py file, you'll see the following definition:


class _OBJECT_HEADER(windows._OBJECT_HEADER):
    [snip]
    optional_header_mask = (('CreatorInfo', '_OBJECT_HEADER_CREATOR_INFO', 0x01),
                            ('NameInfo', '_OBJECT_HEADER_NAME_INFO', 0x02),
                            ('HandleInfo', '_OBJECT_HEADER_HANDLE_INFO', 0x04),
                            ('QuotaInfo', '_OBJECT_HEADER_QUOTA_INFO', 0x08),
                            ('ProcessInfo', '_OBJECT_HEADER_PROCESS_INFO', 0x10))

Now we know the object to find in the types file in order to figure out what type Name is. Look in volatility/plugins/overlays/windows/win7_sp1_x86_vtypes.py:


  '_OBJECT_HEADER_NAME_INFO' : [ 0x10, {
    'Directory' : [ 0x0, ['pointer', ['_OBJECT_DIRECTORY']]],
    'Name' : [ 0x4, ['_UNICODE_STRING']],
    'ReferenceCount' : [ 0xc, ['long']],
} ],

Since Name is of the same type (_UNICODE_STRING), we should be covered. In the words of Al Bundy, "Let's rock":


$ python vol.py -f Win7x86.vmem --profile=Win7SP1x86 filescan
[snip]
0x000000003e8fb1c0      2      0 RW-rw- \Device\HarddiskVolume1\Users\גלידה\Desktop\עצם.txt
0x000000003f83c038      2      0 RW-rw- \Device\HarddiskVolume1\Users\גלידה\AppData\Roaming\Microsoft\Windows\Recent\עצם.lnk
[snip]

Success! A lot of other objects use _UNICODE_STRINGs, including mutants, registry paths and symbolic links. So this was an important fix.

Changes have already been reflected in the master branch of Volatility. We hope that you have enjoyed this not so short, quickie ;-)

Friday, May 15, 2015

Using mprotect(.., .., PROT_NONE) on Linux

After deciding to revisit some old code of mine (ok, very old), I realized that there was something different about how Linux was allocating pages of data I wanted to hide. At first, I was glad that I couldn't see the data using yarascan, but then I realized that I was unable to access the memory regions at all in linux_volshell to verify that they were, in fact, obfuscated. So I decided to take a look at using a smaller, stripped down program. Below is one such example, comments are included to explain what is happening:


int main( int argc, char *argv[]){
    // pid: the process ID of this process 
    // so we can print it out
    int pid;
    pid = getpid();

    //size: an integer to hold the current page size
    int size;
    size = getpagesize();

    //[1] create two pointers in order to allocate 
    //memory regions
    char *buffer;
    char *buffer2;

    //unprotected buffer:
    //allocate memory using mmap()
    buffer2 = (caddr_t) mmap(NULL,
                      size,
                      PROT_READ|PROT_WRITE,
                      MAP_PRIVATE|MAP_ANONYMOUS,
                      0,0);

    //[2] put some characters in the allocated memory
    //we're setting these characters one at a time in order
    //to avoid our strings being detected from the binary itself
    buffer2[0] = 'n';
    buffer2[1] = 'o';
    buffer2[2] = 't';
    buffer2[3] = ' ';
    buffer2[4] = 'h';
    buffer2[5] = 'e';
    buffer2[6] = 'r';
    buffer2[7] = 'e';
    
    //protected buffer:
    //allocate memory with mmap() like before
    buffer = (caddr_t) mmap(NULL,
                      size,
                      PROT_READ|PROT_WRITE,
                      MAP_PRIVATE|MAP_ANONYMOUS,
                      0,0);

    //[2] put some characters in the allocated memory
    //we're setting these characters one at a time in order
    //to avoid our strings being detected from the binary itself
    buffer[0] = 'f';
    buffer[1] = 'i';
    buffer[2] = 'n';
    buffer[3] = 'd';
    buffer[4] = ' ';
    buffer[5] = 'm';
    buffer[6] = 'e';

    //[3] protect the page with PROT_NONE:
    mprotect(buffer, size, PROT_NONE);

    //[4] print PID and buffer addresses:
    printf("PID %d\n", pid);
    printf("buffer at %p\n", buffer);
    printf("buffer2 at %p\n", buffer2);

    //spin until killed so that we know it's in memory:
    while(1);
    return 0;
}

Now we'll just compile the above program and run it. In short, the above program [1] creates two buffers, [2] places characters in these buffers, [3] then calls mprotect() on one of them with PROT_NONE; [4] the program then prints out its process ID and the virtual addresses of the aforementioned buffers:


$ ./victim
PID 29620
buffer at 0x7f2bec9a1000
buffer2 at 0x7f2bec9a2000

Let's set up a config file so that we don't have to type as much:


$ cat linux.config
[DEFAULT]
LOCATION=file:///Path/to/Virtual%20Machine/Linux%20Mint%20Cinnamon/Mint%2064-bit-d583834d.vmem
PROFILE=LinuxLinuxMintCinnamonx64x64
PID="29620"

If we try to search for the strings we placed in the buffers we get:


$ python vol.py --conf-file=linux.config linux_yarascan -Y "not here"
Volatility Foundation Volatility Framework 2.4
Task: victim pid 29620 rule r1 addr 0x7f2bec9a2000
0x7f2bec9a2000  6e 6f 74 20 68 65 72 65 00 00 00 00 00 00 00 00   not.here........
0x7f2bec9a2010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
[snip]
$ python vol.py --conf-file=linux.config linux_yarascan -Y "find me"
Volatility Foundation Volatility Framework 2.4

Well that was disappointing, we couldn't find the "find me" string. What you would expect, is to be able to access the memory contents using Volatility. Let's use the linux_volshell plugin to explore the victim process' memory and see if we can access the memory addresses (given to us from the program at run time) directly. First we'll examine the contents at 0x7f2bec9a2000:


$ python vol.py --conf-file=linux.config linux_volshell
Volatility Foundation Volatility Framework 2.4
>>> Current context: process victim, pid=29620 DTB=0x3cfb1000
>>> db(0x7f2bec9a2000)
0x7f2bec9a2000  6e 6f 74 20 68 65 72 65 00 00 00 00 00 00 00 00   not.here........
0x7f2bec9a2010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2050  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2060  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a2070  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................

As we see, we have the contents that we'd expect. Now to try to access the other memory location:


>>> db(0x7f2bec9a1000)
Memory unreadable at 7f2bec9a1000

We see that we've been rejected. After a bit of investigation, we see that the point of failure is in the entry_present() function in the amd64.py address space. In order to find the value of the entry that failed, we'll print it out. First we'll add a print statement to entry_present():


    def entry_present(self, entry):
        if entry:
            if (entry & 1): 
                return True

            arch = self.profile.metadata.get('os', 'Unknown').lower()

            # The page is in transition and not a prototype.
            # Thus, we will treat it as present.
            if arch == "windows" and ((entry & (1 << 11)) and not (entry & (1 << 10))):
                return True

            # we want a valid entry that hasn't been found valid:
            print hex(entry)
        return False

Now let's see what the entry is:


$ python vol.py --conf-file=linux.config linux_yarascan -Y "find me"
Volatility Foundation Volatility Framework 2.4
0x2681d160

So let's look a little closer at the page table entry:


>>> print "{0:b}".format(0x2681d160)
100110100000011101000101100000

The way we read this is from right to left. The entry_present() function checks the 0th bit to see if the entry is present. In this case the bit is not set (0); therefore this function returns False.

At this point, I decided to take a look at the Linux source code to see how mprotect() is actually implemented. As it turns out, the _PAGE_PRESENT bit is cleared when mprotect(...PROT_NONE) is called on a page and the _PAGE_PROTNONE bit is set [3]. Looking at how _PAGE_PROTNONE is defined [4][5][6] we'll see that it's actually equivalent to the global bit (8th bit) [1][2]. So let's look at our page table entry again, we'll notice that the 8th bit is indeed 1:


>>> print "{0:b}".format(0x2681d160)
100110100000011101000101100000

So let's patch the entry_present() function with our findings:


    def entry_present(self, entry):
        if entry:
            if (entry & 1): 
                return True

            arch = self.profile.metadata.get('os', 'Unknown').lower()

            # The page is in transition and not a prototype.
            # Thus, we will treat it as present.
            if arch == "windows" and ((entry & (1 << 11)) and not (entry & (1 << 10))):
                return True

            # Linux pages that have had mprotect(...PROT_NONE) called on them
            # have the present bit cleared and global bit set
            if arch == "linux" and ((entry & (1 << 8))):
                return True

        return False

Let's see if we get anything back:


$ python vol.py --conf-file=linux.config linux_yarascan -Y "find me"
Volatility Foundation Volatility Framework 2.4
Task: victim pid 29620 rule r1 addr 0x7f2bec9a1000
0x7f2bec9a1000  66 69 6e 64 20 6d 65 00 00 00 00 00 00 00 00 00   find.me.........
0x7f2bec9a1010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0x7f2bec9a1020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
[snip]

Success! So now we're able to see the "forbidden" string that we couldn't access before. This goes to show that sometimes you have to dig a little into "why" something is not working as expected. In this case, we lucked out and had source code to examine, but sometimes things are not as easy as all that. Both intel.py and amd64.py have been patched in order to accommodate memory sections that have had mprotect() called on them with PROT_NONE.

References

[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming http://developer.amd.com/wordpress/media/2012/10/24593_APM_v21.pdf

[2] Intel(R) 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide. Section 4.3 http://www.intel.com/products/processor/manuals/index.htm

[3] https://www.kernel.org/doc/gorman/html/understand/understand006.html

[4] http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/arch/x86/include/asm/pgtable_types.h#L63

[5] http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/arch/x86/include/asm/pgtable_types.h#L29

[6] http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/arch/x86/include/asm/pgtable_types.h#L18

Monday, March 16, 2015

Windows Malware and Memory Forensics Training in the UK

Windows Malware and Memory Forensics Training by The Volatility Project is the only memory forensics course officially designed, sponsored, and taught by the Volatility developers. One of the main reasons we made Volatility open-source is to encourage and facilitate a deeper understanding of how memory analysis works, where the evidence originates, and how to interpret the data collected by the framework's extensive set of plugins. Now you can learn about these benefits first hand from the developers of the most powerful, flexible, and innovative memory forensics tool.

Our last public class in the UK sold out, so we are back again! This training is organized and hosted by CCL Group, the UK's largest provider of digital forensics.

Details

Location: Stratford-Upon-Avon, United Kingdom

Cost: £ 2595 + VAT (lunch provided)

Dates: Monday June 1st - Friday June 5th 2015

Times: 9 AM - 5 PM daily

Instructors: Michael Ligh, Jamie Levy, Andrew Case

View this course's full page

Booking

For bookings, please contact training@cclgroupltd.com or call 01789 261200 (UK).

Other Courses

The following courses are also open for registration.

* April 13-17 in Reston, VA

* May 11-15 in New York, NY

* August 31-September 3 in Amsterdam, NL

Monday, February 2, 2015

Advice from Det. Michael Chaves on Memory Forensics, KnTDD, and POS Malware

The following story was shared by Detective Michael Chaves. It describes how he's used Volatility, KnTDD, and memory forensics over the past year to investigate POS breaches at local businesses. Kudos to Michael for applying his skills in an effective and meaningful way, then taking the time to share experiences with others. Without a doubt, detectives in every police department have or will encounter situations like Michael describes.

It's been about year since I've taken the Volatility Windows Malware and Memory Forensics Training in NYC. I wanted to take this time to share some of my experiences to hopefully help examiners/investigators early on in their exposure to Volatility and to help identify unknown malware. Over the past 10 months I have responded to about a dozen POS breaches at local businesses; mainly liquor stores and restaurants. These breaches are identified rather quickly from local banks that call me with the details and I usually respond to a location within 2 days. It should come as no surprise that I have yet to respond to a location that was anywhere near being PCI compliant.

The large majority of POS terminals were running Windows XP some with SP2, most with SP3. Two machines were even running Windows 2K and all had direct connections to the Internet. Antivirus IF present was either out of date or turned off. I still have yet to see a firewall present or any security policy in place. My RAM capture tool of choice is Kntdd and I’ll use FTK Imager Lite to obtain all registry files, App Data directory, $log, $MFT and prefetch directory. I carry with me several portable drives to make the acquisition from each POS location in the shortest amount of time possible as the store still needs to process customer purchases.

For most of these breaches I have been able to identify the malware pretty easily. I usually begin by running, pslist, psscan, psxview and connections (if supported). In the majority of the breaches, the processes were not hidden and had an active process listed, usually called by ‘explorer’. If I was not able to easily identify the malware process, I’d run dlllist to locate any programs running from odd locations followed by malfind and yarascan. Once I have identified a suspect process, I’d dump that process usually by procdump or dlldump.

During the early part of the investigation, I am not too concerned how the malware works or what the Initial Infection Vector was. I want to know where the credit cards are going and how are they getting out. I’d run strings on the exported out suspected malware file and I would generally find, the URL used to send out the cards via POST, an e-mail address associated with the malware and/or IP addresses. The majority of my cases the POST command was used to send out the cards, in others it was via SMTP. In 5 of my 12 breaches, the malware family was JACKPOS or Alina variants. I will search the Internet on file names, URL’s and artifacts that usually result with great write ups that show what I may be investigating, as well researching with Virustotal. It should also be noted that there have been a few times that I sought assistance from the Volatility community and other students from the NYC class. They have been extremely receptive to my questions and information provided to was invaluable!

I realize there are many readers out there that have a far greater understanding of malware and memory forensics. I’m slowly getting there, but I hope this helps out the people just beginning, like I am/was by describing my workflow and perhaps give confidence to some that may otherwise doubt their ability.

Tuesday, January 27, 2015

Incorporating Disk Forensics with Memory Forensics - Bulk Extractor

In this post we will take our first look at a tool that is primarily used for disk forensics and show how it can be useful during memory forensics analysis as well. In the coming weeks we will have several follow on posts highlighting other tools and techniques.

Background

As you are likely aware, much of the information that is recoverable from disk and the network is also recoverable from memory - assuming you get a valid sample within a proper time frame. If these conditions are met, then many forensics tools not usually associated with memory forensics can be incorporated into the memory analysis process. The artifacts recovered by these tools are often ones that memory tools do not focus on and are ill-suited to handle. Through incorporation of such tools and techniques, investigators can fully leverage all information available in volatile memory.

In many real-world investigation scenarios, you may only ever get a memory sample - see Jared's OMFW 2014 presentation for an excellent example of this [6] - and even less likely will you get full network flow (PCAP) during the time frame of an attack. This is unfortunate as the network often has information vital to an investigation, such as which servers were used to attack the network, which systems attackers laterally moved to, which domain(s) malware was downloaded from, and what commands were sent by a C&C server to local nodes. Even without a PCAP, all hope is not lost though, as network data must traverse main memory in order to be sent and received by applications.** This leaves the opportunity for historical network information to be left in memory long after it was active.

This idea was explored in great detail by Simpson Garfinkel and his co-authors in their 2011 DFRWS paper 'Forensic carving of network packets and associated data structures' [5]. In the paper they discuss not only historical network data in volatile memory, but also how that information can be stored on disk through system swapping and hibernation. This approach has the advantage of potentially getting network data from even previous reboots of the system.

To demonstrate the validity of their approach and evaluate the research, new modules were added to the open source bulk_extractor [1, 2, 3] tool in order to automate extraction and processing.

Since the publication of this paper and the popularization of bulk_extractor, extracting network data from memory captures has become a technique used by many analysts and it has even appeared in leading college digital forensics courses [4].

** With the exception of hardware rootkits within NIC firmware. If you believe this type of malware is active on a system that you need to investigate then you should check out the KntDD acquisition tool [14]. It can safely acquire memory from select NICs as well as acquire memory from other hardware devices.

Bulk Extractor

bulk_extractor is a highly-optimized open source tool that can scan inputs (disk, memory captures, etc.) and automatically find a wide range of information useful to investigators, such as email addresses, URLs, domains, credit cards numbers, and more. The project's wiki documents all of its scanning features [7]. Due to its multi-threaded, C++ design, bulk_extractor can often process inputs and extract all features at the speed at which the input can be read, making it very useful for efficient analysis.

When bulk_extractor is finished processing a target it then produces a number of output files. The exact files produced depends on which scanners were activated. For most scanners, in particular the network-related ones, not only is there a file of the raw data recovered, but there is also a histogram produced that orders entries based on the number of times found. This can be interesting in a number of ways, such as looking for the most common email addresses to determine who a user being investigated communicated with frequently.

Carving Network Packets and Streams from Memory

One of bulk_extractor's most useful features to memory forensics, and the one this blog will focus on, is the ability to profile network data in a sample as well as produce a PCAP of all packets found.

The profiled data includes recovered IP addresses, which can be mapped to known-bad lists or researched through threat intelligence tools, and Ethernet frames, which can be used to determine which local systems were contacted. Both of these profiled data sets include a histogram file.

To get this set of information you can run bulk_extractor as:

bulk_extractor -E net -o <output directory> <memory image path>

This will only run the net scanner and quickly produce the results described above in the directory specified with -o. Depending on your time constraints and desired data recovered, you can also run with other options, such as:

bulk_extractor -e net -e email -o <output directory> <memory image path>

This will run both the net and email scanners, which adds several more output files of interesting forensics data, including information about all domains found as well as email addresses and URLs. This can obviously give deep insight into what network activity occurred on the system being analyzed - potential phishing email accounts, malware URLs and domains, attacker C&C infrastructure, and so on.

Analyzing the PCAP File

In many of our own investigations we have found the PCAP file produced by bulk_extractor to be invaluable. In the connection traces we have uncovered evidence of data exfiltration, including (partial) file contents, commands sent by attackers to malware on client systems, input and output of tools on remotely controlled cmd.exe sessions, re-directed HTTP traffic from web servers, and much more.

The number of packets in a created PCAP file is limited to what is available in memory and not overwritten. For systems with large amounts of RAM, this can be many megabytes of data. For systems with less RAM, the in-memory caches will obviously be smaller, which often leads to less remnant information. As we will show with the following forensics challenges though, even systems with small amounts of RAM can still retain crucial, historical network data long after it was processed.

Forensics Challenges

To show the usefulness of extracting network data from memory samples, we ran bulk_extractor against several memory captures included with popular forensics challenges. We choose these challenges for several reasons. First, they include both Linux and Windows samples. This shows the usefulness of the technique across several operating systems. Second, these challenges include small memory captures while still containing very useful data. Finally, by choosing public memory samples, anyone can perform their own research and follow the steps shown in the blog.

DFRWS 2005 Challenge

The first challenge we will analyze is the DFRWS 2005 challenge [8]. This is one the most well-known challenges since it directly inspired what we now consider modern memory forensics.

In this challenge, investigators were given a full network capture as well as a memory capture both before and after the suspect's laptop crashed. After running bulk_extractor, with -E net chosen, against the first sample we obtain a PCAP file (packets.pcap in BE's output folder) that has 130 packets. This is in comparison to 6304 packets in the full network capture, meaning roughly 2% of the related packets were recovered from a 126MB memory capture (no, the size is not a typo).

While this number may seem incredibly small, analysis shows that the recovered network data has highly useful information. As seen in the following screenshot from the PCAP loaded into Wireshark, proof of the "Back Orifice" (BO2k) malware being used on the system is present. This also includes partial file names and commands used on the system:

As discussed in the challenge's solutions, these backdoor interactions were a key part of solving what the attacker did to the system. Assuming a PCAP wasn't given along with the challenge, which we will see is not always the case, then volatile memory would be all investigators had to rely on for network clues.

DFRWS 2008 Challenge

The second challenge we will discuss is the one from DFRWS 2008 [9]. This included a Linux memory memory sample along with an accompanying PCAP file. As with the previous challenge, we will compare the network data obtained from memory to the provided PCAP along with analyzing the data from memory. This memory capture was 284MB and 122 packets were recovered. This is in comparison to 10243 contained in the full network capture (~1%).

Of these 122 packets, there was quite a bit of interesting data. As documented in the winning solution to the challenge [10], an encrypted ZIP file was exfiltrated from the network. Portions of this exfiltration appear in the capture. Also, the challenge details a Perl script used to extfiltrate the data using HTTP cookies on particular domains. Both fragments of the Perl script as well as several cookie instances are contained within the memory capture. The following shows one of the malicious requests encoding data within a msn.com cookie.

Note that the real msn.com is not contacted and a server under attacker control in Malaysia (219.93.175.67) is contacted instead. For more information on this Perl's scripts chunking and encoding of data, please read the winning submission.

Honeynet 2010 Challenge

Honeynet's 2010 Challenge [11] focused on a end-user infected with a banking trojan. In the challenge you were given only a memory sample of the victim system. The given memory sample was 512MB and bulk_extractor recovered 256 packets. From our analysis, we believe this is a fairly significant amount of packets that were sent or received during creation of the challenge, even though it is a small number.

As shown in the following Wireshark screenshot, one of the packets recovered includes the request that downloaded the ZBot malware:

This is the focal point of the investigation, and if you read the challenge's official solution [12], you will see that once the malicious domain and connection are determined, the rest of the investigation falls into place. Through the use of bulk_extractor we can immediately find the request in-tact, whereas the solution required an intricate mixing of strings and Volatility to map the pieces together. It is also interesting to note that when trying to definitively piece together the network traffic the solution states:

"It’s not possible to state it for sure since no network dump is available."

As we have just learned, we do not need to be given a network capture in order to get verifiable and useful network data from memory.

Honeynet 2011 Challenge

The 2011 Honeynet Challenge was an investigation against a memory sample and disk image of a compromised server. No network capture was provided. The memory capture is 256MB and 347 packets were recovered. As discussed in the solutions' for this challenge, the initial foothold was gained on the server through a heap-based vulnerability in the Exim mail server. The following from the PCAP of recovered packets shows this exploit in action:

Besides the large buffer of AAAAA's used to fill the heap, you can also see the IP address (192.168.56.101) of the attacking system as well as its destination port (25) against the local server. This is an immediate warning sign of an attack against the mail server.

Mapping Connections to Processes

As useful information is recovered from memory, investigators often want to determine which process or kernel component was responsible for sending or receiving the particular packets. Volatility provides two ways to do this. The first is through the use of the strings plugin. The strings plugin can map a physical offset in a memory capture (the offset in the capture file) to its owning process or kernel module. Unfortunately, the strings plugins is not very straightforward with a PCAP file as the offset in physical memory of the packet is not encoded within the PCAP's data**.

To get around this limitation, you can use the yarascan plugin to search for unique data within the packet(s) that you find interesting. To do this, pass the data as the -Y option to yarascan. Note that you can pass arbitrary strings, bytes, and hex values to yarascan and you are not just limited to searching readable text. As yarascan searches for your signature it will list any processes or kernel components that contain the data. This can then immediately point you to the area of code responsible for generating or receiving the packet, which can greatly focus your analysis.

For those who closely follow Volatility, you know that there are also the ethscan [15] and pktscan [16] plugins that can perform the mapping as well. Unfortunately, pktscan is no longer supported and ethscan can take substantially longer to produce results versus bulk_extractor. For these reasons and others we choose to incorporate bulk_extractor in our analysis instead of using only Volatility components.

** Someone please let me know if I am wrong about this. I have not seen this documented anywhere for bulk_extractor and I am not sure its possible using the old PCAP format that BE writes the capture in. The newer PCAP-ng format allows for arbitrary comments though and a nice addition to BE would be writing to the new format with a comment of the physical offset of where the particular packet was found.

Conclusion

This post was written to highlight the power of carving network data from memory. When full packet capture is not available, or when no network data is available, the ability to extract network data from memory may be your only chance of recovering it. As shown, bulk_extractor is a highly capable tool for this task. Not only can it recover packets to a PCAP, but it can also be used to recover a wide range of other information as well as arbitrary strings or regular expressions that you configure it for.

We strongly encourage our readers who are currently not utilizing this capability to start to incorporate in into your memory forensics workflows. We believe you will see immediate usefulness of the data recovered and bulk_extractor makes recovery trivial.

References

[1] https://github.com/simsong/bulk_extractor
[2] http://simson.net/ref/2012/2012-02-02%20USMA%20bulk_extractor.pdf
[3] http://simson.net/ref/2013/2013-12-05_tcpflow-and-BE-update.pdf
[4] https://samsclass.info/121/proj/p3-Bulk.htm
[5] http://simson.net/clips/academic/2011.DFRWS.ipcarving.pdf
[6] http://www.slideshare.net/jared703/vol-ir-jgss114
[7] https://github.com/simsong/bulk_extractor/wiki
[8] http://www.dfrws.org/2005/challenge/
[9] http://www.dfrws.org/2008/challenge/
[10] http://sandbox.dfrws.org/2008/Cohen_Collet_Walters/Digital_Forensics_Research_Workshop_2.pdf
[11] http://www.honeynet.org/challenges/2010_3_banking_troubles
[12] http://www.honeynet.org/files/Forensic_Challenge_3_-_Banking_Troubles_Solution.pdf
[13] http://www.honeynet.org/challenges/2011_7_compromised_server
[14] http://www.gmgsystemsinc.com/knttools/
[15] http://jamaaldev.blogspot.com/2013/07/ethscan-volatility-memory-forensics.html
[16] https://code.google.com/p/volatility/issues/detail?id=233