Wednesday, April 9, 2014

Volatility Memory Forensics and Malware Analysis Training in Australia!

We are happy to announce that our popular Memory Forensics and Malware Analysis Training course is going to be held in Canberra, Australia in August. This is our first offering in Australia, and we are already extremely excited to have a great training session full of inquisitive and enthusiastic students.

This is the only memory forensics course officially designed, sponsored, and taught by the Volatility developers. One of the main reasons we made Volatility open-source is to encourage and facilitate a deeper understanding of how memory analysis works, where the evidence originates, and how to interpret the data collected by the framework's extensive set of plugins. Now you can learn about these benefits first hand from the researchers and developers of the most powerful, flexible, and innovative memory forensics tool.


When
August 25th through 29th
Class runs from 9AM to 5PM

Where
Canberra, AU

Instructors
Michael Ligh (@iMHLv2), Andrew Case (@attrc), and Jamie Levy (@gleeda)
Information on each instructor can be found here.

Registration Process
To request a link to the online registration site or to receive a detailed course agenda/outline, please send an email voltraining [[ at ]] memoryanalysis.net or contact us through our web form.

Past Reviews
Many past reviews of the course can be found on our website here as well as a previous blog post here. We also have some additional feedback from our course in Europe last week:
"Labs are amazingly close to real incidents" - Security Engineer 
"Full overview of Volatility. Extremely useful. Required for anyone wanting to do memory forensics effectively." - CERT operator
"Years of experience by 3 world class experts on memory analysis" - Senior Security Analyst
"Very inspiring training, which is a must for any individual taking cyber incident analysis and handling serious." - Incident Handling Manager
"The perfect balance between what you need to know about Windows internals, malware and forensics." - Security Officer 
"Before the course it was like being in a dark room, then Volatility guys opened windows. Solid base of theory and thorough explanations of options and approaches. Recommended for everyone"
"I work with reverse engineering malware and this course has helped me to simplify the process" - Malware analyst
"This course is extremely valuable to all forensic investigators and malware researchers" - Forensic Investigator and Incident Handler 
"High tech knowledge on Windows memory analysis that is very relevant to real-life situations" - Head of CSIRT 
"It gives such deep knowledge in this special field, as I have never thought before" - Security Analyst

Tuesday, April 8, 2014

Building a Decoder for the CVE-2014-0502 Shellcode

In late February of this year multiple security companies (FireEyeAlientVaultSecPod, Symantec, plus many more) were reporting on a Flash zero-day vulnerability (CVE-2014-0502) being exploited in the wild.  Around this time a friend asked me if I could reverse the exploit and its associated files in order to write a decoder for it. The purpose of the requested decoder was to statically determine the URL from where the backdoor executable (shown later) would be downloaded.

As explained in the referenced links, the exploit found in the wild works by:
  1. Having a victim's browser load a Flash file (cc.swf) that exploited the vulnerable Flash player 
  2. The exploit (shellcode) then downloads a GIF file (logo.gif) from the web server hosting the SWF file
  3. This GIF file contains encrypted/encoded shellcode embedded within it that eventually downloads a backdoor executable from an encrypted URL within the file. Static decryption of this URL is what my friend was after
Note: For simplicity, in the steps above, I skipped some of the details on how the shellcode utilizes ROP, defeats ASLR, and finds the necessary libraries and functions (LoadLibrary, InternetOpenURL, CreateFile, etc.). If you are interested in these details please see the FireEye writeup linked above. If you are interested in how the exploit achieves arbitrary code execution then you should read the writeup from Spider Labs here. (You should probably just read the Spider Labs writeup anyway as its very well done...)

Unfortunately, many of the previous research writeups were not available at the time of my friend's request. To assist with Flash decompilation, I used SoThink SWF decompiler, which is a tool I cannot recommend enough, and that I have used to successfully analyze numerous Flash files. Since my effort, Zscaler has published a nice writeup on the Flash file and how it constructs its payload, although it misses a key part to writing the decoder -- how to determine where the encrypted shellcode starts within the downloaded GIF file.

Through analysis with SoThink's tool I was able to determine that the last four bytes of the GIF file contained a little endian integer that represented the offset of the encrypted payload from the beginning of where the offset is stored (the end of the file minus 4). The following decompiled function shows this process. The decompilation is from SoThink and the comments are mine:

1  public class cc extends Sprite
2  {
3   [snip]
4   _loc_4 = new URLLoader();
5   _loc_4.dataFormat = "binary";
6   _loc_4.addEventListener("complete", Ƿ); // sets mpsc
7   _loc_4.load(new URLRequest("logo.gif")); // get logo.gif from same server as loaded
8   [snip]
9  }
10 
11 public function Ƿ(event:Event) : void
12 {
13  var _loc_3:* = new ByteArray();
14 
15  /* writes logo.gif  to loc_3 */
16  _loc_3.writeBytes(event.target.data as ByteArray, 0, (event.target.data as ByteArray).length);
17 
18  /* move to last 4 bytes */
19  _loc_3.position = _loc_3.length - 4;
20  _loc_3.endian = "littleEndian";
21         
22  /* last four bytes of logo.gif */
23  var _loc_4:* = _loc_3.readUnsignedInt();
24  var _loc_2:* = new ByteArray();
25   
26  /* length of file - integer from last 4 bytes - 4 */
27  _loc_2.writeBytes(_loc_3, _loc_3.length - 4 - _loc_4, _loc_4);
28  _loc_2.position = 0;
29 
30  /* integer read from offset: length of file - integer from last 4 bytes - 4 */
31  Ǵ.setSharedProperty("mpsc", _loc_2);
32  Ǵ.start();
33 
34  return;
35 }// end function

As can be seen on lines 4-7, which are inside the constructor of the Flash file, the logo.gif file is downloaded. The URLLoader instance used to download the file has its complete listener set to the function shown on lines 11 through 35. This function is triggered once the file is finished downloading. On line 16 the file's contents are read into an array named _loc_3 (the third declared local variable) by the decompiler. On lines 19 and 20 the array's position is moved to four bytes before the end of the file and its disposition is set to little endian. On line 23 the integer at these last four bytes is read and a new byte array, _loc_2 is declared. Line 27 is the key one as _loc_2 is filled with the bytes of _loc_3 (logo.gif) starting at the end of the file minus the integer read minus another four bytes. At this point _loc_2 holds the encrypted shellcode and is stored in the mpsc shared property. This buffer will later be executed by the exploit payload.

Now that I knew how to find the shellcode reliably, I then needed to decrypt the shellcode in order to find the instructions that located and decrypted the embedded backdoor URL. When analyzing the shellcode I was working in a native Linux environment so I used a mix of vim, dd, and ndisasm. To start, I figured out the offset of the encrypted shellcode within my file and then extracted a copy of the shellcode.

The following lists the beginning of the shellcode:

 1 $ ndisasm -b32 stage1 
 2 00000000  D9EE            fldz
 3 00000002  D97424F4        fnstenv [esp-0xc]
 4 00000006  5E              pop esi
 5 00000007  83C61F          add esi,byte +0x1f
 6 0000000A  33C9            xor ecx,ecx
 7 0000000C  66B90009        mov cx,0x900
 8 00000010  8A06            mov al,[esi]
 9 00000012  8A6601          mov ah,[esi+0x1]
10 00000015  8826            mov [esi],ah
11 00000017  884601          mov [esi+0x1],al
12 0000001A  83C602          add esi,byte +0x2
13 0000001D  E2F1            loop 0x10
14 0000001F  EE              out dx,al
15 00000020  D974D9F4        fnstenv [ecx+ebx*8-0xc]
16 00000024  2483            and al,0x83
[snip]

For those of you who have never used ndisasm before, it is the disassembler that comes with the nasm assembler. The b option defines the architecture (16, 32, or 64 bit Intel), and ndisasm simply treats the file as raw instructions. This makes it very useful when analyzing shellcode. In the output the first column is the offset of the instruction from the beginning of the file, the second column is the instruction's opcodes, and the third column is the instruction's mnemonic.

As you can see in the ndisasm output, the instructions make sense until line 14 (offset 0x1f) where the out instruction is used. out is used to talk directly with hardware devices, and as such is a privileged operation. Since this shellcode runs in userland, out cannot be used and even the operating system's kernel mode components use it sparingly. Examination of the instructions on lines 2 through 13 reveal a decryptor loop that targets the instructions starting at line 14. To begin, lines 2-4 leverage the floating point unit to determine the runtime address of where the fldz (offset 0) instruction is memory. The floating point internals that enable this are explained in an Symantec paper here and the shellcode trick was first disclosed by noir in 2003 here.

Lines 5-7 then setup the loop. First 0x1f is added to the esi register which moves it to the offset where the out instruction is. ecx is then set to zero using xor and the value 0x900 is moved into the cx (the bottom half of ecx) register. This is the loop counter, so we know that the first layer of decryption will operate on 0x900 (2304) bytes. Lines 8 through 12 then implement the deobfuscation of the bytes beginning at offset 14 with an algorithm that translates to:

1  void stage1(unsigned char *buf)
2  {
3    unsigned char *esi;
4    int ecx;
5    unsigned char ah, al;
6
7    // add esi,byte +0x1f
8    esi = buf + 0x1f;
9
10   // xor ecx,ecx
11   // mov cx,0x900
12   ecx = 0x900;
13
14   while(ecx > 0)
15   {
16     // mov al,[esi]
17     al = *esi;
18
19     // mov ah,[esi+0x1]
20     ah = *(esi + 1);
21
22     // mov [esi],ah
23     *esi = ah;
24
25     // mov [esi+0x1],al
26     *(esi + 1) = al;
27
28     // add esi,byte +0x2
29     esi = esi + 2;
30
31     // loop 0x10
32     ecx = ecx - 1;
33   }
34 }

As you can see from the converted assembly, the purpose of this loop is to flip each byte with the one preceding it in the obfuscated shellcode. This is done by using the ah and al registers which are 1 byte in size each. After running the decoder above, the instructions starting at our original out instruction (offset 0x1f) now make sense and become the second stage of shellcode decryption:

 1 $ ndisasm -b32 stage2 
 2 00000000  D9EE          fldz
 3 00000002  D97424F4      fnstenv [esp-0xc]
 4 00000006  5E            pop esi
 5 00000007  83C621        add esi,byte +0x21
 6 0000000A  56            push esi
 7 0000000B  5F            pop edi
 8 0000000C  33C9          xor ecx,ecx
 9 0000000E  66B9F008      mov cx,0x8f0
10 00000012  90            nop
11 00000013  66AD          lodsw
12 00000015  662D6161      sub ax,0x6161
13 00000019  C0E004        shl al,0x4
14 0000001C  02C4          add al,ah
15 0000001E  AA            stosb
16 0000001F  E2F2          loop 0x13
17 00000021  6E            outsb
18 00000022  6A6F          push byte +0x6f
19 00000024  6F            outsd
[snip]

As can be seen, this decryptor stage is another loop that transforms the code that follows it. Lines 2-4 contain code necessary to place esi at the fldz instruction of the second stage decryptor. Line 5 then adds 0x21 to esi in order to point it to the junk outsb instruction at line 17. The loop counter is initialized to 0x8f0 at line 9 and then lines 11 through 15 perform the transformation. This transformation can be expressed in C as:

1 void stage2(unsigned char *buf)
2 {
3   int ecx;
4   unsigned char *esi;
5   unsigned char *edi;
6   unsigned short ax;
7   unsigned char al, ah;
8
9   // add esi,byte +0x21
10  esi = buf + 0x21;
11
12  // push esi
13  // pop edi
14  edi = esi;
15
16  // xor ecx,ecx
17  // mov cx,0x8f0
18  ecx = 0x8f0;
19
20  while(ecx > 0)
21  {
22    // lodsw
23    ax = *(unsigned short *)esi;
24    esi = esi + 2;
25
26    // sub ax,0x6161
27    ax = ax - 0x6161;
28
29    ah = (ax >> 8) & 0xff;
30    al = ax & 0xff;
31
32    // shl al,0x4
33    al = al << 4;
34
35    // add al,ah
36    al = al + ah;
37
38    // stosb
39    *edi = al;
40    edi = edi + 1;
41
42    ecx = ecx - 1;
43   }
44 }

After the second stage of deobfuscation the outsb (line 17 from the previous ndisasm output) and its following instructions look like:

 1 $ ndisasm -b32 stage3
 2 00000000  D9EE          fldz
 3 00000002  D97424F4      fnstenv [esp-0xc]
 4 00000006  5E            pop esi
 5 00000007  83C61F        add esi,byte +0x1f
 6 0000000A  33C9          xor ecx,ecx
 7 0000000C  66B96804      mov cx,0x468
 8 00000010  8A06          mov al,[esi]
 9 00000012  8A6601        mov ah,[esi+0x1]
10 00000015  8826          mov [esi],ah
11 00000017  884601        mov [esi+0x1],al
12 0000001A  83C602        add esi,byte +0x2
13 0000001D  E2F1          loop 0x10
14 0000001F  EE            out dx,al
15 00000020  D974D9F4      fnstenv [ecx+ebx*8-0xc]
16 00000024  2483          and al,0x83
[snip]

You may notice that this is the same algorithm used in stage 1 for decryption, just with a different loop counter since the decrypting process is moving further down the file. By running the algorithm starting at line 14 (offset 0x1f) we get the fourth level of decryptor shellcode:

 1 $ ndisasm -b32 stage4 
 2 00000000  D9EE          fldz
 3 00000002  D97424F4      fnstenv [esp-0xc]
 4 00000006  5E            pop esi
 5 00000007  83C616        add esi,byte +0x16
 6 0000000A  33C9          xor ecx,ecx
 7 0000000C  66B9BB08      mov cx,0x8bb
 8 00000010  803631        xor byte [esi],0x31
 9 00000013  46            inc esi
10 00000014  E2FA          loop 0x10
[snip]

This decryptor loop decrypts the next 0x8bb bytes using the following algorithm:

1 void stage4(unsigned char *buf)
2 {
3   int ecx;
4   unsigned char *esi;
5
6   // add esi,byte +0x16
7   esi = buf + 0x16;
8
9   // add esi,byte +0x16
10  // mov cx,0x8bb
11  ecx = 0x8bb;
12
13  while (ecx > 0)
14  {
15    // xor byte [esi],0x31
16    *esi = *esi ^ 0x31;
17
18    // inc esi
19    esi = esi + 1;
20
21    // loop 0x10
22    ecx = ecx - 1;
23  }
24 }

After running this algorithm, we finally get to the fully deobfuscated shellcode and can begin analysis:

 1 $ ndisasm -b32 stage5 
 2 00000000  55               push ebp
 3 00000001  8BEC             mov ebp,esp
 4 00000003  81EC90010000     sub esp,0x190
 5 00000009  53               push ebx
 6 0000000A  56               push esi
 7 0000000B  57               push edi
[snip]

Remember that the original purpose of my friend's request was a decoder that could decrypt the encrypted URL used to download the backdoor file. The first relevant instruction that I found related to this task is at offset 0x90 in the deobfuscated function:

 1 00000090  E800000000        call dword 0x95
 2 00000095  5B                pop ebx
 3 00000096  83C350            add ebx,byte +0x50
 4 00000099  899D74FEFFFF      mov [ebp-0x18c],ebx
 5 0000009F  8B8574FEFFFF      mov eax,[ebp-0x18c]
 6 000000A5  813831123112      cmp dword [eax],0x12311231
 7 000000AB  740F              jz 0xbc
 8 000000AD  8B8574FEFFFF      mov eax,[ebp-0x18c]
 9 000000B3  40                inc eax
10 000000B4  898574FEFFFF      mov [ebp-0x18c],eax
11 000000BA  EBE3              jmp short 0x9f
[snip]

On line 1 we see a call instruction being made to the next instruction (pop ebx). This has the effect of placing the runtime address of the pop ebx instruction into ebx. 0x50 is then added to this address and a loop begins that is searching for 0x12311231 (0x31123112 on disk due to little endian) towards the end of the GIF file. This is the special marker used to denote where the encrypted URL begins. Once this marker is found, control is transferred to offset 0xbc (this check and jmp occurs on lines 6-7).

Starting at offset 0xbc, we have the following loop. Note that this disassembly is annoyingly long due to no optimizations being used.

  1 000000BC  8B8574FEFFFF      mov eax,[ebp-0x18c]
  2 000000C2  83C004            add eax,byte +0x4
  3 000000C5  898574FEFFFF      mov [ebp-0x18c],eax
  4 000000CB  8B8574FEFFFF      mov eax,[ebp-0x18c]
  5 000000D1  898580FEFFFF      mov [ebp-0x180],eax
  6 000000D7  83A588FEFFFF00    and dword [ebp-0x178],byte +0x0
  7 000000DE  8B8574FEFFFF      mov eax,[ebp-0x18c]
  8 000000E4  038588FEFFFF      add eax,[ebp-0x178]
  9 000000EA  0FBE00            movsx eax,byte [eax]
 10 000000ED  83F8FF            cmp eax,byte -0x1
 11 000000F0  744F              jz 0x141
 12 000000F2  8B8574FEFFFF      mov eax,[ebp-0x18c]
 13 000000F8  038588FEFFFF      add eax,[ebp-0x178]
 14 000000FE  0FBE00            movsx eax,byte [eax]
 15 00000101  83F012            xor eax,byte +0x12
 16 00000104  8B8D74FEFFFF      mov ecx,[ebp-0x18c]
 17 0000010A  038D88FEFFFF      add ecx,[ebp-0x178]
 18 00000110  8801              mov [ecx],al
 19 00000112  8B8574FEFFFF      mov eax,[ebp-0x18c]
 20 00000118  038588FEFFFF      add eax,[ebp-0x178]
 21 0000011E  0FBE00            movsx eax,byte [eax]
 22 00000121  83E831            sub eax,byte +0x31
 23 00000124  8B8D74FEFFFF      mov ecx,[ebp-0x18c]
 24 0000012A  038D88FEFFFF      add ecx,[ebp-0x178]
 25 00000130  8801              mov [ecx],al
 26 00000132  8B8588FEFFFF      mov eax,[ebp-0x178]
 27 00000138  40                inc eax
 28 00000139  898588FEFFFF      mov [ebp-0x178],eax
 29 0000013F  EB9D              jmp short 0xde

On lines 1-3 the pointer to where the marker was found is incremented by 4 to skip the marker and then placed into [ebp-0x18c]. This value is then also placed into [ebp-0x180] on line 5. The buffer holding the URL is then enumerated until an 0xff marker is found. This is accomplished by the byte comparison of -0x1 on line 10 and the bailout if found on line 11.  Lines 12 through 29 perform the decryption of the URL. The main part of this decryption is on lines 15 and 22 (shown in red) where each byte is transformed by XOR'ing with 0x12 and then subtracting 0x31.

After this analysis, we now finally know how to find and decrypt the URL:
  1. Read in logo.gif
  2. Search the file for 0x31123112 in little endian
  3. Once found, decrypt each byte by XOR it with 0x12 and subtract 0x31
  4. Stop processing when a byte of 0xff is found
After implementing the above steps I was able to give my friend a Python script that could read the decrypted URL from abitrary logo.gif files that he found through his analysis and hunting:

$ python decode.py logo.gif
Found \x31\x12\x31\x12 marker at offset 17764
Found 0xff, breaking URL processing loop
Download URL: http://redacted/redacted.exe

In the end, a simple decryptor loop is all that is needed to decrypt the URL, but it took reversing multiple stages of shellcode and runtime code modifications in order to find and understand this algorithm.

I hope that you found reading this blog post informative and interesting. I would like to thank the other members of the Volatility Team (@iMHLv2, @gleeda, @4tphi) for proof reading this before I hit 'Publish' and for @justdionysus providing the historical references related to the use of the FPU for finding EIP. If you have any questions or comments on the post please leave a comment below, ping me on Twitter (@attrc), or shoot me an email (andrew @@@ memoryanalysis.net).

Wednesday, February 19, 2014

Training by The Volatility Project Now Available In Three Continents!

The Volatility Team is very happy to announce that we have a new website (http://www.memoryanalysis.net) and a number of upcoming training courses this year. With opportunities across three different continents, its now easier than ever before to learn about the most exciting realms of digital forensics from instructors who pioneered the field and developed some of the industry's most powerful tools.

You can visit our new website to learn details about our offerings, including the popular Malware & Memory Forensics class, our Digital Forensics & Incident Response class at BlackHat Vegas, and our online Registry Forensics training.

The next public offerings of the Malware & Memory Forensics class include:
To provide fair warning, the New York and London classes WILL SELL OUT SOON. We've already exceeded our seating capacity in invites for both events - and its first-come-first-served to those who complete the registration. Please contact us ASAP if you would like to attend these courses. We're also nearly at capacity for private/closed training events - with one slot remaining for 2014. If you want dedicated training on-site at your place of business, drop us a line.

The course in Australia is still currently being planned, but we expect it to be in either one of the last two weeks of August or the first week of September. It will be held in Canberra or Sydney. If you are interested in being put on the notification list for when registration opens for this course, then please contact us.

Here are a few other short cuts to our new website for you:
Reviews of our past Malware & Memory Forensics offerings can be found here. Ian Ahl (@tekdefense) also wrote a blog post of his experience here.

Our most recent public offering in San Diego received very high praise as well:
"This was the most in depth forensic course I've ever taken. The instructors are top notch and really know the material and concepts behind it. If you're serious about protecting your network, you need to take this course" - Ryan G. 
"This is the best forensics training I have ever participated in. You don't just learn what commands you blindly punch in; you gain deep insight into win internals, understand how malware can subvert the OS, and how to detect these abuses. Also, tons of stuff I can bring home to continue training and apply to my work." - Christian B.  
"This was hands down the best (technical, useful, well explained, and relevent to current investigations) DFIR course/materials I have taken in the last 10 years! This is a must take class for anyone in DFIR. Aside from the knowledge + lab experience, the tools provided may be worth the class attendance alone. If you don't take this course, you're doing Windows DFIR wrong!" - Anonymous
If you are serious about learning memory forensics and want to learn it from the researchers and developers of The Volatility Project then you should consider taking our course. If you do take the course, you will not only be able to conquer advanced threats and malware, but you will also understand how your tool is operating every step of the way.

Monday, February 3, 2014

ADD: The Next Big Threat To Memory Forensics....Or Not

Similar to a rootkit, an anti-forensics tool or technique must possess two critical traits in order to be significant:

1. It must do something
2. It must get away with it

Satisfying #1 is the easy part. You can hide a process, hide a kernel module, or in the case of ADD - create fake, decoy objects to lead investigators down the wrong path. Although ADD is just a proof-of-concept, we're not convinced there's a concept that needs proving. The idea of creating decoy objects was presented in 2007:
Another area of concern is the susceptibility of these tools to false positives or decoys. It is possible for a malicious adversary to dramatically increase the noise to signal ratio and keep the digital investigator busy. Unfortunately, using this [the pool tag scanning method] makes it extremely easy for a malicious adversary to create "life-like" decoys. 
In other words, tools that use object carving (i.e. pattern matching, scanning) as an analysis technique are implicitly susceptible to attacks that create objects that look like the ones being carved. This is a well-understood consequence of the analysis technique and is true of file carving, internet artifact extraction, and various other types of forensic data.  It would not be responsible for a forensics analyst to ignore legitimate artifacts found using these techniques because they are susceptible to false positives. An analyst should understand the limitations of their tools/techniques and know how to validate or refute their findings with supporting artifacts.

Let's pretend for a moment that the decoy idea is new, however. Indeed it may be new, to some people, who have not seen the previous research. Yet, regardless of what action(s) are carried out in #1, the real challenge is satisfying #2. Once you've done what you want to do, can you clean up after yourself and not get busted?

Think of it this way - a suspect wants to rob a bank. It is implied that this crime is possible to commit - no proof is required. In fact its quite easy, as several very unintelligent people have shown in the past. The suspect gets so far as to take physical possession of the cash, but either gets trapped inside the bank or leaves a trail of money all the way back to his front door.

As the suspect sits in prison, he wonders "what have I accomplished?" and comes up blank. By failing to achieve #2, his efforts toward #1 are futile. Even if he came up with a completely new way of robbing a bank, one that had never been considered by another criminal, he still got caught.

The authors of ADD will argue that the time investigators spend pursuing the criminal makes the decoy concept worthwhile. They make absolutely no attempt to achieve #2. As a result, a talented memory analyst (who happens to be alumni of our training class) made short work of the anti-forensics tool - finding various ways to determine what happened, when it happened, and how it happened in a matter of minutes. In this case, it took the adversary considerably longer (probably weeks) to develop the tool, and it took the investigator the amount of time it takes to eat a bag of chips to blow the case wide open.

Another goal of ADD is to "reset the bar" and convince investigators not to trust what they find in memory. In an online recording, the author stated that the tool serves to teach a valuable lesson to people in the "point and click" forensics mindset. First of all, to reset the bar, you don't scale back and create a tool that only tricks the least skilled investigator. That may indeed reset the bar, but in the wrong direction.

Similarly, no investigators are so naive as to base their conclusions on one piece of data alone. There are various components to the digital crime scene, and one main reason we perform memory forensics is to corroborate evidence. If the supporting data isn't there (i.e. network connections in the firewall, packet captures, file system artifacts, etc), then the fake artifact is quickly exposed.

In fact, ADD doesn't even do a good job of creating fake objects. The fake connections are created without process association, so you see an ESTABLISHED TCP connection with no owner. The fake processes stick out like a sore thumb, because they're only found by one of the 7 techniques that psxview uses to identify process objects. Attempting to dump the fake processes results in an error (expectedly), which raises even more suspicion. Also, the fake files it creates are found floating off a device that doesn't exist rather than a real physical drive.

The sheer amount of nonsense artifacts that this tool disperses in memory just begs for it to be noticed.  While stealth is admittedly not the motivation for this particular technique, increasing the noise becomes a liability when it can be easily triaged.

Perhaps the most astonishing aspect of ADD is that the author(s) failed to advise the audience on how their tool, or any anti-forensics method, could be detected. The question was posed once during the Q&A session at Shmoocon and again nearly two weeks later at about 40:40 into the online recording.

Host: What would you think the signs are [that someone should be looking for] whether or not there is in fact some reason to believe that you should go in and check for these [anti-forensics attacks]?

ADD Author: You know, unfortunately I don't have a good answer for that. I think this is going to be prohibitively difficult. 

After reading Forensic Analysis of Anti-Forensic Activities, you be the judge - is it prohibitively difficult to detect? This exemplifies the value of learning memory forensics techniques from the actual developers who performed the research and intimately understand the limitations of their tools. 

To conclude, in its current state, ADD creates poorly faked objects on one version of Windows (32-bit Windows 7) and draws more attention to itself than any other anti-forensics tool. There is a significant amount of work that needs to be done for this to change, so while the attackers are spending their weeks and months trying to build things up to spec, rest assured that with proper training and the right tools, you won't need to worry about future versions.

Tuesday, January 21, 2014

Malware Superlatives: Most Likely to Cry s/Wolf/Crocodile/

As a young boy once learned, its bad to cry wolf. Its not necessarily bad to cry crocodile, but the authors of Blazgel decided to do it anyway. Blazgel is a kernel rootkit that hooks various SSDT entries and has some backdoor capabilities. When I first saw it hooking NtWriteVirtualMemory, it piqued my interest, because this is the native API called by WriteProcessMemory - a function commonly used for code injection. Presumably, by hooking this function, the rootkit could also prevent antivirus from disinfecting some of its components from memory. As I went to explore the real reason this malware hooked NtWriteVirtualMemory, I was a little surprised to see this:
Blazgel's NtWriteVirtualMemory API Hook Cries Crocodile
You may need to click the image to view a larger disassembly, but essentially what you're seeing is code like the following:

NTSTATUS Hook_NtWriteVirtualMemory(ProcessHandle,
                        BaseAddress,
                        Buffer,
                        NumberOfBytesToWrite,
                        NumberOfBytesWritten)
{

    if (True_NtWriteVirtualMemory != NULL) 
    {
        DbgPrint("crocodile");
        return True_NtWriteVirtualMemory(ProcessHandle, 
                                         BaseAddress,
                                         Buffer,
                                         NumberOfBytesToWrite,
                                         NumberOfBytesWritten);
    }
    //snip
}

The function  named Hook_NtWriteVirtualMemory is the malicious handler that executes when NtWriteVirtualMemory is called. True_NtWriteVirtualMemory is the saved pointer to the real API function. Upon hooking the function, the malware saves the real API so that it can still be referenced when needed. Strangely, this rootkit must have been deployed while still under development, because all the hook does is print crocodile to the kernel debug message facility and then pass the call through to the valid API function.

This post is an excerpt from Malware Superlatives, a sequel to the Making Fun of Your Malware presentation.

- Michael Ligh (@iMHLv2)

Thursday, January 16, 2014

Comparing the Dexter and BlackPOS (Target) RAM Scraping Techniques

Up until yesterday when Brian Krebs wrote A First Look at the Target Intrusion, Malware, there weren't many details about the involved code. Now that its out there, I thought it might be interesting to see how the "RAM scraping" feature worked in comparison to the Dexter malware. As it turns out, the two are quite similar, and neither are really exciting. This just goes to show that you don't need advanced (fine tuned, maybe) tools to be successful at cyber crime.

The BlackPOS malware (see Krebs' article) A.K.A Trojan.POSRAM uses EnumProcesses to get the list of active PIDs on the system. It then cycles through the list, skipping its own PID (Dexter also skipped its parent PID). For all other processes, it calls GetModuleFileNameEx to get the full path to the executable, then strips off the file name (i.e. explorer.exe) portion, converts it to lowercase, and compares it against "pos.exe" with strcmp. Had Target known that this specific sample only looks in the memory of processes named pos.exe, it could have renamed its Point-of-Sale application and avoided the news (for a few minutes anyway... other samples are known to exist that looked for other process names).

The "pos.exe" string isn't exactly in plain text, but the malware isn't packed either. It uses a very simple obfuscation technique where the string is split and the characters are shifted around a bit. You can see it in the read-only section of the PE file:


Once it finds a pos.exe process, it opens a handle with OpenProcess and uses VirtualQueryEx to begin iterating through the process's available memory blocks. This is exactly what Dexter did -- the only difference being that Dexter also checked memory protection constants and skipped ranges that were marked PAGE_NOACCESS or PAGE_GUARD. As a result, the BlackPOS malware could easily lead to an access violation (e.g. STATUS_PAGE_GUARD_VIOLATION) capable of crashing the malware. I can only imagine the look on the attackers' faces once they realized that they came up with 0 credit cards because they didn't check page permissions before reading a memory address. Unfortunately for Target, the pos.exe processes must not have had any no-access or guard pages set.

Here's a snippet of code reverse engineered from the malware that shows how it determines which ranges to scan:

VOID ScanMemory(HANDLE hProcess)
{
    int lpAddress = 0;
    int lpMaxAddress = 0x6FFFFFFF;
    int endAddress = 0;
    MEMORY_BASIC_INFORMATION anMBI;
    SIZE_T cbReturned;

    do { 
        cbReturned = VirtualQueryEx(hProcess, 
                        (LPCVOID)lpAddress, 
                        &anMBI, 
                        sizeof(anMBI));

        if (cbReturned && anMBI.RegionSize)
        {
            endAddress = (int)((char *)anMBI.BaseAddress + anMBI.RegionSize);
            ScanRange(hProcess, 
                      anMBI.BaseAddress,
                      endAddress);
        }
        lpAddress = (int)(char *)lpAddress + anMBI.RegionSize;
    } while (lpAddress < lpMaxAddress);
}

Notice the malware stops at 0x6FFFFFFF, to avoid scanning inside system DLLs which normally exist in high regions of process memory space. The ScanRange function (not shown) breaks the ranges identified by VirtualQueryEx into (roughly) 10 MB chunks and uses ReadProcessMemory to read the data into a buffer, which it then checks for patterns and substrings related to the credit card information that it wants to steal. Dexter did something similar, but read data in roughly 400 KB chunks instead.

In conclusion, Dexter and Trojan.POSRAM are really quite similar in terms of how they scan memory for the sensitive data.

-Michael Ligh (@iMHLv2)

Tuesday, January 14, 2014

TrueCrypt Master Key Extraction And Volume Identification

One of the disclosed pitfalls of TrueCrypt disk encryption is that the master keys must remain in RAM in order to provide fully transparent encryption. In other words, if master keys were allowed to be flushed to disk, the design would suffer in terms of security (writing plain-text keys to more permanent storage) and performance. This is a risk that suspects have to live with, and one that law enforcement and government investigators can capitalize on.

The default encryption scheme is AES in XTS mode. In XTS mode, primary and secondary 256-bit keys are concatenated together to form one 512-bit (64 bytes) master key. An advantage you gain right off the bat is that patterns in AES keys can be distinguished from other seemingly random blocks of data. This is how tools like aeskeyfind and bulk_extractor locate the keys in memory dumps, packet captures, etc. In most cases, extracting the keys from RAM is as easy as this:

$ ./aeskeyfind Win8SP0x86.raw
f12bffe602366806d453b3b290f89429
e6f5e6511496b3db550cc4a00a4bdb1b
4d81111573a789169fce790f4f13a7bd
a2cde593dd1023d89851049b8474b9a0
269493cfc103ee4ac7cb4dea937abb9b
4d81111573a789169fce790f4f13a7bd
4d81111573a789169fce790f4f13a7bd
269493cfc103ee4ac7cb4dea937abb9b
4d81111573a789169fce790f4f13a7bd
0f2eb916e673c76b359a932ef2b81a4b
7a9df9a5589f1d85fb2dfc62471764ef47d00f35890f1884d87c3a10d9eb5bf4
e786793c9da3574f63965803a909b8ef40b140b43be062850d5bb95d75273e41
Keyfind progress: 100%

Several keys were identified, but only the two final ones in red are 256-bits (the others are 128-bit keys). Thus, you can bet by combining the two 256-bit keys, you'll have your 512-bit master AES key. That's all pretty straightforward and has been documented in quite a few places - one of my favorites being Michael Weissbacher's blog

The problem is - what if suspects change the default AES encryption scheme? TrueCrypt also supports Twofish, Serpent, and combinations thereof (AES-Twofish, AES-Twofish-Serpent). Furthermore, it supports modes other than XTS, such as LWR, CBC, outer CBC, and Inner CBC (though many of the CBCs are either deprecated or not recommended).

What do you do if a suspect uses non-default encryption schemes or modes? You can't find Twofish or Serpent keys with tools designed to scan for AES keys -- that just doesn't work. As pointed out by one of our Twitter followers (@brnocrist), a tool by Carsten Maartmann-Moe named Interrogate could be of use here (as could several commercial implementations from Elcomsoft or Passware). 

Another challenge that investigators face, in the case of file-based containers, is figuring out which file on the suspect's hard disk serves as the container. If you don't know that, then having the master keys is only as useful as finding the key to a house but having no idea where the house is. 

To address these issues, I wrote several new Volatility plugins. The truecryptsummary plugin gives you a detailed description of all TrueCrypt related artifacts in a given memory dump. Here's how it appears on a test system running 64-bit Windows 2012

$ python vol.py -f WIN-QBTA4959AO9.raw --profile=Win2012SP0x64 truecryptsummary
Volatility Foundation Volatility Framework 2.3.1 (T)

Process              TrueCrypt.exe at 0xfffffa801af43980 pid 2096
Kernel Module        truecrypt.sys at 0xfffff88009200000 - 0xfffff88009241000
Symbolic Link        Volume{52b24c47-eb79-11e2-93eb-000c29e29398} -> \Device\TrueCryptVolumeZ mounted 2013-10-11 03:51:08 UTC+0000
Symbolic Link        Volume{52b24c50-eb79-11e2-93eb-000c29e29398} -> \Device\TrueCryptVolumeR mounted 2013-10-11 03:55:13 UTC+0000
File Object          \Device\TrueCryptVolumeR\$Directory at 0x7c2f7070
File Object          \Device\TrueCryptVolumeR\$LogFile at 0x7c39d750
File Object          \Device\TrueCryptVolumeR\$MftMirr at 0x7c67cd40
File Object          \Device\TrueCryptVolumeR\$Mft at 0x7cf05230
File Object          \Device\TrueCryptVolumeR\$Directory at 0x7cf50330
File Object          \Device\TrueCryptVolumeR\$BitMap at 0x7cfa7a00
File Object          \Device\TrueCryptVolumeR\Chats\Logs\bertha.xml at 0x7cdf4a00
Driver               \Driver\truecrypt at 0x7c9c0530 range 0xfffff88009200000 - 0xfffff88009241000
Device               TrueCryptVolumeR at 0xfffffa801b4be080 type FILE_DEVICE_DISK
Container            Path: \Device\Harddisk1\Partition1
Device               TrueCrypt at 0xfffffa801ae3f500 type FILE_DEVICE_UNKNOWN

Among other things, you can see that the TrueCrypt volume was mounted on the suspect system on October 11th 2013. Furthermore, the path to the container is \Device\Harddisk1\Partition1, because in this case, the container was an entire partition (a USB thumb drive). If we were dealing with a file-based container as previously mentioned, the output would show the full path on disk to the file.

Perhaps even more exciting than all that is the fact that, despite the partition being fully encrypted, once its mounted, any files accessed on the volume become cached by the Windows Cache Manager per normal -- which means the dumpfiles plugin can help you recover them in plain text. Yes, this includes the $Mft, $MftMirr, $Directory, and other NTFS meta-data files, which are decrypted immediately when mounting the volume. In fact, even if values that lead us to the master keys are swapped to disk, or if TrueCrypt (or other disk encryption suites like PGP or BitLocker) begin using algorithms without predictable/detectable keys, you can still recover all or part of any files accessed while the volume was mounted based on the fact that the Windows OS itself will cache the file contents (remember, the encryption is transparent to the OS, so it caches files from encrypted volumes in the same way as it always does). 

After running a plugin such as truecryptsummary, you should have no doubts as to whether TrueCrypt was installed and in use, and which files or partitions are your targets. You can then run the truecryptmaster plugin which performs nothing short of magic. 

$ python vol.py -f WIN-QBTA4.raw --profile=Win2012SP0x64 truecryptmaster -D . 
Volatility Foundation Volatility Framework 2.3.1 (T)

Container: \Device\Harddisk1\Partition1
Hidden Volume: No
Read Only: No
Disk Length: 7743733760 (bytes)
Host Length: 7743995904 (bytes)
Encryption Algorithm: SERPENT
Mode: XTS
Master Key
0xfffffa8018eb71a8 bbe1dc7a8e87e9f1f7eef37e6bb30a25   ...z.......~k..%
0xfffffa8018eb71b8 90b8948fefee425e5105054e3258b1a7   ......B^Q..N2X..
0xfffffa8018eb71c8 a76c5e96d67892335008a8c60d09fb69   .l^..x.3P......i
0xfffffa8018eb71d8 efb0b5fc759d44ec8c057fbc94ec3cc9   ....u.D.......<.
Dumped 64 bytes to ./0xfffffa8018eb71a8_master.key

You now have a 512-byte Serpent master key, which you can use to decrypt the roughly 8 GB USB drive. It tells you the encryption mode that the suspect used, the full path to the file or container, and some additional properties such as whether the volume is read-only or hidden. As you may suspect, the plugin works regardless of the encryption algorithm, mode, key length, and various other factors which may complicate the procedure of finding keys. This is because it doesn't rely on the key or key schedule patterns -- it finds them in the exact same way the TrueCrypt driver itself finds the keys in RAM before it needs to encrypt or decrypt a block of data. 

The truecryptsummary plugin supports all versions of TrueCrypt since 3.1a (released 2005) and truecryptmaster supports 6.3a (2009) and later. In one of the more exciting hands-on labs in our memory forensics training class, students experiment with these plugins and learn how to make suspects wish there was no such thing as Volatility. 

UPDATE 1/15/2014: In our opinion, what's described here is not a vulnerability in TrueCrypt (that was the reason we linked to their FAQ in the first sentence). We don't intend to cause mass paranoia or discourage readers from using the TrueCrypt software. Our best advice to people seeking to keep data secure and private is to read the TrueCrypt documentation carefully, so you're aware of the risks. As stated in the comments to this post, powering your computer off is probably the best way to clear the master keys from RAM. However, you don't always get that opportunity (the FBI doesn't call in advance before kicking in doors) and there's also the possibility of cold boot attacks even if you do shut down.

-Michael Ligh (@iMHLv2)