PDF analysis with peepdf
PDF analysisIn this post, I will show a basic use of peepdf to analyze a malicious PDF file.
peepdf is a great (python) tool available at "http://eternal-todo.com/tools/peepdf-pdf-analysis-tool". It is well documented and extendable.
An extension example is ParanoiDF available at "http://securityblackhawk.blogspot.fr/2014/08/paranoidf-analyse-for-pdf-files.html".
It can be interacted with using scripts or an intuitive interactive shell (using the "-i" option).
The analyzed sample is identified by its MD5 hash : "369614d7c422201f2d1605f4befd452d"
First, let's open the file with peepdf:
There is a lot of object but peepdf highlights those of interest ("suspicious elements" section). Each of these suspicious element has a role in the attack scenario.
ParanoiDF provides a nice feature to get a text rendering of the content of the PDF file. This file seems to have been used in some sort of fishing action and most likely present some form to the victim.
OpenAction objectAn OpenAction object is "a value specifying a destination to be displayed or an action to be performed when the document is opened" (source http://partners.adobe.com/public/developper/en/pdf/PDFReference.pdf).
So when the document is opened, the object referenced in this OpenAction object will be called. peepdf allows to see which object will be called. Just use the command "object" with the OpenAction object id as a parameter.
This function is used to extract the specified data object to an external file."cname" is the name of the data object to extract.
"nLaunch" controls whether the file is launched, or opened, after it is saved. A value of "0" means that the file to extract will not be launched after having been saved.
So we know that an embedded file will be extracted from the PDF and saved on disk. But there is no direct link with an object in the document.
However, looking back at the OpenAction object, we can see a reference to a "/Names" object which id is 49.
Names objectThis object contains 1 entry with a /EmbeddedFiles element (id 50). Such an element contains a name tree that maps name strings to embedded file streams.
Looking a object 50, we can find a reference to "W2" and a mapping to object 51.
As illustrated above, object 51 is a file specification for a file object embedded in object 52.
From the initial information provided by peepdf, object 52 is indeed an embedded file. This is just how it is called from the document opening. But what is this file?
Embedded filepeepdf shows that the file is supposed to be another pdf file (Subtype: /application/pdf) and that the stream is encoded using the Flate encoding filter. Note that when an object is displayed using the "object" command, peepdf applies the necessary filters to display the uncompressed version of the object.
The stream is too large to be displayed in the console, but peepdf allows for easy export using the ">" command.
Let's see what the embedded file is actually. Instead of being a PDF file it seems to be a PE file.
The W2.pdf file could be carved to extract the PE file but there is another way to to it with peepdf using raw stream manipulation and decoding.
The file is recognized as malicious by clamav.
Again, peepdf has some cool feature to help in this identification task and provides the vtcheck command that checks the PDF file against VirusTotal database use the public API of this site.
The vtcheck command can be used to check the analyzed PDF file itself (default behavior), part of it (either object, rawstream, range of bytes, ...) or a file.
|vtcheck PDF file|
|vtcheck PE file|
So how is it ever executed?
There is a another object of interest in this PDF file which is a "Launch" object (object 54).
This object's type is /Action/Launch. Looking at the PDF reference document (available in the Adobe web site), a Launch action is used to "launch an application, usually to open a file".
Displaying the content of the object makes the execution method quite obvious: a Windows shell is launch with a script that search for the W2.pdf file in common places ("Desktop" or "My Documents") and, if found, executes it.
Again, how is this action launched?
The AA objectThe Launch action is associated to a media box defined in object 3, which "kid" object 2 defines a "AA" dictionary. A "AA" dictionary defines the actions to taken in response to events affecting the document. Object 2 is one of the "suspicious elements" reported by peepdf after opening the file.
So the message "To view [...]" will certainly appear in a media box.
If I understood the PDF specification correctly the "/O" in the /AA entry means that the trigger for the Launch action is simply the document opening.
The attack relies a bit on social engineering because the user expected to:
1/ accept to save an embedded file when the PDF file opens and
2/ press "open" to view the so called "encrypted content" (from the saved embedded PDF file).
I tested the file under Windows using Sandboxie and Adobe Reader 9 to illustrate the attack from the user's point of view.
|Open the PDF file|
|Ask to save W2.pdf|
|Ask for file execution|
ConclusionThis was a short introduction to peepdf and its features. The analyzed PDF file does not use any exploit and the attack strategy relies only on social engineering.
Late addition: this looks like a direct use of Metasploit's module "Adobe PDF Embedded Exe Social Engineering". This source code excerpt explains why it did not work on my French XP OS:
# check for the pdf in these dirs, in this order.. dirs = [ "Desktop", "My Documents", "Documents", "Escritorio", "Mis Documentos" ].