SANS Penetration Testing

Finding Zero-Day XSS Vulns via Doc Metadata

[Editor's Note: Chris Andre Dale has a nice article for us about cross-site-scripting attacks, and he's found a ton of them in various high-profile platforms on the Internet, especially in sites that display or process images. He even found one in WordPress and responsibly disclosed it, resulting in a fix for the platform released just a few weeks ago. In this article, Chris shares his approach and discoveries, with useful lessons for all pen testers. Oh... and if you are going to test systems, make sure you have appropriate permission and don't do anything that could break a target system or harm its users. Thanks for the article, Chris! -Ed.]

By Chris Andre Dale

XSS Here, XSS There, XSS Everywhere!

Today Cross-Site Scripting (XSS) is very widespread. While it is not a newly discovered attack vector, we still see it all the time in the wild. Do you remember back in the days, when you would click on a website's guestbook, and suddenly you would have tons of pop-ups or redirections happen? Yeah, that's often XSS for you. Today I see XSS vulnerabilities in almost all of the penetration testing engagements that I conduct.

Even to this very day, there is evidence of old XSS worms stuck on the web. Remember MySpace? Yeah, me neither. Make a Google search for "Samy is my hero". You will see thousands of ghostly remains of a XSS worm back from 2006! The infamous Samy worm does not still linger, but what you are seeing is the remains of MySpace profiles that were victims of this worm back in 2006.

XSS is usually ranked only as a medium impact when exploited. For instance, OWASP has rated this vulnerability as a moderate impact. I disagree on this. In many cases XSS can be truly brutal and potentially life threatening. What do I mean? When XSS is bundled with other vulnerabilities, such as Cross-Site Request Forgery (CSRF), we can quickly imagine some very nasty scenarios. What if your XSS exploit hooks an IT Operations administrator, and through the XSS you add your CSRF payloads to perform administrative functions to their HVAC solutions? Alternatively, consider the unfortunate event where an attacker has successfully compromised thousands of hosts, using them all to DDOS an unsuspected victim.

New XSS attack vectors arise all the time, however we don't often see something truly new or untraditional. Wouldn't it be cool to see something else other than just your ordinary filter bypass? In this article, I'll cover how I've successfully found 0-day exploits in WordPress, public sites and plugins for popular CMS systems, by merely using using this technique.

Let's take a look at embedding XSS payloads into image metadata, more specifically EXIF data in JPEG images. This can be accomplished several ways. If you are old school (or perhaps just old :), you can accomplish this by modifying your camera settings:

The camera type used here is a Canon(1) camera.

Any hacker with any respect for themselves, uses ExifTool(2) by Phil Harvey to accomplish the task. The following command allows us to add/overwrite an exif tag, specifically the camera type that has allegedly been used to take this photograph:

exiftool.exe -"Camera Model Name"="// " "C:\research.jpg" 

Let's not just add the model name, let's extend it to other values as well:

As you can see, we've added the standard javascript alert code to a whole set of different EXIF data fields. Now we'll create a simple PHP script that will mimic a real world example of a system that uses EXIF data:

$filename = $_GET['filename'];
echo $filename . "
$exif = exif_read_data('tests/' . $filename, 'IFD0');
echo $exif===false ? "No header data found.
\n" : "Image contains headers
$exif = exif_read_data('tests/' . $filename, 0, true);
foreach ($exif as $key => $section) {
foreach ($section as $name => $val) {
echo "$key.$name: $val

The above script will simply iterate through all EXIF data keys it finds and will output the respective value. For testing purposes, this is exactly what we want.

PHP's EXIF parser does not have filtering in place by default, making it very interesting to test this attack vector. In cases where a developer has forgotten to do sanitization on these fields, we may have a successful attack. Developers think, in many cases, that some of their data is read-only, so why would they EVER need to sanitize it?! Common mistake?

Using the script above, and armed with our the metadata-bombed picture, we can try to attack ourselves through the demo script. In the following picture, we have told the script to fetch our metadata-bombed picture, simply illustrating the attack with a JavaScript pop-up:

So what? We've successfully attacked ourselves with a pop-up message? Well, there is so much more to this than just attacking ourselves. First of all, we've verified that there is no built-in filtering in the PHP exif_read_data function. That means that all developers need to remember is to apply filtering manually, and as we covered before, we all know that developers always remember this? Secondly, we've verified that we can gain executable JavaScript in someone's browser. From here, we can simply rewrite our pop-up payload to something much more subtle and evil, such as introducing a BeEF hook. More on this later.

Scouring For 0 Day's

Armed with a fully metadata-bombed picture, I set sail into the Wild Wild Web. I had to check whether my assumptions of developers failing to sanitize the EXIF data was true or not. From there, I roamed into the depths of picture upload sites? I started googling "upload picture", "picture sharing", "photograph sharing" and much more. On the sites that I found interesting, I registered an account and started uploading my pictures.

On a side note, Mailinator(3) comes in handy doing this kind of research. In fact, I registered with the account for most of the sites, however, to my great surprise, one of the sites already had an account with this username! What?! Someone had actually registered with this account before? Then undoubtedly, I could do a password reset! Sure enough, doing a password reset, I gained access to someone else's account. Whew? Now, who would EVER register an account on a Mailinator address for their private pictures? Another security researcher? Criminals? Where do I go from here? Do I really want to venture into someone else's account? If so, what will I find? Regardless of my questions and doubt, I decided to continue, knowing surely that there is no turning back from what I might be about to see. To my surprise, and more importantly, to my relief, the site contained a bunch of family vacation pictures from a trip to Indonesia.

Many of the sites required registration, while many of them did not show any metadata at all. Out of 21 sites tested, 11 sites did not have a feature to display EXIF data, 7 sites had at least rudimentary filtering and 3 sites were found to be vulnerable. Not amazing numbers, however still fun to see it working in the wild, outside of my lab. What do I mean by rudimentary filtering? Well, it just means I didn't try to bypass the filtering. Additionally, I tested the attack vector on 3 WordPress plugins, whereas 2 were found vulnerable and one had the appropriate filtering in place. Responsible disclosure against the sites and plugins has been conducted. Some of the examples in this article have been anonymized because as of the launch of the article, they have still not patched the issue.

Keep in mind, many of the sites that were applying filtering could still be vulnerable. I did not conduct any filter bypass in my testing. My gut feeling is that the filters were very rudimentary and could easily be bypassed.

First, here is an example from which was not found to be vulnerable:

You can see the payload present in the title and the camera is automatically populated by the site. That means that instead of prompting me to set a title for my picture, the site used one of the EXIF data fields to pre-populate it for me. Interesting...this was something I saw as a repeating characteristic when doing my testing.

Flickr also did appropriate filtering, keeping in mind, no filter bypass has been tried:

One particular site did not like my testing at all. When trying to upload my picture, it seemed to break something:

Anyway, we're not here for the failures, are we? We're here for the success stories! Ahh, this is the wonderful world of hacking...gaining success through other people's failures? *evil grin*

Here is a site where we can see our attack manifest itself. Just by uploading the picture and then viewing it, it triggers this vulnerability:

I also found the same vulnerability at other sites. We can see the image I've uploaded in the background — a princess and a unicorn. Sadly, no farting rainbows?

Many of the big sites were also tested, such as Google Plus, DeviantArt, and Photobucket. These were all applying some filtering. A site, however, that did not apply the necessary filtering was WordPress.

In the screenshot above I've successfully uploaded an image, by accessing it through its respective attachment page. Remember, I am using a harmless payload, just alerting a text message. This could be a completely stealthy attack payload if I wanted it to be. Let's dive further into the WordPress finding.

The WordPress Exploit

WordPress is the most popular blogging platform on the internet today, ranking up more than 60 million websites in 2012 (4). Finding working exploits in such a platform can be very interesting for many actors, hence they also have a working bug bounty program (5). The vulnerability I'm demonstrating in this paper has been submitted to WordPress through responsible disclosure, and we held this article until they had properly patched the issue.

The WordPress vulnerability manifests when an administrator, or editor, uploads an image with the ImageDescription EXIF data tag set to a JavaScript payload. The exploit works only for the user accounts as more strict filtering is put on the other accounts. This has sparked some controversy about this vulnerability, however, as I will prove in this article, we will create an attack that is fully stealthy, allowing the attack to take place without an administrator knowing what is going on.

Why the controversy? With WordPress, and other CMS systems such as Sharepoint, some roles are allowed to upload HTML elements. With WordPress, administrators and editors are allowed to implement unfiltered HTML (6). The other side of the controversy is how the attack can be made super stealthy. The administrator has very limited ways to realize that he is doing something wrong and actually uploading malware into his site. Now, that's cool! This is also why WordPress has chosen to patch this issue.

Embedding some JavaScript into the tag and then uploading it will trigger the vulnerability once a user views the attachment page of the image. Using Exiftool, you can accomplish this with the following command:

Here I've changed my JavaScript payload to a reference instead of embedding the JavaScript file itself in the image. This will give us increased flexibility when creating working payloads.

exiftool.exe -"ImageDescription"="<script src=\"\">" paramtest1.jpg

The following example is one of my first runs of the attack. It is not stealthy as the administrator can easily pick up that something is wrong, simply by looking at the title element of the page. WordPress uses the ImageDescription element to populate the title element, and properly filters before doing so. We'll see soon how to bypass this.

The attack works when you navigate to the attachment page, however any WordPress editor with IQ higher than their shoe-size would most likely realize that something is fishy, immediately deleting the picture. If we stopped at this point, I don't think the issue would warrant a patch or much attention at all, however the next steps allows us to go into stealth mode.

If I figured out a way for the payload to be embedded, but without the title element being overridden, I could make the attack feasible. Luckily, I discovered a small artifact when doing the testing. Trying different types of encoding, and other obfuscation techniques, produced some really long strings. When producing a long enough string, I noticed that WordPress suddenly defaulted to using the filename as the title element! Nice!

The following Exiftool command makes WordPress ignore the ImageDescription, allowing a more stealthy attack:

exiftool.exe -"ImageDescription"="

<script src\"\"></script>" paramtes1.jpg

Notice all the extra spaces. This extra padding makes WordPress think that this is too long for the title field, thus defaulting to simply using the filename. The attack now manifests more beautifully when we upload the picture:

The picture loads normally. Our XSS vector is currently invisible. Here is what happens when someone, e.g. the administrator, visits the picture:

The screenshot shows how I've successfully included my malicious JavaScript. This could be a simple BEeF(7) hook, allowing us a very high level of control of the victims. From here, it's game over.

Best Regards, Cross-Site Scripting

Why stop at EXIF data? What about other types of data, perhaps not in the same magnitude as online EXIF parsers, but let's look at embedding XSS into other data.

What if a webpage allowed you to upload a Word document , and it would then automatically extract the Author field of the document and embed it on the site? That could definitely lead to a vulnerability. It sounds like a good vector for a XSS attacks, or even other types of attacks such as SQL Injection if they store the information in a database. When I look at the document I'm writing right now, I can see the following metadata information:

Without a doubt, many of these parameters can easily be changed by the user, either through Exiftool or using the word processor itself. The following example shows editing the username to a XSS payload. I do apologize for the Norwegian text; I've been cursed with a Norwegian installation of Windows and Office by my IT-department.

Pictures and documents. What about audio? Here is an example adding XSS to an mp3 file through the awesome free, and open-source, tool Audacity (8):

There are probably tons of other situations where we can add these types of attacks. It's up to us to find them before the bad guys does.


Let's consider the future. The data we embed in metadata today might, sometime in the future, exploit services that has not yet been developed. Perhaps, we'll see XSS shooting out from projectors, chat services, glasses (e.g. Google Glass) or robots going crazy having alert(1)'s all over the place. Or perhaps even cooler, your files embedded with XSS today might someday, in the future, trigger a callback connection straight back to your BEeF hook?

The bottom line is, data coming from a third party system, being a user or another system, should be sanitized! You know that whole concept of garbage in, garbage out? Let's stop that.

Additionally, it is important for pen testers to have this information in their arsenal when doing their testing. The testers need to think outside the box and cover as much testing surface as possible.

Also, Ed Skoudis had a student who mentioned some great research that has been made on sites processing metadata. I recommend checking out the research done at (9). It might spark some further testing and research for some of our readers.

Now, go onward my friends and?



-Chris Andre Dale


Posted December 26, 2014 at 4:24 AM | Permalink | Reply

Mohamed Ramadan

Well done Chris, I found XXE in Facebook using a similar approach.
Keep Rocking

Posted January 6, 2015 at 9:29 AM | Permalink | Reply

Chris ANdre Dale

Thanks Mohamed!
I know! I saw your XXE exploit. Very well done my friend.

Posted June 1, 2016 at 4:30 AM | Permalink | Reply


i like this

Posted February 1, 2018 at 9:09 AM | Permalink | Reply


This post saved my week!Cheers!!!

Post a Comment


* Indicates a required field.