Exploiting XSS in Ajax Web Applications

Following up on yesterdays post Pluck SiteLife software multiple XSS vulnerabilities, let's take a look at how to exploit XSS in JSON responses using Internet Explorer.
Quick introduction to JSON
JSON is a model for encoding data, used by many web applications that want to serve dynamic or updating content within a single web page. It's formatted like so:
{"parameter":"value","next_parameter":"next_value"}
Using a technique called Ajax, JSON data is normally transferred behind the scenes as a web page is loading. Some people may that realize that because Ajax uses the standard HTTP protocol, it's possible to access JSON data directly by navigating the web browser to a specific URL. An example of this is the Twitter API, which allows me to construct a URL that provides a JSON encoded version of my Twitter profile and my last tweet. The JSON code in the response can be accessed directly or used with embedded scripts to display inline information.
<textarea id="nerds" style="width:700; height:34" disabled="true"></textarea>
<script>function callback(twitters){document.getElementById("nerds").value=twitters[0].text}</script>
<script src="https://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&screen_name=superevr&count=1&callback=callback"></script>
Websites using JSON without proper output encoding are likely to be vulnerable to XSS
Like any other web page, JSON responses are likely to reflect back the values they are given. This becomes problematic when the response contains HTML syntax and characters. Web browsers are designed to render HTML, and as soon as they see it they want to render the code into an image, or a link, or a form field as quickly as possible. When testing for XSS, I inject sample code like the HTML strikeout tag <s > into one of the request parameters, and see if the browser displays text with a line through it. If it does, then that is a pretty good indication of a cross-site scripting vulnerability.
The catch
In a clever attempt to prevent browsers from incorrectly rendering JSON code, the web server presents these pages with a special Content-Type of application/json or application/x-javascript. This tells the browser that it shouldn't render any code here because it has a special use. Unfortunately, this isn't enough. 1
Content Sniffing for HTML in Internet Explorer
But web browsers really do love rendering code, and will mark-up HTML regardless of the content-type if you give them a good enough excuse. This is called content sniffing, and can be used by attackers in different scenarios to cause malicious JavaScript to run on a website that was thought to be immune to attack. Here are two facts on content sniffing that hackers already know about:
- Internet Explorer relies heavily on the file extension when content sniffing.
- File extensions can be spoofed by the requestor
This means that user/json will be displayed as plaintext, but user/json.htm can render as HTML! Depending on the web server, there are a several ways to spoof the file extension. A few examples:
- /json.htm
- /json.html
- /json/.html (PHP and Asp.NET applications)
- /json;.html (JSP applications) (see three semicolon vulnerabilities)
- /json.cgi?a.html (discovered by Hasegawayosuke)
Trouble Shooting
Content sniffing is not always that easy. Here are some factors that may basic tests for content sniffing2 :
- Unable to add arbitrarily file extensions in the URL path
- Site is using HTTPS
- Site has headers for cache-control: no-cache or pragma: no-cache
- Site has header content-disposition: attachment
- Site Content-Type header is set to image/[anything]
Remediation
To protect against this type of vulnerability, several changes must be made. As always, programs should first validate that user input contains appropriate text characters. Also, any time user input is reflected back to a web browser, that text should be encoded properly (e.g. replace < with proper unicode escapes like < or \x3c\u003C). Finally, as an extra protection measure, have the web server include the additional header X-Content-Type-Options: nosniff to prevent content sniffing in Internet Explorer 8+ and other browsers.
JSON is generally designed to be processed in the background by JavaScript, so I understand why developers forget or are unaware of the possible consequences that could happen when the JSON data is accessed directly. Hopefully this post can raise awareness of possible security issues.
Pluck SiteLife software multiple XSS vulnerabilities

On November 30, 2011 I reported to US-CERT that I found multiple XSS vulnerabilities in Demand Media's Pluck SiteLife software. The details of the vulnerabilities (now patched) were published yesterday as US-CERT Vulnerability Note VU#400619.
Heres the original report I sent to US-CERT and on November 30, 2012:
I would like to report multiple XSS vulnerabilities.
...Here are the vulnerability details for Pluck:
This demonstrates multiple XSS vulnerabilities in the Pluck SiteLife Software. According to a sales associate, "The SiteLife product was rolled into a broad social/community platform offering about 2.5 years ago. It's simply called Pluck now and Pluck 5 is the latest version." The version of Pluck that is exploitable is unknown by me at this time.
Here are a few of the known vulnerable URL's and URL parameters:
http://sitelife.example.host/ver1.0/Direct/Process?referrerURL=x&jsonRequest=<body%20onload=alert(1)//>
(Internet Explorer)
http://sitelife.example.host/ver1.0/Direct/jsonp.htm?r=<img%20src=x%20onerror=alert(2)//>&cb=<body%20onload=alert(1)//>(Internet Explorer)
http://sitelife.example.host/ver1.0/sys/jsonp.app/.htm?cb=<img%20src=x%20onerror=alert(1)>&widget_path=pluck%2fuser%2fpersona%wffirstperson%2fprofile.appIn addition to the "cv", "jsonRequest", and "r" parameters, the "ctk" parameter is also vulnerable in some instances.
Here is a proof of concept affecting the pluck.com domain: http://sitelife.pluck.com/ver1.0/direct/process?referrerURL=x&jsonRequest=<body%20onload=alert(1)//>
Here are SOME of the sites that appear to be using the vulnerable SiteLife software. ...
I go on to list over 40 popular websites running Pluck SiteLife software that have the vulnerability, which I won't list here.
Tomorrow, I will post an in-depth look at XSS in Ajax Web Applications and tell you why some of these vulnerabilities were Internet Explorer specific.
Bug Bounties Part 1

I think it's great that companies have started programs to reward ethical hackers that responsibly disclose vulnerabilities before they become a problem to their customers.
Each of the vulnerabilities that I will be posting has already been fixed by the group responsible, or the site has been retired. I'm publishing this information because it was an interesting exercise to find these vulnerabilities. I think that by examining previously found vulnerabilities, we can determine patterns that can help reveal vulnerabilities in the future.
Vark.com
One of the first bug bounties I received was for the website "vark.com". Vark.com was a site where users can ask questions, and Vark would find users with knowledge of that topic to answer the question for you. It's pretty cool concept, one that's been done a few times before.
Vark had XSS on the main search query field:
http://vark.com/users?q=xss%C0%3Cimg+src=x+onerror=alert(1)//&commit=Search
The cool thing about this attack vector is the characters "%C0%3C" or 0xC03C. If you are unfamiliar, this is an attack on improper decoding of UTF-8 characters. UTF-8 is a variable-width encoding allowing for text characters to be represented by multiple bytes of code (eg. Latin Capital Letter A with Grave À is represented with 0xC380. A funny thing is that there are no characters between 0xC000 and 0xC1FF, even though it falls within the UTF-8 range. A parser that wasn't running correctly could interpret 0xC03C as one character, allowing it to pass a filter against 0x3C ("<"). When a secondary parser or web browser sees an invalid UTF-8 encoding, it will simply ignore the invalid character ("0xC0") and you are left with 0x3C, and the ability to insert malicious code.
The same parsing issue allowed for XSS on new questions that would be viewed by other users.
Google paid out $500 for this vulnerability. Vark.com has since been discontinued.
Jaiku.com
Jaiku was a twitter clone that Google purchased in 2007, and re-launched in 2009 after being brought into the Google development eco-system.
I found XSS on the page that you get after registering for the system. Incidentally, the page could also be visited by users that were not logged in, or users that had been logged in for a long time.
http://www.jaiku.com/welcome/done?redirect_to=javascript:alert%281%29
The redirect_to URL was reflected in a link.
Jaiku was losing popularity and with Google Buzz and Google+ initiatives alright out there, it was likely to be shut down soon. However, the Google Security Team told me that there were still quite a few diehard Jaiku users around, and so they paid out $500 for the vulnerability.
Google retired Jaiku.com on January 15, 2012.
spreadsheets.google.com
Google Docs has functionality to create surveys for people to fill out and provide information. The extended description for a survey allowed a user to enter in some HTML elements. Additionally, there is a feature to edit the confirmation page that is presented once the survey is completed, and an attacker could include some HTML here as well. Google has a really comprehensive filter against XSS here, where they actually allow users to enter certain HTML elements (like images, styling, etc). At the time, Google docs didn't seem to be using the latest version of this filter.
The XSS is exploited using <IMG> tag with a "src" attribute set to execute javascript, a vulnerability that affects Internet Explorer 6:
<img src="javascript:alert('img-src')">xxx</img>
//xss
You can still go to my POC that I sent in to google security, and if you look at the source code of the page you'll see what their filter is now doing to prevent XSS:
<img src="//images-docs-opensocial.googleusercontent.com/gadgets/proxy?url=javascript:alert(document.domain)&container=docs&gadget=docs&rewriteMime=image/*" /> //xss
Google sent me a $500 check for reporting this vulnerability.
“I’m shocked a URL can look like this”
Here's something that I had never seen before:
A Top-Level Domain being used as a hostname for a website.
It's actually a mirror of http://nic.ac/, but web browsers are able to access it at http://ac/ or http://ac./ The extra period is sometimes required to force a DNS lookup, but isn't required on subsequent requests.
These URL's all go to the same place:
- http://ac/
- http://ac./
- http://ac.:80/
- http://nic.ac/
- http://nic.ac./
- http://nic.ac.:80/
- http://www.nic.ac/
- http://www.nic.ac./
- http://www.nic.ac.:80/
- http://193.223.78.210
Let's look at what makes up a domain. A domain name consists of parts separated by periods. For a domain like www.example.com, com is the top-level domain and example is a sub-domain of that. The last part, www, is a sub-domain of example.com. Oh, and a hostname is a domain that points to an IP address, like 193.223.78.210.
There is a list of generic top-level domains such as GOV, EDU, COM, MIL, ORG, and NET. Many more top-level domains have been given out by ICANN (icann.org) for use by specific countries. That list can be found in the Root Zone Database (iana.org).
It's rare to see web sites hosted on a Top-Level Domain. In fact, there are currently 312 TLD's and only 17 of them resolve to an IP address.
AC - 193.223.78.210
AI - 209.59.119.34
CM - 195.24.205.60
DK - 193.163.102.24
GG - 87.117.196.80
IO - 193.223.78.212
JE - 87.117.196.80
PH - 203.119.4.7
PN - 80.68.93.100
SH - 193.223.78.211
TK - 217.119.57.22
TM - 193.223.78.213
TO - 216.74.32.107
UZ - 91.212.89.8
VI - 193.0.0.198
WS - 64.70.19.33
XN--O3CW4H - 203.146.249.130
Only 9 of those 17 domains with IP addresses are hosting a web server on port 80.
AC AI DK IO PN SH TM UZ WS
How can a URL look like this? I naturally asked myself if this could raise any security issues.
If there are XSS vulnerabilities (Cross-Site Scripting) on a Top-Level domain, could it affect all of it's subdomains?
Could you use the XSS to grab records and spoof content on all xx.yy.ac subdomains?
Could you create a cookie on the ".ac" domain that is re-sent for all sub-domains for the ultimate ad-network cookie or session-fixation attack?
Fortunately, domain policies work the from left-to-right. For example, xx.yy.ac can set a cookie for .yy.ac, but not the other way around. Additionally, browser vendors collaborate (publicsuffix.org) on a list of domain name suffixes (mxr.mozilla.org), so they can set rules that restrict the way TLD's are used.
I have a feeling that a web server running on a TLD could mess with a browser/plugin/proxy filter somewhere and cause some security issues, but I couldn't find anything concrete. I'm hoping that this post will inspire security researchers to look into the risks that this brings, because the scope of the issue is about to explode!
ICANN has a new initiative (icann.org) that intends to add between 300 and 1,000 new TLD's. Organizations can apply to control their own generic TLD, like .coke or .pepsi.
Security researchers should take a look at this now before things get crazy.
Three Semicolon Vulnerabilities

I have three new web bugs to demonstrate. Each of them take advantage of how a semicolon character is interpreted by a web server or browser. Each of these bugs can be demonstrated on the latest release of Apache Tomcat 7.0.22, and the latest browsers. Exploitation of these bugs requires unique issues on a vulnerabile website.
Continue reading or skip to the Proof of Concept page











