Log4j, a widely used open source logging framework across many important software vendors including Oracle, MySQL, RedHat and NetApp, has been repeatedly effected by a deserialization vulnerability (CVE-2017–5645). This post explains how the vulnerability can be exploited, why it is dangerous and, most importantly, how developers and security professionals can determine if they are vulnerable.
Logging equips the developer with detailed context for application observability, performance and failures. With log4j it is possible to enable logging at runtime without modifying the application binary.
Log4j comes with multiple options to format log files created by framework. It can create simple log files ( w3c , html , xml or json based) .
Very few know this capability of log4j that it is bundled with a fully functional socket server which can be used to listen on network connections and record log events sent to server from various peering network nodes and locations.
Sometimes, we need the logs on different machines. Let’s consider that the application is running on a remote system and we need the logs dispatched to and aggregated on our centralized machine.
To configure socket server, create a log4j-server.properties file in your project root folder of the application in question. This file configures the how and where the log events received will be logged.
SocketAppender to direct logs to the designated log machine. See the following
log4j-reciever.properties files on the receiver side.
log4j.appender.file.layout.ConversionPattern=[%d] [%t] [%m]%n
Upon searching github using this query https://github.com/search?p=4&q=log4j.appender.server.RemoteHost&type=Code one would find several exposed Host IPs that have been hardcoded inadvertently by repo owners. Also several key projects happen to be using a socketServer based setup.
Running the command will invoke the logging server on the receiver side
java -classpath log4j-path.jar org.apache.log4j.net.SimpleSocketServer 4712 log4j-reciever.properties
After this setup, run your program/application, and you will get logs at your desirable machine.
On July 2018, Apache Log4j was exposed to a deserialization vulnerability (CVE-2017–5645) that could be triggered by an attacker by sending a specially crafted malicious binary payload to deserialize the bytes into objects.
Vulnerability trigger point
This vulnerability manifests because the receiver does not sanitize/filter/validate an inputstream from untrusted sources when processing ObjectInputStream.
This vulnerability is remotely exploitable without authentication, i.e., it may be exploited over a network transitively through another application receiving events from a customer facing application. This vulnerability can be effectively addressed or resolved by adding configurable filtering capabilities and related settings to TcpSocketServer and UdpSocketServer.
At present, the official version of Log4j has released a new version (2.8.2+) to fix the vulnerability.
Sphere of influence
All Apache Log4j 2.* series versions: from 2.0-alpha1 to 2.8.1
Attackers think in Graphs
An attackers mindset can be likened to the OODA (Observe Orient Decide Act) model, which paints a cyclical graph (from exposed API to value asset), not a linear checklist.
Let’s deconstruct this cyclical graph as a mental exercise.
The starting point is a
TCP / UDP based SerializedSocketServer that is initialized in the receiving application. All the network bound events received triggers
readObject to unmarshalls/deserialize the
LogEventBridge inputStream which represents log events.
Let’s examine the subclasses of
LogEventBridge type (as logEvent is defined by LogEventBridge).
The core logic intercepts the characters between specific tags in the array encompassed in
logEvent, and then invokes a deserialize function, but the which is provided by the dependent jackson framework.
Going beyond SCA and legacy SAST analysis
SCA (Software Composition Analysis) tools merely examine the application’s bill of materials and thereafter lookup a vulnerability database (using dependent:version coordinates) in order to identify if any of the dependencies have active CVEs associated to it. Such systems are blindsided to details of whether the OSS dependency is used in an unsafe way within the application.
Legacy Static Analysis (SAST) tools are inherently prone to false positives and noise, because it performs generic taint-style vulnerability analysis in a restricted manner.
Based on prescribed and generic ruleset, the SAST engine identifies sources and sinks and thereafter conducts reachability from source to sink without taking the influencing data flow, custom control validation logic and configuration into consideration.
Also, simple taint-style analysis is restricted to the application boundary only, without intrinsic understanding of how data flows in/out from an application into its transitive OSS dependencies and OSS frameworks in order to trigger a seeded vulnerable condition.
Why should this matter?
Let’s examine this criteria by assessing conditions leading to CVE-2017–5645 (Refer to Figure 1 above for visual detail)
- Attacker injects malicious payload via a web form field or API based attribute via a public facing application
- The application uses a vulnerable version of log4j (series versions: from 2.0-alpha1 to 2.8.1)
- The application is configured to use
SocketAppenderto direct/forward logs to log4jserver application
- The application logs every event (info/debug) for observability purpose without sanitizing/validating data append to the log stream
- The log4jserver application (receiving logs) uses a vulnerable version of log4j (series versions: from 2.0-alpha1 to 2.8.1)
- The log4jserver application creates a TCP or UDP SocketServer with an appropriate ObjectInputStream configured (XML or JSON input stream)
- Log4j (series versions: from 2.0-alpha1 to 2.8.1) transitively pulls Jackson deserialization framework.
- A data flow that accepts unvalidated input is deserialized via jackson’s readObject API.
It is imperative that the attacker controlled flow in question is weaving into exposed application, configured with SocketAppender depending on a vulnerable version of log4j and transmitted via network to the receiving log4j server with a initialized SocketServer depending on a vulnerable version of log4j which unmarshalls/deserializes the exploitive payload.
A legacy static analysis (SAST) toolset cannot keep up and maintain context across all these boundaries.
As defenders, we would need to think in graphs too. We at ShiftLeft put forward the idea of combining three graphs of an application (syntax, control and data) into a single multigraph (basically a graph that allows multiple edges for a pair of nodes) which we call the code property graph. We then go a step further to assess the application’s OSS consumption and OSS framework it uses and then connect these associative graphs of all OSS libraries and frameworks to the primary graph.
Upon this densely connected graph of graphs (created in minutes), we can craft custom queries using Ocular to identify all of the criteria listed above.
Please refer to accompanying project hosted in github titled CVE-2017–5645 which is a POC of log4jServer.
Release date: 17/04/17 Discovered by: Marcio Almeida from TELSTRA Red Team Severity: Critical CVSS Base Score: 7.5…
Is your project using log4j and are vulnerable to these conditions?
The first step to conduct this investigation is to fire up Ocular distribution and create CPGs for client and server application.
Let’s now proceed to understand the type hierarchy associated with this condition.
AbstractLogEventBridge which implements
The next query focuses on conducting reachability analysis from untrusted input source
to sensitive exploitive sink
log4j version 2.8.2 as a whitelist check has been incorporated in control flow path
- Download a trial version of Ocular here
- Clone/Star or Watch this project https://github.com/conikeec/CVE-2017-5645 and prepare environment to test the POC (based on README).
- The following query above has been exported as a script that can be incorporated in any CI (Jenkins/Travis/Circle/GitLab) pipeline for continuous validation of risk across all projects in your organization.
Until then Happy Hunting and Hacking!