Skip to content

Commit

Permalink
Core: stricter UTF-8 handling in ngx_utf8_decode().
Browse files Browse the repository at this point in the history
An UTF-8 octet sequence cannot start with a 11111xxx byte (above 0xf8),
see https://datatracker.ietf.org/doc/html/rfc3629#section-3.  Previously,
such bytes were accepted by ngx_utf8_decode() and misinterpreted as 11110xxx
bytes (as in a 4-byte sequence).  While unlikely, this can potentially cause
issues.

Fix is to explicitly reject such bytes in ngx_utf8_decode().
  • Loading branch information
u5surf committed Feb 22, 2023
1 parent 4ace957 commit 2c5fccd
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion src/core/ngx_string.c
Original file line number Diff line number Diff line change
Expand Up @@ -1364,7 +1364,12 @@ ngx_utf8_decode(u_char **p, size_t n)

u = **p;

if (u >= 0xf0) {
if (u >= 0xf8) {

(*p)++;
return 0xffffffff;

} else if (u >= 0xf0) {

u &= 0x07;
valid = 0xffff;
Expand Down

0 comments on commit 2c5fccd

Please sign in to comment.