Skip to content

Commit

Permalink
PHP 8.1 | Tokenizer/PHP: hotfix for overeager explicit octal notation…
Browse files Browse the repository at this point in the history
… backfill

Follow up on 3481 and 3552.

While working on PHPCompatibility/PHPCSUtils, I found another instance where the explicit octal notation backfill is overeager.

PHP natively will tokenize invalid octals, like `0o91` and `T_LNUMBER` + `T_STRING` in all PHP versions, but with the backfill in place, this would no longer be the case and on PHP < 8.1, this would now be tokenized as `T_LNUMBER`, making tokenization across PHP versions unpredictable and inconsistent.

Fixed now. Including tests.
  • Loading branch information
jrfnl committed Apr 10, 2022
1 parent 2596a15 commit 2d71c52
Show file tree
Hide file tree
Showing 3 changed files with 59 additions and 3 deletions.
22 changes: 19 additions & 3 deletions src/Tokenizers/PHP.php
Original file line number Diff line number Diff line change
Expand Up @@ -732,16 +732,32 @@ protected function tokenize($string)
&& $tokens[($stackPtr + 1)][0] === T_STRING
&& strtolower($tokens[($stackPtr + 1)][1][0]) === 'o'
&& $tokens[($stackPtr + 1)][1][1] !== '_')
&& preg_match('`^(o[0-7]+(?:_[0-7]+)?)([0-9_]*)$`i', $tokens[($stackPtr + 1)][1], $matches) === 1
) {
$finalTokens[$newStackPtr] = [
'code' => T_LNUMBER,
'type' => 'T_LNUMBER',
'content' => $token[1] .= $tokens[($stackPtr + 1)][1],
'content' => $token[1] .= $matches[1],
];
$stackPtr++;
$newStackPtr++;

if (isset($matches[2]) === true && $matches[2] !== '') {
$type = 'T_LNUMBER';
if ($matches[2][0] === '_') {
$type = 'T_STRING';
}

$finalTokens[$newStackPtr] = [
'code' => constant($type),
'type' => $type,
'content' => $matches[2],
];
$newStackPtr++;
}

$stackPtr++;
continue;
}
}//end if

/*
PHP 8.1 introduced two dedicated tokens for the & character.
Expand Down
12 changes: 12 additions & 0 deletions tests/Core/Tokenizer/BackfillExplicitOctalNotationTest.inc
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,15 @@ $foo = 0o_137;

/* testInvalid2 */
$foo = 0O_41;

/* testInvalid3 */
$foo = 0o91;

/* testInvalid4 */
$foo = 0O282;

/* testInvalid5 */
$foo = 0o28_2;

/* testInvalid6 */
$foo = 0o2_82;
28 changes: 28 additions & 0 deletions tests/Core/Tokenizer/BackfillExplicitOctalNotationTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,34 @@ public function dataExplicitOctalNotation()
'value' => '0',
],
],
[
[
'marker' => '/* testInvalid3 */',
'type' => 'T_LNUMBER',
'value' => '0',
],
],
[
[
'marker' => '/* testInvalid4 */',
'type' => 'T_LNUMBER',
'value' => '0O2',
],
],
[
[
'marker' => '/* testInvalid5 */',
'type' => 'T_LNUMBER',
'value' => '0o2',
],
],
[
[
'marker' => '/* testInvalid6 */',
'type' => 'T_LNUMBER',
'value' => '0o2',
],
],
];

}//end dataExplicitOctalNotation()
Expand Down

0 comments on commit 2d71c52

Please sign in to comment.