Skip to content
This repository has been archived by the owner on Jan 13, 2024. It is now read-only.

Commit

Permalink
Fix bounds for recursive calls to match from lookbehinds
Browse files Browse the repository at this point in the history
Summary:
When a lookbehind recursively calls `match`, the cursor direction is
reversed. In this case, `remaining` returns the distance from the
start of the string to the current position instead of the distance
to the end of the string.

When the cursor is at the end of the string, this leads to
advanceStringIndex incorrectly being given a non-zero string length,
and it then attempts to do an OOB read from the string.

This problem could be solved equally well by conditioning the call to
`advanceStringIndex` on whether `onlyAtStart` is true, but it is
probably cleaner to just ensure that the length being passed is
correct. (even if it is a little counterintuitive that
advanceStringIndex is called at all while the cursor direction is
reversed)

The spec does not offer much guidance here since we effectively
combine `@match` and `RegExpExec`.

Reviewed By: avp

Differential Revision: D23445121

fbshipit-source-id: 2341d20853083ff25376ac920cea27f220d91b21
  • Loading branch information
neildhar authored and facebook-github-bot committed Sep 2, 2020
1 parent 340790d commit e4afea8
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 4 deletions.
18 changes: 14 additions & 4 deletions lib/Regex/Executor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,14 @@ class Cursor {
return current_ - first_;
}

/// \return the number of code units between the current position and the end
/// of the string.
/// This is called "offsetFromRight" and not "offsetFromEnd" to indicate that
/// it does not change under backwards tracking.
uint32_t offsetFromRight() const {
return last_ - current_;
}

/// \return whether we are at the leftmost position.
/// This does not change under backwards tracking.
bool atLeft() const {
Expand Down Expand Up @@ -924,12 +932,14 @@ auto Context<Traits>::match(State<Traits> *s, bool onlyAtStart)

const CodeUnit *const startLoc = c.currentPointer();

const size_t charsToEnd = c.remaining();
// Use offsetFromRight() instead of remaining() here so that the length passed
// to advanceStringIndex is accurate even when the cursor is going backwards.
const size_t charsToRight = c.offsetFromRight();

// Decide how many locations we'll need to check.
// Note that we do want to check the empty range at the end, so add one to
// charsToEnd.
const size_t locsToCheckCount = onlyAtStart ? 1 : 1 + charsToEnd;
// charsToRight.
const size_t locsToCheckCount = onlyAtStart ? 1 : 1 + charsToRight;

// If we are tracking backwards, we should only ever have one potential match
// location. This is because advanceStringIndex only ever tracks forwards.
Expand All @@ -946,7 +956,7 @@ auto Context<Traits>::match(State<Traits> *s, bool onlyAtStart)
} while (0)

for (size_t locIndex = 0; locIndex < locsToCheckCount;
locIndex = advanceStringIndex(startLoc, locIndex, charsToEnd)) {
locIndex = advanceStringIndex(startLoc, locIndex, charsToRight)) {
const CodeUnit *potentialMatchLocation = startLoc + locIndex;
c.setCurrentPointer(potentialMatchLocation);
s->ip_ = startIp;
Expand Down
4 changes: 4 additions & 0 deletions test/hermes/regexp.js
Original file line number Diff line number Diff line change
Expand Up @@ -543,3 +543,7 @@ print(/-/[Symbol.split]('a-b-c'));
// Check UTF-16 string matching executes correctly
print(/abc/u.exec("\u20ac\u20ac\u20ac\u20ac"));
// CHECK-LABEL: null

// Check that lookbehind searches stay within bounds
print(/(?<=a)/u[Symbol.match](["\u00E9",34534502349000]))
// CHECK-LABEL: null

0 comments on commit e4afea8

Please sign in to comment.