Skip to content

Commit

Permalink
QRegularExpression: coalesce consecutive * tokens in wildcards
Browse files Browse the repository at this point in the history
When converting a wildcard into a regexp, convert a series of
consecutive '*' tokens in just one '.*' (instead of a series of '.*').
The pattern matched is the same, but we reduce the effects of a
possible catastrophic backtracking. I'm not actually sure whether
PCRE optimizes this case out of its own or it doesn't; Perl appears
not to.

Change-Id: Ia83336391593d56cf6d8332c96649a034a83a15b
Pick-to: 6.8
Fixes: QTBUG-127672
Reviewed-by: Thiago Macieira <[email protected]>
  • Loading branch information
dangelog committed Aug 7, 2024
1 parent d17d260 commit a041cd3
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 0 deletions.
3 changes: 3 additions & 0 deletions src/corelib/text/qregularexpression.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1964,6 +1964,9 @@ QString QRegularExpression::wildcardToRegularExpression(QStringView pattern, Wil
switch (c.unicode()) {
case '*':
rx += settings.starEscape;
// Coalesce sequences of *
while (i < wclen && wc[i] == u'*')
++i;
break;
case '?':
rx += settings.questionMarkEscape;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2486,6 +2486,30 @@ void tst_QRegularExpression::wildcard_data()
addRow("foo*bar", "foo\nbar", true, true);
addRow("foo*bar", "foo\r\nbar", true, true);

addRow("foo**********bar", "foo/fie/baz/bar", false, true);
addRow("foo**********bar", "foo bar bar test bar bar bar", true, true);
addRow("foo**********bar", "foo\tbar", true, true);
addRow("foo**********bar", "foo\nbar", true, true);
addRow("foo**********bar", "foo\r\nbar", true, true);

addRow("foo**********bar", "foo/fie/baz/baz", false, false);
addRow("foo**********bar", "foo bar bar test bar bar baz", false, false);
addRow("foo**********bar", "foo\tbaz", false, false);
addRow("foo**********bar", "foo\nbaz", false, false);
addRow("foo**********bar", "foo\r\nbaz", false, false);

addRow("foo*****x*****bar", "foo/fie/bax/bar", false, true);
addRow("foo*****x*****bar", "foo bar bax test bar bar bar", true, true);
addRow("foo*****x*****bar", "foo\tbar foo\tbax foo\tbar foo\tbar", true, true);
addRow("foo*****x*****bar", "foo\nx\nbar", true, true);
addRow("foo*****x*****bar", "foo\r\nxbar", true, true);

addRow("foo*****x*****bar", "foo/fie/baz/bar", false, false);
addRow("foo*****x*****bar", "foo bar baz test bar bar bar", false, false);
addRow("foo*****x*****bar", "foo\tbar foo\tbar foo\tbar foo\tbar", false, false);
addRow("foo*****x*****bar", "foo\nbar", false, false);
addRow("foo*****x*****bar", "foo\r\nbar", false, false);

// different anchor modes
addRow("foo", "afoob", false, false, true);
addRow("foo", "afoob", true, true, false);
Expand Down

0 comments on commit a041cd3

Please sign in to comment.