Skip to content

Commit

Permalink
Bug 1884649: Add test and comments. r=jandem
Browse files Browse the repository at this point in the history
Depends on D204330

Differential Revision: https://phabricator.services.mozilla.com/D204331
  • Loading branch information
anba committed Mar 14, 2024
1 parent cd7b136 commit 7d09504
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
let right = newRope("b", "012345678901234567890123456789");
let latin1Rope = newRope("a", right);
let twoByteRope = newRope("\u221e", right);

// Flattening |twoByteRope| changes |right| from a Latin-1 rope into a two-byte
// dependent string. At this point, |latin1Rope| has the Latin-1 flag set, but
// also has a two-byte rope child.
ensureLinearString(twoByteRope);

let result = latin1Rope.substring(0, 3);

assertEq(result, "ab0");
4 changes: 4 additions & 0 deletions js/src/vm/StringType-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,10 @@ inline JSRope::JSRope(JSString* left, JSString* right, size_t length) {
// |length| must be the sum of the length of both child nodes.
MOZ_ASSERT(left->length() + right->length() == length);

// |isLatin1| is set when both children are guaranteed to contain only Latin-1
// characters. Note that flattening either rope child can clear the Latin-1
// flag of that child, so it's possible that a Latin-1 rope can end up with
// both children being two-byte (dependent) strings.
bool isLatin1 = left->hasLatin1Chars() && right->hasLatin1Chars();

// Do not try to make a rope that could fit inline.
Expand Down
14 changes: 13 additions & 1 deletion js/src/vm/StringType.h
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,10 @@ class JSString : public js::gc::CellWithLengthAndFlags {
* If LATIN1_CHARS_BIT is set, the string's characters are stored as Latin1
* instead of TwoByte. This flag can also be set for ropes, if both the
* left and right nodes are Latin1. Flattening will result in a Latin1
* string in this case.
* string in this case. When we flatten a TwoByte rope, we turn child ropes
* (including Latin1 ropes) into TwoByte dependent strings. If one of these
* strings is also part of another Latin1 rope tree, we can have a Latin1 rope
* with a TwoByte descendent.
*
* The other flags store the string's type. Instead of using a dense index
* to represent the most-derived type, string types are encoded to allow
Expand Down Expand Up @@ -385,6 +388,15 @@ class JSString : public js::gc::CellWithLengthAndFlags {
static_assert((TYPE_FLAGS_MASK & js::gc::HeaderWord::RESERVED_MASK) == 0,
"GC reserved bits must not be used for Strings");

// Linear strings:
// - Content and representation are Latin-1 characters.
// - Unmodifiable after construction.
//
// Ropes:
// - Content are Latin-1 characters.
// - Flag may be cleared when the rope is changed into a dependent string.
//
// Also see LATIN1_CHARS_BIT description under "Flag Encoding".
static const uint32_t LATIN1_CHARS_BIT = js::Bit(9);

// Whether this atom's characters store an uint32 index value less than or
Expand Down

0 comments on commit 7d09504

Please sign in to comment.