22 Text Processing

22.1 String Objects

22.1.1 The String Constructor

The String constructor:

  • is %String%.
  • is the initial value of the "String" property of the global object.
  • creates and initializes a new String object when called as a constructor.
  • performs a type conversion when called as a function rather than as a constructor.
  • may be used as the value of an extends clause of a class definition. Subclass constructors that intend to inherit the specified String behaviour must include a super call to the String constructor to create and initialize the subclass instance with a [[StringData]] internal slot.

22.1.1.1 String ( value )

This function performs the following steps when called:

  1. If value is not present, then
    1. Let s be the empty String.
  2. Else,
    1. If NewTarget is undefined and value is a Symbol, return SymbolDescriptiveString(value).
    2. Let s be ? ToString(value).
  3. If NewTarget is undefined, return s.
  4. Return StringCreate(s, ? GetPrototypeFromConstructor(NewTarget, "%String.prototype%")).

22.1.2 Properties of the String Constructor

The String constructor:

  • has a [[Prototype]] internal slot whose value is %Function.prototype%.
  • has the following properties:

22.1.2.1 String.fromCharCode ( ...codeUnits )

This function may be called with any number of arguments which form the rest parameter codeUnits.

It performs the following steps when called:

  1. Let result be the empty String.
  2. For each element next of codeUnits, do
    1. Let nextCU be the code unit whose numeric value is (? ToUint16(next)).
    2. Set result to the string-concatenation of result and nextCU.
  3. Return result.

The "length" property of this function is 1𝔽.

22.1.2.2 String.fromCodePoint ( ...codePoints )

This function may be called with any number of arguments which form the rest parameter codePoints.

It performs the following steps when called:

  1. Let result be the empty String.
  2. For each element next of codePoints, do
    1. Let nextCP be ? ToNumber(next).
    2. If nextCP is not an integral Number, throw a RangeError exception.
    3. If (nextCP) < 0 or (nextCP) > 0x10FFFF, throw a RangeError exception.
    4. Set result to the string-concatenation of result and UTF16EncodeCodePoint((nextCP)).
  3. Assert: If codePoints is empty, then result is the empty String.
  4. Return result.

The "length" property of this function is 1𝔽.

22.1.2.3 String.prototype

The initial value of String.prototype is the String prototype object.

This property has the attributes { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false }.

22.1.2.4 String.raw ( template, ...substitutions )

This function may be called with a variable number of arguments. The first argument is template and the remainder of the arguments form the List substitutions.

It performs the following steps when called:

  1. Let substitutionCount be the number of elements in substitutions.
  2. Let cooked be ? ToObject(template).
  3. Let literals be ? ToObject(? Get(cooked, "raw")).
  4. Let literalCount be ? LengthOfArrayLike(literals).
  5. If literalCount ≤ 0, return the empty String.
  6. Let R be the empty String.
  7. Let nextIndex be 0.
  8. Repeat,
    1. Let nextLiteralVal be ? Get(literals, ! ToString(𝔽(nextIndex))).
    2. Let nextLiteral be ? ToString(nextLiteralVal).
    3. Set R to the string-concatenation of R and nextLiteral.
    4. If nextIndex + 1 = literalCount, return R.
    5. If nextIndex < substitutionCount, then
      1. Let nextSubVal be substitutions[nextIndex].
      2. Let nextSub be ? ToString(nextSubVal).
      3. Set R to the string-concatenation of R and nextSub.
    6. Set nextIndex to nextIndex + 1.
Note

This function is intended for use as a tag function of a Tagged Template (13.3.11). When called as such, the first argument will be a well formed template object and the rest parameter will contain the substitution values.

22.1.3 Properties of the String Prototype Object

The String prototype object:

  • is %String.prototype%.
  • is a String exotic object and has the internal methods specified for such objects.
  • has a [[StringData]] internal slot whose value is the empty String.
  • has a "length" property whose initial value is +0𝔽 and whose attributes are { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false }.
  • has a [[Prototype]] internal slot whose value is %Object.prototype%.

Unless explicitly stated otherwise, the methods of the String prototype object defined below are not generic and the this value passed to them must be either a String value or an object that has a [[StringData]] internal slot that has been initialized to a String value.

22.1.3.1 String.prototype.at ( index )

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let len be the length of S.
  5. Let relativeIndex be ? ToIntegerOrInfinity(index).
  6. If relativeIndex ≥ 0, then
    1. Let k be relativeIndex.
  7. Else,
    1. Let k be len + relativeIndex.
  8. If k < 0 or klen, return undefined.
  9. Return the substring of S from k to k + 1.

22.1.3.2 String.prototype.charAt ( pos )

Note 1

This method returns a single element String containing the code unit at index pos within the String value resulting from converting this object to a String. If there is no element at that index, the result is the empty String. The result is a String value, not a String object.

If pos is an integral Number, then the result of x.charAt(pos) is equivalent to the result of x.substring(pos, pos + 1).

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let position be ? ToIntegerOrInfinity(pos).
  5. Let size be the length of S.
  6. If position < 0 or positionsize, return the empty String.
  7. Return the substring of S from position to position + 1.
Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.3 String.prototype.charCodeAt ( pos )

Note 1

This method returns a Number (a non-negative integral Number less than 216) that is the numeric value of the code unit at index pos within the String resulting from converting this object to a String. If there is no element at that index, the result is NaN.

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let position be ? ToIntegerOrInfinity(pos).
  5. Let size be the length of S.
  6. If position < 0 or positionsize, return NaN.
  7. Return the Number value for the numeric value of the code unit at index position within the String S.
Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore it can be transferred to other kinds of objects for use as a method.

22.1.3.4 String.prototype.codePointAt ( pos )

Note 1

This method returns a non-negative integral Number less than or equal to 0x10FFFF𝔽 that is the numeric value of the UTF-16 encoded code point (6.1.4) starting at the string element at index pos within the String resulting from converting this object to a String. If there is no element at that index, the result is undefined. If a valid UTF-16 surrogate pair does not begin at pos, the result is the code unit at pos.

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let position be ? ToIntegerOrInfinity(pos).
  5. Let size be the length of S.
  6. If position < 0 or positionsize, return undefined.
  7. Let cp be CodePointAt(S, position).
  8. Return 𝔽(cp.[[CodePoint]]).
Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore it can be transferred to other kinds of objects for use as a method.

22.1.3.5 String.prototype.concat ( ...args )

Note 1

When this method is called it returns the String value consisting of the code units of the this value (converted to a String) followed by the code units of each of the arguments converted to a String. The result is a String value, not a String object.

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let R be S.
  5. For each element next of args, do
    1. Let nextString be ? ToString(next).
    2. Set R to the string-concatenation of R and nextString.
  6. Return R.

The "length" property of this method is 1𝔽.

Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore it can be transferred to other kinds of objects for use as a method.

22.1.3.6 String.prototype.constructor

The initial value of String.prototype.constructor is %String%.

22.1.3.7 String.prototype.endsWith ( searchString [ , endPosition ] )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let isRegExp be ? IsRegExp(searchString).
  5. If isRegExp is true, throw a TypeError exception.
  6. Let searchStr be ? ToString(searchString).
  7. Let len be the length of S.
  8. If endPosition is undefined, let pos be len; else let pos be ? ToIntegerOrInfinity(endPosition).
  9. Let end be the result of clamping pos between 0 and len.
  10. Let searchLength be the length of searchStr.
  11. If searchLength = 0, return true.
  12. Let start be end - searchLength.
  13. If start < 0, return false.
  14. Let substring be the substring of S from start to end.
  15. If substring is searchStr, return true.
  16. Return false.
Note 1

This method returns true if the sequence of code units of searchString converted to a String is the same as the corresponding code units of this object (converted to a String) starting at endPosition - length(this). Otherwise it returns false.

Note 2

Throwing an exception if the first argument is a RegExp is specified in order to allow future editions to define extensions that allow such argument values.

Note 3

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.8 String.prototype.includes ( searchString [ , position ] )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let isRegExp be ? IsRegExp(searchString).
  5. If isRegExp is true, throw a TypeError exception.
  6. Let searchStr be ? ToString(searchString).
  7. Let pos be ? ToIntegerOrInfinity(position).
  8. Assert: If position is undefined, then pos is 0.
  9. Let len be the length of S.
  10. Let start be the result of clamping pos between 0 and len.
  11. Let index be StringIndexOf(S, searchStr, start).
  12. If index is not-found, return false.
  13. Return true.
Note 1

If searchString appears as a substring of the result of converting this object to a String, at one or more indices that are greater than or equal to position, this function returns true; otherwise, it returns false. If position is undefined, 0 is assumed, so as to search all of the String.

Note 2

Throwing an exception if the first argument is a RegExp is specified in order to allow future editions to define extensions that allow such argument values.

Note 3

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.9 String.prototype.indexOf ( searchString [ , position ] )

Note 1

If searchString appears as a substring of the result of converting this object to a String, at one or more indices that are greater than or equal to position, then the smallest such index is returned; otherwise, -1𝔽 is returned. If position is undefined, +0𝔽 is assumed, so as to search all of the String.

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let searchStr be ? ToString(searchString).
  5. Let pos be ? ToIntegerOrInfinity(position).
  6. Assert: If position is undefined, then pos is 0.
  7. Let len be the length of S.
  8. Let start be the result of clamping pos between 0 and len.
  9. Let result be StringIndexOf(S, searchStr, start).
  10. If result is not-found, return -1𝔽.
  11. Return 𝔽(result).
Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.10 String.prototype.isWellFormed ( )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Return IsStringWellFormedUnicode(S).

22.1.3.11 String.prototype.lastIndexOf ( searchString [ , position ] )

Note 1

If searchString appears as a substring of the result of converting this object to a String at one or more indices that are smaller than or equal to position, then the greatest such index is returned; otherwise, -1𝔽 is returned. If position is undefined, the length of the String value is assumed, so as to search all of the String.

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let searchStr be ? ToString(searchString).
  5. Let numPos be ? ToNumber(position).
  6. Assert: If position is undefined, then numPos is NaN.
  7. If numPos is NaN, let pos be +∞; otherwise let pos be ! ToIntegerOrInfinity(numPos).
  8. Let len be the length of S.
  9. Let searchLen be the length of searchStr.
  10. If len < searchLen, return -1𝔽.
  11. Let start be the result of clamping pos between 0 and len - searchLen.
  12. Let result be StringLastIndexOf(S, searchStr, start).
  13. If result is not-found, return -1𝔽.
  14. Return 𝔽(result).
Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.12 String.prototype.localeCompare ( that [ , reserved1 [ , reserved2 ] ] )

An ECMAScript implementation that includes the ECMA-402 Internationalization API must implement this method as specified in the ECMA-402 specification. If an ECMAScript implementation does not include the ECMA-402 API the following specification of this method is used:

This method returns a Number other than NaN representing the result of an implementation-defined locale-sensitive String comparison of the this value (converted to a String S) with that (converted to a String thatValue). The result is intended to correspond with a sort order of String values according to conventions of the host environment's current locale, and will be negative when S is ordered before thatValue, positive when S is ordered after thatValue, and zero in all other cases (representing no relative ordering between S and thatValue).

Before performing the comparisons, this method performs the following steps to prepare the Strings:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let thatValue be ? ToString(that).

The meaning of the optional second and third parameters to this method are defined in the ECMA-402 specification; implementations that do not include ECMA-402 support must not assign any other interpretation to those parameter positions.

The actual return values are implementation-defined to permit encoding additional information in them, but this method, when considered as a method of two arguments, is required to be a consistent comparator defining a total ordering on the set of all Strings. This method is also required to recognize and honour canonical equivalence according to the Unicode Standard, including returning +0𝔽 when comparing distinguishable Strings that are canonically equivalent.

Note 1

This method itself is not directly suitable as an argument to Array.prototype.sort because the latter requires a function of two arguments.

Note 2

This method may rely on whatever language- and/or locale-sensitive comparison functionality is available to the ECMAScript environment from the host environment, and is intended to compare according to the conventions of the host environment's current locale. However, regardless of comparison capabilities, this method must recognize and honour canonical equivalence according to the Unicode Standard—for example, the following comparisons must all return +0𝔽:

// Å ANGSTROM SIGN vs.
// Å LATIN CAPITAL LETTER A + COMBINING RING ABOVE
"\u212B".localeCompare("A\u030A")

// Ω OHM SIGN vs.
// Ω GREEK CAPITAL LETTER OMEGA
"\u2126".localeCompare("\u03A9")

// ṩ LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE vs.
// ṩ LATIN SMALL LETTER S + COMBINING DOT ABOVE + COMBINING DOT BELOW
"\u1E69".localeCompare("s\u0307\u0323")

// ḍ̇ LATIN SMALL LETTER D WITH DOT ABOVE + COMBINING DOT BELOW vs.
// ḍ̇ LATIN SMALL LETTER D WITH DOT BELOW + COMBINING DOT ABOVE
"\u1E0B\u0323".localeCompare("\u1E0D\u0307")

// 가 HANGUL CHOSEONG KIYEOK + HANGUL JUNGSEONG A vs.
// 가 HANGUL SYLLABLE GA
"\u1100\u1161".localeCompare("\uAC00")

For a definition and discussion of canonical equivalence see the Unicode Standard, chapters 2 and 3, as well as Unicode Standard Annex #15, Unicode Normalization Forms and Unicode Technical Note #5, Canonical Equivalence in Applications. Also see Unicode Technical Standard #10, Unicode Collation Algorithm.

It is recommended that this method should not honour Unicode compatibility equivalents or compatibility decompositions as defined in the Unicode Standard, chapter 3, section 3.7.

Note 3

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.13 String.prototype.match ( regexp )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. If regexp is an Object, then
    1. Let matcher be ? GetMethod(regexp, %Symbol.match%).
    2. If matcher is not undefined, then
      1. Return ? Call(matcher, regexp, « O »).
  4. Let S be ? ToString(O).
  5. Let rx be ? RegExpCreate(regexp, undefined).
  6. Return ? Invoke(rx, %Symbol.match%, « S »).
Note

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.14 String.prototype.matchAll ( regexp )

This method performs a regular expression match of the String representing the this value against regexp and returns an iterator that yields match results. Each match result is an Array containing the matched portion of the String as the first element, followed by the portions matched by any capturing groups. If the regular expression never matches, the returned iterator does not yield any match results.

It performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. If regexp is an Object, then
    1. Let isRegExp be ? IsRegExp(regexp).
    2. If isRegExp is true, then
      1. Let flags be ? Get(regexp, "flags").
      2. Perform ? RequireObjectCoercible(flags).
      3. If ? ToString(flags) does not contain "g", throw a TypeError exception.
    3. Let matcher be ? GetMethod(regexp, %Symbol.matchAll%).
    4. If matcher is not undefined, then
      1. Return ? Call(matcher, regexp, « O »).
  4. Let S be ? ToString(O).
  5. Let rx be ? RegExpCreate(regexp, "g").
  6. Return ? Invoke(rx, %Symbol.matchAll%, « S »).
Note 1
This method is intentionally generic, it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.
Note 2
Similarly to String.prototype.split, String.prototype.matchAll is designed to typically act without mutating its inputs.

22.1.3.15 String.prototype.normalize ( [ form ] )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. If form is undefined, let f be "NFC".
  5. Else, let f be ? ToString(form).
  6. If f is not one of "NFC", "NFD", "NFKC", or "NFKD", throw a RangeError exception.
  7. Let ns be the String value that is the result of normalizing S into the normalization form named by f as specified in the latest Unicode Standard, Normalization Forms.
  8. Return ns.
Note

This method is intentionally generic; it does not require that its this value be a String object. Therefore it can be transferred to other kinds of objects for use as a method.

22.1.3.16 String.prototype.padEnd ( maxLength [ , fillString ] )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Return ? StringPaddingBuiltinsImpl(O, maxLength, fillString, end).

22.1.3.17 String.prototype.padStart ( maxLength [ , fillString ] )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Return ? StringPaddingBuiltinsImpl(O, maxLength, fillString, start).

22.1.3.17.1 StringPaddingBuiltinsImpl ( O, maxLength, fillString, placement )

The abstract operation StringPaddingBuiltinsImpl takes arguments O (an ECMAScript language value), maxLength (an ECMAScript language value), fillString (an ECMAScript language value), and placement (start or end) and returns either a normal completion containing a String or a throw completion. It performs the following steps when called:

  1. Let S be ? ToString(O).
  2. Let intMaxLength be (? ToLength(maxLength)).
  3. Let stringLength be the length of S.
  4. If intMaxLengthstringLength, return S.
  5. If fillString is undefined, set fillString to the String value consisting solely of the code unit 0x0020 (SPACE).
  6. Else, set fillString to ? ToString(fillString).
  7. Return StringPad(S, intMaxLength, fillString, placement).

22.1.3.17.2 StringPad ( S, maxLength, fillString, placement )

The abstract operation StringPad takes arguments S (a String), maxLength (a non-negative integer), fillString (a String), and placement (start or end) and returns a String. It performs the following steps when called:

  1. Let stringLength be the length of S.
  2. If maxLengthstringLength, return S.
  3. If fillString is the empty String, return S.
  4. Let fillLen be maxLength - stringLength.
  5. Let truncatedStringFiller be the String value consisting of repeated concatenations of fillString truncated to length fillLen.
  6. If placement is start, return the string-concatenation of truncatedStringFiller and S.
  7. Else, return the string-concatenation of S and truncatedStringFiller.
Note 1

The argument maxLength will be clamped such that it can be no smaller than the length of S.

Note 2

The argument fillString defaults to " " (the String value consisting of the code unit 0x0020 SPACE).

22.1.3.17.3 ToZeroPaddedDecimalString ( n, minLength )

The abstract operation ToZeroPaddedDecimalString takes arguments n (a non-negative integer) and minLength (a non-negative integer) and returns a String. It performs the following steps when called:

  1. Let S be the String representation of n, formatted as a decimal number.
  2. Return StringPad(S, minLength, "0", start).

22.1.3.18 String.prototype.repeat ( count )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. Let S be ? ToString(O).
  4. Let n be ? ToIntegerOrInfinity(count).
  5. If n < 0 or n = +∞, throw a RangeError exception.
  6. If n = 0, return the empty String.
  7. Return the String value that is made from n copies of S appended together.
Note 1

This method creates the String value consisting of the code units of the this value (converted to String) repeated count times.

Note 2

This method is intentionally generic; it does not require that its this value be a String object. Therefore, it can be transferred to other kinds of objects for use as a method.

22.1.3.19 String.prototype.replace ( searchValue, replaceValue )

This method performs the following steps when called:

  1. Let O be the this value.
  2. Perform ? RequireObjectCoercible(O).
  3. If searchValue is an Object, then
    1. Let replacer be ? GetMethod(searchValue, %Symbol.replace%).
    2. If replacer is not undefined, then
      1. Return ? Call(replacer, searchValue, « O, replaceValue »).
  4. Let string be ?