A one stop shop for working with text displayed in the terminal.
The goal of this project is to alleviate the headache of working with Javascript's internal representation of unicode characters, particularly within the context of displaying text in the terminal for command line applications.
Features
- Implements the Unicode grapheme cluster breaking algorithm outlined in UAX #29 to split strings into user perceived characters (graphemes).
- Accurately measures of the visual width of strings when they are displayed in the terminal, with support for emoji characters and ZWJ sequences. For more details see the descriptions of the
codePointWidth
,stringWidth
, andcharWidths
functions below. - Provides methods for slicing and wrapping strings that contain ANSI escape codes.
Everything in this module is up to date with the latest version of Unicode (currently version 16.0.0).
Check out the acknowledgements section below for a look at the other Javascript projects that inspired this module.
Install with npm
:
npm install tty-strings
Or with yarn
:
yarn add tty-strings
Then import one or more of the functions detailed in the section below.
Get the visual width of a unicode code point, this project's equivalent of wcwidth
. Using this function alone to accurately measure the visual width of strings is insufficient, instead use the stringWidth
or charWidths
functions as they take into account the context of each code point.
code
- Unicode code point (must be anumber
).
Returns number
- 2
for a full width code point, 0
for a zero width code point, and 1
for everything else.
Code points
Full width code points are all unicode code points whose East_Asian_Width
property value is F
or W
, which are derived from the EastAsianWidth.txt data file associated with UAX #11: East Asian Width.
Zero width code points include all unicode code points whose General_Category
property value is Mn
, Me
, or Cc
(derived from the DerivedGeneralCategory.txt data file), as well as all code points with the Default_Ignorable_Code_Point
property, (derived from the DerivedCoreProperties.txt data file). Check out UAX #44: Unicode Character Database for more information about these properties.
Example
const { codePointWidth } = require('tty-strings');
// The numerical code point for 'ε€' is 53E4
codePointWidth(0x53E4);
// > 2
Get the length of a string in grapheme clusters. ANSI escape codes will be ignored.
string
- Input string to measure.
Returns number
- the length of the string in graphemes.
Example
const { stringLength } = require('tty-strings');
'π³οΈβπ'.length;
// > 6
stringLength('π³οΈβπ');
// > 1
Measure the visual width of a string. ANSI escape codes will be ignored. This function endeavors to be more accurate than an implementation of wcswidth
, as it takes into account the context of each code point within a grapheme cluster.
string
- Input string to measure.
Returns number
- the visual width of the string.
Example
const { stringWidth } = require('tty-strings');
stringWidth('π§π»βπ€βπ§πΌ');
// > 2
Word wrap text to a specified column width.
Input string may contain ANSI escape codes. Style and hyperlink sequences will be wrapped, while all other types of control sequences will be ignored and will not be included in the output string.
string
- Text to word wrap.columns
- Column width to wrap text to.options
- Optional options object specifying the properties detailed below.
Returns string
- the word wrapped string.
Type: boolean
Default: false
By default, words that are longer than the specified column width will not be broken and will therefore extend past the specified column width. Setting this to true
will enable hard wrapping, in which words longer than the column width will be broken and wrapped across multiple rows.
Type: boolean
Default: true
Trim leading whitespace from the beginning of each line. Setting this to false
will preserve any leading whitespace found before each line in the input string.
Example
const { wordWrap } = require('tty-strings'),
chalk = require('chalk');
const text = 'The ' + chalk.bgGreen.magenta('quick brown π¦ jumps over') + ' the π΄ πΆ.';
console.log(wordWrap(text, 20));
Slice a string by character index. Behaves like the native String.slice(), except that indexes refer to grapheme clusters within the string.
Input string may contain ANSI escape sequences. Style and hyperlink sequences that apply to the sliced portion of the string will be preserved, while all other types of control sequences will be ignored and will not be included in the output slice.
string
- Input string to slice.beginIndex
- Character index (defaults to0
) at which to begin the slice. Negative values specify a position measured from the character length of the string.endIndex
- Character index before which to end the slice. Negative values specify a position measured from the character length of the string. If omitted, the slice will extend to the end of the string.
Returns string
- the sliced string.
Example
const { sliceChars } = require('tty-strings');
sliceChars('πππ', 0, 2);
// > 'ππ';
Slice a string by column index. Behaves like the native String.slice(), except that indexes account for the visual width of each character.
Input string may contain ANSI escape sequences. Style and hyperlink sequences that apply to the sliced portion of the string will be preserved, while all other types of control sequences will be ignored and will not be included in the output slice.
string
- Input string to slice.beginIndex
- Column index (defaults to0
) at which to begin the slice. Negative values specify a position measured from the visual width of the string.endIndex
- Column index before which to end the slice. Negative values specify a position measured from the visual width of the string. If omitted, the slice will extend to the end of the string.
Returns string
- the sliced string.
Example
const { sliceColumns } = require('tty-strings');
// 'π', 'π', and 'π' are all full width characters
sliceColumns('πππ', 0, 2);
// > 'π'
Insert, remove or replace characters from a string, similar to the native Array.splice()
method, except that the start index and delete count refer to grapheme clusters within the string.
String may contain ANSI escape codes; inserted content will adopt any ANSI styling applied to the character immediately preceding the insert point. ANSI control sequences that are not style or hyperlink sequences will be preserved in the output string.
string
- Input string to remove, insert, or replace characters from.start
- Character index at which to begin splicing. Negative values specify a position measured from the character length of the string.deleteCount
- The number of characters to remove from the string. If0
, no characters will be removed from the string.insert
- Optional string to be inserted at the index specified by thestart
parameter. If omitted, nothing will be inserted into the string.
Returns string
- the modified input string.
Example
const { spliceChars } = require('tty-strings');
spliceChars('aΜΜ° bΜΈ cΜΜ₯', 2, 1, 'xΝΝyΜzΜΜ―');
// > 'aΜΜ° xΝΝyΜzΜΜ― cΜΜ₯'
Split a string with ANSI escape codes into an array of lines. Supports both CRLF
and LF
newlines.
ANSI escape codes that are style and hyperlink sequences will be wrapped across the output lines, while all other types of control sequences will be ignored but preserved in the output.
string
- Input string to split.
Returns string[]
- lines in the input string.
Example
const { splitLines } = require('tty-strings'),
chalk = require('chalk');
splitLines(chalk.green('foo\nbar'));
// > ['\x1b[32mfoo\x1b[39m', '\x1b[32mbar\x1b[39m']
Remove ANSI escape codes from a string.
string
- Input string to strip.
Returns string
- the input string with all ANSI escape codes removed.
This method is adapted from chalk's slice-ansi
package, and is essentially identical.
Example
const { stripAnsi } = require('tty-strings');
stripAnsi('\x1b[32mfoo\x1b[39m');
// > 'foo'
A generator function that splits a string into its component graphemes. Does not handle ANSI escape codes, so make sure to use stripAnsi
on any input string before calling this generator.
string
- Input string to split.
Yields - a string
for each grapheme.
Example
const { splitChars } = require('tty-strings');
[...'aΜΜ°bΝΜ±cΜΜ₯'];
// > ['a', '\u0300', '\u0330', 'b', '\u0341', '\u0331', 'c', '\u0302', '\u0325']
[...splitChars('aΜΜ°bΝΜ±cΜΜ₯')];
// > ['aΜΜ°', 'bΝΜ±', 'cΜΜ₯']
A generator function that splits a string into measured graphemes. Does not handle ANSI escape codes, so make sure to use stripAnsi
on any input string before calling this generator. This function endeavors to be more accurate than an implementation of wcswidth
, as it takes into account the context of each code point within a grapheme cluster.
string
- Input string to split.
Yields - a [char, width]
pair for each grapheme in the string.
Example
const { charWidths } = require('tty-strings');
// Basic latin characters
[...charWidths('abc')]
// > [['a', 1], ['b', 1], ['c', 1]]
// Full width emoji characters
[...charWidths('πππ')]
// > [['π', 2], ['π', 2], ['π', 2]]
Contributions are welcome!
To report a bug or request a feature, please open a new issue.
Install project dependencies and run the test suite with the following command:
yarn && yarn test
To generate coverage reports, run:
yarn test --coverage
This project was conceived of as a single module offering improved implementations of the following Javascript packages, all of which are great projects that served as inspiration:
This project's internal implementation of the Unicode grapheme cluster breaking algorithm is inspired by Devon Govett's grapheme-breaker
and Orlin Georgiev's grapheme-splitter
.