Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Allow terminal programs to display text in different sizes #8226

Open
kovidgoyal opened this issue Jan 18, 2025 · 36 comments
Open

[RFC] Allow terminal programs to display text in different sizes #8226

kovidgoyal opened this issue Jan 18, 2025 · 36 comments

Comments

@kovidgoyal
Copy link
Owner

kovidgoyal commented Jan 18, 2025

Hi all,

After several months of hard work refactoring kitty internals extensively, I am excited to announce kitty now has the ability to let terminal programs display text in multiple font sizes. This is implemented in a backwards compatible way, using a new escape code that terminal programs can opt-in to use if they wish too.

This feature works with the existing cell based grid terminal, we now just have the concept of "multicell" characters (these are an extension of the existing way "wide" characters and Emoji are handled) a single character can now extend over a block of cells spanning multiple lines.

A screenshot showing seamless mixed multi-sized text

Image

For a quickstart, some simple commands you can use to display sized text:

printf "\e]66;s=2;Double sized text\a\n\n"
printf "\e]66;s=3;Triple sized text\a\n\n\n"
printf "\e]66;n=1:d=2;Half sized text\a\n"

To use these commands you must either use kitty nightly or build and run kitty from source.

For more details on how the escape code works and the various features see the text-sizing-protocol.rst file. If anyone knows of a conflicting use of OSC 66 please do let know, the number can be easily changed at this point.

This protocol also robustly solves the long standing problem in the terminal ecosystem of determining the width in cells of a string by allowing the client program to simply inform the terminal of how many cells to render any particular piece of text in.

Finally, as a bonus treat, since I was refactoring rendering anyway, I implemented underline gaps for descenders. What this means is that underlines are now rendered with gaps or holes where the text has a descender below the baseline. This is the famous? skip ink CSS feature implemented in browsers. It can be controlled by a new option in kitty.conf called underline_exclusion.

Image

Since text-sizing required extensive changes to kitty internals, I would appreciate testing from some of you that are willing to test beta software. Note that it causes an approx 10% performance penalty in throughput, this is unavoidable because of the extra bookkeeping to deal with multiline characters. But, I think the ability to use different font sizes is worth the tradeoff.

I am also looking for feedback on the protocol design itself, please read the spec document and comment. I am particularly interested in feedback from people implementing terminal programs where such a feature may be useful, to that end I am pinging some such people I know of, please excuse the noise: @rockorager @neurocyte @dankamongmen @justinmk @swsnr @benjajaja @aymanbagabas @mfontanini

Anyone else is of course welcome to comment as well. I dont know how much interest this feature will generate, in case it generates a lot, please remember I am only one person and I have limited bandwidth. I will try to respond to any and all serious comments and I am willing to entertain reasonable changes/additions to how this feature works, but please remember everything has tradeoffs and I may not judge every tradeoff to be worth it.

@kovidgoyal kovidgoyal pinned this issue Jan 18, 2025
@kovidgoyal kovidgoyal changed the title [RFC] Allowing terminal programs to display text in multiple font sizes [RFC] Allow terminal programs to display text in different sizes Jan 18, 2025
@00-kat

This comment has been minimized.

@kovidgoyal

This comment has been minimized.

@benjajaja
Copy link

Thank you for making a comprehensive RFC!

From my experience with images in TUIs, when some operation is not contained to a single cell, then it's imperative to be able to know the precise cell area that is affected, not only to avoid unnecessary redrawing of cells, but also simply to fit UI elements together. Even rendering plain text has pitfalls when accounting for multi-width characters, where the rendered width depends on font and renderer. Rendering text twice as big seems straightforward, the area is just (2 × text-width, 2), and text can be split up accordingly, just like regular text. When using a fractional scale I guess the text-width can also just be multiplied by the scale. In other words, if I have a line of text that is expected to be 16 characters wide, and I would render it at a fractional scale of 1.5, then it would result in being exactly 24 characters wide - since the big text is also monospace.

I could use this right away for my project mdfried, which displays big text by means of re-rendering it as images. Thanks for adding a way to query for support of this feature! I think I might even be able to extract the big-text part into a ratatui-widget, including support for this text-sizing-protocol. Cool stuff!

@kovidgoyal
Copy link
Owner Author

No, not quite. The number of cells is determined solely by the scale and
the width. Fractional scale will adjust the size of the actual rendered
text, but it does not change the number of cells that text occupies.
Maybe I should make this clearer in the spec.

The main use of fractional scales is for super/subscripts and for
rendering normal sized text but in N times the cells with adjustable
whitespace above/below, the canonical example being rendering normal
sized text with only half a cell height of whitespace above it and half
a cell height below it, this can be useful for some UI elements.

@benjajaja
Copy link

benjajaja commented Jan 18, 2025

But I can calculate the width, right? Like I would calculate the width anyway, for line wrapping etc., just adjusted to the scale.

Examples:

  1. The text is hello world, 11 characters. I want to scale it at 2, so it's simply a width of 11 * 2 = 22 and height 2.
  2. The text is hello world, 11 characters. I want to scale it at 1.5, so it's simply a width of ceil(11 * 1.5) = 17 and height ceil(1 * 1.5) = 2.

I could break up a line too, if I had to fit it within some constraint I can use those calculations.

I'm ignoring multi-width characters, so far I have just used textwrap. I understand that it's not 100% accurate, but it's not really feasible to query how wide some text renders, before rendering, in ratatui's rendering style - if this querying is at all possible.

My use case is markdown headers, there are different "tiers", 1-6 IIRC. The easiest is to render tier 1 with a scale of 2.0, and the others proportional between 1.0 and 2.0. That's how it's done in mdfried and it looks quite good.

@rockorager
Copy link
Contributor

rockorager commented Jan 18, 2025 via email

@kovidgoyal
Copy link
Owner Author

kovidgoyal commented Jan 18, 2025 via email

@kovidgoyal
Copy link
Owner Author

kovidgoyal commented Jan 18, 2025

@rockorager This protocol allows setting width without scale (aka scale=1) and the two can be queried for independently as well. If a terminal wants to implement only width and not scale it can do so. I will add a note to clarify that to the spec.

I dont yet see a need for an independent query escape code, since the CPR based query works fine for querying both width and scale independently. If the protocol evolves further we can revisit. Also, I highly doubt there will ever be any terminal that implements this but does not implement sync, but even if that becomes the case, one can print spaces so the only thing that the user might see is cursor movement, which can also be removed by hiding the cursor before querying.

As for using a CSI based global state I deliberately chose against that because I dont like global state it just makes debuggability harder. You have to worry about resetting it, what happens if an application using it crashes, without restoring, etc.

@kovidgoyal
Copy link
Owner Author

Added a note about separate querying for scale and width: b552d77

@dnkl
Copy link

dnkl commented Jan 27, 2025

fyi, I plan to support the "width" part of the protocol in foot (https://codeberg.org/dnkl/foot/pulls/1927), unless there are major architectural changes to the protocol that makes me change my mind :)

@kovidgoyal
Copy link
Owner Author

Cool, good to know. I don't anticipate any major changes to the protocol, but this is the RFC period so no guarantees.

@kovidgoyal
Copy link
Owner Author

Merged into master. Will become kitty 0.40.0

rockorager added a commit to rockorager/libvaxis that referenced this issue Feb 3, 2025
Implement explicit width hint extension, developed by kitty. When
both explicit width and mode 2027 are available, we default to explicit
width. Custom event loop authors will need to update their loops to add
support for this by setting the new capability value.

For simplicity, we don't actually add a flag in the parser for checking
between a cursor position and an F3 key. Instead, we send the cursor
home, then do an explicit width command, *then* check the cursor
position. If the cursor has moved - meaning the extension is supported -
we will see an F3 key with the shift modifier. The response will be
something like `\x1b[1;2R` which we parse as a shift+F3. But in the
loop, we check the flag if we have sent queries and handle this specific
event differently.

Reference: kovidgoyal/kitty#8226
rockorager added a commit to rockorager/libvaxis that referenced this issue Feb 3, 2025
Implement explicit width hint extension, developed by kitty. When
both explicit width and mode 2027 are available, we default to explicit
width. Custom event loop authors will need to update their loops to add
support for this by setting the new capability value.

For simplicity, we don't actually add a flag in the parser for checking
between a cursor position and an F3 key. Instead, we send the cursor
home, then do an explicit width command, *then* check the cursor
position. If the cursor has moved - meaning the extension is supported -
we will see an F3 key with the shift modifier. The response will be
something like `\x1b[1;2R` which we parse as a shift+F3. But in the
loop, we check the flag if we have sent queries and handle this specific
event differently.

Reference: kovidgoyal/kitty#8226
@rockorager
Copy link
Contributor

I've added support for this in libvaxis. Works really well!

recording.mp4

rockorager added a commit to rockorager/libvaxis that referenced this issue Feb 3, 2025
Implement explicit width hint extension, developed by kitty. When
both explicit width and mode 2027 are available, we default to explicit
width. Custom event loop authors will need to update their loops to add
support for this by setting the new capability value.

For simplicity, we don't actually add a flag in the parser for checking
between a cursor position and an F3 key. Instead, we send the cursor
home, then do an explicit width command, *then* check the cursor
position. If the cursor has moved - meaning the extension is supported -
we will see an F3 key with the shift modifier. The response will be
something like `\x1b[1;2R` which we parse as a shift+F3. But in the
loop, we check the flag if we have sent queries and handle this specific
event differently.

Reference: kovidgoyal/kitty#8226
@rockorager
Copy link
Contributor

As a note to possible later implementers, I don't actually have any other place in libvaxis that does a cursor position report. So instead of adding a flag in the parser for determining if I have an F3 or a cursor position report, I've instead done the query like so:

  1. Set flag in event handler that we are awaiting a DA1 response
  2. Send cursor home
  3. Explicit width command (\x1b]66;w=1; \x1b\\)
  4. Cursor position request
  5. Other queries
  6. DA1 request, turn flag off on receipt

Then in my event handler, I can check the flag and if I received a shift+F3 (\x1b[1;2R), then the extension is supported since the cursor will have moved 1 column. If the extension is not supported, I will receive an unmodified F3 press - which we promptly ignore.

rockorager added a commit to rockorager/vaxis that referenced this issue Feb 3, 2025
Implement explicit width support based on the spec developed by kitty.
In the presence of both mode 2027 and explicit width, we opt for
explicit width

Reference: kovidgoyal/kitty#8226
@kovidgoyal
Copy link
Owner Author

You mean you added support for the width part not scale, correct?

@rockorager
Copy link
Contributor

rockorager commented Feb 4, 2025 via email

@kovidgoyal
Copy link
Owner Author

Cool, feel free to ask for clarification, if needed.

@dankamongmen
Copy link
Contributor

this is really impressive work, @kovidgoyal . good to see you continuing to break new ground in the terminal paradigm.

my major questions were all answered by the "Wrapping and overwriting behavior" section of your spec. things there look sane. one question remains, which might have been answered: if i change global font size, does everything rescale based off of the new default?

@kovidgoyal
Copy link
Owner Author

kovidgoyal commented Feb 5, 2025 via email

@neurocyte
Copy link

I've added text sizing support (via libvaxis) to flow. Widths only so far.

The results are great!

Before:
Image

After:
Image

Very nice work!

@rockorager
Copy link
Contributor

rockorager commented Feb 5, 2025

When w is a non-zero value, it specifies the width in scaled cells of the following text.

This portion of the spec doesn't seem to be what kitty is actually doing.

I read this as: w should be the final scaled width of the cells. So an emoji in scale=1 should have w=2, and an emoji with scale=2 should have w=4. What I get when doing this is the following:

Image

Which seems that w should be the unscaled width. I'm fine with either case, but I think the spec or implementation should be clarified here.

EDIT: I also see that my prompt character is scaled and I'm not sure why. I wonder if the w=4 scaling is extending down into that cell? But the ~ character is not scaled...

@neurocyte
Copy link

When w is a non-zero value, it specifies the width in scaled cells of the following text.

This portion of the spec doesn't seem to be what kitty is actually doing.

I read this as: w should be the final scaled width of the cells. So an emoji in scale=1 should have w=2, and an emoji with scale=2 should have w=4.

I read this the exact opposite way. I read it to mean that w is applied to the already scaled size of the cells. So if the cell is scaled to be 2 cells wide, w=2 would give the final glyph a total cell size of 4.

So I guess a more explicit clarification in the text would be good.

EDIT: I also see that my prompt character is scaled and I'm not sure why. I wonder if the w=4 scaling is extending down into that cell? But the ~ character is not scaled...

Do you have the latest kitty version with the #8286 fix?

@rockorager
Copy link
Contributor

rockorager commented Feb 5, 2025 via email

@kovidgoyal
Copy link
Owner Author

The final width is: w * s. The final height is s. So the final multicell is (s*w, s) cells in size. When rendering the actual emoji kitty maintains aspect ratio. So for a square emoji you should always use w=2 and s=whatever. Changing w to a value larger then 2 will just end up with some blank space after the emoji since kitty wont stretch it to fit.

@rockorager
Copy link
Contributor

Thanks for the clarification!

@rockorager
Copy link
Contributor

Is it invalid to specify a numerator > denominator?

@rockorager
Copy link
Contributor

(I previously commented about bit size but had completely did the math wrong. All is well)

mfontanini added a commit to mfontanini/presenterm that referenced this issue Feb 6, 2025
This adds support for kitty's font size protocol
(kovidgoyal/kitty#8226) which allows printing
characters that take up more than one cell. This feature will be
available in kitty >= 0.40.0 and is currently only available in nightly
builds.

This for now is only supported in a subset of the theme components,
namely:

* The introduction slide's presentation title
(`intro_slide.title.font_size`).
* The slide titles (`slide_title.font_size`).
* The headings (`headings.h*.font_size`).

Font sizes are only used if the terminal emulator supports it so this
doesn't change anything for emulators other than kitty (or other
implementors of the protocol). If you find this somehow breaks
something, please create an issue.

For now all built in themes set `intro_slide.title.font_size=2` and
`slide_title.font_size=2`. I think this looks a lot better this way but
please do comment here if you don't think built in themes should come
with these values set.

These are now the first 2 slides in the `demo.md` example:


https://github.com/user-attachments/assets/8d761d86-8855-498a-9766-5294cdae3b57
@kovidgoyal
Copy link
Owner Author

kovidgoyal commented Feb 6, 2025

Is it invalid to specify a numerator > denominator?

Yes. Quoting the spec:
"The denominator for the fractional scale. Must be > n when non-zero."

@ad-chaos
Copy link
Contributor

ad-chaos commented Feb 17, 2025

It would be nice to also specify horizontal alignment for fractionally scaled text. Useful for math equations:

printf "\e]66;n=1:d=2:v=0;w\aC\e]66;n=1:d=2:v=1;t\a•p\e]66;n=1:d=2:v=0;w\a•(1-p)\e]66;n=1:d=2:v=0;t-w\a\n"

Notice how the ws and ts are all left aligned.

Image

@kovidgoyal
Copy link
Owner Author

You can achieve that by preceding w with a space, although I agree its not ideal as it means copy pasting will have an extra space., and currently because of a bug in kitty you cant use a regular space it has to be one of the unicode spaces, for e.g:

printf "\e]66;n=1:d=2:v=0:w=1;\u2009w\aC\e]66;n=1:d=2:v=1;t\a"

Image

@ad-chaos
Copy link
Contributor

Ah, the space will work just fine for my usecase in the time being :)

@Safari77
Copy link

It needs more sanity checks, this crashes kitty

printf "\e]66;n=1:d=20;crashing text\a\n"
Increasing cell height by 1 pixels to work around buggy font that renders underscore outside the bounding box
[31,149] Cell width invalid after adjustment, ignoring modify_font cell_width
[31,149] Cell width too small: 1

@kovidgoyal
Copy link
Owner Author

It's not a crash, it's an abort and should be fixed by
45d931f

@kovidgoyal
Copy link
Owner Author

Implemented support for horizontal alignment using h key with left (default), center and right.

@00-kat
Copy link

00-kat commented Feb 19, 2025

copy pasting will have an extra space

On the topic of copy-pasting, it would be nice if copy-pasting scaled text into programs which support richer markup (like LibreOffice) would retain the formatting (though I'm not sure if I should create a new issue for this, since it's related to Kitty's implementation and not the spec). For example, pasting the math above would result in wCt•pw•(1-p)t-w right now. I'm not sure if this is feasible though.

@kovidgoyal
Copy link
Owner Author

kovidgoyal commented Feb 20, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants