Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf8 input #5

Closed
k-omura opened this issue May 4, 2021 · 10 comments
Closed

utf8 input #5

k-omura opened this issue May 4, 2021 · 10 comments

Comments

@k-omura
Copy link
Owner

k-omura commented May 4, 2021

Suggestion from Greg in #4

I also have an old algo for translating utf8 to unicode that I used on your old code. I can drop it in here if you like.
utf8 is especially good for European languages and data transfer. Shorter strings and a consistent single byte stream to read.
or you could leave it as it is and as I did
the user can translate the string to a wchar string encoding the string themselves.
I'll hack out a simple example Arduino example once I have time to start working with the code.
That is probably the best way I can help you.
My cpp is very limited I work and think mostly in C.
@Yardie-
Copy link
Contributor

Yardie- commented May 4, 2021

I will test it but having looked at your code as you said it's already there in what you have done
maybe this will just work

void truetypeClass::string(uint16_t _x, uint16_t _y, const uint8_t _character[]){
  this->string(_x, _y, (wchar_t*)_character);
}

or just calling the function like
truetype.string(10, 10, L"L'année 1866 fut marquée par un événement bizarre,");

I'll try that.

@Yardie-
Copy link
Contributor

Yardie- commented May 5, 2021

Cool the code you sent works nicely with alex brush.

String test = "èàùëïü";
    //Write a string to the framebuffer
    truetype.string(10, 0, test);
    truetype.string(10, 10, L"èàùëïü");
    truetype.string(10, 20, L"L'année");

All works but another font
https://www.1001fonts.com/kg-les-bouquinistes-de-paris-font.html
blows out the heap. with

3360
/FONTFILE.ttf
68400
CORRUPT HEAP: Bad head at 0x3ffb2d24. Expected 0xabba1234 got 0x3ffb6380
abort() was called at PC 0x400832f1 on core 1

ELF file SHA256: 0000000000000000

Backtrace: 0x40085004:0x3ffb1a70 0x40085279:0x3ffb1a90 0x400832f1:0x3ffb1ab0 0x4008341d:0x3ffb1ae0 0x400da493:0x3ffb1b00 0x400d618d:0x3ffb1dc0 0x400d60ac:0x3ffb1e10 0x400893e1:0x3ffb1e40 0x40081546:0x3ffb1e60 0x400831e9:0x3ffb1e80 0x4000bec7:0x3ffb1ea0 0x400d11f9:0x3ffb1ec0 0x400d2366:0x3ffb1ee0 0x400d1121:0x3ffb1f30 0x400d3cfa:0x3ffb1fb0 0x40086289:0x3ffb1fd0

But I can work with this thanks heaps.
I'll let you enjoy the rest of your holiday in peace.

@Yardie-
Copy link
Contributor

Yardie- commented May 5, 2021

but
https://www.1001fonts.com/chopin-script-font.html
works fine
I guess you can't be expected to support all the fonts in the world :)

@k-omura
Copy link
Owner Author

k-omura commented May 5, 2021

That's right
I have confirmed that some font files have bugs.
It's one of the to-do lists.

Thanks for letting me know what the buggy font files have in common!
If the font file isn't available for some reason, I think I should give feedback to the user.

I can only touch my favorite code on holidays, so it's rather welcome!
I want to stay home while fixing the code

@Yardie-
Copy link
Contributor

Yardie- commented May 5, 2021

I like to try different fonts out so I will let you know what ones do what things as I find them.
Fonts are notoriously variable.
That is why I like that site. There are so many different typesa.
It helps me test lots of different things.

It would be good to at least reject a font rather than crash the code. It seems that is quite tricky.

@k-omura
Copy link
Owner Author

k-omura commented May 6, 2021

https://www.1001fonts.com/kg-les-bouquinistes-de-paris-font.html
blows out the heap. with

I checked the above font.
It turns out that the cause is that it does not currently support characters that are a combination of multiple glyphs called "Compound glyphs".
https://developer.apple.com/fonts/TrueType-Reference-Manual/RM01/Chap1.html#compound

Supporting "Compound glyphs" is one of the future issues, but at the moment, in the case of "Compound glyphs", it is left blank to prevent it from crashing.

I recognize that "Compound glyphs" are especially important in European languages.

@Yardie-
Copy link
Contributor

Yardie- commented May 6, 2021

Well done that is fantastic
and
the perfect response to the problem.
Yep ok that's a problem stop the code crashing in a nice clean way not doing that part and moving on.
The API can clearly now say one day these types will be supported but not now.
I am sure many of the people who require compound glyphs understand that it is not trivial.
You are getting a deep understanding of how ttf works now.
Well done again.

That is ground many fear to tread.

@Yardie-
Copy link
Contributor

Yardie- commented May 7, 2021

You are welcome to close this if you like.

@Yardie-
Copy link
Contributor

Yardie- commented May 9, 2021

Actually it maybe worth another look.

http://utf8everywhere.org/
Although (wchar_t*)_character seems to work so far it might be better to somehow use the algorithms in stringToWchar

@k-omura
Copy link
Owner Author

k-omura commented May 15, 2021

This issue is closed because it supports single-byte characters.
Certainly it may be the "right way" to explicitly convert to utf16...

The "Compound glyphs" mentioned in this issue will be supported in #13.

@k-omura k-omura closed this as completed May 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants