Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JTidy breaks unicode characters #10

Open
korri123 opened this issue Jan 15, 2023 · 0 comments
Open

JTidy breaks unicode characters #10

korri123 opened this issue Jan 15, 2023 · 0 comments

Comments

@korri123
Copy link

Hey, not sure if this is still being developed but here it goes.

JTidy dependency used to convert HTML snippet to full page will malformat special characters (þ, æ, ð, á etc.) as it's bugged and uses UTF-16

Temporary hacky solution is to replace HtmlCleaner#clean with

return String.format("<html><head><meta charset=\"UTF-8\" /></head><body>%s</body></html>", input);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant