Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix implementation of to_utf8 and to_utf16 #123

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Fix implementation of to_utf8 and to_utf16
The current implementation of these functions do not properly convert between the unicode encodings and have caused bugs in the past. This new implementation makes use of std::codecvt and std::wstring_convert.

Although these specializations of codecvt are now deprecated in c++20, they will still work until there is a suitable replacement in the standard library, and as far as I know, such a thing does not exist yet.

It is recommended to use a proper library for this task. il2cpp has built in methods of performing this conversion, so it may be worth looking into that in the future.
  • Loading branch information
StackDoubleFlow committed Feb 5, 2022
commit 7005725bb2e7b0878060f95d045475ea8616f2a5
21 changes: 4 additions & 17 deletions src/utils/utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
#include <fstream>
#include <sstream>
#include <unordered_set>
#include <codecvt>
#include <locale>
#include "il2cpp-object-internals.h"
#include "modloader/shared/modloader.hpp"
#include "shared/utils/gc-alloc.hpp"
Expand Down Expand Up @@ -327,27 +329,12 @@ void setcsstr(Il2CppString* in, std::u16string_view str) {
in->chars[in->length] = (Il2CppChar)'\u0000';
}

// Inspired by DaNike
std::string to_utf8(std::u16string_view view) {
char* dat = static_cast<char*>(calloc(view.length() + 1, sizeof(char)));
std::transform(view.data(), view.data() + view.size(), dat, [](auto utf16_char) {
return static_cast<char>(utf16_char);
});
dat[view.length()] = '\0';
std::string out(dat);
free(dat);
return out;
return std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t>{}.to_bytes(view.data());
}

std::u16string to_utf16(std::string_view view) {
char16_t* dat = static_cast<char16_t*>(calloc(view.length() + 1, sizeof(char16_t)));
std::transform(view.data(), view.data() + view.size(), dat, [](auto standardChar) {
return static_cast<char16_t>(standardChar);
});
dat[view.length()] = '\0';
std::u16string out(dat);
free(dat);
return out;
return std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t>{}.from_bytes(view.data());
}

std::u16string_view csstrtostr(Il2CppString* in)
Expand Down