Skip to content

Commit

Permalink
Convert properly UTF-8 to UTF-16
Browse files Browse the repository at this point in the history
wchar_t is currently 16bit so converting a utf8 encoded characters not
in plane 0 (>= 0x10000) to wchar_t (that is calling char2uni) lead to a
-EINVAL return. This patch detect utf8 in cifs_strtoUTF16 and add special
code calling utf8s_to_utf16s.

Signed-off-by: Frediano Ziglio <[email protected]>
Acked-by: Jeff Layton <[email protected]>
Signed-off-by: Steve French <[email protected]>
  • Loading branch information
Frediano Ziglio authored and smfrench committed Oct 8, 2012
1 parent b7a1062 commit fd3ba42
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions fs/cifs/cifs_unicode.c
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,27 @@ cifs_strtoUTF16(__le16 *to, const char *from, int len,
int i;
wchar_t wchar_to; /* needed to quiet sparse */

/* special case for utf8 to handle no plane0 chars */
if (!strcmp(codepage->charset, "utf8")) {
/*
* convert utf8 -> utf16, we assume we have enough space
* as caller should have assumed conversion does not overflow
* in destination len is length in wchar_t units (16bits)
*/
i = utf8s_to_utf16s(from, len, UTF16_LITTLE_ENDIAN,
(wchar_t *) to, len);

/* if success terminate and exit */
if (i >= 0)
goto success;
/*
* if fails fall back to UCS encoding as this
* function should not return negative values
* currently can fail only if source contains
* invalid encoded characters
*/
}

for (i = 0; len && *from; i++, from += charlen, len -= charlen) {
charlen = codepage->char2uni(from, len, &wchar_to);
if (charlen < 1) {
Expand All @@ -215,6 +236,7 @@ cifs_strtoUTF16(__le16 *to, const char *from, int len,
put_unaligned_le16(wchar_to, &to[i]);
}

success:
put_unaligned_le16(0, &to[i]);
return i;
}
Expand Down

0 comments on commit fd3ba42

Please sign in to comment.