Use the correct length in text for parsing

GSam · web-flow · commit 65bdf46dfcab · 2018-12-17T21:02:58.000+13:00
UnicodeCharsXXTextRecord is encoded as UTF-16 and the length refers to the length in encoded bytes not Unicode. This meant that records of length &gt; 128 Unicode characters could fail to convert to binary elsewhere because the length couldn't fit in the record (but it varied on the length).

(Changing this code to use the alternate CharsXXTextRecord with UTF-8 might also be useful for saving space)
diff --git a/wcf/xml2records.py b/wcf/xml2records.py
@@ -166,7 +166,7 @@ def _parse_data(self, data, is_cdata=False):
             return DateTimeTextRecord(dt, tz)
 
         # text as fallback
-        val = len(data)
+        val = len(data.encode('utf-16le'))
         if val < 2**8:
             return UnicodeChars8TextRecord(data)
         elif val < 2**16: