Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with long UTF8 char (such as 𨑨 U+28468 / UTF8:F0 A8 91 A8) #17

Open
a-tsioh opened this issue Nov 15, 2014 · 0 comments
Open

Comments

@a-tsioh
Copy link

a-tsioh commented Nov 15, 2014

No idea of what's going wrong.
on a db named "twblg" encoded in UTF8.
data from moedict-data-twblg imported with

$ xzcat dump.sql.xz | psql --db twblg

I got the following behaviour with some long UTF8 char

This works:

$ plv8x -d twblg -c "SELECT * FROM entries WHERE 詞目 = '𨑨' ;"
[ { '主編號': '23001',
    '屬性代號': '2',
    '詞目': '𨑨',
    '音讀': 'tshit',
    '文白': '替',
    '部首': '辵',
    '部首序': '162-04-08',
    '方言差對應': '' } ]

This does not:

$ plv8x -d twblg -E "plv8.execute 'SELECT * FROM entries WHERE 詞目=\'𨑨\''"
[]

but this does works with ‘好‘:

$ plv8x -d twblg -E "plv8.execute 'SELECT * FROM entries WHERE 詞目=\'好\''"
[ { '主編號': '2282',
    '屬性代號': '1',
    '詞目': '好',
    '音讀': 'hó',
    '文白': '白',
    '部首': '女',
    '部首序': '038-03-06',
    '方言差對應': '[方]043' },
  { '主編號': '2283',
    '屬性代號': '1',
    '詞目': '好',
    '音讀': 'hònn',
    '文白': '文',
    '部首': '女',
    '部首序': '038-03-06',
    '方言差對應': '' } ]

(SHOW SERVER_ENCOGING returns UTF8)
(Debian wheezy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant