目前,我在将EMOJI数据上传(使用python)到BIG QUERY时遇到问题

这是我要上传到BQ的示例代码:

 {"emojiCharts":{"emoji_icon":"\ud83d\udc4d","repost": 4, "doc": 4, "engagement": 0, "reach": 0, "impression": 0}}
 {"emojiCharts":{"emoji_icon":"\ud83d\udc49","repost": 4, "doc": 4, "engagement": 43, "reach": 722, "impression": 4816}}
 {"emojiCharts":{"emoji_icon":"\u203c","repost": 4, "doc": 4, "engagement": 0, "reach": 0, "impression": 0}}
 {"emojiCharts":{"emoji_icon":"\ud83c\udf89","repost": 5, "doc": 5, "engagement": 43, "reach": 829, "impression": 5529}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude34","repost": 5, "doc": 5, "engagement": 222, "reach": 420, "impression": 2805}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude31","repost": 3, "doc": 3, "engagement": 386, "reach": 2868, "impression": 19122}}
 {"emojiCharts":{"emoji_icon":"\ud83d\udc4d\ud83c\udffb","repost": 5, "doc": 5, "engagement": 43, "reach": 1064, "impression": 7098}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude3b","repost": 3, "doc": 3, "engagement": 93, "reach": 192, "impression": 1283}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude2d","repost": 6, "doc": 6, "engagement": 212, "reach": 909, "impression": 6143}}
 {"emojiCharts":{"emoji_icon":"\ud83e\udd84","repost": 8, "doc": 8, "engagement": 313, "reach": 402, "impression": 2681}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude18","repost": 7, "doc": 7, "engagement": 0, "reach": 8454, "impression": 56366}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude05","repost": 5, "doc": 5, "engagement": 74, "reach": 1582, "impression": 10550}}
 {"emojiCharts":{"emoji_icon":"\ud83d\ude04","repost": 5, "doc": 5, "engagement": 73, "reach": 3329, "impression": 22206}}


问题是大型查询无法看到任何此表情符号(\ud83d\ude04),并且只会以这种格式(\u203c)显示

即使该字段是STRING,它也会显示2个黑色的小面包,为什么BQ在不将表情符号显示为字符串的情况下也无法将其转换为实际的表情符号?

问题:

有什么方法可以将EMOJI上传到Big Query,以使其正确加载? -“将在Google Data Studio中使用”

我是否应该手动(硬编码)将所有表情符号代码更改为可接受的格式,即可接受的格式?

最佳答案

用户“数字”在其评论中提到:


  签出charbase.com/1f618-unicode-face-throwing-a-kiss您想要的是将javascript转义字符转换为实际的unicode数据。


,您需要更改表情符号的编码才能将其准确地表示为一个字符:

SELECT "\U0001f604 \U0001f4b8"
--   , "\ud83d\udcb8"
--   , "\ud83d\ude04"


第2和3d行失败,并显示类似Illegal escape sequence: Unicode value \ud83d is invalid at [2:7]的错误,但第一行在BigQuery和Data Studio中显示正确:

python - 上载到Bigquery时,表情符号崩溃-LMLPHP

python - 上载到Bigquery时,表情符号崩溃-LMLPHP

关于此的其他想法:


https://stackoverflow.com/search?q=%5Cud83d

关于python - 上载到Bigquery时,表情符号崩溃,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/52169443/

10-13 00:49