问题描述
XSSFCell
似乎将某些字符序列编码为 unicode 字符.我怎样才能防止这种情况?我需要应用某种字符转义吗?
XSSFCell
seems to encode certain character sequences as unicode characters. How can I prevent this? Do I need to apply some kind of character escaping?
例如
cell.setCellValue("LUS_BO_WP_x24B8_AI"); // The cell value now is „LUS_BO_WPⒸAI"
在Unicode中Ⓒ
是U+24B8
我已经尝试过设置 ANSI 字体并将单元格类型设置为字符串.
I've already tried setting an ANSI font and setting the cell type to string.
推荐答案
基于@matthias-gerth 的建议,稍作修改:
Based on what @matthias-gerth suggested with little adaptations:
创建自己的
XSSFRichTextString
类
像这样修改 XSSFRichTextString.setString
:st.setT(s);
>>st.setT(escape(s));
Adapt XSSFRichTextString.setString
like this: st.setT(s);
>> st.setT(escape(s));
像这样修改 XSSFRichTextString
的构造函数:st.setT(str);
>>st.setT(escape(str));
Adapt the constructor of XSSFRichTextString
like this: st.setT(str);
>> st.setT(escape(str));
在 XSSFRichTextString
中添加这些内容(这与 Matthias 的建议非常接近):
Add this stuff in XSSFRichTextString
(which is very near to Matthias suggestion):
private static final Pattern PATTERN = Pattern.compile("_x[a-fA-F0-9]{4}");
private static final String UNICODE_CHARACTER_LOW_LINE = "_x005F";
private String escape(String str) {
if (str!=null) {
Matcher m = PATTERN.matcher(str);
if (m.find()) {
StringBuffer buf = new StringBuffer();
int idx = 0;
do {
int pos = m.start();
if( pos > idx) {
buf.append(str.substring(idx, pos));
}
buf.append(UNICODE_CHARACTER_LOW_LINE + m.group(0));
idx = m.end();
} while (m.find());
buf.append(str.substring(idx));
return buf.toString();
}
}
return str;
}
这篇关于Apache POI 中的 XSSFCell 将某些字符序列编码为 unicode 字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!