问题描述
我正在尝试显示存储在数据库中的字符,它是Unicode字符\ u0096.由于Windows-vs-web-browser的问题,这是unicode标准中的控制字符,但是网页会将其显示为En Dash.在>有些UTF-8字符不显示在浏览器上.
I'm trying to display a character stored in a database that is the unicode character \u0096. Because of a strange windows-vs-web-browser thing this is a control character in the unicode standard, but web-pages will display it as an En Dash. See @AlanMoore's answer on Some UTF-8 characters do not show up on browser.
我有以下jsp文件.我想将\ u0096字符显示为En Dash(其他前端解决方案可以完成的一项壮举).
I have the following jsp file. I want to display the \u0096 character as an En Dash(A feat that other front-end solutions can accomplish).
<%@ page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>
<%@ page session="false" trimDirectiveWhitespaces="true"%>
<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core"%>
<!doctype html>
<html>
<c:set var="control" scope="request" value= "b"/>
<c:set var="endash" scope="request" value="a"/>
<% request.setAttribute("control", "\u0096");%>
<% request.setAttribute("endash", "\u2013");%>
Match? 0096: <c:out value="${control}"/> 2013: <c:out value="${endash}"/>
我得到的输出是
Match? 0096: 2013: –
我想要的是
Match? 0096: – 2013: –
推荐答案
用\0096
表示的字符,即U + 0096,无疑是Unicode中的控制字符,具有未定义的含义.这不应与以下事实混淆:在Windows-1252编码中,字节 0x96表示U + 2013 EN DASH.
The character denoted by \0096
, i.e. U+0096, is unambiguously a control character in Unicode, with undefined meaning. This should not be confused with the fact that in the windows-1252 encoding, the byte 0x96 denotes U+2013 EN DASH.
因此,您应该在U + 2013之前简单地替换 U + 0096,或者根据实际设置,也许转换从数据库中获取的数据,从Windows-1252转换为例如UTF-16.该数据库不太可能包含U + 0096.相反,它包含的字节现在被误解为UTF-16,但实际上是Windows-1252编码的字符表示.
Thus, instead of trying to render an invisible character as visible, you should simply replace U+0096 by U+2013 or, depending on the actual setup, perhaps convert the data you get from the database, converting from windows-1252 to e.g. UTF-16. It is unlikely that the database contains something meant to be U+0096. Rather, it contains bytes that are now being misinterpreted as UTF-16 but are actually windows-1252 encoded representations of characters.
这篇关于在一个jsp中显示\ u0096的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!