问题描述
我正在使用contenteditable"<div/>
并启用 PASTE.
I am using a 'contenteditable' <div/>
and enabling PASTE.
从 Microsoft Word 的剪贴板副本粘贴的标记代码数量惊人.我正在解决这个问题,并且已经使用 Prototypes 的 stripTags()
函数(不幸地似乎无法让我保留一些标签)获得了大约 1/2 的方法.
It is amazing the amount of markup code that gets pasted in from a clipboard copy from Microsoft Word. I am battling this, and have gotten about 1/2 way there using Prototypes' stripTags()
function (which unfortunately does not seem to enable me to keep some tags).
然而,即使在那之后,我还是会遇到大量不需要的标记代码.
However, even after that, I wind up with a mind-blowing amount of unneeded markup code.
所以我的问题是,是否有一些函数(使用 JavaScript)或我可以使用的方法来清除大部分不需要的标记?
So my question is, is there some function (using JavaScript), or approach I can use that will clean up the majority of this unneeded markup?
推荐答案
这是我最后编写的函数,它可以很好地完成工作(据我所知).
Here is the function I wound up writing that does the job fairly well (as far as I can tell anyway).
如果有人有任何改进建议,我当然愿意接受.谢谢.
I am certainly open for improvement suggestions if anyone has any. Thanks.
function cleanWordPaste( in_word_text ) {
var tmp = document.createElement("DIV");
tmp.innerHTML = in_word_text;
var newString = tmp.textContent||tmp.innerText;
// this next piece converts line breaks into break tags
// and removes the seemingly endless crap code
newString = newString.replace(/
/g, "<br />").replace(/.*<!--.*-->/g,"");
// this next piece removes any break tags (up to 10) at beginning
for ( i=0; i<10; i++ ) {
if ( newString.substr(0,6)=="<br />" ) {
newString = newString.replace("<br />", "");
}
}
return newString;
}
希望这对你们中的一些人有所帮助.
Hope this is helpful to some of you.
这篇关于使用 JavaScript 清理 Microsoft Word 粘贴的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!