问题描述
我需要转换所有字符|到一个大XML文件的所有文本块中的标签.也就是说,只要我发现
I need to transform all characters | to tags in all texts blocks of a big XML file. That is, whenever I found
<test att="one|two">content | something more | and done</test>
我需要转换为
<test att="one|two">content <bar/> something more <bar/> and done</test>
请注意|属性值中也可能出现这种情况,在这种情况下,必须保持它们不变.阅读了CXML的SAX概述部分的转换幻灯片之后我写了
Note that | can also occur in attributes values and, in that case, they must be keeped unchanged. After reading the Transforming slide of the SAX Overview part of the CXML focumentation, I wrote
(defclass preproc (cxml:sax-proxy) ())
(defmethod sax:characters ((handler preproc) data)
(call-next-method handler (cl-ppcre:regex-replace "\\|" data "<bar/>")))
但是,当然,它会在最终XML中生成一个字符串(转义的)而不是一个标记.
But of course, it produces a string (escaped) not a tag in the final XML.
WML> (cxml:parse "<test>content | ola</test>"
(make-instance 'preproc
:chained-handler (cxml:make-string-sink)))
<?xml version="1.0" encoding="UTF-8"?>
<test>content <bar/> ola</test>"
有任何想法或指示吗?
Any idea or directions?
推荐答案
处理程序不会调用解析器,但是 handling 已经解析了值.因此,与其构造一个包含< bar/> 的 string ,您要做的是调用如果< ; bar/> 实际上已经遇到.在这种情况下,如果文档确实有
The handler doesn't call the parser, but is handling already parsed values. So, rather than constructing a string that contains <bar/>, what you want to do is to call the method that would have been called if <bar/> had actually been encountered. In this case, if the document had actually had
content <bar/> ola
在 test 元素内
,那么将会有电话:
inside the test element, then there would have been the calls:
(sax:characters handler "content ")
(sax:start-element handler nil nil "bar" '())
(sax:end-element handler nil nil "bar"
(sax:characters handler " ola")
因此,您所需要做的就是将字符串拆分为 | 字符(如果需要,您可以使用CL-PPCRE,尽管可能会有更多轻量级的解决方案),然后执行每个字符串部分使用调用下一步方法,并在以下两者之间调用 sax:start-element 和 sax:end-element :
So, all you need to do is split the string on the | character (you can use CL-PPCRE for this if you want, though there may be more lightweight solutions), and then do a call-next-method for each string part, and do calls to sax:start-element and sax:end-element in between:
(defmethod sax:characters ((handler preproc) data)
(let ((parts (cl-ppcre:split "\\|" data)))
;; check this on edge cases, though, e.g., "", "|", "a|", strings
;; without any "|", etc.
(call-next-method handler (pop parts))
(dolist (part parts)
(sax:start-element handler nil nil "bar" '())
(sax:end-element handler nil nil "bar")
(call-next-method handler part))))
(cxml:parse "<test>content | ola</test>"
(make-instance 'preproc
:chained-handler (cxml:make-string-sink)))
;=>
; "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
; <test>content <bar/> ola</test>"
这篇关于如何使用Closure XML将元素注入字符内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!