使用python,我想“教育”纯文本输入的引号并将其转换为Context语法。这是一个(递归)示例:

原文:

Using python, I would like "educate" quotes of
a plain text input and turn them into the Context syntax.
Here is a (recursive) example:


输出:

Using python, I would like \quotation{educate} quotes of
a plain text input and turn them into the Context syntax.
Here is a (recursive) example:


我也希望它能处理嵌套的报价:

原文:

Original text: "Using python, I would like 'educate' quotes of
a plain text input and turn them into the Context syntax.
Here is a (recursive) example:"


输出:

Original text: \quotation {Using python, I would like \quotation{educate} quotes of
a plain text input and turn them into the Context syntax.
Here is a (recursive) example:}


当然,我应该注意一些极端情况,例如:

She said "It looks like we are back in the '90s"


上下文引号的规范在这里:

http://wiki.contextgarden.net/Nested_quotations#Nested_quotations_in_MkIV

在这种情况下最敏感的方法是什么?非常感谢你!

最佳答案

尽管它不能处理您的极端情况,但它适用于嵌套引号

def quote(string):
    text = ''
    stack = []
    for token in iter_tokes(string):
        if is_quote(token):
            if stack and stack[-1] == token: # closing
                text += '}'
                stack.pop()
            else: # opening
                text += '\\quotation{'
                stack.append(token)
        else:
            text += token
    return text

def iter_tokes(string):
    i = find_quote(string)
    if i is None:
        yield string
    else:
        if i > 0:
            yield string[:i]
        yield string[i]
        for q in iter_tokes(string[i+1:]):
            yield q

def find_quote(string):
    for i, char in enumerate(string):
        if is_quote(char):
            return i
    return None

def is_quote(char):
    return char in '\'\"'

def main():
    quoted = None
    with open('input.txt') as fh:
        quoted = quote(fh.read())
    print quoted

main()

关于python - 解析文本以替换引号和嵌套引号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/16262004/

10-16 03:17