本文介绍了PyYAML 自动将某些键转换为布尔值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

几个月以来,我一直在使用 PyYAML 解析器来将文件类型转换为数据管道的一部分.我发现解析器有时非常特殊,似乎今天我偶然发现了另一种奇怪的行为.我当前正在转换的文件包含以下部分:

I've been working with a the PyYAML parser for a few months now to convert file types as part of a data pipeline. I've found the parser to be quite idiosyncratic at times and it seems that today I've stumbled on another strange behavior. The file I'm currently converting contains the following section:

off:
    yes: "Flavor text for yes"
    no: "Flavor text for no"

我在字典中保留了当前嵌套的列表,以便我可以构建一个平面文档,但保存嵌套以便稍后转换回 YAML.我收到一个 TypeError 说我试图将 strbool 类型连接在一起.我调查并发现 PyYaml 实际上是将我上面的文本部分转换为以下内容:

I keep a list of the current nesting in the dictionary so that I can construct a flat document, but save the nesting to convert back to YAML later on. I got a TypeError saying I was trying to concatenate a str and bool type together. I investigated and found that PyYaml is actually taking my section of text above and converting it to the following:

with open(filename, "r") as f:
    data = yaml.load(f.read())
print data

>> {False: {True: "Flavor text for yes", False: "Flavor text for no}}

我进行了快速检查,发现 PyYAML 正在为 yesnotrue 执行此操作,falseonoff.如果键没有被引用,它只会进行这种转换.引用的值和键将被很好地传递.在寻找解决方案时,我发现此行为记录在 此处.

I did a quick check and found that PyYAML was doing this for yes, no, true, false, on, off. It only does this conversion if the keys are unquoted. Quoted values and keys will be passed fine. Looking for solutions, I found this behavior documented here.

虽然知道引用密钥会阻止 PyYAML 这样做可能对其他人有帮助,但我没有这个选项,因为我不是这些文件的作者并且已经编写了我的尽可能少地接触数据的代码.

Although it might be helpful to others to know that quoting the keys will stop PyYAML from doing this, I don't have this option as I am not the author of these files and have written my code to touch the data as little as possible.

是否有解决此问题的方法或覆盖 PyYAML 中默认转换行为的方法?

Is there a workaround for this issue or a way to override the default conversion behavior in PyYAML?

推荐答案

yaml.load 采用第二个参数,一个加载器类(默认情况下,yaml.loader.Loader).预定义的加载器是许多其他加载器的混搭:

yaml.load takes a second argument, a loader class (by default, yaml.loader.Loader). The predefined loader is a mash up of a number of others:

class Loader(Reader, Scanner, Parser, Composer, Constructor, Resolver):

    def __init__(self, stream):
        Reader.__init__(self, stream)
        Scanner.__init__(self)
        Parser.__init__(self)
        Composer.__init__(self)
        Constructor.__init__(self)
        Resolver.__init__(self)

Constructor 类是执行数据类型转换的类.覆盖布尔转换的一种(笨拙但快速的)方法可能是:

The Constructor class is the one performing the data type conversion. One (kludgy, but fast) way to override the boolean conversion could be:

from yaml.constructor import Constructor

def add_bool(self, node):
    return self.construct_scalar(node)

Constructor.add_constructor(u'tag:yaml.org,2002:bool', add_bool)

它覆盖了构造函数用于将带有布尔标记的数据转换为 Python 布尔值的函数.我们在这里所做的只是逐字返回字符串.

which overrides the function that the constructor uses to turn boolean-tagged data into Python booleans. What we're doing here is just returning the string, verbatim.

不过,这会影响 ALL YAML 加载,因为您要覆盖默认构造函数的行为.更正确的做法可能是创建一个从 Constructor 派生的新类,以及带有自定义构造函数的新 Loader 对象.

This affects ALL YAML loading, though, because you're overriding the behaviour of the default constructor. A more proper way to do things could be to create a new class derived from Constructor, and new Loader object taking your custom constructor.

这篇关于PyYAML 自动将某些键转换为布尔值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-19 02:24