解析嵌套的自定义 yaml 标签

本文介绍了解析嵌套的自定义 yaml 标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些带有特定于应用程序标签的 yaml(确切地说是来自 AWS Cloud Formation 模板)，如下所示:

example_yaml = "名称: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"

我想解析它以便我可以这样做:

>>>打印(结果)>>>{'Name': 'EMR {环境} {目的}'}>>>name = result['name'].format(... 环境='发展'，...目的='ETL'……)>>>打印(名称)>>>EMR 开发 ETL

目前我的代码是这样的:

导入yaml从 pprint 导入 pprintdef aws_join(加载器，节点):join_args = loader.construct_yaml_seq(节点)分隔符 = 列表(join_args)[0]joinables = 列表(join_args)[1]join_result = delimiter.join(joinables)返回 join_resultdef aws_ref(loader, node):值 = loader.construct_scalar(节点)占位符 = '{'+值+'}'返回占位符yaml.add_constructor('!Join', aws_join)yaml.add_constructor('!Ref', aws_ref)example_yaml = "名称: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"pprint(yaml.load(example_yaml))

不幸的是，这会导致错误.

...joinables = 列表(join_args)[1]IndexError:列表索引超出范围

将 print('What I am: '+str(join_args)) 添加到 aws_join 表明我得到了一个生成器:

我是什么:

这就是我尝试将生成器转换为列表的原因.不过，生成器最终会正确填充，只是我没有及时使用它.如果我将 aws_join 函数更改为这样:

def aws_join(loader, node):join_args = loader.construct_yaml_seq(节点)返回 join_args

那么最终的结果是这样的:

{'Name': [' ', ['EMR', '{Environment}', '{Purpose}']]}

所以我的函数所需的部分就在那里，只是当我在我的函数中需要它们时.

解决方案

你已经接近了，但问题是你使用的方法construct_yaml_seq().该方法实际上是一个注册正常 YAML 序列的构造函数(最终使一个 Python 列表)并调用 construct_sequence() 方法来处理传入的节点，这也是您应该做的.

当您返回一个字符串时，该字符串无法处理递归数据结构，您不需要使用两步创建过程(首先yield-ing，然后填写)其中的construct_yaml_seq()方法如下.但是这两个步骤的创建过程是为什么你遇到了生成器.

construct_sequence 返回一个简单的列表，但是根据您的需要，节点在开始处理时可用的 !Join 下方，请确保指定deep=True参数，否则第二个列表element 将是一个空列表.因为construct_yaml_seq()，没有指定deep=True，你没有及时得到碎片您的函数(否则您实际上可能会使用该方法).

导入yaml从 pprint 导入 pprintdef aws_join(加载器，节点):join_args = loader.construct_sequence(节点，深=真)# 你可以注释掉下一行断言 join_args == [' ', ['EMR', '{Environment}', '{Purpose}']]分隔符 = join_args[0]joinables = join_args[1]返回 delimiter.join(joinables)def aws_ref(loader, node):值 = loader.construct_scalar(节点)占位符 = '{'+值+'}'返回占位符yaml.add_constructor('!Join', aws_join, Loader=yaml.SafeLoader)yaml.add_constructor('!Ref', aws_ref, Loader=yaml.SafeLoader)example_yaml = "名称: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"pprint(yaml.safe_load(example_yaml))

给出:

{'Name': 'EMR {Environment} {Purpose}'}

你不应该使用 load()，它被记录为潜在的不安全，最重要的是:这里没有必要.注册SafeLoader 并调用 safe_load()

I have some yaml with application-specific tags (from an AWS Cloud Formation template, to be exact) that looks like this:

example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"

I want to parse it so that I can do this:

>>> print(result)
>>> {'Name': 'EMR {Environment} {Purpose}'}

>>> name = result['name'].format(
...    Environment='Development',
...    Purpose='ETL'
... )
>>> print(name)
>>> EMR Development ETL

Currently my code looks like this:

import yaml
from pprint import pprint


def aws_join(loader, node):
    join_args = loader.construct_yaml_seq(node)
    delimiter = list(join_args)[0]
    joinables = list(join_args)[1]
    join_result = delimiter.join(joinables)
    return join_result

def aws_ref(loader, node):
    value = loader.construct_scalar(node)
    placeholder = '{'+value+'}'
    return placeholder

yaml.add_constructor('!Join', aws_join)
yaml.add_constructor('!Ref', aws_ref)

example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"

pprint(yaml.load(example_yaml))

Unfortunately this results in an error.

...
   joinables = list(join_args)[1]
IndexError: list index out of range

Adding print('What I am: '+str(join_args)) to aws_join shows that I'm getting a generator:

What I am: <generator object SafeConstructor.construct_yaml_seq at 0x1082ece08>

That's why I tried to cast the generator as a list. The generator eventually populates correctly though, just not in time for me to use it. If I change my aws_join function to like this:

def aws_join(loader, node):
    join_args = loader.construct_yaml_seq(node)
    return join_args

Then the final result looks like this:

{'Name': [' ', ['EMR', '{Environment}', '{Purpose}']]}

So the required pieces to my function are there, just not when I need them in my function.

解决方案

You are close, but the problem is that you are using the methodconstruct_yaml_seq(). That method is actually a registeredconstructor for the normal YAML sequence (the one that eventually makesa Python list) and it calls the construct_sequence() method to handle thenode that gets passed in, and that is what you should do as well.

As you are returning a string, which cannot deal with recursive datastructures, you don't need to use the two step creation process (firstyield-ing, then filling out) which the construct_yaml_seq() methodfollows. But this two step creation process is why you encountered agenerator.

construct_sequence returns a simple list, but as you want the nodesunderneath the !Join available when you start processing, make sureto specify the deep=True parameter, otherwise the second listelement will be an empty list. And because construct_yaml_seq(),doesn't specify deep=True, you did not get the pieces in time inyour function (otherwise you could have actually used that method).

import yaml
from pprint import pprint


def aws_join(loader, node):
    join_args = loader.construct_sequence(node, deep=True)
    # you can comment out next line
    assert join_args == [' ', ['EMR', '{Environment}', '{Purpose}']]
    delimiter = join_args[0]
    joinables = join_args[1]
    return delimiter.join(joinables)

def aws_ref(loader, node):
    value = loader.construct_scalar(node)
    placeholder = '{'+value+'}'
    return placeholder

yaml.add_constructor('!Join', aws_join, Loader=yaml.SafeLoader)
yaml.add_constructor('!Ref', aws_ref, Loader=yaml.SafeLoader)

example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"

pprint(yaml.safe_load(example_yaml))

which gives:

{'Name': 'EMR {Environment} {Purpose}'}

You should not use load(), it is documented to be potentiallyunsafe, and above all: it is not necessary here. Register with theSafeLoader and call safe_load()

这篇关于解析嵌套的自定义 yaml 标签的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

With

解析嵌套的自定义 yaml 标签

问题描述