问题描述
我需要获取YAML文件中某些键的行号.
I need to get the line numbers of certain keys of a YAML file.
请注意,此答案不能解决问题:我愿意使用 ruamel.yaml ,答案不适用于有序地图.
Please note, this answer does not solve the issue: I do use ruamel.yaml, and the answers do not work with ordered maps.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from ruamel import yaml
data = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")
print(data)
结果我得到了:
CommentedMap([('key1', CommentedOrderedMap([('key2', 'item2'), ('key3', 'item3'), ('key4', CommentedOrderedMap([('key5', 'item5'), ('key6', 'item6')]))]))])
除了!!omap
键之外,什么都不允许访问行号:
what does not allow to access to the line numbers, except for the !!omap
keys:
print(data['key1'].lc.line) # output: 1
print(data['key1']['key4'].lc.line) # output: 4
但是:
print(data['key1']['key2'].lc.line) # output: AttributeError: 'str' object has no attribute 'lc'
实际上,data['key1']['key2]
是str
.
我找到了一种解决方法:
I've found a workaround:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from ruamel import yaml
DATA = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")
def get_line_nb(data):
if isinstance(data, dict):
offset = data.lc.line
for i, key in enumerate(data):
if isinstance(data[key], dict):
get_line_nb(data[key])
else:
print('{}|{} found in line {}\n'
.format(key, data[key], offset + i + 1))
get_line_nb(DATA)
输出:
key2|item2 found in line 2
key3|item3 found in line 3
key5|item5 found in line 5
key6|item6 found in line 6
但是看起来有点脏".有更合适的方法吗?
but this looks a little bit "dirty". Is there a more proper way of doing it?
此解决方法不仅肮脏,而且仅适用于上述一种简单情况,并且一旦存在嵌套列表,就会给出错误结果
this workaround is not only dirty, but only works for simple cases like the one above, and will give wrong results as soon as there are nested lists in the way
推荐答案
此问题不是您使用的是!omap
,也不是像常规"映射一样没有行号.从您通过执行print(data['key1']['key4'].lc.line)
得到4(其中key4
是外部!omap
中的键)的事实应该很清楚.
This issue is not that you are using !omap
and that it doesn't give you the line-numbers as with "normal" mappings. That should be clear from the fact that you get 4 from doing print(data['key1']['key4'].lc.line)
(where key4
is a key in the outer !omap
).
如此答案所示,
data['key1']['key4']
的值是一个收集项(另一个!omap
),但是data['key1']['key2']
的值不是一个收集项,而是一个内置的python字符串,该字符串没有用于存储lc
属性.
The value for data['key1']['key4']
is a collection item (another !omap
), but the value for data['key1']['key2']
is not a collection item but a, built-in, python string, which has no slot to store the lc
attribute.
要在非集合(如字符串)上获得.lc
属性,您必须将RoundTripConstructor
子类化,使用类似scalarstring.py
中的类的类(将__slots__
调整为接受lc
属性然后将节点中可用的行信息传输到该属性,然后设置行,列信息:
To get an .lc
attribute on a non-collection like a string you have to subclass the RoundTripConstructor
, to use something like the classes in scalarstring.py
(with __slots__
adjusted to accept the lc
attribute and then transfer the line information available in the nodes to that attribute and then set the line, column information:
import sys
import ruamel.yaml
yaml_str = """
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: 'item5'
- key6: |
item6
"""
class Str(ruamel.yaml.scalarstring.ScalarString):
__slots__ = ('lc')
style = ""
def __new__(cls, value):
return ruamel.yaml.scalarstring.ScalarString.__new__(cls, value)
class MyPreservedScalarString(ruamel.yaml.scalarstring.PreservedScalarString):
__slots__ = ('lc')
class MyDoubleQuotedScalarString(ruamel.yaml.scalarstring.DoubleQuotedScalarString):
__slots__ = ('lc')
class MySingleQuotedScalarString(ruamel.yaml.scalarstring.SingleQuotedScalarString):
__slots__ = ('lc')
class MyConstructor(ruamel.yaml.constructor.RoundTripConstructor):
def construct_scalar(self, node):
# type: (Any) -> Any
if not isinstance(node, ruamel.yaml.nodes.ScalarNode):
raise ruamel.yaml.constructor.ConstructorError(
None, None,
"expected a scalar node, but found %s" % node.id,
node.start_mark)
if node.style == '|' and isinstance(node.value, ruamel.yaml.compat.text_type):
ret_val = MyPreservedScalarString(node.value)
elif bool(self._preserve_quotes) and isinstance(node.value, ruamel.yaml.compat.text_type):
if node.style == "'":
ret_val = MySingleQuotedScalarString(node.value)
elif node.style == '"':
ret_val = MyDoubleQuotedScalarString(node.value)
else:
ret_val = Str(node.value)
else:
ret_val = Str(node.value)
ret_val.lc = ruamel.yaml.comments.LineCol()
ret_val.lc.line = node.start_mark.line
ret_val.lc.col = node.start_mark.column
return ret_val
yaml = ruamel.yaml.YAML()
yaml.Constructor = MyConstructor
data = yaml.load(yaml_str)
print(data['key1']['key4'].lc.line)
print(data['key1']['key2'].lc.line)
print(data['key1']['key4']['key6'].lc.line)
请注意,最后一次调用print
的输出为6,因为文字标量字符串以|
开头.
Please note that the output of the last call to print
is 6, as the literal scalar string starts with the |
.
如果您还想转储data
,则需要使Representer
知道那些My....
类型.
If you also want to dump data
, you'll need to make a Representer
aware of those My....
types.
这篇关于解析YAML,即使在有序地图中也获得行号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!