这是我的解析代码:

import yaml

def yaml_as_python(val):
    """Convert YAML to dict"""
    try:
        return yaml.load_all(val)
    except yaml.YAMLError as exc:
        return exc

with open('circuits-small.yaml','r') as input_file:
    results = yaml_as_python(input_file)
    print results
    for value in results:
         print value

这是该文件的示例:
ingests:
  - timestamp: 1970-01-01T00:00:00.000Z
    id: SwitchBank_35496721
    attrs:
      Feeder: Line_928
      Switch.normalOpen: 'true'
      IdentifiedObject.description: SwitchBank
      IdentifiedObject.mRID: SwitchBank_35496721
      PowerSystemResource.circuit: '928'
      IdentifiedObject.name: SwitchBank_35496721
      IdentifiedObject.aliasName: SwitchBank_35496721
    loc: vector [43.05292, -76.126800000000003, 0.0]
    kind: SwitchBank
  - timestamp: 1970-01-01T00:00:00.000Z
    id: UndergroundDistributionLineSegment_34862802
    attrs:
      Feeder: Line_928
      status: de-energized
      IdentifiedObject.description: UndergroundDistributionLineSegment
      IdentifiedObject.mRID: UndergroundDistributionLineSegment_34862802
      PowerSystemResource.circuit: '928'
      IdentifiedObject.name: UndergroundDistributionLineSegment_34862802
    path:
    - vector [43.052942000000002, -76.126716000000002, 0.0]
    - vector [43.052585000000001, -76.126515999999995, 0.0]
    kind: UndergroundDistributionLineSegment
  - timestamp: 1970-01-01T00:00:00.000Z
    id: UndergroundDistributionLineSegment_34806014
    attrs:
      Feeder: Line_928
      status: de-energized
      IdentifiedObject.description: UndergroundDistributionLineSegment
      IdentifiedObject.mRID: UndergroundDistributionLineSegment_34806014
      PowerSystemResource.circuit: '928'
      IdentifiedObject.name: UndergroundDistributionLineSegment_34806014
    path:
    - vector [43.05292, -76.126800000000003, 0.0]
    - vector [43.052928999999999, -76.126766000000003, 0.0]
    - vector [43.052942000000002, -76.126716000000002, 0.0]
    kind: UndergroundDistributionLineSegment
...
ingests:
  - timestamp: 1970-01-01T00:00:00.000Z
    id: OverheadDistributionLineSegment_31168454

在回溯中,请注意它在...处开始出现问题。
Traceback (most recent call last):
  File "convert.py", line 29, in <module>
    for value in results:
  File "/Users/conduce-laptop/anaconda2/lib/python2.7/site-packages/yaml/__init__.py", line 82, in load_all
    while loader.check_data():
  File "/Users/conduce-laptop/anaconda2/lib/python2.7/site-packages/yaml/constructor.py", line 28, in check_data
    return self.check_node()
  File "/Users/conduce-laptop/anaconda2/lib/python2.7/site-packages/yaml/composer.py", line 18, in check_node
    if self.check_event(StreamStartEvent):
  File "/Users/conduce-laptop/anaconda2/lib/python2.7/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/Users/conduce-laptop/anaconda2/lib/python2.7/site-packages/yaml/parser.py", line 174, in parse_document_start
    self.peek_token().start_mark)
yaml.parser.ParserError: expected '<document start>', but found '<block mapping start>'
  in "circuits-small.yaml", line 42, column 1

我想要的是将这些文档中的每个文档解析为一个单独的对象,也许所有文档都在同一列表中,或者几乎可以与PyYAML模块一起使用的任何其他内容。我相信...实际上是有效的YAML,因此令我感到惊讶的是它无法自动处理。

最佳答案

错误消息非常具体,说明文档需要以document start marker开头。您的第一个文档没有这样的标记,尽管它具有文档结束标记。在使用...显式结束第一个文档之后,您将无法再使用PyYAML中没有文档边界标记的文档,则必须显式地以---开头:

文件末尾应如下所示:

    kind: UndergroundDistributionLineSegment
...
---
ingests:
  - timestamp: 1970-01-01T00:00:00.000Z
    id: OverheadDistributionLineSegment_31168454

您可以从第一个文档中删除显式文档的开始标记,但是您需要为每个后续文档包括一个开始标记。文档结束标记是可选的。

如果您不能完全控制输入,则使用.load_all()是不安全的。通常没有理由冒险,您应该使用.safe_load_all()并扩展SafeLoader以处理YAML可能包含的任何特定标记。

除此之外,您还应该在文档开始指示符之前(也应该将其添加到第一个文档中)以一个明确的version directive来启动YAML文档:
%YAML 1.1
---

这是为了将来的YAML文件编辑者受益,因为您使用的是PyYAML,该PyYAML仅支持(大多数)YAML 1.1,不支持YAML 1.2规范(2009版)。当然,替代方法是将YAML解析器升级到ruamel.yaml,这也会警告您使用不安全的load_all()(免责声明:我是该解析器的作者)。 ruamel.yaml不允许您在显式文档结束标记(即@flyx所指出的标记)之后添加裸文件,该标记是bug

10-07 19:00
查看更多