我正在尝试将lvm.conf转换为python(类似于JSON)对象。
LVM(逻辑卷管理)配置文件如下所示:
# Configuration section config.
# How LVM configuration settings are handled.
config {
# Configuration option config/checks.
# If enabled, any LVM configuration mismatch is reported.
# This implies checking that the configuration key is understood by
# LVM and that the value of the key is the proper type. If disabled,
# any configuration mismatch is ignored and the default value is used
# without any warning (a message about the configuration key not being
# found is issued in verbose mode only).
checks = 1
# Configuration option config/abort_on_errors.
# Abort the LVM process if a configuration mismatch is found.
abort_on_errors = 0
# Configuration option config/profile_dir.
# Directory where LVM looks for configuration profiles.
profile_dir = "/etc/lvm/profile"
}
local {
}
log {
verbose=0
silent=0
syslog=1
overwrite=0
level=0
indent=1
command_names=0
prefix=" "
activation=0
debug_classes=["memory","devices","activation","allocation","lvmetad","metadata","cache","locking","lvmpolld","dbus"]
}
我想要这样的Python字典:
{ "section_name"":
{"value1" : 1,
"value2" : "some_string",
"value3" : [list, of, strings]}... and so on.}
解析器功能:
def parseLvmConfig2(path="/etc/lvm/lvm.conf"):
try:
EQ, LBRACE, RBRACE, LQ, RQ = map(pp.Suppress, "={}[]")
comment = pp.Suppress("#") + pp.Suppress(pp.restOfLine)
configSection = pp.Word(pp.alphas + "_") + LBRACE
sectionKey = pp.Word(pp.alphas + "_")
sectionValue = pp.Forward()
entry = pp.Group(sectionKey + EQ + sectionValue)
real = pp.Regex(r"[+-]?\d+\.\d*").setParseAction(lambda x: float(x[0]))
integer = pp.Regex(r"[+-]?\d+").setParseAction(lambda x: int(x[0]))
listval = pp.Regex(r'(?:\[)(.*)?(?:\])').setParseAction(lambda x: eval(x[0]))
pp.dblQuotedString.setParseAction(pp.removeQuotes)
struct = pp.Group(pp.ZeroOrMore(entry) + RBRACE)
sectionValue << (pp.dblQuotedString | real | integer | listval)
parser = pp.ZeroOrMore(configSection + pp.Dict(struct))
res = parser.parseFile(path)
print(res)
except (pp.ParseBaseException, ) as e:
print("lvm.conf bad format {0}".format(e))
结果是混乱的,问题是,如何使pyparsing进行这项工作,而又没有其他逻辑?
更新(已解决):
对于任何想更好地了解pyparsing的人,请查看下面的@PaulMcG解释。 (感谢pyparsing,Paul!)
import pyparsing as pp
def parseLvmConf(conf="/etc/lvm/lvm.conf", res_type="dict"):
EQ, LBRACE, RBRACE, LQ, RQ = map(pp.Suppress, "={}[]")
comment = "#" + pp.restOfLine
integer = pp.nums
real = pp.Word(pp.nums + "." + pp.nums)
pp.dblQuotedString.setParseAction(pp.removeQuotes)
scalar_value = real | integer | pp.dblQuotedString
list_value = pp.Group(LQ + pp.delimitedList(scalar_value) + RQ)
key = pp.Word(pp.alphas + "_", pp.alphanums + '_')
key_value = pp.Group(key + EQ + (scalar_value | list_value))
struct = pp.Forward()
entry = key_value | pp.Group(key + struct)
struct <<= pp.Dict(LBRACE + pp.ZeroOrMore(entry) + RBRACE)
parser = pp.Dict(pp.ZeroOrMore(entry))
parser.ignore(comment)
try:
#return lvm.conf as dict
if res_type == "dict":
return parser.parseFile(conf).asDict()
# return lvm.conf as list
elif res_type == "list":
return parser.parseFile(conf).asList()
else:
#return lvm.conf as ParseResults
return parser.parseFile(conf)
except (pp.ParseBaseException,) as e:
print("lvm.conf bad format {0}".format(e))
最佳答案
步骤1应该总是至少要粗略分析要解析的格式的BNF。在开始编写实际代码之前,这确实有助于组织您的想法,并使您考虑要解析的结构和数据。
这是我为该配置想到的BNF(它看起来像Python字符串,因为它可以轻松粘贴到您的代码中以供将来参考-但是pyparsing不适用于此类字符串或不需要此类字符串,它们纯粹是一种设计工具):
BNF = '''
key_struct ::= key struct
struct ::= '{' (key_value | key_struct)... '}'
key_value ::= key '=' (scalar_value | list_value)
key ::= word composed of alphas and '_'
list_value ::= '[' scalar_value [',' scalar_value]... ']'
scalar_value ::= real | integer | double-quoted-string
comment ::= '#' rest-of-line
'''
请注意,{}和[]的开头和结尾处于同一级别,而不是在一个表达式中包含一个打开器,而在另一个表达式中包含一个关闭器。
此BNF还将允许嵌套在结构中的结构,这在您发布的示例文本中并不是严格要求的,但是由于您的代码似乎支持该结构,因此我将其包括在内。
从这里开始,转换为pyparsing非常简单,自下而上地通过BNF:
EQ, LBRACE, RBRACE, LQ, RQ = map(pp.Suppress, "={}[]")
comment = "#" + pp.restOfLine
integer = ppc.integer #pp.Regex(r"[+-]?\d+").setParseAction(lambda x: int(x[0]))
real = ppc.real #pp.Regex(r"[+-]?\d+\.\d*").setParseAction(lambda x: float(x[0]))
pp.dblQuotedString.setParseAction(pp.removeQuotes)
scalar_value = real | integer | pp.dblQuotedString
# `delimitedList(expr)` is a shortcut for `expr + ZeroOrMore(',' + expr)`
list_value = pp.Group(LQ + pp.delimitedList(scalar_value) + RQ)
key = pp.Word(pp.alphas + "_", pp.alphanums + '_')
key_value = pp.Group(key + EQ + (scalar_value | list_value))
struct = pp.Forward()
entry = key_value | pp.Group(key + struct)
struct <<= (LBRACE + pp.ZeroOrMore(entry) + RBRACE)
parser = pp.ZeroOrMore(entry)
parser.ignore(comment)
运行此代码:
try:
res = parser.parseString(lvm_source)
# print(res.dump())
res.pprint()
return res
except (pp.ParseBaseException, ) as e:
print("lvm.conf bad format {0}".format(e))
给出此嵌套列表:
[['config',
['checks', 1],
['abort_on_errors', 0],
['profile_dir', '/etc/lvm/profile']],
['local'],
['log',
['verbose', 0],
['silent', 0],
['syslog', 1],
['overwrite', 0],
['level', 0],
['indent', 1],
['command_names', 0],
['prefix', ' '],
['activation', 0],
['debug_classes',
['memory',
'devices',
'activation',
'allocation',
'lvmetad',
'metadata',
'cache',
'locking',
'lvmpolld',
'dbus']]]]
我认为您更喜欢一种格式,您可以在嵌套dict或分层对象中以键的形式访问值。 Pyparsing有一个名为Dict的类,它将在解析时执行此操作,以便为嵌套子组自动分配结果名称。更改这两行,以自动决定其子条目:
struct <<= pp.Dict(LBRACE + pp.ZeroOrMore(entry) + RBRACE)
parser = pp.Dict(pp.ZeroOrMore(entry))
现在,如果我们调用dump()而不是pprint(),我们将看到分层命名:
[['config', ['checks', 1], ['abort_on_errors', 0], ['profile_dir', '/etc/lvm/profile']], ['local'], ['log', ['verbose', 0], ['silent', 0], ['syslog', 1], ['overwrite', 0], ['level', 0], ['indent', 1], ['command_names', 0], ['prefix', ' '], ['activation', 0], ['debug_classes', ['memory', 'devices', 'activation', 'allocation', 'lvmetad', 'metadata', 'cache', 'locking', 'lvmpolld', 'dbus']]]]
- config: [['checks', 1], ['abort_on_errors', 0], ['profile_dir', '/etc/lvm/profile']]
- abort_on_errors: 0
- checks: 1
- profile_dir: '/etc/lvm/profile'
- local: ''
- log: [['verbose', 0], ['silent', 0], ['syslog', 1], ['overwrite', 0], ['level', 0], ['indent', 1], ['command_names', 0], ['prefix', ' '], ['activation', 0], ['debug_classes', ['memory', 'devices', 'activation', 'allocation', 'lvmetad', 'metadata', 'cache', 'locking', 'lvmpolld', 'dbus']]]
- activation: 0
- command_names: 0
- debug_classes: ['memory', 'devices', 'activation', 'allocation', 'lvmetad', 'metadata', 'cache', 'locking', 'lvmpolld', 'dbus']
- indent: 1
- level: 0
- overwrite: 0
- prefix: ' '
- silent: 0
- syslog: 1
- verbose: 0
然后,您可以使用
res['config']['checks']
或res.log.indent
来访问字段。关于python - 使用pyparsing将lvm.conf转换为python dict,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/54501226/