我有以下peg.js脚本:

start = name*

name = '** name ' var ws 'var:' vr:var ws 'len:' n:num? ws 'label:' lb:label? 'type:' ws t:type? '**\n'
  {return {NAME: vr,
           LENGTH: n,
           LABEL:lb,
           TYPE: t
  }}

type = 'CHAR'/'NUM'
var = $([a-zA-Z_][a-zA-Z0-9_]*)
label = p:labChar* { return p.join('')}
labChar = [^'"<>|\*\/]
ws = [\\t\\r ]
num  = n:[0-9]+ {return n.join('')}


解析:

** name a1 var:a1 len:9 label:The is the label for a1 type:NUM **
** name a2 var:a2 len: label:The is the label for a2 type:CHAR **
** name a3 var:a3 len:67 label: type: **


我遇到了两个问题。

首先,在我要解析的文本中,我期望某些值标签,例如“ var:”,“ len:”,“ label:”和“ type:”。据我所知,我想使用这些标签是固定的,以在两个值之间划定界限。

其次,我需要考虑缺失值。

我要用正确的方法吗?目前,我的脚本将标签的值与类型合并,然后在出现错误:

Line 1, column 64: Expected "type:" or [^'"<>|*/] but "*" found.


另外,我也可以使用文本块来执行此操作吗?我尝试解析:

** name a1 var:a1 len:9 label:The is the label for a1 type:NUM **
** name a2 var:a2 len: label:The is the label for a2 type:CHAR **

randomly created text ()= that I would like to keep

** name b1 var:b1 len:9 label:This is the label for b1 type:NUM **
** name b2 var:b2 len: label:This is the label for b2 type:CHAR **

more text


通过修改第一行并添加以下内容:

start = (name/random)*

random = r:.+ (!'** name')
    {return {RANDOM: r.join('')}}


我的最终结果是:

[
   [{
      "NAME": "a1",
      "LENGTH": "9",
      "LABEL": "The is the label for a1",
      "TYPE": "NUM"
   },
   {
      "NAME": "a2",
      "LENGTH": null,
      "LABEL": "The is the label for a2",
      "TYPE": "CHAR"
   },
   {"RANDOM":"randomly created text ()= that I would like to keep"}]
[{
      "NAME": "b1",
      "LENGTH": "9",
      "LABEL": "This is the label for b1",
      "TYPE": "NUM"
   },
   {
      "NAME": "b2",
      "LENGTH": null,
      "LABEL": "This is the label for b2",
      "TYPE": "CHAR"
   },
   {"RANDOM":"more text "}]
]

最佳答案

您需要一个负前瞻!(ws 'type:'),否则,标签规则将过于贪婪,并消耗所有输入到行尾。

附带说明,可以使用$()语法代替{return n.join('')}来联接元素的文本。

start = name*

name = '** name ' var ws 'var:' vr:var ws 'len:' n:num? ws 'label:' lb:label? ws 'type:' t:type? ws '**' '\n'?
  {return {NAME: vr,
           LENGTH: n,
           LABEL:lb,
           TYPE: t
  }}

var = $([a-zA-Z_][a-zA-Z0-9_]*)

num  = $([0-9]+)

label = $((!(ws 'type:') [^'"<>|\*\/])*)

type = 'CHAR'/'NUM'

ws = [\\t\\r ]


输出:

[
   {
      "NAME": "a1",
      "LENGTH": "9",
      "LABEL": "The is the label for a1",
      "TYPE": "NUM"
   },
   {
      "NAME": "a2",
      "LENGTH": null,
      "LABEL": "The is the label for a2",
      "TYPE": "CHAR"
   },
   {
      "NAME": "a3",
      "LENGTH": "67",
      "LABEL": "",
      "TYPE": null
   }
]

关于javascript - Peg.js区分缺失值和空白,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/52554566/

10-11 20:35