本文介绍了如何在OCaml中将字符串解析为正则表达式类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我们定义这样的正则表达式类型:
We define a regex type like this:
type regex_t =
| Empty_String
| Char of char
| Union of regex_t * regex_t
| Concat of regex_t * regex_t
| Star of regex_t
我们要编写一个函数string_to_regex: string -> regex_t
.
-
Empty_string
的唯一字符是'E' -
Char
的唯一字符是'a'..'z' - '|'用于
Union
- '*'用于
Star
假定 -
Concat
可以进行连续解析. - '('/')'具有最高优先级,然后是星号,然后是concat,然后是联合
- The only char for
Empty_string
is 'E' - The only chars for
Char
are 'a'..'z' - '|' is for
Union
- '*' is for
Star
Concat
is assumed for continuous parsing.- '(' / ')' have highest Precedence, then star, then concat, then union
例如
(a|E)*(a|b)
将是
Concat(Star(Union(Char 'a',Empty_String)),Union(Char 'a',Char 'b'))
如何实现string_to_regex
?
推荐答案
Ocamllex和menhir是编写词法分析器和解析器的绝佳工具
Ocamllex and menhir are wonderful tools to write lexers and parsers
ast.mli
type regex_t =
| Empty
| Char of char
| Concat of regex_t * regex_t
| Choice of regex_t * regex_t
| Star of regex_t
lexer.mll
{ open Parser }
rule token = parse
| ['a'-'z'] as c { CHAR c }
| 'E' { EMPTY }
| '*' { STAR }
| '|' { CHOICE }
| '(' { LPAR }
| ')' { RPAR }
| eof { EOF }
parser.mly
%{ open Ast %}
%token <char> CHAR
%token EMPTY STAR CHOICE LPAR RPAR CONCAT
%token EOF
%nonassoc LPAR EMPTY CHAR
%left CHOICE
%left STAR
%left CONCAT
%start main
%type <Ast.regex_t> main
%%
main: r = regex EOF { r }
regex:
| EMPTY { Empty }
| c = CHAR { Char c }
| LPAR r = regex RPAR { r }
| a = regex CHOICE b = regex { Choice(a, b) }
| r = regex STAR { Star r }
| a = regex b = regex { Concat(a, b) } %prec CONCAT
main.ml
open Ast
let rec format_regex = function
| Empty -> "Empty"
| Char c -> "Char " ^ String.make 1 c
| Concat(a, b) -> "Concat("^format_regex a^", "^format_regex b^")"
| Choice(a, b) -> "Choice("^format_regex a^", "^format_regex b^")"
| Star(a) -> "Star("^format_regex a^")"
let () =
let s = read_line () in
let r = Parser.main Lexer.token (Lexing.from_string s) in
print_endline (format_regex r)
并进行编译
ocamllex lexer.mll
menhir parser.mly
ocamlc -c ast.mli
ocamlc -c parser.mli
ocamlc -c parser.ml
ocamlc -c lexer.ml
ocamlc -c main.ml
ocamlc -o regex parser.cmo lexer.cmo main.cmo
然后
$ ./regex
(a|E)*(a|b)
Concat(Star(Choice(Char a, Empty)), Choice(Char a, Char b))
这篇关于如何在OCaml中将字符串解析为正则表达式类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!