问题描述
我想从给定文件中提取一些字符串数据。文件结构如下:
I want to extract some string data from a given file. File got structure such as:
name,catg,{y:2006,v:1000 ,c:100,vt:1},{y:2007,v:1000,c:100,vt:1},.. {..} ..
。
我要提取下一个值:
- 名称;
- catg;
-
y
,<$之后的数字c $ c> v ,c
,vt
标签;
- name;
- catg;
- numbers after
y
,v
,c
,vt
labels;
我使用下一个正则表达式:
I used the next regexes:
-
@(?< name> \w +),(?< cat> \w +)
; -
@(?: \ {y:(?< y> dd +),+ v:(?< v> \ d +),+ c :(?c> \d +),+ vt :(?vt> \d +)\},?)+
用于提取其他值
@"(?<name>\w+), (?<cat>\w+)"
for extraction of the first two items;@"(?:\{y:(?<y>\d+), +v:(?<v>\d+), +c:(?<c>\d+), +vt:(?<vt>\d+)\}, ?)+"
for extraction of other values enclosed in curly brackets.
我将这两个连接在一起,并在正则表达式测试器中进行了测试。但是正如预期的那样,我只能得到一组提取的数字。我需要其他部分的结果( {y:2007,v:1000,c:100,vt:1}
)。此外,可能有两个以上的部分。
I concatenated those two and made a test in regex tester. But as expected I get only one set of extracted numbers. And I need result from the other part ({y:2007, v:1000, c:100, vt:1}
). Moreover there could be more than two parts.
如何修复我的正则表达式?然后如何从对应的部分中收集所有数字集。
How do I fix my regex? And then how do I collect all number sets from corresponding parts.
推荐答案
这是固定的正则表达式(您需要指定IgnorePatternWhitespace选项):
Here's fixed regex (you need to specify IgnorePatternWhitespace option):
(?'name'\w+), \s*
(?'category'\w+), \s*
(?:
\{ \s*
y: (?'y'\d+), \s*
v: (?'v'\d+), \s*
c: (?'c'\d+), \s*
vt: (?'vt'\d+)
\} \s*
,? \s*
)*
这是用法:
String input = @"name, catg, {y:2006, v:1000, c:100, vt:1}, {y:2007, v:1000, c:100, vt:1}";
String pattern =
@"(?'name'\w+), \s*
(?'category'\w+), \s*
(?:
\{ \s*
y: (?'y'\d+), \s*
v: (?'v'\d+), \s*
c: (?'c'\d+), \s*
vt: (?'vt'\d+)
\} \s*
,? \s*
)* ";
RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline;
Match match = Regex.Match(input, pattern, options);
if (match.Success)
{
String name = match.Groups["name"].Value;
String category = match.Groups["category"].Value;
Console.WriteLine("name = {0}, category = {1}", name, category);
for (Int32 i = 0; i < match.Groups["y"].Captures.Count; ++i)
{
Int32 y = Int32.Parse(match.Groups["y"].Captures[i].Value);
Int32 v = Int32.Parse(match.Groups["v"].Captures[i].Value);
Int32 c = Int32.Parse(match.Groups["c"].Captures[i].Value);
Int32 vt = Int32.Parse(match.Groups["vt"].Captures[i].Value);
Console.WriteLine("y = {0}, v = {1}, c = {2}, vt = {3}", y, v, c, vt);
}
}
这篇关于提取括在大括号内的数字值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!