问题描述
我想解决一个复杂的问题。在这里:
这是我的字符串:
string RangeNum1 =R(1 - 9)AND B750KK或R(4 - 12)和S750FF或R(1 - 10)和Y750RR;
我想循环遍历范围并输出条件为真;结果应如下所示:
输出
1 B750KK,Y750RR
2 B750KK,Y750RR
3 B750KK,Y750RR
4 B750KK,S750FF,Y750RR
5 B750KK,S750FF ,Y750RR
6 B750KK,S750FF,Y750RR
7 B750KK,S750FF,Y750RR
8 B750KK,S750FF,Y750RR
9 B750KK,S750FF,Y750RR
10 S750FF,Y750RR
11 S750FF
12 S750FF
感谢您提前的帮助!
Matt T Heffron 答案非常适合我原来的问题。
是否可以通过以下额外要求进一步处理 Matt T Heffron 答案:
额外要求:
字符串RangeNum1 =R(1 - 9)和B750KK不在StarX或R(4 - 12)和S750FF或R(1 - 10)和Y750RR不在MoonX;
string mostExist =B750KK,S750FF,T768RR,F453PP;
字符串StarX =B750KK,S750FF,T768RR,F453PP;
字符串MoonX =N750KK,D768DD,A453AA;
基本上,输出最符合这个条件在写在决赛桌之前。这将是条件:
1.它必须存在于MostExist字符串中。如果它不存在于MostExist字符串中,则不会在表中写入。
2.如果它表示NOT IN StarX,那么它必须存在于MostExist中字符串,并且不得存在于StarX中。如果它存在于StarX中则不应写入决赛桌。
我不知道我的例子是否清楚。期待你的帮助?
I have a complicated problem I am trying to solve. Here it goes:
This is my string:string RangeNum1 = "R(1 - 9) AND B750KK OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750RR";
I want to loop through the ranges and output it with the condition is true; Results should look like this:
OUTPUT
1 B750KK, Y750RR
2 B750KK, Y750RR
3 B750KK, Y750RR
4 B750KK, S750FF, Y750RR
5 B750KK, S750FF, Y750RR
6 B750KK, S750FF, Y750RR
7 B750KK, S750FF, Y750RR
8 B750KK, S750FF, Y750RR
9 B750KK, S750FF, Y750RR
10 S750FF, Y750RR
11 S750FF
12 S750FF
Thanks for your help in advance!
Matt T Heffron answer was great for my original question.
Is it possible to further process Matt T Heffron answer with the following extra requirements:
EXTRA REQUIREMENTS:string RangeNum1 = "R(1 - 9) AND B750KK NOT IN StarX OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750RR NOT IN MoonX";
string mostExist = "B750KK, S750FF, T768RR, F453PP";
string StarX = "B750KK, S750FF, T768RR, F453PP";
string MoonX = "N750KK, D768DD, A453AA";
So basically, the output most meet this conditions before writing it in the final table. This will be the condition:
1. It must exist in the MostExist string. If it doesn't exist in the MostExist string it would not be written in the table.
2. If it says NOT IN StarX, then it must exist in MostExist string and must not exist in StarX. If it exist in the StarX then it should not be written in the final table.
I don't know if my example is clear. Look forward to your help?
推荐答案
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace ConsoleApplication20
{
class Program
{
static void Main(string[] args)
{
string rangeNum1 = "R(1 - 9) AND B750KK OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750RR";
ParseAndReportRangeMap(rangeNum1);
}
private static readonly Regex AndClauses = new Regex(@"\bR\((?<low>\d+) *- *(?<high>\d+)\) +AND +(?<term>[A-Z0-9]+)", RegexOptions.Compiled | RegexOptions.IgnoreCase);
private static void ParseAndReportRangeMap(string rangeConditions)
{
if (string.IsNullOrWhiteSpace(rangeConditions))
return;
SortedDictionary<int, List<string>> rangeMap = new SortedDictionary<int, List<string>>();
var clauses = AndClauses.Matches(rangeConditions);
foreach (Match clause in clauses)
{
var groups = clause.Groups;
int low = int.Parse(groups["low"].Value);
int high = int.Parse(groups["high"].Value);
string term = groups["term"].Value;
for (int i = low; i <= high; i++)
{
List<string> terms;
if (!rangeMap.TryGetValue(i, out terms))
{
terms = new List<string>();
rangeMap[i] = terms;
}
terms.Add(term); // not checking for duplicates
}
}
foreach (var item in rangeMap)
{
Console.WriteLine("{0} {1}", item.Key, string.Join(", ", item.Value));
}
}
}
}
string RangeNum1 = "R(1 - 9) AND B750KK OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750RR";
string RangeNum2 = "R(1 - 9) AND B750KK NOT IN StarX OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750RR NOT IN MoonX";
b。我发现实际上继续重新格式化输入字符串是有帮助的;它帮助我想象可以做什么:
b. I find it helpful to actually go ahead and re-format the input strings; it helps me visualize what can be done:
string RangeNum1 =
"R(1 - 9) AND B750KK
OR R(4 - 12) AND S750FF
OR R(1 - 10) AND Y750RR";
string RangeNum2 =
"R(1 - 9) AND B750KK NOT IN StarX
OR R(4 - 12) AND S750FF
OR R(1 - 10) AND Y750RR NOT IN MoonX";
从研究这些例子中跳出来的是:在这两种情况下,你可以根据或来划分要解析的字符串。
c。然后,我认为通过精神上煮沸字符串到其基本要素来显示需要剥离的内容是有帮助的:
What "jumps out at me" from studying the examples is that: in both cases you can divide the string to be parsed based on "OR."
c. Then, I think it's helpful to visualize what needs to be stripped away by mentally "boiling down" the string to its essentials:
/*string RangeNum1 =
"1 9 B750KK
4 12 S750FF
1 10 Y750RR";
string RangeNum2 =
"1 9 B750KK NOT IN StarX
4 12 S750FF
1 10 Y750RR NOT IN MoonX";*/
d。然后,继续开始实际编码。
~2。初始编码和测试
a。我得出结论,如果我使用OR作为分隔符拆分任一字符串,我将有一组有效的子字符串要解析。所以我可以在表单范围
d. Then, move on to start actually coding.
~ 2. initial coding and testing
a. I conclude that if I split either string using "OR" as the delimiter, I will have a set of valid sub-strings to parse. So I can implement, and test, that:
// in Form scope
string[] OrStrings;
string[] split1 = new string[] {"OR"};
// inside some method/function/EventHandler, etc.
// note we use split with an array of string here
OrStrings = RangeNum2.Split(split1, StringSplitOptions.RemoveEmptyEntries);
b。然后,经过测试,并进一步沉浸在数据中,我可以考虑我需要摆脱,修剪掉,留给我最小可用参数集用于实际解析。
c。由于我知道需要处理由拆分操作呈现的每个项目,我现在可以首先尝试定义循环,并删除循环中的无关信息:
b. Then, having tested that, and further having "immersed myself in the data," I can think about what I need to get rid of, to trim away, to leave me with the minimal usable set of parameters to be used in actually implementing the parsing.
c. Since I know that each item rendered by the split operation needs to be processed, I can now make a first attempt to define the loop, and to remove extraneous information in the loop:
foreach(string orstr in OrStrings)
{
// trim white space front/back, just in case
finalString = orstr.Trim();
// remove the characters "R("
finalString = finalString.Remove(0,2);
// remove ") AND";
finalString = finalString.Replace(") AND", "");
// remove "-"
finalString = finalString.Replace("-", "");
}
d。在运行此测试之后,仔细观察解析原始和修改的输入字符串的结果,然后我将开始思考接下来需要做什么。而且,此时我通常会为未来做笔记,例如:
1.最终代码:需要使用StringBuilder!
2.对于最终代码:我可以以某种方式改进,组合,替换和删除操作吗?
e。因为我观察到我可以在原始和修改的输入格式的情况下处理'finalString中的前三项,并且,对于解决方案中的这个阶段,暂时忽略修改示例中的其他内容,我的下一个任务是使用前三个条目使解决方案工作。这将告诉我在解析修改后的输入数据格式时,我可以在多大程度上重用代码来解析第一个输入数据格式。
F。所以现在我专注于如何有效地拆分'finalString中的子串:显然我只需要使用空格字符进行拆分。
d. after running this test, and carefully observing the results of parsing both original, and revised, input strings, then I will start thinking about what needs to be done next. And, at this point I will usually make notes for the future, like:
1. for final code: need to use StringBuilder !
2. for final code: can I somehow improve, or combine, Replace, and Remove operation ?
e. since I observe that I can handle the first three items in 'finalString in both original, and revised, input format cases, and, for this stage in the solution, ignore the additional content in the revised example, temporarily, my next task is to make the solution work using only the first three entries. That will tell me to what extent I can reuse the code for parsing the first input data format in parsing the revised input data format.
f. so now I focus on how to split the substrings in 'finalString usefully: and it's obvious that I need to split using a space character only.
// in Form scope
string[] RangeStrings;
// note we use split with an array of char here
char[] split2 = new char[] {' '};
// inside the main loop that parses the original input string: see 2.c. above
// get the start index, end index, and value
rangeStrings = finalString.Split(split2, StringSplitOptions.RemoveEmptyEntries);
// check to make sure I have a valid entry
if(rangeStrings.Length < 3 || rangeStrings.Length > 6) throw new IndexOutOfRangeException();
所以现在我有一个包含三个项目(原始字符串)或六个项目(修订字符串)的数组。我现在可以草绘我的代码处理这些代码的样子:
So now I have an array of either three items (original string), or six items (revised string). I can now "sketch in" what my code is going to look like for processing these:
// will definitely need to reuse the key terms, like 'S750FF, in each entry to be parsed.
string key = rangeStrings[2];
// in every case the key string must be in the 'mustInclude string !
if (! mustInclude.Contains(key)) continue;
// do we need to consider the revised format
bool doProcess = true;
if (rangeStrings.Length == 6)
{
// test the two cases in which we'll exclude the current entry
// from being processed further
// tbd: this kind of sucks: clean this up
if (rangeStrings[3] + rangeStrings[4] == "NOTIN")
{
// the exclusion key is always in position #5 ... we hope
string testString = rangeStrings[5];
if (testString == "MoonX")
{
if (MoonX.Contains(key)) doProcess = false;
}
else if (testString == "StarX")
{
if (StarX.Contains(key)) doProcess = false;
}
}
}
// keep going ?
if(! doProcess) continue;
// now we're going to need the range in integers
int start = Convert.ToInt32(rangeStrings[0]);
int end = Convert.ToInt32(rangeStrings[1]);
// create the final data structure [1]
buildOutput(start, end, key);
// create a report ?
// tbd
此时,我再一次退后一步,在代码中做笔记或插入注释,为了清楚起见,可以返回并将代码重构为单独的方法调用。测试代码的可能修订版本以提高效率或节省内存等。
显然我必须实现一个方法'buildOutput,它取整数值' start和'end,以及字符串'键作为参数。
[1]但是,我想要/需要构建什么样的数据结构?或者,我是否需要构建数据结构?
有很多可能性,而且,imho,这是时间考虑数据结构(如果有的话)应该,因为:取决于您的总体目标是什么,以及蒸馏数据需要在您的应用程序中重用的程度,。
~3。摘要
我试图用这一切证明的整体策略(而且,它只是众多可能策略中的一种......反映我的气质和偏见)是一些人所谓的分而治之。或增量或逐步解决方案。我喜欢将其视为一种策略,在这种策略中,您可以通过冥想数据(工作结果集)和测试,中断小块,以及解决这些块中固有的问题来替代编码,然后再移动解决方案的大局问题。
在解决方案流程的每个状态,我认为退回并反思已经解决的问题是很有价值的。完成后,记录未来的改进(可能作为代码中的注释),以便在达到概念验证阶段时实现。
〜 4.给你的问题
你试过在这个线程上运行前两个解决方案吗?修改了RangeNum字符串,看看会发生什么?
您是否尝试过修改Matt的代码来处理修改后的数据?
根据您在软件职业生涯中的位置,这是正确的时间,您需要投入大量资金来学习RegEx(imho,RegEx是一种自身的编程语言)?
你现在在做什么来解决修改过的问题?
修改后的问题面临的挑战是:它将解析的第一个条目是你想要输出的第4~12行的S750FF,输出是空:需要你(根据你创建的数据结构)并用来存储你的结果来创建一些虚拟的空位置持有者,这样第一次出现的'S750FF就位于what:中的#4位置,以便下一个解析步骤,你需要处理它'Y750RR在1~10的位置,给出正确的结果。
这是一个两通过程的例子解析可能会变得很有价值:因为,如果你开始在'S750FF条目之前解析'Y750RR条目,那么前10行就已经定义了......虽然你还需要创建新的在位置11~12处的'S750FF的行。当您的应用程序处于高负载状态并且性能至关重要时,这种优化变得值得时间/金钱开发它们。
在考虑处理策略时无论是原始字符串格式还是修订后的字符串格式,重要的是要澄清解决方案在处理数据未来变化方面的稳健性。
例如,在未来,你可能需要处理类似的事情:
string RangeNum3 =R(1 - 9)AND B750KK不在StarX中不在MOONX或R(4 - 12)和S750FF或R(1 - 10)和Y750RR在STARX中不在MoonX;
哪里有多个逻辑条款,以确定是否处理案件?
请澄清一些。
Once again, at this point, I'd "step back," and make notes, or insert comments in the code, maybe go back and refactor the code into separate method calls for clarity. Test possible revisions of the code for improved efficiency, or memory conservation, etc.
Obviously I'll have to implement a method 'buildOutput that takes the integer values for 'start and 'end, and the string 'key as parameters.
[1] But, what kind of data structure do I want/need to build ? Or, do I even need to build a data structure ?
There are many possibilities for that, and, imho, this is the time to think about what the data structure, if any, should be, because: depending on what your overall goal is, and the extent to which the "distilled" data needs to be reused in your application, you might make very different choices.
~ 3. summary
The overall strategy I'm trying to demonstrate with all this (and, it's only one of many possible strategies ... one that reflects my temperament and biases) is what some would call "divide and conquer," or "incremental" or "step-wise" solution. I like to think of it as a strategy where you alternate coding with "meditating on your data (working result set)," and testing, breaking off "small chunks," and solving the problems inherent in those "chunks," and then moving on to "bigger picture" aspects of the solution.
At each state of the solution process, I think it's valuable to "pull back," and reflect on what's been done, make notes for future improvements (perhaps as comments in your code) to be implemented when you have reached the proof-of-concept stage.
~ 4. questions for you
Have you tried running the first two solutions on this thread you've got now with the revised RangeNum string, and seeing what happens ?
Have you tried modifying Matt's code to handle the revised data ?
Depending on where you are in your software career, is this the right time for you to make a major investment to learn RegEx (imho, RegEx is a programming language in "its own right") ?
What are you doing now to solve parsing the revised problem ?
The revised problem has the challenge that: the first entry it will parse is the one where you want 'S750FF in rows 4~12 of your output, and the output is "empty:" that requires you (depending on whatever data structure you created and use to store your "results" in) to create some virtual "empty place-holders" so that the first occurrence of 'S750FF is at position #4 in "whatever:" so that the next parsing step, where you handle the need for 'Y750RR in positions 1~10, gives correct results.
That's an example of where a two-pass process of parsing might become valuable: since, if you began parsing the 'Y750RR entry before the 'S750FF entry, the first ten rows would be already defined ... although you'd still need to create new rows for 'S750FF at positions 11~12. Such optimizations become "worth the time/money" to develop them when your application is under "high load," and performance is critical.
In considering a strategy to handle both the original string format, and the revised string format, it's important to clarify how "robust" the solution has to be in terms of handling future variations in the data.
For example, is it the case that in the future, you may need to deal with something like:
string RangeNum3 = "R(1 - 9) AND B750KK NOT IN StarX NOT IN MOONX OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750RR IN STARX NOT IN MoonX";
Where there are multiple logic clauses, to determine if the case is handled ?
Some clarification, please.
using System.Text.RegularExpressions;
Regex rx = new Regex(@"\bR\((?<range>\d+ ?- ?\d+)\)",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
string text = "R(1 - 9) AND B750KK OR R(4 - 12) AND S750FF OR R(1 - 10) AND Y750R";
// Find matches.
MatchCollection matches = rx.Matches(text);
foreach (Match match in matches)
{
GroupCollection groups = match.Groups;
MessageBox.Show(groups["range"].Value);
}
这篇关于C#问题,需要进一步的帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!