问题描述
我有一个文本文件,我通过格式化字符串接收其内容,如下所示:
TABLE OperatorPoints \\ nENADING \\\\ nDGGName InstanceName OP1_X OP1_Y OP2_X OP2_Y \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ -395.6490 \\\\ nEND $
文本文件中的格式如下所示
I have a text file which I am receiving its contents through a formatted string which is as this:
TABLE OperatorPoints \r\nHEADING\r\nDGName InstanceName OP1_X OP1_Y OP2_X OP2_Y\r\nDATA\r\nG1[1,1] L1[1,1] 251.8419 -804.6282 768.8362 -290.5563\r\nG1[1,1] die[1,1] 357.8950 -714.6857 652.0683 -395.6490\r\nEND
Formatted in the text file looks as below
TABLE OperatorPoints
HEADING
DGName InstanceName OP1_X OP1_Y OP2_X OP2_Y
DATA
G1[1,1] L1[1,1] 251.8419 -804.6282 768.8362 -290.5563
G1[1,1] valve[1,1] 357.8950 -714.6857 652.0683 -395.6490
END
内容我的变化但是标题(DGName,InstanceName等)总是s ame。
我只对值感兴趣,需要能够读取每个值并将其存储在类对象中。作为内容值我变化的例子而不是G1 [1,1]可能有更长的文本,因此对于其他单元格。 OP1_X中也可能缺少单元格示例251.8419。我的想法是拥有一个字典< string,int>在哪里我放置标题和'单元'宽度,以便我可以根据确切的位置读取值。
我尝试过:
现在我正在分隔每一行
Contents my vary however the Headers (DGName, InstanceName etc) are always the same.
I am only interested in the values and need to be able to read each value and store it in a class object. As contents values my vary example instead of G1[1,1] there might be longer text, and so for the other cells. There also might be missing cells example 251.8419 in OP1_X missing. What I have in my mind is have a dictionary <string,int> where I place the headers and the 'cell' width so that I can then read the values based on an exact location.
What I have tried:
Right now I am seperating each line
string[] result = myString.Split(new string[] { "\n", "\r\n" }, StringSplitOptions.RemoveEmptyEntries);
然后在每一行使用foreach。但缺点是,如果是缺失值,该行的所有剩余值将向左移动,因此我最后一列中的缺失值
then use foreach on each line. But the disadvantage if the is a missing value all the remaining values for that line will move to the left thus I end up with a missing value in the last column
推荐答案
public class CsvParser
{
public static DataTable Parse(string data, bool headers)
{
return Parse(new StringReader(data), headers);
}
public static DataTable Parse(string data)
{
return Parse(new StringReader(data));
}
public static DataTable Parse(TextReader stream)
{
return Parse(stream, false);
}
public static DataTable Parse(TextReader stream, bool headers)
{
DataTable table = new DataTable();
CsvStream csv = new CsvStream(stream);
string[] row = csv.GetNextRow();
if (row == null)
return null;
if (headers)
{
foreach (string header in row)
{
if (header != null && header.Length > 0 && !table.Columns.Contains(header))
table.Columns.Add(header, typeof(string));
else
table.Columns.Add(GetNextColumnHeader(table), typeof(string));
}
row = csv.GetNextRow();
}
while (row != null)
{
while (row.Length > table.Columns.Count)
table.Columns.Add(GetNextColumnHeader(table), typeof(string));
table.Rows.Add(row);
row = csv.GetNextRow();
}
return table;
}
private static string GetNextColumnHeader(DataTable table)
{
int c = 1;
while (true)
{
string h = "Column" + c++;
if (!table.Columns.Contains(h))
return h;
}
}
private class CsvStream
{
private TextReader stream;
public CsvStream(TextReader s)
{
stream = s;
}
public string[] GetNextRow()
{
ArrayList row = new ArrayList();
while (true)
{
string item = GetNextItem();
if (item == null)
return row.Count == 0 ? null : (string[])row.ToArray(typeof(string));
row.Add(item);
}
}
private bool EOS = false;
private bool EOL = false;
private string GetNextItem()
{
if (EOL)
{
// previous item was last in line, start new line
EOL = false;
return null;
}
bool quoted = false;
bool predata = true;
bool postdata = false;
StringBuilder item = new StringBuilder();
while (true)
{
char c = GetNextChar(true);
if (EOS)
return item.Length > 0 ? item.ToString() : null;
if ((postdata || !quoted) && c == ',')
// end of item, return
return item.ToString();
if ((predata || postdata || !quoted) && (c == '\x0A' || c == '\x0D'))
{
// we are at the end of the line, eat newline characters and exit
EOL = true;
if (c == '\x0D' && GetNextChar(false) == '\x0A')
// new line sequence is 0D0A
GetNextChar(true);
return item.ToString();
}
if (predata && c == ' ')
// whitespace preceeding data, discard
continue;
if (predata && c == '"')
{
// quoted data is starting
quoted = true;
predata = false;
continue;
}
if (predata)
{
// data is starting without quotes
predata = false;
item.Append(c);
continue;
}
if (c == '"' && quoted)
{
if (GetNextChar(false) == '"')
// double quotes within quoted string means add a quote
item.Append(GetNextChar(true));
else
// end-quote reached
postdata = true;
continue;
}
// all cases covered, character must be data
item.Append(c);
}
}
private char[] buffer = new char[4096];
private int pos = 0;
private int length = 0;
private char GetNextChar(bool eat)
{
if (pos >= length)
{
length = stream.ReadBlock(buffer, 0, buffer.Length);
if (length == 0)
{
EOS = true;
return '\0';
}
pos = 0;
}
if (eat)
return buffer[pos++];
else
return buffer[pos];
}
}
}
这篇关于解析格式化的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!