解析格式化的文本文件

本文介绍了解析格式化的文本文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个文本文件，我通过格式化字符串接收其内容，如下所示：

TABLE OperatorPoints \\ nENADING \\\\ nDGGName InstanceName OP1_X OP1_Y OP2_X OP2_Y \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ -395.6490 \\\\ nEND $

文本文件中的格式如下所示

I have a text file which I am receiving its contents through a formatted string which is as this:
TABLE OperatorPoints \r\nHEADING\r\nDGName InstanceName OP1_X OP1_Y OP2_X OP2_Y\r\nDATA\r\nG1[1,1] L1[1,1] 251.8419 -804.6282 768.8362 -290.5563\r\nG1[1,1] die[1,1] 357.8950 -714.6857 652.0683 -395.6490\r\nEND
Formatted in the text file looks as below

TABLE OperatorPoints
HEADING
DGName                  InstanceName        OP1_X        OP1_Y        OP2_X        OP2_Y
DATA
G1[1,1]                      L1[1,1]     251.8419    -804.6282     768.8362    -290.5563
G1[1,1]                   valve[1,1]     357.8950    -714.6857     652.0683    -395.6490
END

内容我的变化但是标题（DGName，InstanceName等）总是s ame。

我只对值感兴趣，需要能够读取每个值并将其存储在类对象中。作为内容值我变化的例子而不是G1 [1,1]可能有更长的文本，因此对于其他单元格。 OP1_X中也可能缺少单元格示例251.8419。我的想法是拥有一个字典< string，int>在哪里我放置标题和'单元'宽度，以便我可以根据确切的位置读取值。

我尝试过：

现在我正在分隔每一行

Contents my vary however the Headers (DGName, InstanceName etc) are always the same.

I am only interested in the values and need to be able to read each value and store it in a class object. As contents values my vary example instead of G1[1,1] there might be longer text, and so for the other cells. There also might be missing cells example 251.8419 in OP1_X missing. What I have in my mind is have a dictionary <string,int> where I place the headers and the 'cell' width so that I can then read the values based on an exact location.

What I have tried:

Right now I am seperating each line

string[] result = myString.Split(new string[] { "\n", "\r\n" }, StringSplitOptions.RemoveEmptyEntries);

然后在每一行使用foreach。但缺点是，如果是缺失值，该行的所有剩余值将向左移动，因此我最后一列中的缺失值

then use foreach on each line. But the disadvantage if the is a missing value all the remaining values for that line will move to the left thus I end up with a missing value in the last column

推荐答案

public class CsvParser
    {
        public static DataTable Parse(string data, bool headers)
        {
            return Parse(new StringReader(data), headers);
        }

        public static DataTable Parse(string data)
        {
            return Parse(new StringReader(data));
        }

        public static DataTable Parse(TextReader stream)
        {
            return Parse(stream, false);
        }

        public static DataTable Parse(TextReader stream, bool headers)
        {
            DataTable table = new DataTable();
            CsvStream csv = new CsvStream(stream);
            string[] row = csv.GetNextRow();
            if (row == null)
                return null;
            if (headers)
            {
                foreach (string header in row)
                {
                    if (header != null && header.Length > 0 && !table.Columns.Contains(header))
                        table.Columns.Add(header, typeof(string));
                    else
                        table.Columns.Add(GetNextColumnHeader(table), typeof(string));
                }
                row = csv.GetNextRow();
            }
            while (row != null)
            {
                while (row.Length > table.Columns.Count)
                    table.Columns.Add(GetNextColumnHeader(table), typeof(string));
                table.Rows.Add(row);
                row = csv.GetNextRow();
            }
            return table;
        }

        private static string GetNextColumnHeader(DataTable table)
        {
            int c = 1;
            while (true)
            {
                string h = "Column" + c++;
                if (!table.Columns.Contains(h))
                    return h;
            }
        }

        private class CsvStream
        {
            private TextReader stream;

            public CsvStream(TextReader s)
            {
                stream = s;
            }

            public string[] GetNextRow()
            {
                ArrayList row = new ArrayList();
                while (true)
                {
                    string item = GetNextItem();
                    if (item == null)
                        return row.Count == 0 ? null : (string[])row.ToArray(typeof(string));
                    row.Add(item);
                }
            }

            private bool EOS = false;
            private bool EOL = false;

            private string GetNextItem()
            {
                if (EOL)
                {
                    // previous item was last in line, start new line
                    EOL = false;
                    return null;
                }

                bool quoted = false;
                bool predata = true;
                bool postdata = false;
                StringBuilder item = new StringBuilder();

                while (true)
                {
                    char c = GetNextChar(true);
                    if (EOS)
                        return item.Length > 0 ? item.ToString() : null;

                    if ((postdata || !quoted) && c == ',')
                        // end of item, return
                        return item.ToString();

                    if ((predata || postdata || !quoted) && (c == '\x0A' || c == '\x0D'))
                    {
                        // we are at the end of the line, eat newline characters and exit
                        EOL = true;
                        if (c == '\x0D' && GetNextChar(false) == '\x0A')
                            // new line sequence is 0D0A
                            GetNextChar(true);
                        return item.ToString();
                    }

                    if (predata && c == ' ')
                        // whitespace preceeding data, discard
                        continue;

                    if (predata && c == '"')
                    {
                        // quoted data is starting
                        quoted = true;
                        predata = false;
                        continue;
                    }

                    if (predata)
                    {
                        // data is starting without quotes
                        predata = false;
                        item.Append(c);
                        continue;
                    }

                    if (c == '"' && quoted)
                    {
                        if (GetNextChar(false) == '"')
                            // double quotes within quoted string means add a quote
                            item.Append(GetNextChar(true));
                        else
                            // end-quote reached
                            postdata = true;
                        continue;
                    }

                    // all cases covered, character must be data
                    item.Append(c);
                }
            }

            private char[] buffer = new char[4096];
            private int pos = 0;
            private int length = 0;

            private char GetNextChar(bool eat)
            {
                if (pos >= length)
                {
                    length = stream.ReadBlock(buffer, 0, buffer.Length);
                    if (length == 0)
                    {
                        EOS = true;
                        return '\0';
                    }
                    pos = 0;
                }
                if (eat)
                    return buffer[pos++];
                else
                    return buffer[pos];
            }
        }
    }

这篇关于解析格式化的文本文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！