将大型json文件转换为csv时出现问题

将大型json文件转换为csv时出现问题

本文介绍了Visual Basic Json到CSV转换-将大型json文件转换为csv时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很大的json文件(50 + mbs),需要将其转换为.csv.我也有一个用Visual Basic编码的json-to-csv转换器,但是由于csv文件中的行数限制为1,048,576行,因此我无法将所有内容成功转换为一个 床单.

我可以向转换器添加一些代码以在达到特定限制时添加额外的.csv文件吗?这是json-to-csv程序的代码

I have large json files(50+mbs) that I need to convert to .csv. I also have a json-to-csv converter coded in Visual Basic but because the number of rows in the csv file is limited to 1,048,576 rows I'm unable to convert everything successfully onto one sheet.

Can I add some code to the converter to add extra .csv files when it gets to a certain limit? This is the code for the json-to-csv program

lass Form1

    Private marketDictionary As New Dictionary(Of String, String)
    Private runnerDictionary As New Dictionary(Of Integer, String)

    Public Sub Print(ByVal Message As String)
        TextBox1.SelectionStart = TextBox1.Text.Length
        TextBox1.SelectedText = vbCrLf & Message
    End Sub

    Private Sub OpenToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles OpenToolStripMenuItem.Click
        With OpenFileDialog1
            .Title = "Open File ..."
            .InitialDirectory = "C:\Users\Anto\Desktop\Application Files\Betfair_1_0_0_1\"
            .FileName = "*.json"
            .ShowDialog()
        End With

    End Sub

    Private Sub ExitToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles ExitToolStripMenuItem.Click

        Me.Close()

    End Sub

    Private Sub ProcessToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles ProcessToolStripMenuItem.Click

        Print("Processing JSON file")

        ProcessJSON(OpenFileDialog1.FileName.ToString())

        Print("Processing complete")

    End Sub

    Private Sub OpenFileDialog1_FileOk(sender As Object, e As System.ComponentModel.CancelEventArgs) Handles OpenFileDialog1.FileOk

        Dim jsonFilename As String

        jsonFilename = OpenFileDialog1.FileName.ToString()

        Dim dateString As String = jsonFilename.Replace("C:\Users\Anto\Desktop\Application Files\Betfair_1_0_0_1\", "").Replace(".json", "")

        Dim marketKeys As String = "C:\Users\Anto\Desktop\Application Files\Betfair_1_0_0_1\marketKeys-" & dateString & ".csv"
        Dim runnerKeys As String = "C:\Users\Anto\Desktop\Application Files\Betfair_1_0_0_1\runnerKeys-" & dateString & ".csv"


        Print(jsonFilename)
        Print(marketKeys)
        Print(runnerKeys)
        Print("")
        Print("Loading market and runner keys")

        LoadKeys(marketKeys, runnerKeys)

        Print("Keys loaded - System ready for processing")
        Print("")

    End Sub

    Private Sub LoadKeys(ByVal marketKeysFilename As String, ByVal runnerKeysFilename As String)

        Dim line As String

        Using reader As StreamReader = New StreamReader(marketKeysFilename)

            line = reader.ReadLine

            Do While (Not line Is Nothing)

                Dim parts As String() = Strings.Split(line, ",")
                Try
                    marketDictionary.Add(parts(0), parts(1))
                Catch ex As Exception
                    Print("Ignoring duplicate market key")
                End Try

                line = reader.ReadLine

            Loop

        End Using

        Using reader As StreamReader = New StreamReader(runnerKeysFilename)

            line = reader.ReadLine

            Do While (Not line Is Nothing)

                Dim parts As String() = Strings.Split(line, ",")

                Try
                    runnerDictionary.Add(parts(0), parts(1))
                Catch ex As Exception
                    Print("Ignoring duplicate runner key")
                End Try

                line = reader.ReadLine

            Loop

        End Using

    End Sub

    Private Sub ProcessJSON(ByVal jsonFilename As String)

        Dim outputFilename As String = jsonFilename.Replace("json", "csv")

        Dim line As String

        Using reader As StreamReader = New StreamReader(jsonFilename)

            line = reader.ReadLine

            Do While (Not line Is Nothing)

                Dim parts As String() = Strings.Split(line, "*")
                Dim book() As MarketBookResponse = DeserializeRawBook(parts(1))
                For bookCount As Integer = 0 To book(0).result.Count - 1
                    For runnerCount As Integer = 0 To book(0).result(bookCount).runners.Count - 1

                        With book(0).result(bookCount).runners(runnerCount)

                            Using writer As StreamWriter = File.AppendText(outputFilename)

                                writer.WriteLine(parts(0) & "," & marketDictionary.Item(book(0).result(bookCount).marketId) & "," & runnerDictionary.Item(.selectionId) & "," & .lastPriceTraded)

                            End Using

                        End With

                    Next

                Next

                line = reader.ReadLine 'read in the next line.

            Loop

        End Using

    End Sub

End Class

顺便说一下,我几乎没有编码经验,这是直接从我买的书中编码的.

By the way I have very little coding experience, this is coded straight from a book I bought.

任何帮助将不胜感激,谢谢

Any help would be greatly appreciated, thanks

推荐答案

什么是工作表"?您是说您的文件必须遵守某些其他处理所施加的限制吗?

What is a 'sheet'?   Do you mean that your file has to comply with some limit imposed by some other processing?

在这种情况下,您需要在创建行时对其进行计数.该代码似乎效率很低,但实际上很容易修改.但是,您将需要为每个部分创建一个新的文件名.像这样:

In that case you need to count the lines as you create them.  That code seems very inefficient, but is actually easy to modify.  You will need to manufacture a new filename for each portion, however.  Something like:

        Dim outputFileBase As String = jsonFilename.Replace(".json", "")
        Dim line As String
        Dim LineCount As Integer = 0
        Dim FileCount As Integer = 0
        Dim OutputFileName As String = Path.Combine(outputFileBase & FileCount.ToString, ".csv")

        Using reader As StreamReader = New StreamReader(jsonFilename)
            line = reader.ReadLine
            Do While (Not line Is Nothing)
                Dim parts As String() = Strings.Split(line, "*")
                Dim book() As MarketBookResponse = DeserializeRawBook(parts(1))
                For bookCount As Integer = 0 To book(0).result.Count - 1
                    For runnerCount As Integer = 0 To book(0).result(bookCount).runners.Count - 1
                        With book(0).result(bookCount).runners(runnerCount)
                            Using writer As StreamWriter = File.AppendText(OutputFileName)
                                writer.WriteLine(parts(0) & "," & marketDictionary.Item(book(0).result(bookCount).marketId) & "," & runnerDictionary.Item(.selectionId) & "," & .lastPriceTraded)
                                LineCount += 1
                                If LineCount > 1000000 Then
                                    FileCount += 1
                                    OutputFileName = Path.Combine(outputFileBase & FileCount.ToString, ".csv")
                                End If
                            End Using
                        End With
                    Next
                Next
                line = reader.ReadLine 'read in the next line.
            Loop
        End Using


这篇关于Visual Basic Json到CSV转换-将大型json文件转换为csv时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 02:33