本文介绍了从一个列表中随机创建两个列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Pandas 从 CSV 文件导入大量数据,一旦读取,我将其格式化为仅包含数字数据.然后返回列表中的列表.然后每个列表包含大约 140k 位的数据.numericData[][].

I'm using pandas to import a lot of data from a CSV file, and once read I format it to contain only numerical data. This then returns a list within a list. Each list then contains around 140k bits of data. numericalData[][].

从这个列表中,我希望创建 TestingTraining Data.对于我的测试数据,我希望拥有 30% 的读取数据 numericData,因此我使用以下代码;

From this list, I wish to create Testing and Training Data. For my testing data, I want to have 30% of my read data numericalData, so I use this following bit of code;

testingAmount = len(numericalData0[0]) * trainingDataPercentage / 100

效果很好.然后,我使用 numpy 从导入的 numericData;

Works a treat. Then, I use numpy to select that amount of data from each column of my imported numericalData;

testingData.append(np.random.choice(numericalData[x], testingAmount)  )

然后返回一个包含 38 列的样本(循环运行),其中每列有大约 49k 个从我导入的 numericData 中随机选择的数据元素.

This then returns a sample with 38 columns (running in a loop), where each column has around 49k elements of data randomly selected from my imported numericalData.

问题是,我的 trainingData 需要保存其他 70% 的数据,但我不确定如何做到这一点.我尝试比较 testingData 中的每个元素,如果两个元素不相等,则将其添加到我的 trainingData.这导致了错误并且不起作用.接下来,我尝试从导入的数据中删除选定的 testingData,然后将该新列保存到我的 trainingData 中,唉,这不起作用.

The issue is, my trainingData needs to hold the other 70% of the data, but I'm unsure on how to do this. I've tried to compare each element in my testingData, and if both elements aren't equal, then add it to my trainingData. This resulted in an error and didn't work. Next, I tried to delete the selected testingData from my imported data, and then save that new column to my trainingData, alas, that didn't work eiher.

过去一周我只使用了 python,所以我对现在尝试什么有点迷茫.

I've only been working with python for the past week so I'm a bit lost on what to try now.

推荐答案

之后可以使用 random.shuffle 和拆分列表.以玩具为例:

You can use random.shuffle and split list after that. For toy example:

import random
data = range(1, 11)

random.shuffle(data)

training = data[:5]
testing = data[5:]

要获取更多信息,请阅读文档.

To get more information, read the docs.

这篇关于从一个列表中随机创建两个列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-12 07:43