本文介绍了从数据集中读取随机行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我有1000条记录的数据集 这是用于存储随机选择记录的列表。 私人 静态列表< string> [] BChrom = new 列表< string> [ 10 ]; 如何将整个数据集 RANDOMLY 中的20%添加到该字符串列表中 尝试 { 使用(sr = new StreamReader( @ C:\ Users \ ** *** \Documents\sub0000.data)) { for ( int i = 0 ; i < BChrom.Length; i ++) { } } } 解决方案 试试这个: ArrayList ReadRandom( string sourceFile, int sampleSize) { ArrayList BChrom = new ArrayList(sampleSize); Random random = new Random(); FileStream ifs = new FileStream(sourceFile,FileMode.Open); StreamReader sr = new StreamReader(ifs); string line = ; // 确定源文件的范围 long lastPos = sr.BaseStream.Seek( 0 ,SeekOrigin.End); for ( int i = 0 ; i < sampleSize; ++ i) { / / 生成随机位置 double pct = random.NextDouble(); // [0.0,1.0} long randomPos =( long )(pct * lastPos); if (pct > = 0 . 99 ) randomPos - = 1024 ; // 如果接近结束,请备份 sr.BaseStream .Seek(randomPos,SeekOrigin.Begin); line = sr.ReadLine(); // 消耗curr部分行 line = sr.ReadLine(); // 这将是一个完整的行 sr.DiscardBufferedData(); // magic BChrom.Add(line); } sr.Close(); ifs.Close(); return BChrom; } 有一些缺点(如果文件大小小于1024等,最后一行永远不会被读取)但大型文件的性能有保证...... I have my dataset with 1000 recordsThis is list used to store the randomly selected records.private static List<string>[] BChrom = new List<string>[10];How can I add 20% from the whole dataset RANDOMLY to that List of stringtry { using (sr = new StreamReader(@"C:\Users\*****\Documents\sub0000.data")) { for (int i = 0; i < BChrom.Length; i++) { } } } 解决方案 Try this:ArrayList ReadRandom(string sourceFile, int sampleSize){ ArrayList BChrom = new ArrayList(sampleSize); Random random = new Random(); FileStream ifs = new FileStream(sourceFile, FileMode.Open); StreamReader sr = new StreamReader(ifs); string line = ""; // determine extent of source file long lastPos = sr.BaseStream.Seek(0, SeekOrigin.End); for (int i = 0; i < sampleSize; ++i) { // generate a random position double pct = random.NextDouble(); // [0.0, 1.0) long randomPos = (long)(pct * lastPos); if (pct >= 0.99) randomPos -= 1024; // if near the end, back up a bit sr.BaseStream.Seek(randomPos, SeekOrigin.Begin); line = sr.ReadLine(); // consume curr partial line line = sr.ReadLine(); // this will be a full line sr.DiscardBufferedData(); // magic BChrom.Add(line); } sr.Close(); ifs.Close(); return BChrom;}There are some drawbacks(like last line is never read, if the file size is less than 1024 etc) but performance is guaranteed on large files... 这篇关于从数据集中读取随机行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 09-18 08:31