本文介绍了删除整行的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 大家好, 我有数千行,有106列。第一列(染色体和位置)只包含染色体和位置,但可以复制,而其余列的范围为1-105,其中它对应于样品编号。如果样本具有某个染色体和位置,那么我想在该单元格中添加第一个,以便最后我将计算其中包含一个样本的每个样本的总和。我难以在Python中编程的问题是,如果相同的键出现在不同的样本中不止一次,我该如何将其写入文件。如何将第一个添加到该单元格中,以便稍后我可以获得总和。 提前多多谢谢, 到目前为止我的代码如下: with open(os.path.join(file_out + .txt ),' w') as outpt: dic = defaultdict(list) dic [chro_pos] .append(sample_num) outpt.write( chrom_pos + \t + \t .join( samp_num)+ \t + \ n) for k,val in dic.iteritems():# k是染色体:位置。 val是样本编号1 out 105 v in val: outpt_TSS.write(int(k)*( \t)+ str( 1 )+ ' \ n' )# 这将有重复的chrome_pos,我不希望这样,我想要一个chrome_pos,其编号对应多个样本。 解决方案 将val写入新数组,然后验证该列表中是否已存在,然后跳过。 Hi guys,I have a thousands rows with 106 columns. The first column (chromosome and location) just contains a chromosome and location but can be duplicated whereas the rest of the columns range from 1-105 in which it correspond to the sample number. If the sample has a certain chromosome and location then, I want to add the number one to that cell so that at the end I will calculate the sum of each sample that has one in it. The problem I am having tough time to program in Python is how can I write this to a file if the same key appear more than once of different sample. How can I add the number one to that cell so I can get the sum later on.Thanks a lot in advance,The code I have so far is found below: with open(os.path.join(file_out+".txt"),'w') as outpt: dic = defaultdict(list) dic[chro_pos].append(sample_num) outpt.write("chrom_pos"+"\t"+"\t".join(samp_num)+ "\t"+"\n") for k ,val in dic.iteritems(): # k is the chromosome:location. val is the sample number 1 out 105 for v in val: outpt_TSS.write(int(k)*("\t")+ str(1)+'\n') # This will have duplicates chrome_pos and I don't want that, I want one chrome_pos with number ones corresponding to multiple samples. 解决方案 write val to a new array and with next, verify if already exist in that list then skip. 这篇关于删除整行的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!