本文介绍了计算出现次数,并输入并将它们附加到csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要做的是计算以下内容:

What I need to do is calculate the following:


  1. 某人在日期列表中出现的次数

  1. The number of times a person appears in the list on dates prior to the date specified in the row and a 1 occurs in column 7.

个人(第8列)出现在日期列表中的次数(注意按照时间顺序排序)。

The number of times a person (column 8) appears in the list on dates prior to the date specified in the row (note they are sorted chronologically.)

来自csv的原始数据。

It might be easier to demonstrate this with an example, raw data from csv.

02/01/2005,Data,Class xpv,4,11yo+,4,1,George Smith
02/01/2005,Data,Class xpv,4,11yo+,4,2,Ted James
02/01/2005,Data,Class xpv,4,11yo+,4,3,Emma Lilly
02/01/2005,Data,Class xpv,4,11yo+,4,5,George Smith
02/01/2005,Data,Class tn2,4,10yo+,6,4,Tom Phillips
03/01/2005,Data,Class tn2,4,10yo+,6,2,Tom Phillips
03/01/2005,Data,Class tn2,4,10yo+,6,5,George Smith
03/01/2005,Data,Class tn2,4,10yo+,6,3,Tom Phillips
03/01/2005,Data,Class tn2,4,10yo+,6,1,Emma Lilly
03/01/2005,Data,Class tn2,4,10yo+,6,6,George Smith
04/01/2005,Data,Class tn2,4,10yo+,6,6,Ted James
04/01/2005,Data,Class tn2,4,10yo+,6,3,Tom Phillips
04/01/2005,Data,Class tn2,4,10yo+,6,2,George Smith
04/01/2005,Data,Class tn2,4,10yo+,6,4,George Smith
04/01/2005,Data,Class tn2,4,10yo+,6,1,George Smith
04/01/2005,Data,Class tn2,4,10yo+,6,5,Tom Phillips
05/01/2005,Data,Class 22zn,2,10yo+,5,3,Emma Lilly
05/01/2005,Data,Class 22zn,2,10yo+,5,1,Ted James
05/01/2005,Data,Class 22zn,2,10yo+,5,2,George Smith
05/01/2005,Data,Class 22zn,2,10yo+,5,4,Emma Lilly
05/01/2005,Data,Class 22zn,2,10yo+,5,5,Tom Phillips

我需要csv看起来像下面的描述的结果:

What I need the csv to look like as a result of following the described instructions:

02/01/2005,Data,Class xpv,4,11yo+,4,1,George Smith,0,0
02/01/2005,Data,Class xpv,4,11yo+,4,2,Ted James,0,0
02/01/2005,Data,Class xpv,4,11yo+,4,3,Emma Lilly,0,0
02/01/2005,Data,Class xpv,4,11yo+,4,5,George Smith,0,0
02/01/2005,Data,Class tn2,4,10yo+,6,4,Tom Phillips,0,0
03/01/2005,Data,Class tn2,4,10yo+,6,2,Tom Phillips,0,0
03/01/2005,Data,Class tn2,4,10yo+,6,5,George Smith,1,2
03/01/2005,Data,Class tn2,4,10yo+,6,3,Tom Phillips,0,0
03/01/2005,Data,Class tn2,4,10yo+,6,1,Emma Lilly,0,1
03/01/2005,Data,Class tn2,4,10yo+,6,6,George Smith,1,2
04/01/2005,Data,Class tn2,4,10yo+,6,6,Ted James,0,1
04/01/2005,Data,Class tn2,4,10yo+,6,3,Tom Phillips,1,2
04/01/2005,Data,Class tn2,4,10yo+,6,2,George Smith,1,4
04/01/2005,Data,Class tn2,4,10yo+,6,4,George Smith,1,4
04/01/2005,Data,Class tn2,4,10yo+,6,1,George Smith,1,4
04/01/2005,Data,Class tn2,4,10yo+,6,5,Tom Phillips,0,3
05/01/2005,Data,Class 22zn,2,10yo+,5,3,Emma Lilly,1,2
05/01/2005,Data,Class 22zn,2,10yo+,5,1,Ted James,0,2
05/01/2005,Data,Class 22zn,2,10yo+,5,2,George Smith,2,7
05/01/2005,Data,Class 22zn,2,10yo+,5,4,Emma Lilly,1,2
05/01/2005,Data,Class 22zn,2,10yo+,5,5,Tom Phillips,0,5

所以你可以看到,在最后一行,Tom Phillips在这一天(第10列)之前的几天发生了5次,在这5次发生中,第7列为零, 。

So you can see that on the last row Tom Phillips had occurred 5 times on days previous to this one (column 10) and of those 5 occurrences there had been zero occurrences of column 7 being "1".

我的csv数据显然比这个更大,所以高效的技术和建议也将受到赞赏。

My csv data is obviously much larger than this, so efficient techniques and suggestions would also be appreciated. If more clarification is required please say so, its hard to tell if this example is understandable.

推荐答案

简单的一个:

import csv
import datetime
import copy
from collections import defaultdict

with open(r"C:\Temp\test.csv") as i, open(r"C:\Temp\resuls.csv", "wb") as o:
    rdr = csv.reader(i)
    wrt = csv.writer(o)

    data, currdate = defaultdict(lambda:[0, 0, 0, 0]), None
    for line in rdr:
        date, name = datetime.datetime.strptime(line[0], '%d/%m/%Y'), line[7]

        if date != currdate or not currdate:
            for v in data.itervalues(): v[:2] = v[2:]
            currdate = date

        wrt.writerow(line + data[name][:2])

        data[name][3] += 1
        if line[6] == "1": data[name][2] += 1

一个带有深层副本:

import csv
import datetime
import copy
from collections import defaultdict

with open(r"C:\Temp\test.csv") as i, open(r"C:\Temp\resuls.csv", "wb") as o:
    rdr, wrt = csv.reader(i), csv.writer(o)

    curr, currdate = defaultdict(lambda:[0, 0]), None
    for line in rdr:
        date, name = datetime.datetime.strptime(line[0], '%d/%m/%Y'), line[7]

        if date != currdate or not currdate:
            prev = copy.deepcopy(curr)
            currdate = date

        wrt.writerow(line + prev[name])

        curr[name][1] += 1
        if line[6] == "1": curr[name][0] += 1

这篇关于计算出现次数,并输入并将它们附加到csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-10 23:49