python - 创建一个循环以找出前20天内的销售数量

我是py的新手，无法弄清楚如何在首次销售后20天找到销售电话的数量。问题是让我计算出在开始的20天内至少拨打了10个电话的销售人员的百分比。
每行都是一个销售电话，销售人员由col id表示，销售电话的时间记录在call_starttime中。

df非常简单，看起来像这样

    id      call_starttime  level
0   66547   7/28/2015 23:18 1
1   66272   8/10/2015 20:48 0
2   66547   8/20/2015 17:32 2
3   66272   8/31/2015 18:21 0
4   66272   8/31/2015 20:25 0

我已经计算出每个id的convos数量，并且可以过滤掉未进行至少10次电话销售的任何人

目前正在使用的代码是

df_withcount=df.groupby(['cc_user_id','cc_cohort']).size().reset_index(name='count')
df_20andmore=df_withcount.loc[(df_withcount['count'] >= 20)]

我希望输出结果可以告诉我ID（销售人员）在最初20天内至少拨打了10次电话的人数。截至目前，我只能弄清楚该如何在整个时间内至少拨打10个电话

最佳答案

我使用了Person类来帮助解决此问题。

创建一个数据框
将call_start_time从字符串更改为TimeDelta格式
在FIRST call_start_time之后的20天检索
创建了Person类来跟踪days_count和id
创建了一个列表来保存Person对象，并使用dataframe中的数据填充这些对象
如果从开始日期到结束日期的20天内，销售人员达到10次以上的销售量，则打印人员对象列表

我已经测试过我的代码，并且效果很好。可以进行改进，但我的主要重点是实现良好的工作解决方案。如果您有任何问题，请告诉我。

import pandas as pd
from datetime import timedelta
import datetime
import numpy as np

# prep data for dataframe
lst = {'call_start_time':['7/28/2015','8/10/2015','7/28/2015','7/28/2015'],
        'level':['1','0','1','1'],
        'id':['66547', '66272', '66547','66547']}

# create dataframe
df = pd.DataFrame(lst)

# convert to TimeDelta object to subtract days
for index, row in df.iterrows():
    row['call_start_time'] = datetime.datetime.strptime(row['call_start_time'], "%m/%d/%Y").date()

# get the end date by adding 20 days to start day
df["end_of_20_days"] = df["call_start_time"] + timedelta(days=20)

# used below comment for testing might need it later
# df['Difference'] = (df['end_of_20_days'] - df['call_start_time']).dt.days

# created person class to keep track of days_count and id
class Person(object):
    def __init__(self, id, start_date, end_date):
        self.id = id
        self.start_date = start_date
        self.end_date = end_date
        self.days_count = 1

# create list to hold objects of person class
person_list = []

# populate person_list with Person objects and their attributes
for index, row in df.iterrows():
    # get result_id to use as conditional for populating Person objects
    result_id = any(x.id == row['id'] for x in person_list)

    # initialize Person objects and inject with data from dataframe
    if len(person_list) == 0:
        person_list.append(Person(row['id'], row['call_start_time'], row['end_of_20_days']))
    elif not(result_id):
        person_list.append(Person(row['id'], row['call_start_time'], row['end_of_20_days']))
    else:
        for x in person_list:
            # if call_start_time is within 20 days time frame, increment day_count to Person object
            diff = (x.end_date - row['call_start_time']).days
            if x.id == row['id'] and diff <= 20 :
                x.days_count += 1
                break

# flag to check if nobody hit the sales mark
flag = 0

# print out only person_list ids who have hit the sales mark
for person in person_list:
    if person.days_count >= 10:
        flag = 1
        print("person id:{} has made {} calls within the past 20 days since first call date".format(person.id, person.days_count))

if flag == 0:
    print("No one has hit the sales mark")

关于python - 创建一个循环以找出前20天内的销售数量，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/57105184/