我正在寻找一个函数,采取两个日期(入院和出院)和一个财政年度,并返回在这两个日期之间每个月的天数。
财政年度从4月1日到3月31日
我目前有一个解决方案(如下)是一个混乱的SPSS和Python,最终它将需要实现回SPSS,但作为一个更整洁的Python函数,不幸的是,这意味着它只能使用标准库(而不是panda)。
例如

+-----------------+-----------------+------+--+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|    Admission    |    Discharge    |  FY  |  | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | Jan | Feb | Mar |
+-----------------+-----------------+------+--+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| 01 January 2017 | 05 January 2017 | 1617 |  |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   4 |   0 |   0 |
| 01 January 2017 | 05 June 2017    | 1617 |  |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |  31 |  28 |  31 |
| 01 January 2017 | 05 June 2017    | 1718 |  |  30 |  31 |   4 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |
| 01 January 2017 | 01 January 2019 | 1718 |  |  30 |  31 |  30 |  31 |  31 |  30 |  31 |  30 |  31 |  31 |  28 |  31 |
+-----------------+-----------------+------+--+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

Related - How to calculate number of days between two given dates?

Current solution (SPSS code)

 * Count the beddays.
 * Similar method to that used in Care homes.
 * 1) Declare an SPSS macro which will set the beddays for each month.
 * 2) Use python to run the macro with the correct parameters.
 * This means that different month lengths and leap years are handled correctly.
Define !BedDaysPerMonth (Month = !Tokens(1)
   /MonthNum = !Tokens(1)
   /DaysInMonth = !Tokens(1)
   /Year = !Tokens(1))

 * Store the start and end date of the given month.
Compute #StartOfMonth = Date.DMY(1, !MonthNum, !Year).
Compute #EndOfMonth = Date.DMY(!DaysInMonth, !MonthNum, !Year).

 * Create the names of the variables e.g. April_beddays and April_cost.
!Let !BedDays = !Concat(!Month, "_beddays").

 * Create variables for the month.
Numeric !BedDays (F2.0).

 * Go through all possibilities to decide how many days to be allocated.
Do if keydate1_dateformat LE #StartOfMonth.
   Do if keydate2_dateformat GE #EndOfMonth.
      Compute !BedDays = !DaysInMonth.
   Else.
      Compute !BedDays = DateDiff(keydate2_dateformat, #StartOfMonth, "days").
   End If.
Else if keydate1_dateformat LE #EndOfMonth.
   Do if keydate2_dateformat GT #EndOfMonth.
      Compute !BedDays = DateDiff(#EndOfMonth, keydate1_dateformat, "days") + 1.
   Else.
      Compute !BedDays = DateDiff(keydate2_dateformat, keydate1_dateformat, "days").
   End If.
Else.
   Compute !BedDays = 0.
End If.

 * Months after the discharge date will end up with negatives.
If !BedDays < 0 !BedDays = 0.
!EndDefine.

 * This python program will call the macro for each month with the right variables.
 * They will also be in FY order.
Begin Program.
from calendar import month_name, monthrange
from datetime import date
import spss

#Set the financial year, this line reads the first variable ('year')
fin_year = int((int(spss.Cursor().fetchone()[0]) // 100) + 2000)

#This line generates a 'dictionary' which will hold all the info we need for each month
#month_name is a list of all the month names and just needs the number of the month
#(m < 4) + 2015 - This will set the year to be 2015 for April onwards and 2016 other wise
#monthrange takes a year and a month number and returns 2 numbers, the first and last day of the month, we only need the second.
months = {m: [month_name[m], (m < 4) + fin_year, monthrange((m < 4) + fin_year, m)[1]]  for m in range(1,13)}
print(months) #Print to the output window so you can see how it works

#This will make the output look a bit nicer
print("\n\n***This is the syntax that will be run:***")

#This loops over the months above but first sorts them by year, meaning they are in correct FY order
for month in sorted(months.items(), key=lambda x: x[1][1]):
   syntax = "!BedDaysPerMonth Month = " + month[1][0][:3]
   syntax += " MonthNum = " + str(month[0])
   syntax += " DaysInMonth = " + str(month[1][2])
   syntax += " Year = " + str(month[1][1]) + "."

   print(syntax)
   spss.Submit(syntax)
End Program.

最佳答案

我能想到的唯一办法是循环浏览每一天并解析它所属的月份:

import time, collections
SECONDS_PER_DAY = 24 * 60 * 60
def monthlyBedDays(admission, discharge, fy=None):

    start = time.mktime(time.strptime(admission, '%d-%b-%Y'))
    end = time.mktime(time.strptime( discharge, '%d-%b-%Y'))
    if fy is not None:
        fy = str(fy)
        start = max(start, time.mktime(time.strptime('01-Apr-'+fy[:2], '%d-%b-%y')))
        end   = min(end,   time.mktime(time.strptime('31-Mar-'+fy[2:], '%d-%b-%y')))
    days = collections.defaultdict(int)
    for day in range(int(start), int(end) + SECONDS_PER_DAY, SECONDS_PER_DAY):
        day = time.localtime(day)
        key = time.strftime('%Y-%m', day)  # use '%b' to answer the question exactly, but that's not such a good idea
        days[ key ] += 1
    return days

output = monthlyBedDays(admission="01-Jan-2018", discharge="25-Apr-2018")
print(output)
# Prints:
# defaultdict(<class 'int'>, {'2018-01': 31, '2018-02': 28, '2018-03': 31, '2018-04': 25})

print(monthlyBedDays(admission="01-Jan-2018", discharge="25-Apr-2018", fy=1718))
# Prints:
# defaultdict(<class 'int'>, {'2018-01': 31, '2018-02': 28, '2018-03': 31})

print(monthlyBedDays(admission="01-Jan-2018", discharge="25-Apr-2018", fy=1819))
# Prints:
# defaultdict(<class 'int'>, {'2018-04': 25})

请注意,输出是一个defaultdict这样,如果您要求它输入未记录(例如output['1999-12'])的任何月份(或任何键)的天数,它将返回0。还要注意,我对输出键使用了'%Y-%m'格式。与使用最初要求的密钥类型('%b'>'Jan')相比,这使得对输出进行排序和消除在不同年份发生的月份之间的歧义变得更加容易。

10-01 06:34
查看更多