我正在读取一堆SAS文件,如下所示:demography = pd.read_sas("demography.sas7bdat", encoding = 'latin-1')adverse_event_ds = pd.read_sas("adverse_event_ds.sas7bdat", encoding = 'latin-1')rpt10344 = pd.read_sas("rpt10344.sas7bdat", encoding = 'latin-1')vaccine_administration = pd.read_sas("vaccine_administration.sas7bdat", encoding = 'latin-1')lab_tests_blood_chemistry_ds = pd.read_sas("lab_tests_blood_chemistry_ds.sas7bdat", encoding = 'latin-1')lab_tests_hematology_ds = pd.read_sas("lab_tests_hematology_ds.sas7bdat", encoding = 'latin-1')lab_tests_miscellaneous_ds = pd.read_sas("lab_tests_miscellaneous_ds.sas7bdat", encoding = 'latin-1')vital_signs = pd.read_sas("vital_signs.sas7bdat", encoding = 'latin-1')
我希望能够将其替换为以下内容:datasets = ["demography", "adverse_event_ds", "rpt10344", "vaccine_administration", "lab_tests_blood_chemistry_ds", "lab_tests_hematology_ds", "lab_tests_miscellaneous_ds", "vital_signs"]
for dataset in datasets: dataset = pd.read_sas(dataset+".sas7bdat", encoding = 'latin-1')
但是当我做类似的事情时:demography.info()
我得到:NameError: name 'demography' is not defined
到底发生了什么,我该如何解决?
最佳答案
这是在每次迭代时分配给dataset
,而不是创建新变量(例如demography
,rpt10344
等)。
我将使用数据集字典,如下所示:
dsd = {}
for dataset in datasets:
dsd[dataset] = pd.read_sas(dataset+".sas7bdat", encoding = 'latin-1')
或更Python的路线:
dsd = { d : pd.read_sas(d + ".sas7bdat", encoding = 'latin-1') for d in datasets }
我强烈建议不要出于解释here和here的原因分配给各个变量名称,但是如果您绝对必须可以使用
for d in datasets:
globals()[d] = pd.read_sas(d + ".sas7bdat", encoding = 'latin-1')
关于python - Python Pandas从列表中将多个SAS文件读取到单独的数据框中,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/52894787/