我想将以下整个数据集转换为百分比。
https://cocl.us/datascience_survey_data
要找出该行的百分比总和。
例如大数据(Spark / Hadoop)= 1332 + 729 + 127 = 2188
因此该百分比将非常有趣:60.87%
我想针对所有行自动执行此操作。
怎么做?
最佳答案
您可以按行将DataFrame.div
列的所有数据除以sum
,然后将100
乘以多个:
df = pd.read_csv('Topic_Survey_Assignment.csv', index_col=0)
df1 = df.div(df.sum(axis=1), axis=0).mul(100)
print (df1)
Very interested Somewhat interested \
Big Data (Spark / Hadoop) 60.877514 33.318099
Data Analysis / Statistics 77.007299 20.255474
Data Journalism 20.235849 50.990566
Data Visualization 61.580882 33.731618
Deep Learning 58.229599 35.500231
Machine Learning 74.724771 21.880734
Not interested
Big Data (Spark / Hadoop) 5.804388
Data Analysis / Statistics 2.737226
Data Journalism 28.773585
Data Visualization 4.687500
Deep Learning 6.270171
Machine Learning 3.394495
详情:
print (df.sum(axis=1))
Big Data (Spark / Hadoop) 2188
Data Analysis / Statistics 2192
Data Journalism 2120
Data Visualization 2176
Deep Learning 2169
Machine Learning 2180
dtype: int64
numpy的替代方案非常相似:
df = pd.read_csv('Topic_Survey_Assignment.csv', index_col=0)
arr = df.values
df1 = pd.DataFrame(arr / np.sum(arr, axis=1)[:, None] * 100,
index=df.index,
columns=df.columns)
print (df1)
Very interested Somewhat interested \
Big Data (Spark / Hadoop) 60.877514 33.318099
Data Analysis / Statistics 77.007299 20.255474
Data Journalism 20.235849 50.990566
Data Visualization 61.580882 33.731618
Deep Learning 58.229599 35.500231
Machine Learning 74.724771 21.880734
Not interested
Big Data (Spark / Hadoop) 5.804388
Data Analysis / Statistics 2.737226
Data Journalism 28.773585
Data Visualization 4.687500
Deep Learning 6.270171
Machine Learning 3.394495
关于python - 将整个数据集转换为百分比,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/58785948/