我想将以下整个数据集转换为百分比。

https://cocl.us/datascience_survey_data

要找出该行的百分比总和。

例如大数据(Spark / Hadoop)= 1332 + 729 + 127 = 2188

因此该百分比将非常有趣:60.87%

我想针对所有行自动执行此操作。
怎么做?

最佳答案

您可以按行将DataFrame.div列的所有数据除以sum,然后将100乘以多个:

df = pd.read_csv('Topic_Survey_Assignment.csv', index_col=0)

df1 = df.div(df.sum(axis=1), axis=0).mul(100)
print (df1)
                            Very interested  Somewhat interested  \
Big Data (Spark / Hadoop)         60.877514            33.318099
Data Analysis / Statistics        77.007299            20.255474
Data Journalism                   20.235849            50.990566
Data Visualization                61.580882            33.731618
Deep Learning                     58.229599            35.500231
Machine Learning                  74.724771            21.880734

                            Not interested
Big Data (Spark / Hadoop)         5.804388
Data Analysis / Statistics        2.737226
Data Journalism                  28.773585
Data Visualization                4.687500
Deep Learning                     6.270171
Machine Learning                  3.394495


详情:

print (df.sum(axis=1))
Big Data (Spark / Hadoop)     2188
Data Analysis / Statistics    2192
Data Journalism               2120
Data Visualization            2176
Deep Learning                 2169
Machine Learning              2180
dtype: int64


numpy的替代方案非常相似:

df = pd.read_csv('Topic_Survey_Assignment.csv', index_col=0)

arr = df.values
df1 = pd.DataFrame(arr / np.sum(arr, axis=1)[:, None] * 100,
                   index=df.index,
                   columns=df.columns)
print (df1)
                            Very interested  Somewhat interested  \
Big Data (Spark / Hadoop)         60.877514            33.318099
Data Analysis / Statistics        77.007299            20.255474
Data Journalism                   20.235849            50.990566
Data Visualization                61.580882            33.731618
Deep Learning                     58.229599            35.500231
Machine Learning                  74.724771            21.880734

                            Not interested
Big Data (Spark / Hadoop)         5.804388
Data Analysis / Statistics        2.737226
Data Journalism                  28.773585
Data Visualization                4.687500
Deep Learning                     6.270171
Machine Learning                  3.394495

关于python - 将整个数据集转换为百分比,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/58785948/

10-14 18:22
查看更多