我有一张表,其中两种不同类型的列已堆叠到field
列中-属性和问题。
+-------+------------+-------+
| id | field | value |
+-------+------------+-------+
| 52394 | gender | M |
| 52394 | age | 24 |
| 52394 | question_1 | 2 |
| 52394 | question_2 | 1 |
+-------+------------+-------+
我想重塑它,以便在问题_1和问题_2保持堆叠的同时,性别和年龄成为专栏。
+-------+--------+-----+------------+-------+
| id | gender | age | field | value |
+-------+--------+-----+------------+-------+
| 52394 | M | 24 | question_1 | 2 |
| 52394 | M | 24 | question_2 | 1 |
+-------+--------+-----+------------+-------+
有关如何执行此操作的任何想法?
最佳答案
这是我的策略:
将枢轴应用于字段为性别或年龄的df,另存为df1。选择字段不是性别或年龄的df,另存为df2。然后在id上合并两个(df1和df2)。这是我的完整代码:
import pandas as pd
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO
else:
from io import StringIO
# Create df
rawText = StringIO("""
id field value
52394 gender M
52394 age 24
52394 question_1 2
52394 question_2 1
""")
df = pd.read_csv(rawText, sep = "\s+")
df1 = df[df['field'].isin(['gender','age'])]
df1 = df1.pivot(index = 'id', columns = 'field', values = 'value').reset_index()
df2 = df[~df['field'].isin(['gender','age'])]
df1.merge(df2)
结果是:
id age gender field value
0 52394 24 M question_1 2
1 52394 24 M question_2 1
关于python - Pandas -仅枢轴选择行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/47963769/