问题描述
我在使用 wide_to_long 函数时遇到了一些问题.这个例子工作正常:
I am having some trouble with the wide_to_long function. This example works fine:
Loc Nom Meas-1 Meas-2 Meas-3
200 A 0.8 1.1 1.2
201 B 4.9 5.1 5.2
pd.wide_to_long(df, 'Meas', i=['Loc','Nom'], j='Ref', sep='-').reset_index()
Loc Nom Meas Ref
200 A 0.8 1
200 A 1.1 2
200 A 1.2 3
201 B 4.9 1
201 B 5.1 2
201 B 5.2 3
我的问题是数据框中Meas-"后面的字符串是一个随机的字母数字序列号.一个基本的例子:
My problem is that the string that follows "Meas-" in my dataframe is a random, alpha-numeric serial number. A basic example:
Loc Nom Meas-1 Meas-2D Meas-3
200 A 0.8 1.1 1.2
201 B 4.9 5.1 5.2
pd.wide_to_long(df, 'Meas', i=['Loc','Nom'], j='Ref', sep='-').reset_index()
Loc Nom Meas Meas-2D Ref
200 A 0.8 1.1 1
200 A 1.2 1.1 3
201 B 4.9 5.1 1
201 B 5.2 5.1 3
更糟糕的是,如果所有Meas-"部分后跟包含字母的字符串,我会收到一个空数据框错误:
Worse, if all of the "Meas-" parts are followed by strings containing letters, I get an empty dataframe error:
Loc Nom Meas-1D Meas-2D Meas-3D
200 A 0.8 1.1 1.2
201 B 4.9 5.1 5.2
pd.wide_to_long(df, 'Meas', i=['Loc','Nom'], j='Ref', sep='-').reset_index()
Empty DataFrame
我怎样才能让这个函数使用Meas-"后面的任何字符串作为参考,而不仅仅是数字?
How can I get this function to use whatever string follows "Meas-" for Ref, and not only numbers?
谢谢!
推荐答案
你应该看看 suffix
参数.(如果你不提到它,它会默认寻找数字'\d+'
,因为 '2D' 不是 number ,那么什么都不返回)
You should look at suffix
parameter.(If you do not mention it , it will default looking for number '\d+'
, since '2D' is not number , then return nothing)
pd.wide_to_long(df, 'Meas', i=['Loc','Nom'], j='Ref', sep='-',suffix='\w+').reset_index()
Out[289]:
Loc Nom Ref Meas
0 200 A 1 0.8
1 200 A 2D 1.1
2 200 A 3 1.2
3 201 B 1 4.9
4 201 B 2D 5.1
5 201 B 3 5.2
这篇关于带有随机 id 变量的 Pandas wide_to_long的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!