python - 如何在DataFrame对象dtype中正确识别包含点的浮点值[0，1]？

我有一个这样的数据框，其中我的值是对象dtype：

df = pd.DataFrame(data=['A', '290', '0.1744175757', '1', '1.0000000000'], columns=['Value'])

df
Out[65]:
          Value
0             A
1           290
2  0.1744175757
3             1
4  1.0000000000

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 1 columns):
Value    5 non-null object
dtypes: object(1)
memory usage: 120.0+ bytes

我想做的是仅选择百分比，在这种情况下，值为0.1744175757和1.0000000000，正好发生在我的数据中，它们都将带有句点/点。这是关键点-我需要能够区分1整数值和1.0000000000百分比，以及0和0.0000000000。

我试图寻找点字符的存在，但这不起作用，它对每个值都返回true，但我不清楚为什么。

df[df['Value'].str.contains('.')]
Out[67]:
          Value
0             A
1           290
2  0.1744175757
3             1
4  1.0000000000

我也尝试过isdecimal（），但这不是我想要的：

df[df['Value'].str.isdecimal()]
Out[68]:
  Value
1   290
3     1

我提出的最接近的功能：

def isPercent(x):

    if pd.isnull(x):
        return False

    try:
        x = float(x)
        return x % 1 != 0
    except:
        return False

df[df['Value'].apply(isPercent)]
Out[74]:
          Value
2  0.1744175757

但这无法正确识别1.0000000000（和0.0000000000）的方案。

我有两个问题：

为什么str.contains（'。'）在这种情况下不起作用？这似乎是最简单的方法，因为它将在100％的时间内获得我所需的数据，但即使没有'，它也会返回True。字符显然是有价值的。
如何正确识别值中包含点字符的所有值[0，1]？

最佳答案

str.contains默认情况下执行基于正则表达式的搜索，并使用'。'。将通过正则表达式引擎匹配任何字符。要禁用它，请使用regex=False：

df[df['Value'].str.contains('.', regex=False)]

          Value
2  0.1744175757
4  1.0000000000

您还可以对其进行转义以按字面意义对待它：

df[df['Value'].str.contains(r'\.')]

          Value
2  0.1744175757
4  1.0000000000

如果您真的只想获取浮点数，请尝试使用功能更强大的正则表达式。

df[df['Value'].str.contains(r'\d+\.\d+')].astype(float)

      Value
2  0.174418
4  1.000000

关于python - 如何在DataFrame对象dtype中正确识别包含点的浮点值[0，1]？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/55582520/