此正则表达式需要找到"1x""x1",但它还必须能够找到两位数字,例如"10x""x11"

leverage_match = re.compile(r"\d+X|X\d+", flags=re.IGNORECASE)


根据regex101.com,上述正则表达式应足以捕获以下所有数字:

import pandas as pd
import re

df = pd.DataFrame(["BULL ESTOX 11X S", "BULL ESTOX X12 S"], columns=["name"])

name
"BULL ESTOX 11X S"
"BULL ESTOX X12 S"


但是,对于下面的代码,它仅返回一位数字,例如对于"11X",它将变为"1X"

leverage_match = re.compile(r"\d+X|X\d+", flags=re.IGNORECASE) #<- Same as seen above

def f(value):

    f2 = lambda x: leverage_match.findall(x)[0] if len(leverage_match.findall(x)) > 0 else ""

    leverage = f2(value)

    if leverage != "":
        return "{}".format(leverage)

    if leverage[0].replace("X","x") == "x":
        leverage = leverage[1]+leverage[0].replace('X','x')

df["description"] = df["name"].map(lambda x:f(x))


----------

更新:这是完整的代码,以确保我没有遗漏任何必要的内容:

import pandas as pd
import re

df = pd.DataFrame(["BULL ESTOX 11X S", "BULL ESTOX X12 S"], columns=["name"])

description_map = {"ESTOX":"Euro STOXX 50"}
underlying_match = re.compile(r"\s(\S+)\s")
leverage_match = re.compile(r"\d+X|X\d+", flags=re.IGNORECASE)

def f(value):

    f1 = lambda x: description_map[underlying_match.findall(x)[0]] if underlying_match.findall(x)[0] in description_map else ""
    f2 = lambda x: leverage_match.findall(x)[0] if len(leverage_match.findall(x)) > 0 else ""
    f3 = lambda x: "-" if "BEAR" in x else "-" if "SHORT" in x else ""

    underlying = f1(value)
    leverage = f2(value)
    sign = f3(value)

    statement = "Tracks " + underlying

    if underlying == "":
        if sign == "-" and leverage == "":
            return statement + "{}".format("inversely.")
        if sign == "-" and leverage != "":
            return statement + "{} with {}{} leverage.".format("inversely", sign, leverage)
        if sign == "" and leverage != "":
            return statement + "with {}{} leverage.".format(sign, leverage)
        else:
            return "Tracks"

    if leverage[0].replace("X","x") == "x":
        leverage = leverage[1]+leverage[0].replace('X','x')

    if leverage != "" and sign == "-":
        statement += " {} with {}{} leverage.".format("inversely", sign, leverage)
    elif leverage != "" and sign == "":
        statement += " with {} leverage.".format(leverage)
    else:
        if sign == "-":
            statement += " {} ".format("inversely")

    return statement

df["description"] = df["name"].map(lambda x:f(x))

print df

最佳答案

我认为您为以下df提供了错误的示例

df = pd.DataFrame(["BULL AXP 11X S", "BULL AXP X11 S"], columns=["name"])


输出如下

             name                                 description
0  BULL AXP 11X S  Tracks American Express with 11X leverage.
1  BULL AXP X11 S   Tracks American Express with 1x leverage.


并且x11变为1x,因为以下部分中您的代码逻辑中存在错误:

if leverage[0].replace("X","x") == "x":
        leverage = leverage[1]+leverage[0].replace('X','x')


相反,它必须如下所示:(UPDATE)

if leverage[0].replace("X","x") == "x":
        leverage = ''.join(leverage[1:])+leverage[0].replace('X','x')


如果您解决了该问题,则输出将如您预期的那样并如下所示:

             name                                 description
0  BULL AXP 11X S  Tracks American Express with 11X leverage.
1  BULL AXP X11 S  Tracks American Express with 11x leverage.


完整代码

import pandas as pd
import re

df = pd.DataFrame(["BULL ESTOX 11X S", "BULL ESTOX X12 S"], columns=["name"])

description_map = {"ESTOX":"Euro STOXX 50"}
underlying_match = re.compile(r"\s(\S+)\s")
leverage_match = re.compile(r"\d+X|X\d+", flags=re.IGNORECASE)

def f(value):

    f1 = lambda x: description_map[underlying_match.findall(x)[0]] if underlying_match.findall(x)[0] in description_map else ""
    f2 = lambda x: leverage_match.findall(x)[0] if len(leverage_match.findall(x)) > 0 else ""
    f3 = lambda x: "-" if "BEAR" in x else "-" if "SHORT" in x else ""

    underlying = f1(value)
    leverage = f2(value)
    sign = f3(value)

    statement = "Tracks " + underlying

    if underlying == "":
        if sign == "-" and leverage == "":
            return statement + "{}".format("inversely.")
        if sign == "-" and leverage != "":
            return statement + "{} with {}{} leverage.".format("inversely", sign, leverage)
        if sign == "" and leverage != "":
            return statement + "with {}{} leverage.".format(sign, leverage)
        else:
            return "Tracks"

    if leverage[0].replace("X","x") == "x":
        leverage = ''.join(leverage[1:])+leverage[0].replace('X','x')

    if leverage != "" and sign == "-":
        statement += " {} with {}{} leverage.".format("inversely", sign, leverage)
    elif leverage != "" and sign == "":
        statement += " with {} leverage.".format(leverage)
    else:
        if sign == "-":
            statement += " {} ".format("inversely")

    return statement

df["description"] = df["name"].map(lambda x:f(x))

print df

关于python - Python:两位数的re.compile()函数错误,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/31070854/

10-12 21:20