从字符串中选择多个值

从字符串中选择多个值

本文介绍了从字符串中选择多个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有下面的示例数据之类的数据,并且我正在尝试模式匹配并解析它以创建类似输出数据的内容.这个想法是,如果我有一个包含"Aggr(")的字符串值,则解析括号中的"stuff",然后解析下一个括号之前逗号后面的东西".是否有一种精巧的方法可以做到这一点?像正则表达式一样,还是需要几个循环?

I have data like the sample data below, and I'm trying to pattern match and parse it to create something like the output data. The idea is, if I have a string value that contains "Aggr(" then parse the "stuff" in the parenthesis and the parse the following "something" that follows the comma before the next parenthesis. Is there a slick way to do this with like regex, or is it going to require a couple of loops?

Sample Data:

SampleDf=pd.DataFrame([['tom',"words Aggr(stuff),something1)"],['bob',"Morewords Aggr(Diffstuff),something2"]],columns=['ReportField','OtherField'])

Sample Output:

OutputDf=pd.DataFrame([['tom',"words Aggr(stuff),something1",'stuff', 'something1'],['bob',"Morewords Aggr(Diffstuff),something2",'Diffstuff','something2']],columns=['ReportField','OtherField','Part1','Part2'])

推荐答案

您可以使用str.extract捕获字符串中的模式并将每个模式转换为一列:

You can use str.extract to capture pattern in the string and convert each into a column:

pd.concat([
        SampleDf,
        SampleDf.OtherField.str.extract(r"Aggr\((?P<Part1>.*?)\),(?P<Part2>[^\(]*)", expand=True)
    ], axis=1)

#   ReportField                             OtherField      Part1        Part2
#0          tom           words Aggr(stuff),something1      stuff   something1
#1          bob   Morewords Aggr(Diffstuff),something2  Diffstuff   something2

regex Aggr\\((?P<Part1>.*?)\\),(?P<Part2>[^\\(]*)捕获您需要的两种模式(一个名为 part1 Aggr\\((?P<Part1>.*?)\\):在 Aggr ,另一个是,(?P<Part2>[^\\(]*),名称为 part2 :逗号后的模式(在下一个括号之前的第一个模式之后).

regex Aggr\\((?P<Part1>.*?)\\),(?P<Part2>[^\\(]*) captures two patterns you needed (with one being Aggr\\((?P<Part1>.*?)\\) named part1: the content in the first parenthesis after Aggr, another being ,(?P<Part2>[^\\(]*) named part2: the pattern after the comma following the first pattern before the next parenthesis).

这篇关于从字符串中选择多个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 04:58