本文介绍了在R中使用具有多个条件的gsub函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
结束问题搜索
Folllow up on question Searching for unique values in dataframe and creating a table with them
Here is how my data looks like
UUID Source 1 Jane http//mywebsite.com44bb00?utm_source=ADW&utm_medium=banner&utm_campaign=Monk&gclid1234 2 Mike http//mywebsite.com44bb00?utm_source=Google&utm_medium=cpc&utm_campaign=DOG&gclid1234 3 John http//mywebsite.com44bb00?utm_source=Yahoo&utm_medium=banner&utm_campaign=DOG&gclid1234 4 Sarah http//mywebsite.com44bb00?utm_source=Facebookdw&utm_medium=cpc&utm_campaign=CAT&gclid1234 5 Michael http//mywebsite.com44bb00?utm_source=Twitter&utm_medium=GDNr&utm_campaign=CAT&gclid1234 6 Bob http//mywebsite.com44bb00?utm_source=ADW&utm_medium=GDN&utm_campaign=DOG&gclid1234 7 Mark http//mywebsite.com44bb00?utm_source=Twitter&utm_medium=banner&utm_campaign=MONK&gclid1234 8 Anna http//mywebsite.com44bb00?utm_source=Facebook&utm_medium=banner&utm_campaign=MONK&gclid1234
And here is the desired output of what I am trying to achieve
NAME UTM_SOURCE UTM_MEDIUM UTM_CAMPAIGN 1 Jane ADW banner Monk 2 Mike Google cpc DOG 3 John Yahoo banner DOG 4 Sarah Faceboo cpc CAT 5 Michael Twitter GDN CAT 6 Bob ADW GDN DOG 7 Mark Twitter banner MONK 8 Anna Facebook banner MONK
I understand that the function gsub can help me. Here is what I have tried so far:
> file1 <- read.csv("C:/Users/Dumitru Ostaciu/Desktop/Users.csv") > file1 <- transform(file1, Source = as.character(Source)) > file2 <- gsub(".*\\?utm_source=", "", file1$Source)
UUID SOURCE 1 ADW&utm_medium=banner&utm_campaign=Monk&gclid1234 2 Google&utm_medium=cpc&utm_campaign=DOG&gclid1234 3 Yahoo&utm_medium=banner&utm_campaign=DOG&gclid1234 4 Facebookdw&utm_medium=cpc&utm_campaign=CAT&gclid1234 5 Twitter&utm_medium=GDNr&utm_campaign=CAT&gclid1234 6 ADW&utm_medium=GDN&utm_campaign=DOG&gclid1234 7 Twitter&utm_medium=banner&utm_campaign=MONK&gclid1234 8 Facebook&utm_medium=banner&utm_campaign=MONK&gclid1234
1)在我得到的输出中,函数复制了跟在值utm_source-后面的所有内容。如何添加另一个维度,使公式只复制=和&之间的内容
2)如何保留最初在第一列(UUID),Jane,Mike,John等中的值?
2) How do i keep the values that were initially in the first column (UUID) , Jane, Mike, John, etc?
推荐答案
- Use gsub to strip the website name from your Source
- Use strsplit to separate the remaining string at each occurrence of ?
x <- read.table(text=" UUID Source 1 Jane http//mywebsite.com44bb00?utm_source=ADW&utm_medium=banner&utm_campaign=Monk&gclid1234 2 Mike http//mywebsite.com44bb00?utm_source=Google&utm_medium=cpc&utm_campaign=DOG&gclid1234 3 John http//mywebsite.com44bb00?utm_source=Yahoo&utm_medium=banner&utm_campaign=DOG&gclid1234 4 Sarah http//mywebsite.com44bb00?utm_source=Facebookdw&utm_medium=cpc&utm_campaign=CAT&gclid1234 5 Michael http//mywebsite.com44bb00?utm_source=Twitter&utm_medium=GDNr&utm_campaign=CAT&gclid1234 6 Bob http//mywebsite.com44bb00?utm_source=ADW&utm_medium=GDN&utm_campaign=DOG&gclid1234 7 Mark http//mywebsite.com44bb00?utm_source=Twitter&utm_medium=banner&utm_campaign=MONK&gclid1234 8 Anna http//mywebsite.com44bb00?utm_source=Facebook&utm_medium=banner&utm_campaign=MONK&gclid1234", header=TRUE, stringsAsFactors=FALSE)
Use strsplit to separate the Source string at each ?:
z <- matrix( unlist(strsplit(gsub(".*\\?", "", x$Source), "\\&")), ncol=4, byrow=TRUE) cbind(x$UUID, gsub(".*=", "", z)) [,1] [,2] [,3] [,4] [,5] [1,] "Jane" "ADW" "banner" "Monk" "gclid1234" [2,] "Mike" "Google" "cpc" "DOG" "gclid1234" [3,] "John" "Yahoo" "banner" "DOG" "gclid1234" [4,] "Sarah" "Facebookdw" "cpc" "CAT" "gclid1234" [5,] "Michael" "Twitter" "GDNr" "CAT" "gclid1234" [6,] "Bob" "ADW" "GDN" "DOG" "gclid1234" [7,] "Mark" "Twitter" "banner" "MONK" "gclid1234" [8,] "Anna" "Facebook" "banner" "MONK" "gclid1234"
然后转换为数据框并添加名称:
And then convert to a data frame and add names:
z <- matrix( unlist(strsplit(gsub(".*\\?", "", x$Source), "\\&")), ncol=4, byrow=TRUE) z <- cbind(x$UUID, gsub(".*=", "", z)) z <- as.data.frame(z[, -5]) names(z) <- c("UUID", "UTM_SOURCE", "UTM_MEDIUM", "UTM_CAMPAIGN") z UUID UTM_SOURCE UTM_MEDIUM UTM_CAMPAIGN 1 Jane ADW banner Monk 2 Mike Google cpc DOG 3 John Yahoo banner DOG 4 Sarah Facebookdw cpc CAT 5 Michael Twitter GDNr CAT 6 Bob ADW GDN DOG 7 Mark Twitter banner MONK 8 Anna Facebook banner MONK
这篇关于在R中使用具有多个条件的gsub函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!