本文介绍了tidyverse 的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

几个月前我运行了以下代码并且运行正常 -

I run the following code few months back and it worked OK -

ceo1_nochange <- ceo1 %>% 
  group_by(ISIN, year) %>% 
  nest(.key = "OTHER_DATA") %>% 
  group_by(ISIN) %>% 
  mutate(OTHER_DATA_LAG = lag(OTHER_DATA, 1), 
         OTHER_DATA_LEAD = lead(OTHER_DATA, 1), 
         KEEP = pmap(list(OTHER_DATA_LAG, OTHER_DATA, OTHER_DATA_LEAD), function(x, y, z) {
           isTRUE(all_equal(x["DirectorID"], y["DirectorID"])) ||
             isTRUE(all_equal(y["DirectorID"], z["DirectorID"]))
         })) %>% 
  filter(unlist(KEEP)) %>% 
  select(-OTHER_DATA_LAG, -OTHER_DATA_LEAD, -KEEP) %>% 
  unnest() %>% 
  ungroup()

我的目的是找出那些 DirectorID 每年都没有变化的观察结果.

My purpose was to identify those observations in which DirectorID did not change from year to year.

但现在我收到以下错误 -

But now I got the following error -

Error: Problem with `mutate()` input `KEEP`.
x argument is of length zero
i Input `KEEP` is `pmap(...)`.
i The error occurred in group 1: ISIN = "AN8068571086".
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
 Error: Problem with `mutate()` input `KEEP`.
x argument is of length zero
i Input `KEEP` is `pmap(...)`.
i The error occurred in group 1: ISIN = "AN8068571086".
Run `rlang::last_error()` to see where the error occurred.

有人能解释一下吗?

这是一个示例数据集 -

This is a sample dataset -

"ROW,ISIN,YEAR,DIRECTOR_NAME,DIRECTOR_ID
1,US9898171015,2006,Thomas (Tom) E Davin,2247441792
2,US9898171015,2006,Matthew (Matt) L Hyde,4842568996
3,US9898171015,2007,James (Jim) M Weber,3581636766
4,US9898171015,2007,Matthew (Matt) L Hyde,4842568996
5,US9898171015,2007,David (Dave) M DeMattei,759047198
6,US9898171015,2008,James (Jim) M Weber,3581636766
7,US9898171015,2008,Matthew (Matt) L Hyde,4842568996
8,US9898171015,2008,David (Dave) M DeMattei,759047198
9,US9898171015,2009,William (Bill) Milroy Barnum Jr,20462211719
10,US9898171015,2009,James (Jim) M Weber,3581636766
11,US9898171015,2009,Matthew (Matt) L Hyde,4842568996
12,US9898171015,2009,David (Dave) M DeMattei,759047198
13,US9898171015,2010,William (Bill) Milroy Barnum Jr,20462211719
14,US9898171015,2010,James (Jim) M Weber,3581636766
15,US9898171015,2010,Matthew (Matt) L Hyde,4842568996
16,US9898171015,2011,Sarah (Sally) Gaines McCoy,11434863691
17,US9898171015,2011,William (Bill) Milroy Barnum Jr,20462211719
18,US9898171015,2011,James (Jim) M Weber,3581636766
19,US9898171015,2011,Matthew (Matt) L Hyde,4842568996
20,US9898171015,2012,Sarah (Sally) Gaines McCoy,11434863691
21,US9898171015,2012,Ernest R Johnson,40425210975
22,US9898171015,2013,Sarah (Sally) Gaines McCoy,11434863691
23,US9898171015,2013,Ernest R Johnson,40425210975
24,US9898171015,2013,Travis D Smith,53006212569
25,US9898171015,2014,Sarah (Sally) Gaines McCoy,11434863691
26,US9898171015,2014,Ernest R Johnson,40425210975
27,US9898171015,2014,Travis D Smith,53006212569
28,US9898171015,2015,Kalen F Holmes,11051172801
29,US9898171015,2015,Sarah (Sally) Gaines McCoy,11434863691
30,US9898171015,2015,Ernest R Johnson,40425210975
31,US9898171015,2015,Travis D Smith,53006212569
32,US9898171015,2016,Sarah (Sally) Gaines McCoy,11434863691
33,US9898171015,2016,Ernest R Johnson,40425210975
34,US9898171015,2016,Travis D Smith,53006212569
35,US9898171015,2017,Sarah (Sally) Gaines McCoy,11434863691
36,US9898171015,2017,Scott Andrew Bailey,174000000000
37,US9898171015,2017,Ernest R Johnson,40425210975
38,US9898171015,2017,Travis D Smith,53006212569
" 

有人可以提供一些线索吗?

can someone provide some clue?

推荐答案

我在代码中没有发现任何可能因最近的更改而受到影响的内容.您收到错误的原因是 laglead 函数.当您在数据帧上使用它们时,它会分别在开头和结尾创建 NULL 值.如果您将该检查放在 pmap 语句中,它应该可以工作.

I didn't find anything in the code which might be affected due to any recent changes. The reason why you are getting the error is because of lag and lead functions. When you use them on dataframe it creates NULL values at the beginning and end respectively. If you put that check in pmap statement it should work.

我还对代码进行了一些其他更改 -

I did some other changes in the code as well -

  • .keynest 中已被弃用,因此使用 nest(OTHER_DATA = c(ROW, DIRECTOR_NAME, DIRECTOR_ID) 代替.
  • 使用了pmap_lgl(而不是pmap),这样您就不必在filterunlist(KEEP)/code>.
  • unnest 需要明确提及列名才能取消嵌套,所以使用了 unnest(cols = c(OTHER_DATA)).
  • .key has been deprecated in nest so used nest(OTHER_DATA = c(ROW, DIRECTOR_NAME, DIRECTOR_ID) instead.
  • Used pmap_lgl (instead of pmap) so that you don't have to do unlist(KEEP) in filter.
  • unnest needs an explicit mention of column name to unnest so used unnest(cols = c(OTHER_DATA)).
library(tidyverse)

ceo1 %>% 
  group_by(ISIN, YEAR) %>% 
  nest(OTHER_DATA = c(ROW, DIRECTOR_NAME, DIRECTOR_ID)) %>% 
  group_by(ISIN) %>% 
  mutate(OTHER_DATA_LAG = lag(OTHER_DATA, 1), 
         OTHER_DATA_LEAD = lead(OTHER_DATA, 1),
         KEEP = pmap_lgl(list(OTHER_DATA_LAG, OTHER_DATA, OTHER_DATA_LEAD), function(x, y, z) {
           if(length(x) > 0 && length(y) > 0 && length(z) > 0)
                isTRUE(all_equal(x["DIRECTOR_ID"], y["DIRECTOR_ID"])) ||
                isTRUE(all_equal(y["DIRECTOR_ID"], z["DIRECTOR_ID"]))
           else FALSE
         })) %>% 
  filter(KEEP) %>% 
  select(-OTHER_DATA_LAG, -OTHER_DATA_LEAD, -KEEP) %>% 
  unnest(cols = c(OTHER_DATA)) %>% 
  ungroup()

#   ISIN          YEAR   ROW DIRECTOR_NAME              DIRECTOR_ID
#   <chr>        <int> <int> <chr>                            <dbl>
# 1 US9898171015  2007     3 James (Jim) M Weber         3581636766
# 2 US9898171015  2007     4 Matthew (Matt) L Hyde       4842568996
# 3 US9898171015  2007     5 David (Dave) M DeMattei      759047198
# 4 US9898171015  2008     6 James (Jim) M Weber         3581636766
# 5 US9898171015  2008     7 Matthew (Matt) L Hyde       4842568996
# 6 US9898171015  2008     8 David (Dave) M DeMattei      759047198
# 7 US9898171015  2013    22 Sarah (Sally) Gaines McCoy 11434863691
# 8 US9898171015  2013    23 Ernest R Johnson           40425210975
# 9 US9898171015  2013    24 Travis D Smith             53006212569
#10 US9898171015  2014    25 Sarah (Sally) Gaines McCoy 11434863691
#11 US9898171015  2014    26 Ernest R Johnson           40425210975
#12 US9898171015  2014    27 Travis D Smith             53006212569

这篇关于tidyverse 的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-21 06:26