问题描述
初学者r用户在这里.我有一个针对不同行业分类和不同分区的年度就业人数的数据集.对于某些观察,员工人数为空.我想通过线性插值(使用na.approx或其他方法)填充这些值.但是,我只想在相同的行业分类和子区域内进行插值.
Beginner r user here. I have a dataset of yearly employment numbers for different industry classifications and different subregions. For some observations, the number of employees is null. I would like to fill these values through linear interpolation (using na.approx or some other method). However, I only want to interpolate within the same industry classification and subregion.
例如,我有这个:
subregion <- c("East Bay", "East Bay", "East Bay", "East Bay", "East Bay", "South Bay")
industry <-c("A","A","A","A","A","B" )
year <- c(2013, 2014, 2015, 2016, 2017, 2002)
emp <- c(50, NA, NA, 80,NA, 300)
data <- data.frame(cbind(subregion,industry,year, emp))
subregion industry year emp
1 East Bay A 2013 50
2 East Bay A 2014 <NA>
3 East Bay A 2015 <NA>
4 East Bay A 2016 80
5 East Bay A 2017 <NA>
6 South Bay B 2002 300
我需要生成此表,跳过对第五个观察值的插值,因为子区域和行业与先前的观察值不匹配.
I need to generate this table, skipping interpolating the fifth observation because subregion and industry do not match the previous observation.
subregion industry year emp
1 East Bay A 2013 50
2 East Bay A 2014 60
3 East Bay A 2015 70
4 East Bay A 2016 80
5 East Bay A 2017 <NA>
6 South Bay B 2002 300
像这样的文章很有帮助,但是我无法解决如何适应解决方案以匹配发生插值的两列相同而不是一列的要求.任何帮助将不胜感激.
Articles like this have been helpful, but I cannot figure out how to adapt the solution to match the requirement that two columns be the same for interpolation to occur, instead of one. Any help would be appreciated.
推荐答案
我们可以通过 na.approx
(来自 zoo
)进行分组
We could do a group by na.approx
(from zoo
)
library(tidyverse)
data %>%
group_by(subregion, industry) %>%
mutate(emp = zoo::na.approx(emp, na.rm = FALSE))
# A tibble: 6 x 4
# Groups: subregion, industry [2]
# subregion industry year emp
# <fct> <fct> <dbl> <dbl>
#1 East Bay A 2013 50
#2 East Bay A 2014 60
#3 East Bay A 2015 70
#4 East Bay A 2016 80
#5 East Bay A 2017 NA
#6 South Bay B 2002 300
数据
data <- data.frame(subregion,industry,year, emp)
这篇关于根据R中的多个条件进行插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!