在R中将多行数据格式化为单行

本文介绍了在R中将多行数据格式化为单行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是一个奇怪的excel或csv格式的文件，我想将其导入R作为数据框。问题是某些列有多个记录行，例如，数据如下：有三列和两行，但是工具列有多列，有没有一种方法可以格式化数据，所以我将仅使用多个工具（例如tool1，tool2等）记录

I am a strange excel or csv formatted file which I want to import to R as a data frame. The problem is that some columns have multiple rows for the records, for example, the data is as follow: There are three columns and two rows but the tools columns has multiple columns, is there a way I can format the data so I will have only record with multiple tools (like say tool1, tool2, etc)

Task             Location  Tools
Raising ticket   Alabama   sharepoint
                           word
                           oracle
Changing ticket  Seattle   word
                           oracle

最终输出预期

Task             Location  Tools1   Tools2  Tools3
Raising ticket   Alabama   sharepoint   word    oracle
Changing ticket  Seattle   word         oracle

推荐答案

使用 dplyr 和 tidyr 。您可以填充您的数据框，以便在每一行中包含任务和位置。然后 group_by 任务和变异为每个组中的每个任务添加一个ID列。然后使用 spread 将新创建的id列分布到多个列中。

With dplyr and tidyr. You can fill your dataframe so that Task and Location are included in each row. Then group_by Task and mutate to add an id column for each task within each group. Then use spread to spread the newly created id column across multiple columns.

library(dplyr)
library(tidyr)
df <- data.frame(Task = c("Raising ticket","","","Changing ticket",""), Location = c("Alabama","","","Seattle",""), Tools = c("sharepoint","word","oracle","word","oracle"))
df[df==""]  <- NA
df %>%
  fill(Task,Location) %>%
  group_by(Task) %>%
  mutate(id = paste0("Tools",row_number())) %>%
  spread(id, Tools)

# A tibble: 2 x 5
# Groups: Task [2]
#  Task            Location Tools1     Tools2 Tools3
#   <fct>           <fct>    <fct>      <fct>  <fct>
# 1 Changing ticket Seattle  word       oracle <NA>
# 2 Raising ticket  Alabama  sharepoint word   oracle

这篇关于在R中将多行数据格式化为单行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

then