本文介绍了r中的翻译(重新编码)错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个小例子:

X1 <- c("AC", "AC", "AC", "CA", "TA", "AT", "CC", "CC")
X2 <- c("AC", "AC", "AC", "CA", "AT", "CA", "AC", "TC")
X3 <- c("AC", "AC", "AC", "AC", "AA", "AT", "CC", "CA")
mydf1 <- data.frame(X1, X2, X3)

输入数据框

  X1 X2 X3
1 AC AC AC
2 AC AC AC
3 AC AC AC
4 CA CA AC
5 TA AT AA
6 AT CA AT
7 CC AC CC
8 CC TC CA

功能

# Function 
atgc <- function(x) {
 xlate <- c( "AA" = 11, "AC" = 12, "AG" = 13, "AT" = 14,
"CA"= 12, "CC" = 22, "CG"= 23,"CT"= 24,
 "GA" = 13, "GC" = 23, "GG"= 33,"GT"= 34,
 "TA"= 14,  "TC" = 24, "TG"= 34,"TT"=44,
"ID"= 56, "DI"= 56, "DD"= 55, "II"= 66
 )
  x =   xlate[x]
 }
outdataframe <- sapply (mydf1, atgc)
outdataframe
   X1 X2 X3
AA 11 11 12
AA 11 11 12
AA 11 11 12
AG 13 13 12
CA 12 12 11
AC 12 13 13
AT 14 11 12
AT 14 14 14

问题,AC在输出上不是12,而是其他的。只是乱七八糟

Problem, AC is not eaqual to 12 in output rather 11, similarly for others. Just mess !

(Exta:我也不知道如何摆脱rownames。)

( Exta: Also I do not know how to get rid of the rownames.)

推荐答案

只需使用申请并转置:

t(apply (mydf1, 1, atgc))

要使用 sapply ,然后使用:


  1. stringsAsFactors = FALSE 创建数据框时,即

mydf1 <- data.frame(X1, X2, X3, stringsAsFactors=FALSE)

(感谢@joran)或

(thanks @joran) or

将函数的最后一行更改为: x = xlate [as.vector(x)]

Change the last line of your function to: x = xlate[as.vector(x)]

这篇关于r中的翻译(重新编码)错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-19 03:38