本文介绍了创建字典并用R替换拉丁词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有带有拉丁语单词的数据集
I have dataset with latin words
text<-c("TESS",
"MAG")
我想设置拉丁西里尔字母音译
I want to set transliteration from latin-cyrillic
library(stringi)
d=stri_trans_general(mydat$text, "latin-cyrillic")
但是我想手动创建翻译字典。
例如:
But I want to manually create the translit dictionary.For example:
dictionary<-c("Tess"="ТЕСС"
"MAG"="МАГ"
.......
......
)
创建字典时,mydat $文本中的
必须用我设置的西里尔字母词替换所有拉丁词。
这样的东西
when dictionary is created,in mydat$text,all latin words must be replaced by cyrillic words, which i set.something like this
d=dictionary(mydat$text)
如何执行这种替换?
text<-c("TESS",
"MAG")
已翻译的文件
file with translit
dict=path.csv
包含
dict=
structure(list(old = structure(c(2L, 1L), .Label = c("mag", "tess"
), class = "factor"), new = structure(c(2L, 1L), .Label = c("маг",
"тесс"), class = "factor")), .Names = c("old", "new"), class = "data.frame", row.names = c(NA,
-2L))
#output
text<-c("ТЕСС",
"МАГ")
仅此
推荐答案
去了!
dict <- structure(list(
old = structure(c(2L, 1L), .Label = c("mag", "tess"),class = "factor"),
new = structure(c(2L, 1L), .Label = c("маг", "тесс"), class = "factor")),
.Names = c("old", "new"), class = "data.frame", row.names = c(NA, -2L))
input<-c("TESS","MAG")
output <- with(lapply(dict,as.character), new[match(tolower(input),old)])
output
# [1] "тесс" "маг"
这篇关于创建字典并用R替换拉丁词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!