问题描述
我希望在R中使用arulesSequences
包.但是,我不知道如何将数据帧强制转换为可以利用此包的对象.
I am looking to use the arulesSequences
package in R. However, I have no idea as to how to coerce my data frame into an object that can leverage this package.
这是一个玩具数据集,可复制我的数据结构:
Here is a toy dataset that replicates my data structure:
ids <- c(rep("X", 5), rep("Y", 5), rep("Z", 5))
seq <- rep(1:5,3)
val <- sample(LETTERS, 15, replace=T)
df <- data.frame(ids, seq, val)
df
ids seq val
1 X 1 T
2 X 2 H
3 X 3 V
4 X 4 A
5 X 5 X
6 Y 1 D
7 Y 2 B
8 Y 3 A
9 Y 4 D
10 Y 5 P
11 Z 1 Q
12 Z 2 R
13 Z 3 W
14 Z 4 W
15 Z 5 P
任何帮助将不胜感激.
推荐答案
对我来说,它实际上是添加一个订单"列,该列列出了订单排名而不是时间值.您只需要在命名约定中非常具体.尝试命名组"或订购的篮子#"变量sequenceID,然后调用排名或订购的eventID.
It worked for me add an essentially "order" column that lists a order ranking rather than a time value. You just have to be very specific in the naming convention. Try and name the "group" or "ordered basket #" variable sequenceID, and call the ranking or ordering eventID.
另一个帮助我(并且让我ing了好久的头)的东西是read_baskets()似乎需要我指定
Another thing that helped me (and had me scratching my head for a long time) was that read_baskets() seemed to need me to specify
read_baskets(con = filePath.txt, sep = " ", info = c("sequenceID","eventID","SIZE"))
尽管help函数使c()详细信息看起来像是可选的标头,但事实并非如此.我似乎需要从文件中删除标头,并在read_baskets()命令中指定它,否则会遇到问题.
Even though the help function makes the c() details seem like an optional header, it is not. I seemed to need to remove the header from my file and specify it in the read_baskets() command, or I'd run into problems.
这篇关于R中的Arules序列挖掘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!