本文介绍了R中Chaid回归树到表的转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用了此链接中的CHAID包.给了我一个可以绘制的chaid对象.我想要一个决策表,其中每个决策规则都放在一列中,而不是决策树中. .但是我不明白如何访问这个chaid对象中的节点和路径.请帮助我.我按照此链接

I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me..I followed the procedure given in this link

由于数据太长,我无法在此处发布数据.因此,我正在发布一个代码,该代码将使用chaid提供的示例数据集来执行任务.

i cant post my data here since it is too long.So i am posting a code which takes the sample dataset provided with chaid to perform the task.

从chaid的帮助手册中复制:

library("CHAID")

  ### fit tree to subsample
  set.seed(290875)
  USvoteS <- USvote[sample(1:nrow(USvote), 1000),]

  ctrl <- chaid_control(minsplit = 200, minprob = 0.1)
  chaidUS <- chaid(vote3 ~ ., data = USvoteS, control = ctrl)

  print(chaidUS)
  plot(chaidUS)

输出:

Model formula:
vote3 ~ gender + ager + empstat + educr + marstat

Fitted party:
[1] root
|   [2] marstat in married
|   |   [3] educr <HS, HS, >HS: Gore (n = 311, err = 49.5%)
|   |   [4] educr in College, Post Coll: Bush (n = 249, err = 35.3%)
|   [5] marstat in widowed, divorced, never married
|   |   [6] gender in male: Gore (n = 159, err = 47.8%)
|   |   [7] gender in female
|   |   |   [8] ager in 18-24, 25-34, 35-44, 45-54: Gore (n = 127, err = 22.0%)
|   |   |   [9] ager in 55-64, 65+: Gore (n = 115, err = 40.9%)

Number of inner nodes:    4
Number of terminal nodes: 5

所以我的问题是如何在决策表中以一列每个决策规则(分支/路径)获取此树数据.我不知道如何从该chaid对象访问不同的树路径.

So my question is how to get this tree data in a decision table with each decision rule(branch/path) in a column..I dont understand how to access different tree paths from this chaid object..

推荐答案

CHAID软件包使用 partykit (递归分区)树结构.您可以使用参与方节点遍历树-一个节点可以是终端节点,也可以具有节点列表,其中包含有关决策规则(拆分)和拟合数据的信息.

CHAID package uses partykit (recursive partitioning) tree structures. You can walk the tree by using party nodes - a node can be terminal or have a list of nodes with information about decision rule (split) and fitted data.

下面的代码遍历树并创建决策表.它是为演示目的而编写的,仅在一棵示例树上进行了测试.

The code below walks the tree and creates the decision table. It is written for demonstration purposes and tested only on one sample tree.

tree2table <- function(party_tree) {

  df_list <- list()
  var_names <-  attr( party_tree$terms, "term.labels")
  var_levels <- lapply( party_tree$data, levels)

  walk_the_tree <- function(node, rule_branch = NULL) {
    # depth-first walk on partynode structure (recursive function)
    # decision rules are extracted for every branch
    if(missing(rule_branch)) {
      rule_branch <- setNames(data.frame(t(replicate(length(var_names), NA))), var_names)
      rule_branch <- cbind(rule_branch, nodeId = NA)
      rule_branch <- cbind(rule_branch, predict = NA)
    }
    if(is.terminal(node)) {
      rule_branch[["nodeId"]] <- node$id
      rule_branch[["predict"]] <- predict_party(party_tree, node$id)
      df_list[[as.character(node$id)]] <<- rule_branch
    } else {
      for(i in 1:length(node)) {
        rule_branch1 <- rule_branch
        val1 <- decision_rule(node,i)
        rule_branch1[[names(val1)[1]]] <- val1
        walk_the_tree(node[i], rule_branch1)
      }
    }
  }

  decision_rule <- function(node, i) {
    # returns split decision rule in data.frame with variable name an values
    var_name <- var_names[node$split$varid[[1]]]
    values_vec <- var_levels[[var_name]][ node$split$index == i]
    values_txt <- paste(values_vec, collapse = ", ")
    return( setNames(values_txt, var_name))
  }
  # compile data frame list
  walk_the_tree(party_tree$node)
  # merge all dataframes
  res_table <- Reduce(rbind, df_list)
  return(res_table)
}

带有CHAID树对象的调用函数:

call function with the CHAID tree object:

table1 <- tree2table(chaidUS)

结果应该是这样的:

gender   ager                       empstat   educr              marstat                          nodeId   predict
-------- -------------------------- --------- ------------------ -------------------------------- -------- ---------
NA       NA                         NA        <HS, HS, >HS       married                          3        Gore
NA       NA                         NA        College, Post Coll married                          4        Bush
male     NA                         NA        NA                 widowed, divorced, never married 6        Gore
female   18-24, 25-34, 35-44, 45-54 NA        NA                 widowed, divorced, never married 8        Gore
female   55-64, 65+                 NA        NA                 widowed, divorced, never married 9        Gore

这篇关于R中Chaid回归树到表的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 09:11