您是否会使用if/else在Haskell中编写此算法?没有他们,有没有办法表达出来?很难从中间提取具有意义的函数。这只是机器学习系统的输出。

我正在实现将html内容段分类为Content或Boilerplate描述here的算法。这具有已经硬编码的权重。

curr_linkDensity <= 0.333333
| prev_linkDensity <= 0.555556
| | curr_numWords <= 16
| | | next_numWords <= 15
| | | | prev_numWords <= 4: BOILERPLATE
| | | | prev_numWords > 4: CONTENT
| | | next_numWords > 15: CONTENT
| | curr_numWords > 16: CONTENT
| prev_linkDensity > 0.555556
| | curr_numWords <= 40
| | | next_numWords <= 17: BOILERPLATE
| | | next_numWords > 17: CONTENT
| | curr_numWords > 40: CONTENT
curr_linkDensity > 0.333333: BOILERPLATE

最佳答案

由于此决策树中只​​有三种路径会导致BOILERPLATE状态,因此我将对其进行迭代和简化:

isBoilerplate =
  prev_linkDensity   <= 0.555556 && curr_numWords <= 16 && prev_numWords <= 4
  || prev_linkDensity > 0.555556 && curr_numWords <= 40 && next_numWords <= 17
  || curr_linkDensity > 0.333333

关于algorithm - 您将如何在Haskell中表达这一点?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/31414337/

10-13 03:08