本文介绍了实现像Wordle这样的词云的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


  • 看看Wordle:

  • 它比我见过的任何其他的词云生成器更好看

  • 注意:源代码不可用 - 阅读FAQ:

  • Take a look at Wordle: http://www.wordle.net/
  • It's much better looking than any other word cloud generators I've seen
  • Note: the source is not available - read the FAQ: http://www.wordle.net/faq#code

  • 是否有可用的算法来执行Wordle的功能?

  • 如果不是,那么产生类似输出的替代方法是什么?


  • 只是好奇

  • 想学习

推荐答案

我是Wordle的创建者。这是Wordle实际工作的方式:

I'm the creator of Wordle. Here's how Wordle actually works:

计算单词,扔掉无聊的单词,然后按数字排序。保留一些N的前N个单词。为每个单词指定与其计数成正比的字体大小。使用Java2D API为每个单词生成一个Java2D形状。

Count the words, throw away boring words, and sort by the count, descending. Keep the top N words for some N. Assign each word a font size proportional to its count. Generate a Java2D Shape for each word, using the Java2D API.

每个单词想要位于某处,例如在垂直中心的某个随机x位置 。按照频率递减的顺序,为每个单词做这个:

Each word "wants" to be somewhere, such as "at some random x position in the vertical center". In decreasing order of frequency, do this for each word:

place the word where it wants to be
while it intersects any of the previously placed words
    move it one step along an ever-increasing spiral

而已。 hard 部分正在有效地进行相交测试,为此我使用了最后命中的缓存,层次边界框和四叉树空间索引(所有这些都是可以通过一些勤奋搜索)。

That's it. The hard part is in doing the intersection-testing efficiently, for which I use last-hit caching, hierarchical bounding boxes, and a quadtree spatial index (all of which are things you can learn more about with some diligent googling).

编辑:正如Reto Aebersold所指出的那样,现在有一本书可以免费获得,它涵盖了这个领域:

As Reto Aebersold pointed out, there's now a book chapter, freely available, that covers this same territory: Beautiful Visualization, Chapter 3: Wordle

这篇关于实现像Wordle这样的词云的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-25 03:28