本文介绍了在本地模式下运行StormCrawler还是安装Apache Storm?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我试图找出如何使用此处.

So I'm trying to figure out how to install and setup Storm/Stormcrawler with ES and Kibana as described here.

我以前从未在本地计算机上安装过Storm,因为我以前曾与Nutch合作过,所以我从来没有在本地安装Hadoop ...以为与Storm可能是一样的(也许不是吗?).

I never installed Storm on my local machine because I've worked with Nutch before and I never had to install Hadoop locally... thought it might be the same with Storm(maybe not?).

我现在想开始使用Stormcrawler而不是Nutch进行爬网.

I'd like to start crawling with Stormcrawler instead of Nutch now.

看来,如果我只是下载一个发行版并将/bin添加到我的PATH中,我就可以与远程集群通信.

It seems that if I just download a release and add the /bin to my PATH, I can only talk to a remote cluster.

似乎我需要根据 this ,使我能够随着时间的推移开发不同的拓扑,然后在准备部署新拓扑时从本地计算机与远程集群进行对话.是吗?

It seems like I need to setup a development environment according to this, to give me the ability to develop different topologies over time and then just talk to the remote cluster from my local machine when ready to deploy the new topologies. Is that right?

所以看来我要做的就是在使用Maven生成Stormcrawler项目时将它添加为我的Stormcrawler项目的依赖项?

So it seems like all I need to do is add Storm as a dependency to my Stormcrawler project when I build it with Maven?

推荐答案

请参见入门页面 YouTube上的教程.

您不需要安装Storm,因为可以在本地模式下运行拓扑,就像使用Nutch和Hadoop一样.只需根据原型生成拓扑,然后根据需要进行修改即可.添加ES组件并使用-local运行它.请参阅原型生成的自述文件.

You don't need to install Storm as you can run the topology in local mode, just as you'd do with Nutch and Hadoop. Just generate a topology from the archetype, modify it to your needs e.g. add ES components and run it with -local. See README generated by the archetype.

稍后,您将安装Storm来从UI中受益,并且可能在多个节点上运行它,但是作为起点,在本地进行它是探索StormCrawler功能的好方法.

Later on, you'd install Storm to benefit from the UI and possibly run it on multiple nodes but as a starting point doing it locally is a good way of exploring the capabilities of StormCrawler.

这篇关于在本地模式下运行StormCrawler还是安装Apache Storm?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-03 12:11