本文介绍了在后台/作为服务运行 Scrapyd 的首选方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在虚拟 Ubuntu 16.04 服务器上运行 Scrapyd,我通过 SSH 连接到该服务器.当我通过简单地运行

来运行scrapy时

$scrapyd

我可以通过转至 http://82.165.102.18:6800 连接到网络界面.>

但是,一旦我关闭 SSH 连接,Web 界面就不再可用,因此,我认为我需要以某种方式在后台运行 Scrapyd 作为服务.

经过一些研究,我发现了一些建议的解决方案:

  • 守护进程(sudo apt install 守护进程)
  • 屏幕(sudo apt 安装屏幕)
  • tmux (sudo apt install tmux)

有人知道最佳/推荐的解决方案是什么吗?不幸的是,Scrapyd 文档相当单薄且过时.

对于某些背景,我每天需要运行大约 10-15 个蜘蛛.

解决方案

将 ScrapyD 设置为系统服务

sudo nano/lib/systemd/system/scrapyd.service

然后复制粘贴以下内容

[单位]说明=Scrapyd 服务之后=网络.目标[服务]用户=组=工作目录=/任何/目录/这里ExecStart=/usr/local/bin/scrapyd[安装]WantedBy=multi-user.target

然后启用服务

systemctl enable scrapyd.service

然后启动服务

systemctl start scrapyd.service

另一种方法但不推荐

使用这个命令.

cd/path/to/your/project/folder &&nohup scrapyd >&/dev/null &

现在您可以关闭 SSH 连接,但 scrapyd 将继续运行.

并确保无论何时您的服务器重新启动并且scrapyd 都会自动运行.这样做

从你的终端复制 echo $PATH 的输出,然后通过 crontab -e

打开你的 crontab

现在在该文件的最顶部,写下这个

PATH=YOUR_COPIED_CONTENT

现在在你的 crontab 的末尾,写下这个.

@reboot cd/path/to/your/project/folder &&nohup scrapyd >&/dev/null &

这意味着,每次您的服务器重新启动时,上面的命令都会自动运行

I am trying to run Scrapyd on a virtual Ubuntu 16.04 server, to which I connect via SSH. When I run scrapy by simply running

$ scrapyd

I can connect to the web interface by going to http://82.165.102.18:6800.

However, once I close the SSH connection, the web interface is no longer available, therefore, I think I need to run Scrapyd in the background as a service somehow.

After some research I came across a few proposed solutions:

  • daemon (sudo apt install daemon)
  • screen (sudo apt install screen)
  • tmux (sudo apt install tmux)

Does someone know what the best / recommended solution is? Unfortunately, the Scrapyd documentation is rather thin and outdated.

For some background, I need to run about 10-15 spiders on a daily basis.

解决方案

Set ScrapyD as a System Service

sudo nano /lib/systemd/system/scrapyd.service

Then copy-paste following

[Unit]
Description=Scrapyd service
After=network.target

[Service]
User=<YOUR-USER>
Group=<USER-GROUP>
WorkingDirectory=/any/directory/here
ExecStart=/usr/local/bin/scrapyd

[Install]
WantedBy=multi-user.target

Then enable service

systemctl enable scrapyd.service

Then start service

systemctl start scrapyd.service

Another method but not recommended

Use this command.

cd /path/to/your/project/folder && nohup scrapyd >& /dev/null &

Now you can close your SSH connection but scrapyd will keep running.

And to make sure that whenever your server restarts and scrapyd runs automatically. Do this

copy the output of echo $PATH from your terminal, and then open your crontab by crontab -e

Now at the very top of that file, write this

PATH=YOUR_COPIED_CONTENT

And now at the end of your crontab, write this.

@reboot cd /path/to/your/project/folder && nohup scrapyd >& /dev/null &

This means, each time your server is restarted, above command will automatically run

这篇关于在后台/作为服务运行 Scrapyd 的首选方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-11 11:54