Tuesday, November 22, 2016

Deploy dataset project to scrapyd







scrapyd@a4c2642d74db:~/projects$ git clone https://github.com/nutthaphon/PyLearning.git
Cloning into 'PyLearning'...
remote: Counting objects: 323, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 323 (delta 0), reused 0 (delta 0), pack-reused 317
Receiving objects: 100% (323/323), 61.27 KiB | 0 bytes/s, done.
Resolving deltas: 100% (146/146), done.
Checking connectivity... done.
scrapyd@a4c2642d74db:~/projects$ 
scrapyd@a4c2642d74db:~/projects$ 
scrapyd@a4c2642d74db:~/projects$ 
scrapyd@a4c2642d74db:~/projects$ cd PyLearning/
scrapyd@a4c2642d74db:~/projects/PyLearning$ ls
Animals  CherryPy  DJango  README.md  Scraping  dataset  decor  foo  serial  test  tutorial
scrapyd@a4c2642d74db:~/projects/PyLearning$ cd dataset/
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ ls
__init__.py  __init__.pyc  build  project.egg-info  scrapinghub.yml  scrapy.cfg  settrade  setup.py
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ sed -i 's/#url/url/g' scrapy.cfg
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ cat scrapy.cfg
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html

[settings]
default = settrade.settings

[deploy]
url = http://localhost:6800/
project = settrade
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ ls
__init__.py  __init__.pyc  build  project.egg-info  scrapinghub.yml  scrapy.cfg  settrade  setup.py
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ cd settrade/
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade$ ls
DailyStockQuote.py   QueryStockSymbol.py   ThinkSpeakChannels.py   __init__.py   items.py      scrapinghub.yml  settings.pyc
DailyStockQuote.pyc  QueryStockSymbol.pyc  ThinkSpeakChannels.pyc  __init__.pyc  pipelines.py  settings.py      spiders
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade$ cd spiders/
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade/spiders$ ls
SettradeSpider.py  SettradeSpider.pyc  __init__.py  __init__.pyc
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade/spiders$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade/spiders$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade/spiders$ ls
SettradeSpider.py  SettradeSpider.pyc  __init__.py  __init__.pyc
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade/spiders$ cd ..
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset/settrade$ cd ..
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ ls
__init__.py  __init__.pyc  build  project.egg-info  scrapinghub.yml  scrapy.cfg  settrade  setup.py
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy -l
default              http://localhost:6800/
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy default -p dataset
Packing version 1479804720
Deploying to project "dataset" in http://localhost:6800/addversion.json
Server response (200):
{"status": "ok", "project": "dataset", "version": "1479804720", "spiders": 1, "node_name": "a4c2642d74db"}

scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy -L default
tutorial
dataset
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ curl http://localhost:6800/listprojects.json
{"status": "ok", "projects": ["tutorial", "dataset"], "node_name": "a4c2642d74db"}
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ curl http://localhost:6800/listspiders.json?project=dataset
{"status": "ok", "spiders": ["settrade_dataset"], "node_name": "a4c2642d74db"}
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ 
scrapyd@a4c2642d74db:~/projects/PyLearning/dataset$ curl http://localhost:6800/schedule.json -d project=dataset -d spider=settrade_dataset -d setting=DOWNLOAD_DELAY=5 -d start_urls=http://www.settrade.com/servlet/IntradayStockChartDataServlet?symbol=INTUCH,http://www.settrade.com/servlet/IntradayStockChartDataServlet?symbol=CPF,http://www.settrade.com/servlet/IntradayStockChartDataServlet?symbol=ADVANC
{"status": "ok", "jobid": "c60e4566b09111e68c380242ac110002", "node_name": "a4c2642d74db"}



Scrapyd deploying your project








ubuntu@node2:~/Docker/nutthaphon/scrapyd/user$ docker run --name scrapyd-server --user scrapyd -it -P nutthaphon/scrapyd:1.1.1 bash
scrapyd@a4c2642d74db:~$ pwd
/home/scrapyd
scrapyd@a4c2642d74db:~$ ls
master.zip  scrapyd-client-master  setuptools-28.8.0.zip
scrapyd@a4c2642d74db:~$ mkdir projects
scrapyd@a4c2642d74db:~$ cd projects/
scrapyd@a4c2642d74db:~/projects$ scrapy startproject tutorial
New Scrapy project 'tutorial', using template directory '/usr/local/lib/python2.7/dist-packages/scrapy/templates/project', created in:
    /home/scrapyd/projects/tutorial

You can start your first spider with:
    cd tutorial
    scrapy genspider example example.com
scrapyd@a4c2642d74db:~/projects$ cd tutorial/
scrapyd@a4c2642d74db:~/projects/tutorial$ ls
scrapy.cfg  tutorial
scrapyd@a4c2642d74db:~/projects/tutorial$ vi scrapy.cfg 
bash: vi: command not found
scrapyd@a4c2642d74db:~/projects/tutorial$ cat scrapy.cfg 
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html

[settings]
default = tutorial.settings

[deploy]
#url = http://localhost:6800/
project = tutorial
scrapyd@a4c2642d74db:~/projects/tutorial$ sed -i 's/#utl/url/g' scrapy.cfg 
scrapyd@a4c2642d74db:~/projects/tutorial$ cat scrapy.cfg 
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html

[settings]
default = tutorial.settings

[deploy]
#url = http://localhost:6800/
project = tutorial
scrapyd@a4c2642d74db:~/projects/tutorial$ sed -i 's/#url/url/g' scrapy.cfg 
scrapyd@a4c2642d74db:~/projects/tutorial$ cat scrapy.cfg 
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html

[settings]
default = tutorial.settings

[deploy]
url = http://localhost:6800/
project = tutorial
scrapyd@a4c2642d74db:~/projects/tutorial$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy -l
default              http://localhost:6800/
scrapyd@a4c2642d74db:~/projects/tutorial$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy default -p tutorial
Packing version 1479799362
Deploying to project "tutorial" in http://localhost:6800/addversion.json
Deploy failed: <urlopen error [Errno 111] Connection refused>
scrapyd@a4c2642d74db:~/projects/tutorial$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy default -p tutorial
Packing version 1479799403
Deploying to project "tutorial" in http://localhost:6800/addversion.json
Server response (200):
{"status": "ok", "project": "tutorial", "version": "1479799403", "spiders": 0, "node_name": "a4c2642d74db"}

scrapyd@a4c2642d74db:~/projects/tutorial$ ~/scrapyd-client-master/scrapyd-client/scrapyd-deploy -L default
tutorial