IT Experience: December 2016

PIP required module :
pymongo

ubuntu@nutthaphongmail:~$ docker exec -it --user root scrapyd-server bash
root@6b82490da131:/home/scrapyd# pip install shub
Collecting shub
  Downloading shub-2.5.0-py2.py3-none-any.whl (47kB)
    100% |################################| 51kB 773kB/s 
Collecting requests (from shub)
  Downloading requests-2.12.4-py2.py3-none-any.whl (576kB)
    100% |################################| 583kB 1.3MB/s 
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages (from shub)
Collecting scrapinghub (from shub)
  Downloading scrapinghub-1.9.0-py2-none-any.whl
Requirement already satisfied: pip in /usr/local/lib/python2.7/dist-packages (from shub)
Collecting PyYAML (from shub)
  Downloading PyYAML-3.12.tar.gz (253kB)
    100% |################################| 256kB 2.5MB/s 
Collecting click (from shub)
  Downloading click-6.6-py2.py3-none-any.whl (71kB)
    100% |################################| 71kB 4.2MB/s 
Collecting retrying (from shub)
  Downloading retrying-1.3.3.tar.gz
Collecting docker-py (from shub)
  Downloading docker_py-1.10.6-py2.py3-none-any.whl (50kB)
    100% |################################| 51kB 4.2MB/s 
Collecting backports.ssl-match-hostname>=3.5; python_version < "3.5" (from docker-py->shub)
  Downloading backports.ssl_match_hostname-3.5.0.1.tar.gz
Collecting websocket-client>=0.32.0 (from docker-py->shub)
  Downloading websocket_client-0.40.0.tar.gz (196kB)
    100% |################################| 204kB 2.7MB/s 
Requirement already satisfied: ipaddress>=1.0.16; python_version < "3.3" in /usr/local/lib/python2.7/dist-packages (from docker-py->shub)
Collecting docker-pycreds>=0.2.1 (from docker-py->shub)
  Downloading docker_pycreds-0.2.1-py2.py3-none-any.whl
Building wheels for collected packages: PyYAML, retrying, backports.ssl-match-hostname, websocket-client
  Running setup.py bdist_wheel for PyYAML ... done
  Stored in directory: /root/.cache/pip/wheels/2c/f7/79/13f3a12cd723892437c0cfbde1230ab4d82947ff7b3839a4fc
  Running setup.py bdist_wheel for retrying ... done
  Stored in directory: /root/.cache/pip/wheels/d9/08/aa/49f7c109140006ea08a7657640aee3feafb65005bcd5280679
  Running setup.py bdist_wheel for backports.ssl-match-hostname ... done
  Stored in directory: /root/.cache/pip/wheels/5d/72/36/b2a31507b613967b728edc33378a5ff2ada0f62855b93c5ae1
  Running setup.py bdist_wheel for websocket-client ... done
  Stored in directory: /root/.cache/pip/wheels/d1/5e/dd/93da015a0ecc8375278b05ad7f0452eff574a044bcea2a95d2
Successfully built PyYAML retrying backports.ssl-match-hostname websocket-client
Installing collected packages: requests, retrying, scrapinghub, PyYAML, click, backports.ssl-match-hostname, websocket-client, docker-pycreds, docker-py, shub
Successfully installed PyYAML-3.12 backports.ssl-match-hostname-3.5.0.1 click-6.6 docker-py-1.10.6 docker-pycreds-0.2.1 requests-2.12.4 retrying-1.3.3 scrapinghub-1.9.0 shub-2.5.0 websocket-client-0.40.0
root@6b82490da131:/home/scrapyd# exit
exit
ubuntu@nutthaphongmail:~$ docker exec -it scrapyd-server bash
scrapyd@6b82490da131:~$ 
scrapyd@6b82490da131:~$ shub
Usage: shub [OPTIONS] COMMAND [ARGS]...

  shub is the Scrapinghub command-line client. It allows you to deploy
  projects or dependencies, schedule spiders, and retrieve scraped data or
  logs without leaving the command line.

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  copy-eggs     Sync eggs from one project with other project
  deploy        Deploy Scrapy project to Scrapy Cloud
  deploy-egg    [DEPRECATED] Build and deploy egg from source
  deploy-reqs   [DEPRECATED] Build and deploy eggs from requirements.txt
  fetch-eggs    Download project eggs from Scrapy Cloud
  image         Manage project based on custom Docker image
  items         Fetch items from Scrapy Cloud
  log           Fetch log from Scrapy Cloud
  login         Save your Scrapinghub API key
  logout        Forget saved Scrapinghub API key
  migrate-eggs  Migrate dash eggs to requirements.txt and project's directory
  requests      Fetch requests from Scrapy Cloud
  schedule      Schedule a spider to run on Scrapy Cloud
  version       Show shub version

  For usage and help on a specific command, run it with a --help flag, e.g.:

      shub schedule --help
scrapyd@6b82490da131:~$ 
scrapyd@6b82490da131:~$ 
scrapyd@6b82490da131:~$ pwd
/home/scrapyd

scrapyd@6b82490da131:~/projects/PyLearning$ cd stack/
scrapyd@6b82490da131:~/projects/PyLearning/stack$ cat requirements.txt 
pymongo==3.4.0
scrapyd@6b82490da131:~/projects/PyLearning/stack$ 
scrapyd@6b82490da131:~/projects/PyLearning/stack$ 
scrapyd@6b82490da131:~/projects/PyLearning/stack$ 
scrapyd@6b82490da131:~/projects/PyLearning/stack$ cat scrapinghub.yml 
projects:
  default: 136494
requirements_file: requirements.txt
scrapyd@6b82490da131:~/projects/PyLearning/stack$ 
scrapyd@6b82490da131:~/projects/PyLearning/stack$ shub login

-------------------------------------------------------------------------------
Welcome to shub version 2!

This release contains major updates to how shub is configured, as well as
updates to the commands and shub's look & feel.

Run 'shub' to get an overview over all available commands, and
'shub command --help' to get detailed help on a command. Definitely try the
new 'shub items -f [JOBID]' to see items live as they are being scraped!

From now on, shub configuration should be done in a file called
'scrapinghub.yml', living next to the previously used 'scrapy.cfg' in your
Scrapy project directory. Global configuration, for example API keys, should be
done in a file called '.scrapinghub.yml' in your home directory.

But no worries, shub has automatically migrated your global settings to
~/.scrapinghub.yml, and will also automatically migrate your project settings
when you run a command within a Scrapy project.

Visit http://doc.scrapinghub.com/shub.html for more information on the new
configuration format and its benefits.

Happy scraping!
-------------------------------------------------------------------------------

Enter your API key from https://app.scrapinghub.com/account/apikey
API key: e4bfa1fd7f8d4d9da817aa112bb82095
Validating API key...
API key is OK, you are logged in now.
scrapyd@6b82490da131:~/projects/PyLearning/stack$ 
scrapyd@6b82490da131:~/projects/PyLearning/stack$ shub deploy
Target project ID: 136494
Save as default [Y/n]: Y
Project 136494 was set as default in scrapinghub.yml. You can deploy to it via 'shub deploy' from now on.
Packing version e7d7f6c-inet1
Deploying to Scrapy Cloud project "136494"
{"spiders": 1, "status": "ok", "project": 136494, "version": "e7d7f6c-inet1"}
Run your spiders at: https://app.scrapinghub.com/p/136494/

* API key from https://app.scrapinghub.com/account/apikey
** project ID found on https://app.scrapinghub.com/p/PROJECT_ID/deploy for new project

IT Experience

Friday, December 30, 2016

Docker Daemon

Thursday, December 22, 2016

How to deploy scrapy project to scrapinghub.com then save crawling data to mongoDB

Blog Archive

About Me