# scrapyd-client **Repository Path**: rukit/scrapyd-client ## Basic Information - **Project Name**: scrapyd-client - **Description**: Command line client for Scrapyd server - **Primary Language**: Unknown - **License**: BSD-3-Clause - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-08-19 - **Last Updated**: 2024-05-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ============== Scrapyd-client ============== .. image:: https://secure.travis-ci.org/scrapy/scrapyd-client.png?branch=master :target: http://travis-ci.org/scrapy/scrapyd-client Scrapyd-client is a client for Scrapyd_. It provides the general ``scrapyd-client`` and the ``scrapyd-deploy`` utility which allows you to deploy your project to a Scrapyd server. .. _Scrapyd: https://scrapyd.readthedocs.io scrapyd-client -------------- For a reference on each subcommand invoke ``scrapyd-client --help``. Where filtering with wildcards is possible, it is facilitated with fnmatch_. The ``--project`` option can be omitted if one is found in a ``scrapy.cfg``. .. _fnmatch: https://docs.python.org/library/fnmatch.html deploy ~~~~~~ At the moment this is a wrapper around `scrapyd-deploy`_. Note that the command line options of this one are likely to change. projects ~~~~~~~~ Lists all projects of a Scrapyd instance:: # lists all projects on the default target scrapyd-client projects # lists all projects from a custom URL scrapyd-client -t http://scrapyd.example.net projects schedule ~~~~~~~~ Schedules one or more spiders to be executed:: # schedules any spider scrapyd-client schedule # schedules all spiders from the 'knowledge' project scrapyd-client schedule -p knowledge \* # schedules any spider from any project whose name ends with '_daily' scrapyd-client schedule -p \* *_daily spiders ~~~~~~~ Lists spiders of one or more projects:: # lists all spiders scrapyd-client spiders # lists all spiders from the 'knowledge' project scrapyd-client spiders -p knowledge scrapyd-deploy -------------- How It Works ~~~~~~~~~~~~ Deploying your project to a Scrapyd server typically involves two steps: 1. Eggifying_ your project. You'll need to install setuptools_ for this. See `Egg Caveats`_ below. 2. Uploading the egg to the Scrapyd server through the `addversion.json`_ endpoint. The ``scrapyd-deploy`` tool automates the process of building the egg and pushing it to the target Scrapyd server. .. _addversion.json: https://scrapyd.readthedocs.org/en/latest/api.html#addversion-json .. _Eggifying: http://peak.telecommunity.com/DevCenter/PythonEggs .. _setuptools: https://pypi.python.org/pypi/setuptools Deploying a Project ~~~~~~~~~~~~~~~~~~~ First ``cd`` into your project's root, you can then deploy your project with the following:: scrapyd-deploy -p This will eggify your project and upload it to the target. If you have a ``setup.py`` file in your project, it will be used, otherwise one will be created automatically. If successful you should see a JSON response similar to the following:: Deploying myproject-1287453519 to http://localhost:6800/addversion.json Server response (200): {"status": "ok", "spiders": ["spider1", "spider2"]} To save yourself from having to specify the target and project, you can set the defaults in the ``scrapy.cfg`` file:: [deploy] url = http://scrapyd.example.com/api/scrapyd username = scrapy password = secret project = yourproject You can now deploy your project with just the following:: scrapyd-deploy If you have more than one target to deploy, you can deploy your project in all targets with one command:: scrapyd-deploy -a -p Versioning ~~~~~~~~~~ By default, ``scrapyd-deploy`` uses the current timestamp for generating the project version, as shown above. However, you can pass a custom version using ``--version``:: scrapyd-deploy -p --version Or for all targets:: scrapyd-deploy -a -p --version The version must be comparable with LooseVersion_. Scrapyd will use the greatest version unless specified. If you use Mercurial or Git, you can use ``HG`` or ``GIT`` respectively as the argument supplied to ``--version`` to use the current revision as the version. You can save yourself having to specify the version parameter by adding it to your target's entry in ``scrapy.cfg``:: [deploy:target] ... version = HG .. _LooseVersion: http://epydoc.sourceforge.net/stdlib/distutils.version.LooseVersion-class.html Local Settings ~~~~~~~~~~~~~~ You may want to keep certain settings local and not have them deployed to Scrapyd. To accomplish this you can create a ``local_settings.py`` file at the root of your project, where your ``scrapy.cfg`` file resides, and add the following to your project's settings:: try: from local_settings import * except ImportError: pass ``scrapyd-deploy`` doesn't deploy anything outside of the project module, so the ``local_settings.py`` file won't be deployed. Egg Caveats ~~~~~~~~~~~ Some things to keep in mind when building eggs for your Scrapy project: * Make sure no local development settings are included in the egg when you build it. The ``find_packages`` function may be picking up your custom settings. In most cases you want to upload the egg with the default project settings. * You should avoid using ``__file__`` in your project code as it doesn't play well with eggs. Consider using `pkgutil.get_data`_ instead. * Be careful when writing to disk in your project, as Scrapyd will most likely be running under a different user which may not have write access to certain directories. If you can, avoid writing to disk and always use tempfile_ for temporary files. .. _pkgutil.get_data: http://docs.python.org/library/pkgutil.html#pkgutil.get_data .. _tempfile: http://docs.python.org/library/tempfile.html Global settings --------------- Targets ~~~~~~~ You can define Scrapyd targets in your project's ``scrapy.cfg`` file. Example:: [deploy:example] url = http://scrapyd.example.com/api/scrapyd username = scrapy password = secret While your target needs to be defined with its URL in ``scrapy.cfg``, you can use netrc_ for username and password, like so:: machine scrapyd.example.com username scrapy password secret If you want to list all available targets, you can use the ``-l`` option:: scrapyd-deploy -l To list projects available on a specific target, use the ``-L`` option:: scrapyd-deploy -L example .. _netrc: https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html