Scrapyd (with authentication)
Scrapyd is an application for deploying and running Scrapy spiders. It enables you to deploy (upload) your projects and control their spiders using a JSON API.
Scrapyd doesn't include any provision for password protecting itself. This container packages Scrapyd with an nginx proxy in front of it providing basic HTTP authentication. The username and password are configured through environment variables.
For more about Scrapyd, see the Scrapyd documentation.
How to use this image
Start a Scrapyd server
$ docker run -d -e USERNAME=my_username -e PASSWORD=hunter123 cdrx/scrapyd-authenticated
You can then use the Scrapyd client to easily deploy the scraper from your machine to the running container.
How to configure Scrapy to use HTTP basic authentication
Support for HTTP authentication is built into scrapyd client. Add the
password field to your
scrapy.cfg file and then deploy as you normally would.
[deploy] url = http://scrapyd:6800/ username = my_username password = hunter123
Installing Python packages that your scraper depends on
If your scraper depends on some 3rd party Python packages (Redis, MySQL drivers, etc) you can install them when the container launches by adding the PACKAGES environment variable.
$ docker run -d -e USERNAME=my_username -e PASSWORD=hunter123 -e PACKAGES=requests,simplejson cdrx/scrapyd-authenticated
This will make the container a bit slow to boot, so if your starting / stopping the container regularly you would be better off forking this repository and adding the packages to
Supported environment variables
||Yes||my_user||The username for authentication with the Scrapy server|
||Yes||hunter123||The password for authentication with the Scrapy server|
||No||simplejson,requests||Comma separated list of Python packages to install before starting scrapyd|
To persist data between launches, you can mount the volume
/scrapyd somewhere on your Docker host.
If you have any problems with or questions about this image, please file an issue on the GitHub repository.
Pull requests welcome :-)