Public Repository

Last pushed: 2 years ago
Short Description
Everything needed to scrape the web with Python
Full Description

Everything You Need To Scrape The Web With Python

This docker image contains everything you need to build and run a python web scraper, and then analyze the data you've scraped.

What's In The Box

Built on the base Ubuntu 14.04 image, the Python-Scraper V2 includes:

Base

  • Python 2.7.6

Web Scraping

  • Scrapy 0.24.5
  • BeautifulSoup4 4.3.2
  • Requests 2.2.1
  • Fake UserAgent 0.0.7
  • wget 2.2

Data Wrangling & Analysis

  • Pandas 0.13.1
  • Matplotlib 1.3.1
  • Scipy 0.13.3
  • FuzzyWuzzy 0.5.0
  • PyParsing 2.0.3
  • SimpleJSON 3.6.5

Output

  • XlsxWriter 0.6.7
  • Python Logstash 0.4.2
  • Redis 2.10.3
  • PyMySQL 0.6.6

... and more assorted goodness.

Docker Pull Command
Owner
rdempsey