Public Repository

Last pushed: 6 months ago
Short Description
Scraping trending video page every day and comments posted to those videos every 30 mins.
Full Description

Youtube Comment Crawler





Scraping trending video page every day and comments posted to those videos
every 30 mins.

Crawled comments are stored in comments.json; each line of the file consists
of a JSON object outputted by
youtube-comment-scraper.
See the project page for more information about the format.

Run via npm

Prepare

After cloning this repository, install related modules via npm:

$ git clone https://github.com/itslab-kyushu/youtube-comment-crawler.git
$ cd youtube-comment-crawler
$ npm install

Start

To start the crawling service and store database files into ./data, run

$ npm start --dir ./data

By default, it crawls English page;
to crawl pages in another language, give the language via --lang option.
For example, the following command starts to crawl Japanese pages:

$ npm start --dir ./data --lang JP

Run as a docker container

Youtube Comment Crawler is also provided as a docker image,
itslabq/youtube-comment-crawler.
It stores database files in /data and you shouldn't give --dir option.

To run a container and mount ./data so that database files are stored in
./data:

$ docker run -d --name crawler -v $(pwd)/data:/data:Z itslabq/youtube-comment-crawler

If you want to crawl pages in another language, give the language via --lang
option. The following example starts to crawl Japanese pages:

$ docker run -d --name crawler -v $(pwd)/data:/data:Z itslabq/youtube-comment-crawler --lang JP

License

This software is released under the MIT License, see LICENSE.

Docker Pull Command
Owner
itslabq

Comments (0)