lucatnt/youtube-to-mp3-to-s3
Schedule youtube-dl to monitor a playlist and upload the mp3 audio to a S3 bucket
3.1K
The purpose of this thing is to have a recurring youtube-dl
job monitoring a playlist and download the audio of new uploads, convert it to mp3, then upload it to a S3-compatible storage service.
It uses yt-dlp, a fork of the venerable youtube-dl to perform the download, go-cron to schedule the job, and Minio Client to provide the S3 upload functionality (which I'm using with Backblaze B2).
It can be used in conjunction with worker-feed-from-bucket to generate an RSS feed from the bucket contents, thus creating a podcast out of the download mp3s.
NOTE: Even though I am making all this public, this is thought for my own personal use, and might not fit your needs. Feel free to fork it/submit PRs if you think it is appropriate.
Environment variables are used to configure this thing.
DOWNLOAD_CRON
: (Docker only) if set, it defines the schedule for the youtube-dl job. The syntax is the same used by cron, with an optional first field which specifies seconds. If unset, the script will run and then exit.OUTPUT_RENAME_PATTERN
: The renaming pattern fed to youtube-dl, it defaults to %(upload_date)s_%(title)s.%(ext)s
. Check its documation for further info.DOWNLOAD_LIMIT
: The maximum number of items of a playlist to download. Defaults to 15
.DOWNLOAD_ARCHIVE_PATH
: The path of the "database" (just a text file) where youtube-dl keeps track of what it has already downloaded. Defaults to archive.txt (in the Docker image that file is in /workdir
).S3_ENDPOINT
: The S3 endpoint to use (just the hostname, without https://
). For example, AWS's is s3.amazonaws.com
, B2's is s3.us-west-001.backblazeb2.com
or s3.eu-central-003.backblazeb2.com
, and so forth.S3_ACCESS_KEY_ID
: Your API key ID.S3_SECRET_ACCESS_KEY
: Your API secret.S3_BUCKET
: The name of your bucket.COMMAND_AFTER_SINGLE_FILE
: Which command should be executed after each file has downloaded. Use {}
to refer to the path of the file. Defaults to mc --config-dir /tmp/.mc mv {} s3/$S3_BUCKET/
, which moves the mp3 to the S3 bucket defined through the previous variables.PRE_COMMANDS
: (Optional) Commands to be executed before the script starts checking for new videos.POST_COMMANDS_EXIT
: (Optional) Commands to be executed after the download script exits (with or without an error).POST_COMMANDS_FAILURE
: (Optional) Commands to be executed after the download fails.POST_COMMANDS_SUCCESS
: (Optional) Commands to be executed after the download succeeds.If you wish to ignore the S3 capabilities of this thing just set COMMAND_AFTER_SINGLE_FILE
to a blank string or whatever you think is appropriate.
docker run -e YOUTUBE_URL="https://www.youtube.com/playlist?list=S0m3N1cePl4yl1st" -e S3_ENDPOINT=s3.eu-central-003.backblazeb2.com -e S3_ACCESS_KEY_ID=SomeKeyID -e S3_SECRET_ACCESS_KEY=SomeSecret -e S3_BUCKET=YourBucket -e DOWNLOAD_CRON='0 0 */6 * * *' -v $(pwd)/archive.txt:/workdir/archive.txt lucatnt/youtube-to-mp3-to-s3
This would monitor the given YouTube playlist every 6 hours, download at most 15 videos (default value for DOWNLOAD_LIMIT
), upload it to B2 and keep track of the download files by mounting archive.txt from the local directory into /workdir/archive.txt
, which is the script's default.
docker pull lucatnt/youtube-to-mp3-to-s3