Public Repository

Last pushed: 2 years ago
Short Description
Apache Sqoop is a tool designed to transfer bulk data from structured datastores to different places
Full Description

Supported tags

  • 0.2.1, latest

For more information about Stratio Sqoop, please see our GitHub Repo

This image contains the sqoop server, so we recommend to run it with the Stratio sqoop-shell image that is available here.

What is Stratio Sqoop?

The traditional application management system, that is, the interaction of applications with relational database using RDBMS, is one of the sources that generate Big Data. Such Big Data, generated by RDBMS, is stored in Relational Database Servers in the relational database structure.

When Big Data storages and analyzers such as MapReduce, Hive, HBase, Cassandra, Pig, etc. of the Hadoop ecosystem came into picture, they required a tool to interact with the relational database servers for importing and exporting the Big Data residing in them. Here, Sqoop occupies a place in the Hadoop ecosystem to provide feasible interaction between relational database server and Hadoop’s HDFS.

Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases. It is provided by the Apache Software Foundation.

Why Stratio Sqoop?

Stratio allows you give the next step and run all the jobs operations that you ran in a Hadoop cluster in a Spark cluster providing you all the Spark beneficts.

How to use this image

Start a Sqoop-server instance

$ docker run --name sqoop-server -d stratio/sqoop-server
Docker Pull Command