Apache Sqoop

by Diego Calvo | Jul 6, 2018 | Big Data | 0 comments

Sqoop definition

Apache Sqoop is a command line tool developed to transfer large volumes of data from databases to relate to Hadoop, hence its name that comes from the merger of SQL and Hadoop. Specifically transforms data relating to Hive or Hbase in one direction and the other from HDFS to relational data such as MySQL, Oracle, Postgress or a data warehouse.

The transfer process consists of reading row by row each table in the database and importing them to HDFS, the output of these is a set of files that can be in CSV, Avro, binary or sequence format.

Sqoop2 has recently been created to improve the usability of Sqoop, providing a Web application and greater integration with Oozie.

Features

Provides a Java API to perform ingest processing. Allowing to program applications that perform some type of transformation on the data.
Provides commands for listing tables and schemas.
Supports incremental data loads.
Provides many connectors such as FTP, JDBC, Kafka, Kite, SFTP

Example of how to pass data between MySQL and HDFS using Sqoop

Prerequisites

Have installed Hortonworks (Virtual Box) for Spark

Creating a MySQL database table

A database is created with a table on which to perform the tests, the following commands will be used.

accessing MYSQL

$ mysql

You can make an ERROR 1045 (28000): Access denied for user ‘ root ‘ @ ‘ localhost, which is resolved:

mysql -u root-P

Note: Keep in mind that the MySQL and system wash can be different.

Consult database

mysql> show databases;

Create Database

mysql> create database myddbb;

Use database

mysql> use myddbb;

Create base table

mysql> CREATE TABLE mytable (
         id MEDIUMINT NOT NULL AUTO_INCREMENT,
         name CHAR (30) NOT NULL,
         age INTEGER (30),
         salary INTEGER (30),
         PRIMARY KEY (id));

CREATE TABLE 2

mysql> CREATE TABLE mytable2 (
         id MEDIUMINT NOT NULL AUTO_INCREMENT,
         name CHAR (30) NOT NULL,
         age INTEGER (30),
         salary INTEGER (30),
         PRIMARY KEY (id));

Insert Data

mysql> INSERT INTO mytable (name, age, salary) values
        ("Peter", 24, 21000), ("Maria", 26, 24000), ("John", 28, 25000), ("Louis", 35, 28000), ("Monica", 42, 30000), ("Rose", 43, 25000), ("Susana", 45, 39000);

Note: If you have permission problems enter MySQL and give all permissions:

grant all privileges on *.* to 'root'@'localhost' IDENTIFIED BY 'MySQL_Key' WITH GRANT OPTION;

Exit MYSQL

mysql> exit;

Load MySQL data to HDFS

Example of loading data from the table “MyTable” of the Database “MIBBDD” to the folder HDFs name “Mitabla_hdfs”

$ sqoop Import \
 --connect jdbc:mysql://localhost/myddbb \
 --username = root -P \
 --table = mytable \
 --driver = com.mysql.jdbc.driver \
 --target-dir =/my_hdfs_table \
 --fields-terminated-by = ',' \
 --lines-terminated-by '\n'

Target-dir: File HDFS where it is stored.
Table: Identifies the table to be copied.
Clear-Staging-table: Indicates that past data can be deleted.
VERBOSE: Prints additional information to facilitate debugging.
Fields-terminated-by: defining the delimiter.

Loading data from HDFS to MySQL

Example of loading data from the HDFS folder named “my_hdfs_table” to the “mytable2” table in the “myddbb” database.

$ sqoop export 
 --connect jdbc: mysql://localhost/myddbb 
 --username = root -P 
 --table = mytable2 
 --export-dir =/my_hdfs_table -m 1

Note: If you have problems “set $ ACCUMULO_HOME to the root in your ACCUMULO intallation”, it can be avoided with:

$ ACCUMULO_HOME = '/var/lib/accumulo'
$ Export ACCUMULO_HOME
$ sudo mkdir/var/lib/accumulo

To practice with Sqoop then consult: “Examples of Sqoop”, in this section compiled many examples that can serve you useful.

Source: official Shell documentation

Source: official documentation for JAVA API

Apache Sqoop

Sqoop definition

Features

Example of how to pass data between MySQL and HDFS using Sqoop

Prerequisites

Creating a MySQL database table

Load MySQL data to HDFS

Loading data from HDFS to MySQL

0 Comments

Submit a Comment Cancel reply

Mi filosofía

Contacto

Aviso legal