Apache Sqoop Examples

by | Aug 19, 2018 | Big Data, Data bases | 0 comments

Prerequisites of Apache Sqoop ExamplesSqoop logo

The prerequisites for these examples are the same as for the previous post of Sqoop.

These examples create a database “myddbb” and a table with values entered “mytable” and another empty table “mytable2”.

Example of loading data from MySQL to HDFS (compression: Snappy and Avro format)

$ sqoop import \
 --connect jdbc:mysql://localhost/mybbdd \
 --username=root -P \
 --table=mytable \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_snappy_avro \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.SnappyCodec \ 
 --as-avrodatafile

Example of loading data from MySQL to HDFS (compression: gzip and Avro format)

$ sqoop import \
 --connect jdbc:mysql://localhost/mibbdd \
 --username=root -P \
 --table=mitabla \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_gzip_avro \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.GzipCodec \
 --as-avrodatafile

 

Example of loading data from MySQL to HDFS (compression: BZIP2 and Sequence format)

$ sqoop import \
 --connect jdbc:mysql://localhost/mibbdd \
 --username=root -P \
 --table=mitabla \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_bzip2_sequence \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.BZip2Codec \
 --as-sequencefile

 

Example of loading data from MySQL to HDFS (restricting data with columns)

$ sqoop import \
 --connect jdbc:mysql://localhost/mibbdd \
 --username=root -P \
 --table=mitabla \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_2_columns \
 --columns nombre,edad

 

Example of loading data from MySQL to HDFS (restricting data with WHERE)

$ sqoop import \
 --connect jdbc:mysql://localhost/mybbdd \
 --username=root -P \
 --table=mytable \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_mayor_age_40 \
 --where "edad > 40"

 

Example of loading data from MySQL to HDFS (incremental load)

In order to make an incremental insertion we need to include new data to the table “MyTable”, for this we execute in MySQL the following sentence:

mysql> 
INSERT INTO mytable (nombre, edad, salario) VALUES
        ("Diego", 24, 21000), ("Rosa", 26, 24000), ("Javier", 28, 25000), ("Lorena", 35, 28000), ("Miriam", 42, 30000), ("Patricia", 43, 25000), ("Natalia", 45, 39000);

Note: To make the insertion necessary to do it in the db “MIBBDD”

Once the insertion is done we can make the incremental insertion from the 8 as it is the first element introduced in the new insertion.

$ sqoop import \ 
 --connect jdbc:mysql://localhost/mybbdd \ 
 --username=root -P \
 --table=mytable \ 
 --driver=com.mysql.jdbc.Driver \ 
 --target-dir=/my_table_hdfs \ 
 --incremental append \
 --check-column id \
 --last-value 8

 

Example of loading data from MySQL to HDFS and consultable from HIVE

In order to make an insertion of the table in the hive database, we must create db where it will be inserted, to avoid problems:

Hive > CREATE DATABASE mybbddhive;

Once the database is created, you are ready to run the query:

$ sqoop import \
 --connect jdbc:mysql://localhost/mybbdd \
 --username=root -P \
 --table=mytable \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_hive \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.SnappyCodec \ 
 --hive-import \
 --hive-database mihive \ 
 --create-hive-table \
 --hive-table ej_hive_table

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *