Sqoop

Sqoop

Table of Contents

Sqoop is command line interface application for transferring data between relational databases and  Hadoop. It always assists incremental load of single table for of sql query as well as saved jobs executes many times to import updates made to a database as the last import.

Sqoop working

Sqoop requests for relational DB to return the metadata source about the table. However from the received information it will generate the java classes that is configured before getting it working sqoop internally that uses JDBC API generated data. As the sqoop will try to package the compiled classes to beable to generate and post compiling will create jar files.

Why Sqoop?

The Hadoop developer is very significant process that starts the data to load in HDFS. They may play around this data in order to acquire many insights that is hidden in the data stored in HDFS. For any analysis the data is residing in the relational database management systems that need to transfer to HDFS. The task of writing MapReduce code for importing and exporting data from the relational database to HDFS is interesting and also tedious. It is where Apache sqoop will come to rescue and remove their pain. It automates the process of importing and exporting the data. Sqoop will make life of developers easy by offering CLI for importing and also exporting the data.It will just provide fundamental information like database authentication, source, destination, operations. It is going to  manage  extra part. This sqoop transforms the code into MapReduce tasks, which are then executed over HDFS. This sqoop tool uses YARN framework which  imports and export the data offers fault tolerance on top of the parallelism.

Key features

Sqoop will provide salient features

1. Full Load-Apache sqoop may sometime load entire table by a one instruction. We can use to load each tables from the database by using a single command.

IT Courses in USA

2. Incremental Load- This tool has to load incrementally which can load many parts of table as it is updated.

3. Parallel import/export- This sqoop tool always uses YARN framework to import as well to export the data and also provides fault tolerance.

4. Import results of sql query- we can also import the result that is returned from an sql query HDFS.

5. Compression- We can compress our data by using deflat algorithm with compress argument,by specifying compression codec argument.

6. Connectors for majors RDBMS Databases- Apache sqoop provides connectors for various RDBMS databases,hiding entire circumference.

7. Kerberos Security Integration-Kerberos will be computer network authentication protocol functions basis of tickets accepts nodes communicating through non-secure network to prove their identity to one another in a secure manner.

8. Load data directly into HIVE/HBase- It could  load data into apache Hive does many analysis and put the data in Hbase, which is Nosql database.

9. Support for accumulo- We can guide sqoop to import table in Accumulo without creating  a directory in HDFS.

Sqoop commands

1. Sqoop Import command- Import command finds for importing a table from relational databases to HDFS. This is mainly used to import tables from MySql databases to HDFS.

The command for importing table is 

a) sqoop import –connect jdbc:mysql://localhost/employees –username sandy

Sqoop imports command with target directory

1. sqoop import – connect jdbc:mysql://localhost/employees –username sandy

Here the sqoop will import the parallel  from databases sources –m property will be used to specify the number of mappers that be executed. Here the sqoop will import the parallel from most database sources. We can even specify the number of map tasks.

Questions

1. What is Apache sqoop?

2. What is the architecture of sqoop explain?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article
Enroll IT Courses

Enroll Free demo class
Need a Free Demo Class?
Join H2K Infosys IT Online Training
Subscribe
By pressing the Subscribe button, you confirm that you have read our Privacy Policy.