...

Cloud Computing - Sqoop

Back to Course

Lesson Description


Lession - #790 Sqoop Installation


As Sqoop is a sub-task of Hadoop, it can deal with Linux working framework. Follow the means given below to introduce Sqoop on your framework.

Step 1: Verifying JAVA Installation

You really want to have Java introduced on your framework prior to introducing Sqoop. Allow us to check Java establishment utilizing the accompanying command −
$ java –version
Assuming Java is now introduced on your framework, you get to see the accompanying reaction −
java version "1.7.0_71"
Java(TM>
SE Runtime Environment (build 1.7.0_71-b13>
Java HotSpot(TM>
Client VM (build 25.0-b02, mixed mode>
On the off chance that Java isn't introduced on your framework, then, at that point, follow the means given below.

Installing Java

Follow the basic advances given below to introduce Java on your framework.

Step 1
Download Java (JDK - X64.tar.gz>
by visiting the accompanying connection Java install Then jdk-7u71-linux-x64.tar.gz will be downloaded onto your framework.

Step 2
For the most part, you can observe the downloaded Java record in the Downloads envelope. Confirm it and concentrate the jdk-7u71-linux-x64.gz record utilizing the accompanying commands.
$ cd Downloads/
$ ls
jdk-7u71-linux-x64.gz
$ tar zxf jdk-7u71-linux-x64.gz
$ ls
jdk1.7.0_71 jdk-7u71-linux-x64.gz
Step 3
To make Java accessible to every one of the users, you need to move it to the area "/usr/nearby/". Open root, and type the accompanying command.
$ su
password:

# mv jdk1.7.0_71 /usr/local/java
# exitStep IV:
Step 4
For setting up PATH and JAVA_HOME factors, add the accompanying orders to ~/.bashrc document.
export JAVA_HOME=/usr/local/java
export PATH=$PATH:$JAVA_HOME/bin
Presently apply every one of the progressions into this running framework.
$ source ~/.bashrc
Step 5
Utilize the accompanying command to arrange Java choices −
# alternatives --install /usr/bin/java java usr/local/java/bin/java 2
# alternatives --install /usr/bin/javac javac usr/local/java/bin/javac 2
# alternatives --install /usr/bin/jar jar usr/local/java/bin/jar 2

# alternatives --set java usr/local/java/bin/java
# alternatives --set javac usr/local/java/bin/javac
# alternatives --set jar usr/local/java/bin/jar

Step 2: Verifying Hadoop Installation

Hadoop should be introduced on your framework prior to introducing Sqoop. Allow us to check the Hadoop establishment utilizing the accompanying command −
$ hadoop version
On the off chance that Hadoop is now introduced on your framework, you will get the accompanying reaction −
Hadoop 2.4.1
--
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
Compiled by hortonmu on 2013-10-07T06:28Z
Compiled with protoc 2.5.0
On the off chance that Hadoop isn't introduced on your framework, then, at that point, continue with the accompanying steps −

Downloading Hadoop

Download and extricate Hadoop 2.4.1 from Apache Software Foundation utilizing the accompanying commands.
$ su
password:

# cd /usr/local
# wget http://apache.claz.org/hadoop/common/hadoop-2.4.1/
hadoop-2.4.1.tar.gz
# tar xzf hadoop-2.4.1.tar.gz
# mv hadoop-2.4.1/* to hadoop/
# exit

Installing Hadoop in Pseudo Distributed Mode

Follow the means given below to introduce Hadoop 2.4.1 in pseudo-dispersed mode.

Step 1: Setting up Hadoop You can set Hadoop climate factors by attaching the accompanying orders to ~/.bashrc document.
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
Presently, apply every one of the progressions into the present running framework.
$ source ~/.bashrc
Step 2: Hadoop Configuration
You can observe all the Hadoop setup records in the area "$HADOOP_HOME/and so on/hadoop". You want to roll out reasonable improvements in those setup documents as per your Hadoop framework.
$ cd $HADOOP_HOME/etc/hadoop
To faster Hadoop programs utilizing java, you need to reset the java environment factors in hadoop-env.shhadoop-env.sh record by supplanting JAVA_HOME esteem with the area of java in your framework.
export JAVA_HOME=/usr/local/java
Given below is the list of documents that you want to alter to arrange Hadoop.

core-site.xml

The core-site.xml document contains data, for example, the port number utilized for Hadoop occurrence, memory distributed for the record framework, memory limit for putting away the information, and the size of Read/Write buffers. Open the core-site.xml and add the accompanying properties in the middle the configuration and configuration labels.

hdfs-site.xml

The hdfs-site.xml record contains data like the worth of replication information, namenode way, and datanode way of your nearby document frameworks. It implies where you need to store the Hadoop framework. Allow us to expect the accompanying information.
dfs.replication (data replication value>
= 1 (In the following path /hadoop/ is the user name. hadoopinfra/hdfs/namenode is the directory created by hdfs file system.>
namenode path = //home/hadoop/hadoopinfra/hdfs/namenode (hadoopinfra/hdfs/datanode is the directory created by hdfs file system.>
datanode path = //home/hadoop/hadoopinfra/hdfs/datanode
Open this record and add the accompanying properties in the middle the <configuration>, </configuration>labels in this document.

yarn-site.xml

This document is utilized to design yarn into Hadoop. Open the yarn-site.xml record and add the accompanying properties in the middle the <configuration>, </configuration> labels in this document.
<configuration>
   <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle <value>
   </property>
 </configuration>
mapred-site.xml

This record is utilized to indicate which MapReduce structure we are utilizing. Naturally, Hadoop contains a format of yarn-site.xml. As a matter of first importance, you want to duplicate the document from mapred-site.xml.template to mapred-site.xml record utilizing the accompanying order.
$ cp mapred-site.xml.template mapred-site.xml
Open mapred-site.xml document and add the accompanying properties in the middle the <configuration>, </configuration> labels in this record.
<configuration>
   <property>
      <name>mapreduce.framework.name</name>
      <value>yarn <value>
   </property>
 </configuration>

Verifying Hadoop Installation

Step 1 - Name Node Setup
Set up the namenode utilizing the order "hdfs namenode - design" as follows.
$ cd ~
$ hdfs namenode -format
Step 2 - Verifying Hadoop dfs
The accompanying order is utilized to start dfs. Executing this order will begin your Hadoop record framework.
$ start-dfs.sh
Step 3 - Verifying Yarn Script
The accompanying command is utilized to begin the yarn script. Executing this order will begin your yarn daemons.
Verifying Yarn Script
Step 4 - Accessing Hadoop on Browser
The default port number to get to Hadoop is 50070. Utilize the accompanying URL to get Hadoop administrations on your program.
http://localhost:50070/
Step 5 - Verify All Applications for Cluster
The default port number to get to all utilizations of cluster is 8088. Utilize the accompanying url to visit this help.
http://localhost:8088/

Step 3:Downloading Sqoop

We can download the most recent adaptation of Sqoop from the accompanying connectionSqoop-download. For this instructional exercise, we are utilizing variant 1.4.5, that is to say, sqoop-1.4.5.bin__hadoop-2.0.4-alpha.tar.gz.

Step 4:Installing Sqoop

The accompanying orders are utilized to remove the Sqoop tar ball and move it to "/usr/lib/sqoop" index.
$tar -xvf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
$ su
password:

# mv sqoop-1.4.4.bin__hadoop-2.0.4-alpha /usr/lib/sqoop
#exit

Step 5:Configuring bashrc

You need to set up the Sqoop environment by annexing the accompanying lines to ~/.bashrc document −
#Sqoop
export SQOOP_HOME=/usr/lib/sqoop export PATH=$PATH:$SQOOP_HOME/bin
The accompanying order is utilized to execute ~/.bashrc document.
$ source ~/.bashrc

Step 6:Configuring Sqoop

To design Sqoop with Hadoop, you really want to alter the sqoop-env.sh record, which is put in the $SQOOP_HOME/conf index. As a matter of first importance, Redirect to Sqoop config index and duplicate the format document utilizing the accompanying order −
$ cd $SQOOP_HOME/conf
$ mv sqoop-env-template.sh sqoop-env.sh
Open sqoop-env.sh and alter the accompanying lines −
export HADOOP_COMMON_HOME=/usr/local/hadoop 
export HADOOP_MAPRED_HOME=/usr/local/hadoop

Step 7: Download and Configure mysql-connector-java

We can download mysql-connector-java-5.1.30.tar.gz record from the accompanying connection. The accompanying orders are utilized to extricate mysql-connector-java tarball and move mysql-connector-java-5.1.30-bin.jar to/usr/lib/sqoop/lib registry.
$ tar -zxf mysql-connector-java-5.1.30.tar.gz
$ su
password:

# cd mysql-connector-java-5.1.30
# mv mysql-connector-java-5.1.30-bin.jar /usr/lib/sqoop/lib

Step 8: Verifying Sqoop

The accompanying command is utilized to check the Sqoop adaptation..
$ cd $SQOOP_HOME/bin
$ sqoop-version