Thursday, September 12, 2013

Installation of Hive in Single Node Hadoop Cluster Machine..

Hive Installation: 

Hi all,

Here, I am going to show you how to install hive in hadoop single node cluster using tarball /offline.


Prerequisites:

- Hadoop must be installed, to check type  " $ echo $HADOOP_HOME ".
- If HADOOP_HOME is not set, then set it immediately, because hive or any hadoop related applications always search for this variable in the current machine.

Note: Here commands or any file names in linux operating system is fully case sensitive, So be careful while typing or adding environment variables to .bashrc or .profiles 

Here I am using 'hduser' as default user to run hadoop cluster. so I install through this user. Don't be confused with hduser, nagarjuna, sudo. ok

First you download the stable version of Hive from apache website. and must set to the currently installed Hadoop version, otherwise you will face errors or any bugs. ok

Installation Steps: 

here I am using the hive version 0.9.0 for hadoop 1.0.4  cluster.
download hive-0.9.0-bin.tar.gz  and do not download hive-0.9.0.tar.gz.

See the below structure I have in my machine

- In the above diagram, I have copied hivexxx.tar.gz file to /usr/local/. And observe it has no permissions to access other that root/sudo user.
- so give permission to this file using chmod command using sudo permission like below.



and then, extract that tar file using tarball command like below, here hduser may not have permission to extract to that folder so use sudo to extract.

Now, You will see hivexxx folder like below

Till now, we did only extraction of hivexxx tar.gz file to some location by using required permissions.
Okay, now we have to set system variables to run hive.

- We need HIVE_HOME and PATH system variables.
- Here we use User Level System Variables, by placing a bash script lines in .bashrc or .profile file under hduser's home directory which are already hidden.

- And editing of these files is your wish, I use nano or gedit to edit these files.

For example


add script to end of .bashrc or .profile file


Now, logout and login  (Re-login) to hduser. Then check for  $HIVE_HOME, If it shows the Hive home then Hive is ready to use.

check like below screen


Note: 
   From the above screen, hive shell will be displayed even though hadoop is not running. To run any SQL queries in hive shell, you must run hadoop. Otherwise you will get connection error like errors.



Please let me know if any mistakes are in this post, welcome your valuable feedback.
contact at nagarjuna.lingala@gmail.com

you can also see me at javaojava.blogspot.com also.

No comments:

Post a Comment