~lazypower/charms/bundles/hdp-hadoop-hive-mysql/bundle

« back to all changes in this revision

Viewing changes to README.md

Committer: amir sanjar
Date: 2014-08-15 14:33:33 UTC
Revision ID: amir.sanjar@canonical.com-20140815143333-8ep526w7x53r72h1

hdp 2.1 data analytic solution using HIVE, mysql, and hadoop

files added:

README.md

bundles.yaml

Show diffs side-by-side

added added

removed removed

README.md

# A Hortonworks HDP 2.1 HIVE, mysql, and Hadoop Cluster

This bundle is a 7 node Hadoop cluster designed to scale out. It contains the following units:

One Hadoop Master (yarn & hdfs) Node

one Hadoop comput Node

one Hive Node

one MySQL Node

## Usage

Once you have a cluster running, just run:

1) juju ssh yarn-hdfs-master/0 <<= ssh to hadoop master

2) Smoke test HDFS admin functionality- As the HDFS user, create a /user/$CLIENT_USER in

hadoop file system - Below steps verifies/demos HDFS functionality

a) sudo su $HDFS_USER

b) hdfs dfs -mkdir -p /user/ubuntu

c) hdfs dfs -chown ubuntu:ubuntu /user/ubuntu

d) hdfs dfs -chmod -R 755 /user/ubuntu

e) exit

3) Smoke test YARN and Mapreduce - Run the smoke test as the $CLIENT_USER, using Terasort and sort 10GB of data.

a) hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-*.jar teragen 10000 /user/ubuntu/teragenout

b) hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-*.jar terasort /user/ubuntu/teragenout /user/ubuntu/terasortout

4) Smoke test HDFS funtionality from ubuntu user space - delete mapreduce output from hdfs

hdfs dfs -rm -r /user/ubuntu/teragenout

HIVE+HDFS Usage:

1) juju ssh hdphive/0 <<= ssh to hive server

2) sudo su $HIVE_USER

3) hive

4) from Hive console:

show databases;

create table test(col1 int, col2 string);

show tables;

exit;

5) exit from $HIVE_USER session

6) sudo su $HDFS_USER

7) hadoop dfsadmin -report <<== verify connection to the remote HDFS cluster

7) hdfs dfs -ls <<== verify that "test" directory has been created on the remote HDFS cluster

##Scale Out Usage

In order to increase the amount of slaves, you must add units, to add one unit:

juju add-unit compute-node

Or you can add multiple units at once:

juju add-unit -n4 compute-node

## References

Older »