~asanjar/charms/bundles/hdp-accumulo-hadoop/bundle

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# A Hortonworks HDP 2.1 Accumulo Cluster
Apache™ Accumulo is a high performance data storage and retrieval system with 
cell-level access control. It is a scalable implementation of Google’s Big Table
design that works on top of Apache Hadoop® and Apache ZooKeeper

## Usage 
    from bundle's home directory:
    juju quickstart bundles.yaml

## Scale Out Usage

In order to increase the amount of slaves, you must add units, to add one unit:   

     juju add-unit compute-node

Or you can add multiple units at once:   

     juju add-unit -n4 compute-node
    
## Validate Accumulo

    1. $ juju ssh accumulo-master/0  <<= ssh to accumulo master
    2. $ cd $ACCUMULO_HOME
    3. Initialize and Start Accumulo service:
       a. $ /usr/lib/accumulo/bin/accumulo init 
       b. Enter instance name: i.e. accumulo
       c. Enter password: accumulo
       d. Start Accumulo: $ /usr/lib/accumulo/bin/start-all.sh           
    4. View the Accumulo native UI
       http://<$accumulo-master>:50095

## Smoke test Accumulo
** Apache Accumulo Bulk Ingest Example 
This is an example of how to bulk ingest data into accumulo using map reduce.

The following commands show how to run this example. This example creates a table
called test_bulk which has two initial split points. Then 1000 rows of test data
are created in HDFS. After that the 1000 rows are ingested into accumulo.
Then we verify the 1000 rows are in accumulo. 

1. $ PKG=org.apache.accumulo.examples.simple.mapreduce.bulk
2. $ ARGS="-i accumulo -z $ZOOKEEPER_HOSTS -u root -p accumulo"
3. $ bin/accumulo $PKG.SetupTable $ARGS -t test_bulk row_00000333 row_00000666
4. $ bin/accumulo $PKG.GenerateTestData --start-row 0 --count 1000 --output bulk/test_1.txt
5. $ bin/tool.sh lib/accumulo-examples-simple.jar $PKG.BulkIngestExample $ARGS -t test_bulk --inputDir bulk --workDir tmp/bulkWork
6. $ bin/accumulo $PKG.VerifyIngest $ARGS -t test_bulk --start-row 0 --count 1000
       
## Contact Information
amir sanjar <amir.sanjar@canonical.com>