~aisrael/charms/trusty/apache-hadoop-client/benchmarks

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
## Overview

The Apache Hadoop software library is a framework that allows for the
distributed processing of large data sets across clusters of computers
using a simple programming model.

This charm deploys a client node running
[Apache Hadoop 2.4.1](http://hadoop.apache.org/docs/r2.4.1/)
from which workloads can be manually run.

## Usage

This charm is intended to be deployed as a part of the
[core bundle](https://jujucharms.com/u/bigdata-dev/apache-core-batch-processing/):

    juju quickstart u/bigdata-dev/apache-core-batch-processing

This will deploy the Apache Hadoop platform with a single client unit.
From there, you can manually load and run map-reduce jobs:

    juju scp my-job.jar client/0:
    juju ssh client/0
    hadoop jar my-job.jar

## Benchmarking

    You can perform a terasort benchmark, in order to gauge performance of your environment:

        $ juju action do apache-hadoop-client/0 terasort
        Action queued with id: cbd981e8-3400-4c8f-8df1-c39c55a7eae6
        $ juju action fetch --wait 0 cbd981e8-3400-4c8f-8df1-c39c55a7eae6
        results:
          meta:
            composite:
              direction: asc
              units: ms
              value: "206676"
          results:
            raw: '{"Total vcore-seconds taken by all map tasks": "439783", "Spilled Records":
              "30000000", "WRONG_LENGTH": "0", "Reduce output records": "10000000", "HDFS:
              Number of bytes read": "1000001024", "Total vcore-seconds taken by all reduce
              tasks": "50275", "Reduce input groups": "10000000", "Shuffled Maps ": "8", "FILE:
              Number of bytes written": "3128977482", "Input split bytes": "1024", "Total
              time spent by all reduce tasks (ms)": "50275", "FILE: Number of large read operations":
              "0", "Bytes Read": "1000000000", "Virtual memory (bytes) snapshot": "7688794112",
              "Launched map tasks": "8", "GC time elapsed (ms)": "11656", "Bytes Written":
              "1000000000", "FILE: Number of read operations": "0", "HDFS: Number of write
              operations": "2", "Total megabyte-seconds taken by all reduce tasks": "51481600",
              "Combine output records": "0", "HDFS: Number of bytes written": "1000000000",
              "Total time spent by all map tasks (ms)": "439783", "Map output records": "10000000",
              "Physical memory (bytes) snapshot": "2329722880", "FILE: Number of write operations":
              "0", "Launched reduce tasks": "1", "Reduce input records": "10000000", "Total
              megabyte-seconds taken by all map tasks": "450337792", "WRONG_REDUCE": "0",
              "HDFS: Number of read operations": "27", "Reduce shuffle bytes": "1040000048",
              "Map input records": "10000000", "Map output materialized bytes": "1040000048",
              "CPU time spent (ms)": "195020", "Merged Map outputs": "8", "FILE: Number of
              bytes read": "2080000144", "Failed Shuffles": "0", "Total time spent by all
              maps in occupied slots (ms)": "439783", "WRONG_MAP": "0", "BAD_ID": "0", "Rack-local
              map tasks": "2", "IO_ERROR": "0", "Combine input records": "0", "Map output
              bytes": "1020000000", "CONNECTION": "0", "HDFS: Number of large read operations":
              "0", "Total committed heap usage (bytes)": "1755840512", "Data-local map tasks":
              "6", "Total time spent by all reduces in occupied slots (ms)": "50275"}'
        status: completed
        timing:
          completed: 2015-05-28 20:55:50 +0000 UTC
          enqueued: 2015-05-28 20:53:41 +0000 UTC
          started: 2015-05-28 20:53:44 +0000 UTC


## Contact Information

- [bigdata-dev@lists.launchpad.net](mailto:bigdata-dev@lists.launchpad.net)


## Hadoop

- [Apache Hadoop](http://hadoop.apache.org/) home page
- [Apache Hadoop bug trackers](http://hadoop.apache.org/issue_tracking.html)
- [Apache Hadoop mailing lists](http://hadoop.apache.org/mailing_lists.html)
- [Apache Hadoop Juju Charm](http://jujucharms.com/?text=hadoop)