1
by Kevin W. Monroe
adding apache-hadoop-client charm |
1 |
## Overview
|
2 |
||
3 |
The Apache Hadoop software library is a framework that allows for the |
|
4 |
distributed processing of large data sets across clusters of computers |
|
5 |
using a simple programming model. |
|
6 |
||
45
by Cory Johns
Updated README and added README.dev |
7 |
This charm deploys a client node running |
8 |
[Apache Hadoop 2.4.1](http://hadoop.apache.org/docs/r2.4.1/) |
|
75.1.6
by Kevin W. Monroe
remove client relation, DEV-README, and snippets that talk about relating to this charm. recommended dev practice is to use the new apache-hadoop-plugin charm now. |
9 |
from which workloads can be manually run. |
1
by Kevin W. Monroe
adding apache-hadoop-client charm |
10 |
|
11 |
## Usage
|
|
12 |
||
75.1.21
by Kevin W. Monroe
update README to reflect this charm being included in the core-batch-processing bundle |
13 |
This charm is intended to be deployed as a part of the |
75.2.1
by Cory Johns
Refactored client charm to use apache-hadoop-plugin |
14 |
[core bundle](https://jujucharms.com/u/bigdata-dev/apache-core-batch-processing/): |
45
by Cory Johns
Updated README and added README.dev |
15 |
|
75.1.21
by Kevin W. Monroe
update README to reflect this charm being included in the core-batch-processing bundle |
16 |
juju quickstart u/bigdata-dev/apache-core-batch-processing
|
75.1.6
by Kevin W. Monroe
remove client relation, DEV-README, and snippets that talk about relating to this charm. recommended dev practice is to use the new apache-hadoop-plugin charm now. |
17 |
|
18 |
This will deploy the Apache Hadoop platform with a single client unit. |
|
19 |
From there, you can manually load and run map-reduce jobs: |
|
45
by Cory Johns
Updated README and added README.dev |
20 |
|
21 |
juju scp my-job.jar client/0:
|
|
22 |
juju ssh client/0
|
|
23 |
hadoop jar my-job.jar
|
|
1
by Kevin W. Monroe
adding apache-hadoop-client charm |
24 |
|
80
by Adam Israel
Add a Benchmarking section to the README |
25 |
## Benchmarking
|
26 |
||
27 |
You can perform a terasort benchmark, in order to gauge performance of your environment: |
|
28 |
||
29 |
$ juju action do apache-hadoop-client/0 terasort
|
|
30 |
Action queued with id: cbd981e8-3400-4c8f-8df1-c39c55a7eae6
|
|
31 |
$ juju action fetch --wait 0 cbd981e8-3400-4c8f-8df1-c39c55a7eae6
|
|
32 |
results:
|
|
33 |
meta:
|
|
34 |
composite:
|
|
35 |
direction: asc
|
|
36 |
units: ms
|
|
37 |
value: "206676"
|
|
38 |
results:
|
|
39 |
raw: '{"Total vcore-seconds taken by all map tasks": "439783", "Spilled Records":
|
|
40 |
"30000000", "WRONG_LENGTH": "0", "Reduce output records": "10000000", "HDFS:
|
|
41 |
Number of bytes read": "1000001024", "Total vcore-seconds taken by all reduce
|
|
42 |
tasks": "50275", "Reduce input groups": "10000000", "Shuffled Maps ": "8", "FILE:
|
|
43 |
Number of bytes written": "3128977482", "Input split bytes": "1024", "Total
|
|
44 |
time spent by all reduce tasks (ms)": "50275", "FILE: Number of large read operations":
|
|
45 |
"0", "Bytes Read": "1000000000", "Virtual memory (bytes) snapshot": "7688794112",
|
|
46 |
"Launched map tasks": "8", "GC time elapsed (ms)": "11656", "Bytes Written":
|
|
47 |
"1000000000", "FILE: Number of read operations": "0", "HDFS: Number of write
|
|
48 |
operations": "2", "Total megabyte-seconds taken by all reduce tasks": "51481600",
|
|
49 |
"Combine output records": "0", "HDFS: Number of bytes written": "1000000000",
|
|
50 |
"Total time spent by all map tasks (ms)": "439783", "Map output records": "10000000",
|
|
51 |
"Physical memory (bytes) snapshot": "2329722880", "FILE: Number of write operations":
|
|
52 |
"0", "Launched reduce tasks": "1", "Reduce input records": "10000000", "Total
|
|
53 |
megabyte-seconds taken by all map tasks": "450337792", "WRONG_REDUCE": "0",
|
|
54 |
"HDFS: Number of read operations": "27", "Reduce shuffle bytes": "1040000048",
|
|
55 |
"Map input records": "10000000", "Map output materialized bytes": "1040000048",
|
|
56 |
"CPU time spent (ms)": "195020", "Merged Map outputs": "8", "FILE: Number of
|
|
57 |
bytes read": "2080000144", "Failed Shuffles": "0", "Total time spent by all
|
|
58 |
maps in occupied slots (ms)": "439783", "WRONG_MAP": "0", "BAD_ID": "0", "Rack-local
|
|
59 |
map tasks": "2", "IO_ERROR": "0", "Combine input records": "0", "Map output
|
|
60 |
bytes": "1020000000", "CONNECTION": "0", "HDFS: Number of large read operations":
|
|
61 |
"0", "Total committed heap usage (bytes)": "1755840512", "Data-local map tasks":
|
|
62 |
"6", "Total time spent by all reduces in occupied slots (ms)": "50275"}'
|
|
63 |
status: completed
|
|
64 |
timing:
|
|
65 |
completed: 2015-05-28 20:55:50 +0000 UTC
|
|
66 |
enqueued: 2015-05-28 20:53:41 +0000 UTC
|
|
67 |
started: 2015-05-28 20:53:44 +0000 UTC
|
|
68 |
||
11
by Cory Johns
Added option for resource mirror |
69 |
|
1
by Kevin W. Monroe
adding apache-hadoop-client charm |
70 |
## Contact Information
|
45
by Cory Johns
Updated README and added README.dev |
71 |
|
75.1.21
by Kevin W. Monroe
update README to reflect this charm being included in the core-batch-processing bundle |
72 |
- [bigdata-dev@lists.launchpad.net](mailto:bigdata-dev@lists.launchpad.net) |
45
by Cory Johns
Updated README and added README.dev |
73 |
|
1
by Kevin W. Monroe
adding apache-hadoop-client charm |
74 |
|
75 |
## Hadoop
|
|
45
by Cory Johns
Updated README and added README.dev |
76 |
|
1
by Kevin W. Monroe
adding apache-hadoop-client charm |
77 |
- [Apache Hadoop](http://hadoop.apache.org/) home page |
78 |
- [Apache Hadoop bug trackers](http://hadoop.apache.org/issue_tracking.html) |
|
79 |
- [Apache Hadoop mailing lists](http://hadoop.apache.org/mailing_lists.html) |
|
80 |
- [Apache Hadoop Juju Charm](http://jujucharms.com/?text=hadoop) |