~bigdata-dev/charms/trusty/apache-hadoop-plugin/unstable : contents of README.md at revision 103

~bigdata-dev/charms/trusty/apache-hadoop-plugin/unstable : (revision 103)

1 by Kevin W. Monroe adding apache-hadoop-client charm	1	## Overview
	2
	3	The Apache Hadoop software library is a framework that allows for the
	4	distributed processing of large data sets across clusters of computers
	5	using a simple programming model.
	6
79 by Kevin W. Monroe remove fqdn bits as they are no longer needed for our etc hosts entries.	7	This charm plugs in to a workload charm to provide the
45 by Cory Johns Updated README and added README.dev	8	[Apache Hadoop 2.4.1](http://hadoop.apache.org/docs/r2.4.1/)
77 by Cory Johns Cleaned up READMEs and metadata from fork	9	libraries and configuration for the workload to use.
1 by Kevin W. Monroe adding apache-hadoop-client charm	10
	11	## Usage
	12
45 by Cory Johns Updated README and added README.dev	13	This charm is intended to be deployed via one of the
98 by Kevin W. Monroe remove namespace refs from readmes now that we are promulgated; update DEV-README with jujubigdata info	14	[apache bundles](https://jujucharms.com/u/bigdata-charmers/#bundles).
48 by Cory Johns Removed double colons, for Kevin	15	For example:
45 by Cory Johns Updated README and added README.dev	16
96 by Kevin W. Monroe remove dev references for production	17	juju quickstart apache-analytics-sql
45 by Cory Johns Updated README and added README.dev	18
77 by Cory Johns Cleaned up READMEs and metadata from fork	19	This will deploy the Apache Hadoop platform with a workload node
45 by Cory Johns Updated README and added README.dev	20	which is running Apache Hive to perform SQL-like queries against your data.
	21
	22	If you wanted to also wanted to be able to analyze your data using Apache Pig,
77 by Cory Johns Cleaned up READMEs and metadata from fork	23	you could deploy it and attach it to the same plugin:
77 by Cory Johns Cleaned up READMEs and metadata from fork	24
95.2.19 by Kevin W. Monroe update READMEs to use promulgated locations where available	25	juju deploy apache-pig pig
77 by Cory Johns Cleaned up READMEs and metadata from fork	26	juju add-relation plugin pig
1 by Kevin W. Monroe adding apache-hadoop-client charm	27
95.1.1 by Cory Johns Added benchmarking (with some small modifications) from https://code.launchpad.net/~aisrael/charms/trusty/apache-hadoop-client/benchmarks/+merge/260526	28	## Benchmarking
	29
	30	You can perform a terasort benchmark, in order to gauge performance of your environment:
	31
	32	$ juju action do plugin/0 terasort
	33	Action queued with id: cbd981e8-3400-4c8f-8df1-c39c55a7eae6
	34	$ juju action fetch --wait 0 cbd981e8-3400-4c8f-8df1-c39c55a7eae6
	35	results:
	36	meta:
	37	composite:
	38	direction: asc
	39	units: ms
	40	value: "206676"
	41	results:
	42	raw: '{"Total vcore-seconds taken by all map tasks": "439783", "Spilled Records":
	43	"30000000", "WRONG_LENGTH": "0", "Reduce output records": "10000000", "HDFS:
	44	Number of bytes read": "1000001024", "Total vcore-seconds taken by all reduce
	45	tasks": "50275", "Reduce input groups": "10000000", "Shuffled Maps ": "8", "FILE:
	46	Number of bytes written": "3128977482", "Input split bytes": "1024", "Total
	47	time spent by all reduce tasks (ms)": "50275", "FILE: Number of large read operations":
	48	"0", "Bytes Read": "1000000000", "Virtual memory (bytes) snapshot": "7688794112",
	49	"Launched map tasks": "8", "GC time elapsed (ms)": "11656", "Bytes Written":
	50	"1000000000", "FILE: Number of read operations": "0", "HDFS: Number of write
	51	operations": "2", "Total megabyte-seconds taken by all reduce tasks": "51481600",
	52	"Combine output records": "0", "HDFS: Number of bytes written": "1000000000",
	53	"Total time spent by all map tasks (ms)": "439783", "Map output records": "10000000",
	54	"Physical memory (bytes) snapshot": "2329722880", "FILE: Number of write operations":
	55	"0", "Launched reduce tasks": "1", "Reduce input records": "10000000", "Total
	56	megabyte-seconds taken by all map tasks": "450337792", "WRONG_REDUCE": "0",
	57	"HDFS: Number of read operations": "27", "Reduce shuffle bytes": "1040000048",
	58	"Map input records": "10000000", "Map output materialized bytes": "1040000048",
	59	"CPU time spent (ms)": "195020", "Merged Map outputs": "8", "FILE: Number of
	60	bytes read": "2080000144", "Failed Shuffles": "0", "Total time spent by all
	61	maps in occupied slots (ms)": "439783", "WRONG_MAP": "0", "BAD_ID": "0", "Rack-local
	62	map tasks": "2", "IO_ERROR": "0", "Combine input records": "0", "Map output
	63	bytes": "1020000000", "CONNECTION": "0", "HDFS: Number of large read operations":
	64	"0", "Total committed heap usage (bytes)": "1755840512", "Data-local map tasks":
	65	"6", "Total time spent by all reduces in occupied slots (ms)": "50275"}'
	66	status: completed
	67	timing:
	68	completed: 2015-05-28 20:55:50 +0000 UTC
	69	enqueued: 2015-05-28 20:53:41 +0000 UTC
	70	started: 2015-05-28 20:53:44 +0000 UTC
	71
11 by Cory Johns Added option for resource mirror	72
	73	## Deploying in Network-Restricted Environments
	74
16 by Cory Johns Improved README instructions for mirroring resources	75	The Apache Hadoop charms can be deployed in environments with limited network
	76	access. To deploy in this environment, you will need a local mirror to serve
	77	the packages and resources required by these charms.
11 by Cory Johns Added option for resource mirror	78
45 by Cory Johns Updated README and added README.dev	79
11 by Cory Johns Added option for resource mirror	80	### Mirroring Packages
11 by Cory Johns Added option for resource mirror	81
16 by Cory Johns Improved README instructions for mirroring resources	82	You can setup a local mirror for apt packages using squid-deb-proxy.
	83	For instructions on configuring juju to use this, see the
11 by Cory Johns Added option for resource mirror	84	[Juju Proxy Documentation](https://juju.ubuntu.com/docs/howto-proxies.html).
11 by Cory Johns Added option for resource mirror	85
45 by Cory Johns Updated README and added README.dev	86
11 by Cory Johns Added option for resource mirror	87	### Mirroring Resources
	88
	89	In addition to apt packages, the Apache Hadoop charms require a few binary
16 by Cory Johns Improved README instructions for mirroring resources	90	resources, which are normally hosted on Launchpad. If access to Launchpad
	91	is not available, the `jujuresources` library makes it easy to create a mirror
	92	of these resources:
11 by Cory Johns Added option for resource mirror	93
11 by Cory Johns Added option for resource mirror	94	sudo pip install jujuresources
95.2.16 by Kevin W. Monroe be more explicit with how to use the resources_mirror config option	95	juju-resources fetch --all /path/to/resources.yaml -d /tmp/resources
	96	juju-resources serve -d /tmp/resources
11 by Cory Johns Added option for resource mirror	97
11 by Cory Johns Added option for resource mirror	98	This will fetch all of the resources needed by this charm and serve them via a
95.2.16 by Kevin W. Monroe be more explicit with how to use the resources_mirror config option	99	simple HTTP server. The output from `juju-resources serve` will give you a
	100	URL that you can set as the `resources_mirror` config option for this charm.
	101	Setting this option will cause all resources required by this charm to be
	102	downloaded from the configured URL.
11 by Cory Johns Added option for resource mirror	103
16 by Cory Johns Improved README instructions for mirroring resources	104	You can fetch the resources for all of the Apache Hadoop charms
	105	(`apache-hadoop-hdfs-master`, `apache-hadoop-yarn-master`,
77 by Cory Johns Cleaned up READMEs and metadata from fork	106	`apache-hadoop-compute-slave`, `apache-hadoop-plugin`, etc) into a single
95.2.16 by Kevin W. Monroe be more explicit with how to use the resources_mirror config option	107	directory and serve them all with a single `juju-resources serve` instance.
11 by Cory Johns Added option for resource mirror	108
11 by Cory Johns Added option for resource mirror	109
1 by Kevin W. Monroe adding apache-hadoop-client charm	110	## Contact Information
45 by Cory Johns Updated README and added README.dev	111
95.2.26 by Cory Johns Update mailing list	112	- <bigdata@lists.ubuntu.com>
45 by Cory Johns Updated README and added README.dev	113
1 by Kevin W. Monroe adding apache-hadoop-client charm	114
1 by Kevin W. Monroe adding apache-hadoop-client charm	115	## Hadoop
45 by Cory Johns Updated README and added README.dev	116
1 by Kevin W. Monroe adding apache-hadoop-client charm	117	- [Apache Hadoop](http://hadoop.apache.org/) home page
	118	- [Apache Hadoop bug trackers](http://hadoop.apache.org/issue_tracking.html)
	119	- [Apache Hadoop mailing lists](http://hadoop.apache.org/mailing_lists.html)
	120	- [Apache Hadoop Juju Charm](http://jujucharms.com/?text=hadoop)