6
Platform architecture: currently supported platforms are x86 and POWER.
7
Note: for POWER you **MUST** select IBM as the Java vendor
12
Valid selections-"IBM" for POWER and "OPENJDK" for x86
23
dfs_namenode_handler_count:
27
The number of server threads for the namenode. Increase this in larger
28
deployments to ensure the namenode can cope with the number of datanodes
29
that it has to deal with.
34
Default block replication. The actual number of replications can be specified when
35
the file is created. The default is used if replication is not specified in create time
40
The default block size for new files (default to 64MB). Increase this in
41
larger deployments for better large data set performance.
46
The size of buffer for use in sequence files. The size of this buffer should
47
probably be a multiple of hardware page size (4096 on Intel x86), and it
48
determines how much data is buffered during read and write operations.
49
dfs_datanode_max_xcievers:
53
The number of files that an datanode will serve at any one time.
54
An Hadoop HDFS datanode has an upper bound on the number of files that it
55
will serve at any one time. This defaults to 256 (which is low) in hadoop
56
1.x - however this charm increases that to 4096.
57
mapreduce_framework_name:
61
Execution framework set to Hadoop YARN.** DO NOT CHANGE **
62
mapreduce_reduce_shuffle_parallelcopies:
66
The default number of parallel transfers run by reduce during the
68
mapred_child_java_opts:
72
Java opts for the task tracker child processes. The following symbol,
73
if present, will be interpolated: @taskid@ is replaced by current TaskID.
74
Any other occurrences of '@' will go unchanged. For example, to enable
75
verbose gc logging to a file named for the taskid in /tmp and to set
76
the heap maximum to be a gigabyte, pass a 'value' of:
78
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
80
The configuration variable mapred.child.ulimit can be used to control
81
the maximum virtual memory of the child processes.
82
mapreduce_task_io_sort_factor:
86
More streams merged at once while sorting files.. This
87
determines the number of open file handles.
88
mapreduce_task_io_sort_mb:
92
Higher memory-limit while sorting data for efficiency..
93
mapred_job_tracker_handler_count:
97
The number of server threads for the JobTracker. This should be roughly
98
4% of the number of tasktracker nodes.
99
tasktracker_http_threads:
103
The number of worker threads that for the http server. This is used for
107
default: /usr/local/hadoop/data
109
The directory under which all other hadoop data is stored. Use this
110
to take advantage of extra storage that might be avaliable.
112
You can change this in a running deployment but all existing data in
113
HDFS will be inaccessible; you can of course switch it back if you
115
yarn_nodemanager_aux-services:
117
default: mapreduce_shuffle
119
Shuffle service that needs to be set for Map Reduce applications.
120
yarn_nodemanager_aux-services_mapreduce_shuffle_class:
122
default: org.apache.hadoop.mapred.ShuffleHandler
124
Shuffle service that needs to be set for Map Reduce applications.
125
dfs_heartbeat_interval:
129
Determines datanode heartbeat interval in seconds.
130
dfs_namenode_heartbeat_recheck_interval:
134
Determines datanode recheck heartbeat interval in milliseconds
135
It is used to calculate the final tineout value for namenode. Calcultion process is
136
as follow: 10.30 minutes = 2 x (dfs.namenode.heartbeat.recheck-interval=5*60*1000)
137
+ 10 * 1000 * (dfs.heartbeat.interval=3)