3
Data warehouse infrastructure built on top of Hadoop.
5
Hive 0.11.3 is a data warehouse infrastructure built on top of Hadoop that
6
provides tools to enable easy data summarization, adhoc querying and
7
analysis of large datasets data stored in Hadoop files. It provides a
8
mechanism to put structure on this data and it also provides a simple
9
query language called Hive QL which is based on SQL and which enables
10
users familiar with SQL to query this data. At the same time, this
11
language also allows traditional map/reduce programmers to be able to
12
plug in their custom mappers and reducers to do more sophisticated
13
analysis which may not be supported by the built-in capabilities of
18
- HiveQL - An SQL dialect language for querying data in a RDBMS fashion
19
- UDF/UDAF/UDTF (User Defined [Aggregate/Table] Functions) - Allows user to
20
create custom Map/Reduce based functions for regular use
21
- Ability to do joins (inner/outer/semi) between tables
22
- Support (limited) for sub-queries
23
- Support for table 'Views'
24
- Ability to partition data into Hive partitions or buckets to enable faster
26
- Hive Web Interface - A web interface to Hive
27
- Hive Server2 - Supports multi-suer querying using Thrift, JDBC and ODBC clients
28
- Hive Metastore - Ability to run a separate Metadata storage process
29
-* Hive cli - A Hive commandline that supports HiveQL
31
See [http://hive.apache.org]http://hive.apache.org) for more information.
33
This charm provides the Hive Server and Metastore roles which form part of an
34
overall Hive deployment.
39
A Hive deployment consists of a Hive service, a RDBMS (only MySQL is currently
40
supported), an optional Metastore service and a Hadoop cluster.
42
To deploy a simple four node Hadoop cluster (see Hadoop charm README for further
45
juju deploy hadoop hadoop-master
46
juju deploy hadoop hadoop-slavecluster
47
juju add-unit -n 2 hadoop-slavecluster
48
juju add-relation hadoop-master:namenode hadoop-slavecluster:datanode
49
juju add-relation hadoop-master:resourcemanager hadoop-slavecluster:nodemanager
51
A Hive server stores metadata in MySQL::
54
# hive requires ROW binlog
55
juju set mysql binlog-format=ROW
57
To deploy a Hive service without a Metastore service::
59
# deploy Hive instance (hive-server2)
60
juju deploy hive2 hive-server
61
# associate Hive with MySQL
62
juju add-relation hive-server:db mysql:db
64
# associate Hive with HDFS Namenode
65
juju add-relation hive-server:namenode hadoop-master:namenode
66
# associate Hive with resourcemanager
67
juju add-relation hive-server:resourcemanager hadoop-master:resourcemanager
69
To deploy a Hive service with a Metastore service::
71
# deploy Metastore instance
72
juju deploy hive2 hive-metastore
73
# associate Metastore with MySQL
74
juju add-relation hive-metastore:db mysql:db
76
# associate Metastore with Namenode
77
juju add-relation hive-metastore:namenode hadoop-master:namenode
79
# deploy Hive instance
80
juju deploy hive2 hive-server
81
# associate Hive with Metastore
82
juju add-relation hive-server:server hive-metastore:metastore
83
# associate Hive with Namenode
84
juju add-relation hive-server:namenode hadoop-master:namenode
85
# associate Hive with resourcemanager
86
juju add-relation hive-server:resourcemanager hadoop-master:resourcemanager
88
Further Hive service units may be deployed::
90
juju add-unit hive-server
92
This currently only works when using a Metastore service.