forked from nexr/RHive
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME
More file actions
68 lines (52 loc) · 2.17 KB
/
README
File metadata and controls
68 lines (52 loc) · 2.17 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
NexR RHive 0.0-6
================
RHive is an R extension facilitating distributed computing via HIVE query.
It allows easy usage of HQL in R, and allows easy usage of R objects and R functions in Hive.
Installation Guide
==================
1. Install Hadoop
1-1. Single Node (http://hadoop.apache.org/common/docs/r0.20.203.0/single_node_setup.html)
1-2. Cluster Node (http://hadoop.apache.org/common/docs/r0.20.203.0/cluster_setup.html)
1-3. set HADOOP_HOME at local machine on which R runs
2. Install Hive
2-1. install local machine and remote machine on which NameNode runs or Hive-Server runs.
2-2. Installation Guide
(https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-InstallationandConfiguration)
2-3. set HIVE_HOME at local machine on which R runs.
2-4. launch Hive Server with following command on remote machine.
e.q. >$HIVE_HOME/bin/hive --service hiveserver
3. Install R and Packages
3-1. install R
3-1-1. need to install R on all tasktracker nodes
3-2. install rJava
3-2-1. only install rJava on local machine.
3-3. install Rserve
3-3-1. need to install Rserve on all tasktracker nodes
3-3-2. set RHIVE_DATA as R objects and R functions repository on all tasktracker nodes.
e.q. >export RHIVE_DATA=/rhive/data
3-3-3. make configuration in path (/etc/Rserv.conf) on all tasktracker nodes.
edit this file to add 'remote enable' to allow remote connection.
3-3-4. launch all Rserve on all tasktracker nodes.
e.q. >R CMD Rserve
3-4. install RUnit
4. Install RHive
4-1. requirement software : ANT.
4-2. R CMD INSTALL RHive_0.0-6.tar.gz
4-3. If HADOOP_HOME doesn't exist, do following instruction :
4-3-1. copy RUDF/RUDAF library(rhive_udf.jar) to '/rhive/lib/' of HDFS path,
using this command : 'hadoop fs -put rhive_udf.jar /rhive/lib/rhive_udf.jar'.
this jar file exists under $HIVE_HOME/lib.
5. Launch RHive
5-1. launch R
5-2. >library(RHive)
5-3. >rhive.connect(hive-server-ip)
6. Tutorial
https://github.com/nexr/RHive/wiki/UserGuides
Requirements
============
- Java 1.6
- R 2.13.0
- Rserve 0.6-0
- rJava 0.9-0
- Hadoop 0.20.x (x >= 1)
- Hive 0.8.x (x >= 0)