R
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
Using H2O through R (Easy Start from Download) ---------------------------------------------- Prequisites """"""""""" 0) Command line comfortability (terminal or cmd) (http://www.davidbaumgold.com/tutorials/command-line/) 1) R (version 2.13 or greater) is installed 2) An H2O jar exists on the desktop of the local machine (http://0xdata.com/h2o/) From the R Console """""""""""""""""" These instructions assume you are using a recent version of R, and are familiar with the basics of using command line. The download package can be obtained by clicking on the button **Download H2O Jar** at http://0xdata.com/h2o/ Once the download completes, move the downloaded file to the desktop. *Users should be aware that in order for H2O to successfully run through R, an instance of H2O must also simultaneously be running. If the instance of H2O is stopped, the R program will no longer run, and work done will be lost.* These instructions first address running H2O, and then running H2O through R. Open up a terminal/cmd and at the prompt enter: $cd Desktop/h2o-.../ #change directory to the h2o file. *Note that "h2o" should be completed with the full h2o file name for the file you downloaded. The file number may change to indicate a more recent version of H2O. $ java -Xmx3g -jar h2o.jar -name mystats-cloud # starts an instance of H2O. This returns output like the following: 08:26:03.603 main INFO WATER: ----- H2O started ----- 08:26:03.607 main INFO WATER: Build git branch: (no branch) 08:26:03.607 main INFO WATER: Build git hash: d78c92f3b8a4c765b2276... 08:26:03.607 main INFO WATER: Build git describe: d78c92f 08:26:03.607 main INFO WATER: Build project version: 1.5.6.137 08:26:03.608 main INFO WATER: Built by: 'jenkins' 08:26:03.608 main INFO WATER: Built on: 'Mon Jul 29 17:10:45 PDT 2013' 08:26:03.608 main INFO WATER: Java availableProcessors: 2 08:26:03.613 main INFO WATER: Java heap totalMemory: 0.02 gb 08:26:03.613 main INFO WATER: Java heap maxMemory: 2.90 gb 08:26:03.635 main INFO WATER: ICE root: '/tmp' 08:26:03.690 main INFO WATER: Internal communication uses port: 54322 + Listening for HTTP and REST traffic on http:// 192.168.1.94:54321/ 08:26:03.775 main INFO WATER: H2O cloud name: 'mystats-cloud' 08:26:03.775 main INFO WATER: (v1.5.6.137) 'mystats-cloud' on /192.168.1.94:54321, discovery address /236.151.114.91:60567 08:26:03.779 main INFO WATER: Cloud of size 1 formed [/192.168.1.94:54321] 08:26:03.779 main INFO WATER: Log dir: '/tmp/h2ologs' **Minimize** the terminal window, and open R. In the R console install the library by entering the following command at the prompt: >install.packages("/Users/UserName/Desktop/h2o_file_name/R/filename.tar.gz", repos = NULL, type = "source") **To find the tar.gz file name**, go to h2o file -> R, and find the file with the extension ".tar.gz." For example, a user at 0x data enters the following into her R console at the command prompt: >install.packages("/Users/Irene/Desktop/h2o-1.5.6137/R/h2o_1.5.6.137.tar.gz", repos = NULL, type = "source") Which returns the following output: * installing *source* package Ôh2oÕ ... ** R ** demo ** inst ** preparing package for lazy loading Creating a generic function for colnames from package base in package h2o Creating a generic function for nrow from package base in package h2o Creating a generic function for ncol from package base in package h2o Creating a generic function for summary from package base in package h2o Creating a generic function for as.data.frame from package base in package h2o ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (h2o) **R Studio users** can install the H2O package by finding the tabbed menu "File; Plots; Packages; Help" and choosing *Packages*. Clicking on *Install Packages* brings up an installation helper. Choose *Package Archive File (tgz; .tar.gz)* in the *"Install From"* field. Click browse and follow the helper to specify Desktop -> h2o file -> R -> .tar.gz *Click "Open". Click "Install"* All R users (both console and R Studio) enter the command: > require(h2o) which returns the following output: Loading required package: h2o Loading required package: RCurl Loading required package: bitops Loading required package: rjson In the R terminal enter: > localH2O = new("H2OClient") > h2o.checkClient(localH2O) Which returns the following output: Successfully connected to http://127.0.0.1:54321 Users can now run H2O from their R console. Additional R documentation can be found here https://github.com/0xdata/h2o/blob/master/R/h2o-package/h2o_package.pdf Users can now run H2O from their R or R Studio console. Additional R documentation can be found in the R section of the main user documentation page. Users can also enter **??h2o** at any time to access help. **Users can change the amount of memory allocated to H2O.** In the Java command entered in the terminal to start H2O the term **-Xmx2g** was used. Xmx is the amount of memory given to H2O. If your data set is large, give H2O more memory (for example, -Xmx4g gives H2O four gigabytes of memory). For best performance, Xmx should be 4x the size of your data, but never more than the total amount of memory on your computer.