Wednesday, January 3, 2018

Using Open Source R-Studio Server connecting to Kerberos-enabled Hadoop

Step 1: Add line "SPARK_HOME=${SPARK_HOME-'/opt/cloudera/parcels/CDH/lib/spark/'}" to end of file "/usr/lib64/R/etc/Renviron"

Step 2: Connect to Spark using sparklyr inside R-Studio Server

> install.packages("sparklyr")
> library(sparklyr)
> readRenviron("/usr/lib64/R/etc/Renviron")
> sc <- spark_connect(master = "yarn-client",version = "1.6.0", config = list
(default = list(spark.yarn.keytab = "/home/donghua/donghua.keytab", spark.yarn.principal = "donghua@DBAGLOBE.COM")))
> sc
$master
[1] "yarn-client"

$method
[1] "shell"

$app_name
[1] "sparklyr"

$config
$config$default
$config$default$spark.yarn.keytab
[1] "/home/donghua/donghua.keytab"

$config$default$spark.yarn.principal
[1] "donghua@DBAGLOBE.COM"



$spark_home
[1] "/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark"

$backend
A connection with                               
description "->localhost:46015"
class       "sockconn"         
mode        "wb"               
text        "binary"           
opened      "opened"           
can read    "yes"              
can write   "yes"              

$monitor
A connection with                              
description "->localhost:8880"
class       "sockconn"        
mode        "rb"              
text        "binary"          
opened      "opened"          
can read    "yes"             
can write   "yes"             

$output_file
[1] "/tmp/RtmpXWaXfE/file7af1ca61a03_spark.log"

$spark_context
<jobj[6]>
  class org.apache.spark.SparkContext
  org.apache.spark.SparkContext@355d7d99

$java_context
<jobj[7]>
  class org.apache.spark.api.java.JavaSparkContext
  org.apache.spark.api.java.JavaSparkContext@ef616c5

attr(,"class")
[1] "spark_connection"       "spark_shell_connection" "DBIConnection"   


image

> library(DBI)
> iotdatademo <- dbGetQuery(sc, 'Select * from default.iotdatademo limit 10')
> iotdatademo

image

Reference URL: https://medium.com/@bkvarda/sparklyr-r-interface-for-spark-and-kerberos-on-cloudera-80abf5f6b4ad

1 comment:

  1. To inquire about the enormous measure of information we need an expert and all around experienced Data Scientist, to end up one of the specialists among the Data Scientist swarm, you should upgrade your range of abilities in huge information in a propelled manner. ExcelR Data Science Courses

    ReplyDelete