SparkR on CDH and HDP


Spark added support for R back in version 1.4.1. and you can use it in Spark Standalone mode.

Big Hadoop distros, like Cloudera’s CDH and Hortonworks’ HDP that bundle Spark, have varying degree of support for R. For the time being, CDH decided to opt out of supporting R (their latest CDH 5.8.x version does not even have sparkR binaries), while HDP (versions 2.3.2, 2.4, … ) includes SparkR as a technical preview technology and bundles some R-related components, like the sparkR script. Making it all work (if at all this is presently possible) is another story and making it run on YARN may be a whole novel of a size of War and Peace.  So you can view this more as a demonstration of Hortonworks’ commitment to Spark, and we are left with the original supported language triad: Scala, Python, and Java.

  1. No comments yet.
(will not be published)

*