hadoop - Apache Spark accessing the data in hdfs through cross cluster -

hadoop - Apache Spark accessing the data in hdfs through cross cluster -

i running spark on amazon emr public dns is, lets say, 23.21.40.15.

now executing spark jar on cluster & want write output of spark job other amazon emr hdfs public dns 29.45.56.72.

i able access own cluster hdfs i.e. 23.21.40.15 not able write cluster 29.45.56.72.

what need spark job can access cross cluster hdfs??
if possible, can share sample code this??

when set output dir in spark job can set credentials access this:

hdfs://username:password@hostname:port/pathtofolder

pd: shouldn't write ips of cluster in public question ;)

Comments