Tuesday, October 9, 2018

Modify default slaves.sh to start apache-spark on MacOS


By default, Apache Spark sbin/start-all.sh will try to start worker using "ssh" to slave node, regardless we were testing using our laptop. Below modification is to start it using Bash instead of SSH.

Donghuas-MacBook-Air:sbin donghua$ diff slaves.sh slaves.sh.old
92,94c92,93
<       cmd="${@// /\\ } 2>&1"
<       echo $cmd
<       bash -c "$cmd"
---
>     ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \
>       2>&1 | sed "s/^/$slave: /"
96,98c95,96
<       cmd="${@// /\\ } 2>&1"
<       echo $cmd
<       bash -c "$cmd"
---
>     ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \
>       2>&1 | sed "s/^/$slave: /" &


Revised code could be found here:

https://github.com/luodonghua/bigdata/blob/master/slaves.sh

And modified version of spark-conf.sh to start multiple workers could be found here:

https://github.com/luodonghua/bigdata/blob/master/spark-config.sh