hadoop - Install spark on yarn cluster -


i looking guide regarding how install spark on existing virtual yarn cluster.

i have yarn cluster consisting of 2 nodes, ran map-reduce job worked perfect. looked results in log , working fine.

now need add spark installation commands , configurations files in vagrantfile. can't find guide, give me link ?

i used guide yarn cluster

http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide/#single-node-installation

thanks in advance!

i don't know vagrant, have installed spark on top of hadoop 2.6 (in guide referred post-yarn) , hope helps.

installing spark on existing hadoop easy, need install only on one machine. have download 1 pre-built hadoop version it's official website (i guess can use without hadoop version need point direction of hadoop binaries in system). decompress it:

tar -xvf spark-2.0.0-bin-hadoop2.x.tgz -c /opt 

now need set environment variables. first in ~/.bashrc (or ~/.zshrc) can set spark_home , add path if want:

export spark_home=/opt/spark-2.0.0-bin-hadoop-2.x export path=$path:$spark_home/bin 

also changes take effect can run:

source ~/.bashrc 

second need point spark hadoop configuartion directories. set these 2 environmental variables in $spark_home/conf/spark-env.sh:

export hadoop_conf_dir=[your-hadoop-conf-dir $hadoop_prefix/etc/hadoop] export yarn_conf_dir=[your-yarn-conf-dir same last variable] 

if file doesn't exist, can copy contents of $spark_home/conf/spark-env.sh.template , start there.

now start shell in yarn mode can run:

spark-shell --master yarn --deploy-mode client 

(you can't run shell in cluster deploy-mode)

----------- update

i forgot mention can submit cluster jobs configuration (thanks @juliancienfuegos):

spark-submit --master yarn --deploy-mode cluster project-spark.py 

this way can't see output in terminal, , command exits job submitted (not completed).

you can use --deploy-mode client see output right there in terminal testing, since job gets canceled if command interrupted (e.g. press ctrl+c, or session ends)


Comments

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -