hadoop - 50-60 gb of data in spark standalone mode -


i trying analyze around 50-60 gb of data. thought of using spark that, not have access multiple nodes in cluster. can level of processing done using spark standalone mode ? if yes, know estimated time required process data.thanks!

short answer: yes.

spark partition file in many smaller chunks. in case few chunks executed @ time. these few chunks should fit in memory (you need play configurations right)

to summarize, able it, faster if had more memory/cores can processes more things in parallel.


Comments

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -