With how many spark nodes should I use Mesos or Yarn? -
i run cluster 4 spark nodes , 1 solr node. want expand cluster 20 nodes , afterwards around 100. not sure @ cluster size make sense use mesos or yarn? make sense add yarn or mesos when have less 100 nodes?
thanks
mesos , yarn can scale upto thousands of nodes without issue.
it the workload decides used, if workload has jobs/tasks related spark or hadoop only, yarn better choice, else if have docker containers or else run mesos better choice.
there many other advantages , disadvantages using mesos, please find them in comparison here.
spark standalone cluster provide same features other cluster managers if running spark.
if run spark alongside other applications, or use richer resource scheduling capabilities (e.g. queues), both yarn , mesos provide these features. of these, yarn preinstalled in many hadoop distributions.
if have less 100 nodes , not going run other applications alongside spark spark standalone cluster better choice not overkilling.
it again depends on capabilities use queues or schedulers fair scheduler yarn/mesos make sense. (to use these features or not use them depends on spark cluster, workload , how busy cluster is.)
Comments
Post a Comment