mapreduce - Number of reducers in hadoop -
i learning hadoop, found number of reducers confusing :
1) number of reducers same number of partitions.
2) number of reducers 0.95 or 1.75 multiplied (no. of nodes) * (no. of maximum containers per node).
3) number of reducers set mapred.reduce.tasks.
4) number of reducers closest to: multiple of block size * task time between 5 , 15 minutes * creates fewest files possible.
i confused, explicitly set number of reducers or done mapreduce program itself?
how number of reducers calculated? please tell me how calculate number of reducers.
1 - number of reducers number of partitions - false
. single reducer might work on 1 or more partitions. chosen partition done on reducer started.
2 - theoretical number of maximum reducers can configure hadoop cluster. dependent on kind of data processing (decides how heavy lifting reducers burdened with).
3 - mapred-site.xml
configuration suggestion yarn. internally resourcemanager has own algorithm running, optimizing things on go. value not number of reducer tasks running every time.
4 - 1 seems bit unrealistic. block size might 128mb , everytime can't have 128*5 minimum number of reducers. that's again false, believe.
there no fixed number of reducers task can configured or calculated. depends on moment how of resources available allocate.
Comments
Post a Comment