mapreduce - Number of reducers in hadoop -


i learning hadoop, found number of reducers confusing :

1) number of reducers same number of partitions.

2) number of reducers 0.95 or 1.75 multiplied (no. of nodes) * (no. of maximum containers per node).

3) number of reducers set mapred.reduce.tasks.

4) number of reducers closest to: multiple of block size * task time between 5 , 15 minutes * creates fewest files possible.

i confused, explicitly set number of reducers or done mapreduce program itself?

how number of reducers calculated? please tell me how calculate number of reducers.

1 - number of reducers number of partitions - false. single reducer might work on 1 or more partitions. chosen partition done on reducer started.

2 - theoretical number of maximum reducers can configure hadoop cluster. dependent on kind of data processing (decides how heavy lifting reducers burdened with).

3 - mapred-site.xml configuration suggestion yarn. internally resourcemanager has own algorithm running, optimizing things on go. value not number of reducer tasks running every time.

4 - 1 seems bit unrealistic. block size might 128mb , everytime can't have 128*5 minimum number of reducers. that's again false, believe.

there no fixed number of reducers task can configured or calculated. depends on moment how of resources available allocate.


Comments

Popular posts from this blog

java - Static nested class instance -

c# - Bluetooth LE CanUpdate Characteristic property -

JavaScript - Replace variable from string in all occurrences -