Distributed tensorflow parameter server and workers -


i closely following imagenet distributed tf train example.

i not able understand how distribution of data takes place when example being run on 2 different workers? in theory, different workers should see different part of data. also, part of code tells parameters pass on parameter server? in multi-gpu example, there explicit section 'cpu:0'.

the different workers see different parts of data virtue of dequeuing mini batch images single queue of preprocessed images. elaborate, in distributed setup training imagenet model, input images preprocessed multiple threads , preprocessed images stored in single randomshufflequeue. can tf.randomshufflequeue in this file see how done. multiple workers organized 'inception towers' , each tower dequeues mini batch of images same queue, , different parts of input. picture here answers second part of question. slim.variables.variabledevicechooser in this file. logic there makes sure variable objects assigned evenly workers act parameter servers. other workers doing actual training fetch variables @ beginning of step , update them @ end of step.


Comments

Popular posts from this blog

matlab - error with cyclic autocorrelation function -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

c# - What is a good .Net RefEdit control to use with ExcelDna? -