Distributed tensorflow parameter server and workers -

- March 15, 2010

i closely following imagenet distributed tf train example.

i not able understand how distribution of data takes place when example being run on 2 different workers? in theory, different workers should see different part of data. also, part of code tells parameters pass on parameter server? in multi-gpu example, there explicit section 'cpu:0'.

the different workers see different parts of data virtue of dequeuing mini batch images single queue of preprocessed images. elaborate, in distributed setup training imagenet model, input images preprocessed multiple threads , preprocessed images stored in single randomshufflequeue. can tf.randomshufflequeue in this file see how done. multiple workers organized 'inception towers' , each tower dequeues mini batch of images same queue, , different parts of input. picture here answers second part of question. slim.variables.variabledevicechooser in this file. logic there makes sure variable objects assigned evenly workers act parameter servers. other workers doing actual training fetch variables @ beginning of step , update them @ end of step.

Search This Blog

Prevent

Distributed tensorflow parameter server and workers -

Comments

Post a Comment

Popular posts from this blog

github - Git errors while pushing -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

Unity3d perpendicular vector3 -