multithreading - multiple streams in one GPU device -


i have multi-threaded program supposed run on 6 gpu devices. want open on each device 6 streams reuse during lifetime of program (36 in total).

i'm using cudastreamcreate() cublascreate() cublassetstream() create each stream , handle. use gpu memory monitor see memory usage each handle. however, when @ gpu memory usage on each device, grow on first stream creation, , doesn't change in rest of streams create.

as far know there isn't limitation on amount of streams want use. can't figure out why memory usage of handles , streams don't show on gpu memory usage.

all streams create residing within single context on given device, there no context related overhead creating additional streams after first one. streams lightweight , (mostly) host side scheduler abstraction. have observed, don't in consume (if any) device memory.


Comments

Popular posts from this blog

matlab - error with cyclic autocorrelation function -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

c# - What is a good .Net RefEdit control to use with ExcelDna? -