Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

CUDA shared memory read/write order within a single thread

The shared memory is not synchronized between threads in a block. But I don’t know if the shared memory is synchronized with the writer thread.

For example, in this example:

__global__ void kernel()
{
    __shared__ int i, j;

    if(threadIdx.x == 0)
    {
        i = 10;
        j = i;
    }

   // #1
}

Is it guaranteed at #1 that, for thread 0, i=10 and j=10, or do I need some memory fence or introduce a local variable?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

I’m going to assume that by

for thread 0

you mean, "the thread that passed the if-test". And for the sake of this discussion, I will assume there is only one of those.

Yes, it’s guaranteed. Otherwise basic C++ compliance would be broken in CUDA.

Challenges in CUDA may arise in inter-thread communication or behavior. However you don’t have that in view in your question.

As an example, it is certainly not guaranteed that for some other thread, i will be visible as 10, without some sort of fence or barrier.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading