Home How do i copy memory from CPU to GPU using CUDA C++?

Questions

How do i copy memory from CPU to GPU using CUDA C++?

January 8, 2025

I want to use my gpu instead of cpu for threading but im not really sure how to do that. i tried doing something like this:

int data_array = readfile();
int array_size = data_array.size();
int iterations = 25;
vector<person> result_array;
run_on_GPU<<<8, 32>>>(data_array, result_array, array_size, iterations);
cudaDeviceSynchronize();

for (int i = 0; i < result_array.size(); i++) {
    if (results_array[i] == condition) break;

    output_file << results_array[i].encoded << endl;
}

I want something like this, i tried using chatGpt but it still didn’t run.

The program did not work and I got something like this:

CUDA Error: invalid argument at launch.
Error in file <secret :)> at line 48: cudaDeviceSynchronize() returned error 11 (cudaErrorInvalidConfiguration)

>Solution :

So it seems you forgot to actually allocate some memory before running the processes.
You should first do something like this:
Instead of DataClass and ResultClass datatypes use your own, theese are just for an example.

DataClass* device_entries = NULL;
ResultClass* device_results = NULL;

cudaMalloc(&device_entries, entry_count * sizeof(DataClass));
cudaMalloc(&device_results, entry_count * sizeof(ResultClass));

entry_count is the size of your data array.

Then after that you need to copy the actual array to the gpu using theese lines:

cudaMemcpy(device_entries, &entries[0], entry_count * sizeof(DataClass), cudaMemcpyHostToDevice);
cudaMemset(device_results, 0, entry_count * sizeof(ResultClass));
cudaDeviceSynchronize();

the cudaMemcpyHostToDevice as it sounds, copies the memory from host to device(the gpu). We will use the same thing but the other way around later.

The block_count and block_size is your own choice, but you should use a multiple of 32 for the block_size variable.
Also the iteration_count variable is the amount of data each thread will process so its up to you how you count it but you can use something like this:

entry_count / (block_count * block_size) + 1;

So to the next part will look something like this:

run_on_GPU<<<block_count, block_size>>>(device_entries, device_results, entry_count, iteration_count);
cudaDeviceSynchronize();

The run_on_GPU method should look something like this:

__global__ void run_on_GPU(DataClass* entries, ResultClass* results, size_t entry_count, int count)

To get the results from the gpu you have to copy the memory back from device to host:

ResultClass* results = (ResultClass*)malloc(entry_count * sizeof(ResultClass));
cudaMemcpy(results, device_results, entry_count * sizeof(ResultClass), cudaMemcpyDeviceToHost);

Bofore ending the program also dont forget to free up the used up memory:

free(results);
cudaFree(device_entries);
cudaFree(device_results);

byMR

Published January 08, 2025

Add a comment

is it compile bug? Accessing inactive member of union in constepr function causes compile error in compile time evaluation

byMR

January 8, 2025

Questions

git log graph smart filter on branches

byMR

January 8, 2025

Questions

Styling Angular Material datepicker

byMR

January 8, 2025

Questions

SQL Server-Relational Table Design with Same Type in a Many-to-Many Relationship

byMR

January 8, 2025

Questions

How to check if a list contains an object of the same type as the caller

byMR

January 8, 2025

Questions

Unexpected output from right rotate function

byMR

January 8, 2025

How do i copy memory from CPU to GPU using CUDA C++?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

is it compile bug? Accessing inactive member of union in constepr function causes compile error in compile time evaluation

git log graph smart filter on branches

Styling Angular Material datepicker

SQL Server-Relational Table Design with Same Type in a Many-to-Many Relationship

How to check if a list contains an object of the same type as the caller

Unexpected output from right rotate function

Keep Up to Date with the Most Important News

How do i copy memory from CPU to GPU using CUDA C++?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

is it compile bug? Accessing inactive member of union in constepr function causes compile error in compile time evaluation

git log graph smart filter on branches

Styling Angular Material datepicker

SQL Server-Relational Table Design with Same Type in a Many-to-Many Relationship

How to check if a list contains an object of the same type as the caller

Unexpected output from right rotate function

Discover more from Dev solutions