Highest scored 'cuda' questions

956 votes

30 answers

2.6m views

How to get the CUDA version?

Is there any quick command or script to check for the version of CUDA installed? I found the manual of 4.0 under the installation directory but I'm not sure whether it is of the actual installed ...

Hailiang Zhang

18.5k

asked Mar 15, 2012 at 20:30

567 votes

19 answers

831k views

Nvidia NVML Driver/library version mismatch [closed]

When I run nvidia-smi, I get the following message: Failed to initialize NVML: Driver/library version mismatch An hour ago I received the same message and uninstalled my CUDA library and I was able ...

etal

14.2k

asked Mar 25, 2017 at 22:47

450 votes

32 answers

872k views

How to tell if tensorflow is using gpu acceleration from inside python shell?

I have installed tensorflow in my ubuntu 16.04 using the second answer here with ubuntu's builtin apt cuda installation. Now my question is how can I test if tensorflow is really using gpu? I have a ...

Tamim Addari

7,761

asked Jun 24, 2016 at 9:14

334 votes

7 answers

603k views

Which TensorFlow and CUDA version combinations are compatible?

I have noticed that some newer TensorFlow versions are incompatible with older CUDA and cuDNN versions. Does an overview of the compatible versions or even a list of officially tested combinations ...

whiletrue

10.8k

asked May 31, 2018 at 10:48

319 votes

8 answers

286k views

Different CUDA versions shown by nvcc and NVIDIA-smi

I am very confused by the different CUDA versions shown by running which nvcc and nvidia-smi. I have both cuda9.2 and cuda10 installed on my ubuntu 16.04. Now I set the PATH to point to cuda9.2. So ...

yuqli

4,951

asked Nov 22, 2018 at 0:44

306 votes

5 answers

162k views

What is the canonical way to check for errors using the CUDA runtime API?

Looking through the answers and comments on CUDA questions, and in the CUDA tag wiki, I see it is often suggested that the return status of every API call should checked for errors. The API ...

talonmies

71.8k

asked Dec 26, 2012 at 9:35

267 votes

17 answers

382k views

A top-like utility for monitoring CUDA activity on a GPU [closed]

I'm trying to monitor a process that uses CUDA and MPI, is there any way I could do this, something like the command "top" but that monitors the GPU too?

natorro

3,033

asked Nov 22, 2011 at 8:19

266 votes

16 answers

746k views

How to verify CuDNN installation?

I have searched many places but ALL I get is HOW to install it, not how to verify that it is installed. I can verify my NVIDIA driver is installed, and that CUDA is installed, but I don't know how to ...

alfredox

4,302

asked Jul 9, 2015 at 18:58

257 votes

10 answers

411k views

Using GPU from a docker container?

I'm searching for a way to use the GPU from inside a docker container. The container will execute arbitrary code so i don't want to use the privileged mode. Any tips? From previous research i ...

Regan

8,561

asked Aug 7, 2014 at 14:41

182 votes

2 answers

81k views

How do CUDA blocks/warps/threads map onto CUDA cores?

I have been using CUDA for a few weeks, but I have some doubts about the allocation of blocks/warps/thread. I am studying the architecture from a didactic point of view (university project), so ...

Daedalus

1,821

asked May 5, 2012 at 9:58

178 votes

2 answers

171k views

Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) [closed]

How are threads organized to be executed by a GPU?

cibercitizen1

21.3k

asked Mar 6, 2010 at 11:08

170 votes

7 answers

507k views

How do I select which GPU to run a job on?

In a multi-GPU computer, how do I designate which GPU a CUDA job should run on? As an example, when installing CUDA, I opted to install the NVIDIA_CUDA-<#.#>_Samples then ran several ...

Steven C. Howell

18k

asked Sep 22, 2016 at 21:23

169 votes

5 answers

128k views

Using Java with Nvidia GPUs (CUDA)

I'm working on a business project that is done in Java, and it needs huge computation power to compute business markets. Simple math, but with huge amount of data. We ordered some CUDA GPUs to try it ...

Hans

1,906

asked Apr 4, 2014 at 15:27

158 votes

22 answers

268k views

CUDA incompatible with my gcc version

I have troubles compiling some of the examples shipped with CUDA SDK. I have installed the developers driver (version 270.41.19) and the CUDA toolkit, then finally the SDK (both the 4.0.17 version). ...

fbielejec

3,620

asked Jul 8, 2011 at 9:25

157 votes

9 answers

107k views

Difference between global and device functions

Can anyone describe the differences between __global__ and __device__ ? When should I use __device__, and when to use __global__?.

Mehdi Saman Booy

2,860

asked Sep 11, 2012 at 16:15

143 votes

3 answers

152k views

How do I choose grid and block dimensions for CUDA kernels?

This is a question about how to determine the CUDA grid, block and thread sizes. This is an additional question to the one posted here. Following this link, the answer from talonmies contains a code ...

user1292251

1,715

asked Apr 3, 2012 at 1:14

137 votes

7 answers

101k views

GPU Emulator for CUDA programming without the hardware [closed]

Question: Is there an emulator for a Geforce card that would allow me to program and test CUDA without having the actual hardware? Info: I'm looking to speed up a few simulations of mine in CUDA, ...

Narcolapser

6,155

asked Jun 21, 2010 at 18:28

133 votes

9 answers

309k views

Is it possible to run CUDA on AMD GPUs?

I'd like to extend my skill set into GPU computing. I am familiar with raytracing and realtime graphics(OpenGL), but the next generation of graphics and high performance computing seems to be in GPU ...

Lee Jacobs

1,717

asked Oct 10, 2012 at 21:02

122 votes

5 answers

62k views

What is a bank conflict? (Doing Cuda/OpenCL programming)

I have been reading the programming guide for CUDA and OpenCL, and I cannot figure out what a bank conflict is. They just sort of dive into how to solve the problem without elaborating on the subject ...

smuggledPancakes

10.2k

asked Oct 1, 2010 at 18:04

119 votes

13 answers

345k views

How can I flush GPU memory using CUDA (physical reset is unavailable)

My CUDA program crashed during execution, before memory was flushed. As a result, device memory remained occupied. I'm running on a GTX 580, for which nvidia-smi --gpu-reset is not supported. ...

timdim

1,221

asked Mar 4, 2013 at 8:22

119 votes

11 answers

224k views

How to get the nvidia driver version from the command line?

For debugging CUDA code and checking compatibilities I need to find out what nvidia driver version for the GPU I have installed. I found How to get the cuda version? but that does not help me here.

Framester

34.8k

asked Oct 29, 2012 at 16:27

112 votes

2 answers

83k views

nvidia-smi Volatile GPU-Utilization explanation?

I know that nvidia-smi -l 1 will give the GPU usage every one second (similarly to the following). However, I would appreciate an explanation on what Volatile GPU-Util really means. Is that the number ...

user3813674

2,623

asked Dec 2, 2016 at 17:31

110 votes

4 answers

85k views

Streaming multiprocessors, Blocks and Threads (CUDA)

What is the relationship between a CUDA core, a streaming multiprocessor and the CUDA model of blocks and threads? What gets mapped to what and what is parallelized and how? and what is more ...

user400055

asked Aug 19, 2010 at 7:21

109 votes

6 answers

168k views

Can I run CUDA on Intel's integrated graphics processor?

I have a very simple Toshiba Laptop with i3 processor. Also, I do not have any expensive graphics card. In the display settings, I see Intel(HD) Graphics as display adapter. I am planning to learn ...

Ankit

6,882

asked Nov 19, 2011 at 9:57

108 votes

10 answers

49k views

NVIDIA vs AMD: GPGPU performance

I'd like to hear from people with experience of coding for both. Myself, I only have experience with NVIDIA. NVIDIA CUDA seems to be a lot more popular than the competition. (Just counting question ...

Eugene Smith

9,238

asked Jan 9, 2011 at 8:27

107 votes

4 answers

76k views

In CUDA, what is memory coalescing, and how is it achieved?

What is "coalesced" in CUDA global memory transaction? I couldn't understand even after going through my CUDA guide. How to do it? In CUDA programming guide matrix example, accessing the matrix row by ...

kar

2,695

asked Feb 18, 2011 at 12:33

97 votes

4 answers

47k views

Why is CUDA pinned memory so fast?

I observe substantial speedups in data transfer when I use pinned memory for CUDA data transfers. On linux, the underlying system call for achieving this is mlock. From the man page of mlock, it ...

Gearoid Murphy

12k

asked Apr 20, 2011 at 21:39

96 votes

8 answers

50k views

Best approach for GPGPU/CUDA/OpenCL in Java?

General-purpose computing on graphics processing units (GPGPU) is a very attractive concept to harness the power of the GPU for any kind of computing. I'd love to use GPGPU for image processing, ...

Frederik

14.4k

asked Apr 13, 2010 at 21:53

95 votes

8 answers

149k views

LNK2038: mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MD_DynamicRelease' in file.obj

I am Integrating Matlab, C and Cuda together in a project. I used Matlab mix in order to connect matlab mx function written in c with the cuda runtime library, a linking error appear about conflict in ...

Ahmed Hassan

991

asked Mar 5, 2015 at 20:22

92 votes

7 answers

335k views

How to remove cuda completely from ubuntu?

I have ubuntu 18.04, and accidentally installed cuda 9.1 to run Tensorflow-gpu, but it seems tensorflow-gpu requires cuda 10.0, so I want to remove cuda first by executing: martin@nlp-server:~$ sudo ...

marlon

7,197

asked Jun 3, 2019 at 16:44

91 votes

4 answers

110k views

When to call cudaDeviceSynchronize?

when is calling to the cudaDeviceSynchronize function really needed?. As far as I understand from the CUDA documentation, CUDA kernels are asynchronous, so it seems that we should call ...

user1588226

911

asked Aug 9, 2012 at 17:25

78 votes

8 answers

64k views

Passing pointers between C and Java through JNI

At the moment, i'm trying to create a Java-application which uses CUDA-functionality. The connection between CUDA and Java works fine, but i've got another problem and wanted to ask, if my thoughts ...

Volker

783

asked Oct 27, 2009 at 17:22

75 votes

2 answers

53k views

GPU Programming, CUDA or OpenCL? [closed]

I am a newbie to GPU programming. I have a laptop with NVIDIA GeForce GT 640 card. I am faced with 2 dilemmas, suggestions are most welcome. If I go for CUDA -- Ubuntu or Windows Clearly CUDA is more ...

Arkapravo

4,094

asked Aug 2, 2013 at 10:09

70 votes

5 answers

95k views

CUDA determining threads per block, blocks per grid

I'm new to the CUDA paradigm. My question is in determining the number of threads per block, and blocks per grid. Does a bit of art and trial play into this? What I've found is that many examples have ...

dnbwise

1,092

asked Dec 8, 2010 at 18:58

70 votes

1 answer

24k views

How and when should I use pitched pointer with the cuda API?

I have quite a good understanding about how to allocate and copy linear memory with cudaMalloc() and cudaMemcpy(). However, when I want to use the CUDA functions to allocate and copy 2D or 3D matrices,...

Ernest_Galbrun

2,624

asked Apr 20, 2013 at 11:43

69 votes

6 answers

276k views

Error Message : Cannot find or open the PDB file

I tried running sample programs provided at NVIDIA's official site. Most of the programs ran smoothly except few where I get similar error messages. How can I fix that? Here's a sample of error ...

KNU

2,508

asked Apr 10, 2013 at 22:34

67 votes

5 answers

117k views

Does __syncthreads() synchronize all threads in the grid?

Does __syncthreads() synchronize all threads in the grid or just the threads in the current warp or block? Also, when the threads in a particular block encounter (in the kernel) the following line ...

Wuschelbeutel Kartoffelhuhn

1,948

asked Mar 6, 2013 at 6:25

67 votes

5 answers

223k views

Where did CUDA get installed on Ubuntu 14.04 on my computer?

I'm trying to install CUDA 7.5 in my ubuntu 14.04. I followed everything in this guide (installation through package): http://developer.download.nvidia.com/compute/cuda/7.5/Prod/docs/sidebar/...

krips89

1,733

asked Mar 29, 2016 at 8:23

66 votes

5 answers

90k views

What is the difference between cuda vs tensor cores?

I am completely new to terms related to HPC computing, but I just saw that EC2 released its new type of instance on AWS that's powered by the new Nvidia Tesla V100, which has both kinds of "cores": ...

Simon Ernesto Cardenas Zarate

2,834

asked Nov 16, 2017 at 16:45

65 votes

7 answers

180k views

How to let cmake find CUDA

I am trying to build this project, which has CUDA as a dependency. But the cmake script cannot find the CUDA installation on the system: cls ~/workspace/gpucluster/cluster/build $ cmake .. -- The C ...

clstaudt

22.2k

asked Nov 14, 2013 at 14:35

65 votes

1 answer

58k views

Pytorch. How does pin_memory work in Dataloader?

I want to understand how the pin_memory parameter in Dataloader works. According to the documentation: pin_memory (bool, optional) – If True, the data loader will copy tensors into CUDA pinned memory ...

Ivan Belonogov

819

asked Apr 7, 2019 at 20:27

64 votes

8 answers

105k views

Error compiling CUDA from Command Prompt

I'm trying to compile a cuda test program on Windows 7 via Command Prompt, I'm this command: nvcc test.cu But all I get is this error: nvcc fatal : Cannot find compiler 'cl.exe' in PATH What may ...

GennSev

1,636

asked Nov 14, 2011 at 17:49

63 votes

4 answers

110k views

Cuda gridDim and blockDim

I get what blockDim is, but I have a problem with gridDim. Blockdim gives the size of the block, but what is gridDim? On the Internet it says gridDim.x gives the number of blocks in the x coordinate. ...

ehah

685

asked May 17, 2013 at 23:38

63 votes

12 answers

32k views

Does CUDA support recursion?

JuanPablo

24.4k

asked Sep 5, 2010 at 2:47

61 votes

3 answers

45k views

Structure of Arrays vs Array of Structures

From some comments that I have read in here, it is preferable to have Structure of Arrays (SoA) over Array of Structures (AoS) for parallel implementations like CUDA. If that is true, can anyone ...

BugShotGG

5,070

asked Jul 29, 2013 at 12:56

61 votes

4 answers

71k views

Coding CUDA with C#?

I've been looking for some information on coding CUDA (the nvidia gpu language) with C#. I have seen a few of the libraries, but it seems that they would add a bit of overhead (because of the p/...

Jess

8,695

asked Jun 25, 2011 at 2:51

60 votes

6 answers

47k views

Compression library using Nvidia's CUDA [closed]

Does anyone know a project which implements standard compression methods (like Zip, GZip, BZip2, LZMA,...) using NVIDIA's CUDA library? I was wondering if algorithms which can make use of a lot of ...

Xn0vv3r

17.9k

asked Jan 19, 2009 at 7:54

60 votes

5 answers

68k views

Fortran vs C++, does Fortran still hold any advantage in numerical analysis these days? [closed]

With the rapid development of C++ compilers,especially the intel ones, and the abilities of directly applying SIMD functions in your C/C++ code, does Fortran still hold any real advantage in the world ...

user0002128

2,905

asked Oct 25, 2012 at 23:18

59 votes

8 answers

184k views

How to install CUDA in Google Colab GPU's

It seems that Google Colab GPU's doesn't come with CUDA Toolkit, how can I install CUDA in Google Colab GPU's. I am getting this error in installing mxnet in Google Colab. Installing collected ...

namerbenz

649

asked May 28, 2018 at 6:33

58 votes

15 answers

34k views

CUDA or FPGA for special purpose 3D graphics computations? [closed]

I am developing a product with heavy 3D graphics computations, to a large extent closest point and range searches. Some hardware optimization would be useful. While I know little about this, my boss (...

Fredriku73

3,180

asked Nov 25, 2008 at 15:35

Collectives™ on Stack Overflow

Questions tagged [cuda]

Related Tags