Day 65: Profiling CUDA performance

Today’s new thing was spending a bit of time profiling the behavior of the CUDA cores when it comes to manipulating vectors.

Getting started with CUDA

As with any new platform being used on a machine you don’t use very often, most of the day was spent updating the machine, reading up about CUDA, getting the libraries installed. And because I can, finding a managed wrapper so I can use CUDA from C# and F#. Luckily such a library exists, in the form of the well-maintained managedCuda project (get it from GitHub).

The results were nothing short of staggering. For the first test, once I managed to work out how to write a CUDA kernel, we just did some simple vector addition.

CUDA: Run #1

For vectors with 50,000 random elements, CUDA was five times faster than plain old C#. It gets more interesting though. Try it with 500,000 elements, and the CUDA code performs twenty times faster than C#.

CUDA: Run #2

You can guess where this is going: how about 5,000,000?

The data in the table below is for 10,000 iterations of adding two random float vectors of length N, averaged over 20 runs. You can spot the pattern right? While the time taken for  addition in C# increases linearly with the number of elements, the CUDA calculations don’t.

Vector length N C# (ms) CUDA (ms)
50,000 2585 559
500,000 21584 1298
5,000,000 214321 8738

What’s next

Now, to get even larger vectors tested on my machine with the test code I wrote i not really possible, since I only have 32GB of system RAM available, and if I start paging (even with the silly M2 drive) performance will be much more bound to IO than the GPU/CPU comparison.

I realize none of the above is particularly complex, but, it is interesting and very new. And it is tempting me to take a project I was working on a few years ago, and swapping out the core logic (it had lots of cosine similarity calculations) and seeing how it would perform with CUDA. It is fun though, and I’m pleasantly surprised at how well documented all of this space is. It took a lot less time than I would have expected.

Hey, were are the missing days?

I’ve been busy at work, travelling and doing the new things. Not to forget avoiding getting mauled by my mini-lions. This means I haven’t been able to find the time to write the posts in time for a week or so. I’ll post this one today, and back-fill as quickly as I can during the week.