Modules for GPUmat
These are tested and working with the current (as of May 18, 2010) release 0.251 of GPUmat. They require the CUDA 3.0 driver & runtime. If you find a bug, of course, I am happy to hear from you. But please note, these are provided without support.
Download my modules here: gwt_modules-3.0_Jul172010.zip.
I also have a version for the CUDA 2.3 runtime. These were tested and working with GPUmat 0.25. This is likely the last time I will release versions for the 2.3 runtime. They can be downloaded here: gwt_modules-2.3_Jul172010.zip.
They are pre-compiled for the x86_64 architecture. You should be
able to re-compile them for another architecture. However, since
with GPUmat we are forced to use the Driver API, this makes things
tricky for template kernels. In the convolution & subsampling
functions, I have had to copy the mangled name directly from the
cubin file to the driver (cpp) code where the module is
loaded. Since changing the architecture will change the mangled
name, this may require some manual labour on your part.
Update: It seems that since 3.0 the .cubin files are no longer ASCII so you can no longer find out the mangled names just by opening them up in a text editor. However, you can still retrieve this information by passing some additional flags to nvcc: See here.
Python users should check out cudamat.
Note: though GPUmat provides double precision & complex-number support my functions only work with single precision (i.e. GPUsingle types).
- cuMisc module :
Implements some missing functionality in GPUmat (like random number
generation) and other functions useful for third-order RBMs and
- cuRand - uniform random generation
- cuRandn - Gaussian random generation
- cuBinarizeProbs - Bernoulli sampling
- cuSigmoid - Logistic function
- cuThreeway - Generalized three-way "outer product"
- cuDist - Euclidean distance computation
- cuSquaredDist - Euclidean squared distance computation
- cuCopyInto - Copies every image into a larger image (zero padding)
- cuGridToMatrix - like a parallelized im2col
- cuMatrixToGrid - like a parallelized col2im
- cuRotate180 - Batch rotation of many filters
- cuSampleMultinomial - Sample from many same-size multinomial distributions in parallel
- cuSubsample - Average pooling many images
- cuSupersample - Upsampling many images by a constant factor
- cuConv module : 2-D convolution
See the help associated with each file. Many of these functions are simply wrappers to Alex Krizhevsky's code and he deserves all the credit.
- Jul 17, 2010: Added .cuh files (which will allow users to recompile)
- May 18, 2010
- Name change: cuMisc replaces cuThreeway module
- Introduced several new functions: cuSquaredDist,cuCopyInto,cuGridToMatrix,cuMatrixToGrid,cuRotate180,cuSampleMultinomial,cuSubsample,cuSupersample,cuNCAreg
- Release for both GPUmat 0.25 (2.3 runtime) and 0.251 (3.0 runtime supporting new Fermi cards)
- Apr 16, 2010
- Name change: cuConv2 is now cuConv (done to support Alex's naming convention)
- Introduced a new function: cuConv2 which is similar to cuConv but it performs convolution of each image with a set of filters that is specific to each image
- Feb 23, 2010 : Update to support Dec 29 version of Alex's code (previously was based on his Nov 1 code)
- Jan 21, 2010
- Added note in cuConv2 help to note that filter is not rotated ( fliplr(flipud()) ) before dot-multiply; this differs from Matlab's conv2 functionality
- Removed references to cuConv.___ (old naming convention, superseded by cuConv2.___)
- Jan 19, 2010 : Initial release
Download GPUmat and verify that it is working for you. In
particular, you should be able to compile and run their sample
modules. It was necessary for me to add:
CUDA_ROOT = '/usr/local/pkg/cuda/current/cuda' to:
This should reflect the actual location of your CUDA install.
Make sure your environment is set correctly. This will, of course,
depend on the actual location of your CUDA install. These
statements, in my ~/.bash_profile were sufficient for the
[ -d /usr/local/pkg/cuda/current/cuda/bin ] && export PATH=/usr/local/pkg/cuda/current/cuda/bin:$PATH
[ -d /usr/local/pkg/cuda/current/cuda/lib64 ] && export LD_LIBRARY_PATH=/usr/local/pkg/cuda/current/cuda/lib64:$LD_LIBRARY_PATH
[ -d /usr/local/pkg/cuda/current/cuda ] && export CUDA_INSTALL_PATH=/usr/local/pkg/cuda/current/cuda
Unzip the above zip file in your GPUmat directory. This should create:
Provided you don't need to recompile for your platform, running GPUstart should automatically initialize my modules (as well as the default modules). You can also manually initialize each module by using its moduleinit.m script. The test scripts (test___.m) demonstrate how to call each function.
One small hiccup (that I will eventually get around to addressing) is that it is necessary to have rnd_multipliers_32bit.txt in the current directory (this is provided in modules/gwt/cuMisc) before calling any of the functions that use random number generation (cuRand,cuRandn,cuBinarizeProbs) for the first time. Otherwise it will complain about the missing file.