Some notes on legoing with CUDA on an ASUS EB1012 running Xubuntu 10.10 64-bit.
Installation is straightforward and well-documented in the Getting Start Guide. Download a bunch of code (Drivers/Toolkit/SDK) from CUDA 3.2 Downloads, run the installation, compile some stuff and you are ready to start legoing!
I highly recommend compiling/installing everything from scratch, it is by far the fastest path to a working environment! One lazy approach would be to skim the Getting Start Guide and following the copy-pastable notes below.
WARNING If you try to take a "shortcut" by re-using your systems currently installed driver or try installing a driver via some arbitrary PPA. Then you need to keep following in mind:
Required software:
sudo apt-get install vim lynx g++ ia32-libs libx11-dev libglut3-dev libgl1-mesa-dev libglu-dev libXmu-dev libxi-dev
Acquire the Toolkit, Device driver and GPU Computing SDK from
# Fetch files
wget http://developer.download.nvidia.com/compute/cuda/3_2_prod/drivers/devdriver_3.2_linux_64_260.19.21.run
wget http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/cudatoolkit_3.2.16_linux_64_ubuntu10.04.run
wget http://developer.download.nvidia.com/compute/cuda/3_2_prod/sdk/gpucomputingsdk_3.2.16_linux.run
# Set Execution Rights
chmod +x *.run
# Install
sudo ./devdriver_3.2_linux_64_260.19.21.run
sudo ./cudatoolkit_3.2.16_linux_64_ubuntu10.04.run
./gpucomputingsdk_3.2.16_linux.run
# Set library Paths
echo '' >> ~/.bashrc
echo 'export PATH="$PATH:/usr/local/cuda/bin"' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/lib"' >> ~/.bashrc
# Fix glut-library
sudo ln -s /usr/lib/libglut.so.3 /usr/lib/libglut.so
# Load driver
sudo modprobe nvidia
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA
Device 0: "ION"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major/Minor version number: 1.1
Total amount of global memory: 534052864 bytes
Multiprocessors x Cores/MP = Cores: 2 (MP) x 8 (Cores/MP) = 16 (Cores)
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.10 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: Yes
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)
Concurrent kernel execution: No
Device has ECC support enabled: No
Device is using TCC driver mode: No
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 1, Device = ION
PASSED
Press <Enter> to Quit...
-----------------------------------------------------------
[bandwidthTest]
./bandwidthTest Starting...
Running on...
Device 0: ION
Quick Mode
Host to Device Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 1240.4
Device to Host Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 795.0
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 6806.1
[bandwidthTest] - Test results:
PASSED
Press <Enter> to Quit...
-----------------------------------------------------------