The cuda-oxide Book#
cuda-oxide is an experimental Rust-to-CUDA compiler that lets you write (SIMT) GPU kernels in safe(ish), idiomatic Rust. It compiles standard Rust code directly to PTX — no DSLs, no foreign language bindings, just Rust.
Project Status#
The v0.1.0 release is an early-stage alpha: expect bugs, incomplete features, and API breakage as we work to improve it. We hope you’ll try it and help shape its direction by sharing feedback on your experience.
🚀 Quick start#
use cuda_device::{cuda_module, kernel, thread, DisjointSlice};
use cuda_core::{CudaContext, DeviceBuffer, LaunchConfig};
#[cuda_module]
mod kernels {
use super::*;
#[kernel]
fn vecadd(a: &[f32], b: &[f32], mut c: DisjointSlice<f32>) {
let idx = thread::index_1d();
let i = idx.get();
if let Some(c_elem) = c.get_mut(idx) {
*c_elem = a[i] + b[i];
}
}
}
fn main() {
let ctx = CudaContext::new(0).unwrap();
let stream = ctx.default_stream();
let module = kernels::load(&ctx).unwrap();
let a = DeviceBuffer::from_host(&stream, &[1.0f32; 1024]).unwrap();
let b = DeviceBuffer::from_host(&stream, &[2.0f32; 1024]).unwrap();
let mut c = DeviceBuffer::<f32>::zeroed(&stream, 1024).unwrap();
module
.vecadd(&stream, LaunchConfig::for_num_elems(1024), &a, &b, &mut c)
.unwrap();
let result = c.to_host_vec(&stream).unwrap();
assert_eq!(result[0], 3.0);
}
Build and run with cargo oxide run vecadd upon installing the prerequisites.
Why cuda-oxide?#
Write GPU kernels with Rust’s type system and ownership model. Safety is a first-class goal, but GPUs have subtleties — read about the safety model.
Not a DSL. A custom rustc codegen backend that compiles pure Rust to PTX.
Compose GPU work as lazy DeviceOperation graphs.
Schedule across stream pools. Await results with .await.
next
Installation
- Project Status
- 🚀 Quick start
- Why cuda-oxide?