When linking `libnvvm.so.4` into `librustc_codegen_nvvm.so` shared
library via the `cust_raw` package, we specified the following
parameters in the `build.rs`:
- cargo::rustc-link-search=native=xxx
- cargo::rustc-link-lib=dylib=nvvm
According to the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/build-scripts.html#rustc-link-search),
the path we added to the `rustc-link-search` parameter is not further
populated to the `LD_LIBRARY_PATH` for search by the transitive
dependencies.
> These paths are also added to the dynamic library search path
> environment variable if they are within the OUT_DIR.
The behavior is intended to be limited to the OUT_DIR so the
`libnvvm.so.4`, which lives as a part of the CUDA SDK, is excluded.
Therefore, we must add it to the LD_LIBRARY_PATH manually in our
container build stage.
Added a new workflow, container_images.yml, to automate building and
pushing CI container images to GitHub Container Registry, ensuring
consistent environments for CI runs across Linux distributions.
Added a dedicated CI workflow for Linux to streamline testing across
multiple CUDA and OS configurations. It reuses the images built from
container_images.yml workflow to avoid installing CUDA toolkits every
time to save time.
Key changes:
- Matrix configurations for Ubuntu 22/24, RockyLinux 9 with CUDA 12.8.1
- Dockerfiles updated with required dependencies (clang, CUDA
toolchains, etc.) to support bindgen and build examples.
- Removed redundant CI steps from `rust.yml` now handled in
`ci_linux.yml`
The network installer is a smaller initial download, which will only
download the require packages to install. This should reduce the install
times on Windows CI.
1. Consolidation of bindgen related "*-sys" packages
- Remove the common dependency of `find_cuda_helper`. Use the cargo
metadata mechanism instead.
- Merged all CUDA bindgen-generated code into the cust_raw crate for
simplicity and maintainability.
- Add CUDA Runtime API bindgen support.
2. cuDNN and OptiX Integration
- Split cudnn into cudnn (high-level API) and cudnn-sys (low-level
bindgens) for better abstraction.
- Split optix into optix (high-level API) and optix-sys (low-level
bindgens) for better abstraction.
3. CUDA 12+ Support
- Updated cust to support CUDA versions >= 12.
- Added compatibility for CUDA 12.3+ graph API changes:
- Renamed cuGraphKernelNodeGetParams →
cuGraphKernelNodeGetParams_v2.
- Enabled conditional node support for CUDA >= 12.3.
4. Temporarily disable cuDNN in CI
- Windows CI pipelines have no cuDNN support yet.
Co-authored-by: Jorge Ortega <jorge-ortega@outlook.com>
* wip
* bootstrap enough optix to get ex02 working
* Add example 03
Generate an animated pattern in the raygen and display it in a window using glfw
* add logging callback
* remove ustr
* Manually create OptixShaderBindingTable
field-by-field instead of transmute
* Switch Module and Pipeline methods to their structs
Instead of having them on DeviceContext
* Switch Module, Pipeline, ProgramGroup methods to their structs
Instead of having them on DeviceContext
* Refactor: remove dead imports
* derive DeviceCopy
* typo
* Better error message
* Move destroy to Drop impl
* typo
* rename OptixContext to DeviceContext
* Make launch params variable name optional
* Remove Clone from Module and ProgramGroup
* Make log callback safe
User catch_unwind to guard against panic in C. Remove note about lifetime of
closure since it's 'static anyway. Have set_log_callback return a Result instead
of panicking on error.
* add wip glam support
* dont panic in drop
* Rework DevicePointer on top of CUdeviceptr
This switches out *T for CUdeviceptr in DevicePointer. This has the
knock-on effect of removing a lot of "pretend we're a CPU pointer" stuff
from downstream types like DeviceSlice.
* wip
* bootstrap enough optix to get ex02 working
* Add example 03
Generate an animated pattern in the raygen and display it in a window using glfw
* add logging callback
* remove ustr
* Manually create OptixShaderBindingTable
field-by-field instead of transmute
* Switch Module and Pipeline methods to their structs
Instead of having them on DeviceContext
* Switch Module, Pipeline, ProgramGroup methods to their structs
Instead of having them on DeviceContext
* Refactor: remove dead imports
* derive DeviceCopy
* typo
* Better error message
* Move destroy to Drop impl
* typo
* rename OptixContext to DeviceContext
* Make launch params variable name optional
* Remove Clone from Module and ProgramGroup
* Make log callback safe
User catch_unwind to guard against panic in C. Remove note about lifetime of
closure since it's 'static anyway. Have set_log_callback return a Result instead
of panicking on error.
* add wip glam support
* dont panic in drop
* wip accel support
* Add accel wip
Enough acceleration structure stuff to get example 04 running, and
rebasing on top of deviceptr branch
* Rework acceleration structure stuff
Provide simple internally allocating API for Accel, but also allow
creating one from raw parts to let user handle memory allocation.
Original API kept as free functions and marked unsafe.
Implement all build input types.
Add mint support to cust and optix.
* add lifetime bound on Instance to referenced Accel
* Have DeviceCopy impl for lifetime markers use null type
* Add unsaafe from_handle ctor
* Add update for DynamicAccel
* Hash build inputs to ensure update is sound
* Add relocation info
* Add remaning DeviceContext methods
* Correct docstrings
* Add doc comments
* Add a prelude
* Own the geometry flags array
* Add prelude
* own the geometry flags array and add support for pre_transform
* Fill out context and add some module docs
* Add some module docs
* Update to latest library changes
* Add more docs
* Remove mut requirement for getting pointer
* Add a simple memcpy_htod wrapper
* Add back pointer offset methods
* Big structure reorg and documentation push
- Reorganized the module structure to something less fragmented. Modules
are longer but more cohesive.
- Integrated the optix programming guide inline into the module
documentation.
You need to build the docs with:
RUSTDOCFLAGS="--html-in-header katex-header.html" cargo doc --no-deps
To see the equations (this is done automatically on docs.rs)
* Wrap SBT properly
* Rename transform types
* Simplify AccelBuildOptions creation
Just take a build flags and move everything else to builders
* Hide programming guide in details tags
* Adapt to latest changes
* Fix toolchain version
* Fix name of DeviceContext
* first optix rust test
* Set ALLOW_COMPACTION in build options
* Use find_cuda_helper to get cuda path
* Handle differering enum representation on windows and linux
* Add DeviceVariable
* Add DeviceMemory trait
Abstracts over different device storage representations
* Add mem_get_info
* Add external memory
* Add a few more types to prelude
* Add more types
* Rework on top of new DeviceVariable
* first optix rust test
* tweak build
* update to latest optix changes
* Split DeviceCopy into cust_core
* update to latest optix changes
* trying to get print working
* tweak test kernel
* stop llvm optimizing out LaunchParams
* Chore: update cargo.toml dep versions
* Feat: second pass for fixing conflicts
* Feat: delete as_ptr and as_mut_ptr on DeviceSlice
* Revert "Feat: delete as_ptr and as_mut_ptr on DeviceSlice"
This reverts commit e858fdcaf3.
* Feat: experiment with deleting as_ptr and as_mut_ptr
* Fix issues and warnings
* Chore: run formatting
* Chore: exclude examples from building in CI
* Feat: update changelog with changes, misc changes before merge
Co-authored-by: rdambrosio <rdambrosio016@gmail.com>