Commit Graph

10 Commits

Author SHA1 Message Date
Genna Wingert
5b4b334632 chore: Update to upstream changes in cubecl (#4670)
* Update to register changes

* Fix clippy

* Fix candle

* Reformat
2026-03-25 08:13:21 -04:00
Guillaume Lagrange
a615c61451 Refactor backend tests to set device settings at initialization + use Dispatch (#4666)
* Fix autodiff device checkpointing

* Refactor tests and fix some out dtypes

* Cargo fmt

* Update xtask handle_backend_tests

* Remove debug

* Cleanup

* Fix remote dtype usage

* Feature gate f16 tests

* Fix default quant scheme

* Fix cuda enabled by default in workspace

* Cargo fmt

* Fix clippy

* Fix display

* Fix float elem metal

* Update notes

* Fix burn-store tests
2026-03-24 16:04:19 -04:00
Nathaniel Simard
91e784e3fb Bump versions (#4589) 2026-03-02 12:05:46 -05:00
Nathaniel Simard
d4dbe9ef84 Chore: Pre-Release 0.21.0-pre.1 (#4494) 2026-02-09 15:47:05 -05:00
Guillaume Lagrange
e2d8833ff9 Bump burn version 0.21 (#4333)
* Bump burn version 0.21

* Feature gate tracing
2026-01-15 13:14:41 -05:00
Guillaume Lagrange
67e7a966af Set release version (#4331) 2026-01-15 10:10:48 -05:00
Crutcher Dunnavant
8628cd8cd4 Refactor dop_timer for multiple trials (for warmup). (#4288) 2026-01-08 08:48:08 -05:00
Crutcher Dunnavant
439a26c0ff Plumb deep tracing feature. (#4234) 2025-12-23 15:55:40 -05:00
Nathaniel Simard
91dd62c92a Chore/pre release 6 (#4224) 2025-12-18 15:50:25 -05:00
Crutcher Dunnavant
a119b279ce Add tracing::instrument and refactor collective operations. (#4157)
* Tracing; WIP

* WIP towards all_reduce tuning.

* Add tracing support for collective operations and refactor all_reduce implementation

* wip

* Refactor all_reduce by removing monoid_broadcast and migrating shared tensor map utilities to a centralized module. Add tracing instrumentation for collective operations. Include minor performance tweaks and fixes across broadcast, reduce, and related modules.

* Refactor collective operations by introducing `PeerDeviceMap` and `get_peer_devices`. Simplify device handling in broadcast and all_reduce functions. Migrate shared logic to `tensor_map.rs` for improved reuse and clarity.

* Refactor all_reduce_sum_centralized return type to use `CollectiveTensorMap`. Remove unused `HashMap` dependency for cleaner imports.

* Update `tracing` dependency configurations and align features across crates

- Add `attributes` feature to `tracing` for enhanced functionality in multiple crates.
- Standardize inclusion of `tracing/std` in applicable feature sets.
- Fix minor typo in `reduce_timing` example documentation.

* Update `tracing-core` and `tracing-subscriber` configurations to enforce default features across crates

- Add and standardize `default` features for `tracing-core` and `tracing-subscriber`.
- Extend `tokio` with `tracing` feature in `burn-communication`.
- Align dependency feature sets for consistent tracing behavior.

* Add `tracing` and `tracing-core` dependencies with standardized feature configurations

- Include `tracing` and `tracing-core` in `Cargo.toml` with appropriate features across crates.
- Add `tracing/default` and `tracing-core/default` to feature sets in `burn`.
- Remove redundant `default` feature usage for `tracing-core` in `burn-train` and `burn-import`.

This is still broken for the no-std target:
```terminaloutput
$ cargo build --color always --no-default-features --target thumbv6m-none-eabi -p burn
```

* this doesnt work

* Remove portable-atomic dependencies and make tracing optional

Eliminates all uses of portable-atomic and portable-atomic-util from the workspace, including conditional dependencies for non-atomic pointer targets. Updates tracing and tracing-core dependencies to be optional in all affected crates, and adjusts feature flags to use dep:tracing and conditional tracing features. Also updates attribute macros in burn-autodiff to only use tracing instrumentation when the std feature is enabled.

* Add AllReduceOp and AllReduceResult imports to server.rs

Imported AllReduceOp and AllReduceResult in local/server.rs to support additional collective operations. This prepares the server module for handling all-reduce functionality.

* Reformat imports in server.rs for readability

Adjusted the formatting of the import statements in server.rs to improve readability and maintain consistency with Rust style guidelines. No functional changes were made.

* Refactor collective operations to generalize `WebSocket` to `Protocol` and improve shared logic reuse.

* Fix typo in error message: "missmatch" → "mismatch" in collective operations

* Refactor: Replace `send_err_to_all` with `fail` for improved error handling consistency in collective operations.

* Refactor `Op` to simplify `peer_devices` acquisition logic and reuse shared operations

* Refactor: Inline await expressions in `all_reduce` and `broadcast` operations; simplify `peer_devices` logic by removing redundant implementation.

* Refactor: Simplify `reduce`, `all_reduce`, and `broadcast` operations by replacing `into_iter` with `iter`, removing unused imports, and streamlining collection mechanisms

* Refactor: Move `effective_root` and `peers` methods from `BroadcastOp` to `BroadcastOpCall` to simplify struct design and enhance cohesion

* Refactor: Simplify `broadcast` operation by reordering logic, inlining expressions, and enhancing readability

* Refactor: Use `expect` for clearer error handling in `reduce` operation, inline variable for global strategy

* Refactor: Rename `reduce_timing` to `dop_timer`, modularize components into `workers`, `parsers`, and `event_utils`, and simplify event instrumentation logic.

* Refactor: Remove unused imports, streamline tracing setup, and simplify `WorkerHandle` interface by removing unused methods and variables

* Refactor: Relocate `WorkRequest` enum to `workers.rs` and remove unused import in `run` function

* Refactor: Remove redundant default values in `Args` struct definition

* Refactor: Simplify tensor reduction and broadcasting logic, streamline `peer_devices` handling, and remove unused imports

* readme/otel

* Simplify imports and std propagation.

* fmt

* rebase/fix
2025-12-16 08:14:46 -05:00