mirror of
https://github.com/jafioti/luminal.git
synced 2026-06-01 21:49:47 +09:00
luminal_python: WIP workaround for dynamo "L not defined" on gemma3
Set torch._dynamo.config.automatic_dynamic_shapes = False at package import time. With the default (True), dynamo's frame-evaluation cache promotes a varying dim to dynamic on the second compiled call and emits a `_guards_fn` submodule whose source closes over `L` (the dynamo locals namespace). When our backend re-exports the FX graph, the closure's free `L` reference doesn't resolve and we panic with NameError: name 'L' is not defined during aot_export_joint_with_descriptors. gemma3-4b's StaticCache call pattern triggers it deterministically (every search budget, every iter); llama-8b, qwen3-4b, qwen3-moe on the same backend do not. Disabling automatic_dynamic_shapes forces a fresh-static-trace recompile on each shape mismatch instead of the L-referencing dynamic-shape path. Cost / why this is WIP, not a fix: The bench loop calls compiled() with cache_position=[1], [2], [3]… each iter. The shape is constant ([1]) but the value varies. With automatic_dynamic_shapes=False, dynamo recompiles per cache_position *value* — i.e. one full luminal compile per token in the prompt. A search-iters=1 gemma3 smoke takes ~2 hr CPU and pegs at 200 GB host RSS instead of a clean ~30 s. Functional but not shippable as the steady-state path. Better long-term routes (not in this commit): - mark cache_position as a static address / specialise it at trace time so dynamo doesn't see value variation. - handle the L-referencing guards module in pt2.py (inject the expected namespace before aot_export, or strip the guards submodule when re-exporting). - reuse the SymInt specialisation already in pt2.py (previous commit) and keep automatic_dynamic_shapes=True so the dim becomes a clean symbolic that pt2.py can resolve. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,5 +1,7 @@
|
||||
"""Luminal Python bindings - PyTorch backend using Luminal."""
|
||||
|
||||
import torch._dynamo
|
||||
|
||||
# Import Python components
|
||||
# Register DynamicCache pytree serialization once at import time
|
||||
from .cache_utils import _register_cache_serialization
|
||||
@@ -11,6 +13,17 @@ from .main import luminal_backend, register_backend
|
||||
|
||||
_register_cache_serialization()
|
||||
|
||||
# Disable dynamo's automatic-dynamic-shape promotion. On the second compiled
|
||||
# call dynamo otherwise promotes any dim that varied to dynamic and emits a
|
||||
# `_guards_fn` submodule that closes over `L` (the dynamo locals namespace).
|
||||
# When our backend re-exports the FX graph via `torch.export`, that closure's
|
||||
# free `L` reference doesn't resolve and we get
|
||||
# `NameError: name 'L' is not defined` during aot_export_joint_with_descriptors.
|
||||
# Gemma3's StaticCache call pattern triggers it deterministically; llama / qwen
|
||||
# don't. Forcing recompile-on-shape-change keeps every call on a static graph
|
||||
# the backend can actually translate.
|
||||
torch._dynamo.config.automatic_dynamic_shapes = False
|
||||
|
||||
# Re-export everything for clean package interface
|
||||
__all__ = [
|
||||
"CompiledModel",
|
||||
|
||||
Reference in New Issue
Block a user