-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Revert "[Unity] Implement relax.transform.KillAfterLastUse (#15810)" #15852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
)" This reverts commit aa4587f. Unfortunately, this PR broke MLC LLM's build pipeline, more specifically, the command below ```python python3 -m mlc_llm.build --model dist/models/llama-2-13b-chat-hf/ --quantization q4f16_1 ``` leads to the the following error message: <details> ``` Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/opt/scratch/junrushao/mlc-llm/mlc_llm/build.py", line 13, in <module> main() File "/opt/scratch/junrushao/mlc-llm/mlc_llm/build.py", line 10, in main core.build_model_from_args(parsed_args) File "/opt/scratch/junrushao/mlc-llm/mlc_llm/core.py", line 616, in build_model_from_args new_params = utils.convert_weights(param_manager, params, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/mlc-llm/mlc_llm/utils.py", line 258, in convert_weights vm["transform_params"]() File "/opt/scratch/junrushao/tvm-dev/python/tvm/_ffi/_ctypes/packed_func.py", line 239, in __call__ raise_last_ffi_error() File "/opt/scratch/junrushao/tvm-dev/python/tvm/_ffi/base.py", line 476, in raise_last_ffi_error raise py_err File "/opt/scratch/junrushao/tvm-dev/src/runtime/relax_vm/vm.cc", line 634, in tvm::runtime::relax_vm::VirtualMachineImpl::InvokeClosurePacked(tvm::runtime::ObjectRef const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) clo->impl.CallPacked(TVMArgs(values.data(), tcodes.data(), args.size() + 1), rv); ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/tvm-dev/src/runtime/relax_vm/vm.cc", line 708, in operator() *rv = static_cast<VirtualMachineImpl*>(ctx_ptr)->InvokeBytecode(gf_idx, inputs); ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/tvm-dev/src/runtime/relax_vm/vm.cc", line 765, in tvm::runtime::relax_vm::VirtualMachineImpl::InvokeBytecode(long, std::vector<tvm::runtime::TVMRetValue, std::allocator<tvm::runtime::TVMRetValue> > const&) RunLoop(); File "/opt/scratch/junrushao/tvm-dev/src/runtime/relax_vm/vm.cc", line 890, in tvm::runtime::relax_vm::VirtualMachineImpl::RunLoop() this->RunInstrCall(curr_frame, instr); ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/tvm-dev/src/runtime/relax_vm/vm.cc", line 843, in tvm::runtime::relax_vm::VirtualMachineImpl::RunInstrCall(tvm::runtime::relax_vm::VMFrame*, tvm::runtime::relax_vm::Instruction) this->InvokeClosurePacked(func_pool_[instr.func_idx], args, &ret); ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/mlc-llm/mlc_llm/relax_model/param_manager.py", line 579, in set_item loaded_params[i] = tvm.nd.array(computed_param, device=device_cpu) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/tvm-dev/python/tvm/runtime/ndarray.py", line 635, in array return empty(arr.shape, arr.dtype, device, mem_scope).copyfrom(arr) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/tvm-dev/python/tvm/runtime/ndarray.py", line 390, in empty dtype = DataType(dtype) ^^^^^^^^^^^^^^^ File "/opt/scratch/junrushao/tvm-dev/python/tvm/_ffi/runtime_ctypes.py", line 174, in __init__ raise ValueError("Do not know how to handle type %s" % type_str) ValueError: Do not know how to handle type object ``` </details> To briefly explain the root cause of this issue, `set_item` method, as defined [here](https://github.com/mlc-ai/mlc-llm/blob/4f4a93f03fed3d900605c02de575d7d5f429ed79/mlc_llm/relax_model/param_manager.py#L576-L579), gets `computed_param=None` after this commit. As a temporary solution, I'd love to propose that we revert this commit for now to quickly unblock us from building any LLM model, but I'm happy to get it back immediately as soon as the issue is fixed.
|
Thank you, and I've identified why this happens. The Working on two fixes now, one for the short-term and one for the long-term. The short-term fix will add similar handling to |
|
The short-term PR is submitted at #15854, which adds another special-case in |
|
Thanks for quickly acting on this! Let’s get #15854 merged instead! |
Prior to this commit, if a relax variable were assigned to itself, through either `VarBinding` or `MatchCast` nodes, the two relax variables would share the same register in the VM. As a result, any upstream transform that deletes an object with `R.memory.kill_tensor` or `R.memory.kill_storage` must be aware of this VM behavior, and to only output one such instruction for each set of aliased registers. This commit updates the VM to produce one register for each aliased relax variable. The trivial bindings can be removed applying `relax.transform.CanonicalizeBindings`, instead of being implicitly de-duplicated at the codegen level. This PR is a follow-up to apache#15854, with a better long-term solution, but which may have knock-on effects that must also be resolved. In addition, this adds a usage example for the bug reported in apache#15852, to avoid re-occurrence of similar issues.
|
Sounds good! I now also have #15855 prepared, which applied the more general fix and can be merged in at a more leisurely pace. |
Prior to this commit, if a relax variable were assigned to itself, through either `VarBinding` or `MatchCast` nodes, the two relax variables would share the same register in the VM. As a result, any upstream transform that deletes an object with `R.memory.kill_tensor` or `R.memory.kill_storage` must be aware of this VM behavior, and to only output one such instruction for each set of aliased registers. This commit updates the VM to produce one register for each aliased relax variable. The trivial bindings can be removed applying `relax.transform.CanonicalizeBindings`, instead of being implicitly de-duplicated at the codegen level. This PR is a follow-up to apache#15854, with a better long-term solution, but which may have knock-on effects that must also be resolved. In addition, this adds a usage example for the bug reported in apache#15852, to avoid re-occurrence of similar issues.
Prior to this commit, if a relax variable were assigned to itself, through either `VarBinding` or `MatchCast` nodes, the two relax variables would share the same register in the VM. As a result, any upstream transform that deletes an object with `R.memory.kill_tensor` or `R.memory.kill_storage` must be aware of this VM behavior, and to only output one such instruction for each set of aliased registers. This commit updates the VM to produce one register for each aliased relax variable. The trivial bindings can be removed applying `relax.transform.CanonicalizeBindings`, instead of being implicitly de-duplicated at the codegen level. This PR is a follow-up to apache#15854, with a better long-term solution, but which may have knock-on effects that must also be resolved. In addition, this adds a usage example for the bug reported in apache#15852, to avoid re-occurrence of similar issues.
Prior to this commit, if a relax variable were assigned to itself, through either `VarBinding` or `MatchCast` nodes, the two relax variables would share the same register in the VM. As a result, any upstream transform that deletes an object with `R.memory.kill_tensor` or `R.memory.kill_storage` must be aware of this VM behavior, and to only output one such instruction for each set of aliased registers. This commit updates the VM to produce one register for each aliased relax variable. The trivial bindings can be removed applying `relax.transform.CanonicalizeBindings`, instead of being implicitly de-duplicated at the codegen level. This PR is a follow-up to apache#15854, with a better long-term solution, but which may have knock-on effects that must also be resolved. In addition, this adds a usage example for the bug reported in apache#15852, to avoid re-occurrence of similar issues.
* [Unity] Ensure one VM register for each relax binding Prior to this commit, if a relax variable were assigned to itself, through either `VarBinding` or `MatchCast` nodes, the two relax variables would share the same register in the VM. As a result, any upstream transform that deletes an object with `R.memory.kill_tensor` or `R.memory.kill_storage` must be aware of this VM behavior, and to only output one such instruction for each set of aliased registers. This commit updates the VM to produce one register for each aliased relax variable. The trivial bindings can be removed applying `relax.transform.CanonicalizeBindings`, instead of being implicitly de-duplicated at the codegen level. This PR is a follow-up to #15854, with a better long-term solution, but which may have knock-on effects that must also be resolved. In addition, this adds a usage example for the bug reported in #15852, to avoid re-occurrence of similar issues. * Add unit test to validate that the alias is preserved * Use the new GetBoundValue utility function * Update VMTIRCodeGen to also avoid de-duplication of bindings * Move the callback definition to tvm.relax.testing.vm namespace
This reverts commit aa4587f.
Unfortunately, this PR broke MLC LLM's build pipeline, more specifically, the command below
leads to the the following error message:
Details
To briefly explain the root cause of this issue,
set_itemmethod, as defined here, getscomputed_param=Noneafter this commit.As a temporary solution, I'd love to propose that we revert this commit for now to quickly unblock us from building any LLM model, but I'm happy to get it back immediately as soon as the issue is fixed.