-
Notifications
You must be signed in to change notification settings - Fork 18
Closed
Description
-
-
Tensor::is_sequential()may need more strict conditions. (Minor fixes #111)
-
-
-
Executor::tensor_memcpy_host_to_device()will cause unknown error if the tensors on the host device is not sequential. We need more check about the tensor on the host or mabe need a python warpper for this (Improve Python interfaces #48)
-
-
- Sometime if the tensor is padded, the allgather operation might overwrite the recv tensor, and the allreduce tensor will also be incorrect. (@chhwang: now send/recv checks contiguity)
-
- Current layernorm and sofxmax operation is scheduled using a quite hack way, might needs for more update in the future. (Minor updates #59)
-
-
ark.init()is not working. (Fix init #39)
-
-
- Layernorm need a recv dependency at its output (@chhwang: it already has)
-
- ARK environments are not working for Python (Fix scheduler bugs #54)
-
[ ] Support both source and destination offsets inmoved to the next versionNetIbQp::stage_send()
-
- Remove a misleading error message:
(Minor fixes #111)Lines 19 to 23 in 420c236
if (input->ndims() > 1) { LOG(INFO, "warning: if the send tensor if not contiguous, the all_gather " "may not work correctly"); }
- Remove a misleading error message:
-
-
ops_matmul_test.ccis not checking error rates correctly (New unit test framework for operators #91)
-
-
-
send_mmandrecv_mmare temporarily broken (Revisions around schedulers #52)
-
-
- When using python -m unittest discover -s . -p "test_*.py" to run all unittest, the snedrecv test will fail, but when we run them seperately, their will be no problem. Seems that in some cases the previous runtime context is not destroyed when one unittest finished and another unittest start. This problem also exist in the current main branch. (@chhwang: this is the test code's issue, won't fix for now)
-
- matmul test failed for matmul larger than 128, 2048, 1024 (Fix scheduler bugs #54)
-
[ ] Offsets of importing/exporting tensors are not properly handledmoved to the next version
-
- matmul unittesttest failed for test_matmul_transpose (Fix a bug in the matmul operator transpose option #94)
-
- float matmul error rate seems too high but it's unclear if it is ARK's issue or the test code issue (@chhwang: this is not an issue)
Metadata
Metadata
Assignees
Labels
No labels