[Example] SEAL for OGBL by rudongyu · Pull Request #4291 · dmlc/dgl

rudongyu · 2022-07-25T08:02:36Z

Description

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented
To the best of my knowledge, examples are either not affected by this change,
or have been fixed to be compatible with this change
Related issue is referred in this PR
If the PR is for a new model/paper, I've updated the example index here.

Changes

Add an example for running seal on OGBL

TODO for dgl core

DRNLTransform
Sampling

dgl-bot · 2022-07-25T08:03:08Z

To trigger regression tests:

@dgl-bot run [instance-type] [which tests] [compare-with-branch];
For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

dgl-bot · 2022-07-25T08:45:41Z

Commit ID: fdacec6

Build ID: 1

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

mufeili · 2022-07-26T08:12:42Z

+        edge_weights = self.edge_weights[EIDs] if self.edge_weights is not None else None
+        x = self.node_features[NIDs] if self.node_features is not None else None
+
+        subg_aug = subg.add_self_loop()


With PR #4261, we are able to save L71-73.

not available in the latest release yet. should we amend it now?

Ok. Let's keep it till the next release.

mufeili · 2022-07-26T08:22:28Z

Done a first pass

dgl-bot · 2022-08-01T07:29:09Z

Commit ID: aee966b

Build ID: 2

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

…_ogbl

dgl-bot · 2022-08-23T07:06:43Z

Commit ID: 4d8c6a4

Build ID: 10

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

Ereboas · 2022-08-23T09:45:29Z

I run main.py on the ogbl-collab dataset, but got an exception at Line 387 graph = graph.to_simple(copy_edata=True, aggregator='sum').
The error message is dgl._ffi.base.DGLError: [09:32:16] /opt/dgl/src/array/kernel.cc:399: Check failed: (feat->dtype).code == kDLFloat (

dgl-bot · 2022-08-24T01:25:48Z

Commit ID: 62c5900

Build ID: 11

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

rudongyu · 2022-08-24T04:11:04Z

I run main.py on the ogbl-collab dataset, but got an exception at Line 387 graph = graph.to_simple(copy_edata=True, aggregator='sum'). The error message is dgl._ffi.base.DGLError: [09:32:16] /opt/dgl/src/array/kernel.cc:399: Check failed: (feat->dtype).code == kDLFloat (

a bug because int type aggregation in the to_simple transform not supported yet. fixed

rudongyu · 2022-08-24T04:22:10Z

The PR is ready for another round of review. @jermainewang. However, the dataloading and the sampler are implemented a little bit complicated to fit current interface of dataloader. I thinks it's better to refactor and move it in dgl core after #4444 supported.

dgl-bot · 2022-08-24T06:06:43Z

Commit ID: 439482e

Build ID: 12

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

dgl-bot · 2022-08-30T09:06:43Z

Commit ID: ef37ca9

Build ID: 13

Status: ❌ CI test failed in Stage [Torch GPU].

Report path: link

Full logs path: link

…_ogbl

dgl-bot · 2022-08-31T02:28:23Z

Commit ID: dc82fa853a7574dfb70c82351c1e8cf48199a557

Build ID: 14

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

dgl-bot · 2022-08-31T02:51:19Z

Commit ID: ff67746

Build ID: 15

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

Ereboas · 2022-09-02T14:52:13Z

I simply modified the GPU device to run on, but got an error:

terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from process_events at /opt/conda/conda-bld/pytorch_1656352657443/work/c10/cuda/CUDACachingAllocator.cpp:1470 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7fd6f3fea477 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x25742 (0x7fd7215e3742 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x20e61 (0x7fd7215dee61 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #3: <unknown function> + 0x41308 (0x7fd7215ff308 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0x41542 (0x7fd7215ff542 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #5: at::detail::empty_generic(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) + 0x7bf (0x7fd7228b7daf in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #6: at::detail::empty_cuda(c10::ArrayRef<long>, c10::ScalarType, c10::optional<c10::Device>, c10::optional<c10::MemoryFormat>) + 0x115 (0x7fd72ddfdd75 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #7: at::detail::empty_cuda(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x31 (0x7fd72ddfdfd1 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #8: at::native::empty_cuda(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x1f (0x7fd72dedc82f in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #9: <unknown function> + 0x2b57e28 (0x7fd6f6b96e28 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #10: <unknown function> + 0x2b57e9b (0x7fd6f6b96e9b in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #11: at::_ops::empty_memory_format::redispatch(c10::DispatchKeySet, c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0xe3 (0x7fd723385c83 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #12: <unknown function> + 0x200341f (0x7fd72361541f in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::_ops::empty_memory_format::call(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x1b7 (0x7fd7233c3657 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #14: at::empty(c10::ArrayRef<long>, c10::TensorOptions, c10::optional<c10::MemoryFormat>) + 0xb1 (0x7fd6dda1c20f in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/tensoradapter/pytorch/libtensoradapter_pytorch_1.12.0.so)
frame #15: torch::empty(c10::ArrayRef<long>, c10::TensorOptions, c10::optional<c10::MemoryFormat>) + 0x95 (0x7fd6dda1ddcc in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/tensoradapter/pytorch/libtensoradapter_pytorch_1.12.0.so)
frame #16: TAempty + 0x119 (0x7fd6dda19a38 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/tensoradapter/pytorch/libtensoradapter_pytorch_1.12.0.so)
frame #17: dgl::runtime::NDArray::Empty(std::vector<long, std::allocator<long> >, DLDataType, DLContext) + 0xb6 (0x7fd6a3168f46 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #18: dgl::aten::NewIdArray(long, DLContext, unsigned char) + 0x6d (0x7fd6a2ded90d in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #19: dgl::runtime::NDArray dgl::aten::impl::Range<(DLDeviceType)2, long>(long, long, DLContext) + 0x9a (0x7fd6a331058a in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #20: dgl::aten::Range(long, long, unsigned char, DLContext) + 0x1fd (0x7fd6a2dedc9d in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #21: dgl::UnitGraph::COO::Edges(unsigned long, std::string const&) const + 0x9b (0x7fd6a32c058b in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #22: dgl::UnitGraph::Edges(unsigned long, std::string const&) const + 0xa1 (0x7fd6a32bb251 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #23: dgl::HeteroGraph::Edges(unsigned long, std::string const&) const + 0x2a (0x7fd6a31bac7a in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #24: <unknown function> + 0x73af6c (0x7fd6a31c3f6c in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #25: DGLFuncCall + 0x48 (0x7fd6a3147548 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #26: <unknown function> + 0x162ac (0x7fd6dd2152ac in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/_ffi/_cy3/core.cpython-39-x86_64-linux-gnu.so)
frame #27: <unknown function> + 0x167db (0x7fd6dd2157db in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/_ffi/_cy3/core.cpython-39-x86_64-linux-gnu.so)
frame #28: _PyObject_MakeTpCall + 0x347 (0x56361a329fa7 in /opt/conda/envs/pytorch/bin/python)
frame #29: <unknown function> + 0x69091 (0x56361a25e091 in /opt/conda/envs/pytorch/bin/python)
frame #30: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #31: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #32: <unknown function> + 0x14c328 (0x56361a341328 in /opt/conda/envs/pytorch/bin/python)
frame #33: <unknown function> + 0x1e5ed4 (0x56361a3daed4 in /opt/conda/envs/pytorch/bin/python)
frame #34: <unknown function> + 0x695e1 (0x56361a25e5e1 in /opt/conda/envs/pytorch/bin/python)
frame #35: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #36: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #37: <unknown function> + 0x14c328 (0x56361a341328 in /opt/conda/envs/pytorch/bin/python)
frame #38: PyObject_Call + 0xb4 (0x56361a341a84 in /opt/conda/envs/pytorch/bin/python)
frame #39: _PyEval_EvalFrameDefault + 0x39a6 (0x56361a324396 in /opt/conda/envs/pytorch/bin/python)
frame #40: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #41: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #42: _PyFunction_Vectorcall + 0xb9 (0x56361a331ff9 in /opt/conda/envs/pytorch/bin/python)
frame #43: _PyObject_FastCallDictTstate + 0x1a5 (0x56361a329745 in /opt/conda/envs/pytorch/bin/python)
frame #44: _PyObject_Call_Prepend + 0x69 (0x56361a33e5e9 in /opt/conda/envs/pytorch/bin/python)
frame #45: <unknown function> + 0x21ea85 (0x56361a413a85 in /opt/conda/envs/pytorch/bin/python)
frame #46: _PyObject_MakeTpCall + 0x347 (0x56361a329fa7 in /opt/conda/envs/pytorch/bin/python)
frame #47: <unknown function> + 0x683df (0x56361a25d3df in /opt/conda/envs/pytorch/bin/python)
frame #48: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #49: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #50: _PyFunction_Vectorcall + 0xb9 (0x56361a331ff9 in /opt/conda/envs/pytorch/bin/python)
frame #51: <unknown function> + 0x1e5ed4 (0x56361a3daed4 in /opt/conda/envs/pytorch/bin/python)
frame #52: <unknown function> + 0x695e1 (0x56361a25e5e1 in /opt/conda/envs/pytorch/bin/python)
frame #53: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #54: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #55: _PyEval_EvalCodeWithName + 0x47 (0x56361a31f6d7 in /opt/conda/envs/pytorch/bin/python)
frame #56: PyEval_EvalCodeEx + 0x39 (0x56361a31f689 in /opt/conda/envs/pytorch/bin/python)
frame #57: PyEval_EvalCode + 0x1b (0x56361a3dae3b in /opt/conda/envs/pytorch/bin/python)
frame #58: <unknown function> + 0x1ea8fd (0x56361a3df8fd in /opt/conda/envs/pytorch/bin/python)
frame #59: <unknown function> + 0x13d991 (0x56361a332991 in /opt/conda/envs/pytorch/bin/python)
frame #60: <unknown function> + 0x1e5ed4 (0x56361a3daed4 in /opt/conda/envs/pytorch/bin/python)
frame #61: <unknown function> + 0x69091 (0x56361a25e091 in /opt/conda/envs/pytorch/bin/python)
frame #62: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #63: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)

The problem happened at Line 466-471:

    for subgs, _ in train_loader:
        subgs = dgl.unbatch(subgs)
        if len(num_nodes) > 1000:
            break
        for subg in subgs:
            num_nodes.append(subg.num_nodes())

I tried to decrease batch_size/num_workers, but no help.
Then I monitored the GPU util and found that even if I set CUDA:1 as the device, CUDA:0 has an abnormal ignorable Memory-Usage.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08    Driver Version: 510.73.08    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:17.0 Off |                    0 |
| N/A   47C    P0    60W / 300W |    728MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:00:18.0 Off |                    0 |
| N/A   42C    P0    57W / 300W |    898MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000000:00:19.0 Off |                    0 |
| N/A   41C    P0    44W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000000:00:1A.0 Off |                    0 |
| N/A   44C    P0    46W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
......

My environment is:

dgl-cu116: 0.9.0
python: 3.9
pytorch: 1.12.0

Does someone know how to solve this issue?

mufeili · 2022-09-04T04:49:28Z

I simply modified the GPU device to run on, but got an error:

terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from process_events at /opt/conda/conda-bld/pytorch_1656352657443/work/c10/cuda/CUDACachingAllocator.cpp:1470 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7fd6f3fea477 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x25742 (0x7fd7215e3742 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x20e61 (0x7fd7215dee61 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #3: <unknown function> + 0x41308 (0x7fd7215ff308 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0x41542 (0x7fd7215ff542 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libc10_cuda.so)
frame #5: at::detail::empty_generic(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) + 0x7bf (0x7fd7228b7daf in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #6: at::detail::empty_cuda(c10::ArrayRef<long>, c10::ScalarType, c10::optional<c10::Device>, c10::optional<c10::MemoryFormat>) + 0x115 (0x7fd72ddfdd75 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #7: at::detail::empty_cuda(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x31 (0x7fd72ddfdfd1 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #8: at::native::empty_cuda(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x1f (0x7fd72dedc82f in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #9: <unknown function> + 0x2b57e28 (0x7fd6f6b96e28 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #10: <unknown function> + 0x2b57e9b (0x7fd6f6b96e9b in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #11: at::_ops::empty_memory_format::redispatch(c10::DispatchKeySet, c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0xe3 (0x7fd723385c83 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #12: <unknown function> + 0x200341f (0x7fd72361541f in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::_ops::empty_memory_format::call(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x1b7 (0x7fd7233c3657 in /opt/conda/envs/pytorch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #14: at::empty(c10::ArrayRef<long>, c10::TensorOptions, c10::optional<c10::MemoryFormat>) + 0xb1 (0x7fd6dda1c20f in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/tensoradapter/pytorch/libtensoradapter_pytorch_1.12.0.so)
frame #15: torch::empty(c10::ArrayRef<long>, c10::TensorOptions, c10::optional<c10::MemoryFormat>) + 0x95 (0x7fd6dda1ddcc in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/tensoradapter/pytorch/libtensoradapter_pytorch_1.12.0.so)
frame #16: TAempty + 0x119 (0x7fd6dda19a38 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/tensoradapter/pytorch/libtensoradapter_pytorch_1.12.0.so)
frame #17: dgl::runtime::NDArray::Empty(std::vector<long, std::allocator<long> >, DLDataType, DLContext) + 0xb6 (0x7fd6a3168f46 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #18: dgl::aten::NewIdArray(long, DLContext, unsigned char) + 0x6d (0x7fd6a2ded90d in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #19: dgl::runtime::NDArray dgl::aten::impl::Range<(DLDeviceType)2, long>(long, long, DLContext) + 0x9a (0x7fd6a331058a in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #20: dgl::aten::Range(long, long, unsigned char, DLContext) + 0x1fd (0x7fd6a2dedc9d in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #21: dgl::UnitGraph::COO::Edges(unsigned long, std::string const&) const + 0x9b (0x7fd6a32c058b in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #22: dgl::UnitGraph::Edges(unsigned long, std::string const&) const + 0xa1 (0x7fd6a32bb251 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #23: dgl::HeteroGraph::Edges(unsigned long, std::string const&) const + 0x2a (0x7fd6a31bac7a in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #24: <unknown function> + 0x73af6c (0x7fd6a31c3f6c in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #25: DGLFuncCall + 0x48 (0x7fd6a3147548 in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/libdgl.so)
frame #26: <unknown function> + 0x162ac (0x7fd6dd2152ac in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/_ffi/_cy3/core.cpython-39-x86_64-linux-gnu.so)
frame #27: <unknown function> + 0x167db (0x7fd6dd2157db in /home/ubuntu/.local/lib/python3.9/site-packages/dgl/_ffi/_cy3/core.cpython-39-x86_64-linux-gnu.so)
frame #28: _PyObject_MakeTpCall + 0x347 (0x56361a329fa7 in /opt/conda/envs/pytorch/bin/python)
frame #29: <unknown function> + 0x69091 (0x56361a25e091 in /opt/conda/envs/pytorch/bin/python)
frame #30: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #31: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #32: <unknown function> + 0x14c328 (0x56361a341328 in /opt/conda/envs/pytorch/bin/python)
frame #33: <unknown function> + 0x1e5ed4 (0x56361a3daed4 in /opt/conda/envs/pytorch/bin/python)
frame #34: <unknown function> + 0x695e1 (0x56361a25e5e1 in /opt/conda/envs/pytorch/bin/python)
frame #35: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #36: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #37: <unknown function> + 0x14c328 (0x56361a341328 in /opt/conda/envs/pytorch/bin/python)
frame #38: PyObject_Call + 0xb4 (0x56361a341a84 in /opt/conda/envs/pytorch/bin/python)
frame #39: _PyEval_EvalFrameDefault + 0x39a6 (0x56361a324396 in /opt/conda/envs/pytorch/bin/python)
frame #40: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #41: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #42: _PyFunction_Vectorcall + 0xb9 (0x56361a331ff9 in /opt/conda/envs/pytorch/bin/python)
frame #43: _PyObject_FastCallDictTstate + 0x1a5 (0x56361a329745 in /opt/conda/envs/pytorch/bin/python)
frame #44: _PyObject_Call_Prepend + 0x69 (0x56361a33e5e9 in /opt/conda/envs/pytorch/bin/python)
frame #45: <unknown function> + 0x21ea85 (0x56361a413a85 in /opt/conda/envs/pytorch/bin/python)
frame #46: _PyObject_MakeTpCall + 0x347 (0x56361a329fa7 in /opt/conda/envs/pytorch/bin/python)
frame #47: <unknown function> + 0x683df (0x56361a25d3df in /opt/conda/envs/pytorch/bin/python)
frame #48: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #49: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #50: _PyFunction_Vectorcall + 0xb9 (0x56361a331ff9 in /opt/conda/envs/pytorch/bin/python)
frame #51: <unknown function> + 0x1e5ed4 (0x56361a3daed4 in /opt/conda/envs/pytorch/bin/python)
frame #52: <unknown function> + 0x695e1 (0x56361a25e5e1 in /opt/conda/envs/pytorch/bin/python)
frame #53: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #54: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)
frame #55: _PyEval_EvalCodeWithName + 0x47 (0x56361a31f6d7 in /opt/conda/envs/pytorch/bin/python)
frame #56: PyEval_EvalCodeEx + 0x39 (0x56361a31f689 in /opt/conda/envs/pytorch/bin/python)
frame #57: PyEval_EvalCode + 0x1b (0x56361a3dae3b in /opt/conda/envs/pytorch/bin/python)
frame #58: <unknown function> + 0x1ea8fd (0x56361a3df8fd in /opt/conda/envs/pytorch/bin/python)
frame #59: <unknown function> + 0x13d991 (0x56361a332991 in /opt/conda/envs/pytorch/bin/python)
frame #60: <unknown function> + 0x1e5ed4 (0x56361a3daed4 in /opt/conda/envs/pytorch/bin/python)
frame #61: <unknown function> + 0x69091 (0x56361a25e091 in /opt/conda/envs/pytorch/bin/python)
frame #62: <unknown function> + 0x12743 (0x7fd8265cc743 in /home/ubuntu/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_frame_eval/pydevd_frame_evaluator.cpython-39-x86_64-linux-gnu.so)
frame #63: <unknown function> + 0x12aa17 (0x56361a31fa17 in /opt/conda/envs/pytorch/bin/python)

The problem happened at Line 466-471:

    for subgs, _ in train_loader:
        subgs = dgl.unbatch(subgs)
        if len(num_nodes) > 1000:
            break
        for subg in subgs:
            num_nodes.append(subg.num_nodes())

I tried to decrease batch_size/num_workers, but no help. Then I monitored the GPU util and found that even if I set CUDA:1 as the device, CUDA:0 has an abnormal ignorable Memory-Usage.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08    Driver Version: 510.73.08    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:17.0 Off |                    0 |
| N/A   47C    P0    60W / 300W |    728MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:00:18.0 Off |                    0 |
| N/A   42C    P0    57W / 300W |    898MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000000:00:19.0 Off |                    0 |
| N/A   41C    P0    44W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000000:00:1A.0 Off |                    0 |
| N/A   44C    P0    46W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
......

My environment is:

dgl-cu116: 0.9.0
python: 3.9
pytorch: 1.12.0

Does someone know how to solve this issue?

How did you trigger this? @rudongyu for awareness

* Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (#4450) * [Doc] fix print issue in tutorial (#4459) * [Example][Refactor] Refactor RGCN example (#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (#4468) * [Example] directional_GSN for ogbg-molpcba (#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com>

* [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430) * Add refactors for multi-gpu and full-graph example * Fix format * Update * Update * Update * [Cleanup] Remove async_transferer (#4505) * Remove async_transferer * remove test * Remove AsyncTransferer Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> * [Cleanup] Remove duplicate entries of CUB submodule (issue# 4395) (#4499) * remove third_part/cub * remove from third_party Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Xin Yao <xiny@nvidia.com> * [Bug] Enable turn on/off libxsmm at runtime (#4455) * enable turn on/off libxsmm at runtime by adding a global config and related API Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> * [Feature] Unify the cuda stream used in core library (#4480) * Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by: xiny <xiny@nvidia.com> * [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389) * * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph * Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information * * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing * * Added test to ensure dgl.remove_self_loop function correctly updates batch information * * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph * * Formatting * * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph * * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass * Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed * * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments * * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test * * Added doc comments for new parameters * * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph * Added test cases for more than k coincident points * * Updated doc comments for output_batch parameters for clarity * * Linter formatting fixes * * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu * * Rewording in doc comments * * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [CI] only known devs are authorized to trigger CI (#4518) * [CI] only known devs are authorized to trigger CI * fix if author is null * add comments * [Readability] Auto fix setup.py and update-version.py (#4446) * Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (#4450) * [Doc] fix print issue in tutorial (#4459) * [Example][Refactor] Refactor RGCN example (#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (#4468) * [Example] directional_GSN for ogbg-molpcba (#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> * Move mock version of dgl_sparse library to DGL main repo (#4524) * init * Add api doc for sparse library * support op btwn matrices with differnt sparsity * Fixed docstring * addresses comments * lint check * change keyword format to fmt Co-authored-by: Israt Nisa <nisisrat@amazon.com> * [DistPart] expose timeout config for process group (#4532) * [DistPart] expose timeout config for process group * refine code * Update tools/distpartitioning/data_proc_pipeline.py Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Feature] Import PyTorch's CUDA stream management (#4503) * add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle * [examples]educe memory consumption (#4558) * [examples]educe memory consumption * reffine help message * refine * [Feature][REVIEW] Enable DGL cugaph nightly CI (#4525) * Added cugraph nightly scripts * Removed nvcr.io//nvidia/pytorch:22.04-py3 reference Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI (#4525)" (#4563) This reverts commit ec171c6. * [Misc] Add flake8 lint workflow. (#4566) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. * Add flake8 annotation in workflow. * remove * add * clean up Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Try use official pylint workflow. (#4568) * polish update_version * update pylint workflow. * add * revert. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [CI] refine stage logic (#4565) * [CI] refine stage logic * refine * refine * remove (#4570) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Add Pylint workflow for flake8. (#4571) * remove * Add pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Update the python version in Pylint workflow for flake8. (#4572) * remove * Add pylint. * Change the python version for pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4574) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Use another workflow. (#4575) * Update pylint. * Use another workflow. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4576) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint.yml * Update pylint.yml * Delete pylint.yml * [Misc]Add pyproject.toml for autopep8 & black. (#4543) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) * rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com>

* Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (dmlc#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (dmlc#4450) * [Doc] fix print issue in tutorial (dmlc#4459) * [Example][Refactor] Refactor RGCN example (dmlc#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (dmlc#4468) * [Example] directional_GSN for ogbg-molpcba (dmlc#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (dmlc#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (dmlc#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (dmlc#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (dmlc#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (dmlc#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (dmlc#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (dmlc#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com>

* Update from master (#4584) * [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430) * Add refactors for multi-gpu and full-graph example * Fix format * Update * Update * Update * [Cleanup] Remove async_transferer (#4505) * Remove async_transferer * remove test * Remove AsyncTransferer Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> * [Cleanup] Remove duplicate entries of CUB submodule (issue# 4395) (#4499) * remove third_part/cub * remove from third_party Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Xin Yao <xiny@nvidia.com> * [Bug] Enable turn on/off libxsmm at runtime (#4455) * enable turn on/off libxsmm at runtime by adding a global config and related API Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> * [Feature] Unify the cuda stream used in core library (#4480) * Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by: xiny <xiny@nvidia.com> * [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389) * * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph * Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information * * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing * * Added test to ensure dgl.remove_self_loop function correctly updates batch information * * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph * * Formatting * * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph * * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass * Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed * * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments * * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test * * Added doc comments for new parameters * * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph * Added test cases for more than k coincident points * * Updated doc comments for output_batch parameters for clarity * * Linter formatting fixes * * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu * * Rewording in doc comments * * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [CI] only known devs are authorized to trigger CI (#4518) * [CI] only known devs are authorized to trigger CI * fix if author is null * add comments * [Readability] Auto fix setup.py and update-version.py (#4446) * Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (#4450) * [Doc] fix print issue in tutorial (#4459) * [Example][Refactor] Refactor RGCN example (#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (#4468) * [Example] directional_GSN for ogbg-molpcba (#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> * Move mock version of dgl_sparse library to DGL main repo (#4524) * init * Add api doc for sparse library * support op btwn matrices with differnt sparsity * Fixed docstring * addresses comments * lint check * change keyword format to fmt Co-authored-by: Israt Nisa <nisisrat@amazon.com> * [DistPart] expose timeout config for process group (#4532) * [DistPart] expose timeout config for process group * refine code * Update tools/distpartitioning/data_proc_pipeline.py Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Feature] Import PyTorch's CUDA stream management (#4503) * add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle * [examples]educe memory consumption (#4558) * [examples]educe memory consumption * reffine help message * refine * [Feature][REVIEW] Enable DGL cugaph nightly CI (#4525) * Added cugraph nightly scripts * Removed nvcr.io//nvidia/pytorch:22.04-py3 reference Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI (#4525)" (#4563) This reverts commit ec171c6. * [Misc] Add flake8 lint workflow. (#4566) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. * Add flake8 annotation in workflow. * remove * add * clean up Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Try use official pylint workflow. (#4568) * polish update_version * update pylint workflow. * add * revert. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [CI] refine stage logic (#4565) * [CI] refine stage logic * refine * refine * remove (#4570) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Add Pylint workflow for flake8. (#4571) * remove * Add pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Update the python version in Pylint workflow for flake8. (#4572) * remove * Add pylint. * Change the python version for pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4574) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Use another workflow. (#4575) * Update pylint. * Use another workflow. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4576) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint.yml * Update pylint.yml * Delete pylint.yml * [Misc]Add pyproject.toml for autopep8 & black. (#4543) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) * rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com> * [Deprecation] Dataset Attributes (#4546) * Update * CI * CI * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * [Example] Bug Fix (#4665) * Update * CI * CI * Update * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * Update Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com>

* Update from master (#4584) * [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430) * Add refactors for multi-gpu and full-graph example * Fix format * Update * Update * Update * [Cleanup] Remove async_transferer (#4505) * Remove async_transferer * remove test * Remove AsyncTransferer Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> * [Cleanup] Remove duplicate entries of CUB submodule (issue# 4395) (#4499) * remove third_part/cub * remove from third_party Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Xin Yao <xiny@nvidia.com> * [Bug] Enable turn on/off libxsmm at runtime (#4455) * enable turn on/off libxsmm at runtime by adding a global config and related API Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> * [Feature] Unify the cuda stream used in core library (#4480) * Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by: xiny <xiny@nvidia.com> * [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389) * * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph * Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information * * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing * * Added test to ensure dgl.remove_self_loop function correctly updates batch information * * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph * * Formatting * * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph * * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass * Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed * * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments * * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test * * Added doc comments for new parameters * * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph * Added test cases for more than k coincident points * * Updated doc comments for output_batch parameters for clarity * * Linter formatting fixes * * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu * * Rewording in doc comments * * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [CI] only known devs are authorized to trigger CI (#4518) * [CI] only known devs are authorized to trigger CI * fix if author is null * add comments * [Readability] Auto fix setup.py and update-version.py (#4446) * Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (#4450) * [Doc] fix print issue in tutorial (#4459) * [Example][Refactor] Refactor RGCN example (#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (#4468) * [Example] directional_GSN for ogbg-molpcba (#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> * Move mock version of dgl_sparse library to DGL main repo (#4524) * init * Add api doc for sparse library * support op btwn matrices with differnt sparsity * Fixed docstring * addresses comments * lint check * change keyword format to fmt Co-authored-by: Israt Nisa <nisisrat@amazon.com> * [DistPart] expose timeout config for process group (#4532) * [DistPart] expose timeout config for process group * refine code * Update tools/distpartitioning/data_proc_pipeline.py Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Feature] Import PyTorch's CUDA stream management (#4503) * add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle * [examples]educe memory consumption (#4558) * [examples]educe memory consumption * reffine help message * refine * [Feature][REVIEW] Enable DGL cugaph nightly CI (#4525) * Added cugraph nightly scripts * Removed nvcr.io//nvidia/pytorch:22.04-py3 reference Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI (#4525)" (#4563) This reverts commit ec171c6. * [Misc] Add flake8 lint workflow. (#4566) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. * Add flake8 annotation in workflow. * remove * add * clean up Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Try use official pylint workflow. (#4568) * polish update_version * update pylint workflow. * add * revert. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [CI] refine stage logic (#4565) * [CI] refine stage logic * refine * refine * remove (#4570) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Add Pylint workflow for flake8. (#4571) * remove * Add pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Update the python version in Pylint workflow for flake8. (#4572) * remove * Add pylint. * Change the python version for pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4574) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Use another workflow. (#4575) * Update pylint. * Use another workflow. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4576) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint.yml * Update pylint.yml * Delete pylint.yml * [Misc]Add pyproject.toml for autopep8 & black. (#4543) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) * rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com> * [Deprecation] Dataset Attributes (#4546) * Update * CI * CI * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * [Example] Bug Fix (#4665) * Update * CI * CI * Update * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * Update * Update (#4724) Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com>

* Update from master (#4584) * [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430) * Add refactors for multi-gpu and full-graph example * Fix format * Update * Update * Update * [Cleanup] Remove async_transferer (#4505) * Remove async_transferer * remove test * Remove AsyncTransferer Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> * [Cleanup] Remove duplicate entries of CUB submodule (issue# 4395) (#4499) * remove third_part/cub * remove from third_party Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Xin Yao <xiny@nvidia.com> * [Bug] Enable turn on/off libxsmm at runtime (#4455) * enable turn on/off libxsmm at runtime by adding a global config and related API Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> * [Feature] Unify the cuda stream used in core library (#4480) * Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by: xiny <xiny@nvidia.com> * [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389) * * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph * Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information * * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing * * Added test to ensure dgl.remove_self_loop function correctly updates batch information * * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph * * Formatting * * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph * * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass * Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed * * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments * * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test * * Added doc comments for new parameters * * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph * Added test cases for more than k coincident points * * Updated doc comments for output_batch parameters for clarity * * Linter formatting fixes * * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu * * Rewording in doc comments * * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [CI] only known devs are authorized to trigger CI (#4518) * [CI] only known devs are authorized to trigger CI * fix if author is null * add comments * [Readability] Auto fix setup.py and update-version.py (#4446) * Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (#4450) * [Doc] fix print issue in tutorial (#4459) * [Example][Refactor] Refactor RGCN example (#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (#4468) * [Example] directional_GSN for ogbg-molpcba (#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> * Move mock version of dgl_sparse library to DGL main repo (#4524) * init * Add api doc for sparse library * support op btwn matrices with differnt sparsity * Fixed docstring * addresses comments * lint check * change keyword format to fmt Co-authored-by: Israt Nisa <nisisrat@amazon.com> * [DistPart] expose timeout config for process group (#4532) * [DistPart] expose timeout config for process group * refine code * Update tools/distpartitioning/data_proc_pipeline.py Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Feature] Import PyTorch's CUDA stream management (#4503) * add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle * [examples]educe memory consumption (#4558) * [examples]educe memory consumption * reffine help message * refine * [Feature][REVIEW] Enable DGL cugaph nightly CI (#4525) * Added cugraph nightly scripts * Removed nvcr.io//nvidia/pytorch:22.04-py3 reference Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI (#4525)" (#4563) This reverts commit ec171c6. * [Misc] Add flake8 lint workflow. (#4566) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. * Add flake8 annotation in workflow. * remove * add * clean up Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Try use official pylint workflow. (#4568) * polish update_version * update pylint workflow. * add * revert. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [CI] refine stage logic (#4565) * [CI] refine stage logic * refine * refine * remove (#4570) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Add Pylint workflow for flake8. (#4571) * remove * Add pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Update the python version in Pylint workflow for flake8. (#4572) * remove * Add pylint. * Change the python version for pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4574) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Use another workflow. (#4575) * Update pylint. * Use another workflow. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4576) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint.yml * Update pylint.yml * Delete pylint.yml * [Misc]Add pyproject.toml for autopep8 & black. (#4543) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) * rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com> * [Deprecation] Dataset Attributes (#4546) * Update * CI * CI * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * [Example] Bug Fix (#4665) * Update * CI * CI * Update * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * Update * Update (#4724) Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * change DGLHeteroGraph to DGLGraph in DOC * revert c change Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-19.ap-northeast-1.compute.internal>

* Update from master (#4584) * [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430) * Add refactors for multi-gpu and full-graph example * Fix format * Update * Update * Update * [Cleanup] Remove async_transferer (#4505) * Remove async_transferer * remove test * Remove AsyncTransferer Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> * [Cleanup] Remove duplicate entries of CUB submodule (issue# 4395) (#4499) * remove third_part/cub * remove from third_party Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Xin Yao <xiny@nvidia.com> * [Bug] Enable turn on/off libxsmm at runtime (#4455) * enable turn on/off libxsmm at runtime by adding a global config and related API Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> * [Feature] Unify the cuda stream used in core library (#4480) * Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by: xiny <xiny@nvidia.com> * [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389) * * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph * Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information * * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing * * Added test to ensure dgl.remove_self_loop function correctly updates batch information * * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph * * Formatting * * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph * * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass * Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed * * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments * * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test * * Added doc comments for new parameters * * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph * Added test cases for more than k coincident points * * Updated doc comments for output_batch parameters for clarity * * Linter formatting fixes * * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu * * Rewording in doc comments * * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [CI] only known devs are authorized to trigger CI (#4518) * [CI] only known devs are authorized to trigger CI * fix if author is null * add comments * [Readability] Auto fix setup.py and update-version.py (#4446) * Auto fix update-version * Auto fix setup.py * Auto fix update-version * Auto fix setup.py * [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438) * Update distributed-preprocessing.rst * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * fix unpinning when tensoradaptor is not available (#4450) * [Doc] fix print issue in tutorial (#4459) * [Example][Refactor] Refactor RGCN example (#4327) * Refactor full graph entity classification * Refactor rgcn with sampling * README update * Update * Results update * Respect default setting of self_loop=false in entity.py * Update * Update README * Update for multi-gpu * Update * [doc] fix invalid link in user guide (#4468) * [Example] directional_GSN for ogbg-molpcba (#4405) * version-1 * version-2 * version-3 * update examples/README * Update .gitignore * update performance in README, delete scripts * 1st approving review * 2nd approving review Co-authored-by: Mufei Li <mufeili1996@gmail.com> * Clarify the message name, which is 'm'. (#4462) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [Refactor] Auto fix view.py. (#4461) Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Example] SEAL for OGBL (#4291) * [Example] SEAL for OGBL * update index * update * fix readme typo * add seal sampler * modify set ops * prefetch * efficiency test * update * optimize * fix ScatterAdd dtype issue * update sampler style * update Co-authored-by: Quan Gan <coin2028@hotmail.com> * [CI] use https instead of http (#4488) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487) * [BugFix] fix crash due to incorrect dtype in dgl.to_block() * fix test failure in TF * [Feature] Make TensorAdapter Stream Aware (#4472) * Allocate tensors in DGL's current stream * make tensoradaptor stream-aware * replace TAemtpy with cpu allocator * fix typo * try fix cpu allocation * clean header * redirect AllocDataSpace as well * resolve comments * [Build][Doc] Specify the sphinx version (#4465) Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * reformat * reformat * Auto fix update-version * Auto fix setup.py * reformat * reformat Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> * Move mock version of dgl_sparse library to DGL main repo (#4524) * init * Add api doc for sparse library * support op btwn matrices with differnt sparsity * Fixed docstring * addresses comments * lint check * change keyword format to fmt Co-authored-by: Israt Nisa <nisisrat@amazon.com> * [DistPart] expose timeout config for process group (#4532) * [DistPart] expose timeout config for process group * refine code * Update tools/distpartitioning/data_proc_pipeline.py Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> * [Feature] Import PyTorch's CUDA stream management (#4503) * add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle * [examples]educe memory consumption (#4558) * [examples]educe memory consumption * reffine help message * refine * [Feature][REVIEW] Enable DGL cugaph nightly CI (#4525) * Added cugraph nightly scripts * Removed nvcr.io//nvidia/pytorch:22.04-py3 reference Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI (#4525)" (#4563) This reverts commit ec171c6. * [Misc] Add flake8 lint workflow. (#4566) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. * Add flake8 annotation in workflow. * remove * add * clean up Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Try use official pylint workflow. (#4568) * polish update_version * update pylint workflow. * add * revert. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [CI] refine stage logic (#4565) * [CI] refine stage logic * refine * refine * remove (#4570) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Add Pylint workflow for flake8. (#4571) * remove * Add pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Update the python version in Pylint workflow for flake8. (#4572) * remove * Add pylint. * Change the python version for pylint. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4574) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Misc] Use another workflow. (#4575) * Update pylint. * Use another workflow. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint. (#4576) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Update pylint.yml * Update pylint.yml * Delete pylint.yml * [Misc]Add pyproject.toml for autopep8 & black. (#4543) * Add pyproject.toml for autopep8. * Add pyproject.toml for autopep8. Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) * rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com> * [Deprecation] Dataset Attributes (#4546) * Update * CI * CI * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * [Example] Bug Fix (#4665) * Update * CI * CI * Update * Update Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * Update * Update (#4724) Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> * [API Deprecation]Rename DGLHeterpGraph to DGLGraph in Py files (#4835) * rename DGLHeterpGraph to DGLGraph * [Sparse] Add sparse matrix C++ implementation (#4773) * [Sparse] Add sparse matrix C++ implementation * Add documentation * Update * Minor fix * Move Python code to dgl/mock_sparse2 * Move headers to include * lint * Update * Add dgl_sparse directory * Move src code to dgl_sparse * Add __init__.py in tests to avoid naming conflict * Add dgl sparse so in Jenkinsfile * Complete docstring & SparseMatrix basic op * lint * Disable win tests * fix lint issue * [Misc] clang-format auto fix. (#4831) * [Misc] clang-format auto fix. * blabla * nolint * blabla Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Dist] enable access DistGraph.edges via canonical etype (#4814) * [Dist] enable access DistGraph.edges via canonical etype * refine code * refine test * refine code * Reading files in chunks to reduce the memory footprint of pyarrow (#4795) All tasks completed. * [Dist] Create <graph_name>_stats.txt file if it does not exist before ParMETIS execution (#4791) * check if stats file exists, if not create one before parmetis run * correct the typo error and correctly use constants.GRAPH_NAME * alltoall returns tensor list with None values, which is failing torch.cat(). (#4788) * replace batch_hetero * [Misc] Add // NOLINT for the very long code. (#4834) * alternative * fix * remove_todo * blabl * ablabl Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * fix (#4841) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * a better way to init threadlocal prng (#4808) Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com> * [DIST] Message size to retrieve SHUFFLE_GLOBAL_NIDs is resulting in very large messages and resulting in killed process (#4790) * Send out the message to the distributed lookup service in batches. * Update function signature for allgather_sizes function call. * Removed the unnecessary if statement . * Removed logging.info message, which is not needed. * [Misc] Minor code style fix. (#4843) * [Misc] Change the max line length for cpp to 80 in lint. * blabla * blabla * blabla * ablabla Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Sparse] Lint C++ files (#4845) * [Dist] Fix typo of metis preprocess in dist partitin pipeline * Fix ogb/ogbn-mag/heter-RGCN example (#4839) Co-authored-by: Mufei Li <mufeili1996@gmail.com> * fix issue * [Misc] Update cpplint. (#4844) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-19.ap-northeast-1.compute.internal> Co-authored-by: czkkkkkk <zekucai@gmail.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: kylasa <kylasa@gmail.com> Co-authored-by: Muhammed Fatih BALIN <m.f.balin@gmail.com> Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com> Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Mufei Li <mufeili1996@gmail.com> * [API Deprecate]Remove as_heterograph and as_immutable_graph (#4851) * Remove batch_hetero and unbatch_hetero * [API Deprecate]Rename DGLHeterpGraph to DGLGraph in python files (#4833) * remove training new line * remove node_attrs and edge_attrs in batch.py (#4890) * [API Deprecation] Remove copy_src,copy_edge,src_mul_edge in dgl.function (#4891) * remove emb_tensor in NodeEmbedding (#4892) * [API Deprecation]Remove 5 APIs in DGLGraph (#4902) * [API Deprecation]Remove APIs in old dgl section in DGLGraph (#4901) * [API Deprecation]Remove adjacency_matrix_scipy and inplace args in candidates (#4895) * [API Deprecation] Remove add_edge in DGLGraph (#4894) * [API Derepcation]Remove __contains__ in DGLGraph (#4937) * [API Deprecation]Remove edge_id() and force_multi argument in edge_ids() (#4896) * fix issue * remove deprecated_kwargs * remove unused import Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Israt Nisa <neesha295@gmail.com> Co-authored-by: Israt Nisa <nisisrat@amazon.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal> Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal> Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com> Co-authored-by: rudongyu <ru_dongyu@outlook.com> Co-authored-by: Quan Gan <coin2028@hotmail.com> Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-19.ap-northeast-1.compute.internal> Co-authored-by: czkkkkkk <zekucai@gmail.com> Co-authored-by: kylasa <kylasa@gmail.com> Co-authored-by: Muhammed Fatih BALIN <m.f.balin@gmail.com> Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

rudongyu added 2 commits July 25, 2022 07:50

[Example] SEAL for OGBL

49b57af

update index

fdacec6

mufeili self-requested a review July 25, 2022 08:55