-
Notifications
You must be signed in to change notification settings - Fork 17
[AutoDiff] Cut reverse-mode adstack memory usage 10x on all backends #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
b3594f2
[SPIR-V] Sparse adstack heap (Phases A-C): lazy LCA-claim row id
duburcqa 3177b4b
[SPIR-V][Runtime] Sparse adstack heap (Phase D-1, D-2): per-task coun…
duburcqa 9aad0d8
[SPIR-V] Sparse adstack heap (Stage 1, IR pattern match for static bo…
duburcqa 279c257
[SPIR-V] Sparse adstack heap (Stage 1.5, split per-heap-kind row claim)
duburcqa 502bc89
[SPIR-V] Sparse adstack heap (Stage 1.3, generic bound-reducer comput…
duburcqa 8efec24
[Runtime] Sparse adstack heap (Stage 1.4 scaffold): bound-reducer pip…
duburcqa 36d658f
[Runtime] Sparse adstack heap (Stage 1.4): dispatch reducer per task …
duburcqa 6828d76
[SPIR-V] Sparse adstack heap: hoist matched_var alloca above OpBranch…
duburcqa 1733dea
[Test] Sparse adstack heap (Stage 1.6): pin grad correctness on ndarr…
duburcqa dbb27c6
[SPIR-V][Runtime] Sparse adstack heap: defense-in-depth bounds check …
duburcqa bf5eb73
[SPIR-V][Runtime] Sparse adstack heap: bound reducer reads SNode-back…
duburcqa 1b90015
[Test] Sparse adstack heap: pin SNode-backed bound_expr grad correctn…
duburcqa 43a01ad
[WIP] Sparse adstack heap: shared static analysis + LLVM split-heap i…
duburcqa ccb6f91
[LLVM] Sparse adstack heap: enable shared static analysis on LLVM cod…
duburcqa fb7a98f
[LLVM] Sparse adstack heap: ensure helpers for split float / int heap…
duburcqa 474b91f
[LLVM] Sparse adstack heap: emit lazy float-heap row claim helper at …
duburcqa cff3c54
[LLVM] Sparse adstack heap: per-kernel row counter
duburcqa 96b726c
[LLVM] Sparse adstack heap: activate the lazy LCA-block float-heap ro…
duburcqa ba93287
[LLVM] Sparse adstack heap: route emit_ad_stack_single_slot_ptr throu…
duburcqa ce23bc6
[LLVM] Sparse adstack heap: route every push / pop / load-top / load-…
duburcqa 5c5dd9a
[LLVM] Sparse adstack heap: activate the lazy float-heap path in visi…
duburcqa 783a99b
[LLVM] Sparse adstack heap: host-side ndarray bound_expr reducer wire…
duburcqa a2fcade
[LLVM] Sparse adstack heap: split float heap allocation
duburcqa 133c2b2
[LLVM] Sparse adstack heap: per-arch device-side reducer for CUDA / A…
duburcqa 1c0d15f
[LLVM] Sparse adstack heap: reflow comments in llvm_runtime_executor.…
duburcqa 88bae23
[DEBUG] Sparse adstack heap: print [ADSTACK-FHEAP] / [ADSTACK-FHEAP-L…
duburcqa 814b56b
[LLVM] Sparse adstack heap: SNode-backed gate capture
duburcqa e620aaf
[LLVM] Sparse adstack heap: post-reducer float-heap sizing
duburcqa 2ea77d7
[LLVM] Sparse adstack heap: unconditional split routing
duburcqa e0454ac
[LLVM] Sparse adstack heap: drop unused legacy combined-heap allocation
duburcqa ab3ae23
[Lang] Sparse adstack heap: drop the [ADSTACK-FHEAP] / [ADSTACK-HEAP-…
duburcqa 5e1be77
[Lang] Sparse adstack heap: address PR review fixes
duburcqa 90cdb1c
[Test] Sparse adstack heap: extend the bound_expr ndarray gate test t…
duburcqa 1cd2389
[Lang] Sparse adstack heap: handle SNode-backed bound_expr on the LLV…
duburcqa ab6960a
[Lang] Sparse adstack heap: speculative defense-in-depth and predicat…
duburcqa d4c547d
[Test] Sparse adstack heap: parametrize the memory-savings end-to-end…
duburcqa 1425a4c
[Lang] Sparse adstack heap: fix per-task bound-reducer length on the …
duburcqa 1114d99
[Test] Sparse adstack heap: drop the resource-budget meta-commentary …
duburcqa f0759e0
[Lang] Sparse adstack heap: floor (not ceiling) division when computi…
duburcqa 3595c3f
[Lang] Sparse adstack heap: address remaining bot-flagged review issues
duburcqa 9b73e9e
[Test] Sparse adstack heap: clarify the test_adstack_static_bound_exp…
duburcqa 8bdcaf5
[Test] Sparse adstack heap: switch test_adstack_static_bound_expr_non…
duburcqa 3abb018
[Lang] Sparse adstack heap: extend the LLVM CPU / CUDA / AMDGPU bound…
duburcqa 2b73144
[Test] Sparse adstack heap: pin the LLVM CUDA / AMDGPU dispatch-cap f…
duburcqa 99b5743
[Test] Sparse adstack heap: pin the LLVM CPU host-reducer SNode arm o…
duburcqa f4ef8ab
[Lang] Sparse adstack heap: bound the snode_resolver tree-id scan wit…
duburcqa 524e6ac
[Test] Sparse adstack heap: pin the LLVM CUDA / AMDGPU device sizer p…
duburcqa 5c2bbf4
[Lang] Sparse adstack heap: extend the SPIR-V bound-reducer dispatch …
duburcqa 198287c
[Test] Sparse adstack heap: pin the SPIR-V bound-reducer f64 gating-f…
duburcqa 8855942
[Lang] Sparse adstack heap: hoist the AdStackBoundRowCapacity buffer …
duburcqa bef9771
[Lang] Sparse adstack heap: switch the LLVM CUDA / AMDGPU launchers' …
duburcqa 0d3d639
[Test] Sparse adstack heap: pin the SPIR-V launcher's resolve_length …
duburcqa 085f2ef
[Lang] Sparse adstack heap: restore deleted explanatory comments flag…
duburcqa 0ef750f
[Lang] Sparse adstack heap: persistent QD_DEBUG_ADSTACK heap-bind pri…
duburcqa 1c11bd5
[Lang] Sparse adstack heap: skip eager-path tasks in synchronize() la…
duburcqa 12e0ba0
[Test] Sparse adstack heap: pin eager-task last_observed_rows skip vi…
duburcqa dbcdf40
[Lang] Sparse adstack heap: mirror stride_int_bytes in the LLVM devic…
duburcqa 845e168
[Lang] Sparse adstack heap: walk LLVM declaration-order SNode offsets…
duburcqa 6f8de95
[Test] Sparse adstack heap: pin the multi-leaf dense SNode gate offse…
duburcqa c9f44f0
[Lang] Sparse adstack heap: validate gate index loop matches first it…
duburcqa f94d7db
[Test] Sparse adstack heap: drop unjustified arch restrictions on PR-…
duburcqa 3b24178
[Docs] Sparse adstack heap: tighten autodiff.md num_threads cap row +…
duburcqa e002f45
[Lang] Sparse adstack heap: hard-error the SPIR-V tertiary heap-sizin…
duburcqa 36848d8
[Lang] Sparse adstack heap: close the symmetric-form, SNode-arm and S…
duburcqa c8c7274
[Lang] Sparse adstack heap: extend the LLVM codegen adstack-alloca pr…
duburcqa 7b919e0
[Test] Sparse adstack heap: rewrite PR-added test docstrings with end…
duburcqa 6800252
[Lang] Sparse adstack heap: reflow PR-added comments to fill 120 cols…
duburcqa 82002b3
[Lang] Sparse adstack heap: restore the align_up_8 alignment-rational…
duburcqa e8ef1ec
[Lang] Sparse adstack heap: gate the per-launch publish work in CPU /…
duburcqa 871464f
[Lang] Sparse adstack heap: gate the bound_expr capture on a conserva…
duburcqa f5c1fcf
[Lang] Sparse adstack heap: rewrite PR-added comments flagged by the …
duburcqa 27be3ab
[Test] Sparse adstack heap: add a C++ unit test for build_adstack_bou…
duburcqa 45ca699
[Lang] Sparse adstack heap: extract the lazy-claim / bound-reducer / …
duburcqa fe68eed
[Doc] Sparse adstack heap: fix the 'raise it' direction in the ad_sta…
duburcqa beaa49a
[Lang] Sparse adstack heap: validate the SNode access-path's index ex…
duburcqa 9933b62
[Lang] Sparse adstack heap: declare the per_thread_stride_float_bytes…
duburcqa 0ccdc94
[Lang] Sparse adstack heap: serialize ad_stack_sparse_threshold_bytes…
duburcqa d37f2ae
[Doc] Sparse adstack heap: rewrite ad_stack_experimental_enabled draw…
duburcqa bbc0191
[Lang] Sparse adstack heap: refresh two stale codegen_llvm.cpp commen…
duburcqa 76acc1b
[Doc] Sparse adstack heap: spell out that enabling adstack globally i…
duburcqa 47fba3a
[Doc] Sparse adstack heap: add a Recommendation note to autodiff.md t…
duburcqa 24ed143
[Doc] Sparse adstack heap: soften the recommendation prefix from 'we …
duburcqa 2c8d275
[Lang] Sparse adstack heap: address bot review - cache-key gap, debug…
duburcqa 1c0011d
[Lang] Sparse adstack heap: drop over-strict SNode validation, cap SP…
duburcqa f3da2f8
Mark unit test as xfail.
duburcqa aa1e214
[Lang] Sparse adstack heap: address bot review (dead code, strict-ali…
duburcqa e951423
[Doc] Sparse adstack heap: document SNode-gate compound-index limitat…
duburcqa 845bd82
[Doc] Sparse adstack heap: prefix qd.field compound-index gate limita…
duburcqa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.