[MPS] Add support for slice_scatter; enable index_put by DenisVieriu97 · Pull Request #3399 · pytorch/executorch

DenisVieriu97 · 2024-04-29T17:40:17Z

Summary of changes:

support for scatter slice
enable index put

With whole model delegation, I am seeing following crash in llama2:

in _verify_exported_program_signature
    raise SpecViolationError(
torch._export.verifier.SpecViolationError: Buffer output getitem_1 does not point to a buffer that exists.
Dict of buffers that are mutated, in order: {'getitem_1': 'layers_0_attention_SDPA_kv_cache_k_cache', 'getitem': 'layers_0_attention_SDPA_kv_cache_v_cache', 'getitem_3': 'layers_1_attention_SDPA_kv_cache_k_cache', 'getitem_2': 'layers_1_attention_SDPA_kv_cache_v_cache', 'getitem_5': 'layers_2_attention_SDPA_kv_cache_k_cache', 'getitem_4': 'layers_2_attention_SDPA_kv_cache_v_cache', 'getitem_7': 'layers_3_attention_SDPA_kv_cache_k_cache', 'getitem_6': 'layers_3_attention_SDPA_kv_cache_v_cache', 'getitem_9': 'layers_4_attention_SDPA_kv_cache_k_cache', 'getitem_8': 'layers_4_attention_SDPA_kv_cache_v_cache'}
Buffer nodes available: []

Commands to lower llama2 to MPS:

python -m examples.models.llama2.export_llama -kv --mps
python3 -m examples.apple.mps.scripts.mps_example --model_name="llama2"

pytorch-bot · 2024-04-29T17:40:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3399

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit ae4940c with merge base 87d828a ():

NEW FAILURES - The following jobs have failed:

Apple / test-demo-ios / macos-job (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 65
Apple / upload-frameworks-ios (gh)
Credentials could not be loaded, please check your action inputs: Could not load credentials from any providers

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cccclai · 2024-05-01T06:34:43Z

I check out this pr and run

git submodule sync
git submodule update --init
./backends/apple/mps/install_requirements.sh
python -m examples.models.llama2.export_llama -kv --mps

but still can't repro...

DenisVieriu97 · 2024-05-01T23:14:45Z

I check out this pr and run

git submodule sync
git submodule update --init
./backends/apple/mps/install_requirements.sh
python -m examples.models.llama2.export_llama -kv --mps

but still can't repro...

@cccclai could you please run ./install_requirements.sh or pip install . --no-build-isolation -v after checking out the branch? It seems it's still tracing with the old code

larryliu0820 · 2024-05-02T20:25:39Z

backends/apple/mps/operators/indexing_ops.py

+            return n
+        return dim
+
+    def get_exapnded_index(self, idx, shape, dim):


larryliu0820 · 2024-05-02T20:27:51Z

I see the metal kernel compilation path is not enabled. Any reason why indexing ops require metal kernels and any plan to enable metal kernel path?

Asking because I'm thinking about hooking up a int4 mm kernel using the metal kernel flow. I have the metal kernel ready and trying to figure out how to inject that into the graph builder and so on.

cccclai · 2024-05-03T06:01:36Z

Have it working with this patch

diff --git a/backends/apple/mps/partition/mps_partitioner.py b/backends/apple/mps/partition/mps_partitioner.py
index e5497389d..8e22169c0 100644
--- a/backends/apple/mps/partition/mps_partitioner.py
+++ b/backends/apple/mps/partition/mps_partitioner.py
@@ -43,12 +43,6 @@ class MPSOperatorSupport(OperatorSupportBase):
         self.edge_program = edge_program

     def is_node_supported(self, submodules, node: torch.fx.Node) -> bool:
-        # Parameters are supported if any of their users are supported
-        if is_parameter(self.edge_program, node):
-            return any(
-                self.is_node_supported(submodules, user) for user in node.users.keys()
-            )
-
         if node.op != "call_function":
             return False

the root cause is that we're tagging the mutable buffers.

If mps doesn't support buffer mutation, this line is good enough for tagging the constants and it will exclude the mutable buffers.

tag_constant_data(edge_program)

Summary: Test with pytorch#3399 and this command passes ``` python -m examples.models.llama2.export_llama -kv --mps ``` Without this diff, it will error out ``` in _verify_exported_program_signature raise SpecViolationError( torch._export.verifier.SpecViolationError: Buffer output getitem_1 does not point to a buffer that exists. Dict of buffers that are mutated, in order: {'getitem_1': 'layers_0_attention_SDPA_kv_cache_k_cache', 'getitem': 'layers_0_attention_SDPA_kv_cache_v_cache', 'getitem_3': 'layers_1_attention_SDPA_kv_cache_k_cache', 'getitem_2': 'layers_1_attention_SDPA_kv_cache_v_cache', 'getitem_5': 'layers_2_attention_SDPA_kv_cache_k_cache', 'getitem_4': 'layers_2_attention_SDPA_kv_cache_v_cache', 'getitem_7': 'layers_3_attention_SDPA_kv_cache_k_cache', 'getitem_6': 'layers_3_attention_SDPA_kv_cache_v_cache', 'getitem_9': 'layers_4_attention_SDPA_kv_cache_k_cache', 'getitem_8': 'layers_4_attention_SDPA_kv_cache_v_cache'} Buffer nodes available: [] ``` The root cause is that by `is_parameter`, it tags all data including mutable buffers. Differential Revision: D56941763

Summary: Test with pytorch#3399 and this command passes ``` python -m examples.models.llama2.export_llama -kv --mps ``` Without this diff, it will error out ``` in _verify_exported_program_signature raise SpecViolationError( torch._export.verifier.SpecViolationError: Buffer output getitem_1 does not point to a buffer that exists. Dict of buffers that are mutated, in order: {'getitem_1': 'layers_0_attention_SDPA_kv_cache_k_cache', 'getitem': 'layers_0_attention_SDPA_kv_cache_v_cache', 'getitem_3': 'layers_1_attention_SDPA_kv_cache_k_cache', 'getitem_2': 'layers_1_attention_SDPA_kv_cache_v_cache', 'getitem_5': 'layers_2_attention_SDPA_kv_cache_k_cache', 'getitem_4': 'layers_2_attention_SDPA_kv_cache_v_cache', 'getitem_7': 'layers_3_attention_SDPA_kv_cache_k_cache', 'getitem_6': 'layers_3_attention_SDPA_kv_cache_v_cache', 'getitem_9': 'layers_4_attention_SDPA_kv_cache_k_cache', 'getitem_8': 'layers_4_attention_SDPA_kv_cache_v_cache'} Buffer nodes available: [] ``` The root cause is that by `is_parameter`, it tags all data including mutable buffers. Reviewed By: larryliu0820 Differential Revision: D56941763

Summary: Pull Request resolved: #3503 Test with #3399 and this command passes ``` python -m examples.models.llama2.export_llama -kv --mps ``` Without this diff, it will error out ``` in _verify_exported_program_signature raise SpecViolationError( torch._export.verifier.SpecViolationError: Buffer output getitem_1 does not point to a buffer that exists. Dict of buffers that are mutated, in order: {'getitem_1': 'layers_0_attention_SDPA_kv_cache_k_cache', 'getitem': 'layers_0_attention_SDPA_kv_cache_v_cache', 'getitem_3': 'layers_1_attention_SDPA_kv_cache_k_cache', 'getitem_2': 'layers_1_attention_SDPA_kv_cache_v_cache', 'getitem_5': 'layers_2_attention_SDPA_kv_cache_k_cache', 'getitem_4': 'layers_2_attention_SDPA_kv_cache_v_cache', 'getitem_7': 'layers_3_attention_SDPA_kv_cache_k_cache', 'getitem_6': 'layers_3_attention_SDPA_kv_cache_v_cache', 'getitem_9': 'layers_4_attention_SDPA_kv_cache_k_cache', 'getitem_8': 'layers_4_attention_SDPA_kv_cache_v_cache'} Buffer nodes available: [] ``` The root cause is that by `is_parameter`, it tags all data including mutable buffers. Reviewed By: larryliu0820 Differential Revision: D56941763 fbshipit-source-id: a0ed8e00f453bea345f3fdba2c5b30e0241eda8d

facebook-github-bot · 2024-05-13T20:10:40Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-05-13T20:13:55Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-05-13T21:32:38Z

@cccclai merged this pull request in ea9647f.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2024

DenisVieriu97 changed the title ~~[MPS - DRAFT] Add support for scatter_slice; enable index_put~~ [MPS - DRAFT] Add support for slice_scatter; enable index_put Apr 29, 2024

larryliu0820 reviewed May 2, 2024

View reviewed changes

backends/apple/mps/operators/indexing_ops.py

return n

return dim

def get_exapnded_index(self, idx, shape, dim):

Copy link

Contributor

larryliu0820 May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

cccclai mentioned this pull request May 3, 2024

fix constant tagging in mps backend #3503

Closed

swolchok mentioned this pull request May 10, 2024

Experimental: Executorch export to CoreML and MPS pytorch/torchchat#742

Closed

Add support for scatter_slice

ae4940c

DenisVieriu97 marked this pull request as ready for review May 13, 2024 20:13

DenisVieriu97 force-pushed the dev/denis/mps_scatter_slice branch from c852bf7 to ae4940c Compare May 13, 2024 20:13

DenisVieriu97 changed the title ~~[MPS - DRAFT] Add support for slice_scatter; enable index_put~~ [MPS] Add support for slice_scatter; enable index_put May 13, 2024

cccclai approved these changes May 13, 2024

View reviewed changes

facebook-github-bot closed this in ea9647f May 13, 2024

facebook-github-bot added the Merged label May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MPS] Add support for slice_scatter; enable index_put#3399

[MPS] Add support for slice_scatter; enable index_put#3399
DenisVieriu97 wants to merge 1 commit intopytorch:mainfrom
DenisVieriu97:dev/denis/mps_scatter_slice

DenisVieriu97 commented Apr 29, 2024

Uh oh!

pytorch-bot bot commented Apr 29, 2024 •

edited

Loading

Uh oh!

cccclai commented May 1, 2024

Uh oh!

DenisVieriu97 commented May 1, 2024

Uh oh!

larryliu0820 May 2, 2024

Uh oh!

larryliu0820 commented May 2, 2024

Uh oh!

cccclai commented May 3, 2024

Uh oh!

facebook-github-bot commented May 13, 2024

Uh oh!

facebook-github-bot commented May 13, 2024

Uh oh!

facebook-github-bot commented May 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

DenisVieriu97 commented Apr 29, 2024

Uh oh!

pytorch-bot bot commented Apr 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3399

❌ 2 New Failures

Uh oh!

cccclai commented May 1, 2024

Uh oh!

DenisVieriu97 commented May 1, 2024

Uh oh!

larryliu0820 May 2, 2024

Choose a reason for hiding this comment

Uh oh!

larryliu0820 commented May 2, 2024

Uh oh!

cccclai commented May 3, 2024

Uh oh!

facebook-github-bot commented May 13, 2024

Uh oh!

facebook-github-bot commented May 13, 2024

Uh oh!

facebook-github-bot commented May 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Apr 29, 2024 •

edited

Loading