Skip to content

Add Text Generation demo#81

Merged
fdwr merged 7 commits intomicrosoft:mainfrom
ibelem:text-generation
May 8, 2025
Merged

Add Text Generation demo#81
fdwr merged 7 commits intomicrosoft:mainfrom
ibelem:text-generation

Conversation

@ibelem
Copy link
Contributor

@ibelem ibelem commented Apr 23, 2025

This PR added text generation demo with following models:

Name URL License Upstream
Phi-3 Mini 4k Instruct microsoft/Phi-3-mini-4k-instruct-onnx MIT microsoft/Phi-3-mini-4k-instruct
TinyLlama 1.1B Chat v1.0 webnn/TinyLlama-1.1B-Chat-v1.0-onnx Apache 2.0 TinyLlama/TinyLlama-1.1B-Chat-v1.0
Qwen2 0.5B Instruct webnn/Qwen2-0.5B-Instruct-onnx Apache 2.0 Qwen/Qwen2-0.5B-Instruct
DeepSeek R1 Distill Qwen 1.5B onnxruntime/DeepSeek-R1-Distill-ONNX MIT deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Screenshot 2025-04-23 162522

Currently, the demo uses the test version of ONNX Runtime Web. We will switch to the dev version once the fix from microsoft/onnxruntime#24437 is included in the NPM packages.

Test URL: https://ibelem.github.io/webnn-developer-preview/demos/text-generation/ , will keep optimizing the performance.
Tested pass on WebNN DirectML GPU backend.

@fdwr PTAL

CC @Honry @huningxin

@fdwr
Copy link
Collaborator

fdwr commented Apr 24, 2025

Note when trying the demo on Edge (Chromium version 135.0.7049.42, vs 132.0.6831.0 from KNOWN_COMPATIBLE_CHROMIUM_VERSION), I see:

image

@Honry
Copy link
Contributor

Honry commented Apr 24, 2025

Note when trying the demo on Edge (Chromium version 135.0.7049.42, vs 132.0.6831.0 from KNOWN_COMPATIBLE_CHROMIUM_VERSION), I see:

image

Float16Array feature may be the root cause. In this demo, we don't fallback Float16Array to Uint16Array. Shall we limit the KNOWN_COMPATIBLE_CHROMIUM_VERSION to newer version?

@ibelem
Copy link
Contributor Author

ibelem commented Apr 24, 2025

@fdwr @Honry

Edge Version Chromium Version Test Result
Stable 135.0.3179.85 135.0.7049.96 [Error] TypeError: Failed to execute 'constant' on 'MLGraphBuilder': The buffer view type doesn't match the operand data type.
Beta 136.0.3240.29 136.0.7103.33 PASS
Canary 137.0.3278.0 137.0.7131.0 PASS

Confirmed the WebNN accepted Float16Array for float16 operand type from Chromium build 136.0.7051.0 via source.chromium.org.

Please take a look the doc and code changes for KNOWN_COMPATIBLE_CHROMIUM_VERSION @0bd1952 (#81)

@ibelem
Copy link
Contributor Author

ibelem commented May 8, 2025

@fdwr Improved the code in f6f3bf8 for your review comments, thanks a lot!

Updated the ORT dist to dev version which includes microsoft/onnxruntime#24437

PTAL

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more.

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, reached end.

@ibelem
Copy link
Contributor Author

ibelem commented May 8, 2025

Thanks @fdwr for the further review! Fixed them in 7cdfa06 . PTAL

@Honry Please let's improve the naming (camelCase, whole words, consistent) and format issues carefully together.

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍😎 demo. Thank you for adding it 👨‍💻.

@ibelem
Copy link
Contributor Author

ibelem commented May 8, 2025

Fixed ctrl key name issue in 4b4f337 . Thanks @fdwr !

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fdwr
Copy link
Collaborator

fdwr commented May 8, 2025

Currently, the demo uses the test version of ONNX Runtime Web. We will switch to the dev version once the fix from microsoft/onnxruntime#24437 is included in the NPM packages.

Can we merge it now, or is there anything to do first (like upload any corresponding models or ORT distributions)?

@ibelem
Copy link
Contributor Author

ibelem commented May 8, 2025

Currently, the demo uses the test version of ONNX Runtime Web. We will switch to the dev version once the fix from microsoft/onnxruntime#24437 is included in the NPM packages.

Can we merge it now, or is there anything to do first (like upload any corresponding models or ORT distributions)?

The development version of ORT distributions has been updated in the codebase and verified to be functioning correctly. No additional ORT distributions are required. Please help to merge this PR. Thanks much @fdwr !

@fdwr fdwr merged commit 15dc7ab into microsoft:main May 8, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants