From 3d16caff4837b8a9e5e55eed9791b3e999e1e73e Mon Sep 17 00:00:00 2001 From: jakmro Date: Thu, 30 Jan 2025 11:13:00 +0100 Subject: [PATCH 1/6] Add Benchmarks --- docs/docs/benchmarks/_category_.json | 7 +++ docs/docs/benchmarks/inference-time.md | 39 ++++++++++++++ docs/docs/benchmarks/memory-usage.md | 33 ++++++++++++ docs/docs/benchmarks/model-size.md | 33 ++++++++++++ ...lassification.mdx => useClassification.md} | 22 ++++++++ ...ectDetection.mdx => useObjectDetection.md} | 22 ++++++++ ...eStyleTransfer.mdx => useStyleTransfer.md} | 22 ++++++++ ...getting-started.mdx => getting-started.md} | 8 ++- ...exporting-llama.mdx => exporting-llama.md} | 27 ++++++---- docs/docs/llms/running-llms.md | 53 +++++++++++++++---- docs/docs/module-api/executorch-bindings.md | 1 - 11 files changed, 247 insertions(+), 20 deletions(-) create mode 100644 docs/docs/benchmarks/_category_.json create mode 100644 docs/docs/benchmarks/inference-time.md create mode 100644 docs/docs/benchmarks/memory-usage.md create mode 100644 docs/docs/benchmarks/model-size.md rename docs/docs/computer-vision/{useClassification.mdx => useClassification.md} (84%) rename docs/docs/computer-vision/{useObjectDetection.mdx => useObjectDetection.md} (86%) rename docs/docs/computer-vision/{useStyleTransfer.mdx => useStyleTransfer.md} (72%) rename docs/docs/fundamentals/{getting-started.mdx => getting-started.md} (99%) rename docs/docs/llms/{exporting-llama.mdx => exporting-llama.md} (66%) diff --git a/docs/docs/benchmarks/_category_.json b/docs/docs/benchmarks/_category_.json new file mode 100644 index 0000000000..8e10f7a3fa --- /dev/null +++ b/docs/docs/benchmarks/_category_.json @@ -0,0 +1,7 @@ +{ + "label": "Benchmarks", + "position": 5, + "link": { + "type": "generated-index" + } +} diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md new file mode 100644 index 0000000000..b48c54b1e3 --- /dev/null +++ b/docs/docs/benchmarks/inference-time.md @@ -0,0 +1,39 @@ +--- +title: Inference Time +sidebar_position: 3 +--- + +## Classification + + + + + +
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220
Consecutive100120130180
+ +## Object Detection + + + + + +
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120
Consecutive190260280100
+ +## Style Transfer + + + + + +
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst850115014001800
Consecutive4506007501650
+ +## LLMs + +| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | +| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | +| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | +| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | +| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | +| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | +| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | +| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | diff --git a/docs/docs/benchmarks/memory-usage.md b/docs/docs/benchmarks/memory-usage.md new file mode 100644 index 0000000000..439496e264 --- /dev/null +++ b/docs/docs/benchmarks/memory-usage.md @@ -0,0 +1,33 @@ +--- +title: Memory Usage +sidebar_position: 2 +--- + +## Classification + +| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| ----------------- | ---------------------- | ----------------- | +| EFFICIENTNET_V2_S | 130 | 85 | + +## Object Detection + +| Model | Android (XNNPack) [MB] | iOS (XNNPack) [MB] | +| ------------------------------ | ---------------------- | ------------------ | +| SSDLITE_320_MOBILENET_V3_LARGE | 90 | 90 | + +## Style Transfer + +| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| ----------------------------------------------------------------------------------------------- | ---------------------- | ----------------- | +| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | + +## LLMs + +| Model | Android (XNNPack) [GB] | iOS (XNNPack) [GB] | +| --------------------- | ---------------------- | ------------------ | +| LLAMA3_2_1B | 3.2 | 3.1 | +| LLAMA3_2_1B_SPINQUANT | 1.9 | 2 | +| LLAMA3_2_1B_QLORA | 2.2 | 2.5 | +| LLAMA3_2_3B | 7.1 | 7.3 | +| LLAMA3_2_3B_SPINQUANT | 3.7 | 3.8 | +| LLAMA3_2_3B_QLORA | 4 | 4.1 | diff --git a/docs/docs/benchmarks/model-size.md b/docs/docs/benchmarks/model-size.md new file mode 100644 index 0000000000..508506f391 --- /dev/null +++ b/docs/docs/benchmarks/model-size.md @@ -0,0 +1,33 @@ +--- +title: Model Size +sidebar_position: 1 +--- + +## Classification + +| Model | XNNPack [MB] | CoreML [MB] | +| ----------------- | ------------ | ----------- | +| EFFICIENTNET_V2_S | 85.6 | 43.9 | + +## Object Detection + +| Model | XNNPack [MB] | +| ------------------------------ | ------------ | +| SSDLITE_320_MOBILENET_V3_LARGE | 13.9 | + +## Style Transfer + +| Model | XNNPack [MB] | CoreML [MB] | +| ----------------------------------------------------------------------------------------------- | ------------ | ----------- | +| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | + +## LLMs + +| Model | XNNPack [GB] | +| --------------------- | ------------ | +| LLAMA3_2_1B | 2.47 | +| LLAMA3_2_1B_SPINQUANT | 1.14 | +| LLAMA3_2_1B_QLORA | 1.18 | +| LLAMA3_2_3B | 6.43 | +| LLAMA3_2_3B_SPINQUANT | 2.55 | +| LLAMA3_2_3B_QLORA | 2.65 | diff --git a/docs/docs/computer-vision/useClassification.mdx b/docs/docs/computer-vision/useClassification.md similarity index 84% rename from docs/docs/computer-vision/useClassification.mdx rename to docs/docs/computer-vision/useClassification.md index 1043088b21..219cc470c7 100644 --- a/docs/docs/computer-vision/useClassification.mdx +++ b/docs/docs/computer-vision/useClassification.md @@ -86,3 +86,25 @@ function App() { | Model | Number of classes | Class list | | --------------------------------------------------------------------------------------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [efficientnet_v2_s](https://pytorch.org/vision/0.20/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000 | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/main/android/src/main/java/com/swmansion/rnexecutorch/models/classification/Constants.kt) | + +## Benchmarks + +### Model size + +| Model | XNNPack [MB] | CoreML [MB] | +| ----------------- | ------------ | ----------- | +| EFFICIENTNET_V2_S | 85.6 | 43.9 | + +### Memory usage + +| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| ----------------- | ---------------------- | ----------------- | +| EFFICIENTNET_V2_S | 130 | 85 | + +### Inference time + + + + + +
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220
Consecutive100120130180
diff --git a/docs/docs/computer-vision/useObjectDetection.mdx b/docs/docs/computer-vision/useObjectDetection.md similarity index 86% rename from docs/docs/computer-vision/useObjectDetection.mdx rename to docs/docs/computer-vision/useObjectDetection.md index 5de3da41cc..e86a7c84cb 100644 --- a/docs/docs/computer-vision/useObjectDetection.mdx +++ b/docs/docs/computer-vision/useObjectDetection.md @@ -124,3 +124,25 @@ function App() { | Model | Number of classes | Class list | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | | [SSDLite320 MobileNetV3 Large](https://pytorch.org/vision/main/models/generated/torchvision.models.detection.ssdlite320_mobilenet_v3_large.html#torchvision.models.detection.SSDLite320_MobileNet_V3_Large_Weights) | 91 | [COCO](https://github.com/software-mansion/react-native-executorch/blob/69802ee1ca161d9df00def1dabe014d36341cfa9/src/types/object_detection.ts#L14) | + +## Benchmarks + +### Model size + +| Model | XNNPack [MB] | +| ------------------------------ | ------------ | +| SSDLITE_320_MOBILENET_V3_LARGE | 13.9 | + +### Memory usage + +| Model | Android (XNNPack) [MB] | iOS (XNNPack) [MB] | +| ------------------------------ | ---------------------- | ------------------ | +| SSDLITE_320_MOBILENET_V3_LARGE | 90 | 90 | + +### Inference time + + + + + +
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120
Consecutive190260280100
diff --git a/docs/docs/computer-vision/useStyleTransfer.mdx b/docs/docs/computer-vision/useStyleTransfer.md similarity index 72% rename from docs/docs/computer-vision/useStyleTransfer.mdx rename to docs/docs/computer-vision/useStyleTransfer.md index c5a5e3e0d2..7019a77980 100644 --- a/docs/docs/computer-vision/useStyleTransfer.mdx +++ b/docs/docs/computer-vision/useStyleTransfer.md @@ -78,3 +78,25 @@ function App(){ - [Mosaic](https://github.com/pytorch/examples/tree/main/fast_neural_style) - [Udnie](https://github.com/pytorch/examples/tree/main/fast_neural_style) - [Rain princess](https://github.com/pytorch/examples/tree/main/fast_neural_style) + +## Benchmarks + +### Model size + +| Model | XNNPack [MB] | CoreML [MB] | +| ----------------------------------------------------------------------------------------------- | ------------ | ----------- | +| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | + +### Memory usage + +| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| ----------------------------------------------------------------------------------------------- | ---------------------- | ----------------- | +| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | + +### Inference time + + + + + +
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst850115014001800
Consecutive4506007501650
diff --git a/docs/docs/fundamentals/getting-started.mdx b/docs/docs/fundamentals/getting-started.md similarity index 99% rename from docs/docs/fundamentals/getting-started.mdx rename to docs/docs/fundamentals/getting-started.md index a53be6d147..924bdb3793 100644 --- a/docs/docs/fundamentals/getting-started.mdx +++ b/docs/docs/fundamentals/getting-started.md @@ -7,12 +7,15 @@ import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ## What is ExecuTorch? + ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as CoreML and XNNPack. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise. ## React Native ExecuTorch + React Native ExecuTorch is our way of bringing ExecuTorch into the React Native world. Our API is built to be simple, declarative, and efficient. Plus, we’ll provide a set of pre-exported models for common use cases, so you won’t have to worry about handling exports yourself. With just a few lines of JavaScript, you’ll be able to run AI models (even LLMs 👀) right on your device—keeping user data private and saving on cloud costs. ## Installation + Installation is pretty straightforward, just use your favorite package manager. @@ -54,12 +57,15 @@ Because we are using ExecuTorch under the hood, you won't be able to build iOS a ::: Running the app with the library: + ```bash yarn run expo: -d ``` ## Good reads -If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources: + +If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources: + - [ExecuTorch docs](https://pytorch.org/executorch/stable/index.html) - [Native code for iOS](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-i-ios-f1562a4556e8?source=user_profile_page---------0-------------250189c98ccf---------------) - [Native code for Android](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-ii-android-29431b6b9f7f?source=user_profile_page---------2-------------b8e3a5cb1c63---------------) diff --git a/docs/docs/llms/exporting-llama.mdx b/docs/docs/llms/exporting-llama.md similarity index 66% rename from docs/docs/llms/exporting-llama.mdx rename to docs/docs/llms/exporting-llama.md index 5acbb5d202..28b3ceb997 100644 --- a/docs/docs/llms/exporting-llama.mdx +++ b/docs/docs/llms/exporting-llama.md @@ -3,32 +3,41 @@ title: Exporting Llama sidebar_position: 2 --- -In order to make the process of export as simple as possible for you, we created a script that runs a Docker container and exports the model. +In order to make the process of export as simple as possible for you, we created a script that runs a Docker container and exports the model. ## Steps to export Llama + ### 1. Create an account -Get a [HuggingFace](https://huggingface.co/) account. This will allow you to download needed files. You can also use the [official Llama website](https://www.llama.com/llama-downloads/). + +Get a [HuggingFace](https://huggingface.co/) account. This will allow you to download needed files. You can also use the [official Llama website](https://www.llama.com/llama-downloads/). ### 2. Select a model + Pick the model that suits your needs. Before you download it, you'll need to accept a license. For best performance, we recommend using Spin-Quant or QLoRA versions of the model: - - [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main/original) - - [Llama 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main/original) - - [Llama 3.2 3B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8/tree/main) - - [Llama 3.2 1B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8/tree/main) - - [Llama 3.2 3B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8/tree/main) - - [Llama 3.2 1B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8/tree/main) + +- [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main/original) +- [Llama 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main/original) +- [Llama 3.2 3B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8/tree/main) +- [Llama 3.2 1B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8/tree/main) +- [Llama 3.2 3B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8/tree/main) +- [Llama 3.2 1B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8/tree/main) ### 3. Download files + Download the `consolidated.00.pth`, `params.json` and `tokenizer.model` files. If you can't see them, make sure to check the `original` directory. ### 4. Rename the tokenizer file + Rename the `tokenizer.model` file to `tokenizer.bin` as required by the library: + ```bash mv tokenizer.model tokenizer.bin ``` ### 5. Run the export script -Navigate to the `llama_export` directory and run the following command: + +Navigate to the `llama_export` directory and run the following command: + ```bash ./build_llama_binary.sh --model-path /path/to/consolidated.00.pth --params-path /path/to/params.json ``` diff --git a/docs/docs/llms/running-llms.md b/docs/docs/llms/running-llms.md index a00bad237b..7b14c9aca7 100644 --- a/docs/docs/llms/running-llms.md +++ b/docs/docs/llms/running-llms.md @@ -45,15 +45,15 @@ Given computational constraints, our architecture is designed to support only on ### Returns -| Field | Type | Description | -| ------------------- | ---------------------------------- | --------------------------------------------------------------------------------------------------------------- | -| `generate` | `(input: string) => Promise` | Function to start generating a response with the given input string. | -| `response` | `string` | State of the generated response. This field is updated with each token generated by the model | -| `error` | string | null | Contains the error message if the model failed to load | -| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response | -| `interrupt` | `() => void` | Function to interrupt the current inference | -| `isReady` | `boolean` | Indicates whether the model is ready | -| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. | +| Field | Type | Description | +| ------------------ | ---------------------------------- | --------------------------------------------------------------------------------------------------------------- | +| `generate` | `(input: string) => Promise` | Function to start generating a response with the given input string. | +| `response` | `string` | State of the generated response. This field is updated with each token generated by the model | +| `error` | string | null | Contains the error message if the model failed to load | +| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response | +| `interrupt` | `() => void` | Function to interrupt the current inference | +| `isReady` | `boolean` | Indicates whether the model is ready | +| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. | ## Sending a message @@ -88,3 +88,38 @@ Behind the scenes, tokens are generated one by one, and the response property is Sometimes, you might want to stop the model while it’s generating. To do this, you can use `interrupt()`, which will halt the model and append the current response to its internal conversation state. There are also cases when you need to check if tokens are being generated, such as to conditionally render a stop button. We’ve made this easy with the `isTokenBeingGenerated` property. + +## Benchmarks + +### Model size + +| Model | XNNPack [GB] | +| --------------------- | ------------ | +| LLAMA3_2_1B | 2.47 | +| LLAMA3_2_1B_SPINQUANT | 1.14 | +| LLAMA3_2_1B_QLORA | 1.18 | +| LLAMA3_2_3B | 6.43 | +| LLAMA3_2_3B_SPINQUANT | 2.55 | +| LLAMA3_2_3B_QLORA | 2.65 | + +### Memory usage + +| Model | Android (XNNPack) [GB] | iOS (XNNPack) [GB] | +| --------------------- | ---------------------- | ------------------ | +| LLAMA3_2_1B | 3.2 | 3.1 | +| LLAMA3_2_1B_SPINQUANT | 1.9 | 2 | +| LLAMA3_2_1B_QLORA | 2.2 | 2.5 | +| LLAMA3_2_3B | 7.1 | 7.3 | +| LLAMA3_2_3B_SPINQUANT | 3.7 | 3.8 | +| LLAMA3_2_3B_QLORA | 4 | 4.1 | + +### Inference time + +| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | +| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | +| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | +| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | +| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | +| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | +| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | +| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | diff --git a/docs/docs/module-api/executorch-bindings.md b/docs/docs/module-api/executorch-bindings.md index c7c6ecdb63..4147008673 100644 --- a/docs/docs/module-api/executorch-bindings.md +++ b/docs/docs/module-api/executorch-bindings.md @@ -76,7 +76,6 @@ const executorchModule = useExecutorchModule({ }); ``` - ## Setting up input parameters To prepare the input for the model, define the shape of the input tensor. This shape depends on the model's requirements. For the `STYLE_TRANSFER_CANDY` model, we need a tensor of shape `[1, 3, 640, 640]`, corresponding to a batch size of 1, 3 color channels (RGB), and dimensions of 640x640 pixels. From a1cb94936f722da0d8a8782486157e6fe0a5c4dd Mon Sep 17 00:00:00 2001 From: jakmro Date: Thu, 30 Jan 2025 16:27:54 +0100 Subject: [PATCH 2/6] Add OnePlus 12 Benchmarks --- docs/docs/benchmarks/inference-time.md | 34 +++++++++---------- .../docs/computer-vision/useClassification.md | 6 ++-- .../computer-vision/useObjectDetection.md | 6 ++-- docs/docs/computer-vision/useStyleTransfer.md | 6 ++-- docs/docs/llms/running-llms.md | 16 ++++----- 5 files changed, 34 insertions(+), 34 deletions(-) diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md index b48c54b1e3..1d8bae6e68 100644 --- a/docs/docs/benchmarks/inference-time.md +++ b/docs/docs/benchmarks/inference-time.md @@ -6,34 +6,34 @@ sidebar_position: 3 ## Classification - - - + + +
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220
Consecutive100120130180
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220230
Consecutive100120130180170
## Object Detection - - - + + +
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120
Consecutive190260280100
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120140
Consecutive19026028010090
## Style Transfer - - - + + +
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst850115014001800
Consecutive4506007501650
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst8501150140018001950
Consecutive45060075016501800
## LLMs -| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | -| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | -| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | -| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | -| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | -| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | -| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | -| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | +| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | OnePlus 12 (XNNPack) [tokens/s] | +| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- | +| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | 19.3 | +| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 | +| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | 44.4 | +| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 | +| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | +| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | diff --git a/docs/docs/computer-vision/useClassification.md b/docs/docs/computer-vision/useClassification.md index 219cc470c7..8a341d95d8 100644 --- a/docs/docs/computer-vision/useClassification.md +++ b/docs/docs/computer-vision/useClassification.md @@ -104,7 +104,7 @@ function App() { ### Inference time - - - + + +
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220
Consecutive100120130180
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220230
Consecutive100120130180170
diff --git a/docs/docs/computer-vision/useObjectDetection.md b/docs/docs/computer-vision/useObjectDetection.md index e86a7c84cb..6177e843f1 100644 --- a/docs/docs/computer-vision/useObjectDetection.md +++ b/docs/docs/computer-vision/useObjectDetection.md @@ -142,7 +142,7 @@ function App() { ### Inference time - - - + + +
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120
Consecutive190260280100
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120140
Consecutive19026028010090
diff --git a/docs/docs/computer-vision/useStyleTransfer.md b/docs/docs/computer-vision/useStyleTransfer.md index 7019a77980..e964eac63d 100644 --- a/docs/docs/computer-vision/useStyleTransfer.md +++ b/docs/docs/computer-vision/useStyleTransfer.md @@ -96,7 +96,7 @@ function App(){ ### Inference time - - - + + +
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst850115014001800
Consecutive4506007501650
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst8501150140018001950
Consecutive45060075016501800
diff --git a/docs/docs/llms/running-llms.md b/docs/docs/llms/running-llms.md index 7b14c9aca7..7ee04459f0 100644 --- a/docs/docs/llms/running-llms.md +++ b/docs/docs/llms/running-llms.md @@ -115,11 +115,11 @@ There are also cases when you need to check if tokens are being generated, such ### Inference time -| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | -| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | -| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | -| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | -| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | -| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | -| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | -| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | +| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | OnePlus 12 (XNNPack) [tokens/s] | +| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- | +| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | 19.3 | +| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 | +| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | 44.4 | +| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 | +| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | +| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | From 3ffc6838990d4c0f36fa77c29ff56cf1cc86c77b Mon Sep 17 00:00:00 2001 From: jakmro Date: Mon, 3 Feb 2025 14:31:52 +0100 Subject: [PATCH 3/6] Add suggested changes --- docs/docs/benchmarks/inference-time.md | 35 +++++++++++-------- docs/docs/benchmarks/memory-usage.md | 9 +++-- docs/docs/benchmarks/model-size.md | 9 +++-- .../docs/computer-vision/useClassification.md | 12 ++++--- .../computer-vision/useObjectDetection.md | 12 ++++--- docs/docs/computer-vision/useStyleTransfer.md | 33 +++++++++++------ docs/docs/llms/running-llms.md | 4 +++ 7 files changed, 72 insertions(+), 42 deletions(-) diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md index 1d8bae6e68..d4d26b9d7e 100644 --- a/docs/docs/benchmarks/inference-time.md +++ b/docs/docs/benchmarks/inference-time.md @@ -3,29 +3,30 @@ title: Inference Time sidebar_position: 3 --- +:::info +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +::: + ## Classification - - - - -
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220230
Consecutive100120130180170
+| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| ----------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | +| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | ## Object Detection - - - - -
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120140
Consecutive19026028010090
+| Model | iPhone 16 Pro (XNNPack) [ms] | iPhone 13 Pro (XNNPack) [ms] | iPhone SE 3 (XNNPack) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | +| SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 | ## Style Transfer - - - - -
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst8501150140018001950
Consecutive45060075016501800
+| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| ---------------------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | +| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 | ## LLMs @@ -37,3 +38,7 @@ sidebar_position: 3 | LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 | | LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | | LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | + +:::info +❌ - Not enough memory +::: diff --git a/docs/docs/benchmarks/memory-usage.md b/docs/docs/benchmarks/memory-usage.md index 439496e264..7fe32f967e 100644 --- a/docs/docs/benchmarks/memory-usage.md +++ b/docs/docs/benchmarks/memory-usage.md @@ -17,9 +17,12 @@ sidebar_position: 2 ## Style Transfer -| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | -| ----------------------------------------------------------------------------------------------- | ---------------------- | ----------------- | -| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | +| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| ---------------------------- | ---------------------- | ----------------- | +| STYLE_TRANSFER_CANDY | 950 | 350 | +| STYLE_TRANSFER_MOSAIC | 950 | 350 | +| STYLE_TRANSFER_UDNIE | 950 | 350 | +| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | ## LLMs diff --git a/docs/docs/benchmarks/model-size.md b/docs/docs/benchmarks/model-size.md index 508506f391..cdf661bb50 100644 --- a/docs/docs/benchmarks/model-size.md +++ b/docs/docs/benchmarks/model-size.md @@ -17,9 +17,12 @@ sidebar_position: 1 ## Style Transfer -| Model | XNNPack [MB] | CoreML [MB] | -| ----------------------------------------------------------------------------------------------- | ------------ | ----------- | -| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | +| Model | XNNPack [MB] | CoreML [MB] | +| ---------------------------- | ------------ | ----------- | +| STYLE_TRANSFER_CANDY | 6.78 | 5.22 | +| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | +| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 | +| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | ## LLMs diff --git a/docs/docs/computer-vision/useClassification.md b/docs/docs/computer-vision/useClassification.md index 8a341d95d8..911ec2501b 100644 --- a/docs/docs/computer-vision/useClassification.md +++ b/docs/docs/computer-vision/useClassification.md @@ -103,8 +103,10 @@ function App() { ### Inference time - - - - -
Model Inference Type iPhone 16 Pro (CoreML) [ms] iPhone 13 Pro (CoreML) [ms] iPhone SE 3 (CoreML) [ms] Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
EFFICIENTNET_V2_SFirst140180210220230
Consecutive100120130180170
+:::info +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +::: + +| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| ----------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | +| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | diff --git a/docs/docs/computer-vision/useObjectDetection.md b/docs/docs/computer-vision/useObjectDetection.md index 6177e843f1..c3fc0de7a1 100644 --- a/docs/docs/computer-vision/useObjectDetection.md +++ b/docs/docs/computer-vision/useObjectDetection.md @@ -141,8 +141,10 @@ function App() { ### Inference time - - - - -
ModelInference TypeiPhone 16 Pro (XNNPack) [ms]iPhone 13 Pro (XNNPack) [ms]iPhone SE 3 (XNNPack) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
SSDLITE_320_MOBILENET_V3_LARGEFirst200280300120140
Consecutive19026028010090
+:::info +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +::: + +| Model | iPhone 16 Pro (XNNPack) [ms] | iPhone 13 Pro (XNNPack) [ms] | iPhone SE 3 (XNNPack) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | +| SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 | diff --git a/docs/docs/computer-vision/useStyleTransfer.md b/docs/docs/computer-vision/useStyleTransfer.md index e964eac63d..efbc3448b2 100644 --- a/docs/docs/computer-vision/useStyleTransfer.md +++ b/docs/docs/computer-vision/useStyleTransfer.md @@ -83,20 +83,31 @@ function App(){ ### Model size -| Model | XNNPack [MB] | CoreML [MB] | -| ----------------------------------------------------------------------------------------------- | ------------ | ----------- | -| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | +| Model | XNNPack [MB] | CoreML [MB] | +| ---------------------------- | ------------ | ----------- | +| STYLE_TRANSFER_CANDY | 6.78 | 5.22 | +| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | +| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 | +| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | ### Memory usage -| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | -| ----------------------------------------------------------------------------------------------- | ---------------------- | ----------------- | -| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | +| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| ---------------------------- | ---------------------- | ----------------- | +| STYLE_TRANSFER_CANDY | 950 | 350 | +| STYLE_TRANSFER_MOSAIC | 950 | 350 | +| STYLE_TRANSFER_UDNIE | 950 | 350 | +| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | ### Inference time - - - - -
ModelInference TypeiPhone 16 Pro (CoreML) [ms]iPhone 13 Pro (CoreML) [ms]iPhone SE 3 (CoreML) [ms]Samsung Galaxy S24 (XNNPack) [ms]OnePlus 12 (XNNPack) [ms]
STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESSFirst8501150140018001950
Consecutive45060075016501800
+:::info +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +::: + +| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| ---------------------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | +| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 | diff --git a/docs/docs/llms/running-llms.md b/docs/docs/llms/running-llms.md index 7ee04459f0..60b0d9a68e 100644 --- a/docs/docs/llms/running-llms.md +++ b/docs/docs/llms/running-llms.md @@ -123,3 +123,7 @@ There are also cases when you need to check if tokens are being generated, such | LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 | | LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | | LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | + +:::info +❌ - Not enough memory +::: From a7ccd3027c80f5485e34f1df7445f67e56b50fc3 Mon Sep 17 00:00:00 2001 From: jakmro Date: Mon, 3 Feb 2025 14:34:05 +0100 Subject: [PATCH 4/6] Standardize XNNPack to XNNPACK and CoreML to Core ML across documentation --- docs/docs/benchmarks/inference-time.md | 8 ++++---- docs/docs/benchmarks/memory-usage.md | 8 ++++---- docs/docs/benchmarks/model-size.md | 8 ++++---- docs/docs/computer-vision/useClassification.md | 6 +++--- docs/docs/computer-vision/useObjectDetection.md | 6 +++--- docs/docs/computer-vision/useStyleTransfer.md | 6 +++--- docs/docs/fundamentals/getting-started.md | 2 +- docs/docs/llms/running-llms.md | 6 +++--- 8 files changed, 25 insertions(+), 25 deletions(-) diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md index d4d26b9d7e..067a3a13ee 100644 --- a/docs/docs/benchmarks/inference-time.md +++ b/docs/docs/benchmarks/inference-time.md @@ -9,19 +9,19 @@ Times presented in the tables are measured as consecutive runs of the model. Ini ## Classification -| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ----------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | | EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | ## Object Detection -| Model | iPhone 16 Pro (XNNPack) [ms] | iPhone 13 Pro (XNNPack) [ms] | iPhone SE 3 (XNNPack) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | | SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 | ## Style Transfer -| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ---------------------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | | STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | | STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | @@ -30,7 +30,7 @@ Times presented in the tables are measured as consecutive runs of the model. Ini ## LLMs -| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | OnePlus 12 (XNNPack) [tokens/s] | +| Model | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] | | --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- | | LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | 19.3 | | LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 | diff --git a/docs/docs/benchmarks/memory-usage.md b/docs/docs/benchmarks/memory-usage.md index 7fe32f967e..0526e71cba 100644 --- a/docs/docs/benchmarks/memory-usage.md +++ b/docs/docs/benchmarks/memory-usage.md @@ -5,19 +5,19 @@ sidebar_position: 2 ## Classification -| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | | ----------------- | ---------------------- | ----------------- | | EFFICIENTNET_V2_S | 130 | 85 | ## Object Detection -| Model | Android (XNNPack) [MB] | iOS (XNNPack) [MB] | +| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] | | ------------------------------ | ---------------------- | ------------------ | | SSDLITE_320_MOBILENET_V3_LARGE | 90 | 90 | ## Style Transfer -| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | | ---------------------------- | ---------------------- | ----------------- | | STYLE_TRANSFER_CANDY | 950 | 350 | | STYLE_TRANSFER_MOSAIC | 950 | 350 | @@ -26,7 +26,7 @@ sidebar_position: 2 ## LLMs -| Model | Android (XNNPack) [GB] | iOS (XNNPack) [GB] | +| Model | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] | | --------------------- | ---------------------- | ------------------ | | LLAMA3_2_1B | 3.2 | 3.1 | | LLAMA3_2_1B_SPINQUANT | 1.9 | 2 | diff --git a/docs/docs/benchmarks/model-size.md b/docs/docs/benchmarks/model-size.md index cdf661bb50..8a2574a74e 100644 --- a/docs/docs/benchmarks/model-size.md +++ b/docs/docs/benchmarks/model-size.md @@ -5,19 +5,19 @@ sidebar_position: 1 ## Classification -| Model | XNNPack [MB] | CoreML [MB] | +| Model | XNNPACK [MB] | Core ML [MB] | | ----------------- | ------------ | ----------- | | EFFICIENTNET_V2_S | 85.6 | 43.9 | ## Object Detection -| Model | XNNPack [MB] | +| Model | XNNPACK [MB] | | ------------------------------ | ------------ | | SSDLITE_320_MOBILENET_V3_LARGE | 13.9 | ## Style Transfer -| Model | XNNPack [MB] | CoreML [MB] | +| Model | XNNPACK [MB] | Core ML [MB] | | ---------------------------- | ------------ | ----------- | | STYLE_TRANSFER_CANDY | 6.78 | 5.22 | | STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | @@ -26,7 +26,7 @@ sidebar_position: 1 ## LLMs -| Model | XNNPack [GB] | +| Model | XNNPACK [GB] | | --------------------- | ------------ | | LLAMA3_2_1B | 2.47 | | LLAMA3_2_1B_SPINQUANT | 1.14 | diff --git a/docs/docs/computer-vision/useClassification.md b/docs/docs/computer-vision/useClassification.md index 911ec2501b..6aec16e327 100644 --- a/docs/docs/computer-vision/useClassification.md +++ b/docs/docs/computer-vision/useClassification.md @@ -91,13 +91,13 @@ function App() { ### Model size -| Model | XNNPack [MB] | CoreML [MB] | +| Model | XNNPACK [MB] | Core ML [MB] | | ----------------- | ------------ | ----------- | | EFFICIENTNET_V2_S | 85.6 | 43.9 | ### Memory usage -| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | | ----------------- | ---------------------- | ----------------- | | EFFICIENTNET_V2_S | 130 | 85 | @@ -107,6 +107,6 @@ function App() { Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. ::: -| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ----------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | | EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | diff --git a/docs/docs/computer-vision/useObjectDetection.md b/docs/docs/computer-vision/useObjectDetection.md index c3fc0de7a1..9d66637c82 100644 --- a/docs/docs/computer-vision/useObjectDetection.md +++ b/docs/docs/computer-vision/useObjectDetection.md @@ -129,13 +129,13 @@ function App() { ### Model size -| Model | XNNPack [MB] | +| Model | XNNPACK [MB] | | ------------------------------ | ------------ | | SSDLITE_320_MOBILENET_V3_LARGE | 13.9 | ### Memory usage -| Model | Android (XNNPack) [MB] | iOS (XNNPack) [MB] | +| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] | | ------------------------------ | ---------------------- | ------------------ | | SSDLITE_320_MOBILENET_V3_LARGE | 90 | 90 | @@ -145,6 +145,6 @@ function App() { Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. ::: -| Model | iPhone 16 Pro (XNNPack) [ms] | iPhone 13 Pro (XNNPack) [ms] | iPhone SE 3 (XNNPack) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | | SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 | diff --git a/docs/docs/computer-vision/useStyleTransfer.md b/docs/docs/computer-vision/useStyleTransfer.md index efbc3448b2..b8d0b56075 100644 --- a/docs/docs/computer-vision/useStyleTransfer.md +++ b/docs/docs/computer-vision/useStyleTransfer.md @@ -83,7 +83,7 @@ function App(){ ### Model size -| Model | XNNPack [MB] | CoreML [MB] | +| Model | XNNPACK [MB] | Core ML [MB] | | ---------------------------- | ------------ | ----------- | | STYLE_TRANSFER_CANDY | 6.78 | 5.22 | | STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | @@ -92,7 +92,7 @@ function App(){ ### Memory usage -| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | +| Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | | ---------------------------- | ---------------------- | ----------------- | | STYLE_TRANSFER_CANDY | 950 | 350 | | STYLE_TRANSFER_MOSAIC | 950 | 350 | @@ -105,7 +105,7 @@ function App(){ Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. ::: -| Model | iPhone 16 Pro (CoreML) [ms] | iPhone 13 Pro (CoreML) [ms] | iPhone SE 3 (CoreML) [ms] | Samsung Galaxy S24 (XNNPack) [ms] | OnePlus 12 (XNNPack) [ms] | +| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | | ---------------------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | | STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | | STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | diff --git a/docs/docs/fundamentals/getting-started.md b/docs/docs/fundamentals/getting-started.md index 924bdb3793..025b085ad5 100644 --- a/docs/docs/fundamentals/getting-started.md +++ b/docs/docs/fundamentals/getting-started.md @@ -8,7 +8,7 @@ import TabItem from '@theme/TabItem'; ## What is ExecuTorch? -ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as CoreML and XNNPack. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise. +ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as Core ML and XNNPACK. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise. ## React Native ExecuTorch diff --git a/docs/docs/llms/running-llms.md b/docs/docs/llms/running-llms.md index 60b0d9a68e..6a60e85214 100644 --- a/docs/docs/llms/running-llms.md +++ b/docs/docs/llms/running-llms.md @@ -93,7 +93,7 @@ There are also cases when you need to check if tokens are being generated, such ### Model size -| Model | XNNPack [GB] | +| Model | XNNPACK [GB] | | --------------------- | ------------ | | LLAMA3_2_1B | 2.47 | | LLAMA3_2_1B_SPINQUANT | 1.14 | @@ -104,7 +104,7 @@ There are also cases when you need to check if tokens are being generated, such ### Memory usage -| Model | Android (XNNPack) [GB] | iOS (XNNPack) [GB] | +| Model | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] | | --------------------- | ---------------------- | ------------------ | | LLAMA3_2_1B | 3.2 | 3.1 | | LLAMA3_2_1B_SPINQUANT | 1.9 | 2 | @@ -115,7 +115,7 @@ There are also cases when you need to check if tokens are being generated, such ### Inference time -| Model | iPhone 16 Pro (XNNPack) [tokens/s] | iPhone 13 Pro (XNNPack) [tokens/s] | iPhone SE 3 (XNNPack) [tokens/s] | Samsung Galaxy S24 (XNNPack) [tokens/s] | OnePlus 12 (XNNPack) [tokens/s] | +| Model | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] | | --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- | | LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | 19.3 | | LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 | From 31db3914c7e1ad3ed814368037dd9dd3524acb12 Mon Sep 17 00:00:00 2001 From: jakmro Date: Mon, 3 Feb 2025 14:46:39 +0100 Subject: [PATCH 5/6] Change admonition type info -> warning, add more details to warning notice --- docs/docs/benchmarks/inference-time.md | 18 +++++----- docs/docs/benchmarks/memory-usage.md | 14 ++++---- docs/docs/benchmarks/model-size.md | 14 ++++---- .../docs/computer-vision/useClassification.md | 16 ++++----- .../computer-vision/useObjectDetection.md | 4 +-- docs/docs/computer-vision/useStyleTransfer.md | 34 +++++++++---------- 6 files changed, 50 insertions(+), 50 deletions(-) diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md index 067a3a13ee..920d5fc34b 100644 --- a/docs/docs/benchmarks/inference-time.md +++ b/docs/docs/benchmarks/inference-time.md @@ -3,15 +3,15 @@ title: Inference Time sidebar_position: 3 --- -:::info -Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +:::warning warning +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. ::: ## Classification | Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | -| ----------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | -| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | +| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | +| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | ## Object Detection @@ -22,11 +22,11 @@ Times presented in the tables are measured as consecutive runs of the model. Ini ## Style Transfer | Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | -| ---------------------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | -| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | -| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | -| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 | -| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 | +| ---------------------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | +| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 | ## LLMs diff --git a/docs/docs/benchmarks/memory-usage.md b/docs/docs/benchmarks/memory-usage.md index 0526e71cba..868a0884b6 100644 --- a/docs/docs/benchmarks/memory-usage.md +++ b/docs/docs/benchmarks/memory-usage.md @@ -6,8 +6,8 @@ sidebar_position: 2 ## Classification | Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | -| ----------------- | ---------------------- | ----------------- | -| EFFICIENTNET_V2_S | 130 | 85 | +| ----------------- | ---------------------- | ------------------ | +| EFFICIENTNET_V2_S | 130 | 85 | ## Object Detection @@ -18,11 +18,11 @@ sidebar_position: 2 ## Style Transfer | Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | -| ---------------------------- | ---------------------- | ----------------- | -| STYLE_TRANSFER_CANDY | 950 | 350 | -| STYLE_TRANSFER_MOSAIC | 950 | 350 | -| STYLE_TRANSFER_UDNIE | 950 | 350 | -| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | +| ---------------------------- | ---------------------- | ------------------ | +| STYLE_TRANSFER_CANDY | 950 | 350 | +| STYLE_TRANSFER_MOSAIC | 950 | 350 | +| STYLE_TRANSFER_UDNIE | 950 | 350 | +| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | ## LLMs diff --git a/docs/docs/benchmarks/model-size.md b/docs/docs/benchmarks/model-size.md index 8a2574a74e..a80f59d47f 100644 --- a/docs/docs/benchmarks/model-size.md +++ b/docs/docs/benchmarks/model-size.md @@ -6,8 +6,8 @@ sidebar_position: 1 ## Classification | Model | XNNPACK [MB] | Core ML [MB] | -| ----------------- | ------------ | ----------- | -| EFFICIENTNET_V2_S | 85.6 | 43.9 | +| ----------------- | ------------ | ------------ | +| EFFICIENTNET_V2_S | 85.6 | 43.9 | ## Object Detection @@ -18,11 +18,11 @@ sidebar_position: 1 ## Style Transfer | Model | XNNPACK [MB] | Core ML [MB] | -| ---------------------------- | ------------ | ----------- | -| STYLE_TRANSFER_CANDY | 6.78 | 5.22 | -| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | -| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 | -| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | +| ---------------------------- | ------------ | ------------ | +| STYLE_TRANSFER_CANDY | 6.78 | 5.22 | +| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | +| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 | +| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | ## LLMs diff --git a/docs/docs/computer-vision/useClassification.md b/docs/docs/computer-vision/useClassification.md index 6aec16e327..db33fed113 100644 --- a/docs/docs/computer-vision/useClassification.md +++ b/docs/docs/computer-vision/useClassification.md @@ -92,21 +92,21 @@ function App() { ### Model size | Model | XNNPACK [MB] | Core ML [MB] | -| ----------------- | ------------ | ----------- | -| EFFICIENTNET_V2_S | 85.6 | 43.9 | +| ----------------- | ------------ | ------------ | +| EFFICIENTNET_V2_S | 85.6 | 43.9 | ### Memory usage | Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | -| ----------------- | ---------------------- | ----------------- | -| EFFICIENTNET_V2_S | 130 | 85 | +| ----------------- | ---------------------- | ------------------ | +| EFFICIENTNET_V2_S | 130 | 85 | ### Inference time -:::info -Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +:::warning warning +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. ::: | Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | -| ----------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | -| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | +| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | +| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 | diff --git a/docs/docs/computer-vision/useObjectDetection.md b/docs/docs/computer-vision/useObjectDetection.md index 9d66637c82..a0e3033799 100644 --- a/docs/docs/computer-vision/useObjectDetection.md +++ b/docs/docs/computer-vision/useObjectDetection.md @@ -141,8 +141,8 @@ function App() { ### Inference time -:::info -Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +:::warning warning +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. ::: | Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | diff --git a/docs/docs/computer-vision/useStyleTransfer.md b/docs/docs/computer-vision/useStyleTransfer.md index b8d0b56075..6a8a346107 100644 --- a/docs/docs/computer-vision/useStyleTransfer.md +++ b/docs/docs/computer-vision/useStyleTransfer.md @@ -84,30 +84,30 @@ function App(){ ### Model size | Model | XNNPACK [MB] | Core ML [MB] | -| ---------------------------- | ------------ | ----------- | -| STYLE_TRANSFER_CANDY | 6.78 | 5.22 | -| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | -| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 | -| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | +| ---------------------------- | ------------ | ------------ | +| STYLE_TRANSFER_CANDY | 6.78 | 5.22 | +| STYLE_TRANSFER_MOSAIC | 6.78 | 5.22 | +| STYLE_TRANSFER_UDNIE | 6.78 | 5.22 | +| STYLE_TRANSFER_RAIN_PRINCESS | 6.78 | 5.22 | ### Memory usage | Model | Android (XNNPACK) [MB] | iOS (Core ML) [MB] | -| ---------------------------- | ---------------------- | ----------------- | -| STYLE_TRANSFER_CANDY | 950 | 350 | -| STYLE_TRANSFER_MOSAIC | 950 | 350 | -| STYLE_TRANSFER_UDNIE | 950 | 350 | -| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | +| ---------------------------- | ---------------------- | ------------------ | +| STYLE_TRANSFER_CANDY | 950 | 350 | +| STYLE_TRANSFER_MOSAIC | 950 | 350 | +| STYLE_TRANSFER_UDNIE | 950 | 350 | +| STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | ### Inference time -:::info -Times presented in the tables are measured as consecutive runs of the model. Initial run times may be longer due to model loading and initialization. +:::warning warning +Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. ::: | Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] | -| ---------------------------- | --------------------------- | --------------------------- | ------------------------- | --------------------------------- | ------------------------- | -| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | -| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | -| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 | -| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 | +| ---------------------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- | +| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 | +| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 | From 1eacf1f27865ee0cf09a3eec1f0f6a2c83235e03 Mon Sep 17 00:00:00 2001 From: jakmro Date: Mon, 3 Feb 2025 20:22:19 +0100 Subject: [PATCH 6/6] Update insufficient-RAM-indicator description --- docs/docs/benchmarks/inference-time.md | 4 +--- docs/docs/llms/running-llms.md | 4 +--- 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md index 920d5fc34b..c1f91a3b7b 100644 --- a/docs/docs/benchmarks/inference-time.md +++ b/docs/docs/benchmarks/inference-time.md @@ -39,6 +39,4 @@ Times presented in the tables are measured as consecutive runs of the model. Ini | LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | | LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | -:::info -❌ - Not enough memory -::: +❌ - Insufficient RAM. diff --git a/docs/docs/llms/running-llms.md b/docs/docs/llms/running-llms.md index 6a60e85214..36016f2c50 100644 --- a/docs/docs/llms/running-llms.md +++ b/docs/docs/llms/running-llms.md @@ -124,6 +124,4 @@ There are also cases when you need to check if tokens are being generated, such | LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | | LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | -:::info -❌ - Not enough memory -::: +❌ - Insufficient RAM.