diff --git a/docs/docs/benchmarks/_category_.json b/docs/docs/benchmarks/_category_.json
new file mode 100644
index 0000000000..8e10f7a3fa
--- /dev/null
+++ b/docs/docs/benchmarks/_category_.json
@@ -0,0 +1,7 @@
+{
+  "label": "Benchmarks",
+  "position": 5,
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/docs/benchmarks/inference-time.md b/docs/docs/benchmarks/inference-time.md
new file mode 100644
index 0000000000..c1f91a3b7b
--- /dev/null
+++ b/docs/docs/benchmarks/inference-time.md
@@ -0,0 +1,42 @@
+---
+title: Inference Time
+sidebar_position: 3
+---
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+## Classification
+
+| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| EFFICIENTNET_V2_S | 100                          | 120                          | 130                        | 180                               | 170                       |
+
+## Object Detection
+
+| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| SSDLITE_320_MOBILENET_V3_LARGE | 190                          | 260                          | 280                        | 100                               | 90                        |
+
+## Style Transfer
+
+| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ---------------------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| STYLE_TRANSFER_CANDY         | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_MOSAIC        | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_UDNIE         | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_RAIN_PRINCESS | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+
+## LLMs
+
+| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
+| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- |
+| LLAMA3_2_1B           | 16.1                               | 11.4                               | ❌                               | 15.6                                    | 19.3                            |
+| LLAMA3_2_1B_SPINQUANT | 40.6                               | 16.7                               | 16.5                             | 40.3                                    | 48.2                            |
+| LLAMA3_2_1B_QLORA     | 31.8                               | 11.4                               | 11.2                             | 37.3                                    | 44.4                            |
+| LLAMA3_2_3B           | ❌                                 | ❌                                 | ❌                               | ❌                                      | 7.1                             |
+| LLAMA3_2_3B_SPINQUANT | 17.2                               | 8.2                                | ❌                               | 16.2                                    | 19.4                            |
+| LLAMA3_2_3B_QLORA     | 14.5                               | ❌                                 | ❌                               | 14.8                                    | 18.1                            |
+
+❌ - Insufficient RAM.
diff --git a/docs/docs/benchmarks/memory-usage.md b/docs/docs/benchmarks/memory-usage.md
new file mode 100644
index 0000000000..868a0884b6
--- /dev/null
+++ b/docs/docs/benchmarks/memory-usage.md
@@ -0,0 +1,36 @@
+---
+title: Memory Usage
+sidebar_position: 2
+---
+
+## Classification
+
+| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ----------------- | ---------------------- | ------------------ |
+| EFFICIENTNET_V2_S | 130                    | 85                 |
+
+## Object Detection
+
+| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------ | ---------------------- | ------------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 90                     | 90                 |
+
+## Style Transfer
+
+| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ---------------------------- | ---------------------- | ------------------ |
+| STYLE_TRANSFER_CANDY         | 950                    | 350                |
+| STYLE_TRANSFER_MOSAIC        | 950                    | 350                |
+| STYLE_TRANSFER_UDNIE         | 950                    | 350                |
+| STYLE_TRANSFER_RAIN_PRINCESS | 950                    | 350                |
+
+## LLMs
+
+| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
+| --------------------- | ---------------------- | ------------------ |
+| LLAMA3_2_1B           | 3.2                    | 3.1                |
+| LLAMA3_2_1B_SPINQUANT | 1.9                    | 2                  |
+| LLAMA3_2_1B_QLORA     | 2.2                    | 2.5                |
+| LLAMA3_2_3B           | 7.1                    | 7.3                |
+| LLAMA3_2_3B_SPINQUANT | 3.7                    | 3.8                |
+| LLAMA3_2_3B_QLORA     | 4                      | 4.1                |
diff --git a/docs/docs/benchmarks/model-size.md b/docs/docs/benchmarks/model-size.md
new file mode 100644
index 0000000000..a80f59d47f
--- /dev/null
+++ b/docs/docs/benchmarks/model-size.md
@@ -0,0 +1,36 @@
+---
+title: Model Size
+sidebar_position: 1
+---
+
+## Classification
+
+| Model             | XNNPACK [MB] | Core ML [MB] |
+| ----------------- | ------------ | ------------ |
+| EFFICIENTNET_V2_S | 85.6         | 43.9         |
+
+## Object Detection
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | ------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 13.9         |
+
+## Style Transfer
+
+| Model                        | XNNPACK [MB] | Core ML [MB] |
+| ---------------------------- | ------------ | ------------ |
+| STYLE_TRANSFER_CANDY         | 6.78         | 5.22         |
+| STYLE_TRANSFER_MOSAIC        | 6.78         | 5.22         |
+| STYLE_TRANSFER_UDNIE         | 6.78         | 5.22         |
+| STYLE_TRANSFER_RAIN_PRINCESS | 6.78         | 5.22         |
+
+## LLMs
+
+| Model                 | XNNPACK [GB] |
+| --------------------- | ------------ |
+| LLAMA3_2_1B           | 2.47         |
+| LLAMA3_2_1B_SPINQUANT | 1.14         |
+| LLAMA3_2_1B_QLORA     | 1.18         |
+| LLAMA3_2_3B           | 6.43         |
+| LLAMA3_2_3B_SPINQUANT | 2.55         |
+| LLAMA3_2_3B_QLORA     | 2.65         |
diff --git a/docs/docs/computer-vision/useClassification.mdx b/docs/docs/computer-vision/useClassification.md
similarity index 79%
rename from docs/docs/computer-vision/useClassification.mdx
rename to docs/docs/computer-vision/useClassification.md
index 1043088b21..db33fed113 100644
--- a/docs/docs/computer-vision/useClassification.mdx
+++ b/docs/docs/computer-vision/useClassification.md
@@ -86,3 +86,27 @@ function App() {
 | Model                                                                                                           | Number of classes | Class list                                                                                                                                                                 |
 | --------------------------------------------------------------------------------------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [efficientnet_v2_s](https://pytorch.org/vision/0.20/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000              | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/main/android/src/main/java/com/swmansion/rnexecutorch/models/classification/Constants.kt) |
+
+## Benchmarks
+
+### Model size
+
+| Model             | XNNPACK [MB] | Core ML [MB] |
+| ----------------- | ------------ | ------------ |
+| EFFICIENTNET_V2_S | 85.6         | 43.9         |
+
+### Memory usage
+
+| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ----------------- | ---------------------- | ------------------ |
+| EFFICIENTNET_V2_S | 130                    | 85                 |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ----------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| EFFICIENTNET_V2_S | 100                          | 120                          | 130                        | 180                               | 170                       |
diff --git a/docs/docs/computer-vision/useObjectDetection.mdx b/docs/docs/computer-vision/useObjectDetection.md
similarity index 82%
rename from docs/docs/computer-vision/useObjectDetection.mdx
rename to docs/docs/computer-vision/useObjectDetection.md
index 5de3da41cc..a0e3033799 100644
--- a/docs/docs/computer-vision/useObjectDetection.mdx
+++ b/docs/docs/computer-vision/useObjectDetection.md
@@ -124,3 +124,27 @@ function App() {
 | Model                                                                                                                                                                                                               | Number of classes | Class list                                                                                                                                          |
 | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [SSDLite320 MobileNetV3 Large](https://pytorch.org/vision/main/models/generated/torchvision.models.detection.ssdlite320_mobilenet_v3_large.html#torchvision.models.detection.SSDLite320_MobileNet_V3_Large_Weights) | 91                | [COCO](https://github.com/software-mansion/react-native-executorch/blob/69802ee1ca161d9df00def1dabe014d36341cfa9/src/types/object_detection.ts#L14) |
+
+## Benchmarks
+
+### Model size
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | ------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 13.9         |
+
+### Memory usage
+
+| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------ | ---------------------- | ------------------ |
+| SSDLITE_320_MOBILENET_V3_LARGE | 90                     | 90                 |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| SSDLITE_320_MOBILENET_V3_LARGE | 190                          | 260                          | 280                        | 100                               | 90                        |
diff --git a/docs/docs/computer-vision/useStyleTransfer.mdx b/docs/docs/computer-vision/useStyleTransfer.md
similarity index 61%
rename from docs/docs/computer-vision/useStyleTransfer.mdx
rename to docs/docs/computer-vision/useStyleTransfer.md
index c5a5e3e0d2..6a8a346107 100644
--- a/docs/docs/computer-vision/useStyleTransfer.mdx
+++ b/docs/docs/computer-vision/useStyleTransfer.md
@@ -78,3 +78,36 @@ function App(){
 - [Mosaic](https://github.com/pytorch/examples/tree/main/fast_neural_style)
 - [Udnie](https://github.com/pytorch/examples/tree/main/fast_neural_style)
 - [Rain princess](https://github.com/pytorch/examples/tree/main/fast_neural_style)
+
+## Benchmarks
+
+### Model size
+
+| Model                        | XNNPACK [MB] | Core ML [MB] |
+| ---------------------------- | ------------ | ------------ |
+| STYLE_TRANSFER_CANDY         | 6.78         | 5.22         |
+| STYLE_TRANSFER_MOSAIC        | 6.78         | 5.22         |
+| STYLE_TRANSFER_UDNIE         | 6.78         | 5.22         |
+| STYLE_TRANSFER_RAIN_PRINCESS | 6.78         | 5.22         |
+
+### Memory usage
+
+| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ---------------------------- | ---------------------- | ------------------ |
+| STYLE_TRANSFER_CANDY         | 950                    | 350                |
+| STYLE_TRANSFER_MOSAIC        | 950                    | 350                |
+| STYLE_TRANSFER_UDNIE         | 950                    | 350                |
+| STYLE_TRANSFER_RAIN_PRINCESS | 950                    | 350                |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ---------------------------- | ---------------------------- | ---------------------------- | -------------------------- | --------------------------------- | ------------------------- |
+| STYLE_TRANSFER_CANDY         | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_MOSAIC        | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_UDNIE         | 450                          | 600                          | 750                        | 1650                              | 1800                      |
+| STYLE_TRANSFER_RAIN_PRINCESS | 450                          | 600                          | 750                        | 1650                              | 1800                      |
diff --git a/docs/docs/fundamentals/getting-started.mdx b/docs/docs/fundamentals/getting-started.md
similarity index 94%
rename from docs/docs/fundamentals/getting-started.mdx
rename to docs/docs/fundamentals/getting-started.md
index a53be6d147..025b085ad5 100644
--- a/docs/docs/fundamentals/getting-started.mdx
+++ b/docs/docs/fundamentals/getting-started.md
@@ -7,12 +7,15 @@ import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 
 ## What is ExecuTorch?
-ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as CoreML and XNNPack. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise.
+
+ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as Core ML and XNNPACK. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise.
 
 ## React Native ExecuTorch
+
 React Native ExecuTorch is our way of bringing ExecuTorch into the React Native world. Our API is built to be simple, declarative, and efficient. Plus, we’ll provide a set of pre-exported models for common use cases, so you won’t have to worry about handling exports yourself. With just a few lines of JavaScript, you’ll be able to run AI models (even LLMs 👀) right on your device—keeping user data private and saving on cloud costs.
 
 ## Installation
+
 Installation is pretty straightforward, just use your favorite package manager.
 
 <Tabs>
@@ -54,12 +57,15 @@ Because we are using ExecuTorch under the hood, you won't be able to build iOS a
 :::
 
 Running the app with the library:
+
 ```bash
 yarn run expo:<ios | android> -d
 ```
 
 ## Good reads
-If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources:  
+
+If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources:
+
 - [ExecuTorch docs](https://pytorch.org/executorch/stable/index.html)
 - [Native code for iOS](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-i-ios-f1562a4556e8?source=user_profile_page---------0-------------250189c98ccf---------------)
 - [Native code for Android](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-ii-android-29431b6b9f7f?source=user_profile_page---------2-------------b8e3a5cb1c63---------------)
diff --git a/docs/docs/llms/exporting-llama.mdx b/docs/docs/llms/exporting-llama.md
similarity index 66%
rename from docs/docs/llms/exporting-llama.mdx
rename to docs/docs/llms/exporting-llama.md
index 5acbb5d202..28b3ceb997 100644
--- a/docs/docs/llms/exporting-llama.mdx
+++ b/docs/docs/llms/exporting-llama.md
@@ -3,32 +3,41 @@ title: Exporting Llama
 sidebar_position: 2
 ---
 
-In order to make the process of export as simple as possible for you, we created a script that runs a Docker container and exports the model. 
+In order to make the process of export as simple as possible for you, we created a script that runs a Docker container and exports the model.
 
 ## Steps to export Llama
+
 ### 1. Create an account
-Get a [HuggingFace](https://huggingface.co/) account. This will allow you to download needed files. You can also use the [official Llama website](https://www.llama.com/llama-downloads/). 
+
+Get a [HuggingFace](https://huggingface.co/) account. This will allow you to download needed files. You can also use the [official Llama website](https://www.llama.com/llama-downloads/).
 
 ### 2. Select a model
+
 Pick the model that suits your needs. Before you download it, you'll need to accept a license. For best performance, we recommend using Spin-Quant or QLoRA versions of the model:
-   - [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main/original)
-   - [Llama 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main/original)
-   - [Llama 3.2 3B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8/tree/main)
-   - [Llama 3.2 1B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8/tree/main)
-   - [Llama 3.2 3B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8/tree/main)
-   - [Llama 3.2 1B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8/tree/main)
+
+- [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main/original)
+- [Llama 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main/original)
+- [Llama 3.2 3B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8/tree/main)
+- [Llama 3.2 1B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8/tree/main)
+- [Llama 3.2 3B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8/tree/main)
+- [Llama 3.2 1B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8/tree/main)
 
 ### 3. Download files
+
 Download the `consolidated.00.pth`, `params.json` and `tokenizer.model` files. If you can't see them, make sure to check the `original` directory.
 
 ### 4. Rename the tokenizer file
+
 Rename the `tokenizer.model` file to `tokenizer.bin` as required by the library:
+
 ```bash
 mv tokenizer.model tokenizer.bin
 ```
 
 ### 5. Run the export script
-Navigate to the `llama_export` directory and run the following command: 
+
+Navigate to the `llama_export` directory and run the following command:
+
 ```bash
 ./build_llama_binary.sh --model-path /path/to/consolidated.00.pth --params-path /path/to/params.json
 ```
diff --git a/docs/docs/llms/running-llms.md b/docs/docs/llms/running-llms.md
index a00bad237b..36016f2c50 100644
--- a/docs/docs/llms/running-llms.md
+++ b/docs/docs/llms/running-llms.md
@@ -45,15 +45,15 @@ Given computational constraints, our architecture is designed to support only on
 
 ### Returns
 
-| Field               | Type                               | Description                                                                                                     |
-| ------------------- | ---------------------------------- | --------------------------------------------------------------------------------------------------------------- |
-| `generate`          | `(input: string) => Promise<void>` | Function to start generating a response with the given input string.                                            |
-| `response`          | `string`                           | State of the generated response. This field is updated with each token generated by the model                   |
-| `error`             | <code>string &#124; null</code>    | Contains the error message if the model failed to load                                                          |
-| `isGenerating` | `boolean`                          | Indicates whether the model is currently generating a response                                                  |
-| `interrupt`         | `() => void`                       | Function to interrupt the current inference                                                                     |
-| `isReady`      | `boolean`                          | Indicates whether the model is ready                                                                            |
-| `downloadProgress`  | `number`                           | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
+| Field              | Type                               | Description                                                                                                     |
+| ------------------ | ---------------------------------- | --------------------------------------------------------------------------------------------------------------- |
+| `generate`         | `(input: string) => Promise<void>` | Function to start generating a response with the given input string.                                            |
+| `response`         | `string`                           | State of the generated response. This field is updated with each token generated by the model                   |
+| `error`            | <code>string &#124; null</code>    | Contains the error message if the model failed to load                                                          |
+| `isGenerating`     | `boolean`                          | Indicates whether the model is currently generating a response                                                  |
+| `interrupt`        | `() => void`                       | Function to interrupt the current inference                                                                     |
+| `isReady`          | `boolean`                          | Indicates whether the model is ready                                                                            |
+| `downloadProgress` | `number`                           | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
 
 ## Sending a message
 
@@ -88,3 +88,40 @@ Behind the scenes, tokens are generated one by one, and the response property is
 Sometimes, you might want to stop the model while it’s generating. To do this, you can use `interrupt()`, which will halt the model and append the current response to its internal conversation state.
 
 There are also cases when you need to check if tokens are being generated, such as to conditionally render a stop button. We’ve made this easy with the `isTokenBeingGenerated` property.
+
+## Benchmarks
+
+### Model size
+
+| Model                 | XNNPACK [GB] |
+| --------------------- | ------------ |
+| LLAMA3_2_1B           | 2.47         |
+| LLAMA3_2_1B_SPINQUANT | 1.14         |
+| LLAMA3_2_1B_QLORA     | 1.18         |
+| LLAMA3_2_3B           | 6.43         |
+| LLAMA3_2_3B_SPINQUANT | 2.55         |
+| LLAMA3_2_3B_QLORA     | 2.65         |
+
+### Memory usage
+
+| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
+| --------------------- | ---------------------- | ------------------ |
+| LLAMA3_2_1B           | 3.2                    | 3.1                |
+| LLAMA3_2_1B_SPINQUANT | 1.9                    | 2                  |
+| LLAMA3_2_1B_QLORA     | 2.2                    | 2.5                |
+| LLAMA3_2_3B           | 7.1                    | 7.3                |
+| LLAMA3_2_3B_SPINQUANT | 3.7                    | 3.8                |
+| LLAMA3_2_3B_QLORA     | 4                      | 4.1                |
+
+### Inference time
+
+| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
+| --------------------- | ---------------------------------- | ---------------------------------- | -------------------------------- | --------------------------------------- | ------------------------------- |
+| LLAMA3_2_1B           | 16.1                               | 11.4                               | ❌                               | 15.6                                    | 19.3                            |
+| LLAMA3_2_1B_SPINQUANT | 40.6                               | 16.7                               | 16.5                             | 40.3                                    | 48.2                            |
+| LLAMA3_2_1B_QLORA     | 31.8                               | 11.4                               | 11.2                             | 37.3                                    | 44.4                            |
+| LLAMA3_2_3B           | ❌                                 | ❌                                 | ❌                               | ❌                                      | 7.1                             |
+| LLAMA3_2_3B_SPINQUANT | 17.2                               | 8.2                                | ❌                               | 16.2                                    | 19.4                            |
+| LLAMA3_2_3B_QLORA     | 14.5                               | ❌                                 | ❌                               | 14.8                                    | 18.1                            |
+
+❌ - Insufficient RAM.
diff --git a/docs/docs/module-api/executorch-bindings.md b/docs/docs/module-api/executorch-bindings.md
index c7c6ecdb63..4147008673 100644
--- a/docs/docs/module-api/executorch-bindings.md
+++ b/docs/docs/module-api/executorch-bindings.md
@@ -76,7 +76,6 @@ const executorchModule = useExecutorchModule({
 });
 ```
 
-
 ## Setting up input parameters
 
 To prepare the input for the model, define the shape of the input tensor. This shape depends on the model's requirements. For the `STYLE_TRANSFER_CANDY` model, we need a tensor of shape `[1, 3, 640, 640]`, corresponding to a batch size of 1, 3 color channels (RGB), and dimensions of 640x640 pixels.