llama.android : Rewrite Android binding (w/o cpu_features dep)#17413
llama.android : Rewrite Android binding (w/o cpu_features dep)#17413ggerganov merged 275 commits intoggml-org:masterfrom
Conversation
…k navigation inside conversation and benchmark
… enable alert dialog on back navigation inside conversation and benchmark
…ens in generation
… use NavigationActions instead of raw navController
|
just merged and resolved conflicts with changes introduced 3 days ago from cb44fc8 |
|
Not sure why one test case in
|
66b92ef to
21d1876
Compare
hmm... this time |
|
The CI failures are not related to this PR, don't worry about it. @0cc4m @jeffbolznv It might be nothing, but the vulkan failure looks a bit suspicious, the |
Thanks for looking into it! I assume Anyways, plz keep me posted and let me know if there's any other concerns you have regarding this PR, happy to address them! 😊 |
|
Hi @ggerganov @slaren - is this all good to merge in? |
|
Happy Thanksgiving to all the reviewers and collaborators on this Android binding btw 🦃 🍁 |
|
Hello, wondering if there's going to be any movement on this? Let us know if there's anything we can help with/needs doing. |
|
Apologies for the lack of attention here. I don't have capacity to look into this in detail and I was hoping we will get some feedback from users, but so far there isn't much. I think there are 2 approaches we can take:
|
Hi @ggerganov, I understand your concerns. If it were me, I'd also not be comfortable reviewing some backend or infra code which is obviously outside of my domain. We are not here trying to push or even force you take something you don't feel safe or useful to accept. Let's take a step back (again lol) and review the approaches ahead: Approach A: Freeze this PR, perhaps add a link into your main README.md's "bindings" section? either pointing to Arm's Android binding repo (with cpu_features) or my patched (without cpu_features) Approach B Review this PR sometime in near future, preferably with an Android expert you trust, because:
You can easily try out our production app (https://play.google.com/store/apps/details?id=com.arm.aichat) not only on any recent Android or even ChromeOS device (not necessarily an expensive flagship one), but also simply on your computer using Android Studio's emulator or a 3rd party emulator such as https://waydro.id, as long as it comes with Google Play (for finding & installing this app). Finally, here's how the production app works - you can visually see the great potential of how a good Android binding on top of your foundational system can make a difference in this genAI ecosystem, while we are giving the source code to you for free: Chat.short.mp4Finally, thanks again to both you and @slaren for honestly sharing your concerns with us. Please let us know how you'd like to proceed! Merry Christmas 🎄https://open.spotify.com/track/2FRnf9qhLbvw8fu4IBXx78 |
|
in Android domain, your llama.cpp is yet still a hidden gem 💎 post-short.mp4at least let's set aside some time to review this when you are available, because it'd be such a waste of its great potential by missing the opportunity to expand GGML's impact beyond desktops, directly into everyone's everyday life, into their palms. 🙏 |
|
I don't have observations about Google's MediaPipe, but one of the apps in this domain that I think is well know is PocketPal AI. It's also based on top of llama.cpp. Curious have you benchmarked against it? |
Hi @ggerganov, YES we have! PocketPal is using llama.rn, which is covered in those benchmarks. My teammate @BmanClark can provide you the benchmark results, either the summaries, or detailed raw data (perhaps over a Zoom/Teams meeting due to a little sensitivity). TL;DR, llama.rn cannot deliver this good performance due to:
|
|
Alright, let's give it a try. The android example has been stale for quite a while, so in that regard refreshing it might be a good idea. Or at least it's worth a try. I'll ping you for any support that we might need (issues, bugs, CI, etc.) and in case we are not able to handle those issues, will likely roll back to the old example. I tried to get things running on my end to test these changes but Android Studio has changed a lot since I last used it and struggling to update NDK or whatever it needs to update. So gave up on this for now. |
|
Thank you very much for giving us the opportunity to showcase and maintain this binding! |
…org#17413) * UI: implement basic UI components * util: implement performance monitor; wrap it with a viewmodel * util: implement user preferences utility * UI: implement core flow's screens * UI: add a new MainActivity; update manifest * [WIP] DI: implement simple local vm factory provider * UI: disable triggering drawer via gesture; enable alert dialog on back navigation inside conversation and benchmark * UI: allow drawer's gesture control only on Home and Settings screens; enable alert dialog on back navigation inside conversation and benchmark * UI: split a nested parent settings screen into separate child settings screens * UI: polish system prompt setup UI * Deps: bump Kotlin plugin; introduce KSP; apply in :app subproject * DB: setup Room database * data: introduce repo for System Prompt; flow data from Room to VM * bugfix: properly handle user's quitting conversation screen while tokens in generation * UI: rename `ModeSelection` to `ModelLoading` for better clarity * UI: update app name to be more Arm * UI: polish conversation screen * data: code polish * UI: code polish * bugfix: handle user quitting on model loading * UI: locks user in alert dialog when model is unloading * vm: replace token metrics stubs with actual implementation * UI: refactor top app bars * nit: combine temperatureMetrics and useFahrenheit * DI: introduce Hilt plugin + processor + lib dependencies * DI: make app Hilt injectable * DI: make viewmodels Hilt injectable * DI: replace manual DI with Hilt DI * UI: optimize AppContent's composing * bugfix: wait for model to load before navigating to benchmark screen; use NavigationActions instead of raw navController * UI: navigation with more natural animated transitions * DI: Optimize AppModule * Feature: Introduce ModelRepository and ModelsManagementViewModel; update AppModule * UI: polish UI for ModelsManagementScreen; inject ModelsManagementVieModel * DI: abstract the protocol of SystemPromptRepository; update AppModule * data: [WIP] prepare for ModelRepository refactor & impl * data: introduce Model entity and DAO; update DI module * UI: replace Models Management screen's stubbing with instrumentation * UI: polish sort order menu * data: import local model with file picker * bugfix: use List instead of Collection for ModelDao's deletion * data: add a util file for extracting file name & size and model metadata * UI: enrich ModelManagementState; extract filename to show correct importing UI * UI: implement multiple models deletion; update Models Management screen * UI: handle back navigation when user is in multi-selection mode * util: extract file size formatting into ModelUtils * UI: add a confirmation step when user picks a file; refactor model import overlay into AlertDialog * UI: extract a shared ModelCard component * UI: replace model selection screen's data stubbing; add empty view * nit: tidy SystemPromptViewModel * Util: split FileUtils from ModelUtils; extract copy methods into FileUtils * data: pass through getModelById from ModelDao into ModelRepository * core: extract conversation and benchmark logics into InferenceManager; add logs and missing state updates in stub InferenceEngine * vm: split mono MainViewModel into separate individual ViewModels * vm: merge SystemPromptViewModel into ModelLoadingViewModel * core: break down InferenceManager due to Interface Segregation Principle * UI: show model card in Model Loading screen * UI: show model card in Conversation screen * UI: unify Model Card components * core: swap in LLamaAndroid and mark stub engine for testing only * data: allow canceling the ongoing model import * UI: update UI ongoing model import's cancellation * LLama: update engine state after handling the cancellation of sendUserPrompt * VM: handle the cancellation of ongoing token generation * LLama: refactor loadModel by splitting the system prompt setting into a separate method * feature: check for available space before copying local model * UI: centralize the AppScaffold and modularize its configs * UI: refactor BottomBarConfig.ModelsManagement APIs * UI: combine TopBarConfig and BottomBarConfig into each route's ScaffoldConfig * UI: replace ugly optional as casts in AppScaffold with extension functions * UI: fix the typo `totalGb` in `StorageMetrics` * UI: remove code duplication in sort menu * LLama: add ModelUnloadingState to engine State; add missing state checks in stub engine; fix instrumentation engine's error messages * UI: refactor back handling by removing centralized BackHandlerSetup and UnloadModelConfirmationDialog from AppContent * UI: implement BenchmarkScreen's individual back handling * LLama: add a new Initializing state; ; add two extension properties; rename LibraryLoaded state to Initialized * UI: Introduce an abstract ViewModel to handle additional model unloading logics * UI: expose a single facade ModelUnloadDialogHandler; move UnloadModelState into ModelUnloadingViewModel.kt * UI: migrate ModelLoadingScreen onto ModelLoadingViewModel; update & refine ModelLoadingScreen * UI: migrate ConversationViewModel onto ModelLoadingViewModel; update & refine ConversationScreen * nit: extract app name into a constant value; remove unused onBackPressed callbacks * UI: update AppContent to pass in correct navigation callbacks * nit: polish ModelLoadingScreen UI * core: throw Exception instead of returning null if model fails to load * navigation: sink model loading state management from AppContent down into ModelLoadingScreen; pass ModelLoadingMetrics to Benchmark and Conversation screens * gguf: add GGUF metadata data holder and its corresponding extractor implementation * DB: introduce Kotlin serialization extension's library and plugin; add Room runtime library * GGUF: make GgufMetadata serializable in order to be compatible with Room * nit: refactor data.local package structure * nit: rename lastUsed field to dateLastUsed; add dateAdded field * UI: refactor ModelCard UI to show GGUF metadata * UI: update ModelSelectionScreen with a preselect mechanism * UI: polish model card * nit: allow deselect model on Model Selection screen * nit: revert accidental committing of debug code * UI: polish ModelLoading screen * util: extract formatting helper functions from FileUtils into a new FormatUtils * UI: polish model cards on Benchmark and Conversation screens to show model loading metrics * UI: show a Snack bar to warn user that system prompt is not always supported * UI: handle back press on Model Selection screen * UI: finally support theme modes; remove hardcoded color schemes, default to dynamic color scheme implementation * feature: support searching on Model Selection screen * nit: move scaffold related UI components into a separate package * UI: extract InfoView out into a separate file for reusability * data: move Model related actions (query, filter, sort) into ModelInfo file * UI: animate FAB on model preselection states * feature: support filtering in Model Management screen * ui: show empty models info in Model Management screen * ui: add filter off icon to "Clear filters" menu item * [WIP] ui: polish Benchmark screen; implement its bottom app bar * ui: polish Benchmark screen; implement its bottom app bar's rerun and share * nit: disable mode selection's radio buttons when loading model * feature: implement Conversation screen's bottom app bar * pkg: restructure BottomAppBars into separate files in a child package * pkg: restructure TopBarApps into separate files in a child package * pkg: restructure system metrics into a separate file * UI: polish Conversation screen * data: update system prompt presets * UI: allow hide or show model card on Conversation & Benchmark screens; fix message arrangement * data: update & enhance system prompt presets * deps: introduce Retrofit2 * data: implement HuggingFace data model, data source with Retrofit API * data: update Model data repository to support fetching HuggingFace models * [WIP] UI: replace the HuggingFace stub in Model Management screen with actual API call * UI: map language codes into country Emojis * ui: add "clear results" action to Benchmark screen * nit: print current pp & tg in llama-bench * UI: disable landscape mode; prevent duplicated benchmark running * llama: migrate C/CXX flags into CMakeList * [WIP] llama: ABI split builds five .so artifacts. However, all .so are performing on SVE level * [WIP] llama: ABI split where five tiers are built sequentially. * [WIP] llama: disable OpenMP in ABI split since most SoCs are big.LITTLE * [WIP] llama: enable KleidiAI and disable tier 4 due to `+sve+sve2` bug caused by `ggml_add_cpu_backend_variant_impl` as explained below ```CMake if (NOT SME_ENABLED MATCHES -1) ... set(PRIVATE_ARCH_FLAGS "-fno-tree-vectorize;${PRIVATE_ARCH_FLAGS}+sve+sve2") ... ``` * core: add Google's cpu_features as a submodule * core: implement cpu_detector native lib * core: swap out hardcoded LlamaAndroid library loading * core: add back OpenMP due to huge perf loss on TG128 * misc: reorg the pkg structure * misc: rename LlamaAndroid related class to InferenceEngine prefixes * [WIP] lib: move GgufMetadata into the lib submodule * lib: expose GgufMetadataReader as interface only * lib: replace the naive & plain SharedPreferences with DataStore implementation * lib: hide the internal implementations, only expose a facade and interfaces * lib: expose Arm features * di: add a stub TierDetection; provide both actual impl and stub in AppModule * UI: add visualizer UI for Arm features * misc: UI polish * lib: refactored InferenceEngineLoader; added a `NONE` Llama Tier * UI: support `NONE` Llama Tier in general settings * lib: optimize engine loader; always perform a fresh detection when cache is null * remote: add HuggingFaceModelDetails data class * remote: refine HuggingFaceModel data class * nit: remove `trendingScore` field from HuggingFace model entities, weird... * remote: refactor HuggingFaceApiService; implement download feature in HuggingFaceRemoteDataSource * remote: fix the incorrect parse of HuggingFace's inconsistent & weird JSON response * UI: scaffold Models Management screen and view model * UI: implement a dialog UI to show fetched HuggingFace models. * UI: use a broadcast receiver to listen for download complete events and show local import dialog. * data: handle network exceptions elegantly * pkg: restructure `data`'s packages * data: extract local file info, copy and cleanup logics into LocalFileDataSource * nit: minor UI patch; add missing comments * bugfix: tapping "Home" in navigation drawer should simply close it without any navigation action. * UI: improve autoscroll during token generation * lib: tested on JFrog Artifactory for Maven publishing * UI: show RAM warning if model too large * UI: polish model management screen's error dialog * util: add more items into the mapping table of ISO 639-1 language code to ISO 3166-1 country code * llm: properly propagate error to UI upon failing to load selected model * UI: avoid duplicated calculation of token metrics * lib: read & validate the magic number from the picked source file before executing the import * UI: add "Learn More" hyperlinks to Error dialog upon model import failures * lib: refactor the GgufMetadataReader to take InputStream instead of absolute path as argument * lib: fix the `SIMD` typo in Tier description * core: verify model file path is readable * lib: add UnsupportedArchitectureException for triaged error message * util: split FormatUtils into multiple utils for better readability * UI: change benchmark screen from raw markdown to table view * bugfix: reset preselection upon running the preselected model * misc: linter issue * bugfix: fix the malfunctioning monitoring switch * UI: update Arm features indicator; fix the broken hyperlinks * UI: add quick action buttons to benchmark screen's result card * UI: hide share fab after clearing all benchmark results * UI: fix the model unload dialog message; elevate the model card and hide it by default on Conversation screen; * UI: hide the stubbing actions in Conversation screen * UI: add show/hide stats control to conversation screen's assistant message bubble; fix placeholder * UI: add a info button to explain token metrics * misc: remove the redundant `Companion` added due to refactoring * UI: show corresponding system metrics detailed info upon tapping RAM / storage / temperature indicator * UI: add info button to System Prompt switch; expand the model card by default * UI: disable tag & language chips; add section headers to explain what they are * misc: replace top bar indicator's spacer with padding * UI: merge the Model Selection and Model Management into a unified Models screen * UI: split the ModelsManagementViewModel from a unified ModelsViewModel due to huge complexity * UI: add model loading in progress view; polish the empty model info view * UI: polish the bottom bars and info view when no models found; show loading in progress while fetching models * build: [BREAKING] bump the versions of libraries and plugins * UI: fix the breaking build * UI: add Tooltip on Import FAB for user onboarding * UI: adds AppPreferences to track user onboarding status * UI: tracks user's first success on importing a model * data: add hand crafted rules to filter the models fetched from HuggingFace API * UI: update app name & about; polish top bars' indicators & buttons * UI: polish Hugging Face download dialog UI * UX: implement onboarding tooltips for model import and onboarding * misc: use sentence case for CTA button labels * [WIP] UI: add Arm color palette from Philip.Watson3 * UI: address Rojin's UX feedbacks * UI: address Rojin's UX feedbacks - part 2 * UI: update Arm color palette from Philip.Watson3 * data: make sure fetch preselected models in the same order of their IDs * UI: fix UI issues in the generic settings screen and navigation drawer * nit: address Rojin's feedbacks on model import message again * nit: append `®` to all `Arm` labels * UI: extract a reusable InfoAlertDialog * core: support GGML_CPU_ALL_VARIANTS on Android! * core: restructure Kleidi-Llama library * core: organizing cmake arguments * data: sort preselected models according to device's available RAM * app: update adaptive + themed + legacy icons and app name * UI: fix the font size auto scaling for ArmFeaturesVisualizer * core: further improve the performance on native methods * UI: minor color palette changes; emphasize the bottom bar FABs; fix Settings Screen menu item label * UI: make more room for assistant message bubble's width * UI: better usage of tertiary colors to highlight model cards but not for warnings * UI: fix the layout issue on large font sizes * lib: support x86-64 by dynamically set Arm related definitions * lib: replace the factory pattern for deprecated tiered lib loading with single instance pattern * llama: update the library name in JNI and CMake project * llama: update the library's package name and namespace * llama: update the app's package name and namespace * app: bump ksp version * app: remove deprecated SystemUIController from accompanist by migrating to EdgeToEdge * app: extract AppContent from MainActivity to a separate file in ui package * lib: add File version for GGUF Magic number verification * lib: perform engine state check inclusively instead of exclusively * lib: change `LlamaTier` to `ArmCpuTier` * lib: remove kleidi-llama related namings * cleanup: remove Arm AI Chat/Playground app source code; replace with the basic sample app from https://github.com/hanyin-arm/Arm-AI-Chat-Sample Note: the full Google Play version of AI Chat app will be open will be open sourced in another repo soon, therefore didn't go through the trouble of pruning the history using `git filter-repo` here. * [WIP] doc: update main and Android README docs; add self to code owners * lib: revert System.load back to System.loadLibrary * jni: introduce a logging util to filter different logging levels on different build types * lib: enable app optimization * doc: replace stub Google Play app URL with the actual link add screenshots; add my GitHub ID to maintainer list * Remove cpu_features * Fix linters issues in editorconfig-checker job https://github.com/ggml-org/llama.cpp/actions/runs/19548770247/job/55974800633?pr=17413 * Remove unnecessary Android CMake flag * purge include/cpu_features directory --------- Co-authored-by: Han Yin <han.yin@arm.com>
* UI: implement basic UI components
* util: implement performance monitor; wrap it with a viewmodel
* util: implement user preferences utility
* UI: implement core flow's screens
* UI: add a new MainActivity; update manifest
* [WIP] DI: implement simple local vm factory provider
* UI: disable triggering drawer via gesture; enable alert dialog on back navigation inside conversation and benchmark
* UI: allow drawer's gesture control only on Home and Settings screens; enable alert dialog on back navigation inside conversation and benchmark
* UI: split a nested parent settings screen into separate child settings screens
* UI: polish system prompt setup UI
* Deps: bump Kotlin plugin; introduce KSP; apply in :app subproject
* DB: setup Room database
* data: introduce repo for System Prompt; flow data from Room to VM
* bugfix: properly handle user's quitting conversation screen while tokens in generation
* UI: rename `ModeSelection` to `ModelLoading` for better clarity
* UI: update app name to be more Arm
* UI: polish conversation screen
* data: code polish
* UI: code polish
* bugfix: handle user quitting on model loading
* UI: locks user in alert dialog when model is unloading
* vm: replace token metrics stubs with actual implementation
* UI: refactor top app bars
* nit: combine temperatureMetrics and useFahrenheit
* DI: introduce Hilt plugin + processor + lib dependencies
* DI: make app Hilt injectable
* DI: make viewmodels Hilt injectable
* DI: replace manual DI with Hilt DI
* UI: optimize AppContent's composing
* bugfix: wait for model to load before navigating to benchmark screen; use NavigationActions instead of raw navController
* UI: navigation with more natural animated transitions
* DI: Optimize AppModule
* Feature: Introduce ModelRepository and ModelsManagementViewModel; update AppModule
* UI: polish UI for ModelsManagementScreen; inject ModelsManagementVieModel
* DI: abstract the protocol of SystemPromptRepository; update AppModule
* data: [WIP] prepare for ModelRepository refactor & impl
* data: introduce Model entity and DAO; update DI module
* UI: replace Models Management screen's stubbing with instrumentation
* UI: polish sort order menu
* data: import local model with file picker
* bugfix: use List instead of Collection for ModelDao's deletion
* data: add a util file for extracting file name & size and model metadata
* UI: enrich ModelManagementState; extract filename to show correct importing UI
* UI: implement multiple models deletion; update Models Management screen
* UI: handle back navigation when user is in multi-selection mode
* util: extract file size formatting into ModelUtils
* UI: add a confirmation step when user picks a file; refactor model import overlay into AlertDialog
* UI: extract a shared ModelCard component
* UI: replace model selection screen's data stubbing; add empty view
* nit: tidy SystemPromptViewModel
* Util: split FileUtils from ModelUtils; extract copy methods into FileUtils
* data: pass through getModelById from ModelDao into ModelRepository
* core: extract conversation and benchmark logics into InferenceManager; add logs and missing state updates in stub InferenceEngine
* vm: split mono MainViewModel into separate individual ViewModels
* vm: merge SystemPromptViewModel into ModelLoadingViewModel
* core: break down InferenceManager due to Interface Segregation Principle
* UI: show model card in Model Loading screen
* UI: show model card in Conversation screen
* UI: unify Model Card components
* core: swap in LLamaAndroid and mark stub engine for testing only
* data: allow canceling the ongoing model import
* UI: update UI ongoing model import's cancellation
* LLama: update engine state after handling the cancellation of sendUserPrompt
* VM: handle the cancellation of ongoing token generation
* LLama: refactor loadModel by splitting the system prompt setting into a separate method
* feature: check for available space before copying local model
* UI: centralize the AppScaffold and modularize its configs
* UI: refactor BottomBarConfig.ModelsManagement APIs
* UI: combine TopBarConfig and BottomBarConfig into each route's ScaffoldConfig
* UI: replace ugly optional as casts in AppScaffold with extension functions
* UI: fix the typo `totalGb` in `StorageMetrics`
* UI: remove code duplication in sort menu
* LLama: add ModelUnloadingState to engine State; add missing state checks in stub engine; fix instrumentation engine's error messages
* UI: refactor back handling by removing centralized BackHandlerSetup and UnloadModelConfirmationDialog from AppContent
* UI: implement BenchmarkScreen's individual back handling
* LLama: add a new Initializing state; ; add two extension properties; rename LibraryLoaded state to Initialized
* UI: Introduce an abstract ViewModel to handle additional model unloading logics
* UI: expose a single facade ModelUnloadDialogHandler; move UnloadModelState into ModelUnloadingViewModel.kt
* UI: migrate ModelLoadingScreen onto ModelLoadingViewModel; update & refine ModelLoadingScreen
* UI: migrate ConversationViewModel onto ModelLoadingViewModel; update & refine ConversationScreen
* nit: extract app name into a constant value; remove unused onBackPressed callbacks
* UI: update AppContent to pass in correct navigation callbacks
* nit: polish ModelLoadingScreen UI
* core: throw Exception instead of returning null if model fails to load
* navigation: sink model loading state management from AppContent down into ModelLoadingScreen; pass ModelLoadingMetrics to Benchmark and Conversation screens
* gguf: add GGUF metadata data holder and its corresponding extractor implementation
* DB: introduce Kotlin serialization extension's library and plugin; add Room runtime library
* GGUF: make GgufMetadata serializable in order to be compatible with Room
* nit: refactor data.local package structure
* nit: rename lastUsed field to dateLastUsed; add dateAdded field
* UI: refactor ModelCard UI to show GGUF metadata
* UI: update ModelSelectionScreen with a preselect mechanism
* UI: polish model card
* nit: allow deselect model on Model Selection screen
* nit: revert accidental committing of debug code
* UI: polish ModelLoading screen
* util: extract formatting helper functions from FileUtils into a new FormatUtils
* UI: polish model cards on Benchmark and Conversation screens to show model loading metrics
* UI: show a Snack bar to warn user that system prompt is not always supported
* UI: handle back press on Model Selection screen
* UI: finally support theme modes; remove hardcoded color schemes, default to dynamic color scheme implementation
* feature: support searching on Model Selection screen
* nit: move scaffold related UI components into a separate package
* UI: extract InfoView out into a separate file for reusability
* data: move Model related actions (query, filter, sort) into ModelInfo file
* UI: animate FAB on model preselection states
* feature: support filtering in Model Management screen
* ui: show empty models info in Model Management screen
* ui: add filter off icon to "Clear filters" menu item
* [WIP] ui: polish Benchmark screen; implement its bottom app bar
* ui: polish Benchmark screen; implement its bottom app bar's rerun and share
* nit: disable mode selection's radio buttons when loading model
* feature: implement Conversation screen's bottom app bar
* pkg: restructure BottomAppBars into separate files in a child package
* pkg: restructure TopBarApps into separate files in a child package
* pkg: restructure system metrics into a separate file
* UI: polish Conversation screen
* data: update system prompt presets
* UI: allow hide or show model card on Conversation & Benchmark screens; fix message arrangement
* data: update & enhance system prompt presets
* deps: introduce Retrofit2
* data: implement HuggingFace data model, data source with Retrofit API
* data: update Model data repository to support fetching HuggingFace models
* [WIP] UI: replace the HuggingFace stub in Model Management screen with actual API call
* UI: map language codes into country Emojis
* ui: add "clear results" action to Benchmark screen
* nit: print current pp & tg in llama-bench
* UI: disable landscape mode; prevent duplicated benchmark running
* llama: migrate C/CXX flags into CMakeList
* [WIP] llama: ABI split builds five .so artifacts.
However, all .so are performing on SVE level
* [WIP] llama: ABI split where five tiers are built sequentially.
* [WIP] llama: disable OpenMP in ABI split since most SoCs are big.LITTLE
* [WIP] llama: enable KleidiAI and disable tier 4 due to `+sve+sve2` bug caused by `ggml_add_cpu_backend_variant_impl` as explained below
```CMake
if (NOT SME_ENABLED MATCHES -1)
...
set(PRIVATE_ARCH_FLAGS "-fno-tree-vectorize;${PRIVATE_ARCH_FLAGS}+sve+sve2")
...
```
* core: add Google's cpu_features as a submodule
* core: implement cpu_detector native lib
* core: swap out hardcoded LlamaAndroid library loading
* core: add back OpenMP due to huge perf loss on TG128
* misc: reorg the pkg structure
* misc: rename LlamaAndroid related class to InferenceEngine prefixes
* [WIP] lib: move GgufMetadata into the lib submodule
* lib: expose GgufMetadataReader as interface only
* lib: replace the naive & plain SharedPreferences with DataStore implementation
* lib: hide the internal implementations, only expose a facade and interfaces
* lib: expose Arm features
* di: add a stub TierDetection; provide both actual impl and stub in AppModule
* UI: add visualizer UI for Arm features
* misc: UI polish
* lib: refactored InferenceEngineLoader; added a `NONE` Llama Tier
* UI: support `NONE` Llama Tier in general settings
* lib: optimize engine loader; always perform a fresh detection when cache is null
* remote: add HuggingFaceModelDetails data class
* remote: refine HuggingFaceModel data class
* nit: remove `trendingScore` field from HuggingFace model entities, weird...
* remote: refactor HuggingFaceApiService; implement download feature in HuggingFaceRemoteDataSource
* remote: fix the incorrect parse of HuggingFace's inconsistent & weird JSON response
* UI: scaffold Models Management screen and view model
* UI: implement a dialog UI to show fetched HuggingFace models.
* UI: use a broadcast receiver to listen for download complete events and show local import dialog.
* data: handle network exceptions elegantly
* pkg: restructure `data`'s packages
* data: extract local file info, copy and cleanup logics into LocalFileDataSource
* nit: minor UI patch; add missing comments
* bugfix: tapping "Home" in navigation drawer should simply close it without any navigation action.
* UI: improve autoscroll during token generation
* lib: tested on JFrog Artifactory for Maven publishing
* UI: show RAM warning if model too large
* UI: polish model management screen's error dialog
* util: add more items into the mapping table of ISO 639-1 language code to ISO 3166-1 country code
* llm: properly propagate error to UI upon failing to load selected model
* UI: avoid duplicated calculation of token metrics
* lib: read & validate the magic number from the picked source file before executing the import
* UI: add "Learn More" hyperlinks to Error dialog upon model import failures
* lib: refactor the GgufMetadataReader to take InputStream instead of absolute path as argument
* lib: fix the `SIMD` typo in Tier description
* core: verify model file path is readable
* lib: add UnsupportedArchitectureException for triaged error message
* util: split FormatUtils into multiple utils for better readability
* UI: change benchmark screen from raw markdown to table view
* bugfix: reset preselection upon running the preselected model
* misc: linter issue
* bugfix: fix the malfunctioning monitoring switch
* UI: update Arm features indicator; fix the broken hyperlinks
* UI: add quick action buttons to benchmark screen's result card
* UI: hide share fab after clearing all benchmark results
* UI: fix the model unload dialog message; elevate the model card and hide it by default on Conversation screen;
* UI: hide the stubbing actions in Conversation screen
* UI: add show/hide stats control to conversation screen's assistant message bubble; fix placeholder
* UI: add a info button to explain token metrics
* misc: remove the redundant `Companion` added due to refactoring
* UI: show corresponding system metrics detailed info upon tapping RAM / storage / temperature indicator
* UI: add info button to System Prompt switch; expand the model card by default
* UI: disable tag & language chips; add section headers to explain what they are
* misc: replace top bar indicator's spacer with padding
* UI: merge the Model Selection and Model Management into a unified Models screen
* UI: split the ModelsManagementViewModel from a unified ModelsViewModel due to huge complexity
* UI: add model loading in progress view; polish the empty model info view
* UI: polish the bottom bars and info view when no models found; show loading in progress while fetching models
* build: [BREAKING] bump the versions of libraries and plugins
* UI: fix the breaking build
* UI: add Tooltip on Import FAB for user onboarding
* UI: adds AppPreferences to track user onboarding status
* UI: tracks user's first success on importing a model
* data: add hand crafted rules to filter the models fetched from HuggingFace API
* UI: update app name & about; polish top bars' indicators & buttons
* UI: polish Hugging Face download dialog UI
* UX: implement onboarding tooltips for model import and onboarding
* misc: use sentence case for CTA button labels
* [WIP] UI: add Arm color palette from Philip.Watson3
* UI: address Rojin's UX feedbacks
* UI: address Rojin's UX feedbacks - part 2
* UI: update Arm color palette from Philip.Watson3
* data: make sure fetch preselected models in the same order of their IDs
* UI: fix UI issues in the generic settings screen and navigation drawer
* nit: address Rojin's feedbacks on model import message again
* nit: append `®` to all `Arm` labels
* UI: extract a reusable InfoAlertDialog
* core: support GGML_CPU_ALL_VARIANTS on Android!
* core: restructure Kleidi-Llama library
* core: organizing cmake arguments
* data: sort preselected models according to device's available RAM
* app: update adaptive + themed + legacy icons and app name
* UI: fix the font size auto scaling for ArmFeaturesVisualizer
* core: further improve the performance on native methods
* UI: minor color palette changes; emphasize the bottom bar FABs; fix Settings Screen menu item label
* UI: make more room for assistant message bubble's width
* UI: better usage of tertiary colors to highlight model cards but not for warnings
* UI: fix the layout issue on large font sizes
* lib: support x86-64 by dynamically set Arm related definitions
* lib: replace the factory pattern for deprecated tiered lib loading with single instance pattern
* llama: update the library name in JNI and CMake project
* llama: update the library's package name and namespace
* llama: update the app's package name and namespace
* app: bump ksp version
* app: remove deprecated SystemUIController from accompanist by migrating to EdgeToEdge
* app: extract AppContent from MainActivity to a separate file in ui package
* lib: add File version for GGUF Magic number verification
* lib: perform engine state check inclusively instead of exclusively
* lib: change `LlamaTier` to `ArmCpuTier`
* lib: remove kleidi-llama related namings
* cleanup: remove Arm AI Chat/Playground app source code; replace with the basic sample app from https://github.com/hanyin-arm/Arm-AI-Chat-Sample
Note: the full Google Play version of AI Chat app will be open will be open sourced in another repo soon, therefore didn't go through the trouble of pruning the history using `git filter-repo` here.
* [WIP] doc: update main and Android README docs; add self to code owners
* lib: revert System.load back to System.loadLibrary
* jni: introduce a logging util to filter different logging levels on different build types
* lib: enable app optimization
* doc: replace stub Google Play app URL with the actual link add screenshots; add my GitHub ID to maintainer list
* Remove cpu_features
* Fix linters issues in editorconfig-checker job
https://github.com/ggml-org/llama.cpp/actions/runs/19548770247/job/55974800633?pr=17413
* Remove unnecessary Android CMake flag
* purge include/cpu_features directory
---------
Co-authored-by: Han Yin <han.yin@arm.com>
…org#17413) * UI: implement basic UI components * util: implement performance monitor; wrap it with a viewmodel * util: implement user preferences utility * UI: implement core flow's screens * UI: add a new MainActivity; update manifest * [WIP] DI: implement simple local vm factory provider * UI: disable triggering drawer via gesture; enable alert dialog on back navigation inside conversation and benchmark * UI: allow drawer's gesture control only on Home and Settings screens; enable alert dialog on back navigation inside conversation and benchmark * UI: split a nested parent settings screen into separate child settings screens * UI: polish system prompt setup UI * Deps: bump Kotlin plugin; introduce KSP; apply in :app subproject * DB: setup Room database * data: introduce repo for System Prompt; flow data from Room to VM * bugfix: properly handle user's quitting conversation screen while tokens in generation * UI: rename `ModeSelection` to `ModelLoading` for better clarity * UI: update app name to be more Arm * UI: polish conversation screen * data: code polish * UI: code polish * bugfix: handle user quitting on model loading * UI: locks user in alert dialog when model is unloading * vm: replace token metrics stubs with actual implementation * UI: refactor top app bars * nit: combine temperatureMetrics and useFahrenheit * DI: introduce Hilt plugin + processor + lib dependencies * DI: make app Hilt injectable * DI: make viewmodels Hilt injectable * DI: replace manual DI with Hilt DI * UI: optimize AppContent's composing * bugfix: wait for model to load before navigating to benchmark screen; use NavigationActions instead of raw navController * UI: navigation with more natural animated transitions * DI: Optimize AppModule * Feature: Introduce ModelRepository and ModelsManagementViewModel; update AppModule * UI: polish UI for ModelsManagementScreen; inject ModelsManagementVieModel * DI: abstract the protocol of SystemPromptRepository; update AppModule * data: [WIP] prepare for ModelRepository refactor & impl * data: introduce Model entity and DAO; update DI module * UI: replace Models Management screen's stubbing with instrumentation * UI: polish sort order menu * data: import local model with file picker * bugfix: use List instead of Collection for ModelDao's deletion * data: add a util file for extracting file name & size and model metadata * UI: enrich ModelManagementState; extract filename to show correct importing UI * UI: implement multiple models deletion; update Models Management screen * UI: handle back navigation when user is in multi-selection mode * util: extract file size formatting into ModelUtils * UI: add a confirmation step when user picks a file; refactor model import overlay into AlertDialog * UI: extract a shared ModelCard component * UI: replace model selection screen's data stubbing; add empty view * nit: tidy SystemPromptViewModel * Util: split FileUtils from ModelUtils; extract copy methods into FileUtils * data: pass through getModelById from ModelDao into ModelRepository * core: extract conversation and benchmark logics into InferenceManager; add logs and missing state updates in stub InferenceEngine * vm: split mono MainViewModel into separate individual ViewModels * vm: merge SystemPromptViewModel into ModelLoadingViewModel * core: break down InferenceManager due to Interface Segregation Principle * UI: show model card in Model Loading screen * UI: show model card in Conversation screen * UI: unify Model Card components * core: swap in LLamaAndroid and mark stub engine for testing only * data: allow canceling the ongoing model import * UI: update UI ongoing model import's cancellation * LLama: update engine state after handling the cancellation of sendUserPrompt * VM: handle the cancellation of ongoing token generation * LLama: refactor loadModel by splitting the system prompt setting into a separate method * feature: check for available space before copying local model * UI: centralize the AppScaffold and modularize its configs * UI: refactor BottomBarConfig.ModelsManagement APIs * UI: combine TopBarConfig and BottomBarConfig into each route's ScaffoldConfig * UI: replace ugly optional as casts in AppScaffold with extension functions * UI: fix the typo `totalGb` in `StorageMetrics` * UI: remove code duplication in sort menu * LLama: add ModelUnloadingState to engine State; add missing state checks in stub engine; fix instrumentation engine's error messages * UI: refactor back handling by removing centralized BackHandlerSetup and UnloadModelConfirmationDialog from AppContent * UI: implement BenchmarkScreen's individual back handling * LLama: add a new Initializing state; ; add two extension properties; rename LibraryLoaded state to Initialized * UI: Introduce an abstract ViewModel to handle additional model unloading logics * UI: expose a single facade ModelUnloadDialogHandler; move UnloadModelState into ModelUnloadingViewModel.kt * UI: migrate ModelLoadingScreen onto ModelLoadingViewModel; update & refine ModelLoadingScreen * UI: migrate ConversationViewModel onto ModelLoadingViewModel; update & refine ConversationScreen * nit: extract app name into a constant value; remove unused onBackPressed callbacks * UI: update AppContent to pass in correct navigation callbacks * nit: polish ModelLoadingScreen UI * core: throw Exception instead of returning null if model fails to load * navigation: sink model loading state management from AppContent down into ModelLoadingScreen; pass ModelLoadingMetrics to Benchmark and Conversation screens * gguf: add GGUF metadata data holder and its corresponding extractor implementation * DB: introduce Kotlin serialization extension's library and plugin; add Room runtime library * GGUF: make GgufMetadata serializable in order to be compatible with Room * nit: refactor data.local package structure * nit: rename lastUsed field to dateLastUsed; add dateAdded field * UI: refactor ModelCard UI to show GGUF metadata * UI: update ModelSelectionScreen with a preselect mechanism * UI: polish model card * nit: allow deselect model on Model Selection screen * nit: revert accidental committing of debug code * UI: polish ModelLoading screen * util: extract formatting helper functions from FileUtils into a new FormatUtils * UI: polish model cards on Benchmark and Conversation screens to show model loading metrics * UI: show a Snack bar to warn user that system prompt is not always supported * UI: handle back press on Model Selection screen * UI: finally support theme modes; remove hardcoded color schemes, default to dynamic color scheme implementation * feature: support searching on Model Selection screen * nit: move scaffold related UI components into a separate package * UI: extract InfoView out into a separate file for reusability * data: move Model related actions (query, filter, sort) into ModelInfo file * UI: animate FAB on model preselection states * feature: support filtering in Model Management screen * ui: show empty models info in Model Management screen * ui: add filter off icon to "Clear filters" menu item * [WIP] ui: polish Benchmark screen; implement its bottom app bar * ui: polish Benchmark screen; implement its bottom app bar's rerun and share * nit: disable mode selection's radio buttons when loading model * feature: implement Conversation screen's bottom app bar * pkg: restructure BottomAppBars into separate files in a child package * pkg: restructure TopBarApps into separate files in a child package * pkg: restructure system metrics into a separate file * UI: polish Conversation screen * data: update system prompt presets * UI: allow hide or show model card on Conversation & Benchmark screens; fix message arrangement * data: update & enhance system prompt presets * deps: introduce Retrofit2 * data: implement HuggingFace data model, data source with Retrofit API * data: update Model data repository to support fetching HuggingFace models * [WIP] UI: replace the HuggingFace stub in Model Management screen with actual API call * UI: map language codes into country Emojis * ui: add "clear results" action to Benchmark screen * nit: print current pp & tg in llama-bench * UI: disable landscape mode; prevent duplicated benchmark running * llama: migrate C/CXX flags into CMakeList * [WIP] llama: ABI split builds five .so artifacts. However, all .so are performing on SVE level * [WIP] llama: ABI split where five tiers are built sequentially. * [WIP] llama: disable OpenMP in ABI split since most SoCs are big.LITTLE * [WIP] llama: enable KleidiAI and disable tier 4 due to `+sve+sve2` bug caused by `ggml_add_cpu_backend_variant_impl` as explained below ```CMake if (NOT SME_ENABLED MATCHES -1) ... set(PRIVATE_ARCH_FLAGS "-fno-tree-vectorize;${PRIVATE_ARCH_FLAGS}+sve+sve2") ... ``` * core: add Google's cpu_features as a submodule * core: implement cpu_detector native lib * core: swap out hardcoded LlamaAndroid library loading * core: add back OpenMP due to huge perf loss on TG128 * misc: reorg the pkg structure * misc: rename LlamaAndroid related class to InferenceEngine prefixes * [WIP] lib: move GgufMetadata into the lib submodule * lib: expose GgufMetadataReader as interface only * lib: replace the naive & plain SharedPreferences with DataStore implementation * lib: hide the internal implementations, only expose a facade and interfaces * lib: expose Arm features * di: add a stub TierDetection; provide both actual impl and stub in AppModule * UI: add visualizer UI for Arm features * misc: UI polish * lib: refactored InferenceEngineLoader; added a `NONE` Llama Tier * UI: support `NONE` Llama Tier in general settings * lib: optimize engine loader; always perform a fresh detection when cache is null * remote: add HuggingFaceModelDetails data class * remote: refine HuggingFaceModel data class * nit: remove `trendingScore` field from HuggingFace model entities, weird... * remote: refactor HuggingFaceApiService; implement download feature in HuggingFaceRemoteDataSource * remote: fix the incorrect parse of HuggingFace's inconsistent & weird JSON response * UI: scaffold Models Management screen and view model * UI: implement a dialog UI to show fetched HuggingFace models. * UI: use a broadcast receiver to listen for download complete events and show local import dialog. * data: handle network exceptions elegantly * pkg: restructure `data`'s packages * data: extract local file info, copy and cleanup logics into LocalFileDataSource * nit: minor UI patch; add missing comments * bugfix: tapping "Home" in navigation drawer should simply close it without any navigation action. * UI: improve autoscroll during token generation * lib: tested on JFrog Artifactory for Maven publishing * UI: show RAM warning if model too large * UI: polish model management screen's error dialog * util: add more items into the mapping table of ISO 639-1 language code to ISO 3166-1 country code * llm: properly propagate error to UI upon failing to load selected model * UI: avoid duplicated calculation of token metrics * lib: read & validate the magic number from the picked source file before executing the import * UI: add "Learn More" hyperlinks to Error dialog upon model import failures * lib: refactor the GgufMetadataReader to take InputStream instead of absolute path as argument * lib: fix the `SIMD` typo in Tier description * core: verify model file path is readable * lib: add UnsupportedArchitectureException for triaged error message * util: split FormatUtils into multiple utils for better readability * UI: change benchmark screen from raw markdown to table view * bugfix: reset preselection upon running the preselected model * misc: linter issue * bugfix: fix the malfunctioning monitoring switch * UI: update Arm features indicator; fix the broken hyperlinks * UI: add quick action buttons to benchmark screen's result card * UI: hide share fab after clearing all benchmark results * UI: fix the model unload dialog message; elevate the model card and hide it by default on Conversation screen; * UI: hide the stubbing actions in Conversation screen * UI: add show/hide stats control to conversation screen's assistant message bubble; fix placeholder * UI: add a info button to explain token metrics * misc: remove the redundant `Companion` added due to refactoring * UI: show corresponding system metrics detailed info upon tapping RAM / storage / temperature indicator * UI: add info button to System Prompt switch; expand the model card by default * UI: disable tag & language chips; add section headers to explain what they are * misc: replace top bar indicator's spacer with padding * UI: merge the Model Selection and Model Management into a unified Models screen * UI: split the ModelsManagementViewModel from a unified ModelsViewModel due to huge complexity * UI: add model loading in progress view; polish the empty model info view * UI: polish the bottom bars and info view when no models found; show loading in progress while fetching models * build: [BREAKING] bump the versions of libraries and plugins * UI: fix the breaking build * UI: add Tooltip on Import FAB for user onboarding * UI: adds AppPreferences to track user onboarding status * UI: tracks user's first success on importing a model * data: add hand crafted rules to filter the models fetched from HuggingFace API * UI: update app name & about; polish top bars' indicators & buttons * UI: polish Hugging Face download dialog UI * UX: implement onboarding tooltips for model import and onboarding * misc: use sentence case for CTA button labels * [WIP] UI: add Arm color palette from Philip.Watson3 * UI: address Rojin's UX feedbacks * UI: address Rojin's UX feedbacks - part 2 * UI: update Arm color palette from Philip.Watson3 * data: make sure fetch preselected models in the same order of their IDs * UI: fix UI issues in the generic settings screen and navigation drawer * nit: address Rojin's feedbacks on model import message again * nit: append `®` to all `Arm` labels * UI: extract a reusable InfoAlertDialog * core: support GGML_CPU_ALL_VARIANTS on Android! * core: restructure Kleidi-Llama library * core: organizing cmake arguments * data: sort preselected models according to device's available RAM * app: update adaptive + themed + legacy icons and app name * UI: fix the font size auto scaling for ArmFeaturesVisualizer * core: further improve the performance on native methods * UI: minor color palette changes; emphasize the bottom bar FABs; fix Settings Screen menu item label * UI: make more room for assistant message bubble's width * UI: better usage of tertiary colors to highlight model cards but not for warnings * UI: fix the layout issue on large font sizes * lib: support x86-64 by dynamically set Arm related definitions * lib: replace the factory pattern for deprecated tiered lib loading with single instance pattern * llama: update the library name in JNI and CMake project * llama: update the library's package name and namespace * llama: update the app's package name and namespace * app: bump ksp version * app: remove deprecated SystemUIController from accompanist by migrating to EdgeToEdge * app: extract AppContent from MainActivity to a separate file in ui package * lib: add File version for GGUF Magic number verification * lib: perform engine state check inclusively instead of exclusively * lib: change `LlamaTier` to `ArmCpuTier` * lib: remove kleidi-llama related namings * cleanup: remove Arm AI Chat/Playground app source code; replace with the basic sample app from https://github.com/hanyin-arm/Arm-AI-Chat-Sample Note: the full Google Play version of AI Chat app will be open will be open sourced in another repo soon, therefore didn't go through the trouble of pruning the history using `git filter-repo` here. * [WIP] doc: update main and Android README docs; add self to code owners * lib: revert System.load back to System.loadLibrary * jni: introduce a logging util to filter different logging levels on different build types * lib: enable app optimization * doc: replace stub Google Play app URL with the actual link add screenshots; add my GitHub ID to maintainer list * Remove cpu_features * Fix linters issues in editorconfig-checker job https://github.com/ggml-org/llama.cpp/actions/runs/19548770247/job/55974800633?pr=17413 * Remove unnecessary Android CMake flag * purge include/cpu_features directory --------- Co-authored-by: Han Yin <han.yin@arm.com>



a. enable Aarch64 acceleration up to
SME2b. enable x86_64 acceleratiion up to
AMXc. support both
AndroidandChromeOSa. automatic message role formatting
b. system prompt injection
c. context overflow handling
d. batch decoding
a. engine states
b. new API (such as system prompt)
a. parse GGUF metadata
Architectural diagram

Performance comparison
Against Google AI Edge Gallery on the same Pixel 9:
512345275-355b5941-65c2-4088-8622-44979954caef.mp4
Production app

The full-featured production app "Arm AI Chat" is published on Google Play, it's built with the same binding and works on both Android and ChromeOS:
Compared with #17152, this PR avoids introducing Google's
cpu_features.