A demo app showcasing FunctionGemma 270M running on-device via MLX on Apple Silicon. The app is a simple expense tracker where voice input is parsed into structured function calls by the model — no cloud APIs involved.
Built with React Native + Expo as a proof of concept for on-device function calling with a fine-tuned small language model (~540MB fp16).
- On-device function calling — FunctionGemma 270M parses voice input into structured tool calls (
add_expense,query_expenses) entirely on the device - Voice → structured data — "Spent twenty on lunch" becomes
{amount: 20, category: "food", transaction_type: "expense", date: "2026-03-22"} - Relative date handling — prompt injection with pre-computed dates so the model maps "yesterday" or "last week" to actual dates
- MLX on iOS — custom Expo native module wrapping mlx-swift for inference on Apple Silicon
| Layer | Technology |
|---|---|
| Framework | React Native 0.83 + Expo 55 |
| Navigation | Expo Router (file-based) |
| AI Inference | MLX Swift (custom native module) |
| Model | FunctionGemma 270M (fp16) |
| Voice Input | expo-speech-recognition |
| Database | expo-sqlite |
| Animations | react-native-reanimated |
| Gestures | react-native-gesture-handler |
| Icons | expo-symbols (SF Symbols) |
| Dates | dayjs |
src/
├── app/
│ ├── _layout.tsx # Root stack, DB init, providers
│ ├── index.tsx # Main transaction list screen
│ └── query-results.tsx # Modal sheet for query results
├── components/
│ ├── animated-header-title.tsx
│ ├── loading-screen.tsx # Model download/load progress
│ ├── transaction-list.tsx # Date-grouped FlatList
│ ├── transaction-row.tsx # Swipeable transaction card
│ └── voice-pill.tsx # Floating voice input UI
├── hooks/
│ ├── useModelLoader.ts # Download, cache, load model
│ ├── useVoiceRecognition.ts
│ ├── useExpenseInference.ts # Tool call dispatch
│ ├── useAddExpenseDispatcher.ts
│ ├── useQueryExpensesDispatcher.ts
│ └── useTransactions.ts
├── contexts/
│ ├── query-results-context.tsx
│ └── scroll-context.tsx
├── db/
│ ├── database.ts # SQLite connection
│ ├── schema.ts # Migration
│ ├── transactions.ts # CRUD + filtered queries
│ ├── types.ts # Transaction types
│ └── storage.ts # KV store for settings
├── constants/
│ ├── categories.ts # 23 category icons + colors
│ └── theme.ts # Dark theme palette
├── tools/
│ └── expenseTools.ts # Tool schema definition
└── utils/
├── buildPrompt.ts # Date context injection
└── flattenByDate.ts # Group transactions by date
training/
└── queryfi_v2.py # Fine-tuning script (~1000 examples)
modules/
└── queryfi-mlx/ # Custom Expo native module
├── ios/
│ ├── MLXInferenceService.swift # Model lifecycle + generation
│ └── QueryfiMlxModule.swift # Expo bridge
└── src/
├── QueryfiMlxModule.ts
└── QueryfiMlx.types.ts
- macOS with Apple Silicon (M1+)
- Xcode 16+
- Node.js 18+
- iOS device or simulator (iOS 18.0+)
# Clone the repo
git clone https://github.com/akshayjadhav4/QueryFi.git
cd queryfi
# Install dependencies
pnpm install
# Build and run on a physical iOS device
npx expo run:ios --deviceNote: This app requires a physical iPhone/iPad with Apple Silicon. It will not work on the iOS Simulator — MLX inference needs the Neural Engine / GPU available on real devices.
On first launch, the app downloads the MLX model (~540MB) from Hugging Face. This only happens once — the model is cached locally.
# Start dev server
pnpm start
# Type check
pnpm tsc --noEmit
# Lint
pnpm lintThe app uses a fine-tuned version of Google's FunctionGemma 270M:
- Training data: ~1000 examples covering expense/income logging, transaction queries, and no-op rejection
- Format: FunctionGemma's control token format (
<start_function_call>,<escape>, etc.) - Quantization: fp16 only — 4-bit quantization destroys fine-tuned patterns on this small model
- Inference: Greedy decoding with hardcoded tool declarations to match training format exactly
See docs/MLX-Convert-Model.md for the full conversion and upload guide.
| Repository | Description |
|---|---|
| queryfi-functiongemma-270m | Fine-tuned PyTorch model |
| queryfi-functiongemma-270m-mlx | MLX-converted fp16 model (used by app) |
Voice Input → Speech Recognition → Text
↓
buildPrompt() (inject date context)
↓
MLX Generate (on-device)
↓
Parse Tool Call
↓
┌────────────────┴────────────────┐
↓ ↓
add_expense query_expenses
↓ ↓
Insert to SQLite Query SQLite with filters
↓ ↓
Refresh main list Open results bottom sheet