Skip to content
View suryaavala's full-sized avatar
πŸš€
To infinity and beyond!
πŸš€
To infinity and beyond!

Block or report suryaavala

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
suryaavala/README.md

πŸ‘‹πŸΎ Hi, I'm Surya!

Typing SVG

Buy Me a Coffee Sponsor Email Website LinkedIn

✨ "I've always been passionate about computers and the ability to make them solve problems for us. I value encountering challenging situations, enjoy analysing & solving problems and seeing the results. Code is a byproduct of this phenomenon."

I build things across the full stack β€” from low-level systems optimisation and distributed infrastructure, to ML platforms, recommender engines, clinical NLP services, and multi-agent GenAI systems. I care about making things work reliably at scale, in regulated environments, with measurable impact.

For over a decade I've navigated highly regulated domains β€” Healthcare, Energy, Finance β€” specialising in chipping away the "Hidden Technical Debt in Machine Learning".

🌱 Beyond Code

Strong believer in πŸ‘‰πŸΎ "AI is the new Electricity".

I want to use whatever skills and resources I have to help build a fairer world for everyone. That means fighting for climate action, standing against racism and inequality, spreading awareness about rational thinking and empathy, and advancing humanity's collective understanding through science and exploration. I'm actively looking for opportunities where technology β€” especially ML systems β€” can drive equitable outcomes in underserved communities, support public policy decisions with better data, and help us make progress on the problems that actually matter.

When I'm not debugging distributed systems, I'm probably reading about the intersection of responsible AI and public policy, or figuring out how to make technology work for people who need it most.

πŸ“ˆ Impact

πŸ—οΈ Platform & Engineering

  • 67% reduction in feature lead times and 97%+ drop in change-failure rates β€” architected a modular GCP-native ML platform with ClearML at Montu
  • 40% avg cost optimisation β€” event-driven energy forecasting systems for 10k+ sites at Amber
  • Hired, onboarded and mentored teams of Data Scientists & ML Engineers across multiple orgs

πŸ§ͺ Product & Commercial

  • 93% accuracy on medical data extraction, outperforming Google Healthcare NLP by 14% β€” structuring complex clinical text at Montu
  • 33% adoption gains & 19% profile subscription uplift β€” personalised recommender systems at Linktree
  • 18–23% repeat order uplift β€” patient segmentation engine driving commercial growth at Montu
  • 71.3% clinician adoption β€” hybrid personalised prescription recommender for 100k+ recurring patients

πŸ” Privacy & Compliance

  • 0.87+ F1 PII redaction for clinical input/log sanitisation
  • RBAC data access policies, CI/CD security scanning, DevSecOps frameworks
  • Privacy-by-Design across all healthcare ML services

🎯 NLP & GenAI

  • Automated consultation notes reducing 32% of consultation time
  • Care Quality assessment automating 70% of clinician reviews
  • LLM-based data extraction & clinical services at production scale

πŸ•ΈοΈ Deep Architectural Competency

CS Fundamentals & Distributed Systems
Primitives: C/C++, Big-O Tuning, LSM/B-Trees, PostgreSQL, Advanced SQL (Window Functions, CTEs), Bash/Shell, Perl. Python (Advanced internals, Asyncio, Metaprogramming).
Systems: Rigorous cProfile analysis crossing CPython GIL/GC boundaries using pure stdlib buffers (heaps, deques). Architecting on CAP/PACELC theorem bounds with Paxos/Raft consensus, Event-Driven Architecture, Domain-Driven Design, and Gang of Four patterns.
Mathematics, Classical ML & Causal Inference
Primitives: Linear Algebra (SVD, PCA), Matrix Calculus, Bayesian Inference, Optimisation (SGD, Adam), Clustering.
Execution: Determining when not to use Deep Learning. Wielding Causal Inference (Do-Calculus, Propensity matching), XGBoost/LightGBM, and specialised domain models: Recommender Systems (Two-Tower, Collaborative Filtering), Time-Series Forecasting (ARIMA, Prophet, DeepAR).
Deep Learning Internals & Platform MLOps
DL Primitives: Re-deriving PyTorch Autograd/Transformer blocks from scratch. PEFT (LoRA/QLoRA), RLHF alignment (PPO/DPO). TensorFlow, Keras, Scikit-Learn.
Platform: Strict FTI (Feature-Training-Inference) isolation via multi-stage OCI-compliant containerisation. Orchestrating with Kubeflow, ClearML, TFx, MLflow, DVC, Airflow, Feature Stores (Feast/Tecton). Enforcing data contracts, schema validation, and drift monitoring.
GenAI, HPC Optimisation & FinOps Strategy
HPC: Optimising hardware utilisation via vLLM (PagedAttention), FlashAttention, Quantisation (INT8/FP4), GPU Profiling (torch.profiler).
Agentic: Multi-Agent Orchestration via LangChain, LangGraph, DSPy, MCP, A2A. GraphRAG (Knowledge Graphs + LLMs). Guardrails, Prompt Engineering, and evals. Federated Learning and Differential Privacy guarantees.
Strategy: Cross-functional leadership, RFC authorship, and FinOps architecture driving cost reductions.

🧠 Domain Expertise

πŸ₯ Healthcare & Clinical NLP
Automated consultation notes (32% time reduction), Care Quality assessments (70% of reviews automated), structured medical data extraction (93% accuracy, outperforming Google Healthcare NLP by 14%). Personalised prescription recommenders for 100k+ patients with 71.3% clinician adoption. Privacy-by-Design with 0.87+ F1 PII redaction.
⚑ Energy & Time-Series Forecasting
High-availability energy forecasting systems β€” regional price, load and solar generation forecasts for 10k+ sites, ensuring grid stability and 40% avg cost optimisation at Amber. ARIMA, Prophet, DeepAR for production anomaly detection across high-frequency streaming data with Kafka.
πŸ›’ Recommender Systems & Consumer Tech
Led personalised consumer experiences at Linktree through end-to-end recommender systems (links & profiles) utilising collaborative filtering and embedding-based search β€” 33% gains in Link adoption & 19% profile subscription uplift. Patient segmentation driving 18–23% repeat order uplift.
🏦 Enterprise ML & Document Intelligence
Enterprise ID Fraud Detection Platform with Kubeflow on GKE at ANZ. Synthetic data generation, font detection CNNs, and red flag identification in ID documents. Customer request triaging chatbot at HammondCare. ML infrastructure and SageMaker platform extensions at nib Group. "Thea" document mining framework at Eliiza. Won 2nd place in Cricket Australia DataJam 2020.
πŸ”¬ Advanced Paradigms
PEFT (LoRA/QLoRA), RLHF alignment (PPO/DPO), A2A, MCP orchestration, evals & optimisation. Federated Learning, differential privacy guarantees. Agentic workflows β€” LangChain/LangGraph, Multi-Agent Systems, RAG, DSPy agents, guardrails.

🎨 Infrastructure Stack






πŸ“¦ Notable Repos

πŸ€– GenAI & Agentic Systems

  • scaling-succotash β€” Production agentic search engine: GraphRAG, Celery DLQ, Circuit Breakers, K8s
  • openclaude β€” Multi-model orchestration mesh

πŸ—οΈ Systems & Infrastructure

πŸ§ͺ ML & Data Science

  • suncorp β€” ML modelling & analysis
  • stockprediction β€” Financial market prediction
  • som β€” Self Organising Maps (Kohonen Networks)
  • prodr β€” Productionising R ML models ⭐2
  • legimages β€” ML Legos for Images

πŸ”§ Data Engineering & Tooling

πŸ“‚ Full Catalogue
Repo Description
blog Personal blog
suryaavala.github.io Portfolio website
powr Energy data analysis
Stellartube Python project
TimeStamper Timestamping tool
textparsing Text parsing
network Network programming
utilities Reusable Python tooling
CS231n Stanford CNNs for Visual Recognition
18s1-9417 UNSW ML (COMP9417)
17s1-cs9417 UNSW ML (COMP9417)
16s2-comp2041-ass1 UNSW Perl scripting ⭐1
16s2-comp2041-ass2 UNSW Python scripting ⭐1
16s2-comp2041-labs UNSW scripting labs
egl_test Technical assessment
finder_test Technical assessment
tcal Telugu movie releases β†’ Google Calendar

πŸ›οΈ Upstream Contributions & Open Source


GitHub Streak

Pinned Loading

  1. egl_test egl_test Public

    Python

  2. zen_search zen_search Public

    Python implementation of a basic user ticketing system search

    Jupyter Notebook 1

  3. fwfparser fwfparser Public

    Parse fixed width files

    Python 1

  4. network network Public

    Python

  5. prodr prodr Public

    Productionising ML models Developed in R a.k.a I have R Models, now what?!

    R 2

  6. Stellartube Stellartube Public

    Python