Skip to content
View khajamoddin's full-sized avatar
πŸ’­
I may be slow to respond.
πŸ’­
I may be slow to respond.

Block or report khajamoddin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
khajamoddin/README.md

Khajamoddin Shaik

Senior Systems Architect & AI Infrastructure Engineer

Available for End-to-End AI Solution Architecture & GCP Implementations

Rust Go Performance Engineering Sustainable AI


I design and build reliable, high-performance systems where correctness, efficiency, and long-term maintainability matter. My work sits at the intersection of ML architecture, cloud infrastructure, and cost-aware performance engineeringβ€”supporting organizations moving from experimentation to large-scale AI deployment.

"The era of 'AI at any cost' is over. In 2026, leaders win by building AI systems that are high-performance, cost-governed, and energy-aware."

πŸš€ End-to-End AI Solutions: From Advisory to Production

I help enterprises navigate the "AI Infrastructure Reckoning" by bridging the gap between high-level strategy and production-grade engineering. Whether starting from a blank slate or scaling a pilot, I deliver high-performance, cost-governed, and energy-aware systems.

πŸ› οΈ How We Can Work Together

Engagement Model Core Deliverables & Outcomes
A-to-Z Project Build β€’ Custom Generative AI Agents: Fully integrated, secure workflows using Gemini & Vertex AI Agent Builder.
β€’ Production ML Platforms: Scalable GKE/Vertex AI environments built for 99.9% reliability.
Consulting & Advisory β€’ Infrastructure Audit: Deep-dive into GPU/TPU utilization and cloud spend to identify 15–30% efficiency gains.
β€’ MLOps/LLMOps Strategy: Architecting end-to-end lifecycles (Pipelines, Model Garden, Feature Store).
Data Infrastructure Transformation β€’ Legacy to Modern AI Pipelines: Modernizing mission-critical workloads (IBM MQ/ACE) by migrating to high-throughput Modern OSS stacks including Apache Kafka, Redis, and Airflow.

⚑ The Outcome

  • Reduced TCO: Transitioning to efficient Small Language Models (SLMs) and right-sizing model selection to cut inference costs by up to 40%.
  • Infrastructure-Native Intelligence: Leveraging ESNODE telemetry for node-level power, thermal, and GPU utilization optimization.
  • Sustainability: Aligning AI deployments with Carbon-Aware Scheduling and enterprise cloud spend (FinOps) governance.

🧩 ESNODE: Infrastructure-Native Intelligence

Founder & Managing Director

At ESNODE, I lead the development of a vendor-neutral AI infrastructure optimization platform. ESNODE bridges the gap between compute demand and energy realities by delivering real-time telemetry for modern AI clusters.

  • GPU/CPU/Power Telemetry: Deep visibility into node-level power draw and thermal behavior.
  • Inference Economics: Transitioning to efficient SLMs to maximize tokens-per-watt.
  • Modernized Power Footprint: Scaling AI systems from on-prem servers to cloud-scale deployments responsibly.

πŸ›  Technology Focus

Category Technologies
Core Languages Rust Go Python Java C++
AI & Data Capabilities RAG AI Agents LLMOps Vector Search Analytics
AI & ML Libraries PyTorch Transformers TensorFlow scikit-learn
AI & Data Platforms Vertex AI Gemini BigQuery
Data & Messaging (OSS) Apache Kafka Redis PostgreSQL
Vector Databases (OSS) Milvus Pinecone PgVector
Workflow & Orchestration Apache Airflow ComfyUI
Cloud & Infrastructure Google Cloud Kubernetes
Observability Prometheus OpenTelemetry
Legacy & Enterprise Systems IBM MQ IBM ACE IBM Mainframes RHEL Oracle

πŸ— Professional Philosophy & Experience

With over 25 years of experience in mission-critical environments (Banking, Defence, Telecommunications, Power Generation), my work is guided by:

  1. Robustness over novelty: Engineering for quiet reliability and long-term trust.
  2. Infrastructure-aware ML: Scaling architectures that respect hardware limits.
  3. Predictive Performance: Understanding why systems fail to build better operational correctness.

I am open to conducting confidential case studies for enterprises facing memory-related performance bottlenecks or GPU utilization challenges under full NDA.


πŸ“ Based in Stockholm, Sweden / India

LinkedIn Website Email

Designed for Reliability & Performance

Pinned Loading

  1. graphrag graphrag Public

    Forked from microsoft/graphrag

    A modular graph-based Retrieval-Augmented Generation (RAG) system

    Python

  2. optax optax Public

    Forked from google-deepmind/optax

    Optax is a gradient processing and optimization library for JAX.

    Python

  3. huggingface/transformers huggingface/transformers Public

    πŸ€— Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    Python 160k 33.1k

  4. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 78.7k 16.3k

  5. ESNODE/ESNODE-Core ESNODE/ESNODE-Core Public

    ESNODE Core delivers high-freq GPU telemetry for power-aware optimization: power, utilization, thermals, and health via a Prometheus /metrics endpoint. Includes per-GPU stats, optional TSDB, daemon…

    Makefile 3