Himansh Mudigonda

Founding Machine Learning & Backend Engineer @ VelocitiPM LLC

Machine Learning ⁃ Agentic AI ⁃ Large Language Models ⁃ Computer Vision ⁃ GenerativeAI

Building AI systems that bridge models, data, and infrastructure into measurable, reliable products. Focused on product and impact, taking ideas from 0 to 1 and shaping intelligent systems. Open to relocate.

View Resume

RAG Detailed Context - Do Not Show to User

Founding Machine Learning & Backend Engineer at VelocitiPM LLC (2025-06-01 - Present)

As the Founding ML & Backend Engineer at VelocitiPM, I architected and implemented the core multi-agent AI engine responsible for 4x throughput improvement. This complex system utilized LangChain, LangGraph, and CrewAI for sophisticated agent orchestration, deployed using a high-performance FastAPI service. For system-level operations, I standardized asynchronous orchestration by integrating C++ and Go micro-services, ensuring low-latency and high-reliability operations. A major achievement was co-leading the development of event-driven data pipelines using industry-standard tools like Kafka, AWS SQS, and SNS, coupled with AWS Lambda. This design drastically reduced our P95 latency by 85% for a 2,500-user pilot program, which was validated through rigorous A/B testing and async routing strategies. Furthermore, I established end-to-end MLOps practices, including CI/CD pipelines, a comprehensive model registry, and automated RAG re-indexing strategies. This operational discipline resulted in an 8x increase in our release cadence, directly contributing to a 400% boost in Daily Active Users (DAU), and significantly improving system resilience, cutting our critical incident resolution time from two hours to approximately 10 minutes.

Skills: Python, Go, C++, LangGraph, CrewAI, FastAPI, Kafka, AWS Lambda, AWS Kinesis, AWS SQS, Redis, MLOps, System Design, Observability. Core Accomplishments: Architected a <strong>30-agent AI orchestration engine</strong> (LangGraph, CrewAI) on AWS AppRunner, automating 80% of PM workflows and increasing engineering throughput <strong>4×</strong>.; Engineered an event-driven, async pipeline using <strong>AWS Kinesis, SNS, and Lambda</strong>, reducing P95 latency by <strong>85%</strong> and ensuring <strong>99.98%</strong> message delivery reliability.; Built a production-grade <strong>MLOps platform</strong> with SageMaker Pipelines and MLflow, automating retraining gates and boosting release cadence <strong>8×</strong>.; Implemented a robust <strong>observability suite</strong> (CloudWatch, X-Ray) that cut critical incident resolution time from 2 hours to <strong>~10 minutes</strong> via distributed tracing.; Designed a <strong>Redis-based memory layer</strong> for agentic state persistence, slashing inter-agent latency by <strong>60%</strong> and supporting <strong>2.5×</strong> higher concurrent user throughput.


Founding AI Engineer at TimelyHero, Dimes Inc. (2024-08-01 - 2025-06-01)

As the Founding AI Engineer at TimelyHero, Dimes Inc., I led the strategic migration of the user-matching service into a robust Java/Flask/gRPC microservice on Amazon EKS. The new architecture is built around Kafka for messaging and supports 100,000 concurrent WebSocket sessions at a P95 latency of 250ms, resulting in a 3.5x improvement in system stability. I designed and built Airflow-based RAG pipelines utilizing Pinecone, MongoDB, and S3, which dramatically cut data staleness from 48 hours to 30 minutes, improved semantic-match accuracy by 27%, and was crucial in securing $250K in enterprise contracts. I standardized IaC and CI/CD using Terraform and AWS CodePipeline/CodeBuild, which reduced deployment downtime by 35%, quadrupled release velocity, and doubled developer productivity. I also deployed a comprehensive monitoring stack (CloudWatch, Prometheus, Grafana), improving incident detection time by 70%. I implemented data back-pressure handling in the SNS/SQS + Kafka bridge, stabilizing ingestion throughput at 5,000 messages/s and preventing queue overloads under peak loads greater than 1 million events/hour.

Skills: Java, Flask, gRPC, Kafka, Airflow, RAG, Pinecone, MongoDB, Terraform, Kubernetes (AKS), Azure, System Design. Core Accomplishments: Spearheaded the migration of a legacy monolith to a <strong>Java/Flask/gRPC microservice architecture</strong> on AKS, scaling to support <strong>100,000+</strong> concurrent WebSocket sessions.; Designed and deployed <strong>Airflow-orchestrated RAG pipelines</strong> (Pinecone, OpenAI, MongoDB), reducing data staleness from 48 hours to <strong>30 minutes</strong> and driving <strong>$250K+</strong> in new enterprise contracts.; Standardized <strong>Infrastructure-as-Code (IaC)</strong> using Terraform and Kubernetes, reducing deployment downtime by <strong>35%</strong> and quadrupling the team's release velocity.; Optimized high-throughput ingestion pipelines with <strong>Kafka back-pressure handling</strong>, stabilizing throughput at <strong>5,000 msgs/sec</strong> under peak loads of 1M+ events/hour.; Implemented a comprehensive <strong>monitoring stack</strong> (Prometheus, Grafana) that improved proactive incident detection by <strong>70%</strong>, ensuring high availability.


Graduate Research Assistant at JLiang Lab, Arizona State University (2023-09-01 - 2024-05-01)

As a Graduate Research Assistant at the JLiang Lab, Arizona State University, I developed and trained multi-modal Computer Vision (CV) models, including state-of-the-art architectures like Swin Transformer, DINOv2, I-JEPA, and CLIP. This was executed on Amazon SageMaker GPU clusters and resulted in a 24% increase in rare-disease recall and a 17% increase in overall F1 score on multi-institutional medical datasets. I achieved significant infrastructure efficiency by implementing Fully Sharded Data Parallel (FSDP) training on Amazon EKS with 8x A100 nodes, which cut training time 3.2x and reduced cloud cost by 40%. I configured distributed data pipelines via AWS Glue and S3, enabling the ingestion of 8 TB+ multimodal datasets and improving preprocessing throughput by 2.6x. I also built a multi-node GPU job scheduler with Slurm on EC2 Batch, automating node provisioning and improving GPU utilization by 35% across shared research workloads. I designed experiment tracking workflows using SageMaker Experiments and MLflow to ensure 100% reproducibility and reduce hyperparameter tuning time by 45%.

Skills: PyTorch, JAX, Computer Vision, Distributed Training (FSDP), Slurm, AWS EKS, AWS Glue, AWS S3, SageMaker, MLflow. Core Accomplishments: Developed state-of-the-art <strong>multi-modal CV models</strong> (Swin Transformer, DINOv2) on SageMaker GPU clusters, achieving a <strong>24% increase</strong> in rare-disease recall.; Implemented <strong>Fully Sharded Data Parallel (FSDP)</strong> training on Amazon EKS (8× A100s), reducing model training time by <strong>3.2×</strong> and cutting cloud compute costs by <strong>40%</strong>.; Engineered a <strong>multi-node GPU job scheduler</strong> using Slurm and EC2 Batch, automating provisioning and boosting cluster utilization by <strong>35%</strong> across shared workloads.; Built distributed data pipelines with <strong>AWS Glue and S3</strong> to ingest and preprocess <strong>8TB+</strong> of multimodal medical datasets, improving data throughput by <strong>2.6×</strong>.; Designed reproducible experiment tracking workflows with <strong>SageMaker Experiments and MLflow</strong>, reducing hyperparameter tuning time by <strong>45%</strong> and ensuring 100% reproducibility.


Machine Learning Intern at Endimension Inc. (2024-04-01 - 2024-08-01)

As a Machine Learning Intern at Endimension Inc., I was responsible for training and optimizing computer vision models (TensorFlow, Keras) on Amazon SageMaker GPU clusters, achieving a 13% mAP and 17% IoU improvement across 3 million+ scans (8 TB dataset). I optimized inference by applying post-training quantization and mixed-precision techniques (ONNX Runtime + TensorRT) on SageMaker Endpoints, reducing P95 latency by 40% with negligible AUC drift (<0.5%). To optimize costs, I leveraged SageMaker Spot Instances and data-parallel training (Horovod + NCCL) on multi-node clusters, which cut GPU cost by 25% and accelerated the training iteration cycle by 1.8x. I built a serverless inference pipeline using Kinesis, Lambda, and SageMaker Endpoints, enabling 24x7 fault-tolerant processing and reducing inference P95 latency by 40%. I also implemented data preprocessing pipelines (AWS Glue + S3) to handle 8 TB of raw DICOM data, achieving 2.3x faster ingestion. I configured a monitoring stack (CloudWatch, Prometheus, MLflow tracking) for model metrics, which cut debug time by 60%.

Skills: TensorFlow, Keras, Computer Vision, Model Optimization, Quantization, ONNX Runtime, AWS SageMaker, CUDA, AWS Kinesis, AWS Glue, MLflow. Core Accomplishments: Trained and optimized large-scale vision models (TensorFlow, Keras) on <strong>8TB of medical imaging data</strong>, improving diagnostic mAP by <strong>13%</strong> and IoU by <strong>17%</strong>.; Deployed <strong>ONNX-quantized inference endpoints</strong> on SageMaker, reducing P95 latency by <strong>40%</strong> with negligible AUC drift (<0.5%) for real-time diagnostics.; Leveraged <strong>SageMaker Spot Instances</strong> and distributed training strategies to reduce GPU training costs by <strong>25%</strong> while accelerating iteration cycles by <strong>1.8×</strong>.; Architected a <strong>serverless inference pipeline</strong> using AWS Kinesis and Lambda, ensuring <strong>24/7 fault tolerance</strong> and scalable processing for high-volume image streams.; Configured a robust <strong>model monitoring stack</strong> (MLflow, Prometheus) to track drift and performance, reducing debugging time for production models by <strong>60%</strong>.


Machine Learning Researcher at SRM Advanced Electronics Laboratory (2021-12-01 - 2023-07-31)

In my role as a Machine Learning Researcher at SRM Advanced Electronics Laboratory, I developed a distributed kernel regression pipeline on Apache Spark (using Java and MLlib) for the prediction of non-invasive blood glucose from photoacoustic spectroscopy (PAS) signals. The pipeline achieved a highly accurate clinical-grade Mean Absolute Relative Difference (MARD) of 8.86% and RMSE of 10.94, with over 95% of predictions falling within the safe and accurate Clarke Error Grid Zones A and B, a testament to its reliability under high-noise, real-world sensor conditions. I engineered a real-time IoT data streaming pipeline leveraging AWS Greengrass for edge processing and AWS IoT Core for cloud messaging via MQTT, achieving an ultra-low-latency ingestion of less than 200ms for over 2,000 sensor readings per hour. I also designed robust ETL and data-quality workflows using Spark SQL, implemented automated validation metrics, and created evaluation frameworks to ensure production-grade signal integrity across various edge devices. This foundational work culminated in the co-authoring and publication of the research in **Scientific Reports** (Nature Portfolio), a Q1-ranked journal.

Skills: Apache Spark, SQL, Java, AWS Greengrass, AWS IoT Core, AWS MQTT, Regression Modeling, ETL, Research. Core Accomplishments: Developed a distributed <strong>kernel regression pipeline</strong> on Apache Spark (Java + MLlib), achieving a clinical-grade <strong>MARD of 8.86%</strong> for non-invasive glucose monitoring.; Engineered a real-time IoT streaming system using <strong>AWS Greengrass and IoT Core</strong>, enabling ultra-low latency ingestion (<strong><200ms</strong>) for 2,000+ daily sensor readings.; Designed production-grade <strong>ETL and data quality workflows</strong> with Spark SQL, ensuring <strong>95%+ signal integrity</strong> across distributed edge devices.; Optimized cloud-to-edge messaging protocols via <strong>MQTT</strong>, ensuring reliable data transmission and reducing packet loss by <strong>15%</strong> in unstable network environments.; Co-authored and published this novel research in <strong>Scientific Reports (Nature Portfolio)</strong>, a Q1 journal, validating the system's clinical accuracy and architectural robustness.


Toolkit

Languages

AI/ML

LLMs & Agentic Systems

Backend & Orchestration

Cloud & IaC

Databases & MLOps

Experience

VelocitiPM LLC

Jun 2025 - Present

PythonGoC++LangGraphCrewAIFastAPIKafkaAWS LambdaAWS KinesisAWS SQSRedisMLOpsSystem DesignObservability

TimelyHero, Dimes Inc.

Aug 2024 - Jun 2025

Founding AI Engineer
Remote - Tokyo, Japan
JavaFlaskgRPCKafkaAirflowRAGPineconeMongoDBTerraformKubernetes (AKS)AzureSystem Design

JLiang Lab, Arizona State University

Sep 2023 - May 2024

PyTorchJAXComputer VisionDistributed Training (FSDP)SlurmAWS EKSAWS GlueAWS S3SageMakerMLflow

Endimension Inc.

Apr 2024 - Aug 2024

Machine Learning Intern
Remote - Tempe, AZ, USA
TensorFlowKerasComputer VisionModel OptimizationQuantizationONNX RuntimeAWS SageMakerCUDAAWS KinesisAWS GlueMLflow

SRM Advanced Electronics Laboratory

Dec 2021 - Jul 2023

Machine Learning Researcher
Amaravathi, AP, India
Apache SparkSQLJavaAWS GreengrassAWS IoT CoreAWS MQTTRegression ModelingETLResearch

Projects

Here, you'll find the 27 of my best works in the fields of machine learning, computer science, automation and more.

2025 Q4

4 Projects

2025 Q3

4 Projects

2025 Q2

4 Projects

2025 Q1

3 Projects

2024 Q4

2 Projects

2024 Q3

2 Projects

2024 Q2

2 Projects

2024 Q1

1 Project

2023

2 Projects

2022-2021

2 Projects

Pre 2021

1 Project

Honors

Here are a few of my honors, awards, scholarships and certifications.

Scholarships and Fellowships

Herbold ASU Graduate Scholarship

Aug 2024 - Aug 2025

Herbold Foundation
Arizona State University, Tempe, Arizona

ASU Engineering Graduate Fellowship

Jul 2023 - Jul 2024

Ira A. Fulton Schools of Engineering
Arizona State University, Tempe, Arizona

SRM Merit Scholarship

Jun 2019 - Jun 2023

SRM University
SRM University, AP, India

Awards

Gold Medalist: Research Day

Apr 2023

SRM University
SRM University, AP, India

Certifications

Here is a comprehensive list of all 18 of my professional certifications, showcasing my commitment to continuous learning and expertise in various technologies.

AI & Machine Learning Foundations

Industry Specializations & MLOps

Professional Development

Foundational Skills & Legacy

Publications

Here is a list of all my publications.

Values

These are the principles that guide my work and life. They're not just words on a page—they're the compass that helps me navigate complex challenges, build meaningful relationships, and create technology that truly serves people. I believe that when we align our actions with our values, we can make a genuine difference in the world around us.

Mastery

I raise the standard in everything I touch. I value depth, precision and the pursuit of world-class craft.

Relentless Growth

Every moment is data. I learn fast, adapt fast and constantly sharpen my mind, skills and character.

Resilience

Pressure clarifies. I stay steady, reset quickly and come back stronger every single time.

Impact

I care about meaningful progress. I direct effort where it moves systems, teams and outcomes forward.

Service

Strength increases when shared. I help, uplift and enable others to operate at their best.

Clarity & Wisdom

I think deeply, choose consciously and act with alignment. I make decisions anchored in truth, not noise.

Freedom

I design my life around growth, curiosity and joy. I choose direction intentionally, not reactively.

Discipline

Consistency builds power. I show up every day and move forward with deliberate action.

- by Himansh Mudigonda