Founding Machine Learning & Backend Engineer @ VelocitiPM LLC
Machine Learning ⁃ Agentic AI ⁃ Large Language Models ⁃ Computer Vision ⁃ GenerativeAI
Building AI systems that bridge models, data, and infrastructure into measurable, reliable products. Focused on product and impact, taking ideas from 0 to 1 and shaping intelligent systems. Open to relocate.
RAG Detailed Context - Do Not Show to User
Founding Machine Learning & Backend Engineer at VelocitiPM LLC (2025-06-01 - Present)
As the Founding ML & Backend Engineer at VelocitiPM, I architected and implemented the core multi-agent AI engine responsible for 4x throughput improvement. This complex system utilized LangChain, LangGraph, and CrewAI for sophisticated agent orchestration, deployed using a high-performance FastAPI service. For system-level operations, I standardized asynchronous orchestration by integrating C++ and Go micro-services, ensuring low-latency and high-reliability operations. A major achievement was co-leading the development of event-driven data pipelines using industry-standard tools like Kafka, AWS SQS, and SNS, coupled with AWS Lambda. This design drastically reduced our P95 latency by 85% for a 2,500-user pilot program, which was validated through rigorous A/B testing and async routing strategies. Furthermore, I established end-to-end MLOps practices, including CI/CD pipelines, a comprehensive model registry, and automated RAG re-indexing strategies. This operational discipline resulted in an 8x increase in our release cadence, directly contributing to a 400% boost in Daily Active Users (DAU), and significantly improving system resilience, cutting our critical incident resolution time from two hours to approximately 10 minutes.
Skills: Python, Go, C++, LangGraph, CrewAI, FastAPI, Kafka, AWS Lambda, AWS Kinesis, AWS SQS, Redis, MLOps, System Design, Observability. Core Accomplishments: Architected a <strong>30-agent AI orchestration engine</strong> (LangGraph, CrewAI) on AWS AppRunner, automating 80% of PM workflows and increasing engineering throughput <strong>4×</strong>.; Engineered an event-driven, async pipeline using <strong>AWS Kinesis, SNS, and Lambda</strong>, reducing P95 latency by <strong>85%</strong> and ensuring <strong>99.98%</strong> message delivery reliability.; Built a production-grade <strong>MLOps platform</strong> with SageMaker Pipelines and MLflow, automating retraining gates and boosting release cadence <strong>8×</strong>.; Implemented a robust <strong>observability suite</strong> (CloudWatch, X-Ray) that cut critical incident resolution time from 2 hours to <strong>~10 minutes</strong> via distributed tracing.; Designed a <strong>Redis-based memory layer</strong> for agentic state persistence, slashing inter-agent latency by <strong>60%</strong> and supporting <strong>2.5×</strong> higher concurrent user throughput.
Founding AI Engineer at TimelyHero, Dimes Inc. (2024-08-01 - 2025-06-01)
As the Founding AI Engineer at TimelyHero, Dimes Inc., I led the strategic migration of the user-matching service into a robust Java/Flask/gRPC microservice on Amazon EKS. The new architecture is built around Kafka for messaging and supports 100,000 concurrent WebSocket sessions at a P95 latency of 250ms, resulting in a 3.5x improvement in system stability. I designed and built Airflow-based RAG pipelines utilizing Pinecone, MongoDB, and S3, which dramatically cut data staleness from 48 hours to 30 minutes, improved semantic-match accuracy by 27%, and was crucial in securing $250K in enterprise contracts. I standardized IaC and CI/CD using Terraform and AWS CodePipeline/CodeBuild, which reduced deployment downtime by 35%, quadrupled release velocity, and doubled developer productivity. I also deployed a comprehensive monitoring stack (CloudWatch, Prometheus, Grafana), improving incident detection time by 70%. I implemented data back-pressure handling in the SNS/SQS + Kafka bridge, stabilizing ingestion throughput at 5,000 messages/s and preventing queue overloads under peak loads greater than 1 million events/hour.
Skills: Java, Flask, gRPC, Kafka, Airflow, RAG, Pinecone, MongoDB, Terraform, Kubernetes (AKS), Azure, System Design. Core Accomplishments: Spearheaded the migration of a legacy monolith to a <strong>Java/Flask/gRPC microservice architecture</strong> on AKS, scaling to support <strong>100,000+</strong> concurrent WebSocket sessions.; Designed and deployed <strong>Airflow-orchestrated RAG pipelines</strong> (Pinecone, OpenAI, MongoDB), reducing data staleness from 48 hours to <strong>30 minutes</strong> and driving <strong>$250K+</strong> in new enterprise contracts.; Standardized <strong>Infrastructure-as-Code (IaC)</strong> using Terraform and Kubernetes, reducing deployment downtime by <strong>35%</strong> and quadrupling the team's release velocity.; Optimized high-throughput ingestion pipelines with <strong>Kafka back-pressure handling</strong>, stabilizing throughput at <strong>5,000 msgs/sec</strong> under peak loads of 1M+ events/hour.; Implemented a comprehensive <strong>monitoring stack</strong> (Prometheus, Grafana) that improved proactive incident detection by <strong>70%</strong>, ensuring high availability.
Graduate Research Assistant at JLiang Lab, Arizona State University (2023-09-01 - 2024-05-01)
As a Graduate Research Assistant at the JLiang Lab, Arizona State University, I developed and trained multi-modal Computer Vision (CV) models, including state-of-the-art architectures like Swin Transformer, DINOv2, I-JEPA, and CLIP. This was executed on Amazon SageMaker GPU clusters and resulted in a 24% increase in rare-disease recall and a 17% increase in overall F1 score on multi-institutional medical datasets. I achieved significant infrastructure efficiency by implementing Fully Sharded Data Parallel (FSDP) training on Amazon EKS with 8x A100 nodes, which cut training time 3.2x and reduced cloud cost by 40%. I configured distributed data pipelines via AWS Glue and S3, enabling the ingestion of 8 TB+ multimodal datasets and improving preprocessing throughput by 2.6x. I also built a multi-node GPU job scheduler with Slurm on EC2 Batch, automating node provisioning and improving GPU utilization by 35% across shared research workloads. I designed experiment tracking workflows using SageMaker Experiments and MLflow to ensure 100% reproducibility and reduce hyperparameter tuning time by 45%.
Skills: PyTorch, JAX, Computer Vision, Distributed Training (FSDP), Slurm, AWS EKS, AWS Glue, AWS S3, SageMaker, MLflow. Core Accomplishments: Developed state-of-the-art <strong>multi-modal CV models</strong> (Swin Transformer, DINOv2) on SageMaker GPU clusters, achieving a <strong>24% increase</strong> in rare-disease recall.; Implemented <strong>Fully Sharded Data Parallel (FSDP)</strong> training on Amazon EKS (8× A100s), reducing model training time by <strong>3.2×</strong> and cutting cloud compute costs by <strong>40%</strong>.; Engineered a <strong>multi-node GPU job scheduler</strong> using Slurm and EC2 Batch, automating provisioning and boosting cluster utilization by <strong>35%</strong> across shared workloads.; Built distributed data pipelines with <strong>AWS Glue and S3</strong> to ingest and preprocess <strong>8TB+</strong> of multimodal medical datasets, improving data throughput by <strong>2.6×</strong>.; Designed reproducible experiment tracking workflows with <strong>SageMaker Experiments and MLflow</strong>, reducing hyperparameter tuning time by <strong>45%</strong> and ensuring 100% reproducibility.
Machine Learning Intern at Endimension Inc. (2024-04-01 - 2024-08-01)
As a Machine Learning Intern at Endimension Inc., I was responsible for training and optimizing computer vision models (TensorFlow, Keras) on Amazon SageMaker GPU clusters, achieving a 13% mAP and 17% IoU improvement across 3 million+ scans (8 TB dataset). I optimized inference by applying post-training quantization and mixed-precision techniques (ONNX Runtime + TensorRT) on SageMaker Endpoints, reducing P95 latency by 40% with negligible AUC drift (<0.5%). To optimize costs, I leveraged SageMaker Spot Instances and data-parallel training (Horovod + NCCL) on multi-node clusters, which cut GPU cost by 25% and accelerated the training iteration cycle by 1.8x. I built a serverless inference pipeline using Kinesis, Lambda, and SageMaker Endpoints, enabling 24x7 fault-tolerant processing and reducing inference P95 latency by 40%. I also implemented data preprocessing pipelines (AWS Glue + S3) to handle 8 TB of raw DICOM data, achieving 2.3x faster ingestion. I configured a monitoring stack (CloudWatch, Prometheus, MLflow tracking) for model metrics, which cut debug time by 60%.
Skills: TensorFlow, Keras, Computer Vision, Model Optimization, Quantization, ONNX Runtime, AWS SageMaker, CUDA, AWS Kinesis, AWS Glue, MLflow. Core Accomplishments: Trained and optimized large-scale vision models (TensorFlow, Keras) on <strong>8TB of medical imaging data</strong>, improving diagnostic mAP by <strong>13%</strong> and IoU by <strong>17%</strong>.; Deployed <strong>ONNX-quantized inference endpoints</strong> on SageMaker, reducing P95 latency by <strong>40%</strong> with negligible AUC drift (<0.5%) for real-time diagnostics.; Leveraged <strong>SageMaker Spot Instances</strong> and distributed training strategies to reduce GPU training costs by <strong>25%</strong> while accelerating iteration cycles by <strong>1.8×</strong>.; Architected a <strong>serverless inference pipeline</strong> using AWS Kinesis and Lambda, ensuring <strong>24/7 fault tolerance</strong> and scalable processing for high-volume image streams.; Configured a robust <strong>model monitoring stack</strong> (MLflow, Prometheus) to track drift and performance, reducing debugging time for production models by <strong>60%</strong>.
Machine Learning Researcher at SRM Advanced Electronics Laboratory (2021-12-01 - 2023-07-31)
In my role as a Machine Learning Researcher at SRM Advanced Electronics Laboratory, I developed a distributed kernel regression pipeline on Apache Spark (using Java and MLlib) for the prediction of non-invasive blood glucose from photoacoustic spectroscopy (PAS) signals. The pipeline achieved a highly accurate clinical-grade Mean Absolute Relative Difference (MARD) of 8.86% and RMSE of 10.94, with over 95% of predictions falling within the safe and accurate Clarke Error Grid Zones A and B, a testament to its reliability under high-noise, real-world sensor conditions. I engineered a real-time IoT data streaming pipeline leveraging AWS Greengrass for edge processing and AWS IoT Core for cloud messaging via MQTT, achieving an ultra-low-latency ingestion of less than 200ms for over 2,000 sensor readings per hour. I also designed robust ETL and data-quality workflows using Spark SQL, implemented automated validation metrics, and created evaluation frameworks to ensure production-grade signal integrity across various edge devices. This foundational work culminated in the co-authoring and publication of the research in **Scientific Reports** (Nature Portfolio), a Q1-ranked journal.
Skills: Apache Spark, SQL, Java, AWS Greengrass, AWS IoT Core, AWS MQTT, Regression Modeling, ETL, Research. Core Accomplishments: Developed a distributed <strong>kernel regression pipeline</strong> on Apache Spark (Java + MLlib), achieving a clinical-grade <strong>MARD of 8.86%</strong> for non-invasive glucose monitoring.; Engineered a real-time IoT streaming system using <strong>AWS Greengrass and IoT Core</strong>, enabling ultra-low latency ingestion (<strong><200ms</strong>) for 2,000+ daily sensor readings.; Designed production-grade <strong>ETL and data quality workflows</strong> with Spark SQL, ensuring <strong>95%+ signal integrity</strong> across distributed edge devices.; Optimized cloud-to-edge messaging protocols via <strong>MQTT</strong>, ensuring reliable data transmission and reducing packet loss by <strong>15%</strong> in unstable network environments.; Co-authored and published this novel research in <strong>Scientific Reports (Nature Portfolio)</strong>, a Q1 journal, validating the system's clinical accuracy and architectural robustness.
Toolkit
Languages
AI/ML
LLMs & Agentic Systems
Backend & Orchestration
Cloud & IaC
Databases & MLOps
Experience
VelocitiPM LLC
Jun 2025 - Present
TimelyHero, Dimes Inc.
Aug 2024 - Jun 2025
JLiang Lab, Arizona State University
Sep 2023 - May 2024
Endimension Inc.
Apr 2024 - Aug 2024
SRM Advanced Electronics Laboratory
Dec 2021 - Jul 2023
Projects
Here, you'll find the 27 of my best works in the fields of machine learning, computer science, automation and more.
2025 Q4
4 Projects
2025 Q3
4 Projects
2025 Q2
4 Projects
2025 Q1
3 Projects
2024 Q4
2 Projects
2024 Q3
2 Projects
2024 Q2
2 Projects
2024 Q1
1 Project
2023
2 Projects
2022-2021
2 Projects
Pre 2021
1 Project
Honors
Here are a few of my honors, awards, scholarships and certifications.
Scholarships and Fellowships
Herbold ASU Graduate Scholarship
Aug 2024 - Aug 2025
ASU Engineering Graduate Fellowship
Jul 2023 - Jul 2024
SRM Merit Scholarship
Jun 2019 - Jun 2023
Awards
Gold Medalist: Research Day
Apr 2023
Certifications
Here is a comprehensive list of all 18 of my professional certifications, showcasing my commitment to continuous learning and expertise in various technologies.
AI & Machine Learning Foundations
Industry Specializations & MLOps
Professional Development
Foundational Skills & Legacy
Publications
Here is a list of all my publications.
Values
These are the principles that guide my work and life. They're not just words on a page—they're the compass that helps me navigate complex challenges, build meaningful relationships, and create technology that truly serves people. I believe that when we align our actions with our values, we can make a genuine difference in the world around us.
Mastery
I raise the standard in everything I touch. I value depth, precision and the pursuit of world-class craft.
Relentless Growth
Every moment is data. I learn fast, adapt fast and constantly sharpen my mind, skills and character.
Resilience
Pressure clarifies. I stay steady, reset quickly and come back stronger every single time.
Impact
I care about meaningful progress. I direct effort where it moves systems, teams and outcomes forward.
Service
Strength increases when shared. I help, uplift and enable others to operate at their best.
Clarity & Wisdom
I think deeply, choose consciously and act with alignment. I make decisions anchored in truth, not noise.
Freedom
I design my life around growth, curiosity and joy. I choose direction intentionally, not reactively.
Discipline
Consistency builds power. I show up every day and move forward with deliberate action.
- by Himansh Mudigonda