Hi, I'm
Pradyumna
Senior Data Scientist
Impact-driven Data Scientist with 5 years of experience building production-grade AI/ML systems across Medical AI, NLP, Computer Vision, and Generative AI. Proven track record in architecting GraphRAG pipelines, fine-tuning large medical language models, developing voice agents for clinical applications, and leading cross-functional teams.
Skills & Technologies
A battle-tested toolkit spanning GenAI, Knowledge Graphs, Computer Vision, and production MLOps — built over 5+ years.
GenAI & LLMs
Knowledge Graphs & RAG
Computer Vision & NLP
MLOps & Infrastructure
Core Data Science
Work Experience
5 years building production AI/ML at scale across GenAI, Medical AI, and Data Science.
Senior Data Scientist
Mondee Pvt. Ltd.
Hyderabad, India
- ▸Architected a medical-grade GraphRAG chatbot for clinical decision support by constructing structured knowledge graphs from medical textbooks — enabling traceable, hallucination-resistant drug query responses.
- ▸Engineered a drug–drug interaction checker and dosage scheduler integrating real-time RxNorm and PubMed APIs for conflict alerts and patient-specific recommendations.
- ▸Led end-to-end fine-tuning of Medgemma-27b-text-it for clinical NLP, coordinating data curation, training, and evaluation with teams from IIT Madras and IIT Hyderabad.
- ▸Directed large-scale ASR data preparation for gemma-3n-e2b-it to build a medical voice agent capable of real-time clinical transcription and query resolution.
Data Scientist
ADP India Pvt. Ltd.
Hyderabad, India
- ▸Designed a Knowledge-Graph RAG pipeline on AWS Neptune & Bedrock, reducing AI hallucinations in financial data — recognised as runner-up in the ADP Global Hackathon.
- ▸Built a Process Mining solution analysing millions of client records to optimise payment workflows across 73 payroll cycles per client on average.
- ▸Engineered an agentic assistant that drafts emails and schedules meetings in real-time, saving 24 hours per user per month — now rolled out across all ADP employees.
Junior Data Scientist
Claim Genius Pvt. Ltd.
Remote, India
- ▸Built a high-performance Instance Segmentation pipeline with Detectron2 + FastAPI, improving mAP by 12% and accelerating vehicle damage assessment by 26%.
- ▸Enhanced model interpretability for regulatory compliance by integrating GradCAM++ visualisations.
- ▸Trained GAN models for synthetic image generation to resolve class imbalance in insurance damage detection datasets.
Certifications
Agentic Knowledge Graph Construction
DeepLearning.AI × Neo4j
Aug 2025
Neo4j Fundamentals
Neo4j GraphAcademy
Jul 2025
Pretraining LLMs
DeepLearning.AI × Upstage
Feb 2025
TensorFlow Developer Certificate
Coursera
2023
Deep Learning Specialization
Coursera
2022
An Introduction To Practical Deep Learning
Intel - Coursera
2022
Technical Support Fundamentals
Google - Coursera
2021
Education
M.Sc. Computer Science (Big Data Analytics)
Central University of Rajasthan
Kishangarh, India
Integrated B.Sc. B.Ed. (Physical Sciences and Education)
Regional Institute of Education (NCERT), Bhubaneswar
Bhubaneswar, India
Featured Projects
Research and personal projects spanning multi-label learning, medical AI, quantum computing, and recommender systems.
Algorithm Development: Multi-label Classification Enhancement
Improving multi-label classification by generating synthetic data for rare labels using MLSMOTE technique at TCS Big Data Lab, Rajasthan. Addressed the tail-labels problem where classifiers struggle with underrepresented labels.
LLSF_DL-MLSMOTE-Hybrid
Hybrid deep learning approach combining Label-Specific Feature learning with MLSMOTE for multi-label classification. Implements the LLSF-DL algorithm for improved classification performance on imbalanced datasets.
LLSF-Learning-Label-Specific-Features
Implementation of the Learning Label-Specific Features (LLSF) algorithm for multi-label classification. Enables feature selection by ranking features according to their relevance to each label.
Session-based Recommendation with Graph Neural Networks
Graph Neural Network-based recommendation system for session-based learning. Captures essential features from graph structures to recommend items during ongoing sessions.
Electricity Price Prediction using ELM-PSO-ARIMA
Hybrid model combining Extreme Learning Machine, Particle Swarm Optimization, and ARIMA to capture frequent changes in electricity prices with improved accuracy.
SVM-kNN-PSO Ensemble for Intrusion Detection
Novel ensemble method combining Support Vector Machines, k-Nearest Neighbors, and Particle Swarm Optimization for robust intrusion detection system.
Brain-Tumor-Segmentation
Deep learning brain tumor segmentation from MRI scans using U-Net architecture with attention mechanisms for improved medical imaging analysis.
MLSMOTE
Multi-Label Synthetic Minority Over-sampling Technique for handling class imbalance in multi-label datasets. Generates synthetic samples for minority labels.
Rule-based-Recommender-system
Rule-based recommendation engine using association rules and collaborative filtering techniques with NLP for personalized recommendations.
My-first-quantum-code
Quantum computing experiments using Qiskit — exploring quantum circuit simulations and entanglement phenomena.
Latest Articles
Deep-dives into ML research, audio source separation, and multilingual NLP.

Salesforce Uses AWS Textract For Intelligent Document Automation
The healthcare domain has received all-time higher attention because of the current pandemic... Read the full article →

Extracting Vocals And Instrumentals From Music The Deep Learning Way
Whenever people get exposed to good music, the tune gets stuck in their heads for hours. And at some point, they google up the lyrics, vocals, and instrumental... Read the full article →

Microsoft Speller100: A Spell-Checker For Over 100 Languages
People do not care enough to use their queries’ correct spelling while searching for anything online... Read the full article →

A Deep Dive Into IBM Quantum Roadmap
“It took us 60 years from the first logic gates to modern cloud services. But IBM has set itself on a mission to fast forward the same journey for Quantum Computation (QC) to 3 years,” Jay G.. Read the full article →

IIT Kanpur Offers Free 8-Weeks Computational Science Course, Enrollments Ends 15th Feb
IIT Kanpur has opened up the enrollment for an eight-week online course on computational science on the SWAYAM platform. An AvHumboldt Fellow with over 50 publications in his name, Dr... Read the full article →

Dealing With Racially-Biased Hate-Speech Detection Models
Hate-speech detection models are the most glaring example of biased models, as shown by researchers from Allen Institute for Artificial Intelligence in their linguistic study... Read the full article →
Get In Touch
Open to senior AI/ML roles, GenAI research collaborations, and consulting opportunities.
Pradyumna Kumar Sahoo
Senior Data Scientist
📍 Hyderabad, India
pradyumna.sahoo@outlook.in
✉️ Send Me an EmailBuilt with Next.js · Tailwind CSS · Framer Motion — © 2026 Pradyumna Kumar Sahoo