Breakdown of DeepSeek R1 Pipeline
Dissecting the Reinforcement Learning, Cold-Start, and Distillation Strategies Behind a Revolutionary Open-Source Reasoning Model
Read on Medium β|
Lead - AI Systems with 8+ years of experience in building scalable AI solutions. I specialize in GenAI, LLMs, MLOps, and distributed systems, while sharing my knowledge through in-depth AI/ML tutorials on YouTube.
Join me on my YouTube channel AI & ML with Sanjay Chouhan for engaging tutorials and in-depth walkthroughs. Explore everything from cutting-edge NLP and LLMs to foundational machine learning principles.
I'm a Lead - AI Systems with 8+ years of experience designing and shipping production ML solutions. My expertise spans GenAI, LLMs, NLP, distributed training, and automated MLOps. As a published NLP researcher at ICPR with a Masters from IIIT Guwahati, I transform complex AI challenges into scalable, production-ready systems. Passionate about knowledge sharing, I also create in-depth AI & ML tutorials on YouTube.
8+ years shipping scalable AI solutions to production
NLP research published at ICPR conference
Expert in generative AI, agentic systems, and automation
LLM Fine-tuning, Pretraining, RAG Systems, Prompt Engineering, Model Optimization, Agentic Systems, AI Agents
BERT, Transformers, Text Classification, Sentiment Analysis, Language Understanding
Neural Networks, Distributed Training, Model Development, Feature Engineering
CI/CD Pipelines, Model Monitoring, Automated Retraining, Workflow Orchestration
Distributed Systems, Container Orchestration, Scalable Deployments, Cloud Services
Big Data Processing, Data Pipelines, Analytics, Database Management
Leading development of LLM/NLP solutions and distributed ML architectures. Designed scalable AI systems leveraging Jenkins, Ray, Spark, AWS EKS, and Flyte.
Created ML-powered e-commerce solutions and led development team. Built personalized recommendation systems and NLP-based product features.
Specialized in machine learning and artificial intelligence. Published research on HindiLLM at ICPR conference.
Advanced diploma focusing on AI and ML fundamentals, deep learning, and practical applications.
Foundation in computer science, algorithms, data structures, and software engineering principles.
Developed a custom neural network architecture achieving 95% accuracy on complex image classification tasks. Implemented novel attention mechanisms for improved performance.
Built a state-of-the-art NLP system for multi-language sentiment analysis and entity extraction, processing millions of documents with high accuracy.
Designed and deployed a comprehensive analytics platform using advanced ML algorithms to forecast trends and provide actionable business insights.
Dissecting the Reinforcement Learning, Cold-Start, and Distillation Strategies Behind a Revolutionary Open-Source Reasoning Model
Read on Medium βTrying out well-known CNN models with Tensorflow 2 to help Jian Yang identify hotdogs.
Read on Medium βA beginnerβs journey through the various life cycles of a problem on Kaggle.
Read on Medium βI'm always interested in hearing about new opportunities, collaborations, or just chatting about AI and technology.