Predicting the Future of Medicine Supply: ML-Driven Auto-Replenishment at Scale 🏥📈
Published:
Managing inventory for India’s largest medicine warehouse is like predicting the unpredictable. With over 100,000 orders processed daily and thousands of SKUs ranging from common painkillers to life-saving prescription drugs, traditional inventory management simply doesn’t cut it. At Tata 1mg, I had the opportunity to build an ML-driven auto-replenishment module for Odin - our proprietary Warehouse Management System (WMS) - using cutting-edge time series forecasting with ARIMA and LSTM models. This system revolutionized how we manage medicine supply chains at scale.
The Challenge: Medicine Inventory at Scale 💊
The pharmaceutical supply chain presents unique challenges that make inventory management incredibly complex:
1. Seasonal Patterns & Health Trends
- Cold medicines spike during monsoon season
- Allergy medications surge during specific months
- Sudden demand for masks and sanitizers (COVID-19 taught us this!)
2. Regulatory Compliance
- Expiry date management for medications
- Temperature-controlled storage requirements
- Batch tracking and recall procedures
3. Economic Impact
- Stockouts mean patients can’t get life-saving medications
- Overstock leads to expired medicines and financial losses
- Each incorrect prediction affects thousands of patients
4. Scale Complexity
ODIN_SCALE_METRICS = {
"daily_orders": 100_000,
"unique_skus": 50_000,
"warehouses": 12,
"supplier_partners": 500,
"cities_served": 1000,
"avg_shelf_life_days": 365,
"temperature_zones": 4 # Room temp, cool, cold, frozen
}
When I joined the Odin team, the existing system relied on simple threshold-based reordering - essentially “order more when stock hits X units.” This approach led to:
- 23% stockout rate during peak seasons
- ₹2.3 crore worth of expired inventory annually
- Manual intervention required for 40% of replenishment decisions
We needed something smarter.
The Solution: Time Series Forecasting with ML 🧠
I designed a hybrid forecasting system that combines the statistical rigor of ARIMA with the pattern recognition power of LSTM networks. The system continuously learns from historical data, seasonal patterns, and external factors to predict future demand with remarkable accuracy.
Architecture Overview
from typing import Dict, List, Tuple, Optional
import pandas as pd
import numpy as np
from statsmodels.tsa.arima.model import ARIMA
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from sklearn.preprocessing import MinMaxScaler
import logging
class HybridDemandForecaster:
"""
Hybrid forecasting system combining ARIMA and LSTM
for pharmaceutical demand prediction
"""
def __init__(self, sku_id: str, warehouse_id: str):
self.sku_id = sku_id
self.warehouse_id = warehouse_id
self.arima_model = None
self.lstm_model = None
self.scaler = MinMaxScaler()
self.historical_data = None
self.external_factors = ExternalFactorsProcessor()
def prepare_data(self, historical_sales: pd.DataFrame) -> Tuple[np.ndarray, np.ndarray]:
"""Prepare time series data for training"""
# Add external factors (weather, disease outbreaks, festivals)
enhanced_data = self.external_factors.enrich_data(
historical_sales, self.sku_id
)
# Handle missing values and outliers
cleaned_data = self._clean_time_series(enhanced_data)
# Create features for LSTM
lstm_features, lstm_targets = self._create_lstm_sequences(
cleaned_data, lookback_days=30
)
return lstm_features, lstm_targets
def train_arima_model(self, data: pd.Series) -> None:
"""Train ARIMA model for trend and seasonality"""
# Automatic order selection using AIC
best_aic = float('inf')
best_order = None
for p in range(0, 4):
for d in range(0, 2):
for q in range(0, 4):
try:
model = ARIMA(data, order=(p, d, q))
fitted_model = model.fit()
if fitted_model.aic < best_aic:
best_aic = fitted_model.aic
best_order = (p, d, q)
self.arima_model = fitted_model
except:
continue
logging.info(f"Best ARIMA order for {self.sku_id}: {best_order}")
Deep Dive: ARIMA for Statistical Forecasting 📊
ARIMA (AutoRegressive Integrated Moving Average) excels at capturing trend and seasonality patterns in pharmaceutical demand:
class ARIMAForecaster:
def __init__(self):
self.seasonal_periods = {
'cold_medicines': 365, # Yearly seasonality
'allergy_medicines': 90, # Quarterly patterns
'prescription_drugs': 30, # Monthly refill cycles
'wellness_products': 7 # Weekly patterns
}
def detect_seasonality(self, data: pd.Series, sku_category: str) -> int:
"""Detect seasonal patterns specific to medicine categories"""
from statsmodels.tsa.seasonal import seasonal_decompose
# Use domain knowledge for initial guess
expected_period = self.seasonal_periods.get(sku_category, 30)
# Verify with statistical decomposition
try:
decomposition = seasonal_decompose(
data,
model='additive',
period=expected_period
)
# Measure strength of seasonality
seasonal_strength = np.var(decomposition.seasonal) / np.var(data)
if seasonal_strength > 0.1: # Significant seasonality
return expected_period
else:
return 1 # No seasonality
except Exception as e:
logging.warning(f"Seasonality detection failed: {e}")
return 1
def forecast_with_confidence(self, steps: int) -> Dict:
"""Generate forecasts with confidence intervals"""
forecast_result = self.arima_model.forecast(steps=steps)
confidence_intervals = self.arima_model.get_forecast(steps=steps).conf_int()
return {
'forecast': forecast_result,
'lower_ci': confidence_intervals.iloc[:, 0].values,
'upper_ci': confidence_intervals.iloc[:, 1].values,
'model_aic': self.arima_model.aic
}
Deep Dive: LSTM for Pattern Recognition 🔮
While ARIMA captures statistical patterns, LSTM networks excel at learning complex, non-linear relationships in demand data:
class LSTMDemandForecaster:
def __init__(self, lookback_days: int = 30):
self.lookback_days = lookback_days
self.model = None
self.scaler = MinMaxScaler()
def build_model(self, input_shape: Tuple[int, int]) -> Sequential:
"""Build LSTM architecture optimized for demand forecasting"""
model = Sequential([
LSTM(64, return_sequences=True, input_shape=input_shape),
Dropout(0.2),
LSTM(32, return_sequences=True),
Dropout(0.2),
LSTM(16, return_sequences=False),
Dropout(0.2),
Dense(8, activation='relu'),
Dense(1, activation='linear') # Demand can't be negative
])
model.compile(
optimizer='adam',
loss='huber', # Robust to outliers
metrics=['mae', 'mape']
)
return model
def create_sequences(self, data: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
"""Create sequences for LSTM training"""
X, y = [], []
for i in range(self.lookback_days, len(data)):
# Features: past demand, day of week, month, external factors
sequence = data[i-self.lookback_days:i]
target = data[i, 0] # Next day's demand
X.append(sequence)
y.append(target)
return np.array(X), np.array(y)
def train_with_early_stopping(self, X_train: np.ndarray, y_train: np.ndarray,
X_val: np.ndarray, y_val: np.ndarray) -> None:
"""Train LSTM with early stopping and learning rate scheduling"""
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
callbacks = [
EarlyStopping(
monitor='val_loss',
patience=15,
restore_best_weights=True
),
ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=8,
min_lr=0.0001
)
]
self.model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=100,
batch_size=32,
callbacks=callbacks,
verbose=1
)
External Factors Integration 🌍
Pharmaceutical demand is heavily influenced by external factors. Our system integrates multiple data sources:
class ExternalFactorsProcessor:
def __init__(self):
self.weather_api = WeatherDataAPI()
self.disease_tracker = DiseaseOutbreakTracker()
self.festival_calendar = IndianFestivalCalendar()
self.economic_indicators = EconomicDataProvider()
def enrich_data(self, sales_data: pd.DataFrame, sku_id: str) -> pd.DataFrame:
"""Add external factors relevant to pharmaceutical demand"""
enriched_data = sales_data.copy()
# Weather factors (crucial for seasonal medicines)
weather_data = self.weather_api.get_historical_data(
sales_data.index.min(), sales_data.index.max()
)
enriched_data['temperature'] = weather_data['temperature']
enriched_data['humidity'] = weather_data['humidity']
enriched_data['rainfall'] = weather_data['rainfall']
enriched_data['aqi'] = weather_data['air_quality_index']
# Disease outbreak indicators
disease_alerts = self.disease_tracker.get_alerts(
sales_data.index.min(), sales_data.index.max()
)
enriched_data['flu_alert_level'] = disease_alerts['flu']
enriched_data['dengue_alert_level'] = disease_alerts['dengue']
enriched_data['covid_cases'] = disease_alerts['covid']
# Festival and holiday effects
festivals = self.festival_calendar.get_festivals(
sales_data.index.min(), sales_data.index.max()
)
enriched_data['is_festival'] = festivals['is_festival']
enriched_data['days_to_festival'] = festivals['days_to_next_festival']
# Economic factors (affect purchasing power)
economic_data = self.economic_indicators.get_data(
sales_data.index.min(), sales_data.index.max()
)
enriched_data['inflation_rate'] = economic_data['inflation']
enriched_data['unemployment_rate'] = economic_data['unemployment']
return enriched_data
def calculate_feature_importance(self, sku_category: str) -> Dict[str, float]:
"""Calculate which external factors matter most for each medicine category"""
feature_weights = {
'cold_medicines': {
'temperature': 0.4,
'humidity': 0.3,
'flu_alert_level': 0.2,
'is_festival': 0.1
},
'allergy_medicines': {
'aqi': 0.5,
'pollen_count': 0.3,
'humidity': 0.2
},
'digestive_health': {
'is_festival': 0.4, # Food habits change during festivals
'temperature': 0.3,
'economic_factors': 0.3
}
}
return feature_weights.get(sku_category, {})
Ensemble Forecasting: Best of Both Worlds ⚖️
The magic happens when we combine ARIMA and LSTM predictions using an intelligent ensemble approach:
class EnsembleForecaster:
def __init__(self):
self.arima_forecaster = ARIMAForecaster()
self.lstm_forecaster = LSTMDemandForecaster()
self.ensemble_weights = {}
def calculate_dynamic_weights(self, sku_id: str, historical_accuracy: Dict) -> Dict[str, float]:
"""Calculate weights based on historical model performance"""
arima_accuracy = historical_accuracy.get('arima_mape', 0.3)
lstm_accuracy = historical_accuracy.get('lstm_mape', 0.3)
# Weight inversely proportional to error
arima_weight = 1 / (1 + arima_accuracy)
lstm_weight = 1 / (1 + lstm_accuracy)
# Normalize weights
total_weight = arima_weight + lstm_weight
return {
'arima': arima_weight / total_weight,
'lstm': lstm_weight / total_weight
}
def generate_ensemble_forecast(self, sku_id: str, forecast_horizon: int) -> Dict:
"""Generate ensemble forecast combining ARIMA and LSTM"""
# Get individual forecasts
arima_forecast = self.arima_forecaster.forecast_with_confidence(forecast_horizon)
lstm_forecast = self.lstm_forecaster.predict(forecast_horizon)
# Calculate dynamic weights
weights = self.calculate_dynamic_weights(sku_id, self.get_historical_accuracy(sku_id))
# Ensemble prediction
ensemble_forecast = (
weights['arima'] * arima_forecast['forecast'] +
weights['lstm'] * lstm_forecast
)
# Ensemble confidence intervals (conservative approach)
ensemble_lower = np.minimum(
arima_forecast['lower_ci'],
lstm_forecast * 0.8 # Assume 20% uncertainty for LSTM
)
ensemble_upper = np.maximum(
arima_forecast['upper_ci'],
lstm_forecast * 1.2
)
return {
'forecast': ensemble_forecast,
'confidence_lower': ensemble_lower,
'confidence_upper': ensemble_upper,
'arima_weight': weights['arima'],
'lstm_weight': weights['lstm'],
'individual_forecasts': {
'arima': arima_forecast['forecast'],
'lstm': lstm_forecast
}
}
Auto-Replenishment Decision Engine 🤖
The forecasting models feed into an intelligent decision engine that determines optimal reorder points and quantities:
class AutoReplenishmentEngine:
def __init__(self):
self.safety_stock_calculator = SafetyStockCalculator()
self.supplier_lead_times = SupplierDatabase()
self.shelf_life_tracker = ShelfLifeManager()
def calculate_reorder_point(self, sku_id: str, forecast_data: Dict) -> Dict:
"""Calculate when and how much to reorder"""
# Get SKU-specific parameters
sku_info = self.get_sku_info(sku_id)
lead_time = self.supplier_lead_times.get_lead_time(sku_id)
# Calculate demand during lead time
lead_time_demand = forecast_data['forecast'][:lead_time].sum()
# Safety stock based on forecast uncertainty
safety_stock = self.safety_stock_calculator.calculate(
forecast_mean=forecast_data['forecast'].mean(),
forecast_std=self._calculate_forecast_std(forecast_data),
lead_time=lead_time,
service_level=sku_info['target_service_level']
)
# Reorder point
reorder_point = lead_time_demand + safety_stock
# Economic Order Quantity (EOQ) with shelf life constraints
eoq = self._calculate_eoq_with_expiry(
annual_demand=forecast_data['forecast'].sum() * (365/30), # Annualize
holding_cost=sku_info['holding_cost'],
ordering_cost=sku_info['ordering_cost'],
shelf_life_days=sku_info['shelf_life_days']
)
return {
'reorder_point': reorder_point,
'order_quantity': eoq,
'safety_stock': safety_stock,
'lead_time_demand': lead_time_demand,
'confidence_level': self._calculate_decision_confidence(forecast_data)
}
def _calculate_eoq_with_expiry(self, annual_demand: float, holding_cost: float,
ordering_cost: float, shelf_life_days: int) -> float:
"""EOQ formula adjusted for perishable goods"""
import math
# Standard EOQ
eoq_standard = math.sqrt((2 * annual_demand * ordering_cost) / holding_cost)
# Shelf life constraint
max_order_size = (annual_demand / 365) * shelf_life_days * 0.8 # 80% of shelf life
# Return the minimum of EOQ and shelf life constraint
return min(eoq_standard, max_order_size)
Results: Transforming Medicine Supply Chain 📈
The impact of our ML-driven auto-replenishment system was remarkable:
Performance Metrics:
SYSTEM_PERFORMANCE = {
"forecast_accuracy_improvement": {
"arima_alone": "78% MAPE",
"lstm_alone": "82% MAPE",
"ensemble_model": "89% MAPE", # 11% improvement
"previous_system": "62% MAPE"
},
"business_impact": {
"stockout_reduction": "23% to 6%", # 74% improvement
"expired_inventory_reduction": "₹2.3Cr to ₹0.8Cr", # 65% reduction
"manual_interventions": "40% to 12%", # 70% reduction
"customer_satisfaction": "4.1 to 4.6 rating"
},
"operational_efficiency": {
"inventory_turnover": "8.2x to 12.1x", # 47% improvement
"working_capital_reduction": "₹15Cr",
"procurement_automation": "85%",
"forecast_generation_time": "2 hours to 15 minutes"
}
}
Real-World Impact Stories:
COVID-19 Response: When the pandemic hit, our system detected the surge in demand for masks, sanitizers, and immunity boosters 2 weeks before competitors, allowing us to stock up and serve customers when they needed us most.
Monsoon Preparedness: The system accurately predicted a 340% spike in cold and cough medicines during the 2023 monsoon season, preventing stockouts that would have affected thousands of patients.
Festival Planning: During Diwali 2023, the system forecasted increased demand for digestive medicines and antacids, leading to a 95% service level during the peak period.
Technical Challenges and Solutions 🔧
Challenge 1: Cold Start Problem
Problem: New SKUs have no historical data for forecasting.
Solution: Transfer learning approach using similar SKU patterns:
class ColdStartHandler:
def __init__(self):
self.sku_similarity_matcher = SKUSimilarityMatcher()
def generate_initial_forecast(self, new_sku_id: str) -> np.ndarray:
"""Generate forecast for new SKU using similar SKUs"""
# Find similar SKUs based on category, price, and therapeutic class
similar_skus = self.sku_similarity_matcher.find_similar(new_sku_id, top_k=5)
# Weighted average of similar SKU patterns
forecasts = []
weights = []
for similar_sku, similarity_score in similar_skus:
forecast = self.get_sku_forecast(similar_sku)
forecasts.append(forecast)
weights.append(similarity_score)
# Weighted ensemble
weights = np.array(weights) / sum(weights)
cold_start_forecast = np.average(forecasts, axis=0, weights=weights)
return cold_start_forecast
Challenge 2: Concept Drift
Problem: Demand patterns change over time (new diseases, changing lifestyles).
Solution: Adaptive model retraining:
class ConceptDriftDetector:
def __init__(self):
self.drift_threshold = 0.15 # 15% increase in error
self.monitoring_window = 30 # days
def detect_drift(self, sku_id: str, recent_errors: List[float]) -> bool:
"""Detect if model performance is degrading"""
if len(recent_errors) < self.monitoring_window:
return False
recent_mape = np.mean(recent_errors[-self.monitoring_window:])
historical_mape = self.get_historical_mape(sku_id)
drift_detected = (recent_mape - historical_mape) > self.drift_threshold
if drift_detected:
logging.warning(f"Concept drift detected for SKU {sku_id}")
self.trigger_model_retrain(sku_id)
return drift_detected
Lessons Learned: Building ML at Scale 📚
Domain Knowledge Trumps Complex Models: Understanding pharmaceutical supply chains was more valuable than the fanciest algorithms.
Hybrid Approaches Work: Combining statistical models (ARIMA) with deep learning (LSTM) provided the best results.
External Data is Gold: Weather, disease outbreaks, and festivals significantly improved forecast accuracy.
Start Simple, Iterate Fast: We began with basic ARIMA, then added LSTM, then ensemble methods.
Monitor, Monitor, Monitor: Concept drift is real - models need continuous monitoring and retraining.
Future Enhancements 🚀
We’re continuously improving the system:
- Real-time Learning: Incorporating streaming data for instant model updates
- Multi-warehouse Optimization: Global inventory optimization across all Odin warehouses
- Supplier Integration: Direct API integration with pharmaceutical manufacturers
- Explainable AI: Making forecast decisions interpretable for business users
Conclusion: ML-Powered Healthcare Supply Chain 🎯
Building the ML-driven auto-replenishment module for Odin was one of the most impactful projects of my career. The system doesn’t just predict numbers - it ensures that patients across India have access to the medicines they need, when they need them.
The combination of ARIMA’s statistical rigor and LSTM’s pattern recognition, enhanced with rich external data, created a forecasting system that significantly outperformed traditional approaches. More importantly, it automated 85% of procurement decisions, reduced expired inventory by 65%, and improved customer satisfaction from 4.1 to 4.6.
As machine learning continues to transform supply chains across industries, the principles we established - hybrid modeling, external data integration, and continuous monitoring - remain as relevant as ever. The future of healthcare supply chain management is intelligent, automated, and patient-centric.
Interested in discussing ML applications in healthcare or supply chain optimization? I’d love to connect! Reach out at yashpathania704@gmail.com or find me on LinkedIn.
Coming up next: I’ll be sharing how we built “Free at UCD” - a crowd-sourced web app that serves 400-500 daily sessions, helping UCD students find free food locations across Dublin!