Customer Churn Analysis & Prediction
The Challenge
Subscription businesses lose customers without warning, and by the time companies spot the pattern, thousands in revenue have already walked out the door. Acquiring a new customer costs 5-7x more than retaining an existing one, yet most companies only discover churn during quarterly reviews when it's too late to intervene.
What I Built
End-to-end machine learning pipeline predicting customer churn with 75% accuracy. Analyzed 7,000+ customer records to identify behavioral patterns signaling cancellation, then built an interactive Tableau dashboard providing real-time churn risk scoring for targeted retention efforts.
Business Impact:
• 75% prediction accuracy identifying at-risk customers 30 days in advance
• 10,000+ customers scored by churn probability for prioritized outreach
• Potential $125K+ annual revenue recovery by saving just 50 high-value customers
• Feature importance analysis revealing systemic product and process issues
Technical Approach
# Create tenure risk segments
df['tenure_group'] = pd.cut(df['tenure'],
bins=[0, 6, 12, 24, 999],
labels=['0-6mon', '6-12mon', '12-24mon', '24+mon'])
# Calculate average monthly spend
df['avg_monthly_spend'] = df['totalcharges'] / (df['tenure'] + 1)
# Price sensitivity indicator
df['price_sensitivity'] = df['monthlycharges'] / df['avg_monthly_spend']
Used Logistic Regression with an optimal balance of 75% accuracy and good interpretability.
Built two-page Tableau dashboard: (1) Churn Analysis showing segment patterns, (2) Predictive Analysis with feature importance, risk segmentation, revenue at risk, confusion matrix, and top 10 highest-risk customers ready for immediate outreach.
- Overview Machine learning pipeline achieving 75% accuracy on 7,000+ customer records. Identifies at-risk customers 30 days in advance with interactive dashboard showing churn probability scores, revenue at risk by segment, and prioritized intervention list.
- Technologies Used Python (Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn), Machine Learning (Random Forest, XGBoost, Logistic Regression), Tableau (Interactive Dashboard), Jupyter Notebook, Statistical Analysis
- Skills Demonstrated End-to-end ML pipeline development, feature engineering, model selection and validation, classification modeling, feature importance interpretation, business insight generation, Tableau dashboard design, ROI analysis, stakeholder communication
- Business Impact Provides 30-day advance warning vs. reactive quarterly reviews, enables prioritized retention outreach to high-risk customers, potential $125K+ annual revenue recovery, identified systemic issues (fiber quality, payment friction, onboarding gaps) affecting broader customer base