Back to Portfolio

Customer Churn Analysis & Prediction

Customer Churn Analysis showing predictive model results, feature importance, and retention insights

The Challenge

Subscription businesses lose customers without warning, and by the time companies spot the pattern, thousands in revenue have already walked out the door. Acquiring a new customer costs 5-7x more than retaining an existing one, yet most companies only discover churn during quarterly reviews when it's too late to intervene.

What I Built

End-to-end machine learning pipeline predicting customer churn with 75% accuracy. Analyzed 7,000+ customer records to identify behavioral patterns signaling cancellation, then built an interactive Tableau dashboard providing real-time churn risk scoring for targeted retention efforts.

Business Impact:

• 75% prediction accuracy identifying at-risk customers 30 days in advance

• 10,000+ customers scored by churn probability for prioritized outreach

• Potential $125K+ annual revenue recovery by saving just 50 high-value customers

• Feature importance analysis revealing systemic product and process issues

Technical Approach

Feature Engineering Example:

# Create tenure risk segments
df['tenure_group'] = pd.cut(df['tenure'],
    bins=[0, 6, 12, 24, 999],
    labels=['0-6mon', '6-12mon', '12-24mon', '24+mon'])

# Calculate average monthly spend
df['avg_monthly_spend'] = df['totalcharges'] / (df['tenure'] + 1)

# Price sensitivity indicator
df['price_sensitivity'] = df['monthlycharges'] / df['avg_monthly_spend']

Used Logistic Regression with an optimal balance of 75% accuracy and good interpretability.

Built two-page Tableau dashboard: (1) Churn Analysis showing segment patterns, (2) Predictive Analysis with feature importance, risk segmentation, revenue at risk, confusion matrix, and top 10 highest-risk customers ready for immediate outreach.

  • Overview Machine learning pipeline achieving 75% accuracy on 7,000+ customer records. Identifies at-risk customers 30 days in advance with interactive dashboard showing churn probability scores, revenue at risk by segment, and prioritized intervention list.
  • Technologies Used Python (Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn), Machine Learning (Random Forest, XGBoost, Logistic Regression), Tableau (Interactive Dashboard), Jupyter Notebook, Statistical Analysis
  • Skills Demonstrated End-to-end ML pipeline development, feature engineering, model selection and validation, classification modeling, feature importance interpretation, business insight generation, Tableau dashboard design, ROI analysis, stakeholder communication
  • Business Impact Provides 30-day advance warning vs. reactive quarterly reviews, enables prioritized retention outreach to high-risk customers, potential $125K+ annual revenue recovery, identified systemic issues (fiber quality, payment friction, onboarding gaps) affecting broader customer base
View Full Project on GitHub View Live Dashboard