Churn analysis quantifies how many customers stop using a product or service in a given period. This guide covers calculating churn rate and gross revenue churn in SQL, identifying leading indicators of churn using cohort analysis, building a basic predictive churn model with logistic regression, and designing intervention programmes using churn propensity scores.
Why Churn Analysis Matters
Churn is the rate at which customers stop using a product or service. For subscription businesses, churn directly determines whether growth compounds or erodes: a 5% monthly churn rate means replacing 46% of the customer base every year just to stay flat.
Churn analysis answers three distinct questions. Measurement: what is the churn rate, when does churn happen, and which customer segments churn most? Prediction: which active customers are at high risk of churning in the next 30-90 days? Intervention: which actions reduce churn probability, and what is the ROI of retention programmes?
Calculating Churn Rate in SQL
Churn rate definition varies by business model. Define precisely before building the query.
**Monthly user churn** (for subscription or active-user products):
WITH monthly_actives AS (
SELECT
DATE_TRUNC('month', activity_date) AS month,
user_id
FROM user_activity
GROUP BY 1, 2
),
churn_calc AS (
SELECT
m1.month,
COUNT(DISTINCT m1.user_id) AS active_users,
COUNT(DISTINCT m2.user_id) AS retained_users,
COUNT(DISTINCT m1.user_id) - COUNT(DISTINCT m2.user_id) AS churned_users
FROM monthly_actives m1
LEFT JOIN monthly_actives m2
ON m1.user_id = m2.user_id
AND m2.month = m1.month + INTERVAL '1 month'
GROUP BY m1.month
)
SELECT
month,
active_users,
churned_users,
ROUND(100.0 * churned_users / active_users, 2) AS churn_rate_pct
FROM churn_calc
ORDER BY month
**Gross Revenue Churn (GRC)**: Revenue lost from churned and downgraded customers as a percentage of prior-period revenue. Critical for subscription businesses — even if user count stays flat, revenue churn can be high if churning customers are high-value.
SELECT
month,
prior_mrr,
lost_mrr_from_churn,
lost_mrr_from_downgrade,
ROUND(100.0 * (lost_mrr_from_churn + lost_mrr_from_downgrade) / prior_mrr, 2) AS gross_revenue_churn_pct
FROM mrr_movements
**Net Revenue Retention (NRR)**: The amount retained plus expansion revenue, as a percentage of prior-period revenue. NRR above 100% means existing customers are growing faster than they churn — a key SaaS health metric.
Identifying When Churn Happens: Survival Analysis
Survival analysis tracks what percentage of a cohort survives (remains active) at each period after acquisition. It reveals whether churn is front-loaded (users churn immediately), distributed, or back-loaded (users stay for a while then leave).
WITH cohort_periods AS (
SELECT
u.user_id,
DATE_TRUNC('month', u.created_at) AS cohort_month,
generate_series(0, 24) AS period
FROM users u
),
SELECT
cohort_month,
period,
COUNT(DISTINCT cp.user_id) AS cohort_size,
COUNT(DISTINCT CASE WHEN a.user_id IS NOT NULL THEN cp.user_id END) AS surviving_users,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN a.user_id IS NOT NULL THEN cp.user_id END) / MAX(cohort_size), 1) AS survival_rate
FROM cohort_periods cp
LEFT JOIN monthly_actives a
ON cp.user_id = a.user_id
AND DATE_DIFF('month', cp.cohort_month, a.month) = cp.period
GROUP BY 1, 2
The survival curve reveals at which period churn is highest — common patterns: a steep drop in months 1-3 (onboarding failure churn), gradual decline, or a delayed cliff at contract renewal dates.
Segmenting Churn by Customer Attributes
Aggregate churn rate rarely yields actionable insight. Break churn by segment to identify where it is concentrated:
SELECT
u.plan_type,
u.acquisition_channel,
u.company_size_band,
COUNT(DISTINCT CASE WHEN churn.user_id IS NOT NULL THEN u.user_id END) AS churned_users,
COUNT(DISTINCT u.user_id) AS active_users,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN churn.user_id IS NOT NULL THEN u.user_id END) / COUNT(DISTINCT u.user_id), 1) AS churn_rate
FROM users u
LEFT JOIN churned_users churn ON u.user_id = churn.user_id
GROUP BY 1, 2, 3
ORDER BY churn_rate DESC
Common high-churn segments: users on free or trial plans, users who did not complete onboarding, users acquired through low-intent channels, users at small companies with lower budget stability.
Leading Indicators of Churn
Predicting churn requires identifying behaviours that precede churn. Build a feature table for each customer over a trailing period:
- Login frequency in last 30 days (declining = risk signal)
- Days since last login
- Feature adoption breadth (users of one feature churn faster than users of multiple)
- Support ticket volume and sentiment
- Contract renewal date proximity
- Plan tier and price sensitivity signals
- Stakeholder change (new decision-maker = re-evaluation risk)
Compute these features in SQL:
SELECT
u.user_id,
DATEDIFF('day', MAX(a.activity_date), CURRENT_DATE) AS days_since_last_active,
COUNT(DISTINCT CASE WHEN a.activity_date >= CURRENT_DATE - 30 THEN DATE(a.activity_date) END) AS active_days_last_30,
COUNT(DISTINCT a.feature_used) AS feature_breadth,
u.plan_type,
u.contract_renewal_date,
DATEDIFF('day', CURRENT_DATE, u.contract_renewal_date) AS days_to_renewal
FROM users u
LEFT JOIN activity_events a ON u.user_id = a.user_id
GROUP BY u.user_id, u.plan_type, u.contract_renewal_date
Basic Predictive Churn Model
With churn labels (churned in next 30 days: 1/0) and features per customer, a logistic regression provides a churn propensity score for each active customer.
In Python with scikit-learn:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
X = features[['days_since_last_active', 'active_days_last_30', 'feature_breadth', 'days_to_renewal']]
y = features['churned_next_30_days']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
model = LogisticRegression()
model.fit(X_train_scaled, y_train)
scores = model.predict_proba(scaler.transform(X[features]))[:, 1]
Score every active customer daily. Customers above a threshold (typically the top 10-20% of risk score) are flagged for intervention — customer success outreach, discount offer, proactive training session.
Intervention Design
Churn prediction without intervention design is reporting, not value creation. For each risk tier:
- **High risk, high value**: Customer success manager personal outreach within 24 hours — not an automated email, a human conversation to identify the root cause of declining engagement
- **High risk, medium value**: Automated in-app nudge to a feature they have not adopted; email offering a training session; proactive support check-in
- **Medium risk**: Automated re-engagement campaign highlighting unused features or new releases relevant to their use case
Track intervention outcomes: did customers who received outreach churn at a lower rate than a holdout group that did not? This measures the actual ROI of the retention programme — not just whether the model predicts churn, but whether interventions actually reduce it.
Our BI strategy practice designs churn analytics frameworks and customer health scoring models — contact us to discuss your retention analytics requirements.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →