
Tricky SQL Interview Question: Calculating Revenue from Loyal Customers
If you’re preparing for SQL interview questions for data analyst or data scientist roles, you’ll often encounter real-world business scenarios where you need to write queries that not only return correct results but also demonstrate structured problem-solving.
In this post, we’ll walk through an actual SQL interview-style problem that a major U.S. hotel chain could ask during a data role interview. The challenge involves identifying loyal customers and calculating the percentage of total revenue they contributed in a given year.
Interview Question: The Business Scenario
Imagine you are an analyst for a major hotel chain with locations across the United States. The marketing team is planning a new promotion focused on loyal customers. Before launching it, they want to know:
👉 What percentage of our 2017 revenue came from loyal customers?
What Defines a Loyal Customer?
For this problem, a loyal customer must meet all three of these criteria:
-
Membership – The customer must be part of the company’s points system (
is_member = true). -
Repeat Stays per Hotel – At each hotel they visited, they must have stayed more than 2 times.
-
Multiple Locations – They must have stayed at 3 different hotels across the U.S.
Table Schema
You’re given a dataset customer_transactions with the following schema:
| Column Name | Data Type | Description |
|---|---|---|
customer_id |
id | Unique customer identifier |
hotel_id |
integer | Unique ID for hotel |
transaction_id |
integer | Unique ID for transaction |
first_night |
string | Start date of stay, format "YYYY-mm-dd" |
number_of_nights |
integer | Number of nights stayed |
total_spend |
integer | Total spend in USD for the stay |
is_member |
boolean | Whether the customer is a member of the points system |
Step 1: Understanding the Problem
We need two numbers:
-
Revenue from loyal customers
-
Total revenue (all customers)
Then compute:
\(Percent Revenue from Loyal Customers= \frac{\text{Revenue Loyal}}{\text{Revenue Total}} \times 100\)
The challenge lies in identifying loyal customers correctly before aggregating revenue.
You need to be familiar with CTEs and joins to solve this problem.
Step 2: Break Down the Logic
-
Filter to members only →
WHERE is_member = true. -
Check repeat stays per hotel → Customers must have more than 2 stays at each hotel they visited.
-
This means counting transactions per
customer_id+hotel_id.
-
-
Check different hotel count → Customer must have stayed at least in 3 unique hotels.
-
Revenue aggregation → Once loyal customers are identified, calculate their total spend and compare to total revenue.
Step 3: SQL Query
Here’s a possible solution:
We create CTEs for member transactions, hotel stays and "valid customer" - to find customers who satisfy the constraints given in the question. Similary, we create two more CTEs for revenue - total and loyal. The loyal revenue is calculated using the "valid customers" CTEs with the help of a join as shown below.
WITH member_transactions AS (
SELECT *
FROM customer_transactions
WHERE is_member = TRUE
),
hotel_stays AS (
SELECT
customer_id,
hotel_id,
COUNT(DISTINCT transaction_id) AS num_stays
FROM member_transactions
GROUP BY customer_id, hotel_id
),
valid_customers AS (
SELECT
customer_id
FROM hotel_stays
GROUP BY customer_id
HAVING
MIN(num_stays) > 2 -- more than 2 stays at each hotel visited
AND COUNT(DISTINCT hotel_id) >= 3 -- at least 3 unique hotels
),
loyal_revenue AS (
SELECT
SUM(ct.total_spend) AS loyal_spend
FROM customer_transactions ct
JOIN valid_customers vc
ON ct.customer_id = vc.customer_id
WHERE ct.is_member = TRUE
AND strftime('%Y', ct.first_night) = '2017'
),
total_revenue AS (
SELECT
SUM(total_spend) AS total_spend
FROM customer_transactions
WHERE strftime('%Y', first_night) = '2017'
)
SELECT
ROUND(100.0 * loyal_spend / total_spend, 2) AS loyal_revenue_percent
FROM loyal_revenue, total_revenue;
Step 4: Walkthrough of the Query
-
member_transactions→ filters down to only members. -
hotel_stays→ counts stays per customer per hotel. -
valid_customers→ ensures:-
Every hotel they visited had >2 stays (
MIN(num_stays) > 2). -
They stayed in 3+ different hotels.
-
-
loyal_revenue→ sums total spend from valid loyal customers in 2017. -
total_revenue→ sums all customer spend in 2017. -
Final
SELECT→ computes % revenue from loyal customers.

Step 5: Interviewer Follow-Up Questions
In interviews, solving the query is only half the battle. Expect follow-up questions such as:
-
Performance Optimization
-
How would this query perform on billions of rows?
-
Could you use indexes on
customer_id,hotel_id, andfirst_night?
-
-
Edge Cases
-
What if a customer stayed 3 times in one hotel and 0 times in others?
-
How would you handle cancellations or refunds?
-
-
Business Understanding
-
Why does the company define “loyal” this way?
-
How would changing the definition (e.g., 2 different hotels instead of 3) impact the results?
-

Key Takeaways
-
SQL interview questions often combine business rules with technical filtering and aggregation.
-
Always break down complex conditions into smaller logical steps using CTEs.
-
Be prepared to discuss query performance, edge cases, and business implications.
Final Thoughts
This problem is a great example of the type of SQL case study questions asked in interviews for data analysts, data scientists, and data engineers. Practicing problems like this will help you stand out in interviews—not just by writing the query, but by demonstrating structured thinking and real-world awareness.
If you want to practice more real interview questions like this, sign up on Dataloopr.
