Welcome to SQL Adventures!
Inspired by Danny Ma’s 8 Week SQL Challenge, let’s dive into the challenge for Week 2— Pizza Runner. This series is a gateway to my efforts in mastering SQL, one challenge at a time. Through this series of articles, I share my solutions and insights gained from each challenge. The first part of this challenge, Week 1 — Danny’s Diner, is documented here.
Data
The data is about Danny’s Diner, a restaurant specializing in sushi, curry, and and pizza. The data is stored in the pizza_runner
database schema, which contains several interconnected tables that capture details about customers, orders, runners, and the pizzas themselves. Here is a brief overview of the tables:
runners
: Contains details about the runners, including their unique runner IDs and registration dates.customer_orders
: Captures pizza orders from customers, including information on the type of pizza ordered, any customizations (exclusions or extras), and the order time.runner_orders
: Tracks order fulfillment by runners, including pickup times, delivery distances, durations, and cancellations.pizza_names
: A simple reference table that lists the available pizza types, such as Meat Lovers and Vegetarian.pizza_recipes
: Details the standard set of toppings for each pizza type.pizza_toppings
: Provides a mapping of topping IDs to their respective names, helping to identify ingredients used in the pizzas.
I’ll add the ER diagram here for reference, but the data within each table can be seen in the challenge website.
Challenge
This challenge is packed with over 25 questions, divided into four key areas of focus. To maintain my sanity while writing this (and hopefully yours while reading), I’ve decided to break it down into parts. Part I will cover the first area of Pizza Metrics and following articles will cover the remaining areas along with the bonus questions.
Pizza Metrics
- How many pizzas were ordered?
- How many unique customer orders were made?
- How many successful orders were delivered by each runner?
- How many of each type of pizza was delivered?
- How many Vegetarian and Meatlovers were ordered by each customer?
- What was the maximum number of pizzas delivered in a single order?
- For each customer, how many delivered pizzas had at least 1 change and how many had no changes?
- How many pizzas were delivered that had both exclusions and extras?
- What was the total volume of pizzas ordered for each hour of the day?
- What was the volume of orders for each day of the week?
Solutions
Before diving into the questions, it can be seen that there are some null values within certain tables. To fix this, an update command can be used.
In the table customer_orders
, the columns exclusions
and extras
have null values which are updated to NULL
as follows:
UPDATE pizza_runner.customer_orders
SET exclusions = CASE
WHEN exclusions = 'null' OR exclusions = ' ' THEN NULL
ELSE exclusions
END,
extras = CASE
WHEN extras = 'null' OR extras = 'NaN' OR extras = ' ' THEN NULL
ELSE extras
END
RETURNING *;
In the table runner_orders
, the columns pickup_time
, distance
, cancellation
and duration
have ‘null’
or ‘NaN’
values and the columns distance
and duration
have irregularities within their format and data types. These can be fixed as follows:
UPDATE pizza_runner.runner_orders
SET pickup_time = CASE
WHEN pickup_time = 'null' THEN NULL
ELSE pickup_time
END,
distance = CASE
WHEN distance = 'null' THEN NULL
ELSE regexp_replace(distance, '[^0-9]', '', 'g')::NUMERIC(10,2)
END,
duration = CASE
WHEN duration = 'null' THEN NULL
ELSE regexp_replace(duration, '[^0-9]', '', 'g')::INTEGER
END,
cancellation = CASE
WHEN cancellation = 'null' OR cancellation = '' THEN NULL
ELSE cancellation
END
RETURNING *;
Here, the table runner_orders
is updated to clean and format its data. The query converts occurrences of the string 'null'
to actual NULL
values in the pickup_time
, distance
, duration
, and cancellation
columns. For the distance
column, non-numeric characters are removed, and the values are cast to a NUMERIC
type with two decimal places. Similarly, the duration
column is stripped of non-numeric characters and cast to an INTEGER
. The cancellation
column is set to NULL
for empty strings or the string 'null'
.
Let’s move on to solving the questions now.
1. How many pizzas were ordered?
SELECT
COUNT(pizza_id) AS total_pizza_count
FROM
pizza_runner.customer_orders;
This is straightforward — counting the total number of pizzas within the customer_orders
table.
2. How many unique customer orders were made?
SELECT
COUNT(DISTINCT order_id) AS unique_order_count
FROM
pizza_runner.customer_orders;
This is again counting the number of orders within the customer_orders
table and not the number of pizzas.
3. How many successful orders were delivered by each runner?
Since I used DBFiddle for completing this task, I had to rerun the update command before every run. To avoid this repetition, I wrote a CTE that does the same and used it to solve this question.
WITH cte_ro AS (
SELECT
order_id,
runner_id,
CASE
WHEN pickup_time = 'null' THEN NULL
ELSE pickup_time
END AS pickup_time,
CASE
WHEN distance = 'null' THEN NULL
ELSE REGEXP_REPLACE(distance, '[^0-9]', '', 'g')::NUMERIC(10,2)
END AS distance,
CASE
WHEN duration = 'null' THEN NULL
ELSE REGEXP_REPLACE(duration, '[^0-9]', '', 'g')::INTEGER
END AS duration,
CASE
WHEN cancellation = 'null' OR cancellation = '' THEN NULL
ELSE cancellation
END AS cancellation
FROM
pizza_runner.runner_orders
)
SELECT
DISTINCT runner_id,
COUNT(*) AS successful_orders
FROM
cte_ro
WHERE
cancellation IS NULL
GROUP BY
runner_id
ORDER BY
runner_id;
If UPDATE command works fine, then this can be the used directly:
SELECT
runner_id,
COUNT(*) AS successful_orders
FROM
pizza_runner.runner_orders
WHERE
cancellation IS NULL
GROUP BY
runner_id
ORDER BY
runner_id;
The number of successful orders per runner is as shown. Runner with runner_id
of 1 has the most number of successful orders at 4.
4. How many of each type of pizza was delivered?
To just get the number of pizzas of each type that were ordered:
SELECT DISTINCT
co.pizza_id,
pn.pizza_name,
COUNT(co.order_id) AS number_of_pizzas
FROM
pizza_runner.customer_orders co
LEFT JOIN
pizza_runner.pizza_names pn ON co.pizza_id = pn.pizza_id
GROUP BY
co.pizza_id,
pn.pizza_name;
Taking into account only the orders that were delivered and not cancelled, the number of pizzas of each type can be calculated using the same CTE as used in Q3.
WITH cte_ro AS (
SELECT
order_id,
runner_id,
CASE
WHEN pickup_time = 'null' OR pickup_time = '' THEN NULL
ELSE pickup_time
END AS pickup_time,
CASE
WHEN distance = 'null' OR distance = '' THEN NULL
ELSE REGEXP_REPLACE(distance, '[^0-9]', '', 'g')::NUMERIC(10,2)
END AS distance,
CASE
WHEN duration = 'null' THEN NULL
ELSE REGEXP_REPLACE(duration, '[^0-9]', '', 'g')::INTEGER
END AS duration,
CASE
WHEN cancellation = 'null' OR cancellation = '' THEN NULL
ELSE cancellation
END AS cancellation
FROM
pizza_runner.runner_orders
)
SELECT
DISTINCT co.pizza_id,
pn.pizza_name,
COUNT(co.order_id) AS number_of_pizzas
FROM
pizza_runner.customer_orders co
LEFT JOIN
pizza_runner.pizza_names pn ON co.pizza_id = pn.pizza_id
LEFT JOIN
cte_ro c ON co.order_id = c.order_id
WHERE
c.cancellation IS NULL
GROUP BY
co.pizza_id,
pn.pizza_name;
The number of pizzas delivered with respect to the pizza type is as shown. Pizza with pizza_id
of 1, that is the Meatlovers, is the most delivered pizza type at 9 pizzas.
5. How many Vegetarian and Meatlovers were ordered by each customer?
Again, since I used DBFiddle for completing this task, I wrote a CTE that does the same update to customer_orders table.
WITH cte_co AS (
SELECT
order_id,
customer_id,
pizza_id,
CASE
WHEN exclusions = 'null' OR exclusions = '' THEN NULL
ELSE exclusions
END AS exclusions,
CASE
WHEN extras = 'null' OR extras = '' THEN NULL
ELSE extras
END AS extras,
order_time
FROM
pizza_runner.customer_orders
)
SELECT
customer_id,
pn.pizza_name,
COUNT(*) AS pizza_count
FROM
cte_co co
LEFT JOIN
pizza_runner.pizza_names pn ON co.pizza_id = pn.pizza_id
GROUP BY
customer_id,
pn.pizza_name
ORDER BY
customer_id;
The number of pizzas delivered with respect to the pizza type and customer is as shown. Every customer except customers with customer_id
of 104 and 105 have ordered both the types of pizzas. While 104 has only ordered Meatlovers pizza thrice, 105 has only ordered Vegetarian once.
6. What was the maximum number of pizzas delivered in a single order?
Using the CTEs as defined before, the maximum number of pizzas ordered in a single order:
SELECT
order_id,
COUNT(pizza_id) AS number_of_pizzas
FROM
cte_co
GROUP BY
order_id
ORDER BY
number_of_pizzas DESC
LIMIT 1;
To ensure that only the orders that were delivered and not cancelled are considered, the code can be modified as follows which again gives the same result:
SELECT
co.order_id,
COUNT(co.pizza_id) AS number_of_pizzas
FROM
cte_co co
LEFT JOIN
cte_ro ro ON co.order_id = ro.order_id
WHERE
ro.cancellation IS NULL
GROUP BY
co.order_id
ORDER BY
number_of_pizzas DESC
LIMIT 1;
Using both the conditions, the maximum number of pizzas delivered in a single order is 3 which is in the order_id
of 4.
7. For each customer, how many delivered pizzas had at least 1 change and how many had no changes?
Considering either exclusions or extras within the order as a change:
SELECT
co.customer_id,
COUNT(co.pizza_id) AS pizza_count,
SUM(
CASE
WHEN co.exclusions IS NOT NULL OR co.extras IS NOT NULL THEN 1
ELSE 0
END
) AS at_least_1_change,
SUM(
CASE
WHEN co.exclusions IS NULL AND co.extras IS NULL THEN 1
ELSE 0
END
) AS no_changes
FROM
cte_co co
LEFT JOIN cte_ro ro ON co.order_id = ro.order_id
WHERE
ro.cancellation IS NULL
GROUP BY
co.customer_id
ORDER BY
co.customer_id;
This shows that customer with customer_id
of 101 had 2 pizzas delivered, both with no changes, as well as customer with customer_id
of 102 had 3 pizzas delivered, with no changes. Customer with customer_id
of 103 had 3 pizzas delivered, all with at least 1 change, as well as customer with customer_id
of 105 had only 1 pizza delivered and that too with at least 1 change.
8. How many pizzas were delivered that had both exclusions and extras?
SELECT
COUNT(co.pizza_id) AS pizza_count_with_extras_exclusions
FROM cte_co co
LEFT JOIN cte_ro ro
ON co.order_id = ro.order_id
WHERE ro.cancellation IS NULL
AND co.exclusions IS NOT NULL
AND co.extras IS NOT NULL;
Only one pizza was delivered that had both exclusions and extras.
9. What was the total volume of pizzas ordered for each hour of the day?
SELECT
DATE_PART('hour', order_time) AS hour_of_day,
COUNT(order_id) AS pizza_volume
FROM cte_co
GROUP BY DATE_PART('hour', order_time)
ORDER BY DATE_PART('hour', order_time);
The total volume of pizzas ordered each day with respect to the hour of the day is as shown. The majority of orders were placed during the hours of 13 (1 pm), 18 (6 pm), 21 (9 pm) and 23 (11 pm).
10. What was the volume of orders for each day of the week?
SELECT
TO_CHAR(order_time, 'FMDay') AS day_of_week,
COUNT(order_id) AS pizza_volume
FROM cte_co
GROUP BY TO_CHAR(order_time, 'FMDay');
The total volume of pizzas ordered each day of the week is as shown. The majority of orders were placed on Saturday and Wednesday (5) followed by Thursday (3) and then Friday (1).
That was Part I. Onto the next one! See you in Part II!
Toodles!